Complexity Analysis
Time and space complexity of each algorithm in CiteGraph
VNumber of papers (82,937)
ENumber of citation edges (148,116)
KAverage citations per paper
Build Adjacency List
Parse all citation edges and populate the graph
Each edge is visited exactly once. Two Maps are built: citedBy (in-edges) and cites (out-edges). Nodes are also initialised in O(V).
Top Cited (Max-Heap)
Rank all papers by citation count using a max-heap
All V papers are pushed into a MaxPriorityQueue (binary heap). Each enqueue is O(log V). Dequeuing the top-K papers is O(K log V), dominated by the build phase.
Search by Title / ID
Linear scan across all papers for substring / exact match
Iterates over the papers Map once. ID lookup short-circuits on first match. Title search collects all matches up to a limit of 50 and sorts by citation count.
Paper Cluster (Jaccard)
Find papers with overlapping citation sets
For each of the V papers, the intersection with the target paper's citation set is computed in O(K) where K is the average citation list length. Jaccard similarity = |A∩B| / |A∪B|.
Data Structures · Space Complexity
O(V + E)
Adjacency List (Map)
Two Maps — citedBy and cites — store in- and out-edges per node.
O(V)
MaxPriorityQueue
Binary max-heap holding all papers keyed by citation count.
O(K)
Set (for Jaccard)
Temporary Set built from the target paper's citation list per query.
Overall Space
Graph + heap + working sets