Graph Algorithms

GraphQLite includes 15+ built-in graph algorithms.

Centrality Algorithms

PageRank

Measures node importance based on incoming links from important nodes.

RETURN pageRank()
RETURN pageRank(0.85, 20)  -- damping, iterations

Returns: [{"node_id": int, "user_id": string, "score": float}, ...]

Parameters:

  • damping (default: 0.85) - Probability of following a link
  • iterations (default: 20) - Number of iterations

Degree Centrality

Counts incoming and outgoing connections.

RETURN degreeCentrality()

Returns: [{"node_id": int, "user_id": string, "in_degree": int, "out_degree": int, "degree": int}, ...]

Betweenness Centrality

Measures how often a node lies on shortest paths between other nodes.

RETURN betweennessCentrality()

Returns: [{"node_id": int, "user_id": string, "score": float}, ...]

Closeness Centrality

Measures average distance to all other nodes.

RETURN closenessCentrality()

Returns: [{"node_id": int, "user_id": string, "score": float}, ...]

Eigenvector Centrality

Measures influence based on connections to high-scoring nodes.

RETURN eigenvectorCentrality()
RETURN eigenvectorCentrality(100)  -- max iterations

Returns: [{"node_id": int, "user_id": string, "score": float}, ...]

Community Detection

Label Propagation

Detects communities by propagating labels through the network.

RETURN labelPropagation()
RETURN labelPropagation(10)  -- max iterations
RETURN communities()         -- alias

Returns: [{"node_id": int, "user_id": string, "community": int}, ...]

Louvain

Hierarchical community detection optimizing modularity.

RETURN louvain()
RETURN louvain(1.0)  -- resolution parameter

Returns: [{"node_id": int, "user_id": string, "community": int}, ...]

Connected Components

Weakly Connected Components (WCC)

Groups nodes reachable by ignoring edge direction.

RETURN wcc()

Returns: [{"node_id": int, "user_id": string, "component": int}, ...]

Strongly Connected Components (SCC)

Groups nodes where every node can reach every other node following edge direction.

RETURN scc()

Returns: [{"node_id": int, "user_id": string, "component": int}, ...]

Path Finding

Dijkstra (Shortest Path)

Finds shortest path between two nodes.

RETURN dijkstra('source_id', 'target_id')

Returns: {"found": bool, "distance": int, "path": [node_ids]}

The found field indicates whether a path exists. When found is false, distance is null and path is empty.

Shortest path with heuristic. Can use geographic coordinates for distance estimation or fall back to uniform heuristic.

RETURN astar('source_id', 'target_id')
RETURN astar('source_id', 'target_id', 'lat_prop', 'lon_prop')

When lat_prop and lon_prop are provided, A* uses haversine distance as the heuristic. Without these properties, it behaves similarly to Dijkstra but may explore fewer nodes.

Returns: {"found": bool, "distance": float, "path": [node_ids], "nodes_explored": int}

All-Pairs Shortest Paths (APSP)

Computes shortest distances between all node pairs.

RETURN apsp()

Returns: [{"source": string, "target": string, "distance": int}, ...]

Note: O(n²) space and time complexity. Use with caution on large graphs.

Traversal

Breadth-First Search (BFS)

Explores nodes level by level from a starting point.

RETURN bfs('start_id')
RETURN bfs('start_id', 3)  -- max depth

Returns: [{"node_id": int, "user_id": string, "depth": int, "order": int}, ...]

The order field indicates the traversal order (0 = starting node, then incrementing).

Depth-First Search (DFS)

Explores as far as possible along each branch.

RETURN dfs('start_id')
RETURN dfs('start_id', 5)  -- max depth

Returns: [{"node_id": int, "user_id": string, "depth": int, "order": int}, ...]

Similarity

Node Similarity (Jaccard)

Computes Jaccard similarity between node neighborhoods.

RETURN nodeSimilarity()

Returns: [{"node1": int, "node2": int, "similarity": float}, ...]

K-Nearest Neighbors (KNN)

Finds k most similar nodes to a given node based on Jaccard similarity of neighborhoods.

RETURN knn('node_id', 10)  -- node, k

Returns: [{"neighbor": string, "similarity": float, "rank": int}, ...]

Results are ordered by similarity (highest first), with rank starting at 1.

Triangle Count

Counts triangles and computes clustering coefficient.

RETURN triangleCount()

Returns: [{"node_id": int, "user_id": string, "triangles": int, "clustering_coefficient": float}, ...]

Using Results in SQL

Extract algorithm results using SQLite JSON functions:

SELECT
    json_extract(value, '$.node_id') as id,
    json_extract(value, '$.score') as score
FROM json_each(cypher('RETURN pageRank()'))
ORDER BY score DESC
LIMIT 10;