Graph Analytics
This tutorial shows how to use GraphQLite's built-in graph algorithms for analysis.
What You'll Learn
- Run centrality algorithms to find important nodes
- Detect communities in your graph
- Find shortest paths between nodes
- Use algorithm results in your applications
Setup: Create a Social Network
from graphqlite import Graph
g = Graph(":memory:")
# Create a small social network
people = ["alice", "bob", "carol", "dave", "eve", "frank", "grace", "henry"]
for person in people:
g.upsert_node(person, {"name": person.title()}, label="Person")
# Create connections (who follows whom)
connections = [
("alice", "bob"), ("alice", "carol"), ("alice", "dave"),
("bob", "carol"), ("bob", "eve"),
("carol", "dave"), ("carol", "eve"), ("carol", "frank"),
("dave", "frank"),
("eve", "frank"), ("eve", "grace"),
("frank", "grace"), ("frank", "henry"),
("grace", "henry"),
]
for source, target in connections:
g.upsert_edge(source, target, {}, rel_type="FOLLOWS")
print(g.stats()) # {'nodes': 8, 'edges': 14}
Centrality: Finding Important Nodes
PageRank
PageRank identifies nodes that are linked to by other important nodes:
results = g.pagerank(damping=0.85, iterations=20)
for r in sorted(results, key=lambda x: x["score"], reverse=True)[:3]:
print(f"{r['user_id']}: {r['score']:.4f}")
Output:
frank: 0.1842
grace: 0.1536
eve: 0.1298
Frank is the most "important" because many well-connected people follow him.
Degree Centrality
Count incoming and outgoing connections:
results = g.degree_centrality()
for r in results:
print(f"{r['user_id']}: in={r['in_degree']}, out={r['out_degree']}")
Betweenness Centrality
Find nodes that act as bridges between communities:
results = g.query("RETURN betweennessCentrality()")
# Carol and Eve have high betweenness - they connect different groups
Community Detection
Label Propagation
Find clusters of densely connected nodes:
results = g.community_detection(max_iterations=10)
communities = {}
for r in results:
label = r["community"]
if label not in communities:
communities[label] = []
communities[label].append(r["user_id"])
for label, members in communities.items():
print(f"Community {label}: {members}")
Louvain Algorithm
For larger graphs, Louvain provides hierarchical community detection:
results = g.query("RETURN louvain(1.0)")
Path Finding
Shortest Path
Find the shortest path between two nodes:
path = g.shortest_path("alice", "henry")
print(f"Distance: {path['distance']}")
print(f"Path: {' -> '.join(path['path'])}")
Output:
Distance: 4
Path: alice -> carol -> frank -> henry
All-Pairs Shortest Paths
Compute distances between all node pairs:
results = g.query("RETURN apsp()")
Connected Components
Weakly Connected Components
Find groups of nodes that are connected (ignoring edge direction):
results = g.connected_components()
Strongly Connected Components
Find groups where every node can reach every other node:
results = g.query("RETURN scc()")
Using Results in Your Application
Algorithm results are returned as lists of dictionaries, making them easy to process:
# Find the top influencers
influencers = g.pagerank()
top_3 = sorted(influencers, key=lambda x: x["score"], reverse=True)[:3]
# Get full node data for top influencers
for inf in top_3:
node = g.get_node(inf["user_id"])
print(f"{node['properties']['name']}: PageRank {inf['score']:.4f}")
Combining Algorithms with Cypher
Use algorithm results to guide Cypher queries:
# Find the most central node
pagerank = g.pagerank()
most_central = max(pagerank, key=lambda x: x["score"])["user_id"]
# Query their connections
results = g.query(f"""
MATCH (p:Person {{name: '{most_central.title()}'}})-[:FOLLOWS]->(friend)
RETURN friend.name AS friend
""")
print(f"Top influencer {most_central} follows: {[r['friend'] for r in results]}")
Next Steps
- Graph Algorithms Reference - Complete algorithm documentation
- Performance - Algorithm performance characteristics