Performance

This document covers GraphQLite's performance characteristics and optimization strategies.

Benchmarks

Benchmarks on Apple M1 Max (10 cores, 64GB RAM).

Insertion Performance

NodesEdgesTimeRate
100K500K445ms1.3M/s
500K2.5M2.30s1.3M/s
1M5.0M5.16s1.1M/s

Traversal by Topology

TopologyNodesEdges1-hop2-hop
Chain100K99K<1ms<1ms
Sparse100K500K<1ms<1ms
Moderate100K2.0M<1ms2ms
Dense100K5.0M<1ms9ms
Normal dist.100K957K<1ms1ms
Power-law100K242K<1ms<1ms
Moderate500K10.0M1ms2ms
Moderate1M20.0M<1ms2ms

Graph Algorithms

AlgorithmNodesEdgesTime
PageRank100K500K148ms
Label Propagation100K500K154ms
PageRank500K2.5M953ms
Label Propagation500K2.5M811ms
PageRank1M5.0M37.81s
Label Propagation1M5.0M40.21s

Cypher Query Performance

Query TypeG(100K, 500K)G(500K, 2.5M)G(1M, 5M)
Node lookup<1ms1ms<1ms
1-hop<1ms<1ms<1ms
2-hop<1ms<1ms<1ms
3-hop1ms1ms1ms
Filter scan341ms1.98s3.79s
MATCH all360ms2.05s3.98s

Optimization Strategies

Use Indexes Effectively

GraphQLite creates indexes on:

  • nodes(user_id) - Fast node lookup by ID
  • nodes(label) - Fast filtering by label
  • edges(source_id), edges(target_id) - Fast traversal
  • Property tables on (node_id, key) - Fast property access

Queries that leverage these indexes are fast.

Limit Variable-Length Paths

Variable-length paths can be expensive:

-- Expensive: unlimited depth
MATCH (a)-[*]->(b) RETURN b

-- Better: limit depth
MATCH (a)-[*1..3]->(b) RETURN b

Use Specific Labels

Labels help filter early:

-- Slower: scan all nodes
MATCH (n) WHERE n.type = 'Person' RETURN n

-- Faster: use label
MATCH (n:Person) RETURN n

Batch Operations

For bulk inserts, use batch methods:

# Slow: individual inserts
for person in people:
    g.upsert_node(person["id"], person, label="Person")

# Fast: batch insert
nodes = [(p["id"], p, "Person") for p in people]
g.upsert_nodes_batch(nodes)

Algorithm Caching

Graph algorithms scan the entire graph. If your graph doesn't change frequently, cache results:

import functools

@functools.lru_cache(maxsize=1)
def get_pagerank():
    return g.pagerank()

Memory Usage

GraphQLite uses SQLite's memory management. Key factors:

  • Page cache: SQLite caches database pages in memory
  • Algorithm scratch space: Algorithms allocate temporary structures
  • Result buffers: Query results are buffered before returning

For large graphs, consider:

# Increase SQLite page cache (default: 2MB)
conn.execute("PRAGMA cache_size = -64000")  # 64MB

Running Benchmarks

Run benchmarks on your hardware:

make performance

This runs:

  • Insertion benchmarks
  • Traversal benchmarks across topologies
  • Algorithm benchmarks
  • Query benchmarks