Storage Model
GraphQLite uses a typed property graph model stored in regular SQLite tables. The schema is designed for query efficiency using an Entity-Attribute-Value (EAV) pattern with property key normalization.
Schema Overview
┌─────────────────────────────────────┐
│ nodes │
│ id (PK, auto-increment) │
├─────────────────────────────────────┤
│ 1 │
│ 2 │
│ 3 │
└─────────────────────────────────────┘
│
│ 1:N
▼
┌─────────────────────────────────────┐
│ node_labels │
│ node_id (FK) │ label │
├───────────────┼─────────────────────┤
│ 1 │ "Person" │
│ 2 │ "Person" │
│ 3 │ "Company" │
└─────────────────────────────────────┘
┌─────────────────────────────────────┐
│ property_keys │
│ id (PK) │ key (UNIQUE) │
├──────────┼──────────────────────────┤
│ 1 │ "name" │
│ 2 │ "age" │
│ 3 │ "id" │
└─────────────────────────────────────┘
│
│ 1:N (via key_id)
▼
┌───────────────────────────────────────────┐
│ node_props_text │
│ node_id (FK) │ key_id (FK) │ value │
├───────────────┼─────────────┼─────────────┤
│ 1 │ 3 │ "alice" │
│ 1 │ 1 │ "Alice" │
│ 2 │ 3 │ "bob" │
│ 2 │ 1 │ "Bob" │
└───────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ edges │
│ id (PK) │ source_id (FK) │ target_id (FK) │ type │
├──────────┼────────────────┼────────────────┼────────────┤
│ 1 │ 1 │ 2 │ "KNOWS" │
│ 2 │ 1 │ 3 │ "WORKS_AT" │
└─────────────────────────────────────────────────────────┘
Core Tables
nodes
The nodes table stores graph nodes with a simple auto-incrementing ID. Node metadata such as labels and properties are stored in separate tables, enabling nodes to have multiple labels and efficient property queries.
| Column | Type | Description |
|---|---|---|
id | INTEGER PRIMARY KEY AUTOINCREMENT | Internal node identifier |
node_labels
Labels are stored in a separate table allowing nodes to have multiple labels. This normalized design enables efficient label-based filtering through indexed lookups.
| Column | Type | Description |
|---|---|---|
node_id | INTEGER FK → nodes(id) | References the node |
label | TEXT | Label name (e.g., "Person") |
The primary key is the composite (node_id, label), which prevents duplicate labels on the same node.
edges
The edges table stores relationships between nodes with a required relationship type.
| Column | Type | Description |
|---|---|---|
id | INTEGER PRIMARY KEY AUTOINCREMENT | Internal edge identifier |
source_id | INTEGER FK → nodes(id) | Source node |
target_id | INTEGER FK → nodes(id) | Target node |
type | TEXT NOT NULL | Relationship type (e.g., "KNOWS") |
Foreign keys use ON DELETE CASCADE so removing a node automatically removes its edges.
property_keys
Property names are normalized into a lookup table to reduce storage overhead and improve query performance. Instead of storing the property name string with every property value, we store a small integer key ID.
| Column | Type | Description |
|---|---|---|
id | INTEGER PRIMARY KEY AUTOINCREMENT | Property key identifier |
key | TEXT UNIQUE | Property name (e.g., "name", "age") |
Property Tables
Properties are stored in separate tables by type. This approach enables type-safe queries, efficient indexing by value, and proper numeric comparisons without type conversion overhead.
Node property tables:
node_props_text— String valuesnode_props_int— Integer valuesnode_props_real— Floating-point valuesnode_props_bool— Boolean values (stored as 0 or 1)
Edge property tables:
edge_props_textedge_props_intedge_props_realedge_props_bool
Each property table has the same structure:
| Column | Type | Description |
|---|---|---|
node_id / edge_id | INTEGER FK | References the owner entity |
key_id | INTEGER FK → property_keys(id) | References the property name |
value | (varies by table) | The property value |
The primary key is the composite (node_id, key_id) or (edge_id, key_id), ensuring each entity has at most one value per property.
Indexes
GraphQLite creates indexes optimized for common graph query patterns:
-- Edge traversal (covers both directions and type filtering)
CREATE INDEX idx_edges_source ON edges(source_id, type);
CREATE INDEX idx_edges_target ON edges(target_id, type);
CREATE INDEX idx_edges_type ON edges(type);
-- Label filtering
CREATE INDEX idx_node_labels_label ON node_labels(label, node_id);
-- Property key lookup
CREATE INDEX idx_property_keys_key ON property_keys(key);
-- Property value queries (enables efficient WHERE clauses)
CREATE INDEX idx_node_props_text_key_value ON node_props_text(key_id, value, node_id);
CREATE INDEX idx_node_props_int_key_value ON node_props_int(key_id, value, node_id);
-- ... similar for other property tables
The property indexes are designed "key-first" to efficiently satisfy queries like WHERE n.name = 'Alice', which translate to lookups by key_id and value.
Why This Design?
Typed property tables provide several advantages over storing all properties as JSON or a single TEXT column. Integer comparisons are performed natively rather than through string parsing. Type-specific indexes enable efficient range queries. Storage is more compact since values don't require type metadata.
Property key normalization through the property_keys table reduces storage by replacing repeated property name strings with integer IDs. This also enables efficient property-first queries and simplifies schema introspection.
Separate label table allows nodes to have multiple labels, which is a common requirement in graph modeling. The label index supports efficient label-based filtering without scanning all nodes.
Query Translation
When you write:
MATCH (p:Person {name: 'Alice'})
WHERE p.age > 25
RETURN p.name, p.age
GraphQLite translates this to SQL that joins the appropriate tables:
SELECT
name_prop.value AS "p.name",
age_prop.value AS "p.age"
FROM nodes p
JOIN node_labels p_label ON p.id = p_label.node_id AND p_label.label = 'Person'
LEFT JOIN node_props_text name_prop
ON p.id = name_prop.node_id
AND name_prop.key_id = (SELECT id FROM property_keys WHERE key = 'name')
LEFT JOIN node_props_int age_prop
ON p.id = age_prop.node_id
AND age_prop.key_id = (SELECT id FROM property_keys WHERE key = 'age')
WHERE name_prop.value = 'Alice'
AND age_prop.value > 25
In practice, the query optimizer uses cached prepared statements for property key lookups, making this translation efficient.
Direct SQL Access
You can query the underlying tables directly for advanced use cases:
-- Count nodes by label
SELECT label, COUNT(*) FROM node_labels GROUP BY label;
-- Find all properties of a specific node
SELECT pk.key, 'text' as type, pt.value
FROM node_props_text pt
JOIN property_keys pk ON pt.key_id = pk.id
WHERE pt.node_id = 1
UNION ALL
SELECT pk.key, 'int' as type, CAST(pi.value AS TEXT)
FROM node_props_int pi
JOIN property_keys pk ON pi.key_id = pk.id
WHERE pi.node_id = 1;
-- Find nodes with a specific property value
SELECT nl.node_id, nl.label, pt.value as name
FROM node_props_text pt
JOIN property_keys pk ON pt.key_id = pk.id
JOIN node_labels nl ON pt.node_id = nl.node_id
WHERE pk.key = 'name' AND pt.value = 'Alice';