Skip to content

Technical specification for writing RedisGraph client libraries

By design, there is not a full standard for RedisGraph clients to adhere to. Areas such as pretty-print formatting, query validation, and transactional and multithreaded capabilities have no canonically correct behavior, and the implementer is free to choose the approach and complexity that suits them best.

RedisGraph does, however, provide a compact result set format for clients that minimizes the amount of redundant data transmitted from the server. Implementers are encouraged to take advantage of this format, as it provides better performance and removes ambiguity from decoding certain data. This approach requires clients to be capable of issuing procedure calls to the server and performing a small amount of client-side caching.

Retrieving the compact result set

Appending the flag --compact to any query issued to the GRAPH.QUERY endpoint will cause the server to issue results in the compact format. Because we don't store connection-specific configurations, all queries should be issued with this flag.

GRAPH.QUERY demo "MATCH (a) RETURN a" --compact

Formatting differences in the compact result set

The result set has the same overall structure as described in the Result Set documentation.

Certain values are emitted as integer IDs rather than strings:

  1. Node labels
  2. Relationship types
  3. Property keys

Instructions on how to efficiently convert these IDs in the Procedure Calls section below.

Additionally, two enums are exposed:

ColumnType indicates what type of value is held in each column (more formally, that offset into each row of the result set). Each entry in the header row will be a 2-array, with this enum in the first position and the column name string in the second.

PropertyType indicates the data type (such as integer or string) of each returned scalar value. Each scalar values is emitted as a 2-array, with this enum in the first position and the actual value in the second. A column can consist exclusively of scalar values, such as both of the columns created by RETURN a.value, 'this literal string'. Each property on a graph entity also has a scalar as its value, so this construction is nested in each value of the properties array when a column contains a node or relationship.

Decoding the result set

Given the graph created by the query:

GRAPH.QUERY demo "CREATE (:plant {name: 'Tree'})-[:GROWS {season: 'Autumn'}]->(:fruit {name: 'Apple'})"

Let's formulate a query that returns 3 columns: nodes, relationships, and scalars, in that order.

Verbose (default):

127.0.0.1:6379> GRAPH.QUERY demo "MATCH (a)-[e]->(b) RETURN a, e, b.name"
1) 1) "a"
   2) "e"
   3) "b.name"
2) 1) 1) 1) 1) "id"
            2) (integer) 0
         2) 1) "labels"
            2) 1) "plant"
         3) 1) "properties"
            2) 1) 1) "name"
                  2) "Tree"
      2) 1) 1) "id"
            2) (integer) 0
         2) 1) "type"
            2) "GROWS"
         3) 1) "src_node"
            2) (integer) 0
         4) 1) "dest_node"
            2) (integer) 1
         5) 1) "properties"
            2) 1) 1) "season"
                  2) "Autumn"
      3) "Apple"
3) 1) "Query internal execution time: 1.326905 milliseconds"

Compact:

127.0.0.1:6379> GRAPH.QUERY demo "MATCH (a)-[e]->(b) RETURN a, e, b.name" --compact
1) 1) 1) (integer) 2
      2) "a"
   2) 1) (integer) 3
      2) "e"
   3) 1) (integer) 1
      2) "b.name"
2) 1) 1) 1) (integer) 0
         2) 1) (integer) 0
         3) 1) 1) (integer) 0
               2) (integer) 2
               3) "Tree"
      2) 1) (integer) 0
         2) (integer) 0
         3) (integer) 0
         4) (integer) 1
         5) 1) 1) (integer) 1
               2) (integer) 2
               3) "Autumn"
      3) 1) (integer) 2
         2) "Apple"
3) 1) "Query internal execution time: 1.085412 milliseconds"

These results are being parsed by redis-cli, which adds such visual cues as array indexing and indentation, as well as type hints like (integer). The actual data transmitted is formatted using the RESP protocol. All of the current RedisGraph clients rely upon a stable Redis client in the same language (such as redis-rb for Ruby) which handles RESP decoding.

Top-level array results

The result set above had 3 members in its top-level array:

1) Header row
2) Result rows
3) Query statistics

All queries that have a RETURN clause will have these 3 members. Queries that don't return results have only one member in the outermost array, the query statistics:

127.0.0.1:6379> GRAPH.QUERY demo "CREATE (:plant {name: 'Tree'})-[:GROWS {season: 'Autumn'}]->(:fruit {name: 'Apple'})" --compact
1) 1) "Labels added: 2"
   2) "Nodes created: 2"
   3) "Properties set: 3"
   4) "Relationships created: 1"
   5) "Query internal execution time: 1.972868 milliseconds"

Rather than introspecting on the query being emitted, the client implementation can check whether this array contains 1 or 3 elements to choose how to format data.

Reading the header row

Our sample query MATCH (a)-[e]->(b) RETURN a, e, b.name generated the header:

1) 1) (integer) 2
   2) "a"
2) 1) (integer) 3
   2) "e"
3) 1) (integer) 1
   2) "b.name"

The 3 array members correspond, in order, to the 3 entities described in the RETURN clause.

Each is emitted as a 2-array:

1) ColumnType (enum)
2) column name (string)

It is the client's responsibility to store ColumnType enum. RedisGraph guarantees that this enum may be extended in the future, but the existing values will not be altered.

In this case, a corresponds to a Node column, e corresponds to a Relation column, and b.name corresponds to a Scalar column. No other column types are currently supported.

Reading result rows

The entity representations in this section will closely resemble those found in Result Set Graph Entities.

Our query produced one row of results with 3 columns (as described by the header):

1) 1) 1) (integer) 0
      2) 1) (integer) 0
      3) 1) 1) (integer) 0
            2) (integer) 2
            3) "Tree"
   2) 1) (integer) 0
      2) (integer) 0
      3) (integer) 0
      4) (integer) 1
      5) 1) 1) (integer) 1
            2) (integer) 2
            3) "Autumn"
   3) 1) (integer) 2
      2) "Apple"

We know the first column to contain nodes. The node representation contains 3 top-level elements:

  1. The node's internal ID.
  2. An array of all label IDs associated with the node (currently, each node can have either 0 or 1 labels, though this restriction may be lifted in the future).
  3. An array of all properties the node contains. Properties are represented as 3-arrays - [property key ID, PropertyType enum, value].
[   
    Node ID (integer),
    [label ID (integer) X label count]
    [[property key ID (integer), PropertyType (enum), value (scalar)] X property count]
]

The second column contains relations. The relation representation differs from the node representation in two respects:

  • Each relation has exactly one type, rather than the 0+ labels a node may have.
  • A relation is emitted with the IDs of its source and destination nodes.

As such, the complete representation is as follows:

  1. The relation's internal ID.
  2. The relationship type ID.
  3. The source node's internal ID.
  4. The destination node's internal ID.
  5. The key-value pairs of all properties the relation possesses.
[   
    Relation ID (integer),
    type ID (integer),
    source node ID (integer),
    destination node ID (integer),
    [[property key ID (integer), PropertyType (enum), value (scalar)] X property count]
]

The third column contains a scalar. Each scalar is emitted as a 2-array - [PropertyType enum, value].

As with ColumnType, it is the client's responsibility to store the PropertyType enum. RedisGraph guarantees that this enum may be extended in the future, but the existing values will not be altered.

Reading statistics

The final top-level member of the GRAPH.QUERY reply is the execution statistics. This element is identical between the compact and standard response formats.

The statistics always include query execution time, while any combination of the other elements may be included depending on how the graph was modified.

  1. "Labels added: (integer)"
  2. "Nodes created: (integer)"
  3. "Properties set: (integer)"
  4. "Nodes deleted: (integer)"
  5. "Relationships deleted: (integer)"
  6. "Relationships created: (integer)"
  7. "Query internal execution time: (float) milliseconds"

Procedure Calls

Property keys, node labels, and relationship types are all returned as IDs rather than strings in the compact format. For each of these 3 string-ID mappings, IDs start at 0 and increase monotonically.

As such, the client should store an string array for each of these 3 mappings, and print the appropriate string for the user by checking an array at position ID. If an ID greater than the array length is encountered, the local array should be updated with a procedure call.

These calls are described generally in the Procedures documentation.

To retrieve each full mapping, the appropriate calls are:

db.labels()

127.0.0.1:6379> GRAPH.QUERY demo "CALL db.labels()"
1) 1) "label"
2) 1) 1) "plant"
   2) 1) "fruit"
3) 1) "Query internal execution time: 0.321513 milliseconds"

db.relationshipTypes()

127.0.0.1:6379> GRAPH.QUERY demo "CALL db.relationshipTypes()"
1) 1) "relationshipType"
2) 1) 1) "GROWS"
3) 1) "Query internal execution time: 0.429677 milliseconds"

db.propertyKeys()

127.0.0.1:6379> GRAPH.QUERY demo "CALL db.propertyKeys()"
1) 1) "propertyKey"
2) 1) 1) "name"
   2) 1) "season"
3) 1) "Query internal execution time: 0.318940 milliseconds"

Because the cached values never become outdated, it is possible to just retrieve new values with slightly more complex constructions:

CALL db.propertyKeys() YIELD propertyKey RETURN propertyKey SKIP [cached_array_length]

Though the property calls are quite efficient regardless of whether this optimization is used.

As an example, the Python client checks its local array of labels to resolve every label ID as seen here.

In the case of an IndexError, it issues a procedure call to fully refresh its label cache as seen here.

Reference clients

All the logic described in this document has been implemented in most of the clients listed in Client Libraries. Among these, redisgraph-py and JRedisGraph are currently the most sophisticated.