Storage Format in NebulaGraph v2.0.0

storage-format-in-nebula-graph-2.0

NebulaGraph 2.0 has changed a lot over its releases. In the storage architecture design, the encoding format has been changed, which has the most significant impact on its users. In NebulaGraph, data is stored as KV-pairs in RocksDB. This article covers several issues such as the differences between the old and new encoding formats and why the format must be changed.

Encoding Format in NebulaGraph 1.0

Let's start with a brief review of the encoding format in NebulaGraph 1.0. For those who are not familiar with it, I recommend that they read this post: An Introduction to NebulaGraph's Storage Engine. In NebulaGraph 1.0, the vertex IDs can only be represented by values of the int type, so all VertexIDs are stored in int64.

Vertex key format in NebulaGraph 1.0

Vertex key format in NebulaGraph 1.0

Edge key format in NebulaGraph 1.0

Edge key format in NebulaGraph 1.0

For a VertexID, its corresponding PartID is obtained after hash, so a vertex and all its incoming and outgoing edges, of which only the IDs of the sourcing vertices are hash processed, are mapped in the same partition. It should be noted that in NebulaGraph 1.0, the first bytes of a vertex and an edge are of the same Type. That is, for a vertex, all its tags are not stored physically contiguously. They may be stored as shown in the following figure. For the vertex named src, its three tags (tag1, tag2, and tag3) may actually be separated by edges.

rocksdb key format

Such a key format can meet most requirements for operations on data in NebulaGraph 1.0. For example, when the prefix is specified, data can be retrieved by the FETCH and the GO manipulations.

Encoding Format in NebulaGraph 2.0

In the releases prior to the 2.0 GA release, the encoding format in the underlying storage layer was actually the same as that in NebulaGraph 1.0. For an integer VertexID, it is stored in exactly the same format as in 1.0. However, for a string VertexID, it is changed from int64, which occupies 8 bytes, to FIXED_STRING of a fixed length which needs to be specified by users when they run CREATE SPACE. If a VertexID is shorter than the specified length, the system will automatically pad it with \0. For a VertexID that exceeds the specified length, the system will report an error directly.

In the 2.0 GA release, several changes were made to the underlying storage encoding format. Therefore, to upgrade NebulaGraph to this release, a specific tool is necessary. This tool converts the data from the old format to the new format. The following figures show the key formats implemented in the 2.0 GA release.

Key Formats in NebulaGraph 2.0

Vertex key format

Vertex key format

Edge key format

Edge key format

Key Format Comparison between NebulaGraph 1.0 and 2.0

Key Format Comparison - Vertex

Key Format Comparison - Edge

As shown in the preceding figures, some major changes are made:

As mentioned earlier, the length of VertexID is changed from 8 bytes to n bytes. For an integer VertexID, n takes the value of 8. For a string VertexID, n takes the value of the specified length.
In the vertex key format, the timestamp byte is removed, and in the edge key format, the timestamp byte is replaced with a one-byte placeholder.
For the key format of a vertex and an edge, the Type byte differs from each other, which separates vertices and edges physically.

Why these changes are made? Here are the main reasons:

The changes made to VertexID are mainly to make it support string IDs while still being compatible with int IDs in NebulaGraph 1.0. In the Storage layer, VertexIDs are encoded as bytes. However, VertexIDs of the corresponding type, which is determined by the configuration of the graph space, are returned in the results. > Why FIXED_STRING is applied to the string IDs? If a fixed length is not specified, scanning by prefixes is not possible. By padding the length with \0, the prefixes of all vertices and edges are of the same length, so that the corresponding prefix query can be performed.
Removing the timestamp byte is mainly to improve the performance, because saving data as multiple versions lowers the performance and currently the MVCC implementation in NebulaGraph is of a low priority. > A one-byte long placeholder is reserved in the edge key format, which is purposely designed for TOSS (Transaction on Storage Side). It is mainly used to identify the outgoing edge and the incoming edge of an edge. We will write another article to introduce the TOSS feature in detail. Please stay tuned for our updates.
The main benefit of separating vertices and edges physically is that all tags of a vertex can be easily and quickly retrieved, which is extensively used in Cypher's MATCH statements. Before this change, data is scanned by a prefix composed of the same Type + VertexID, which may greatly reduce the performance because vertices and edges may be mixed together. After different Types are implemented, scanning by the VertexType + VertexID prefix can retrieve all tags quickly. > In NebulaGraph 1.0, vertices and edges can be stored with the same prefix because there is no need to fetch all tags for a vertex. At the code level, however, such an implementation has great effect on the performance. For example, the FETCH operation in NebulaGraph 1.0 works by scanning by the VertexID prefix, but for a super vertex, retrieving all its tags with such an operation performs poorly. Besides, if the SCAN operation provided by the Storage layer is used to retrieve all the vertices of an entire graph, the entire RocksDB is actually scanned.

In addition to the changes made to the key format of vertices and edges, the format of the indexes has actually changed.

Why these changes? The first reason is that with NULL support in NebulaGraph 2.0, the index semantics must be changed accordingly. The second one is that in NebulaGraph 1.0, the string fields in indexes are actually treated as strings of variable length. Therefore, whenever an index on a string field is used in a LOOKUP statement, only an equivalent query is used. In NebulaGraph 2.0, however, FIXED_STRING is applied to the string fields for indexes and the data VertexID, and for the LOOKUP statements, range scanning is supported for the string type index. For example, LOOKUP ON index1 WHERE col > "aaa". We will introduce this design in details in the future articles.

Interested in this topic? Join the NebulaGraph Slack channel and talk to the community.