Pick of the Week at Nebula Graph - Configuration recommendations for data import

Steam
2020-09-18

Pick of the Week

Normally the weekly issue covers Nebula Graph Updates and Community Q&As. If something major happens, it will also be covered in the additional Events of the Week section.

Events of the Week

  1. Live talk: Nebula Graph in practice with WeChat

Graphs hold a promising prospect in areas such as social network recommendation, real-time computing, risk control, and security. How to store and query large-scale heterogeneous graph data efficiently with graph databases is a great challenge.

Most of the well-known graph databases are helpless in dealing with big data sets. For example, the community edition of Neo4j, which is widely used in the graph field, only provides single-replica services. And JanusGraph, while solving the storage problem of big data sets by external metadata management, KV storage, and indexing, has much-maligned performance issues.

How can Internet companies which are facing the challenge of big data storage and processing solve these problems with a graph database? In this live talk, Li Benli, a senior engineer in the WeChat team, shared his experience with us.

Previously Li has written an article in this regard, read the article here.

  1. New release: Nebula Graph 1.1.0 will be released next week

In this release, the dev team has greatly improved the stability and performance of Nebula Graph. There will also be some bug fixes. Stay tuned!

Nebula Graph Updates

The updates of Nebula in the last week:

  • The range scan for string-type indexes is no longer supported. We can only use the == condition in the WHERE clause of a LOOKUP statement while filtering string-type indexes, and the conditions must match all the properties of the indexes. For more information, see PR #2283 and PR #2277.

  • Fixed an issue where stopping the meta service before the initialization of the job manager may cause meta exceptions. For more information, see PR #2332.

  • Optimized the logic of Raft. A delay is added after elections failed to ensure that there is only one election request at the same time. For more information, see PR #2305.

Community Q&A

This week’s topic is about suggestions for the Spark Writer configuration from @nicole, a community user.

Spark Writer Configuration Suggestions

Before using Spark Write to import data, we need to configure application.conf.

  1. @nicole recommends that we enrich the configuration file with comments, writing comments for all parameters in the file. Put parameters with default values, such as Spark-related parameters, into comments, and write notes to remind users to remove the comment signs to make any modification take effect.

  2. For the field mapping configuration of tags and edges, @nicole wonders if we can add an option that could automatically map the fields in the source data and Neubla Graph that have the same names. For tags and edges with more than 50 properties, this option would save a lot of work.

Practice data migration from Neo4j to Nebula Graph

In this article, you’ll learn about the implementation of Nebula Graph Exchange, a data import tool based on Spark, and how to import data with it.

Previous Pick of the Week

  1. Nebula Graph DBaaS is online
  2. DB-Engines Graph DBMS Ranking Update in September
  3. Import data from Neo4J or JanusGraph to Nebula Graph
Like what we do ? Star us on GitHub. https://github.com/vesoft-inc/nebula