LDBC Social Network Benchmark Between AgensGraph and Neo4j
Bitnine releases the performance comparison test between AgensGraph (v2.1) vs Neo4j (v3.3.5). This post will assess the test result for LDBC Social Network Benchmark (SNB), which depends on the software, testing environment, time of test, and various other conditions. However, there really isn’t a clear winner when it comes to performance comparison as there are too many factors to take into consideration and it is extremely difficult to compare performance on exact criteria.
What is the Linked Data Benchmark Council (LDBC)?
The Linked Data Benchmark Council (LDBC) is a consortium comprised of leading companies and organizations in the DBMS industry. The LDBC SNB is used to test various functions of the graph database management system (GDBMS) and provides a set of queries that describe the scenarios of the social network characterized by graph-shaped data. For this test, the LDBC SNB – v0.3.2 criteria were used.
The SNB consists of two workloads such as the interactive workload and the business intelligence workload. Among those two workloads, the interactive workload has been used for this test because this respective workload “consists of user-centric transactional-like interactive queries”, which will be suitable for testing query speed of a graph database software.
This post focuses on complex read queries because the need for analyzing such queries are in demand in the current times. Many technologies within the 4th Industrial Revolution such as the big data, A.I, and robot engineering require analysis of complex queries. Even Gartner names top 10 data and analytics technology trends (including graph) due to the significance of exploring and querying data of complex interrelationship.
Test Environment and Software Comparison
Neo4j is a leading vendor in the graph database industry. AgensGraph is a multi-model DBMS integration of PostgreSQL, with the availability of simultaneous or separate use of both SQL and Cypher.
The following tables show server info and parameter used on this LDBC SNB test.
On complex read query based performance test, AgensGraph outperforms neo4j on 8 out of 14 queries. Among those 8, 5 queries (#3, #4, #6, #9, #14) are significantly faster, and among those 5, 2 queries (#6 and #14) show a noticeably larger gap.
AgensGraph has improved since the last benchmark and is now more capable of handling queries with a complicated set of relationships and finding the shortest path between data. The more complex conditions are set in a query, the faster AgensGraph becomes in contrast to the opposing database. Due to its multi-model trait, AgensGraph is able to utilize both the RDB and the GDB depending on certain queries, thus gaining significant advantage performance-wise.
Following are the Top 2 queries with a large gap of difference: Complex Read #6 and #14.
Complex Query #6 (Tag Co-Occurrence)
Query Description: Given a start Person and some Tag, find other Tags that occur together with this Tag on Posts that were created by start Person’s friends and friends of friends (excluding start Person). Return top 10 Tags, and the count of Posts that were created by these Persons, which contain both this Tag and the given Tag.
AgensGraph is 1,544 times faster in this query. It takes AgensGraph 0.4107sec vs Neo4j 634.19992sec. The plot of this query may look simple because it generally starts with a given person, moving to friends or friends of friends, then to Posts and finally ends at a given tag. The conditions for finding tags are intricately tangled. The tags excluding the given tag must come from numbers of posts that were created by friends and friends of friends.
Complex Query #14 (Weighted/Unweighted Paths)
Query Description: Given two Persons, find all (unweighted) shortest paths between these two Persons, in the subgraph induced by the Knows relationship. Then, for each path calculate a weight. The nodes in the path are Persons, and the weight of a path is the sum of weights between every pair of consecutive Person nodes in the path. The weight for a pair of Persons is calculated such that every reply (by one of the Persons) to a Post (by the other Person) contributes 1.0, and every reply (by one of the Persons) to a Comment (by the other Person) contributes 0.5. Return all the paths with the shortest length and their weights. Do not return any rows if there is no path between the two Persons.
AgensGraph is 1,285 times faster in this query. It takes AgensGraph 0.5018sec vs Neo4j 644.9761sec. This complex query not only requires computing the path length but also requires returning of short paths length and computing their weight. To compute these weights, one must look for smaller sub-queries with three paths of lengths formed by the two Persons at each step, a Post, and a Comment.
Applicable Use Cases of Complex Read Queries
All of the complex read queries are strongly rooted in social network service. Complex Read #4, #6, and #9, for example, are generally similar queries that recommend and exclude certain people over the social network by comparing commonalities, whether that be similar tags or messages, between a given person and a friend. A few queries, however, can be tweaked into an entirely new scenario for new fields of applications.
Query #3 is finding friends and friends of friends of given person who have been to countries X and Y. This scenario is similar to Bitnine’s latest use case with Korea Customs Service (KCS). The role of a control center within the KCS was to find links between the smugglers and the accomplices from certain countries. The detailed use case relatable to this query can be downloaded here.
Query #14 is finding the unweighted and weighted shortest path between two people. Based on the information of unweighted and weighted shortest path, the query can be reinterpreted to a navigation system. Weighing an edge would either be analyzing the cost associated with traveling or time taken for the trip from one city to another. The navigation system applied with AgensGraph may be able to narrow down and recommend a faster and more efficient route for transportation service or industry.
Based on the complex read queries, AgensGraph is slightly ahead of Neo4j, however, as mentioned in the beginning, there can be no clear winner in performance comparison because there are too many factors to take into consideration prior to the benchmark test. For this test though, the complex read for AgensGraph is shown to be an average of 13.3 times faster than its competing database. AgensGraph engineering team is working to improve the overall performance for the next release, to uphold a full-scale LDBC SNB test again at an optimal state.
LDBC. LDBC Social Network Benchmark (SNB) – v0.3.2
※ AgensGraph Community Edition is licensed under Apache 2.0, so please feel free to send us an inquiry/idea about our products to firstname.lastname@example.org or our Github