AgensGraph Tutorial
Import, Query, and Modify Graph Data

Before Getting Started!

The goal of this tutorial is to give users guidance in how to load sample data into AgensGraph, query the loaded data, and/or extract the desired information.

In order to load and use data as guided in this tutorial, AgensGraph should be installed first. You can install/use AgensBrowser, a visualization tool, as well, for easier and quicker analysis of the sample data.

In general, there are two ways to load data into AgensGraph:

  1. Import dataset files into AgensGraph.
  2. Execute queries on AgensGraph to generate the data.

This tutorial describes how to import a dataset file (CSV) into AgensGraph. If you want more information on how to generate data using queries, please visit Bitnine's website and read Basic Cypher and Advanced Cypher white papers. The link to download the dataset we use here is given below.

Contents of this tutorial

  1. Dataset Overview
  2. What is, and How to Generate Graph Model
  3. Export Dataset from RDB to CSV
  4. Import CSV to AgensGraph
  5. Querying Graph Data
  6. Updating Graph Data
  7. Notes

01. Dataset Overview: Northwind

This tutorial uses Northwind, a sample dataset widely used in RDBMS. It is true that the Northwind dataset is mostly used for examples of relational data, but it can be used as examples of graph data as well. This dataset contains sales data of a fictitious company called Northwind Traders, an importer/exporter of specialty foods from/to the global market. The Northwind dataset used in this tutorial consists of a total of 11 tables, 6 vertex labels, and 6 edge labels and covers 91 customers, 830 ordering records, and 77 types of products sold.

The entity-relationship diagram of the dataset is shown below :

northwind

02. What is, and How to Generate Graph Model

A graph model means a data model that is implemented in graph where data are linked by points (nodes or vertices) and lines (links or edges), rather than in a table format consisting of rows and columns.

The followings shall be noted when switching from a relational model (that uses tables) to a graph model:

  • "row" in a relational model corresponds to a vertex or an edge with properties in a graph model
  • "table name" in a relational model corresponds to "vertex label" and "edge label" in a graph model.

The graph model of Northwind is as follows:

metadata

One of the biggest differences between graph model and relational model is that only graph model reflects the reality (business) model as it is.

Relational model processes complicated real-world business model and transform them into a table.

The problem is that the business model in the real world is in the form of relationships (edges) among objects (vertices), and analogous objects are combined and represented by a group (label), all of which is quite similar to the way graph model is implemented.

03. Export Dataset from RDB to CSV

If you have a Northwind dataset in your existing legacy RDB and want to load it into AgensGraph, you must first export the dataset in the RDB to CSV before loading.

COPY (SELECT * FROM categories) TO 'D:\northwind\exp\categories.csv' WITH CSV header;             
                              COPY (SELECT * FROM customers) TO 'D:\northwind\exp\customers.csv' WITH CSV header;               
                              COPY (SELECT * FROM employees) TO 'D:\northwind\exp\employees.csv' WITH CSV header;               
                              COPY (SELECT * FROM employee_territories) TO 'D:\northwind\exp\employee_territories.csv' WITH CSV header;    
                              COPY (SELECT * FROM orders_details) TO 'D:\northwind\exp\orders_details.csv' WITH CSV header;           
                              COPY (SELECT * FROM orders) TO 'D:\northwind\exp\orders.csv' WITH CSV header;                  
                              COPY (SELECT * FROM products) TO 'D:\northwind\exp\products.csv' WITH CSV header;                
                              COPY (SELECT * FROM regions) TO 'D:\northwind\exp\regions.csv' WITH CSV header;                 
                              COPY (SELECT * FROM shippers) TO 'D:\northwind\exp\shippers.csv' WITH CSV header;                
                              COPY (SELECT * FROM suppliers) TO 'D:\northwind\exp\suppliers.csv' WITH CSV header;               
                              COPY (SELECT * FROM territories) TO 'D:\northwind\exp\territories.csv' WITH CSV header;          
                              
                              COPY (SELECT * FROM orders
                              LEFT OUTER JOIN orders_details ON orders_details.OrderID = orders.OrderID) TO 'D:\northwind\exp\orders.csv' WITH CSV header;

04. Import CSV to AgensGraph

AgensGraph Foreign-Data Wrapper (FDW) is needed when importing a CSV-formatted dataset into AgensGraph.

  1. First, install 'file_fdw' extension.

    CREATE EXTENSION IF NOT EXISTS file_fdw;
  2. Create a database connection server.

    CREATE SERVER northwind FOREIGN DATA WRAPPER file_fdw;
  3. Create each foreign table to load a dataset in CSV format.

    CREATE FOREIGN TABLE categories (
                                                    CategoryID int,
                                                    CategoryName varchar(15),
                                                    Description text,
                                                    Picture bytea
                                              ) 
                                              SERVER northwind
                                              OPTIONS (FORMAT 'csv', HEADER 'true', FILENAME 'D:\northwind\categories.csv', delimiter ',', quote '"', null '');
                                              
                                              CREATE FOREIGN TABLE customers (
                                                    CustomerID char(5),
                                                    CompanyName varchar(40),
                                                    ContactName varchar(30),
                                                    ContactTitle varchar(30),
                                                    Address varchar(60),
                                                    City varchar(15),
                                                    Region varchar(15),
                                                    PostalCode varchar(10),
                                                    Country varchar(15),
                                                    Phone varchar(24),
                                                    Fax varchar(24)
                                              ) 
                                              SERVER northwind
                                              OPTIONS (FORMAT 'csv', HEADER 'true', FILENAME 'D:\northwind\customers.csv', delimiter ',', quote '"', null '');
                                              
                                              CREATE FOREIGN TABLE employees (
                                                    EmployeeID int,
                                                    LastName varchar(20),
                                                    FirstName varchar(10),
                                                    Title varchar(30),
                                                    TitleOfCourtesy varchar(25),
                                                    BirthDate date,
                                                    HireDate date,
                                                    Address varchar(60),
                                                    City varchar(15),
                                                    Region varchar(15),
                                                    PostalCode varchar(10),
                                                    Country varchar(15),
                                                    HomePhone varchar(24),
                                                    Extension varchar(4),
                                                    Photo bytea,
                                                    Notes text,
                                                    ReportTo int,
                                                    PhotoPath varchar(255)
                                              )  
                                              SERVER northwind 
                                              OPTIONS (FORMAT 'csv', HEADER 'true', FILENAME 'D:\northwind\employees.csv', delimiter ',', quote '"', null '');
                                              
                                              CREATE FOREIGN TABLE employee_territories (
                                                    EmployeeID int,
                                                    TerritoryID varchar(20)
                                              ) 
                                              SERVER northwind
                                              OPTIONS (FORMAT 'csv', HEADER 'true', FILENAME 'D:\northwind\employee_territories.csv', delimiter ',', quote '"', null '');     
                                              
                                              CREATE FOREIGN TABLE orders_details (
                                                    orderID int,
                                                    ProductID int,
                                                    UnitPrice money,
                                                    Quantity smallint,
                                                    Discount real
                                              ) 
                                              SERVER northwind
                                              OPTIONS (FORMAT 'csv', HEADER 'true', FILENAME 'D:\northwind\orders_details.csv', delimiter ',', quote '"', null '');     
                                              
                                              CREATE FOREIGN TABLE orders (
                                                    orderID int,
                                                    CustomerID char(5),
                                                    EmployeeID int,
                                                    orderDate date,
                                                    RequiredDate date,
                                                    ShippedDate date,
                                                    ShipVia int,
                                                    Freight money,
                                                    ShipName varchar(40),
                                                    ShipAddress varchar(60),
                                                    ShipCity varchar(15),
                                                    ShipRegion varchar(15),                         
                                                    ShipPostalCode varchar(10),                         
                                                    ShipCountry varchar(15)                     
                                              )                          
                                              SERVER northwind                         
                                              OPTIONS (FORMAT 'csv', HEADER 'true', FILENAME 'D:\northwind\orders.csv', delimiter ',', quote '"', null '');                         
                                                                       
                                              CREATE FOREIGN TABLE products (
                                                    ProductID int,         
                                                    ProductName varchar(40),
                                                    SupplierID int,        
                                                    CategoryID int,        
                                                    QuantityPerUnit varchar(20),
                                                    UnitPrice money,       
                                                    UnitsInStock smallint, 
                                                    UnitsOnorder smallint, 
                                                    ReorderLevel smallint, 
                                                    Discontinued bit       
                                              ) 
                                              SERVER northwind
                                              OPTIONS (FORMAT 'csv', HEADER 'true', FILENAME 'D:\northwind\products.csv', delimiter ',', quote '"', null '');
                                              
                                              CREATE FOREIGN TABLE regions (
                                                    RegionID int,
                                                    RegionDescription char(50)
                                              ) 
                                              SERVER northwind
                                              OPTIONS (FORMAT 'csv', HEADER 'true', FILENAME 'D:\northwind\regions.csv', delimiter ',', quote '"', null '');     
                                              
                                              CREATE FOREIGN TABLE shippers (
                                                    ShipperID int,
                                                    CompanyName varchar(40),
                                                    Phone varchar(24)
                                              ) 
                                              SERVER northwind
                                              OPTIONS (FORMAT 'csv', HEADER 'true', FILENAME 'D:\northwind\shippers.csv', delimiter ',', quote '"', null '');     
                                              
                                              CREATE FOREIGN TABLE suppliers (
                                                    SupplierID int,
                                                    CompanyName varchar(40),
                                                    ContactName varchar(30),
                                                    ContactTitle varchar(30),
                                                    Address varchar(60),
                                                    City varchar(15),
                                                    Region varchar(15),
                                                    PostalCode varchar(10),
                                                    Country varchar(15),
                                                    Phone varchar(24),
                                                    Fax varchar(24),
                                                    HomePage text
                                              ) 
                                              SERVER northwind
                                              OPTIONS (FORMAT 'csv', HEADER 'true', FILENAME 'D:\northwind\suppliers.csv', delimiter ',', quote '"', null '');
                                              
                                              CREATE FOREIGN TABLE territories (
                                                    TerritoryID varchar(20),
                                                    TerritoryDescription char(50),
                                                    RegionID int
                                              ) 
                                              SERVER northwind
                                              OPTIONS (FORMAT 'csv', HEADER 'true', FILENAME 'D:\northwind\territories.csv', delimiter ',', quote '"', null ''); 
  4. After creating a graph, specify a graph path.

    CREATE GRAPH northwind_graph;
                                              SET graph_path = northwind_graph;
  5. Now create vertices.

    LOAD FROM categories AS source CREATE (n:category=to_jsonb(source));
                                              LOAD FROM customers AS source CREATE (n:customer=to_jsonb(source));
                                              LOAD FROM employees AS source CREATE (n:employee=to_jsonb(source));
                                              create vlabel if not exists "order";
                                              LOAD FROM orders AS source CREATE (n:"order"=to_jsonb(source));
                                              LOAD FROM products AS source CREATE (n:product=to_jsonb(source));
                                              LOAD FROM regions AS source CREATE (n:region=to_jsonb(source));
                                              LOAD FROM shippers AS source CREATE (n:shipper=to_jsonb(source));
                                              LOAD FROM suppliers AS source CREATE (n:supplier=to_jsonb(source));
                                              LOAD FROM territories AS source CREATE (n:territory=to_jsonb(source));
                                              
  6. Create indices of vertices for a quick graph search.

    CREATE PROPERTY INDEX ON category(categoryid);
                                              CREATE PROPERTY INDEX ON customer(customerid);
                                              CREATE PROPERTY INDEX ON employee(employeeid);
                                              CREATE PROPERTY INDEX ON "order"(orderid);
                                              CREATE PROPERTY INDEX ON product(productid);
                                              CREATE PROPERTY INDEX ON region(regionid);
                                              CREATE PROPERTY INDEX ON shipper(shipperid);
                                              CREATE PROPERTY INDEX ON supplier(supplierid);
                                              CREATE PROPERTY INDEX ON territory(territoryid);
                                              
  7. Generate edges among vertices.

    LOAD FROM orders_details AS source
                                              MATCH (n:"order"),(m:product)
                                              WHERE n.orderid=to_jsonb((source).orderid)
                                                    AND m.productid=to_jsonb((source).productid)
                                              CREATE (n)-[r:ORDERS {unitprice:(source).unitprice,quantity:(source).quantity,discount:(source).discount}]->(m);
                                              
                                              MATCH (n:employee),(m:employee)
                                              WHERE m.employeeid=n.reportto
                                              CREATE (n)-[r:REPORTS_TO]->(m);
                                              
                                              MATCH (n:supplier),(m:product)
                                              WHERE m.supplierid=n.supplierid
                                              CREATE (n)-[r:SUPPLIES]->(m);
                                              
                                              MATCH (n:product),(m:category)
                                              WHERE n.categoryid=m.categoryid
                                              CREATE (n)-[r:PART_OF]->(m);
                                              
                                              MATCH (n:customer),(m:"order")
                                              WHERE m.customerid=n.customerid
                                              CREATE (n)-[r:PURCHASED]->(m);
                                              
                                              MATCH (n:employee),(m:"order")
                                              WHERE m.employeeid=n.employeeid
                                              CREATE (n)-[r:SOLD]->(m);
                                              

If you have finished all of the steps above properly, the following graph model will be generated. (By the graph layout you use different graph layout can appear on your screen)

browser-sample

05. Querying Graph Data

After loading the Northwind dataset into AgensGraph, check if the data is properly loaded by retrieving it with a Cypher query. You can also visualize the graphs by executing query statements in AgensBrowser.

  1. Try to retrieve 100 objects from the whole data. Query up to 100 objects, including customers, orders, and products:

    MATCH path=(c:customer)-[:purchased]->(o:"order")-[r:orders]->(p:product) 
                                                    RETURN path
                                                    LIMIT 100;
  2. Try to retrieve only the data of a specific customer. Query the purchase data of the person named “Anton” among the list of customers:

    MATCH path=(c:customer)-[:PURCHASED]->(o:"order")-[:ORDERS]->(p:product)-[:PART_OF]->(:category)
                                              WHERE c.customerid = 'ANTON' 
                                              RETURN path
                                              LIMIT 100;
  3. Create a product purchase ranking list per customer. Calculate the rating (product-specific order rate) to figure out how often an item ordered by a customer has been purchased:

    DROP ELABEL IF EXISTS RATED;
                                              CREATE ELABEL IF NOT EXISTS RATED;
                                              
                                              MATCH (c:customer)-[:PURCHASED]->(o:"order")-[:ORDERS]->(p:product)
                                              WITH c, count(p) AS total
                                              MATCH (c)-[:PURCHASED]->(o:"order")-[:ORDERS]->(p:product)
                                              WITH c, total, p, count(o) AS orders
                                              WITH c, total, p, orders, orders*1.0/total AS rating
                                              MERGE (c)-[rated:RATED {total_count: to_jsonb(total), order_count: to_jsonb(orders), rating: rating}]->(p);

    Now that you have created a rating vertex that represents the order rate of each item with the query above, you can inquire the items ordered by “Anton” and calculate the order rate of each item.

    MATCH path = (c:customer)-[r:RATED]->(p:product)
                                              WHERE c.customerid='ANTON'
                                              RETURN c.customerid as cid, r.total_count, p.productName, r.order_count AS ord_cnt, r.rating, path
                                              ORDER BY cid, ord_cnt DESC
                                              LIMIT 100;

06. Updating the Graph

Once you confirm that you can properly retrieve graph data, now try to update it. You may search for the information you need to update the graph data, and then update the values ​​of existing objects or expand the graph.

The following query assigns Janet's reporting target (REPORTS_TO) to Steven; it finds a Steven vertex first through a MATCH query, and then looks for the graph pattern that expresses the past ‘report step’ of Janet; finally, deletes the existing REPORTS_TO edge from this graph pattern, and creates new REPORTS_TO edge between Janet and Steven.

MATCH (mgr:Employee {EmployeeID:5}) -- Steven (Vertex)
                                    MATCH (emp:Employee {EmployeeID:3})-[rel:REPORTS_TO]->() -- Janet (Vertex) and Her/His REPORTS_TO (Edge)
                                    DELETE rel
                                    CREATE (emp)-[:REPORTS_TO]->(mgr)
                                    RETURN *;

In this way, you may change the graph in AgensGraph. If you have any questions on Northwind (dataset) or would like to learn more about AgensGraph, please refer to the link below

Related Links

Additional Dataset

So far, we’ve been through from importing dataset to AgensGraph to querying and updating loaded graph data, and we can play with another data on the basis of what we’ve learnt; it is time to play with IMDB(Movie Data).