How to import CSV file in AG Cloud Express
As announced at the beginning of the launch of AG Cloud Express, the Bitnine R&D team has updated the ‘Import your own data’ feature, allowing you to import your own CSV data and show your data in nodes and edges. We expect some people to jump in to try out the core of a cloud-based graph database, but there is something you must understand to use this feature.
First of all, AG Cloud Express is a combination of several technologies and applications developed by Bitnine; AgensGraph for multi-model (relational and graph) data processing, AGViewer for visualization of data in graph and graph queries. Making the CSV files for AG Cloud Express can also be applied in AgensGraph because the modeling techniques provided in this tutorial follow similar principles.
Importing your data into AG Cloud Express requires you to follow some rules and make a few setups. If you don’t follow certain rules and setups, an error message may appear and AG Cloud Express fails to add your project.
To show how to create CSV files for AG Cloud Express, we have prepared the following sample data acquired from Google Analytics. This data shows which major states and cities from around the world visited certain merchandise store websites. (Note that there is more data gathered than shown in the image below)
Depending on your needs, data modeling may look different, so take this guide with a grain of salt, and apply the rules to your own data.
To convert the above table data to a CSV file AG Cloud Express can understand, first, there must be separate CSV files for nodes and edges.
How to make Node CSV
Notice there are two categories known as Page Title and Region. We will copy and paste these categories separately and save each in different files.
Although not fully shown in the image, there is a month’s worth of information in the actual data. There are also repetitive nodes, so we shall remove all the duplicate values to prevent unnecessary nodes from generating.
The node CSV for Region and Page is created as shown below.
The first row in A1 shall be named, “Name” for the data to be distinguished as nodes in AG Cloud Express. The CSV file shall be named as Region and Page respectively. The file names will be determined as the name of the nodes in AG Viewer.
Take out any apostrophes (for example, men’s -> mens) to avoid error when importing
How to make Edge CSV
Making edge CSV is simpler than it looks. Start with the file acquired from Google Analytics and change the name of the rows in a way that AG Viewer can understand. The column data next to the start_node and end_node are the properties that can further enhance the quality and depth of graph data. Leave significant properties that will help you define your data and remove any unnecessary properties that could hinder or shroud your judgment.
Raw data (Before)
The nodes shall always start with start_node and end_node and the relationship between them will be represented by an edge CSV file saved as visit[region&page]. It is necessary to save the edge file as follows.
- The action of one node to another will be the relationship between them. Thus, the ‘name of the action’ written in the file shall be your edge. In this case, ‘visit’ is the relationship that will be listed in the AG Viewer.
- Start nodes and end nodes need to be included in the square bracket [ ] and the separator between the nodes must be written as ‘&’. When saving directly in a Microsoft Excel file, you cannot save the file with special letters such as ‘&’, therefore you must manually edit the file name to include ‘&’ between the two nodes.
The files should look like this once ready to import on AG Cloud Express.
Importing into AG Cloud Express
For the sake of the tutorial, the CSV conversion process is shortened and kept simple as possible. Depending on what kind of insights you want to draw, it might take more time to model your data.
Once you have a CSV file ready for import, follow the process below. Refer to the link to see how to log in to AG Cloud Express.
Once you log in and click add a new project, a popup will appear as shown below.
Select our newly updated option, User Data (.CSV) to import your data.
Drag your files to the Drag & Drop box or browse files to import.
If you have successfully modeled and named your CSV files, a popup window will appear. If you have not successfully added the file, check your filename and the content of the CSV files and retry again. Make sure to follow the tutorial provided above. Request for improvement if the issue persists.
Click the Launch AGViewer button for the moment of truth. An additional tab will open, leading you to the graph visualization tool. Check to see if the nodes and edges are properly aligned as shown in the screenshot below.
If node labels and edge labels are properly displayed as above, it means that you are ready to view your data in graphs. If any one of the labels lists 0, then the file regarding either one of your error labels must be edited and re-imported.
Above is sample data showing websites customers from California and Colorado have visited. The thickness of the edge shows larger traffic than those with a thin edge line. Many customers have shown interest in various categories of clothing websites and according to the sample data, Colorado has only two websites in common with California. Among the Page nodes in common, Colorado has bigger traffic on ‘Mens / Unisex / Apparel / Google Merchandise Store’ whereas California does not. Though the analysis is somewhat imperfect with the limited amount of data, we can assume that there are more male apparels customers in Colorado than in California. Also, there are more accessible online services available in California than in Colorado.
More in-depth analysis with AG Viewer will be dealt on another post if you are interested in what is in store for AG Viewer.
Take a glance and gather insights
The AG Viewer is a graph visualization tool within AG Cloud Express that will help you see your data model easier. We have gone through how to convert raw data into a CSV file AG Cloud Express can read, but will summarize the major points below. Do check out the rules, practice with the provided datasets below or import your data into our cloud-based graph database service!
Tips to Remember:
- The name of the column in Node files should be called, ‘name’
- In the Edge file(s), make sure to name your nodes, ‘start_node’ and ‘end_node’.
- Save edge file with the following format. ‘relationship[name of the start_node file &name of the end_node file]’. (Example: Visit[Region&Page]
(if you are saving directly from Excel, ‘&’ won’t be accepted. In that case, place any letter and turn off the file, and then manually edit the filename by switching the letter to ‘&’.)
- Once all Node and Edge files are ready, click and drag them into the AG Cloud Express Drag & Drop box.
If you have any questions regarding the newly released AG Cloud Express, please feel free to reach us.
Thank you! 😀