Skip to content

Commit

Permalink
[ALGOS-263] feat(algos): Modify the graph schema and dataset;
Browse files Browse the repository at this point in the history
  • Loading branch information
xuanleilin committed Jul 19, 2024
1 parent 4368f0f commit 848b393
Show file tree
Hide file tree
Showing 5 changed files with 48 additions and 49 deletions.
5 changes: 3 additions & 2 deletions algorithms_test/1_dataset.sh
Original file line number Diff line number Diff line change
Expand Up @@ -46,10 +46,11 @@ main() {
# Check if the folder exists before downloading the dataset
dataset_folder="$directory/$top_level_dir"
if [ ! -d "$dataset_folder" ]; then
echo "Created directory: $dataset_folder"
mkdir -p "$dataset_folder"

# Download the dataset if it doesn't exist
if [ ! -f "$directory/$file_name" ]; then
mkdir -p "$dataset_folder"
echo "Created directory: $dataset_folder"
echo "Downloading $file_name..."
if ! wget -O "$directory/$file_name" "$download_link"; then
echo "Failed to download $file_name"
Expand Down
82 changes: 40 additions & 42 deletions algorithms_test/ReadMe.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,9 +102,9 @@ Place a JSON configuration file at `config/1_dataset.json`:
"default_directory": "~/data/public"
},
"datasets": {
"LiveJournal": {
"download_link": "https://snap.stanford.edu/data/soc-LiveJournal1.txt.gz",
"top_level_dir": "livejournal"
"Skitter": {
"download_link": "https://snap.stanford.edu/data/as-skitter.txt.gz",
"top_level_dir": "skitter"
},
"Facebook": {
"download_link": "https://snap.stanford.edu/data/facebook_combined.txt.gz",
Expand Down Expand Up @@ -133,38 +133,39 @@ Place a JSON configuration file at `config/1_dataset.json`:
## Example Output

```
======================================== LiveJournal ========================================
Created directory: /home/tigergraph/data/public/livejournal
Downloading soc-LiveJournal1.txt.gz...
--2024-07-19 08:54:43-- https://snap.stanford.edu/data/soc-LiveJournal1.txt.gz
======================================== Skitter ========================================
Created directory: /home/tigergraph/data/public/skitter
Downloading as-skitter.txt.gz...
--2024-07-19 10:28:48-- https://snap.stanford.edu/data/as-skitter.txt.gz
Resolving snap.stanford.edu (snap.stanford.edu)... 171.64.75.80
Connecting to snap.stanford.edu (snap.stanford.edu)|171.64.75.80|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 259619239 (248M) [application/x-gzip]
Saving to: ‘/home/tigergraph/data/public/soc-LiveJournal1.txt.gz’
Length: 33209863 (32M) [application/x-gzip]
Saving to: ‘/home/tigergraph/data/public/as-skitter.txt.gz’

/home/tigergraph/data/public/soc-LiveJournal1.txt 100%[=============================================================================================================>] 247.59M 16.4MB/s in 12s
/home/tigergraph/data/public/as-skitt 100%[=======================================================================>] 31.67M 12.3MB/s in 2.6s

2024-07-19 08:54:54 (21.2 MB/s) - ‘/home/tigergraph/data/public/soc-LiveJournal1.txt.gz’ saved [259619239/259619239]
2024-07-19 10:28:51 (12.3 MB/s) - ‘/home/tigergraph/data/public/as-skitter.txt.gz’ saved [33209863/33209863]

Unzipping soc-LiveJournal1.txt.gz into /home/tigergraph/data/public...
Finished unzipping soc-LiveJournal1.txt.gz.
Unzipping as-skitter.txt.gz into /home/tigergraph/data/public...
Finished unzipping as-skitter.txt.gz.
======================================== Facebook ========================================
Created directory: /home/tigergraph/data/tmp/facebook
Downloading facebook_combined.txt.gz...
--2024-07-19 08:55:01-- https://snap.stanford.edu/data/facebook_combined.txt.gz
--2024-07-19 10:28:52-- https://snap.stanford.edu/data/facebook_combined.txt.gz
Resolving snap.stanford.edu (snap.stanford.edu)... 171.64.75.80
Connecting to snap.stanford.edu (snap.stanford.edu)|171.64.75.80|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 218576 (213K) [application/x-gzip]
Saving to: ‘/home/tigergraph/data/tmp/facebook_combined.txt.gz’

/home/tigergraph/data/tmp/facebook_combined.txt.g 100%[=============================================================================================================>] 213.45K 635KB/s in 0.3s
/home/tigergraph/data/tmp/facebook_co 100%[=======================================================================>] 213.45K 622KB/s in 0.3s

2024-07-19 08:55:02 (635 KB/s) - ‘/home/tigergraph/data/tmp/facebook_combined.txt.gz’ saved [218576/218576]
2024-07-19 10:28:52 (622 KB/s) - ‘/home/tigergraph/data/tmp/facebook_combined.txt.gz’ saved [218576/218576]

Unzipping facebook_combined.txt.gz into /home/tigergraph/data/tmp...
Finished unzipping facebook_combined.txt.gz.```
Finished unzipping facebook_combined.txt.gz.
```
# TigerGraph Setup Script
Expand All @@ -187,7 +188,7 @@ The configuration file `config/2_setup.json` should have the following structure
},
"graphs": {
"MyGraph": {
"file_path": "/home/tigergraph/data/public/livejournal/soc-LiveJournal1.txt",
"file_path": "/home/tigergraph/data/public/skitter/as-skitter.txt",
"execution_steps": {
"drop_graph": true,
"create_schema": true,
Expand Down Expand Up @@ -256,51 +257,48 @@ Finished dropping graph MyGraph.
--------------------------------------------------------------------------------
Running: Creating schema /home/tigergraph/gsql-graph-algorithms/algorithms_test/gsql/MyGraph/1_create_schema.gsql
Stopping GPE GSE RESTPP
Successfully stopped GPE GSE RESTPP in 16.168 seconds
Successfully stopped GPE GSE RESTPP in 16.500 seconds
Starting GPE GSE RESTPP
Successfully started GPE GSE RESTPP in 0.065 seconds
Successfully started GPE GSE RESTPP in 0.063 seconds
The graph MyGraph is created.
Successfully created schema change jobs: [change_schema_of_MyGraph].
WARNING: When modifying the graph schema, reinstalling all affected queries is required, and the duration of this process may vary based on the number and complexity of the queries. To skip query reinstallation, you can run with the '-N' option, but manual reinstallation of queries will be necessary afterwards.
Kick off schema change job change_schema_of_MyGraph
Doing schema change on graph 'MyGraph' (current version: 0)
Trying to add local vertex 'MyNode' to the graph 'MyGraph'.
Trying to add local edge 'MyEdge' and its reverse edge 'rev_MyEdge' to the graph 'MyGraph'.
Trying to add local edge 'MyEdge' to the graph 'MyGraph'.
Graph MyGraph updated to new version 1
The job change_schema_of_MyGraph completes in 1.433 seconds!
The job change_schema_of_MyGraph completes in 0.687 seconds!
Local schema change succeeded.
Successfully dropped jobs on the graph 'MyGraph': [change_schema_of_MyGraph].
--------------------------------------------------------------------------------
Running: Creating loading job /home/tigergraph/gsql-graph-algorithms/algorithms_test/gsql/MyGraph/2_create_loading_job.gsql
Using graph 'MyGraph'
Successfully created loading jobs: [loading_job].
--------------------------------------------------------------------------------
Running loading job for /home/tigergraph/data/public/livejournal/soc-LiveJournal1.txt...
Running loading job for /home/tigergraph/data/public/skitter/as-skitter.txt...
[Tip: Use "CTRL + C" to stop displaying the loading status update, then use "SHOW LOADING STATUS <jobid>" to track the loading progress again]
[Tip: Manage loading jobs with "ABORT/RESUME LOADING JOB <jobid>"]
Running the following loading job:
Job name: loading_job
Jobid: MyGraph.loading_job.file.m1.1721379381684
Log directory: /home/tigergraph/tigergraph/log/fileLoader/MyGraph.loading_job.file.m1.1721379381684
Job "MyGraph.loading_job.file.m1.1721379381684" loading status
Current timestamp is 2024-07-19 08:57:22.611
Loading status was last updated at 2024-07-19 08:57:19.988.
Jobid: MyGraph.loading_job.file.m1.1721390365568
Log directory: /home/tigergraph/tigergraph/log/fileLoader/MyGraph.loading_job.file.m1.1721390365568
Job "MyGraph.loading_job.file.m1.1721390365568" loading status
Current timestamp is 2024-07-19 11:59:41.602
Loading status was last updated at 2024-07-19 11:59:36.576.
[FINISHED] m1 ( Finished: 1 / Total: 1 )
+------------------------------------------------------------------------------------------------+
| FILENAME | LINES | OBJECTS | ERRORS | AVG SPEED | DURATION | PERCENTAGE|
|soc-LiveJournal1.txt | 68993774 | 206981321 | 4 | 1186 kl/s | 58.14 s | 100 %|
+------------------------------------------------------------------------------------------------+
[WARNING] bad data in m1 /home/tigergraph/data/public/livejournal/soc-LiveJournal1.txt: 3 line(s) do not have enough number of tokens.
[WARNING] bad data in m1 /home/tigergraph/data/public/livejournal/soc-LiveJournal1.txt:MyEdge: 1 object(s) have invalid attributes.
Sampling error data can be viewed by executing the 'SHOW LOADING ERROR MyGraph.loading_job.file.m1.1721379381684'.
LOAD SUCCESSFUL for loading jobid: MyGraph.loading_job.file.m1.1721379381684
Job ID: MyGraph.loading_job.file.m1.1721379381684'SHOW LOADING ERROR MyGraph.loading_job.file.m1.1721379381684'.
Elapsed time: 58 sec
Log directory: /home/tigergraph/tigergraph/log/fileLoader/MyGraph.loading_job.file.m1.1721379381684
Summary: /home/tigergraph/tigergraph/log/fileLoader/MyGraph.loading_job.file.m1.1721379381684/summary
Finished running loading job for /home/tigergraph/data/public/livejournal/soc-LiveJournal1.txt.
+---------------------------------------------------------------------------------------------------------+
| FILENAME | LINES | OBJECTS | ERRORS | AVG SPEED | DURATION | PERCENTAGE|
|public/skitter/as-skitter.txt | 11095298 | 33285894 | 0 | 1023 kl/s | 10.84 s | 100 %|
+---------------------------------------------------------------------------------------------------------+
LOAD SUCCESSFUL for loading jobid: MyGraph.loading_job.file.m1.1721390365568
Job ID: MyGraph.loading_job.file.m1.1721390365568---------------------------------------------------------+
Elapsed time: 11 sec
Log directory: /home/tigergraph/tigergraph/log/fileLoader/MyGraph.loading_job.file.m1.1721390365568
Summary: /home/tigergraph/tigergraph/log/fileLoader/MyGraph.loading_job.file.m1.1721390365568/summary
Finished running loading job for /home/tigergraph/data/public/skitter/as-skitter.txt.
--------------------------------------------------------------------------------
All queries are dropped.
Running: Creating query /home/tigergraph/gsql-graph-algorithms/algorithms_test/../algorithms/Community/connected_components/weakly_connected_components/small_world/tg_wcc_small_world.gsql
Expand Down
6 changes: 3 additions & 3 deletions algorithms_test/config/1_dataset.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@
"default_directory": "~/data/public"
},
"datasets": {
"LiveJournal": {
"download_link": "https://snap.stanford.edu/data/soc-LiveJournal1.txt.gz",
"top_level_dir": "livejournal"
"Skitter": {
"download_link": "https://snap.stanford.edu/data/as-skitter.txt.gz",
"top_level_dir": "skitter"
},
"Facebook": {
"download_link": "https://snap.stanford.edu/data/facebook_combined.txt.gz",
Expand Down
2 changes: 1 addition & 1 deletion algorithms_test/config/2_setup.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
},
"graphs": {
"MyGraph": {
"file_path": "/home/tigergraph/data/public/livejournal/soc-LiveJournal1.txt",
"file_path": "/home/tigergraph/data/public/skitter/as-skitter.txt",
"execution_steps": {
"drop_graph": true,
"create_schema": true,
Expand Down
2 changes: 1 addition & 1 deletion algorithms_test/gsql/MyGraph/1_create_schema.gsql
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ CREATE SCHEMA_CHANGE JOB change_schema_of_MyGraph FOR GRAPH MyGraph {
ADD VERTEX MyNode (PRIMARY_ID id STRING, community STRING) WITH PRIMARY_ID_AS_ATTRIBUTE="true";

# 2.2 Create edges
ADD DIRECTED EDGE MyEdge (FROM MyNode, TO MyNode, weight DOUBLE) WITH REVERSE_EDGE="rev_MyEdge";
ADD UNDIRECTED EDGE MyEdge (FROM MyNode, TO MyNode, weight DOUBLE);
}

# 3. Run schema_change job
Expand Down

0 comments on commit 848b393

Please sign in to comment.