To add neo4j dependency to your sbt build:
libraryDependencies += "com.github.music-of-the-ainur" %% "neo4j-almaren" % "0.1.4-3.4"
Neo4j Connector was implemented using https://github.com/neo4j-contrib/neo4j-spark-connector. For more details check the following link.
To run in spark-shell: For 2.12:
spark-shell --master "local[*]" --packages "com.github.music-of-the-ainur:almaren-framework_2.12:0.9.10-3.4,com.github.music-of-the-ainur:neo4j-almaren_2.12:0.1.4-3.4"
For 2.13:
spark-shell --master "local[*]" --packages "com.github.music-of-the-ainur:almaren-framework_2.13:0.9.10-3.4,com.github.music-of-the-ainur:neo4j-almaren_2.13:0.1.4-3.4"
The connector is also available from the
Maven Central
repository. It can be used using the --packages
option or the
spark.jars.packages
configuration property. Use the following value
version | Connector Artifact |
---|---|
Spark 3.4.x and scala 2.13 | com.github.music-of-the-ainur:neo4j-almaren_2.13:0.1.4-3.4 |
Spark 3.4.x and scala 2.12 | com.github.music-of-the-ainur:neo4j-almaren_2.12:0.1.4-3.4 |
Spark 3.3.x and scala 2.13 | com.github.music-of-the-ainur:neo4j-almaren_2.13:0.1.4-3.3 |
Spark 3.3.x and scala 2.12 | com.github.music-of-the-ainur:neo4j-almaren_2.12:0.1.4-3.3 |
Spark 3.2.x and scala 2.12 | com.github.music-of-the-ainur:neo4j-almaren_2.12:0.1.4-3.2 |
Spark 3.1.x and scala 2.12 | com.github.music-of-the-ainur:neo4j-almaren_2.12:0.1.4-3.1 |
Spark 2.4.x and scala 2.12 | com.github.music-of-the-ainur:neo4j-almaren_2.12:0.1.4-2.4 |
Spark 2.4.x and scala 2.11 | com.github.music-of-the-ainur:neo4j-almaren_2.11:0.1.4-2.4 |
Parameters | Description |
---|---|
url | The url of the Neo4j instance to connect to |
---- | ---- |
Options | Description |
------------ | ------------------------- |
authentication.basic.username | Username to use for basic authentication type |
authentication.basic.password | Username to use for basic authentication type |
authentication.custom.credentials | These are the credentials authenticating the principal |
labels | labels is a name or identifier to a Node or a Relationship in Neo4j Database. |
Nodes | Nodes are often used to represent entities. The simplest possible graph is a single node. |
Relationship | A relationship connects two nodes. Relationships organize nodes into structures, allowing a graph to resemble a list, a tree, a map, or a compound entity — any of which may be combined into yet more complex, richly inter-connected structures. |
For More Driver options check the following link
import org.apache.spark.sql.{AnalysisException, Column, DataFrame, SaveMode, SparkSession}
import org.scalatest._
import org.apache.spark.sql.functions._
import com.github.music.of.the.ainur.almaren.Almaren
import com.github.music.of.the.ainur.almaren.builder.Core.Implicit
import com.github.music.of.the.ainur.almaren.neo4j.Neo4j.Neo4jImplicit
val almaren = Almaren("neo4j-almaren")
val df = almaren.builder
.sourceNeo4j(
"bolt://localhost:7687",
Some("neo4j"),
Some("neo4j1234"),
Map("labels" -> "Person")
).batch
Parameters | Description |
---|---|
url | The url of the Neo4j instance to connect to |
-------- | ------ |
Options | Description |
authentication.basic.username | Username to use for basic authentication type |
authentication.basic.password | Username to use for basic authentication type |
authentication.custom.credentials | These are the credentials authenticating the principal |
SaveMode | SaveMode is used to specify the expected behavior of saving a DataFrame to a data source. |
node.keys | Comma separated list of properties considered as node keys in case of you’re using SaveMode.Overwrite |
For More Driver options check the following link
import org.apache.spark.sql.{AnalysisException, Column, DataFrame, SaveMode, SparkSession}
import org.scalatest._
import org.apache.spark.sql.functions._
import com.github.music.of.the.ainur.almaren.Almaren
import com.github.music.of.the.ainur.almaren.builder.Core.Implicit
import com.github.music.of.the.ainur.almaren.neo4j.Neo4j.Neo4jImplicit
val almaren = Almaren("neo4j-almaren")
val df = almaren.builder
.sourceSql("select * from person_info")
.targetNeo4j(
"bolt://localhost:7687",
Some("neo4j"),
Some("neo4j1234"),
Map("labels" -> "Person"),
SaveMode.ErrorIfExists
).batch