Skip to content

Commit

Permalink
Update aws.md
Browse files Browse the repository at this point in the history
  • Loading branch information
sonalgoyal authored Aug 1, 2023
1 parent 5438c5b commit d67cd80
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions docs/running/aws.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ nav_order: 5
---
## Running on AWS Elastic Map Reduce

One option is to use the spark-submit option with the Zingg config and phase.

aws emr create-cluster --name "Add Spark Step Cluster" --release-label emr-6.2.0 --applications Name=Zingg \
--ec2-attributes KeyName=myKey --instance-type <instance type> --instance-count <num instances> \
--steps Type=Spark,Name="Zingg",ActionOnFailure=CONTINUE,Args=[--class,zingg.client.Client,<s3 location of zingg.jar>,--phase,<name of phase - findTrainingData,match etc>,--conf,<s3 location of config.json>] --use-default-roles````

A second option is to run Zingg Python code in [AWS EMR Notebooks](https://aws.amazon.com/emr/features/notebooks/)

0 comments on commit d67cd80

Please sign in to comment.