I’m trying to get Event Recovery working following the release last month.
However when I follow the steps in the docs I’m unable to get the step to execute on EMR.
If I supply the
MainClass (as shown in the docs) I get the error:
Unexpected argument: com.snowplowanalytics.snowplow.event.recovery.Main
If I don’t supply that, I get the error:
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/SparkConf
The cluster was created with the following config:
aws emr create-cluster --release-label emr-5.19.0
–instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=1,InstanceType=m4.large
–applications Name=Spark Name=Hadoop
–name=“Snowplow Event Recovery”
Are there any known issues around this or anything obvious I’m likely to have missed?