For engineers   AWS batch pipeline


Topic Replies Activity
Handling large volumes of duplicated event_ids 4 July 3, 2018
Error in Raw S3 -> Raw HDFS Step 1 June 28, 2018
Monitoring S3 Loader 2 June 12, 2018
Shredded/bad-rows output directory already exists 18 June 6, 2018
Storage loader code from where it gets data from shredded/good 3 May 22, 2018
Problems with Enrich / EMR process provisioning instances 3 May 14, 2018
Monitoring snowplow 4 May 7, 2018
Transient rdbloader error: [Amazon](60000) Error setting/closing connection 4 April 25, 2018
EMR Job - Failing with SSL handshake? 5 April 23, 2018
Enrich problem: "Error writing row" 1 April 19, 2018
Unique network_userid count mismatch 2 April 12, 2018
Incorrect IP Address in the batch pipeline 1 April 9, 2018
Processing a big file in EMR or split it up? 3 March 17, 2018
Excutors lost and disconnecting in EMR 5 February 22, 2018
./snowplow-emr-etl-runner: rule 24: exec: java: not found 5 February 21, 2018
ETL RDB Loader Error 5 February 10, 2018
Shredding fails with custom schema in eu-central-1 6 January 27, 2018
Repopulate a single table? 6 January 16, 2018
java.nio.file.FileAlreadyExistsException: ./ip_geo 8 January 15, 2018
Enable Ganglia on Snowplow EMR clusters 4 January 10, 2018
Spark memory woes 2 December 14, 2017
Having issues with config.yaml and Contract Violation 5 December 9, 2017
EmrEtlRunner config.yml, cloudfront format 2 December 8, 2017
Enriched good and bad buckets are empty in the enrich 8 December 4, 2017
Events after enrichment ending in bad bucket 5 November 30, 2017
Failing in the 4th steps of storage process(Input file not found) 3 November 28, 2017
Error in running scala stream collector 3 November 28, 2017
Frequently failing in the 4th steps of storage process 5 November 22, 2017
Need to rebuild snowplow in new region 5 November 20, 2017
Service: Amazon S3; Status Code: 503; Error Code: 503 Slow Down issues 3 November 13, 2017