EMR to Redshift Error


#1

I’m getting an interesting error on EMR-ETL-Runner on the RDB Loader (Redshift) stage:

Data loading error Amazon Invalid operation: Cannot COPY into nonexistent table com_snowplowanalytics_snowplow_duplicate_1;
ERROR: Data loading error Amazon Invalid operation: Cannot COPY into nonexistent table com_snowplowanalytics_snowplow_duplicate_1;
Following steps completed: [Discover]
INFO: Logs successfully dumped to S3 [s3://bucket/logs/rdb-loader/2018-05-10-10-49-44/b22169fd-58d5-48a1-9e63-ac313feb0a99]

I have no idea where this duplicate table is coming from. I have already created a database, ran the atomic-def.sql as referenced on this page, and all other steps:

Here is my config for the Redshift target as well:

{
“schema”: “iglu:com.snowplowanalytics.snowplow.storage/redshift_config/jsonschema/2-1-0”,
“data”: {
“name”: “Snowplow Redshift Storage”,
“host”: “IP”,
“database”: “Database”,
“port”: 5439,
“sslMode”: “REQUIRE”,
“username”: “username”,
“password”: “password”,
“roleArn”: “ARN”,
“schema”: “atomic”,
“maxError”: 1,
“compRows”: 1000,
“sshTunnel”: null,
“purpose”: “ENRICHED_EVENTS”
}

Any thoughts on why this is happening and ideas on how to fix are much appreciated.


#2

should be able to just run this SQL to create table


#3

Thanks! That worked. I had to run manifest-def.sql as well. It wasn’t clear in the documentation that I had to run those as well.