Resubmit record from stream enrich

We have just switched from deprecated batch mode collection and enrichment to the new streaming versions.

Everything was going fine till last week, when we pushed a datatype error to our production JS tracker and started trying to put a string in a numeric field in an unstruct schema. So now I have an enriched/bad S3 bucket full of events that I want to edit and put back into ETL.

A while back I did this with the batch mode collection and enrichment: pulled events out of enriched/bad, cleaned them up, generated fresh input files, placed them in the raw input bucket. All that worked fine.

Can I do a similar thing now? If I take the payload from an enriched/bad record and correct the defects, can I put it in a bucket where it will get picked up by emr-etl-runner?

@wleftwich, yes, you still should be able to recover bad data. However, the workflow might be different from what you used to do. That depends on the enrichment version you currently use. There are 2 workflows for streamed data:

New bad format is discussed here. It was introduced with R118 release.

3 Likes