Parse enriched/bad


#1

For troubleshooting, we need to check on data in enriched/bad. We plan to use Redshift/Spectrum to analyze the data. What kind of format is enriched/bad in? What pre-process is needed before we can apply Spectrum on it?

Thanks.


#2

Hi @RichardJ,

Enriched is just TSV (column names in Snowplow code on github). Bad is much more complicated - this is JSON with thrift encoded raw request in one of fields (again - thrift definition in Snowplow sources). Of course cx/ue_px if base64.