POST data from CloudFront Collector

So I’m dealing with a large amount of raw cloudfront data on S3. I successfully put it through the emr-etl tool -> storageloader into Redshift. However, only a small portion of my data made it through and I’m finding most of my data in the “enrichment/bad” bucket with this error:

“Only GET operations supported for CloudFront Collector, not POST”

However, the line it’s referencing has base64 encoded JSON data. I’m a bit confused, I’m reading that CloudFront shouldn’t accept POST requests, but clearly the data is there so it must have worked somehow. Why won’t the EMR job accept it? Thanks.

Hi @dyerw, can you share an example line with the Base64-encoded JSON data?

Here’s a line from the “bad” folder: http://pastebin.com/raw/y2MiKE6C

Hi @dyerw - the row you have shared is from the Clojure Collector - it contains this giveaway parameter:

&cv=clj-1.1.0-tom-0.2.0

So it seems like something has gone wrong somewhere in the configuration or setup of your pipeline…

Ah it appears I had it set to cloudfront in my config.yml. Changing it and trying again.

So I’m getting a different error now for a bunch of the data:

Payload with vendor noonu and version tp2 not supported by this version of Scala Common Enrich"}],“failure_tstamp”:"2016-05-16T16:37:36.873Z

So I guess my question now is: what is the vendor and how is it set?

Made a separate post for this because the initial issue was resolved:
http://discourse.snowplow.io/t/cant-enrich-custom-events-with-custom-schemas-repository/237