[redshift] unstructured event not save in correct schema


#1

i’m sending some unstructured events using IOS but when i call the snowplow-storage-loader, my custom schema are saved in the atomic.events table and i can’t see my custom properties.

i. I need a different config for snowplow-storage-loader? I’m using just one config shared to emr-etl-runner and snowplow-storage-loader.

Some idea of why this is happening?


#2

Hi @carleto,

Did you deploy table for your self-describing (unstructured) events along with atomic.events? Also can you confirm that your enriched/good bucket is non-empty?


#3

Hi @anton,

I have created the tables in schema _atomic_ and the enriched/good is empty.

update

i run again the emr_etl_job and now have data in enriched/good/run=2017-02-24-11-50-32/ with 105 itens.


#4

Sorry @carleto, I meant shredded/good (not enriched) bucket. This is a bucket where your data is loading from.

If shredded/good is empty, you should look into shredded/bad and find out what exactly went wrong with your events (each event should result into line in shredded/good or into error message in shredded/bad.


#5

@anton, it’s ok.

After run again, i saw the folder, as you suggest and in the folder has the same 105 files, but all files have 0kb size.

How can I analyze this items?

update

I find the unstructured log in archived files, an example:
mob 2017-02-24 14:50:32.845 2017-02-24 13:46:55.000 2017-02-24 13:46:53.609 struct d8dcf8ec-fa2b-4915-9934-dc5a2cde2a27 ios-0.6.2 clj-1.1.0-tom-0.2.0 hadoop-1.7.0-common-0.23.0 leonardowistuba@gmail.com 179.191.81.14 2c61506f-0c80-42ef-8f4e-3dfc08f3efee {"schema":"iglu:com.snowplowanalytics.snowplow/contexts/jsonschema/1-0-1","data":[{"schema":"iglu:com.snowplowanalytics.snowplow/mobile_context/jsonschema/1-0-1","data":{"osType":"ios","networkType":"wifi","osVersion":"10.2","appleIdfv":"5E628B31-6DD6-4DA6-B5B6-E69ADBC8D450","carrier":"Claro Brasil","deviceManufacturer":"Apple Inc.","appleIdfa":"9B1211E4-794D-4885-9BF5-9108B51206AB","deviceModel":"iPhone"}}]} none none [MKT] Scroll na página 0 GuiaBolso/17 CFNetwork/808.2.14 Darwin/16.3.0 pt 750 1334 750 1334 2017-02-24 13:46:54.655 2017-02-24 13:46:53.954 com.google.analytics event jsonschema 1-0-0

I don’t understand why the event is using the schema:
iglu:com.snowplowanalytics.snowplow/mobile_context/jsonschema/1-0-1 and not my custom defined schema.


#6

Hi @carleto,

To find out what’s wrong with your shredded items - you need to look at shredded/bad. And if you don’t find there anything related to your event - you need to look at enriched/bad. Each line in those files is error happened during your events shredding (or enrichment). So, first is enrichment step, after follows shredding step which takes all valid events from enriched/good.

Most often problem is that your events you trackers sent are not valid against your JSON Schema and that error should be represented in enriched/bad.

JSON you published above is not an event, it’s a context. Difference between them is that event is atomic entity happened in some point of time and contexts are auxiliary entities that can accompany events (so one event may have zero or more contexts). Both contexts and events are expressed with self-describing JSONs. You can find more information here if you want. This particular context could accompany page_view event.