Javascript tracker with unstructured events

Hello Snowplow community,

I have started with Snowplow on AWS using terraform.

I successfully created a pipeline for capturing page view info.

I have a couple of questions

  1. Which iglu schema name do I need to give in the Javascript tracker? Not sure whether the below given schema is correct or not. I just copied from some webpage.

iglu:com.snowplowanalytics.snowplow/event_fingerprint_config/jsonschema/1-0-0

  1. What about using multiple iglu schemas?

  2. Handling Unstructured events

I am trying to understand how unstructured events can be tracked. When it comes to iglu schema, I am getting lost. I am not sure how to create my own json schema and use it.

I went through the below given links but couldn’t figure it out.

Could you please help me out to understand these topics?

I could not find any YouTube videos which explain iglu schemas and snowplow end-to-end setups.If possible, please add some videos on different topics.

Thanks,
Raghav Nayak

Hi @raghavn,

For standard events like page views, or any of the javascript tracker’s built-in events/contexts, you don’t need schemas at all.

For custom events, or custom contexts, you do need them.

Iglu is used to host schemas, if you want to use custom events or contexts, you’ll need to set up an Iglu server and connect it to your Enrich component.

Assuming you’ve done that, you’ll need to create a schema to match your custom event or context, and upload it to Iglu for testing. Here’s a guide to doing that, and here’s a guide to self-describing JSON, which is the format of the schema. I would recommend using Snowplow Mini to test your schemas and tracking.

Once you have uploaded your schema, you can then track your custom events as described in the documentation - reference the path to your custom schema for that event, and the data that you’d like to send against that schema. Same format for attaching custom contexts to events.

To give you direct answers to your questions:

  1. Which iglu schema name do I need to give in the Javascript tracker? Not sure whether the below given schema is correct or not. I just copied from some webpage. iglu:com.snowplowanalytics.snowplow/event_fingerprint_config/jsonschema/1-0-0

You don’t need to give any schema name in the javascript tracker unless you’re referencing one you’ve created yourself via the above process. The event_fingerprint_config schema is a definition for a config file in enrichment, it doesn’t apply to this.

  1. What about using multiple iglu schemas?

For custom events, you reference one schema per-event each event can only point to one schema. For custom contexts, you reference a list of schemas - one per context. You can attach as many contexts as you like to your events. You can attach custom contexts to any event.

  1. Handling Unstructured events

Unstructured events are what I refer to as custom events - in the documentation you’ll sometimes see them referred to as Self-describing events. It’s a bit confusing I know - technically they’re ‘Custom Self-Describing events’. It’s quite the mouthful.

Hello Colm,

Thank you so much for the clarification.

Thanks,
Raghav

Hello Colm,

I have two questions

  1. Organizing event fields
    We have different types of custom events for different pages, and each event has different fields; some of them are common and some unique. What would be the best way to capture these types of custom events?

I am thinking to create a single schema where I will put all the fields in one single file. I think by doing this it would be easier to maintain and include the schema in JavaScript tracker.

  1. Tracking JSON fields
    As per datatypes supported, there is no JSON datatype. If I want to store some JSON field as a single field, is there any provision?

Thanks,
Raghav

Hey @raghavn,

So in terms of what’s possible, you can pretty much do anything with custom events. However it’s wise to try to follow some of the principles that Snowplow is designed for, and aim for a simple and effective tracking design.

I definitely recommend against using one schema for everything - it’s a short cut on the tracking implementation side, but it’s likely to produce more headaches downstream when you’re trying to use the data.

To give you a quick explanation of some key considerations here:

Snowplow is designed around two important concepts, events and entities (aka contexts). Events are actions - things that happen at a specific moment in time - in this use case, you can think of an event as an interaction with your website. Examples are page views, button clicks, sign-up events.

Entities are persistent things, which are associated with events. Under the paradigm of ‘interactions with your website’, entities are the things doing the interacting, or the things being interacted with (but not limited to those two!). Entities can be attached to events. So, for example, you may have a ‘user’ entity, which has data about the user - username, login date, etc etc. You might have a ‘product’ entity, which contains information about products on your website. When a user clicks on ‘add to cart’ for a product, you might attach the user and product entity to an add to cart event.

I’ve just picked some examples to try to make the concepts more familiar here. Below are some links which should be useful in figuring out tracking design. They’re written for specific verticals but the concepts are transferrable to any use case.

Retail:
part 1 should help get your head around the concepts
part 2 should help you see how a design can be done well

Media:
part 1 should help get your head around the concepts
part 2 should help you see how a design can be done well

Actually I recommend that whole series of blogs!

1 Like

Hello Colm,

Thank you so much for the detailed explanation. I will certainly go through the blog posts.

Thanks,
Raghav