Ingesting Auth0 log streams

I am looking to ingest Auth0 events into a GCP-based Snowplow installation. Auth0 is a popular identity management platform and supports near-real-time exports of user creation, login, … events using their log streams feature.

As Auth0 log events are not currently “natively” supported by Snowplow, I was considering using Snowplow’s remote adapter feature, but apparently remote adapters do not work with Beam Enrich, and their usage is somewhat controversial.

I see several alternatives, but would like some feedback on how to best proceed :slight_smile:.

  • Option 1. Implement an adapter in Scala; contribute this adapter to the Snowplow project.
  • Option 2. Send events to a custom endpoint/microservice of my own, which converts them into a “native” snowplow_data event and then forwards them on to the default Snowplow tp2-tracking endpoint. (Forwarding to tp2 seems to trump forwarding to the Iglu-adapter, mostly because it looks like the Iglu-adapter does not support batching of events; Auth0 micro-batches its events.)

I’m quite time-constrained, and as always having the feature faster is better. I have limited Scala experience. Implementing a small adapter micro-service only takes a matter of hours. All strong points in favor of remote adapters and now option two.

However, Auth0 is a popular, publicly available service. I’d like to contribute, and I’d like “official” integration. Looking at existing adapters, the actual programming required to ingest log events would also be limited; I’d probably spend more time setting up my dev env than coding. I’d go for this if there’s a good chance of the adapter ending up in some future release of common-enrich. I don’t want to deploy from source every time there’s a new release…

So…

  • Would an Auth0 event log adapter be a good candidate adapter? We’ve determined that we want to use Snowplow to ingest these logs, but log events are maybe not as structured as typical Snowplow data. There’s structure though - besides the log message itself, entries do have properties such as event type, user id, … It’s not just text.
  • I’m not affiliated with Auth0; is “third-party” adapter development like this encouraged? Is it ok if I’m the one defining event types under the com.auth0-namespace in Iglu central? I’m thinking of involving Auth0 or asking them to contribute, as integration with other services is their strong point. They integrate with Datadog, EventBridge, Event Grid and Slack; Snowplow should be there :wink:.
  • What about option 2? It seems quite simple to me. It’s a fairly generic alternative to using the remote adapters, and actually seems easier to set up as it does not require any Snowplow configuration changes.

I’d definitely go with option 2 here. There’s a lot more flexibility in defining what business logic is required in terms of transforming (or splitting up) the event and it means that it isn’t hardcoded to common-enrich which can be a problem with Iglu adapters (being pinned to a specific version of a schema). These services are often very inexpensive to run (Cloud Functions, Cloud Run) and you can still use a Snowplow tracker under the hood if you’d like to avoid having to construct the network request from scratch.

I’m not sure that my views are necessarily shared by everyone but I do like having this sort of functionality independent from the enrich code base. Beyond the structure and common manipulations of the Snowplow event I think it’s better to have a loose coupling where enrich doesn’t need to be aware of third party events. You’ve already highlighted a couple of these reasons (deployments, release velocity) but I believe there’s a number of reasons beyond this.

There’s no issue with this - I’d advise to just keep the schema as general as you can and try and support for whatever use cases you can think of (e.g., if there are fields you aren’t using, still include them in the schema).

1 Like

Thanks for your response @mike! I’ll get started on option two.

Interesting! The adapters do currently live in the enrich-common codebase; the docs encourage contributions. It would be cool if it were easier to contribute a (pluggable) adapter.

For anyone who comes along this topic: I wrote a Medium story on this deliberation and how I ended up ingesting these “unsupported” log events. If you’re in this topic, you may be interested in the post too, so I’m dropping the link here.

4 Likes

@JonMerlevede this post is really awesome! Thanks so much for writing it and posting here. :slight_smile:

Ingesting custom events in Snowplow is not as easy as it maybe should be.

I think we agree. In fact, we have been discussing this topic recently, and just last week we had a workshop to explore what options we might have to solve this pain point.

It’s not my place to guess at what lands on the roadmap, but I am confident in saying that we take this topic seriously and there’s a desire to innovate here.

I’ve highlighted your post to the relevant teams in case we might take some inspiration from it. Thanks again for going the extra step and writing it up. :slight_smile:

3 Likes