Ideas around iglu / schema store


#1

Thought of an interesting idea for iglu server - have a built in notification system and/or event that triggers notifications for schema changes/updates.

Example:

  1. User makes an update to a schema and pushes it to dev - other teams are reading off that stream and doing X with it - send an event/email/etc. to all subscribers of the stream saying “X schema changed: {payload}”

On another note - Yelp had an interesting schema store setup that tracks changes and pushes notifications down stream to update things like Redshift schemas based off incoming schema changes (ref: https://github.com/Yelp/schematizer)

Interested in the communities thoughts on the above!


#2

of course - if you store your schemas in s3 - you could always do s3 onwrite events and SES emails or something. thinking out loud mostly :slight_smile:


#3

I really like this idea.

What’s the best way of doing the notifications in a reasonably platform agnostic way? I was thinking it’d be nice if Iglu Server POSTed to a customisable webhook on schema changes but that doesn’t easily solve notifications. Is the meta solution to push schema change events into the pipeline itself and a consumer can read/query those events?


#4

I like it! It will be especially useful in conjunction with our planned Schema-inference mechanism. @13scoobie feel free to submit an issue to Iglu repository.

@mike is there any specific type of notifications you’re thinking about? I think POST-webhook is general enough to be used as notification approach.


#5

I think POST webhook is probably sufficient. SNS is a little too platform specific and webhooks means that people can always integrate with Zapier or some other IFTTT style tool to trigger notifications for other systems.


#6

@anton I’d be interested to know more about what you have planned for schema-inference - is this documented anywhere?


#7

Sorry @acgray, nothing is documented yet and it is in early blue-printing stage, but long story short there’s an effort to allow data payload have unversioned schemas, which will either be matched against existing version or derived (and published) automatically based on SchemaVer rules and differences in data payload.


#8

Will do, thanks guys! and agree - webhook would allow for multiplatform support vs locking down to a specific vendor.

EDIT: issue opened - https://github.com/snowplow/iglu/issues/260