Is there best practices guidelines for maintaining a catalogue of event models in use in a particular deployment?
The way i see it, machines and developers are somewhat happy with iglu repositories hosting jsonschemas and various derivatives (jsonpaths and redshift ddl) groomed into a repository index. Has anyone attempted to use this as a basis to provide human readable raw event model documentation to render jsonschema into a document and allow for manual input from the model designer to explain the reasoning behind capturing attributes, version change logs and other pieces of information to make downstream data analysis more thoughtful?
I for once would want to explain to the analysts why certain elements are captured, why the field lengths were limited, how the raw events should be interpreted and which elements make it into the database, which fields are expected to have high cardinality etc. etc.
In my previous engagements, unrelated to snowplow, we have made an extensive use of Atlassian Confluence templates to standardize information gathering, event model documentation and robust cataloguing. Any pointers on how to achieve these results with snowplow assets?