BigQuery Web Model v1 Released

Colm · January 20, 2021, 4:55pm

We are very excited to announce the release of BigQuery Web model v1 . This is the second of a series of releases intended to address a hugely important need for Snowplow - extensible, scalable, incremental data modeling.

Improving the Modeling Experience

As described in the Redshift v1 release, these models aim to solve the key problems of modeling Snowplow data by providing a Snowplow-maintained incremental logic, and allowing users to customise their logic in a more maintainable and more straightforward way.

What the new model brings

The v1 release of the web model is designed to implement a SQL-as-software structure:

We establish core modules which can be thought of as source code
Each module has an explicit input and output (each module also has side-effects - this is unavoidable)
Each module has an ‘entry point’ for custom logic, which can be treated as a plugin
Each module is testable in isolation
Tests can be extended to custom modules

This structure allows us to segregate the ‘heavy lifting’ of an incremental Snowplow module - by extrapolating the incremental logic into its own ‘base’ module. The base module produces a table which contains only events relevant to this run of the incremental logic - both the new and those that require recomputing (because relevant events have arrived - think of a late arriving page ping event).

This removes the complexity from customisation - all subsequent logic can operate on this input, as if it were a simple drop-and-recompute model, but the mode’s structure ensures an efficient incremental update. This means that the end user need only be concerned with the aggregation logic they care about, rather than expending effort on how to make that logic work within a complex structure.

Additional features introduced

Users can take advantage of the commit_table stored procedure to create and update custom tables, without needing to manage table definitions or migrations
Tests improved
Helper scripts improved
Configs introduced (Snowplow Insights customers can use configs directly on Orchestration - Open Source users should instrument their own dependency management)
Introduces script functionality to use configs to produce ‘pure’ SQL files, for those that wish to run the models using some other tool than SQL-Runner.

More information

Check out the v1 README of the repo for more detail on the structure.

For a quickstart guide, see the BigQuery README.

Topic		Replies	Views
Redshift Web Model 1.1.0 released New releases	3	669	November 17, 2020
Dataform-data-models v0.1.0 released Announcements	1	700	April 16, 2021
BigQuery Web Model 1.0.2 Released New releases	0	562	March 11, 2021
BigQuery Web Model 1.0.3 Released New releases	0	522	March 25, 2021
Redshift Web Model 1.3.0 Released New releases	0	625	June 7, 2022

BigQuery Web Model v1 Released

Improving the Modeling Experience

What the new model brings

Additional features introduced

More information

Related Topics