Update custom schema but mutator didn't work

phxtorise · October 26, 2021, 7:36pm

Hi,

My Snowplow pipeline is deployed on GCP. And my Iglu server is a static repo hosted on Google Cloud Storage. Recently I met a problem when I wanted to update one of my custom schemas in the Iglu repo. I added a new property in the schema, and uploaded and overwrote it onto GCS with the same name “1-0-0”. Using the same name is because there are already a large amount of data stored with this version, and I didn’t want to lose them. But I found BigQuery Mutator could not help me update the new schema in BigQuery. If I ssh into the VM and use mutator add-column, it will also report an error. So my current solution is to manually update the BigQuery table schema through bq command line. I am not sure whether it is a good practice, or whether it will cause some issues in the future. Is there any suggestion or better practice for such a situation?

mike · October 26, 2021, 10:05pm

Best practice is to never modify a schema once it has been deployed to production and to treat it as immutable. If you need to add new columns you should use schema versioning to increment the version of your schema - and the mutator will then take care of adding a new column for you (it has no capability to alter existing columns).

phxtorise · October 26, 2021, 11:22pm

Thanks, @mike! And another question is, if I add a schema of a new version like upgrading from 1-0-0 to 1-0-1, and I want to move the data of column 1-0-0 to the column 1-0-1 in bigquery table, and then remove the column 1-0-0, is there any convenient way to do this? Or is it a suggested way?

mike · October 26, 2021, 11:54pm

The suggested way is to COALESCE these columns (either at query time or materialisation) as this will preserve the version associated with the event.

If you are using a data model you can just pick the first non-null value.

Topic		Replies	Views
BQ Mutator not adding column from custom schema GCP pipeline	4	926	July 13, 2022
Mutator Exception Adding Custom Schema For engineers	2	630	April 11, 2022
Mutator Exception Adding Custom Schema via GKE Mutator pod Enrichment	5	841	July 5, 2022
Schema Violations error GCP pipeline	2	1013	January 27, 2022
GCP quickstart -custom schemas not showing in bq	2	592	June 21, 2023

Update custom schema but mutator didn't work

Related Topics