Schema Violations error

Hi Snowplowers,

I am currently hosting the Snowplow pipeline on GCP, in which GTM is used as the tracker, and BigQuery is used as the final storage. Today I met a strange thing. When I triggered a custom event (item_view) on our website, it cannot be received in BigQuery. I found the data in the badrows table, in which it was classified as schema_violatioins. I have also checked my schema repository hosted on Google Cloud Storage, and the schema and Iglu resolver have no problem since other item_view events could be received. And the following is its failure information.

{
  "timestamp": "2022-01-26T18:44:26.547215Z",
  "messages": [
    {
      "schemaKey": "iglu:com.xxx/item_view/jsonschema/1-0-1",
      "error": {
        "error": "ResolutionError",
        "lookupHistory": [
          {
            "repository": "Iglu Central",
            "errors": [
              {
                "error": "RepoFailure",
                "message": "Unexpected exception fetching: org.http4s.client.UnexpectedStatus: unexpected HTTP status: 404 Not Found"
              }
            ],
            "attempts": 33,
            "lastAttempt": "2022-01-26T18:44:26.546Z"
          },
          {
            "repository": "Iglu Client Embedded",
            "errors": [{ "error": "NotFound" }],
            "attempts": 1,
            "lastAttempt": "2022-01-26T07:14:26.404Z"
          },
          {
            "repository": "XXX Static Repo (HTTP)",
            "errors": [
              {
                "error": "RepoFailure",
                "message": "Unexpected exception fetching: org.http4s.client.UnexpectedStatus: unexpected HTTP status: 403 Forbidden"
              }
            ],
            "attempts": 33,
            "lastAttempt": "2022-01-26T18:44:26.440Z"
          }
        ]
      }
    }
  ]
}

Do you know what’s this error caused by and how to fix it? Thanks for sharing your thoughts.

Another strange thing is, when I used the GTM published version, this item_view event will be dumped into badrows with the error above. But if I used the GTM preview to trigger this item_view event, then the BigQuery could receive it. Do you have any idea about this issue? Thanks!

It sounds like the enricher tried to fetch the schema from your Iglu repository (GCS) and it was getting a 403 response back, it retried multiple times but I’m guessing it got the same response repeatedly so eventually gave up.

Unfortunately the 403 error code back from GCS can mean quite a few things (including quota issues, drift in your enricher clock time and billing issues - full list here). If you have Cloud Logging turned on you’ll hopefully get the response text which should give you a little bit more detail.

1 Like