Having some problems running Snowplow Web Model in dbtCloud

Hey @angelsk,

It definitely seems like a tough one, but I feel like the underlying cause of issues is still the dbt bug. As far as I can see from my own internal testing, as well as from your screenshots provided, the manifest table behaves as expected, only updating snowplow_web_incremental_manifest when models actually run successfully. When models are cancelled or fail during execution, the manifest is not updated. Since on the next run the web package looks to the earliest date in the manifest table to see when to update data, it should re-process events that were processed in the successfully run models as well as the failed models.

Referring to the manifest table you posted, we can see that snowplow_web_page_view_events successfully processed data up until 2022-03-21 14:03:16.214000, while snowplow_web_users_this_run successfully processed data up until 2022-03-02 19:14:08.991000. What this means is that in the next run the web package will begin processing events from 2022-03-02 19:14:08.991000 for all models, and therefore given that this next run is successful you will find that all models are updated correctly.

If you want, we can schedule a call to chat about this and see how the runs pan out in real time, and I can walk you through any other questions you might have! I think it’s gotten to a point where this discussion has gotten too lengthy and I’m a bit lost, so chatting over a zoom call might be more productive. Let me know if that works for you!

Have a great day,
Emiel

@Emiel that sounds like an excellent idea! You should have my email, lets have a chat :slight_smile:

I managed to get all the back filling done by disabling snowplow__enable_yauaa - but I have issues running the final run - the one where there may not be any new events or sessions. I’ve upgraded everything I can, but yeah.

So yeah. Can’t use this in dbt cloud.