Upgrade vs New Install

Brandon_Kane · March 27, 2019, 8:25pm

Hi all,

We have been successfully running Snowplow for quite a while now in production but have sadly not kept up with upgrades. We are currently on R84.

I’m wondering about the possibility of just spinning up a new installation of R113 and trying to cutover to the new version rather than doing the upgrades piece by piece, version by version.

Has anyone else tried this? What would be the best way to handle the actual cutover?

Any other thoughts on this approach would be appreciated.

Thanks!

Brandon

Colm · March 27, 2019, 10:02pm

Hey @Brandon_Kane,

Spinning up a new pipeline is a common strategy for a big version jump upgrade. I would send all tracking to both pipelines insofar as possible - most trackers support either multiple endpoints or multiple trackers - here are the docs on this for js.

That way you’re covered if something goes wrong, and can QA/Audit the new data vs the old before switching the old one off.

Hope that helps, let us know how you get on!

jakethomas · March 27, 2019, 10:08pm

I’ll echo @Colm here, and was in the process of typing the same exact thing.

If you’re running an aws alb you can do some fun things with target groups, but I would definitely lean towards running multiple trackers (pointed at different collector endpoints) while you validate all tracking works as desired with the new infra.

I’ve typically found that it’s not the version # that you’ll have trouble with here- it’s making sure your configuration is equivalent.

robkingston · March 27, 2019, 10:13pm

Depends whether you’re running batch or real-time…

If it’s batch, I think you can get to the version you want in one go, but you’ll need to follow all the steps up to the desired version. We’re almost running the latest version of batch (and our current pipeline has been running since 2013). It’s really easy once you get the hang of it.

Here’s how we manage our upgrades:

Document the steps you need to reach the desired version (have to read the blog posts - they’re really clear though)
Make the changes you need in a Snowplow config git repository (just create a new branch for the upgrade for testing/reviewing)
Run a test enrichment (keep a bunch of test data in some separate S3 buckets)
Deploy the config to your production environment

Brandon_Kane · March 28, 2019, 1:50pm

Thanks for the great tips everyone!

We are using batch right now but might take this opportunity to move to one of the real time approaches since that aligns better with how we use the data now. The multiple tracker approach seems to be the way to go, since we only use the JS tracker right now and have control over the code that initializes them.

@Colm One question though - It looks from the docs like the trackers are all backwards compatible, so we should be ok to use the latest version of the JS tracker to send data to both our R84 collectors and the R113 collectors. Am I interpreting that correctly?

ihor · March 28, 2019, 3:38pm

@Brandon_Kane, what JS tracker are you using now? From v2.5 till current one there should not be breaking changes. However, there are different approaches introduced which you might wish to switch to when upgrading.

Brandon_Kane · March 28, 2019, 3:52pm

@ihor - We are on 2.6 currently, so should be fine to continue using the current setup with two different versioned collectors, but we will plan to update to the latest JS tracker as part of the upgrade to take advantage of the new stuff.

Colm · March 28, 2019, 4:16pm

My two cents is that it’s best to play it safe with the stack upgrade just in case.

I would upgrade the stack keeping the 2.6 tracker, do some checks across the two pipelines, then upgrade the tracker when you’ve done due diligence on the pipeline upgrade.

I wouldn’t expect it to happen, but the pragmatist in me says that if you do have a confusing issue to dig into you want the smallest range of potential causes possible.

ihor · March 28, 2019, 4:46pm

@Brandon_Kane, yes, should be safe to upgrade taking the approach @Colm suggested.

Non-breaking changes to keep in mind:

domainUserId is a UUID from v2.6
trackPageView generates a new ID with each page view from v2.7
trackSelfDescribingEvent is an alias for trackUnstructEvent from v2.7
useLocalStorage and useCookies are deprecated from v2.8. They will fire warnings if still used
identifyUser as an alias for setUserId from v2.9

Lot’s of new features in addition. Below is the list of the release posts to go over the new features

Brandon_Kane · March 28, 2019, 4:58pm

@Colm and @ihor thanks again for the quick and detailed responses!

Definitely want to do the two steps in isolation, was really just curious whether to do the tracker upgrade first before starting the rest of the upgrades, or whether to stay on 2.6 tracker until the rest of the work is done then switch from 2.6 to 2.10 afterwards.

With using the same tracker to send to multiple endpoints, should the cookieName be different between the two or is it best to keep the same cookieName connected to both trackers?

ihor · March 28, 2019, 10:55pm

@Brandon_Kane, I assume you still want to keep track of the same users even after the upgrade. I think that introducing different cookie name will disrupt the continuity of domain_userid with different trackers.

Topic		Replies	Views
Guidance with Snowplow upgrade	4	685	March 28, 2023
Upgrade to snowplow R119 version AWS batch pipeline (Legacy)	0	633	October 4, 2022
Questions about setting up the real-time pipeline AWS real-time pipeline	5	1915	August 11, 2016
On-premise Realtime Pipeline For engineers	2	2242	January 3, 2018
Trying out Snowplow for a demo/learning For engineers	1	926	July 26, 2017

Upgrade vs New Install

Related Topics