DEPRECATION NOTICE: Snowplow will stop hosting the Maxmind GeoLite2 database on behalf of users

(!) Snowplow Insights customers
This notice does not apply to our paid customers. We will continue to host the MaxMind database for you in the short-term to ensure you experience no service disruption. We will be in touch directly when you need to take action.

Snowplow users can leverage the IP Lookup Enrichment to enrich Snowplow data based on data stored in the different Maxmind databases, including the free GeoLite2 database.

Until now, we have hosted that database for Snowplow users on http://snowplow-hosted-assets.s3.amazonaws.com/third-party/maxmind/GeoLite2-City.mmdb. That means that anyone running Snowplow can enable the enrichment by configuring their config file as follows:

{
	"schema": "iglu:com.snowplowanalytics.snowplow/ip_lookups/jsonschema/2-0-0",

	"data": {

		"name": "ip_lookups",
		"vendor": "com.snowplowanalytics.snowplow",
		"enabled": true,
		"parameters": {
			"geo": {
				"database": "GeoLite2-City.mmdb",
				"uri": "http://snowplow-hosted-assets.s3.amazonaws.com/third-party/maxmind"
			}
		}
	}
}

Following new CCPA regulation in California, Maxmind has updated the basis on which this database is made available. It is now necessary that anyone who wants to use the database registers with Maxmind. (Maxmind is still kindly making the database freely available to registered users.) You can read more about the rationale for those changes on the Maxmind blog.

This means that we must no longer host the database on behalf of our users. We intend to remove the hosted asset by February 14th. As a result, we ask all users of the database to, as a matter of priority:

  1. Follow the steps to setup an account, license key and download the database detailed on the Maxmind blog post.
  2. Upload the database to S3 or GCS, but keep it private
  3. Update their Snowplow enrichment config file to point to the private asset

Doing this swiftly will prevent any disruption to your Snowplow pipeline once we stop hosting the asset.

1 Like

Question, GeoLite has also changed the formats/names for the free databases.
There are now GeoLite2-ASN and GeoLite2-City.
Are the newer versions of the free databases supported in the IP Lookups enrichment?
It looks like the documentation/config schema does not reflect this:

GeoLite2-City is still supported.

GeoLite2-ASN is not currently supported as that is a separate ASN only database, but may get added into the IP lookups enrichment in the future.

Ok, thanks Mike. I just wanted to be sure as it looks like there might have been some schema changes to the GeoLite2-City.mmdb database too from the older version. Will be setting it up later today.

That’s definitely possible. At the moment the enrichment (scala-maxmind-iplookups) only picks up a predefined list of fields but if new fields are added the enrichment will still run - but won’t add those new fields into the payload.

1 Like

A quick update to this notice. Later than we originally anticipated, we will now remove the Maxmind GeoLite2 database on September 8th.

To prevent any users who have not yet switched to self-hosting the database for themselves from having all their events fail, rather than remove the database altogether we have replaced the file with an empty database. This should mean that any Snowplow users who have their pipeline setup to use the database here will from 2020-09-08 have no values in all their geo_ fields.

We remind any users who have been using the Maxmind GeoLite2 database to sign up to Maxmind as documented below, check the list of IP addresses against which “Do not sell” lists have been registered and ensure that those requests are enforced.

As a reminder, this change will not impact any Snowplow Insights customers.

Going forwards we would like to be better at proactively informing open source users who opt in about relevant changes to Snowplow technology and the way we host it. If you would like to receive these notices please log your email address here.

Many thanks,

Yali

1 Like

@yali

I have followed the above steps but still I am getting null geo fields.

Any Idea what I have done wrong in below config or how I can debug this issue?

{
        "schema": "iglu:com.snowplowanalytics.snowplow/ip_lookups/jsonschema/2-0-0",

        "data": {

                "name": "ip_lookups",
                "vendor": "com.snowplowanalytics.snowplow",
                "enabled": true,
                "parameters": {
                        "geo": {
                                "database": "GeoLite2-City.mmdb",
                                "uri": "s3://<myBucket>/third-party/maxmind"
                        }
                }
        }
}
1 Like

@karan
I am having the same problem. Did you find any solution? Thanks a lot.

@yali
Is this solution still valid? I followed the steps above but still getting null geo fields. I would appreciate any help. Thanks in advance.

Solved! It was my bad. I forgot to redeploy ECS clusters after the changes. Thanks for the solution.