IP lookup enrichment IndexOutOfBound exception


#1

We are trying to set up the stream enrich on AWS (Beijing region). Things are up and running except the IpLookups enrichment.

Problem

We get an ArrayIndexOutOfBoundException during IP lookups with MaxMind:

[pool-2-thread-1] INFO com.snowplowanalytics.snowplow.enrich.kinesis.sources.KinesisSource - Processing 1 records from shardId-000000000000
{"line":"CwBkAAAADTg3LjIxMC4xOTkuNTcKAMgAAAFX8xtSzQsA0gAAAAVVVEYtOAsA3AAAABFzc2MtMC43LjAta2luZXNpcwsBLAAAAHlNb3ppbGxhLzUuMCAoTWFjaW50b3NoOyBJbnRlbCBNYWMgT1MgWCAxMF8xMV82KSBBcHBsZVdlYktpdC81MzcuMzYgKEtIVE1MLCBsaWtlIEdlY2tvKSBDaHJvbWUvNTMuMC4yNzg1LjE0MyBTYWZhcmkvNTM3LjM2CwE2AAAAT2h0dHA6Ly9lYzItNTQtMjIzLTUwLTI1MC5jbi1ub3J0aC0xLmNvbXB1dGUuYW1hem9uYXdzLmNvbS5jbjo4MDg1L3Nub3dwbG93Lmh0bWwLAUAAAAACL2kLAUoAAAK5c3RtPTE0NzcyNTI0MzE0ODgmZT1wcCZ1cmw9aHR0cCUzQSUyRiUyRmVjMi01NC0yMjMtNTAtMjUwLmNuLW5vcnRoLTEuY29tcHV0ZS5hbWF6b25hd3MuY29tLmNuJTNBODA4NSUyRnNub3dwbG93Lmh0bWwmcGFnZT1XZWxjb21lJTIwcGFnZSZwcF9taXg9MCZwcF9tYXg9MCZwcF9taXk9MCZwcF9tYXk9MCZ0dj1qcy0yLjYuMSZ0bmE9Y2YmYWlkPXNwX3Rlc3QmcD13ZWImdHo9RXVyb3BlJTJGQmVybGluJmxhbmc9ZW4tVVMmdWE9TW96aWxsYSUyRjUuMCUyMChNYWNpbnRvc2glM0IlMjBJbnRlbCUyME1hYyUyME9TJTIwWCUyMDEwXzExXzYpJTIwQXBwbGVXZWJLaXQlMkY1MzcuMzYlMjAoS0hUTUwlMkMlMjBsaWtlJTIwR2Vja28pJTIwQ2hyb21lJTJGNTMuMC4yNzg1LjE0MyUyMFNhZmFyaSUyRjUzNy4zNiZjcz1VVEYtOCZmX3BkZj0xJmZfcXQ9MCZmX3JlYWxwPTAmZl93bWE9MCZmX2Rpcj0wJmZfZmxhPTEmZl9qYXZhPTAmZl9nZWFycz0wJmZfYWc9MCZyZXM9MTI4MHg4MDAmY2Q9MjQmY29va2llPTEmZWlkPWFjY2Q1ZWU5LWZmMTItNDI2Ny1hMjVkLTJmNGZjYWNlMDkzYSZkdG09MTQ3NzI1MjQzMTQ4NyZ2cD0xMjgweDU5MSZkcz0xMjgweDU5MSZ2aWQ9MSZzaWQ9OTY0MWYxMDktZGVhNC00MDgyLTgwYTItMDQwOTk4NjRhNDQ4JmR1aWQ9OWJkMzJiMDYtNGFlOC00OTYzLTk3OGYtMDk5MmMzNDgyZmE1JmZwPTEzMTMwNzYzNQ8BXgsAAAAIAAAAQEhvc3Q6IGVjMi01NC0yMjMtNTAtMjUwLmNuLW5vcnRoLTEuY29tcHV0ZS5hbWF6b25hd3MuY29tLmNuOjgwODkAAAAWQ29ubmVjdGlvbjoga2VlcC1hbGl2ZQAAAIVVc2VyLUFnZW50OiBNb3ppbGxhLzUuMCAoTWFjaW50b3NoOyBJbnRlbCBNYWMgT1MgWCAxMF8xMV82KSBBcHBsZVdlYktpdC81MzcuMzYgKEtIVE1MLCBsaWtlIEdlY2tvKSBDaHJvbWUvNTMuMC4yNzg1LjE0MyBTYWZhcmkvNTM3LjM2AAAAJkFjY2VwdDogaW1hZ2Uvd2VicCwgaW1hZ2UvKiwgKi8qO3E9MC44AAAAWFJlZmVyZXI6IGh0dHA6Ly9lYzItNTQtMjIzLTUwLTI1MC5jbi1ub3J0aC0xLmNvbXB1dGUuYW1hem9uYXdzLmNvbS5jbjo4MDg1L3Nub3dwbG93Lmh0bWwAAAAkQWNjZXB0LUVuY29kaW5nOiBnemlwLCBkZWZsYXRlLCBzZGNoAAAAN0FjY2VwdC1MYW5ndWFnZTogZW4tVVMsIGVuO3E9MC44LCB6aDtxPTAuNiwgemgtQ047cT0wLjQAAADKQ29va2llOiBjb2xsZWN0b3JDb29raWVOYW1lPWU3ZmIwYjVhLTdkMGQtNGY0My1hN2E0LTI2M2NkNTUwYzZlMzsgX3NwX2lkLmYwYzM9OWJkMzJiMDYtNGFlOC00OTYzLTk3OGYtMDk5MmMzNDgyZmE1LjE0NzcyNTExNTUuMS4xNDc3MjUyNDMxLjE0NzcyNTExNTUuOTY0MWYxMDktZGVhNC00MDgyLTgwYTItMDQwOTk4NjRhNDQ4OyBfc3Bfc2VzLmYwYzM9KgsBkAAAADVlYzItNTQtMjIzLTUwLTI1MC5jbi1ub3J0aC0xLmNvbXB1dGUuYW1hem9uYXdzLmNvbS5jbgsBmgAAACRlN2ZiMGI1YS03ZDBkLTRmNDMtYTdhNC0yNjNjZDU1MGM2ZTMLemkAAABBaWdsdTpjb20uc25vd3Bsb3dhbmFseXRpY3Muc25vd3Bsb3cvQ29sbGVjdG9yUGF5bG9hZC90aHJpZnQvMS0wLTAA","errors":[{"level":"error","message":"Could not extract geo-location from IP address [87.210.199.57]: [java.lang.ArrayIndexOutOfBoundsException: 2446770]"}],"failure_tstamp":"2016-10-23T19:53:57.902Z"}

The IP address above (87.210.199.57) should be mapped to Amsterdam.

Our enrichments/ip_lookups.json file:

{
    "schema": "iglu:com.snowplowanalytics.snowplow/ip_lookups/jsonschema/1-0-0",

    "data": {

        "name": "ip_lookups",
        "vendor": "com.snowplowanalytics.snowplow",
        "enabled": true,
        "parameters": {
            "geo": {
                "database": "GeoLiteCity.dat",
                "uri": "/usr/share/GeoIP"
            }
        }
    }
}

Note: we downloaded the IP database file and put it locally on where the enrichment process runs.

What we tried to lookup this IP address:

  1. Use MaxMind’s GeoIP API, correctly resolved to Amsterdam.
  2. Use scala-maxmind-iplookups, correctly resolved to Amsterdam.

Any ideas/suggestions?


#2

Hi @chelven - /usr/share/GeoIP is not a valid URI. Per the IP lookups enrichment docs, use a http:// or s3:// URI here.