Adapter_failures - python-requests/2.9.1

Hello,

A curiosity to wrap my head around, occasionally I get the following in my Bad-Json stream from enricher:

{
    "schema": "iglu:com.snowplowanalytics.snowplow.badrows/adapter_failures/jsonschema/1-0-0",
    "data":
    {
        "processor":
        {
            "artifact": "stream-enrich",
            "version": "1.4.2"
        },
        "failure":
        {
            "timestamp": "2021-06-01T00:12:28.275400Z",
            "vendor": ".git",
            "version": "config",
            "messages": [
            {
                "field": "vendor/version",
                "value": ".git/config",
                "expectation": "vendor/version combination is not supported"
            }]
        },
        "payload":
        {
            "vendor": ".git",
            "version": "config",
            "querystring": [
            {
                "name": "n3pc",
                "value": "true"
            }],
            "contentType": null,
            "body": null,
            "collector": "ssc-2.1.1-kinesis",
            "encoding": "UTF-8",
            "hostname": "x",
            "timestamp": "2021-06-01T00:12:10.258Z",
            "ipAddress": "x",
            "useragent": "python-requests/2.9.1",
            "refererUri": null,
            "headers": ["Timeout-Access: <function1>", "Connection: upgrade", "Host: x", "X-Real-Ip: x", "X-Forwarded-For: x, x", "X-Forwarded-Proto: http", "X-Forwarded-Port: 80", "X-Amzn-Trace-Id: Root=1-abc", "Accept: */*", "Accept-Encoding: gzip, deflate", "User-Agent: python-requests/2.9.1"],
            "networkUserId": "007"
        }
    }
}

Is this happening as the user_agent isn’t recognised?

Python, which is weird, possibly a scraper?

If my assumptions are incorrect what could it be?

Thanks
Kyle

Yep - this is almost certainly a scraper / exploit scanner. You’ll typically find quite a few of these that are adapter failures that fail based on a vendor/version path that aims to determine if a file containing credentials or sensitive data is being served.

Not much to worry about - but if you want to eliminate them it’s easiest to put something like a web application firewall / shield in front of the collector which can reject / IP ban some of this traffic.

3 Likes

Thanks @mike