Python Analytics SDK


I’m working with the Python Analytics API trying to read data from Google Cloud Storage (written by Storage Loader).
However the event parsing fails for me on:

return jsonify_good_event(line.split('\\t'), known_fields, add_geolocation_data)
AttributeError: 'dict' object has no attribute 'split'

I’m not quite sure how to handle this or what the expected input format would be as I’m passing on the TSV row string. Do I have to take care of transformations beforehand?

This is my code:

def snowplowTsv2Json(content):
    """Use the Snowplow SDK to generate a JSON object from TSV files
         content: File retrieved from GCS
    snowplowEvents = []

    with open(content, encoding='utf-8') as tsvfile:
        reader = [line.rstrip('\n') for line in tsvfile]
        #reader = csv.DictReader(tsvfile, dialect='excel-tab')
        print('First row of reader: {}, Type of reader var {}'.format(reader[0], type(reader)))
        for row in reader:
            print('Iterator row: {}'.format(str(row)))
                jsonRow = snowplow_analytics_sdk.event_transformer.transform(
                print('Json row: {}'.format(jsonRow))
            except snowplow_analytics_sdk.snowplow_event_transformation_exception.SnowplowEventTransformationException as e:
                for error_message in e.error_messages:
                    print('Error in snowplowTsv2Json: {}'.format(error_message))
        return snowplowEvents

Do you have any idea on how to progress? I’ve tried various options of feeding the TSV rows, but none worked…

It looks like you’re calling this twice? As far as I’m aware, you should only need to transform the row once and then you’ll have the JSON object that you’re after.

Oh my! Thank you @PaulBoocock! I was trying to fix the input while the problem was somewhere else… Desperately waiting for the weekend!

