Reading S3 good

nando_roz · September 16, 2021, 11:45pm

Hi all,

What is the best approach to read the good data in S3 from .tsv.gz?

mike · September 17, 2021, 12:03am

I would say that’s going to be pretty dependent on:

What would you like to do with the data?
How large are the keys?
How many keys are you planning on reading at a time?
How are the keys partitioned on S3?

nando_roz · September 17, 2021, 10:15am

My use will to data analysis, instead save on postgress use athena to do that.

The size, time to read I do not know yet because we Will start to track our application

Colm · September 17, 2021, 10:26am

There’s a guide here: Using AWS Athena to query the 'good' bucket on S3

It was written quite a while ago, but should get you most of the way if not all the way there.

Topic		Replies	Views
Using AWS Athena to query the 'good' bucket on S3 For data modelers & consumers	2	8539	June 28, 2017
Using AWS Athena to query the shredded events For data modelers & consumers	0	5314	August 4, 2017
Can we use spectrum to query shredded data instead of enriched? For data modelers & consumers	1	2039	September 27, 2017
Column headers for enriched events in S3 For data modelers & consumers	2	1257	March 15, 2017
How to see the data Iglu	37	2693	September 27, 2021

Reading S3 good

Related Topics