Questions regarding data modeling & analysis

radubogdan · January 20, 2017, 11:49am

Hi,

I’m trying to get a basic batch pipeline with a Clojure collector going, and I reached the point where data is being imported to the atomic.events table in Redshift.

Now I’m trying to wrap my head around the data modeling / analysis part. In connection to that, I have some questions about stuff which is unclear to me:

The link 5-data-modeling/sql-runner/redshift in the guide here getting-started-with-data-modeling is dead. I guess it should point here instead 5-data-modeling/web-model/redshift right?
I’m not sure I understand what the purpose difference is between 5-data-modeling/web-model/redshift and 5-data-modeling/web-model/sql-runner ?
The so called sql-runner when is that one meant to be used? Should it be installed separately from EmrEtlRunner?
In the analysis section it recommends setting up some prebuilt views: Setting-up-the-prebuilt-views-in-Redshift-and-PostgreSQL How do they differ from 5-data-modeling/web-model/redshift ? Do you need one or both?

Thanks!

NielsKSchjoedt · January 25, 2017, 10:47am

+1

travisdevitt · January 25, 2017, 5:34pm

sql-runner is a separate application that executes your data modeling SQL queries in a specified order (so you don’t have to run each query manually every day/hour). It has it’s own config file which specifies the queries to run, the order, and the database on which to run the queries.

The data-modeling/…/redshift folders provide example SQL queries for creating basic data models (higher level aggregate tables) in Redshift. I imagine a lot of people customize them (as we do) so that they can build in their own business logic. The idea is to build these higher level aggregated tables so you don’t have to directly query atomic.events with long complicated queries every time you need to answer a basic business question.

Topic		Replies	Views
Redshift SQLRunner, no errors but no data either For engineers	2	500	May 30, 2023
Data modelling documentation seems be out of date? Redshift	4	1562	November 25, 2018
Making SQL data models incremental to improve performance [tutorial] Redshift	11	9195	October 11, 2017
Redshift Web Model 1.1.0 released New releases	3	661	November 17, 2020
Replacing Amazon Redshift with Apache Spark for event data modeling [tutorial] Spark	3	6746	September 5, 2017

Questions regarding data modeling & analysis

Related Topics