Cron Job for emr-etl and snowflake data

I’m not really familiar with cron jobs but I’d to schedule the following

  • run the snowplow-emr-etl-runner
  • run the snowflake loader. I’m running a command line data flow tasks today
  • run a series of SQL scripts against Snowflake.

I’m wondering if I should create a shell script that gets started by the cron job. The shell script would ensure the sequence of events. Any thoughts? Any scripts that you may have is greatly appreciated.

@sonnypolaris, that is exactly how we manage the pipelines for our clients at the moment. To facilitate the scheduling and organizing the steps to be executed, we also use in-house built open-source Factotum (wrapping up EmrEtlRunner), Dataflow Runner (wrapping up Snowflake transformer and loader and/or EmrEtlRunner), and SQL Runner (to run data model on data on Redshift, Snowflake, BigQuery).