We would like to switch storage from RedShift to ClickHouse.
From user perspective the main difference between this 2 storages is tables count. There will be 1 wide table in ClickHouse, while RedShift has multiple custom tables.
From SnowPlow pipeline perspective that’s means we need to skip the shredding step, but keep other functionality like deduplication.
Current solution is to upload data in RedShift, export data in Parquet format, then JOIN data and upload it to ClickHouse. But this workflow is too complicated, so I’m looking for another solution.
Is there any way to load data into ClickHouse right now? Or should we create our own loader to work with ClickHouse?