I totally agree with @robkingston about starting to gain traction and thorough understanding of your data with a lower-cost solution.
At O’Reilly, we used Pentaho sitting on top of a number of cubes in Postgres/Redshift. It was relatively expensive, inflexible, and well… cubes were valuable in Kimball land before columnar db’s came into their own and storage became cheap. Wagon was heavily used at O’Reilly/Safari for collaborative sql, and worked incredibly well. PopSQL has become the closest post-Wagon-acquisition analog.
At Wanderu we used Metabase sitting on top of Redshift with good success for a couple of years (at an extremently low operational cost). Eventually the company outgrew the system and moved onto Looker (where the snowplow block is easily implemented as @mjensen mentioned above). During that time, the knowledge of internal data assets and shared understanding of how it was generated/loaded/represented was enormously valuable to the organization. Metabase just doesn’t have some of the functionality that Looker does, and Looker is definitely a common solution as BI/ analytics teams mature. But then again, Looker has a very hefty price tag.
Currently the team I work on at CarGurus uses Looker very heavily, sitting on top of a pretty large Snowflake database. A small team of application/ bi engineers builds out the Looker blocks/models, and a large number of analysts are free to explore the data at depth, with no prior SQL experience.
I’ve also helped companies like Friday Feedback and GetAbstract with analytics, and Mode/ Periscope have worked well.
For pricing context, Metabase cost < $50/month to run and reliability was very good. It doesn’t have a lot of the functionality that Looker does, but until you’ve reached a point of in-depth knowledge of your data it would be hard to justify >100k annually for a UI. Periscope/Mode are awesome lower-priced options for small to medium-sized teams (and you don’t want to maintain the system yourself).
With all this being said, what I’ve found is the following:
- There is no perfect solution, and the gripes about various systems are almost always the same: “inflexible for pure exploration”, “doesn’t let me do what d3js.org shows on their site”, “too expensive”, “slow” (b/c a lot of data is still a lot of data, and a columnar db is still a columnar db), etc.
- Having a solid awareness of the tradeoffs, and answering those before picking one solution over another is probably the most important part. Want to optimize for speed of implementation and a low cost but know that you’ll have to migrate in a couple years? Metabase is a very solid option (and getting much better, quickly). Want to avoid potential migrations down the road, don’t care about spend, and want to have/pay for support straight from the company? Looker is great.
- Vendor lock-in is a very real (and very scary!) thing. In my opinion, the most freeing thing a company can do is learn and love SQL and/or model data in the database, and let a UI just be a UI. If you’re unsatisfied with your experience using database-vendor-1 or chart-vendor-1, migrating your dashboards to another ANSI-compliant database is trivially easy. When you need to learn how vendor #1 implemented a “select a, b, count(*) from table where x > y group by 1, 2 order by 3 desc” query, you might as well learn sql once and get the initial pain over with. There is definitely a learning curve, but once you’re over that initial curve you find that your knowledge is directly transferrable to numerous other systems. Another side benefit of sql is you become deeply knowledgable about your own data instead of becoming deeply knowledgable about how company X’s sql abstraction layer. With this being said, Looker’s explore functionality or Metabase’s question exploration are both incredibly useful.
- There (still) isn’t really a good one-size-fits-all solution. Jupyter is enormously valuable, but serves a completely different purpose than a dashboarding tool an executive stares at all day. And neither of those are good sql editors.
- I’ve become rather skeptical of third-party tools that attempt to do it all (ie: “we can chart things, govern your data, do predictive analytics, and track things in one single tool!”).
Hope this helps, apologies for the rant .