Mark Rittman is joined by Special Guest Fangjin Yang to talk about the history of Druid, a high-performance, column-oriented, distributed data store originally developed by the team at Metamarkets to provide fast ad-hoc access to large amounts of event-level marketing data, and his work at Imply to commercialise Druid and build a suite of supporting query and data management tools.
Mark Rittman is joined in this 50th Episode Special by our original guest on the first episode of Drill to Detail, Stewart Bryson, to talk about developing agile BI applications using FiveTran, SnowflakeDB and Looker and his recent work developing a BI solution for Google Play Marketing using Google Data Studio and Google Cloud Platform. We're also joined later in the show by Alex Gorbachev from Pythian, our mystery guest who Stewart then interviews flawlessly armed only with a set of questions given to him as the guest was unveiled ... though be sure to listen past the final closing music for the bonus out-takes.
Mark Rittman is joined by Will Davis from Trifacta to talk about the public beta of Google Cloud Dataprep, Trifacta's data wrangling platform and topics including metadata management, data quality and data management for big data and cloud data sources.
- Google Cloud Dataprep on Google Cloud Platform
- "Google Cloud Dataprep: Spreadsheet-Style Data Wrangling Powered by Google Cloud Dataflow"
- "A New Cloud-Based Data Prep Solution from Google & Trifacta"
- Trifacta website
- "A Breakthrough Approach to Exploring and Preparing Data"
- Trifacta platform architecture
- "Garbage In, Garbage Out: Why Data Quality Matters"
- "How to Put an Effective Metadata Strategy in Place"
Drill to Detail returns after the New Year break with Special Guest Julian Hyde from Hortonworks to talk about bitmap indexes and CASE tools, Mondrian and open-source OLAP analysis, and Apache Calcite's mission to bring sanity, cost-based optimisers and support for OLAP workloads to today's dis-aggregated, distributed new-world database engines.
- Oracle Designer page on Oracle.com
- Bitmap Index page on Wikipedia
- Mondrian project page on Github
- Mondrian OLAP Server page on Wikipedia
- MultiDimensional eXpressions (MDX) page on Wikipedia
- Julian Hyde blog
- Apache Calcite project homepage
- Apache Calcite Introduction and Overview deck
- Streaming SQL presentation at Apex Big Data World 2017, Mountain View, California
Mark is joined by long-term industry veteran and friend Christian Berg to talk about surviving fifteen years as a contractor in analytics industry, changes he's seen in the market and in how project are approached, the value in getting involved in the community, and in a specially extended Christmas and New Year edition we look back at what was topical in 2017 and what are Christian's predictions for 2018 ... and appoint Christian as Head of our Best Practices Found on the Internet.