The Data Engineering Show is a podcast for data engineering and BI practitioners to go beyond theory. Learn from the biggest influencers in tech about their practical day-to-day data challenges and solutions in a casual and fun setting. SEASON 1 DATA BROS Eldad and Boaz Farkash shared the same stuffed toys growing up as well as a big passion for data. After founding Sisense and building it to become a high-growth analytics unicorn, they moved on to their next venture, Firebolt, a leading hig ...
…
continue reading

1
Revolutionizing Data Governance with DataStrato’s Unified Open Source Approach
23:36
23:36
Play later
Play later
Lists
Like
Liked
23:36In this episode of The Data Engineering Show, the bros sit with Lisa Cao, Product Manager at DataStrato, to explore data catalogs and Apache Gravitino, a unified metadata lake used to manage access and perform data governance for all data sources. What You’ll Learn: How Apache Gravitino differs from others like Unity catalog and Polaris by being ab…
…
continue reading

1
Database Technology in the Age of AI with DuckDB Labs co-creator Hannes Mühleisen
30:52
30:52
Play later
Play later
Lists
Like
Liked
30:52In this episode of The Data Engineering Show, host Benjamin and co-host Eldad sit with CEO DuckDB Labs and co-creator DuckDB, Hannes Mühleisen. Together, they: Talk about the journey of DuckDB, an open-source analytical database system designed as a universal wrangling tool. Explain how DuckDB differs from SQLite, highlighting the analytical and tr…
…
continue reading

1
AI and Data Movement: Trends and Best Practices with Estuary’s Daniel Pálma
30:33
30:33
Play later
Play later
Lists
Like
Liked
30:33In this episode of The Data Engineering Show, the bros sit with Daniel Pálma, Head of Marketing at Estuary. Join them as they; Talk about Daniel’s career transition from data engineering to marketing and how his background in data engineering has been a tremendous help to his marketing competence. Discuss the role of AI in the evolution of data mov…
…
continue reading

1
AI and Data Change Management with Chad Sanderson, CEO Gable AI
36:43
36:43
Play later
Play later
Lists
Like
Liked
36:43In this episode of The Data Engineering Show, host Benjamin and co-host Eldad sit with Chad Sanderson, CEO and co-founder of Gable AI to explore the interesting world of data change management. Join them as they: Delve into challenges of data quality, how it degrades over time and the one-sided data quality checks on the “last mile” of the data sup…
…
continue reading

1
Tech Stacks and Tradeoffs: Xudo's Founder on Picking the Right Tools for BI Success
24:56
24:56
Play later
Play later
Lists
Like
Liked
24:56Wouter Trappers is the founder of Xudo and shares his slightly unconventional path from philosopher to data consultant with the Bros in this latest episode of The Data Engineering Show. Wouter’s grounding in philosophy has proved to be a shaping influence on his approach to business intelligence. Much more than just a software solution, for Wouter,…
…
continue reading

1
Data Rewind: Conversation Highlights from Zach Wilson, Matthew Housley, Joe Reis, and Krishnan Viswanathan
28:02
28:02
Play later
Play later
Lists
Like
Liked
28:02In this special roundup episode of The Data Engineering Show, the Bros revisits some of the best bits from episodes with data thought leaders Zach Wilson, Matthew Housley, Joe Reis, and Krishnan Viswanathan, spotlighting essential trends and lessons learned across the evolving data engineering landscape. From data observability to bridging academia…
…
continue reading

1
The Resurgence of SQL: Insights from Ryanne Dolan from LinkedIn
32:57
32:57
Play later
Play later
Lists
Like
Liked
32:57In this episode of The Data Engineering Show, the bros, Eldad and Benjamin are joined by Ryanne Dolan from LinkedIn to discuss the innovative Hoptimator (H2) project. This conversation reveals how LinkedIn has improved its data pipelines by automating the setup and management of complex workflows. Together they cover: Automated Data Pipelines: Ryan…
…
continue reading

1
Vector Databases Won’t Replace SQL - Andy Pavlo
42:59
42:59
Play later
Play later
Lists
Like
Liked
42:59SQL’s slow. SQL’s stupid. We hear these claims every time a new shiny tool enters the market, only to realize five years later when the hype dies down that SQL is actually a good idea. In this super techie episode of the Data Engineering Show, Andy Pavlo, Associate Professor at Carnegie Mellon University, joins the bros to delve into database inter…
…
continue reading

1
How ZoomInfo transitioned from data graveyards to ROI-driven data projects
39:46
39:46
Play later
Play later
Lists
Like
Liked
39:46Too often expensive resources and manhours are spent on dashboards no one uses, resulting in zero ROI. Philip Philip Zelitchenko, VP of Data & Analytics at ZoomInfo met the bros to talk about adopting product management principles to ensure data projects have value, and provide an unfiltered peak into ZoomInfo’s data stack and unique tech culture. …
…
continue reading

1
Matthew Weingarten from Disney Streaming about Data Quality Best Practices
27:21
27:21
Play later
Play later
Lists
Like
Liked
27:21Matthew Weingarten, Lead Data Engineer at Disney Streaming, talks about principles essential for data quality, cost optimization, debugging, and data modeling, as adopted by the world's leading companies. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten …
…
continue reading

1
Joseph Machado, Senior Data Engineer @ LinkedIn talks best practices
25:59
25:59
Play later
Play later
Lists
Like
Liked
25:59Data engineering should be less about the stack and more about best practices. While tools may change, foundational principles will remain constant. Joseph Mercado, Senior Data Engineer at LinkedIn, is on The Data Engineering Show to talk about principles that are key to success, leveraging AI for automation, and adopting software engineering metho…
…
continue reading

1
Professors Joe Hellerstein and Joseph Gonzalez on LLMs
46:07
46:07
Play later
Play later
Lists
Like
Liked
46:07Joe Hellerstein is the Jim Gray Professor of Computer Science at Berkeley and Joseph Gonzalez is an Associate Professor in the Electrical Engineering and Computer Science department. They’ve inspired generations of database enthusiasts (including Benji and Eldad) and have come on the show to talk about all things LLM and RunLLM which they co-founde…
…
continue reading

1
Megan Lieu on powerful notebooks that enable collaboration
31:31
31:31
Play later
Play later
Lists
Like
Liked
31:31There are two types of data influencers on LinkedIn: 1. Those who talk directly about the products and companies they work for 2. Those that provide more general guidance, tips and opinions Can influencers actually be passionate about the products they’re developing and straightforwardly talk about them without sounding salesly? We’re kicking off 2…
…
continue reading

1
Transitioning from software engineering to data engineering
29:48
29:48
Play later
Play later
Lists
Like
Liked
29:48Every data team should have at least one data engineer with a software engineering background. This time on The Data Engineering Show, Xiaoxu Gao is an inspiring Python and data engineering expert with 10.6K followers on Medium. She’s a data engineer at Adyen with a software engineering background, and she met the bros to talk about why both softwa…
…
continue reading

1
Vin Vashishta explains why we should stop using dashboards
35:45
35:45
Play later
Play later
Lists
Like
Liked
35:45Vin Vashista, the guy we all love to follow, has never seen a dashboard with positive ROI. This time on The Data Engineering Show, he met the bros to talk about the difference between BI dashboards and analytics that actually introduce knowledge. It’s no longer just about the data volume, it’s about quality and relevance. The Data Engineering Show …
…
continue reading

1
Joe Reis and Matt Housley on the fundamentals of data engineering
42:11
42:11
Play later
Play later
Lists
Like
Liked
42:11After co-writing the best-selling book ‘Fundamentals of Data Engineering’, Joe Reis and Matt Housely joined the bros for some much-needed ranting, priceless data advice, and good laughs. So why are we still talking about providing business value and dashboards, even though we don’t really have anything new to say? If there are so many great tools i…
…
continue reading

1
Bill Inmon, the Godfather of Data Warehousing
30:32
30:32
Play later
Play later
Lists
Like
Liked
30:32As people in the data industry go, Bill Inmon is among the top, often seen as the godfather of the data warehouse. In this Data Engineering Show episode, Bill Inmon talks about surviving rabbit holes throughout the evolution of data, the data modeling renaissance, and why ChatGPT is not Textual ETL. The Data Engineering Show is handcrafted by our f…
…
continue reading

1
Large-scale data engineering at Momentive.ai - Meenal Iyer
38:40
38:40
Play later
Play later
Lists
Like
Liked
38:40As companies scale, data gets messy. The data team says one thing, the business team says something completely different. Meenal Iyer, VP Data at Momentive.ai, Met the Data Bros to talk about enforcing collaboration in large organizations to ensure what she considers the three most important data factors: Adoption, Trust, and Value. The Data Engine…
…
continue reading

1
Data engineering from the early 2000s till today - BlackRock
41:49
41:49
Play later
Play later
Lists
Like
Liked
41:49When it comes to data management, have we come a long way since the early 2000s? Or has it simply taken us 20 years to finally realize that you can’t scale properly without data modeling. With over 20 years of experience in the data space, leading engineering teams at Cisco, Oracle, Greenplum, and now as Sr. Director of Engineering at BlackRock, Kr…
…
continue reading

1
Zach Wilson on what makes a great data engineer
34:02
34:02
Play later
Play later
Lists
Like
Liked
34:02How good you are at Spark or Flink ≠ how good you are at data engineering. After years of data engineering experience at Airbnb, Netflix, and Facebook, Zach Wilson is now focused on spreading the knowledge in EcZachly and all over social media. He met Benjamin Wagner to explain why data modeling and storytelling are more important than the actual t…
…
continue reading

1
How ZipRecruiter and Yotpo power self-service data platforms that work
45:48
45:48
Play later
Play later
Lists
Like
Liked
45:48Data engineers are not paid to do support. Liran Yogev, Director of Engineering at ZipRecruiter, and Doron Porat, Director of Infrastructure at Yotpo talk about building resilient self-service products that keep customers happy and engineers calm. They walked the bros through their data stacks and explained how ZipRecruiter is completely rebuilding…
…
continue reading

1
Data Observability with Millions of Users - Barr Moses
38:36
38:36
Play later
Play later
Lists
Like
Liked
38:36Barr Moses, CEO of Monte Carlo explains the difference between data quality and data observability, and how to make sure your data is accurate in a world where so many different teams are accessing it. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of …
…
continue reading

1
How Amplitude Engineers Process 5 Trillion Real-time Events
27:59
27:59
Play later
Play later
Lists
Like
Liked
27:59Weichen Wang, Senior Engineering Manager at Amplitude, came to meet the bros to talk about Amplitude's cutting-edge data stack and how it processes 5 Trillion real-time events while dealing with mutable data and massive scale. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedi…
…
continue reading

1
Making Observability a Key Business Driver
48:59
48:59
Play later
Play later
Lists
Like
Liked
48:5980% of the code that you write doesn’t work on the first try. And that’s fine. But knowing which 80% is not working and which 20% is working is the actual challenge. After 10 years at Facebook, managing and scaling the Seattle site to over 6000 engineers(!) Vijaye Raji founded Statsig to make observability automated and real-time. How is the semant…
…
continue reading

1
A ClickHouse Review from a Practitioner’s Point of View
34:43
34:43
Play later
Play later
Lists
Like
Liked
34:43Sudeep Kumar, Principal Engineer at Salesforce is a ClickHouse fan. He considers the shift to Clickhouse as one of his biggest accomplishments during his eBay days and walks Boaz through his experience with the platform. How on one hand it handled 2B events per minute, but also how it required rollups which compromised granularity when extending ti…
…
continue reading

1
The Creator of Airflow About His Recipe for Smart Data-Driven Companies
45:56
45:56
Play later
Play later
Lists
Like
Liked
45:56According to Maxime Beauchemin, CEO & Founder at Preset and Creator of Apache Superset and Apache Airflow, it's not so straight-forward to understand what you're really getting into and the vastness of the skills that are required in order to build a thriving company. Picking the right system and services is key for a successful start, and can help…
…
continue reading

1
How Similarweb Delivers Customer Facing Analytics Over 100s of TBs
37:11
37:11
Play later
Play later
Lists
Like
Liked
37:11According to Yoav Shmaria, VP R&D Platform at Similarweb, the best way to manage data warehouse costs is to tag every table, database or ETL running to have good granularity over every feature. Besides handy cost management tips, Yoav walks the bros through the tech stack he implemented to analyze 100s of TBs of web data to serve fast customer-faci…
…
continue reading

1
How Klarna Designed a New Data Platform in the Cloud
40:37
40:37
Play later
Play later
Lists
Like
Liked
40:37Klarna is one of the leading fintech companies in the world, valued at $45B. While many corporations are “stuck” on-prem, Klarna made the move and today is a cloud-only company. Gunnar Tangring, Klarna’s Lead Data Engineer tells Boaz what this new modernized stack looks like. The Data Engineering Show is handcrafted by our friends over at: fame.so …
…
continue reading

1
How Eventbrite is Modernizing its Data Stack
23:25
23:25
Play later
Play later
Lists
Like
Liked
23:25Archana shares Eventbrite’s data stack modernization process, and how you get engineers to adopt new technologies like dbt which may be outside their comfort zone. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, aut…
…
continue reading

1
A Deep Dive into Slack's Data Architecture
34:06
34:06
Play later
Play later
Lists
Like
Liked
34:06Growing from a startup to an IPOed and then an acquired company meant that Slack’s sales org was scaling rapidly. Apun Hiran, Slack’s Director of Software Engineering explains how the data stack and architecture evolved to support this growth with more reliable and timely metrics. Speaker: Apun Hiran, Director of Software Engineering (Data), Slack …
…
continue reading

1
Transitioning Scopely’s 5.5 PB Data Platform to the Modern Data Stack
31:52
31:52
Play later
Play later
Lists
Like
Liked
31:52Should data engineering AND BI be handled by the same people? According to Jonathan Palmer, VP Data Platform at Scopely – YES. By Analytics Engineers. His team of Analytics Engineers is in the final stages of transitioning 5.5 PBs of data which include 15B evens per day to the modern data stack. Tune in to learn how they did it. The Data Engineerin…
…
continue reading

1
Getting rid of raw data with Jens Larsson
29:01
29:01
Play later
Play later
Lists
Like
Liked
29:01Why would you create ugly data? According to Jens Larsson, don’t even go near raw data. Jens started off at Google, continued to manage data science at Spotify, caught the startup bug at Tink, and recently joined an exciting new company called Ark Kapital, together with Spotify’s former VP Analytics. Jens explains how he and his team killed the not…
…
continue reading

1
How Zendesk engineers manage customer-facing data applications
33:28
33:28
Play later
Play later
Lists
Like
Liked
33:28This time on the data engineering show, Eldad abandoned his brother Boaz but it’s ok because Boaz got the full 30 minutes to talk to one of the most interesting people in the data space. Ananth Packkildurai is Principal Software Engineer at Zendesk and runs one of the strongest newsletters in data – Data Engineering Weekly. He talked about data app…
…
continue reading

1
How are those data intensive customer facing apps engineered at Gong?
26:16
26:16
Play later
Play later
Lists
Like
Liked
26:16Gong manages hundreds of thousands of videoconferences and millions of emails PER DAY, which add up to hundreds of TBs. The Data Bros met Yarin Benado, Gong’s engineering manager to understand what is required to move to a modern data stack to support all this, what this stack looks like, and why it all comes down to data quality at the end of the …
…
continue reading

1
How Bolt Engineers Are Designing Its Next-Gen Data Platform
35:55
35:55
Play later
Play later
Lists
Like
Liked
35:55Bolt's ride-hailing app serves 2B users in Europe and Africa and handles 500K queries every day. Erik Heintare along with Bolt's engineering team is in the midst of designing a new next-gen data platform and is sharing how it's going to solve their biggest data challenges. Guest: Erik Heintare - Senior Analytics Engineer at Bolt Hosts: Eldad and Bo…
…
continue reading

1
How did Agoda scale its data platform to support 1.5T events per day?
38:40
38:40
Play later
Play later
Lists
Like
Liked
38:40Scaling a data platform to support 1.5T events per day requires complicated technical migrations and alignment between hundreds of engineers. What to see how Agoda did it. Guests: Amir Arad, Director of Machine Learning, Agoda Shaun Sit, Senior Dev Manager, Agoda Hosts: The Data Bros - Eldad and Boaz Farkash The Data Engineering Show is handcrafted…
…
continue reading
It’s the mother of all development projects. You use it daily. And so do 65M developers around the world. This time on the Data Engineering Show – A deep dive into GitHub’s data stack. Arfon Smith KimYen (Truong) Ladia shared GitHub’s data engineering challenges and solutions and explained why every developer should know and adopt the ADR protocol.…
…
continue reading

1
Building Data Products For Data Engineers
39:51
39:51
Play later
Play later
Lists
Like
Liked
39:51How does a tech stack that always needs to be at the forefront of technology look like?Roy Miara from Explorium talks about building data products for the audience that can’t be fooled – Data Engineers. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of…
…
continue reading

1
How Vimeo Keeps Data Intact with 85B Events Per Month
40:13
40:13
Play later
Play later
Lists
Like
Liked
40:13How does the Viemo data team deal with 2 PBs of data and 85B events per month? What made them recently build a data ops team? What data tool does the team love? And why (the hell) did they call their legacy platform Fatal Attraction? Guest: Lior Solomon, VP Data Engineering at Vimeo. The Data Engineering Show is handcrafted by our friends over at: …
…
continue reading

1
How Substack's Data Stack Supports 500K Paying Subscribers
24:24
24:24
Play later
Play later
Lists
Like
Liked
24:24Substack is an amazing — if not the most amazing — content publishing platform out there. Essentially, it allows anyone to become a journalist or to start their own newsletters and charge subscriptions for them. So how did they build a data stack that can support all of their 500K paying subscribers?Guest: Mike Cohen, Data Engineer at SubStackHosts…
…
continue reading

1
A Technical Deep Dive to Yelp's Data Infrastructure - With Steven Moy
50:09
50:09
Play later
Play later
Lists
Like
Liked
50:09As an expert in query engines and performance-related challenges, Steven Moy explains how Yelp handled its huge data growth in the past ten years. Guest: Steven Moy, Software Engineer at YelpHosts: The Data Bros, Eldad and Boaz Farkash, CEO and CPO at Firebolt The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests …
…
continue reading

1
How Canva's Data Engineers and Analysts Support 55M Active Users
43:18
43:18
Play later
Play later
Lists
Like
Liked
43:18Canva is one of the hottest, if not the hottest, graphic design platforms out there. Only a week ago it was announced that they reached a staggering 16 Billion dollar valuation, after having seen even stronger growth during the pandemic. With 55 million active users and around 500 million dollars in annual revenue, it seems that Canva is unstoppabl…
…
continue reading

1
How AppsFlyer Delivers Sub-Second BI to 1000 Looker Users - With Alexandra Sudilovsky
31:46
31:46
Play later
Play later
Lists
Like
Liked
31:46AppsFlyer has exploded in size, growing from a small company of 200 people to 1000 people in just three years. Dealing not only with a huge amount of data on a daily basis but doing so while growing quickly as a company can come with many challenges. Guest: Alexandra Sudilovsky, Senior BI Expert at AppsFlyerHosts: The Data Bros, Eldad and Boaz Fark…
…
continue reading
The Data Engineering Show is a podcast for data engineering and BI practitioners to go beyond theory, and learn from the biggest influencers in tech about their practical day to day data challenges and solutions in a casual and fun setting. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Mach…
…
continue reading