Are you on top of the latest innovations in data, analytics, and AI? With data being pivotal to strategy and change, the Data-powered Innovation Jam podcast gives you the key to some of the most crucial aspects of business success. Through our guests, we bring you the latest trends from the world of data and AI, discussing the best ideas and experiences. Our hosts with their decades of profound experience and a background in avant-garde music, will also explore the edges of jazz, rock, and p ...
…
continue reading
Data Lake Architecture Podcasts
This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.
…
continue reading
By Capgemini
…
continue reading

1
The Data Model That Captures Your Business: Metric Trees Explained
1:01:05
1:01:05
Play later
Play later
Lists
Like
Liked
1:01:05Summary In this episode of the Data Engineering Podcast Vijay Subramanian, founder and CEO of Trace, talks about metric trees - a new approach to data modeling that directly captures a company's business model. Vijay shares insights from his decade-long experience building data practices at Rent the Runway and explains how the modern data stack has…
…
continue reading

1
From GPUs-as-a-Service to Workloads-as-a-Service: Flex AI’s Path to High-Utilization AI Infra
56:31
56:31
Play later
Play later
Lists
Like
Liked
56:31Summary In this crossover episode of the AI Engineering Podcast, host Tobias Macey interviews Brijesh Tripathi, CEO of Flex AI, about revolutionizing AI engineering by removing DevOps burdens through "workload as a service". Brijesh shares his expertise from leading AI/HPC architecture at Intel and deploying supercomputers like Aurora, highlighting…
…
continue reading

1
From RAG to Relational: How Agentic Patterns Are Reshaping Data Architecture
52:58
52:58
Play later
Play later
Lists
Like
Liked
52:58Summary In this episode of the AI Engineering Podcast Mark Brooker, VP and Distinguished Engineer at AWS, talks about how agentic workflows are transforming database usage and infrastructure design. He discusses the evolving role of data in AI systems, from traditional models to more modern approaches like vectors, RAG, and relational databases. Ma…
…
continue reading

1
Duck Lake: Simplifying the Lakehouse Ecosystem
1:10:41
1:10:41
Play later
Play later
Lists
Like
Liked
1:10:41Summary In this episode of the Data Engineering Podcast Hannes Mühleisen and Mark Raasveldt, the creators of DuckDB, share their work on Duck Lake, a new entrant in the open lakehouse ecosystem. They discuss how Duck Lake, is focused on simplicity, flexibility, and offers a unified catalog and table format compared to other lakehouse formats like I…
…
continue reading

1
Aligning Business and Data: The Essential Role of Data Modeling
1:06:51
1:06:51
Play later
Play later
Lists
Like
Liked
1:06:51Summary In this episode of the Data Engineering Podcast Serge Gershkovich, head of product at SQL DBM, talks about the socio-technical aspects of data modeling. Serge shares his background in data modeling and highlights its importance as a collaborative process between business stakeholders and data teams. He debunks common misconceptions that dat…
…
continue reading

1
From Academia to Industry: Bridging Data Engineering Challenges
50:54
50:54
Play later
Play later
Lists
Like
Liked
50:54Summary In this episode of the Data Engineering Podcast Professor Paul Groth, from the University of Amsterdam, talks about his research on knowledge graphs and data engineering. Paul shares his background in AI and data management, discussing the evolution of data provenance and lineage, as well as the challenges of data integration. He explores t…
…
continue reading

1
High Performance And Low Overhead Graphs With KuzuDB
1:01:29
1:01:29
Play later
Play later
Lists
Like
Liked
1:01:29Summary In this episode of the Data Engineering Podcast Prashanth Rao, an AI engineer at KuzuDB, talks about their embeddable graph database. Prashanth explains how KuzuDB addresses performance shortcomings in existing solutions through columnar storage and novel join algorithms. He discusses the usability and scalability of KuzuDB, emphasizing its…
…
continue reading

1
Bridging Data and Decision-Making: AI's Role in Modern Analytics
1:10:44
1:10:44
Play later
Play later
Lists
Like
Liked
1:10:44Summary In this episode of the Data Engineering Podcast Lucas Thelosen and Drew Gilson from Gravity talk about their development of Orion, an autonomous data analyst that bridges the gap between data availability and business decision-making. Lucas and Drew share their backgrounds in data analytics and how their experiences have shaped their approa…
…
continue reading

1
From Bits to Tables: The Evolution of S3 Storage
50:08
50:08
Play later
Play later
Lists
Like
Liked
50:08Summary In this episode of the Data Engineering Podcast Andy Warfield talks about the innovative functionalities of S3 Tables and Vectors and their integration into modern data stacks. Andy shares his journey through the tech industry and his role at Amazon, where he collaborates to enhance storage capabilities, discussing the evolution of S3 from …
…
continue reading

1
Revolutionizing Python Notebooks with Marimo
51:56
51:56
Play later
Play later
Lists
Like
Liked
51:56Summary In this episode of the Data Engineering Podcast Akshay Agrawal from Marimo discusses the innovative new Python notebook environment, which offers a reactive execution model, full Python integration, and built-in UI elements to enhance the interactive computing experience. He discusses the challenges of traditional Jupyter notebooks, such as…
…
continue reading

1
Warehouse Native Incremental Data Processing With Dynamic Tables And Delayed View Semantics
55:07
55:07
Play later
Play later
Lists
Like
Liked
55:07Summary In this episode of the Data Engineering Podcast Dan Sotolongo from Snowflake talks about the complexities of incremental data processing in warehouse environments. Dan discusses the challenges of handling continuously evolving datasets and the importance of incremental data processing for optimized resource use and reduced latency. He expla…
…
continue reading

1
Streamlining Data Pipelines with MCP Servers and Vector Engines
52:04
52:04
Play later
Play later
Lists
Like
Liked
52:04Summary In this episode of the Data Engineering Podcast Kacper Łukawski from Qdrant about integrating MCP servers with vector databases to process unstructured data. Kacper shares his experience in data engineering, from building big data pipelines in the automotive industry to leveraging large language models (LLMs) for transforming unstructured d…
…
continue reading

1
Foundational Data Engineering At Two Sigma
55:05
55:05
Play later
Play later
Lists
Like
Liked
55:05Summary In this episode of the Data Engineering Podcast Effie Baram, a leader in foundational data engineering at Two Sigma, talks about the complexities and innovations in data engineering within the finance sector. She discusses the critical role of data at Two Sigma, balancing data quality with delivery speed, and the socio-technical challenges …
…
continue reading
Ready for your Hour of (Data) Power with some Radioactive and Electric Feel thrown in? Ok, so hang tight, get your coffee or lemonade (depending on how your summer looks like!), as we bring you a double whammy from your newest Data-powered Innovation Jam to celebrate the anniversary launch of the 10th edition of the Data Powered Innovation Review. …
…
continue reading

1
Enabling Agents In The Enterprise With A Platform Approach
54:18
54:18
Play later
Play later
Lists
Like
Liked
54:18Summary In this episode of the Data Engineering Podcast Arun Joseph talks about developing and implementing agent platforms to empower businesses with agentic capabilities. From leading AI engineering at Deutsche Telekom to his current entrepreneurial venture focused on multi-agent systems, Arun shares insights on building agentic systems at an org…
…
continue reading

1
Dagster's New Era: Modularizing Data Transformation in the Age of AI
1:01:37
1:01:37
Play later
Play later
Lists
Like
Liked
1:01:37Summary In this episode of the Data Engineering Podcast we welcome back Nick Schrock, CTO and founder of Dagster Labs, to discuss the evolving landscape of data engineering in the age of AI. As AI begins to impact data platforms and the role of data engineers, Nick shares his insights on how it will ultimately enhance productivity and expand softwa…
…
continue reading

1
AI and the Lakehouse: How Starburst is Pioneering New Workflows
44:09
44:09
Play later
Play later
Lists
Like
Liked
44:09Summary In this episode of the Data Engineering Podcast Alex Albu, tech lead for AI initiatives at Starburst, talks about integrating AI workloads with the lakehouse architecture. From his software engineering roots to leading data engineering efforts, Alex shares insights on enhancing Starburst's platform to support AI applications, including an A…
…
continue reading
From punk rock to process automation, from Blondie to BSON — it’s a genre-defying episode of the Data-Powered Innovation Jam where we grab a virtual bench in Central Park with Andrew Davidson, SVP of Products at MongoDB, and let the data conversation run wild. What follows is an improvisational jam session on the evolution of data, the art of distr…
…
continue reading

1
Amazon S3: The Backbone of Modern Data Systems
1:01:01
1:01:01
Play later
Play later
Lists
Like
Liked
1:01:01Summary In this episode of the Data Engineering Podcast Mai-Lan Tomsen Bukovec, Vice President of Technology at AWS, talks about the evolution of Amazon S3 and its profound impact on data architecture. From her work on compute systems to leading the development and operations of S3, Mylan shares insights on how S3 has become a foundational element …
…
continue reading

1
Scaling Data Operations With Platform Engineering
42:20
42:20
Play later
Play later
Lists
Like
Liked
42:20Summary In this episode of the Data Engineering Podcast Chakravarthy Kotaru talks about scaling data operations through standardized platform offerings. From his roots as an Oracle developer to leading the data platform at a major online travel company, Chakravarthy shares insights on managing diverse database technologies and providing databases a…
…
continue reading

1
From Data Discovery to AI: The Evolution of Semantic Layers
49:30
49:30
Play later
Play later
Lists
Like
Liked
49:30Summary In this episode of the Data Engineering Podcast, host Tobias Macy welcomes back Shinji Kim to discuss the evolving role of semantic layers in the era of AI. As they explore the challenges of managing vast data ecosystems and providing context to data users, they delve into the significance of semantic layers for AI applications. They dive i…
…
continue reading

1
The six pistols of technology trends
1:18:45
1:18:45
Play later
Play later
Lists
Like
Liked
1:18:45Ready for the remix? You might have already listened to one of the most recent episodes of the Cloud Realities podcast, where six dynamic voices from two podcast teams came together to unpack Capgemini’s TechnoVision 2025. From infrastructure and applications to collaboration, user experience, automation, and — naturally — a heavy dose of data and …
…
continue reading

1
Balancing Off-the-Shelf and Custom Solutions in Data Engineering
46:05
46:05
Play later
Play later
Lists
Like
Liked
46:05Summary In this episode of the Data Engineering Podcast Tulika Bhatt, a senior software engineer at Netflix, talks about her experiences with large-scale data processing and the future of data engineering technologies. Tulika shares her journey into the data engineering field, discussing her work at BlackRock and Verizon before joining Netflix, and…
…
continue reading
What happens when Professor Erik Proper of the Technical University of Vienna teams up with the ever-inquisitive ‘Dr. Bob’ Robert Engels – modestly supported by co-hosts Weiwei Feng and Ron Tolido - to discuss AI, models, semantics, ontologies, and context? A true PhD Fest? Or a practical exploration of the synergies between the academic and busine…
…
continue reading

1
StarRocks: Bridging Lakehouse and OLAP for High-Performance Analytics
59:41
59:41
Play later
Play later
Lists
Like
Liked
59:41Summary In this episode of the Data Engineering Podcast Sida Shen, product manager at CelerData, talks about StarRocks, a high-performance analytical database. Sida discusses the inception of StarRocks, which was forked from Apache Doris in 2020 and evolved into a high-performance Lakehouse query engine. He explains the architectural design of Star…
…
continue reading

1
Exploring NATS: A Multi-Paradigm Connectivity Layer for Distributed Applications
1:12:50
1:12:50
Play later
Play later
Lists
Like
Liked
1:12:50Summary In this episode of the Data Engineering Podcast Derek Collison, creator of NATS and CEO of Synadia, talks about the evolution and capabilities of NATS as a multi-paradigm connectivity layer for distributed applications. Derek discusses the challenges and solutions in building distributed systems, and highlights the unique features of NATS t…
…
continue reading
Much like a hammer sees every problem as a nail, data experts tend to see opportunity in every single data field. Collaboration on data is still one of the best ways to bring data to life and build business value on top of it. Sharing data across organizations is one of the most powerful ways to spark innovation. But when data needs to stay private…
…
continue reading

1
Advanced Lakehouse Management With The LakeKeeper Iceberg REST Catalog
57:13
57:13
Play later
Play later
Lists
Like
Liked
57:13Summary In this episode of the Data Engineering Podcast Viktor Kessler, co-founder of Vakmo, talks about the architectural patterns in the lake house enabled by a fast and feature-rich Iceberg catalog. Viktor shares his journey from data warehouses to developing the open-source project, Lakekeeper, an Apache Iceberg REST catalog written in Rust tha…
…
continue reading

1
Simplifying Data Pipelines with Durable Execution
39:49
39:49
Play later
Play later
Lists
Like
Liked
39:49Summary In this episode of the Data Engineering Podcast Jeremy Edberg, CEO of DBOS, about durable execution and its impact on designing and implementing business logic for data systems. Jeremy explains how DBOS's serverless platform and orchestrator provide local resilience and reduce operational overhead, ensuring exactly-once execution in distrib…
…
continue reading
Ever thought AI practitioners could win a Noble prize? Well, they already have. Not for predicting your next favourite movie or generating poetry, but for groundbreaking advancements in Life Sciences. And it won’t be the last time. In this episode, hosts Ron Tolido, Robert Engels, and Weiwei Feng find themselves venturing into unexpected territory …
…
continue reading

1
Overcoming Redis Limitations: The Dragonfly DB Approach
43:58
43:58
Play later
Play later
Lists
Like
Liked
43:58Summary In this episode of the Data Engineering Podcast Roman Gershman, CTO and founder of Dragonfly DB, explores the development and impact of high-speed in-memory databases. Roman shares his experience creating a more efficient alternative to Redis, focusing on performance gains, scalability, and cost efficiency, while addressing limitations such…
…
continue reading
In for a new venture? This episode of the Data-powered Innovation Jam has you covered. Our hosts Weiwei, Robert, and Ron welcome Brett Clark, Director, Global Business Development at blackshark.ai—a company with the modest mission of creating a digital twin of the entire planet using satellite data. It all started with Microsoft Flight Simulator, b…
…
continue reading

1
Bringing AI Into The Inner Loop of Data Engineering With Ascend
52:47
52:47
Play later
Play later
Lists
Like
Liked
52:47Summary In this episode of the Data Engineering Podcast Sean Knapp, CEO of Ascend.io, explores the intersection of AI and data engineering. He discusses the evolution of data engineering and the role of AI in automating processes, alleviating burdens on data engineers, and enabling them to focus on complex tasks and innovation. The conversation cov…
…
continue reading

1
Astronomer's Role in the Airflow Ecosystem: A Deep Dive with Pete DeJoy
51:41
51:41
Play later
Play later
Lists
Like
Liked
51:41Summary In this episode of the Data Engineering Podcast Pete DeJoy, co-founder and product lead at Astronomer, talks about building and managing Airflow pipelines on Astronomer and the upcoming improvements in Airflow 3. Pete shares his journey into data engineering, discusses Astronomer's contributions to the Airflow project, and highlights the cr…
…
continue reading
It’s the ultimate new hit album for generative AI: intelligent automation, autonomous systems, and—of course—agents. But while ServiceNow has been a pioneering force in this space for a long time, it still seems to fly under the radar for many data and AI experts. Time to change that. Time to bridge the worlds of Planet Process and Planet Data! For…
…
continue reading

1
Accelerated Computing in Modern Data Centers With Datapelago
55:36
55:36
Play later
Play later
Lists
Like
Liked
55:36Summary In this episode of the Data Engineering Podcast Rajan Goyal, CEO and co-founder of Datapelago, talks about improving efficiencies in data processing by reimagining system architecture. Rajan explains the shift from hyperconverged to disaggregated and composable infrastructure, highlighting the importance of accelerated computing in modern d…
…
continue reading
"It Don’t Mean a Thing If It Ain’t Got That Swing"—Duke Ellington’s immortal words composed almost a century ago resonate deeply in the latest episode of the Data-powered Innovation Jam. Hosts Ron Tolido, Robert Engels, and Weiwei Feng explore TechnoVision 2025, Capgemini’s annual technology trend analysis, where ‘The Pendulum Swing’ captures the r…
…
continue reading

1
The Future of Data Engineering: AI, LLMs, and Automation
59:39
59:39
Play later
Play later
Lists
Like
Liked
59:39Summary In this episode of the Data Engineering Podcast Gleb Mezhanskiy, CEO and co-founder of DataFold, talks about the intersection of AI and data engineering. He discusses the challenges and opportunities of integrating AI into data engineering, particularly using large language models (LLMs) to enhance productivity and reduce manual toil. The c…
…
continue reading

1
Evolving Responsibilities in AI Data Management
38:57
38:57
Play later
Play later
Lists
Like
Liked
38:57Summary In this episode of the Data Engineering Podcast Bartosz Mikulski talks about preparing data for AI applications. Bartosz shares his journey from data engineering to MLOps and emphasizes the importance of data testing over software development in AI contexts. He discusses the types of data assets required for AI applications, including exten…
…
continue reading
Cloud computing has long been a familiar fixture in the digital sky, but that doesn’t mean innovation has stopped soaring. In this episode, hosts Ron Tolido, Robert Engels, and Weiwei Feng discuss all things cloud with Carolin Eggers, Director, Data & AI, EMEA North, Google Cloud. The first kind of cloud? There’s nothing UFO about it. That’s well u…
…
continue reading
Is quantum computing an early ‘70s thing? Well, certainly if you compare it to the new, experimental music that was released then. To many, quantum computing feels like listening to an early Pink Floyd record for the very first time: alien, incomparable, confusing, but above all: intriguing. Quite different from the more mainstream band the Floyd l…
…
continue reading
Knock, knock, Neo. Think Agents are a thing of the future? Look again at the iconic movie, The Matrix—they've been hiding in plain sight. In the first episode of 2025, hosts Ron Tolido, Weiwei Feng, and Robert Engels venture down the digital rabbit hole of Virtual Twins with Morgan Zimmerman, CEO of NETVIBES at Dassault Systèmes. It’s a fascinating…
…
continue reading

1
CSVs Will Never Die And OneSchema Is Counting On It
54:40
54:40
Play later
Play later
Lists
Like
Liked
54:40Summary In this episode of the Data Engineering Podcast Andrew Luo, CEO of OneSchema, talks about handling CSV data in business operations. Andrew shares his background in data engineering and CRM migration, which led to the creation of OneSchema, a platform designed to automate CSV imports and improve data validation processes. He discusses the ch…
…
continue reading

1
Breaking Down Data Silos: AI and ML in Master Data Management
57:30
57:30
Play later
Play later
Lists
Like
Liked
57:30Summary In this episode of the Data Engineering Podcast Dan Bruckner, co-founder and CTO of Tamr, talks about the application of machine learning (ML) and artificial intelligence (AI) in master data management (MDM). Dan shares his journey from working at CERN to becoming a data expert and discusses the challenges of reconciling large-scale organiz…
…
continue reading

1
Building a Data Vision Board: A Guide to Strategic Planning
49:59
49:59
Play later
Play later
Lists
Like
Liked
49:59Summary In this episode of the Data Engineering Podcast Lior Barak shares his insights on developing a three-year strategic vision for data management. He discusses the importance of having a strategic plan for data, highlighting the need for data teams to focus on impact rather than just enablement. He introduces the concept of a "data vision boar…
…
continue reading
Do we hear sleigh bells in that surf rock soundtrack? Well, anything can happen when you tune in to the Data-powered Innovation Jam podcast—especially in the very last episode of the year! Our hosts Weiwei Feng (winning top honours for her Christmas sweater), Robert Engels, and Ron Tolido gather with their producer Arne Rossmann by a cozy fireplace…
…
continue reading

1
How Orchestration Impacts Data Platform Architecture
59:39
59:39
Play later
Play later
Lists
Like
Liked
59:39Summary The core task of data engineering is managing the flows of data through an organization. In order to ensure those flows are executing on schedule and without error is the role of the data orchestrator. Which orchestration engine you choose impacts the ways that you architect the rest of your data platform. In this episode Hugo Lu shares his…
…
continue reading
Autonomous AI isn’t just a futuristic dream—it’s here, and robots are redefining industries. But who would have thought the human touch is so crucial to success? In this electrifying episode, hosts Robert Engels, Weiwei Feng, and Ron Tolido are joined by Kence Anderson, CEO and co-founder of Composabl and author of “Designing Autonomous AI: A Guide…
…
continue reading

1
An Exploration Of The Impediments To Reusable Data Pipelines
51:32
51:32
Play later
Play later
Lists
Like
Liked
51:32Summary In this episode of the Data Engineering Podcast the inimitable Max Beauchemin talks about reusability in data pipelines. The conversation explores the "write everything twice" problem, where similar pipelines are built without code reuse, and discusses the challenges of managing different SQL dialects and relational databases. Max also touc…
…
continue reading