Artwork
iconShare
 
Manage episode 520332207 series 2053958
Content provided by The Data Flowcast. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by The Data Flowcast or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://staging.podcastplayer.com/legal.

Building scalable, reproducible workflows for scientific computing often requires bridging the gap between research flexibility and enterprise reliability.

In this episode, Anja MacKenzie, Expert for Cheminformatics at Covestro, explains how her team uses Airflow and Kubernetes to create a shared, self-service platform for computational chemistry.

Key Takeaways:

00:00 Introduction.

06:19 Custom scripts made sharing and reuse difficult.

09:29 Workflows are manually triggered with user traceability.

10:38 Customization supports varied compute requirements.

12:48 Persistent volumes allow tasks to share large amounts of data.

14:25 Custom operators separate logic from infrastructure.

16:43 Modified triggers connect dependent workflows.

18:36 UI plugins enable file uploads and secure access.

Resources Mentioned:

Anja MacKenzie

https://www.linkedin.com/in/anja-mackenzie/

Covestro | LinkedIn

https://www.linkedin.com/company/covestro/

Covestro | Website

https://www.covestro.com

Apache Airflow

https://airflow.apache.org/

Kubernetes

https://kubernetes.io/

Airflow KubernetesPodOperator

https://airflow.apache.org/docs/apache-airflow-providers-cncf-kubernetes/stable/operators.html

Astronomer

https://www.astronomer.io/

Airflow Academy by Marc Lamberti

https://www.udemy.com/user/lockgfg/?utm_source=adwords&utm_medium=udemyads&utm_campaign=Search_DSA_GammaCatchall_NonP_la.EN_cc.ROW-English&campaigntype=Search&portfolio=ROW-English&language=EN&product=Course&test=&audience=DSA&topic=&priority=Gamma&utm_content=deal4584&utm_term=_._ag_169801645584_._ad_700876640602_._kw__._de_c_._dm__._pl__._ti_dsa-1456167871416_._li_9061346_._pd__._&matchtype=&gad_source=1&gad_campaignid=21341313808&gbraid=0AAAAADROdO1_-I2TMcVyU8F3i1jRXJ24K&gclid=Cj0KCQjwvJHIBhCgARIsAEQnWlC1uYHIRm3y9Q8rPNSuVPNivsxogqfczpKHwhmNho2uKZYC-y0taNQaApU2EALw_wcB

Airflow Documentation

https://airflow.apache.org/docs/

Airflow Plugins

https://airflow.apache.org/docs/apache-airflow/1.10.9/plugins.html

Thanks for listening to “The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.

#AI #Automation #Airflow

  continue reading

80 episodes