Machine learning audio course, teaching the fundamentals of machine learning and artificial intelligence. It covers intuition, models (shallow and deep), math, languages, frameworks, etc. Where your other ML resources provide the trees, I provide the forest. Consider MLG your syllabus, with highly-curated resources for each episode's details at ocdevel.com. Audio is a great supplement during exercise, commute, chores, etc.
…
continue reading
OCDevel Podcasts
Auto encoders are neural networks that compress data into a smaller "code," enabling dimensionality reduction, data cleaning, and lossy compression by reconstructing original inputs from this code. Advanced auto encoder types, such as denoising, sparse, and variational auto encoders, extend these concepts for applications in generative modeling, in…
…
continue reading
At inference, large language models use in-context learning with zero-, one-, or few-shot examples to perform new tasks without weight updates, and can be grounded with Retrieval Augmented Generation (RAG) by embedding documents into vector databases for real-time factual lookup using cosine similarity. LLM agents autonomously plan, act, and use ex…
…
continue reading
Explains language models (LLMs) advancements. Scaling laws - the relationships among model size, data size, and compute - and how emergent abilities such as in-context learning, multi-step reasoning, and instruction following arise once certain scaling thresholds are crossed. The evolution of the transformer architecture with Mixture of Experts (Mo…
…
continue reading

1
MLA 024 Code AI MCP Servers, ML Engineering
43:38
43:38
Play later
Play later
Lists
Like
Liked
43:38Tool use in code AI agents allows for both in-editor code completion and agent-driven file and command actions, while the Model Context Protocol (MCP) standardizes how these agents communicate with external and internal tools. MCP integration broadens the automation capabilities for developers and machine learning engineers by enabling access to a …
…
continue reading
Gemini 2.5 Pro currently leads in both accuracy and cost-effectiveness among code-focused large language models, with Claude 3.7 and a DeepSeek R1/Claude 3.5 combination also performing well in specific modes. Using local open source models via tools like Ollama offers enhanced privacy but trades off model performance, and advanced workflows like c…
…
continue reading

1
MLA 022 Code AI: Cursor, Cline, Roo, Aider, Copilot, Windsurf
55:29
55:29
Play later
Play later
Lists
Like
Liked
55:29Vibe coding is using large language models within IDEs or plugins to generate, edit, and review code, and has recently become a prominent and evolving technique in software and machine learning engineering. The episode outlines a comparison of current code AI tools - such as Cursor, Copilot, Windsurf, Cline, Roo Code, and Aider - explaining their a…
…
continue reading
Links: Notes and resources at ocdevel.com/mlg/33 3Blue1Brown videos: https://3blue1brown.com/ Try a walking desk stay healthy & sharp while you learn & code Try Descript audio/video editing with AI power-tools Background & Motivation RNN Limitations: Sequential processing prevents full parallelization—even with attention tweaks—making them ineffici…
…
continue reading

1
MLA 021 Databricks: Cloud Analytics and MLOps
26:28
26:28
Play later
Play later
Lists
Like
Liked
26:28Databricks is a cloud-based platform for data analytics and machine learning operations, integrating features such as a hosted Spark cluster, Python notebook execution, Delta Lake for data management, and seamless IDE connectivity. Raybeam utilizes Databricks and other ML Ops tools according to client infrastructure, scaling needs, and project goal…
…
continue reading

1
MLA 020 Kubeflow and ML Pipeline Orchestration on Kubernetes
1:08:47
1:08:47
Play later
Play later
Lists
Like
Liked
1:08:47Machine learning pipeline orchestration tools, such as SageMaker and Kubeflow, streamline the end-to-end process of data ingestion, model training, deployment, and monitoring, with Kubeflow providing an open-source, cross-cloud platform built atop Kubernetes. Organizations typically choose between cloud-native managed services and open-source solut…
…
continue reading

1
MLA 019 Cloud, DevOps & Architecture
1:15:21
1:15:21
Play later
Play later
Lists
Like
Liked
1:15:21The deployment of machine learning models for real-world use involves a sequence of cloud services and architectural choices, where machine learning expertise must be complemented by DevOps and architecture skills, often requiring collaboration with professionals. Key concepts discussed include infrastructure as code, cloud container orchestration,…
…
continue reading

1
MLA 017 AWS Local Development Environment
1:04:49
1:04:49
Play later
Play later
Lists
Like
Liked
1:04:49AWS development environments for local and cloud deployment can differ significantly, leading to extra complexity and setup during cloud migration. By developing directly within AWS environments, using tools such as Lambda, Cloud9, SageMaker Studio, client VPN connections, or LocalStack, developers can streamline transitions to production and lever…
…
continue reading
SageMaker streamlines machine learning workflows by enabling integrated model training, tuning, deployment, monitoring, and pipeline automation within the AWS ecosystem, offering scalable compute options and flexible development environments. Cloud-native AWS machine learning services such as Comprehend and Poly provide off-the-shelf solutions for …
…
continue reading
SageMaker is an end-to-end machine learning platform on AWS that covers every stage of the ML lifecycle, including data ingestion, preparation, training, deployment, monitoring, and bias detection. The platform offers integrated tools such as Data Wrangler, Feature Store, Ground Truth, Clarify, Autopilot, and distributed training to enable scalable…
…
continue reading

1
MLA 014 Machine Learning Hosting and Serverless Deployment
52:33
52:33
Play later
Play later
Lists
Like
Liked
52:33Machine learning model deployment on the cloud is typically handled with solutions like AWS SageMaker for end-to-end training and inference as a REST endpoint, AWS Batch for cost-effective on-demand batch jobs using Docker containers, and AWS Lambda for low-usage, serverless inference without GPU support. Storage and infrastructure options such as …
…
continue reading

1
MLA 013 Tech Stack for Customer-Facing Machine Learning Products
47:37
47:37
Play later
Play later
Lists
Like
Liked
47:37Primary technology recommendations for building a customer-facing machine learning product include React and React Native for the front end, serverless platforms like AWS Amplify or GCP Firebase for authentication and basic server/database needs, and Postgres as the relational database of choice. Serverless approaches are encouraged for scalability…
…
continue reading

1
MLA 012 Docker for Machine Learning Workflows
31:41
31:41
Play later
Play later
Lists
Like
Liked
31:41Docker enables efficient, consistent machine learning environment setup across local development and cloud deployment, avoiding many pitfalls of virtual machines and manual dependency management. It streamlines system reproduction, resource allocation, and GPU access, supporting portability and simplified collaboration for ML projects. Machine lear…
…
continue reading
Try a walking desk to stay healthy while you study or work! Show notes at ocdevel.com/mlg/32. L1/L2 norm, Manhattan, Euclidean, cosine distances, dot product Normed distances link A norm is a function that assigns a strictly positive length to each vector in a vector space. link Minkowski is generalized. p_root(sum(xi-yi)^p). "p" = ? (1, 2, ..) for…
…
continue reading
Primary clustering tools for practical applications include K-means using scikit-learn or Faiss, agglomerative clustering leveraging cosine similarity with scikit-learn, and density-based methods like DBSCAN or HDBSCAN. For determining the optimal number of clusters, silhouette score is generally preferred over inertia-based visual heuristics, and …
…
continue reading

1
MLA 010 NLP packages: transformers, spaCy, Gensim, NLTK
26:22
26:22
Play later
Play later
Lists
Like
Liked
26:22The landscape of Python natural language processing tools has evolved from broad libraries like NLTK toward more specialized packages such as Gensim for topic modeling, SpaCy for linguistic analysis, and Hugging Face Transformers for advanced tasks, with Sentence Transformers extending transformer models to enable efficient semantic search and clus…
…
continue reading

1
MLA 009 Charting and Visualization Tools for Data Science
24:43
24:43
Play later
Play later
Lists
Like
Liked
24:43Python charting libraries - Matplotlib, Seaborn, and Bokeh - explaining, their strengths from quick EDA to interactive, HTML-exported visualizations, and clarifies where D3.js fits as a JavaScript alternative for end-user applications. It also evaluates major software solutions like Tableau, Power BI, QlikView, and Excel, detailing how modern BI to…
…
continue reading
Exploratory data analysis (EDA) sits at the critical pre-modeling stage of the data science pipeline, focusing on uncovering missing values, detecting outliers, and understanding feature distributions through both statistical summaries and visualizations, such as Pandas' info(), describe(), histograms, and box plots. Visualization tools like Matplo…
…
continue reading
Jupyter Notebooks, originally conceived as IPython Notebooks, enable data scientists to combine code, documentation, and visual outputs in an interactive, browser-based environment supporting multiple languages like Python, Julia, and R. This episode details how Jupyter Notebooks structure workflows into executable cells - mixing markdown explanati…
…
continue reading

1
MLA 006 Salaries for Data Science & Machine Learning
19:35
19:35
Play later
Play later
Lists
Like
Liked
19:35O'Reilly's 2017 Data Science Salary Survey finds that location is the most significant salary determinant for data professionals, with median salaries ranging from $134,000 in California to under $30,000 in Eastern Europe, and highlights that negotiation skills can lead to salary differences as high as $45,000. Other key factors impacting earnings …
…
continue reading

1
MLA 005 Shapes and Sizes: Tensors and NDArrays
27:18
27:18
Play later
Play later
Lists
Like
Liked
27:18Explains the fundamental differences between tensor dimensions, size, and shape, clarifying frequent misconceptions—such as the distinction between the number of features (“columns”) and true data dimensions—while also demystifying reshaping operations like expand_dims, squeeze, and transpose in NumPy. Through practical examples from images and nat…
…
continue reading
Practical workflow of loading, cleaning, and storing large datasets for machine learning, moving from ingesting raw CSVs or JSON files with pandas to saving processed datasets and neural network weights using HDF5 for efficient numerical storage. It clearly distinguishes among storage options—explaining when to use HDF5, pickle files, or SQL databa…
…
continue reading
NumPy enables efficient storage and vectorized computation on large numerical datasets in RAM by leveraging contiguous memory allocation and low-level C/Fortran libraries, drastically reducing memory footprint compared to native Python lists. Pandas, built on top of NumPy, introduces labelled, flexible tabular data manipulation—facilitating intuiti…
…
continue reading

1
MLA 001 Degrees, Certificates, and Machine Learning Careers
11:21
11:21
Play later
Play later
Lists
Like
Liked
11:21While industry-respected credentials like Udacity Nanodegrees help build a practical portfolio for machine learning job interviews, they remain insufficient stand-alone qualifications—most roles require a Master’s degree as a near-hard requirement, especially compared to more flexible web development fields. A Master’s, such as Georgia Tech’s OMSCS…
…
continue reading
Notes and resources: ocdevel.com/mlg/29 Try a walking desk to stay healthy while you study or work! Reinforcement Learning (RL) is a fundamental component of artificial intelligence, different from purely being AI itself. It is considered a key aspect of AI due to its ability to learn through interactions with the environment using a system of rewa…
…
continue reading
Notes and resources: ocdevel.com/mlg/28 Try a walking desk to stay healthy while you study or work! More hyperparameters for optimizing neural networks. A focus on regularization, optimizers, feature scaling, and hyperparameter search methods. Hyperparameter Search Techniques Grid Search involves testing all possible permutations of hyperparameters…
…
continue reading
Full notes and resources at ocdevel.com/mlg/27 Try a walking desk to stay healthy while you study or work! Hyperparameters are crucial elements in the configuration of machine learning models. Unlike parameters, which are learned by the model during training, hyperparameters are set by humans before the learning process begins. They are the knobs a…
…
continue reading
Try a walking desk to stay healthy while you study or work! Ful notes and resources at ocdevel.com/mlg/26 NOTE. This episode is no longer relevant, and tforce_btc_trader no longer maintained. The current podcast project is Gnothi. Episode Overview TForce BTC Trader Project: Trading Crypto Special: Intuitively highlights decisions: hypers, supervise…
…
continue reading
Try a walking desk to stay healthy while you study or work! Notes and resources at ocdevel.com/mlg/25 Filters and Feature Maps: Filters are small matrices used to detect visual features from an input image by applying them to local pixel patches, creating a 3D output called a feature map. Each filter is tasked with recognizing a specific pattern (e…
…
continue reading
Try a walking desk to stay healthy while you study or work! Notes and resources at ocdevel.com/mlg/24 Hardware Desktop if you're stationary, as you'll get the best performance bang-for-buck and improved longevity; laptop if you're mobile. Desktops. Build your own PC, better value than pre-built. See PC Part Picker, make sure to use an Nvidia graphi…
…
continue reading
Try a walking desk to stay healthy while you study or work! Notes and resources at ocdevel.com/mlg/23 Neural Network Types in NLP Vanilla Neural Networks (Feedforward Networks): Used for general classification or regression tasks. Examples include predicting housing costs or classifying images as cat, dog, or tree. Convolutional Neural Networks (CN…
…
continue reading
Try a walking desk to stay healthy while you study or work! Notes and resources at ocdevel.com/mlg/22 Deep NLP Fundamentals Deep learning has had a profound impact on natural language processing by introducing models like recurrent neural networks (RNNs) that are specifically adept at handling sequential data. Unlike traditional linear models like …
…
continue reading
Try a walking desk to stay healthy while you study or work! Notes and resources at ocdevel.com/mlg/20 NLP progresses through three main layers: text preprocessing, syntax tools, and high-level goals, each building upon the last to achieve complex linguistic tasks. Text Preprocessing Text preprocessing involves essential steps such as tokenization, …
…
continue reading

1
MLG 019 Natural Language Processing 2
1:05:54
1:05:54
Play later
Play later
Lists
Like
Liked
1:05:54Try a walking desk to stay healthy while you study or work! Notes and resources at ocdevel.com/mlg/19 Classical NLP Techniques: Origins and Phases in NLP History: Initially reliant on hardcoded linguistic rules, NLP's evolution significantly pivoted with the introduction of machine learning, particularly shallow learning algorithms, leading eventua…
…
continue reading
Try a walking desk to stay healthy while you study or work! Full notes at ocdevel.com/mlg/18 Overview: Natural Language Processing (NLP) is a subfield of machine learning that focuses on enabling computers to understand, interpret, and generate human language. It is a complex field that combines linguistics, computer science, and AI to process and …
…
continue reading
Try a walking desk to stay healthy while you study or work! At this point, browse #importance:essential on ocdevel.com/mlg/resources with the 45m/d ML, 15m/d Math breakdown.By OCDevel
…
continue reading
Try a walking desk to stay healthy while you study or work! Full notes at ocdevel.com/mlg/16 Inspiration in AI Development Early inspirations for AI development centered around solving challenging problems, but recent advancements like self-driving cars and automated scientific discoveries attract professionals due to potential economic automation …
…
continue reading
Try a walking desk to stay healthy while you study or work! Full notes at ocdevel.com/mlg/15 Concepts Performance Evaluation Metrics: Tools to assess how well a machine learning model performs tasks like spam classification, housing price prediction, etc. Common metrics include accuracy, precision, recall, F1/F2 scores, and confusion matrices. Accu…
…
continue reading
Try a walking desk to stay healthy while you study or work! Full notes at ocdevel.com/mlg/14 Anomaly Detection Systems Applications: Credit card fraud detection and server activity monitoring. Concept: Identifying outliers on a bell curve. Statistics: Central role of the Gaussian distribution (normal distribution) in detecting anomalies. Process: I…
…
continue reading
Try a walking desk to stay healthy while you study or work! Full notes at ocdevel.com/mlg/13 Support Vector Machines (SVM) Purpose: Classification and regression. Mechanism: Establishes decision boundaries with maximum margin. Margin: The thickness of the decision boundary, large margin minimizes overfitting. Support Vectors: Data points that the m…
…
continue reading
Try a walking desk to stay healthy while you study or work! Full notes at ocdevel.com/mlg/12 Topics Shallow vs. Deep Learning: Shallow learning can often solve problems more efficiently in time and resources compared to deep learning. Supervised Learning: Key algorithms include linear regression, logistic regression, neural networks, and K Nearest …
…
continue reading
Try a walking desk to stay healthy while you study or work! Full notes at ocdevel.com/mlg/10 Topics: Recommended Languages and Frameworks: Python and TensorFlow are top recommendations for machine learning. Python's versatile libraries (NumPy, Pandas, Scikit-Learn) enable it to cover all areas of data science including data mining, analytics, and m…
…
continue reading
Try a walking desk to stay healthy while you study or work! Full notes at ocdevel.com/mlg/9 Key Concepts: Deep Learning vs. Shallow Learning: Machine learning is broken down hierarchically into AI, ML, and subfields like supervised/unsupervised learning. Deep learning is a specialized area within supervised learning distinct from shallow learning a…
…
continue reading
Mathematics essential for machine learning includes linear algebra, statistics, and calculus, each serving distinct purposes: linear algebra handles data representation and computation, statistics underpins the algorithms and evaluation, and calculus enables the optimization process. It is recommended to learn the necessary math alongside or after …
…
continue reading
The logistic regression algorithm is used for classification tasks in supervised machine learning, distinguishing items by class (such as "expensive" or "not expensive") rather than predicting continuous numerical values. Logistic regression applies a sigmoid or logistic function to a linear regression model to generate probabilities, which are the…
…
continue reading
People interested in machine learning can choose between self-guided learning, online certification programs such as MOOCs, accredited university degrees, and doctoral research, with industry acceptance and personal goals influencing which path is most appropriate. Industry employers currently prioritize a strong project portfolio over non-accredit…
…
continue reading
Linear regression is introduced as the foundational supervised learning algorithm for predicting continuous numeric values, using cost estimation of Portland houses as an example. The episode explains the three-step process of machine learning - prediction via a hypothesis function, error calculation with a cost function (mean squared error), and p…
…
continue reading