Manage episode 496871741 series 3680004
Welcome to episode 305 of The Cloud Pod – where the forecast is always cloudy! How did you do on your Microsoft Build Predictions? As badly as us? Plus we’ve got news on AWS service changes, a lifecycle catch up page for all those services that bought the farm, tons of Gemini news (seriously, like a lot) and even some AI for .NET.
Welcome to the cloud pod- and thanks for joining us!
Titles we almost went with this week:
- Google’s Jules: An AI Gem for Cloud Devs
- Autonomous Agents of Code: Jules’ Excellent Adventure in the Google Cloud
- Gemini 2.5 Shoots for the Stars with Cosmic-Sized AI Upgrades
- Resistance is Futile: OpenAI Assimilates Your Codebase
- AWS Transformers: Rise of the Agentic AI
- Teaching an old .NET dog new Linux tricks
- CodeBuild Puts Docker Builds in Hyperdrive
- Inspector Gadget’s New Trick: Mapping Container Vulnerabilities
- Yo Dawg, I Heard You Like Scanning Containers…
- Google Cranks AI to 11 with New Ultra Plan
- I, For One, Welcome Our New AI Ultra Overlords
- The Inference Engine That Could: llm-d Chugs Ahead with Kubernetes-Native
- Scaling
- Scaling Inference to Infinity and Beyond with Google Cloud’s llm-d
- Google Cloud and Spring AI: A Match Made in Java-n
- The Fast and the Serverless: Cloud Run Drifts into AI Studio Territory
- SQL Server 2025: A Vector Victor, Not a Scalar Failure
- AI will solve my life problems of having money in my pocket
- I used to scan all the containers but now I will just scan yours
AI Is Going Great – or How ML Makes Money
01:50 Jules: Google’s autonomous AI coding agent
- Jules is an autonomous AI agent that can read code, understand intent, and make code changes on its own.
- It goes beyond AI coding assistants to operate independently.
- It clones code into a secure Google Cloud VM, allowing it to understand the full context of a project. This enables it to write tests, build features, fix bugs, and more.
- Jules operates asynchronously in the background, presenting its plan and reasoning when complete. This allows developers to focus on other tasks while it works.
- Integration with GitHub enables Jules to work directly in existing workflows without extra setup or context switching. Developers can steer and give feedback throughout the process.
- For cloud developers, Jules demonstrates the rapid advancement of AI for coding moving from prototype to product. Its cloud-based parallel execution enables efficient handling of complex, multi-file changes.
- While in public beta, Jules is free with some usage limits. This allows developers to experiment with this cutting-edge AI coding agent and understand its potential to accelerate development on Google Cloud.
02:56 Ryan – “More and more, as new tools get released, it’s just going to change the way anything gets written… it’s getting more and more capable.”
05:45 Introducing Flow: Google’s AI filmmaking tool designed for Veo
- Flow is an AI-powered filmmaking tool custom-designed for Google’s advanced video, image and language models (Veo, Imagen, Gemini). It allows creators to generate cinematic video clips and scenes.
- The tool leverages cloud AI capabilities to make AI video generation more accessible. Creators can describe their vision in plain language, and bring their own image/video assets.
- Key features include camera controls, scene editing/extension, asset management, and a library of example clips. This aims to enable a new wave of AI-assisted filmmaking.
- Flow is an evolution of Google’s earlier VideoFX experiment, now productized for Google AI cloud subscribers. It’s an example of applied ML moving from research into cloud products and services.
- Potential use cases include storyboarding, pre-visualization, and final rendered clips for both amateurs and professional filmmakers. Early collaborations demonstrate applications in short films.
- For cloud providers and developers, Flow showcases how foundational AI models can be packaged into vertical applications. It represents an emerging class of AI tools built on cloud infrastructure.
- The ‘so what’: Flow demonstrates tangible progress in making generative AI accessible to creatives, powered by the scale and ease-of-use of the cloud. It signals the disruptive potential of cloud AI to reshape content creation industries.
- As of right now, Flow is available to users of Google AI Pro and Google AI Ultra.
06:53 Ryan – “This is another area – like coding – it’s going to change movie making and directing; because not only do you need to have the vision in your head, but you have to be good at the prompt engineering to get it out.”
07:37 Google I/O 2025: Gemini as a universal AI assistant
- Google is extending its multimodal foundation model, Gemini 2.5 Pro, into a “world model” that can understand, simulate, plan and imagine like the human brain. What could go wrong?
- Gemini is showing emerging capabilities to simulate environments, understand physics, and enable robots to grasp and follow instructions.
- The goal is to make Gemini a universal AI assistant that can perform tasks, handle admin, make recommendations, and enrich people’s lives across any device.
- Google is integrating live AI capabilities from Project Astra like video understanding, screen sharing and memory into products like Gemini Live, Search, and the Live API.
- Project Mariner is a research prototype exploring agentic capabilities, with a system of agents that can multitask up to 10 different things like looking up info, making bookings and purchases.
- These AI developments aim to make AI more personal, proactive, and powerful to boost productivity and usher in a new era of discovery.
- For cloud, this points to a future where highly capable AI agents and models can be accessed as a service to enhance any application with intelligent assistance.
- The implications are that cloud AI is evolving from single-purpose APIs to multi-skilled AI assistants that developers can leverage. Businesses should consider how universal AI agents could transform their products and customer experiences.
08:28 Justin – “I can’t wait for an assistant – my own personal JARVIS.”
09:50 Google I/O 2025: Updates to Gemini 2.5 from Google DeepMind
- Google announced major updates to its Gemini 2.5 large language models, including the 2.5 Pro and 2.5 Flash versions, which are leading benchmarks for coding, reasoning, learning, and more.
- New capabilities to the models include native audio output for more natural conversations, advanced security safeguards against prompt injection attacks, and the ability to use tools and access computers.
- An experimental “Deep Think” mode enables enhanced reasoning for highly complex math and coding tasks.
- Developer experience improvements include thought summaries for transparency, adjustable thinking budgets for cost control, and support for Model Context Protocol (MCP) tools.
- The models are available in Google’s cloud AI platforms like Vertex AI, and the Gemini API for businesses and developers to build intelligent applications
- The rapid progress and expanding capabilities of large language models have major implications for unlocking new AI use cases and experiences across industries
- The ‘so what’: Google’s Gemini models represent the state-of-the-art in large language model performance and are poised to enable a new wave of intelligent applications leveraging natural conversations, reasoning, coding and more. Businesses and developers should pay close attention as language AI rapidly becomes an essential cloud computing technology.
11:43 Google DeepMind creates super-advanced AI that can invent new algorithms – Ars Technica
- AlphaEvolve is a new AI coding agent from Google DeepMind based on their Gemini large language models, with the addition of an “evolutionary” approach to evaluate and improve algorithms.
- It uses an automatic evaluation system to generate multiple solutions to a problem, analyze each one, and iteratively focus on and refine the best solution.
- Unlike previous DeepMind AIs trained extensively on a single knowledge domain, AlphaEvolve is a general-purpose AI to aid research on any programming or algorithmic problem.
- Google has already started deploying AlphaEvolve across its business, with positive results.
- For cloud computing, AlphaEvolve could enable more intelligent, efficient and robust cloud services and applications by optimizing underlying algorithms and architectures.
- Businesses and developers could leverage AlphaEvolve to tackle complex problems and accelerate R&D in fields like scientific computing, analytics, AI/ML, etc. on the cloud.
- AlphaEvolve represents an important step towards using AI to augment human intelligence in solving big challenges in math, science and computing.
13:25 Justin – “The other AIs doing all the programming work, this is creating the new algorithms, and then we’re getting quantum computing which is just going to figure out all the possibilities and figure out that we’re just going to die at this point…”
14:08 OpenAI introduces Codex, its first full-fledged AI agent for coding
- OpenAI has released Codex, an AI agent that can generate production-ready code based on natural language prompts from developers.
- Codex runs in a containerized environment that mirrors the user’s codebase and development setup.
- Developers can provide an “AGENTS.md” file to give Codex additional context and guidance on project standards.
- Codex is built on the codex-1 model, a variant of OpenAI’s o3 reasoning model that was trained via reinforcement learning on a broad set of coding tasks.
- For cloud developers, Codex could automate routine programming work, boosting productivity.
- Businesses could leverage Codex to rapidly prototype cloud applications and services.
- Codex represents a major step towards AI systems becoming full-fledged software development partners working alongside human programmers.
- While still in research preview, Codex points to a future where AI is deeply integrated into the cloud application development lifecycle.
- We’re currently not spending the money on this one – so if any of our listeners out there are using this, we’d love to hear about your experiences.
- RIP to everyone’s jobs.
Cloud Tools
16:11 Hashicorp: Introducing Hashicorp Validated Patterns For Product Use Cases
- HashiCorp Validated Patterns provide pre-built, validated solutions for common use cases using HashiCorp tools like Terraform, Vault, Consul, and Nomad.
- They help accelerate time-to-value by providing a starting point for building and deploying production-ready infrastructure and apps in the cloud.
- Patterns cover core use cases, like service networking, zero trust security, multi-cloud deployments, Kubernetes deployments, and more.
- Validated Patterns integrate with major cloud platforms including AWS, Azure, and Google Cloud Platform. What, no Oracle?
- Validated Patterns solve the problem of figuring out best practices and recommended architectures when using HashiCorp tools for common scenarios.
- The patterns are fully open source and customizable, allowing users to adapt them to their specific needs.
- This matters for YOU – the cloud professional – because it makes it faster and easier to properly implement HashiCorp tools in production by leveraging curated, validated solutions.
17:02 Matt – “I looked a little bit more into the article… they’re like, cool. Terraform with Prisma Cloud by Palo Alto Networks. Maybe that’s a good idea? I don’t know, I just feel like there’s gonna be someone that runs a Terraform destroyer, takes down your time in Prisma Cloud. Feels like a bad life choice.”
AWS
18:22 AWS service changes
- It’s a big week for killing things off… RIP.
- AWS is ending support for several services including Amazon Pinpoint, AWS IQ, IoT Analytics, IoT Events, SimSpace Weaver, Panorama, Inspector Classic, Connect Voice ID, and DMS Fleet Advisor.
- End of support means these services will no longer be available after specific announced dates.
- AWS will provide customers with detailed migration guidance and support to transition to alternative services.
- Some services, like AWS Private 5G and DataSync Discovery, have already reached the end of support and are no longer accessible.
- This announcement matters because ending support for services can significantly impact customers who rely on them, and requires careful planning to migrate.
- Customers should review the end of support dates and migration paths in the linked documentation for each affected service.
- The AWS Product Lifecycle page provides more details on end of support timelines and options: https://aws.amazon.com/products/lifecycle
19:15 Introducing the AWS Product Lifecycle page and AWS service availability updates
- AWS launched a new Product Lifecycle page that provides a centralized view of upcoming changes to AWS service availability, including services closing to new customers, services announcing end of support, and services that have reached end of support.
- The page helps customers stay informed about service changes that may impact their workloads and plan migrations more efficiently by consolidating lifecycle information in one place.
- Several services are closing to new customers after June 20, 2025 but will continue to operate for existing users, while other services have announced specific end of support dates.
- Services that have already reached end of support and are no longer accessible include AWS Private 5G and AWS DataSync Discovery
- The Product Lifecycle page integrates with existing resources like service documentation pages that provide detailed migration guidance for services being discontinued
- Having a single reference for service lifecycle information reduces time spent tracking down updates across different pages and allows customers to focus on their core business
- Checking the Product Lifecycle page regularly along with the What’s New with AWS page is recommended to stay on top of important availability changes
- This page is missing ones previously announced, but it’s a good place to start.
21:09 Justin – “Sometimes they build stuff to see if it sticks to the wall, and maybe it does for one or two customers, but then no one else is interested, and I think that’s a death knell for a lot of these things.”
22:50 Introducing Strands Agents, an Open Source AI Agents SDK
- Strands Agents is an open source SDK that simplifies building AI agents by leveraging advanced language models to plan, chain thoughts, call tools, and reflect.
- Developers can define an agent with just a prompt and list of tools.
- It integrates with Amazon Bedrock models that support tool use and streaming, as well as models from Anthropic, Meta’s Llama, and other providers.
- Strands can run anywhere.
- The model-driven approach of Strands reduces complexity compared to frameworks requiring complex agent workflows and orchestration. This enables faster development and iteration on AI agents.
- Use cases include conversational agents, event-triggered agents, scheduled agents, and continuously running agents.
- Strands provides deployment examples for AWS Lambda, Fargate, and EC2.
- For The Cloud Pod listeners, Strands Agents dramatically lowers the bar to building practical AI agents on AWS by providing an open source, model-driven framework to define, test and deploy agents that leverage state-of-the-art language models. Teams at AWS already use it in production.
- Strands Agents project on GitHub: https://github.com/strands-agents
- Pricing: Varies based on usage of underlying models and AWS services. (General estimate, pricing not provided in article. YMMV.)
23:49 Ryan – “I hope we don’t get too many more of these to be honest, because now OpenAI has one, Google has one, Amazon now has one – it feels like great, we’ve got a whole bunch of open source options that do the same thing. And it’s like, instead of collaborating in the open space, in the open source market, they’re creating their own competing versions of it. And it’s going to make things diverge, which I don’t like.”
25:43 AWS Transform for .NET, the first agentic AI service for modernizing .NET applications at scale
- AWS Transform for .NET is a new AI-powered service that automates porting .NET Framework applications to cross-platform .NET, making modernization faster and less error-prone. This matters because ported apps are 40% cheaper to run on Linux, have 1.5-2x better performance, and 50% better scalability.
- It integrates with source code repositories like GitHub, GitLab, Bitbucket and provides experiences through a web UI for large-scale portfolio transformation and a Visual Studio extension for individual projects.
- New capabilities include support for private NuGet packages, porting MVC Razor views, executing ported unit tests, cross-repo dependency detection, and detailed transformation reports.
- Enterprises with large portfolios of legacy .NET Framework apps that want to modernize to Linux – in order to reduce costs and improve performance/scalability – will benefit most.
- Individual developers can also use it to port specific projects.
- For The Cloud Pod listeners, this automates a previously manual, time-consuming process of porting .NET apps to Linux. It showcases how AWS is innovating by applying AI to solve real customer challenges around app modernization at scale.
- Official service page: https://aws.amazon.com/transform/net/
- Pricing: No additional charge for AWS Transform itself. Standard pricing applies for any AWS resources used to run the ported applications. (General estimate based on article.)
33:00 Accelerate CI/CD pipelines with the new AWS CodeBuild Docker Server capability | AWS News Blog
- Yes, another way to run Docker in Amazon.
- AWS CodeBuild‘s new Docker Server capability provisions a dedicated, persistent Docker server within a CodeBuild project in order to accelerate Docker image builds.
- It centralizes image building to a remote host with consistent caching, reducing wait times and increasing efficiency (up to 98% faster builds in example.)
- The persistent Docker server maintains layer caches between builds, especially beneficial for large, complex Docker images with many layers
- Integrates seamlessly with existing CodeBuild projects – simply enable the Docker Server option when creating or editing a project.
- Supports both x86 (Linux) and ARM architectures
- Ideal for CI/CD pipelines that frequently build and deploy Docker images, dramatically improving throughput.
- Pricing varies based on Docker Server compute type; be sure to check the CodeBuild pricing page for details.
- Available in all regions where CodeBuild is offered.
- For teams heavily using Docker in their build pipelines, this new CodeBuild capability can provide a major speed boost and efficiency gain with minimal setup or workflow changes required. Faster builds mean faster deployments. You’re welcome.
34:02 Justin – “Right now, if you have CodeBuild and you want to build on a Docker server, you have to connect to an ECS or Fargate instance that’s inside of a VPC elsewhere. So you had to do peering to where you code build environments. And now you can basically run this as a fully managed Docker server inside the code build environment. So you don’t have to do all those extra connectivity steps. That’s the advantage here.”
34:50 Amazon Inspector enhances container security by mapping Amazon ECR images to running containers
- Amazon Inspector now maps Amazon ECR container images to running containers in Amazon ECS and EKS, providing visibility into which images are actively deployed and their usage patterns.
- This enhancement allows security teams to prioritize fixing vulnerabilities based on severity and actual runtime usage of the container images.
- Inspector shows the cluster ARN, number of EKS pods/ECS tasks an image is deployed to, and last run time to help prioritize fixes.
- Vulnerability scanning is extended to minimal base images like scratch, distroless, and Chainguard images, and supports additional ecosystems like Go, Oracle JDK, Tomcat, WordPress and more.
- This enables comprehensive security scanning even for highly optimized container environments, eliminating the need for multiple tools.
- The features work across single AWS accounts, cross-account setups, and AWS Organizations via delegated admin for centralized vulnerability management.
- Available now in all regions where Amazon Inspector is offered at no additional cost, so that’s a plus.
- The enhancements significantly improve container security posture by focusing on vulnerabilities in images that are actively running, not just sitting in a repository.
GCP
36:38 Google announces AI Ultra subscription plan
- Google AI Ultra is a new premium subscription plan providing access to Google’s most advanced AI models and features, including Gemini, Flow, Whisk, NotebookLM, and more.
- Offers the highest usage limits and early access to cutting-edge capabilities like Veo 3 video generation and Deep Think 2.5 Pro enhanced reasoning mode
- Integrates Google AI directly into apps like Gmail, Docs, Chrome browser for seamless AI assistance
- Includes YouTube Premium and 30 TB of Google One storage, which, let’s be honest. They’re just trying to justify the cost here. Youtube Premium? Really?
- The plan targets filmmakers, developers, researchers and power users demanding “the best Google AI has to offer”.
- Costs $249.99/month with a 50% off intro offer for the first 3 months, U.S. only initially. We DO love a good promo code, but we fully expected this to be the new norm of $100 a month.
- Expands Google’s AI offerings to compete with Microsoft, Amazon, OpenAI and others in the rapidly growing generative AI market.
- They’ll still charge you for other stuff, don’t worry.
40:46 Database Center is now generally available
- Database Center is an AI-powered unified fleet management solution that simplifies monitoring, optimization, and security for database fleets on GCP
- It provides a single pane of glass view across Cloud SQL, AlloyDB, Spanner, Bigtable, Memorystore, and Firestore databases.
- Proactively identifies risks and provides intelligent recommendations to optimize performance, reliability, cost, compliance and security.
- Introduces an AI-powered natural language chat interface to ask questions, resolve issues, and get optimization recommendations
- Leverages Google’s Gemini foundation models to enable assistive performance troubleshooting, of course.
- DC allows creating custom views, tracking historical data on database resources and issues, and centralizing database alerts.
- Competes with database management offerings from AWS and Azure, but differentiates with AI-powered insights and tight integration with GCP’s database and AI/ML services.
- Key use cases include enterprises managing large fleets of databases powering critical applications that need unified visibility and optimization
- There is no additional cost for core features, but premium features like Gemini-based performance/cost recommendations require Gemini Cloud Assist.
- Advanced security requires Security Command Center subscription, which is VERY pricey, so be wary.
41:47 Ryan – “While I really like this feature, I want to make fun of it just because it’ll be like a lot of the other Google services where it’ll just be very confusing to the end user – where they won’t really know which service they’re using under the covers. They’ll click a button, they’ll set up a whole bunch of stuff up, and then they’ll get a bill that has AlloyDB on it and they’ll be like, I don’t understand what this is at all. So I look forward to that conversation.”
42:18 GKE Data Cache, now GA, accelerates stateful apps | Google Cloud Blog
- GKE Data Cache is a new managed solution that accelerates read-heavy stateful apps on GKE by intelligently caching data from persistent disks on high-speed local SSDs.
- It can provide up to 480% higher transactions/sec and 80% lower latency for PostgreSQL on GKE.
- It simplifies implementing a high-performance cache layer vs complex manual setup, and supports all read/write Persistent Disk types.
- Competes with offerings like Amazon ElastiCache and Azure Cache for Redis, but is more tightly integrated with GKE and Persistent Disks.
- There are potential cost savings by allowing use of smaller persistent disks and less memory, while still achieving high read performance. Just remember, those local disks go away when the server dies.
- Key use cases include databases, analytics platforms, content management systems, developer environments that need fast startup.
- Based on local SSD usage, pricing varies by configuration. E.g. The 375GB local SSD is $95.40/month.
- Also, I’d like to point out that once again Ryan is trying to convince Justin to run things in containers that shouldn’t be in containers. Cody would like a word.
45:30 Enhancing vllm for distributed inference with llm-d
- Google Cloud is introducing llm-d, an open-source project that enhances the vLLM inference engine to enable distributed and disaggregated inference for large language models (LLMs) in a Kubernetes-native way.
- llm-d makes inference more cost-effective and easier to scale by incorporating a vLLM-aware inference scheduler, support for disaggregated serving to handle longer requests, and a multi-tier KV cache for intermediate values.
- Early tests by Google Cloud using llm-d show 2x improvements in time-to-first-token for use cases like code completion.
- llm-d is a collaboration between Google Cloud, Red Hat, IBM Research, NVIDIA, CoreWeave, AMD, Cisco, Hugging Face, Intel, Lambda, and Mistral AI, all leveraging Google’s proven technology in securely serving AI at scale.
- It works across PyTorch and JAX frameworks and supports both GPU and Google Cloud TPU accelerators, providing flexibility and choice.
- Deploying llm-d on Google Cloud enables low-latency, high-performance inference by integrating with Google’s global network, GKE AI capabilities, and AI Hypercomputer across software and hardware.
- Key use cases include agentic AI workflows and reasoning models that require highly scalable and efficient inference.
As AI moves from prototyping to large-scale deployment, efficient inference becomes critical. llm-d tackles this challenge head-on, optimizing vLLM to drastically improve performance and cost-effectiveness for demanding LLM workloads. It showcases Google Cloud’s leadership in AI infrastructure and commitment to open innovation.
- Show editor note: Remember in the 300th episode blog post where I said I was doing so much better understanding all the technical information? Yeah. I take it back.
47:48 Ryan – “I wonder if this is capitalizing on… did the community look at Vertex AI and some of the things that they’ve sort of ‘productized’ and be like, how are you doing it? And then started the collaboration that way? It’d be kind of fun to be a fly on the wall and how this was made.”
49:17 Google Cloud and Spring AI 1.0
- Spring AI 1.0 enables seamless integration of AI capabilities into Java applications running on Spring Boot, allowing enterprises to leverage AI without complex integrations.
- Supports various AI models for image generation, audio transcription, semantic search, and chatbots.
- Provides tools to enhance chat models with memory, external functions, private data injection, vector stores, accuracy evaluation, and cross-service connectivity via the Model Context Protocol (MCP.)
- Integrates with Google Cloud’s Vertex AI platform and Gemini models, though specific comparisons to other cloud AI offerings are not provided.
- Utilizes Google Cloud’s AlloyDB or Cloud SQL for scalable, highly-available PostgreSQL databases with pgVector capabilities to support vector similarity searches.
- Key use cases include modernizing enterprise Java applications with AI capabilities across various industries already using Spring Boot.
- Developers should care as it significantly lowers the barrier to entry for incorporating AI into their Java applications, with familiar Spring abstractions and starter dependencies
50:14 Ryan – “I guess Spring Boot’s better as a framework for Java apps than some things that have come before it. It’s done a good job of standardizing a lot of Java startups…so I guess if you do the same thing with AI integration perhaps it will be a little easier?”
51:49 AI Studio to Cloud Run and Cloud Run MCP server
- AI Studio now allows deploying apps directly to Cloud Run with one click, making it faster and easier to go from idea to shareable app.
- Gemma 3 models can be deployed from AI Studio to Cloud Run, enabling easy scaling of Gemma projects to production on serverless infrastructure with GPU support
- The new Cloud Run MCP server lets MCP-compatible AI agents (like Claude, Copilot, Google Gen AI SDK) deploy apps to Cloud Run, empowering AI-assisted development.
- These integrations streamline the AI app development workflow on GCP, from building and testing in AI Studio to production deployment on Cloud Run’s scalable serverless platform.
- Cloud Run’s granular billing and free tier make hosting AI Studio apps very cost-effective, with estimates starting at $0/mo with 2M free requests, then pay-per-use after that.
- Automated deployment from AI agents via the MCP server is a differentiator vs. other clouds, leveraging GCP’s strength in AI.
- Rapid prototyping and deployment of AI-powered apps, scaling Gemma/LLM workloads, AI agent-based development are some of the key features.
- Developers and businesses looking to quickly build and deploy AI apps at scale without infrastructure overhead should take note of these new capabilities that demonstrate GCP’s expanding and integrating AI/ML portfolio.
52:47 Justin – “MCP is like the new ClickOps.”
Azure
55:14 Remember Last week when Matt made us do Build Predictions? Well, as predicted – we did horribly.
-  - Ryan
 
- Announce an enhancement to GitHub Copilot, that allows agentic code development and agentic tasks.
- Full Coding Agent
- Agent Mode in Github Copilot
-  -  - Quantum Computing – Double down on Majorna and quantum computing capabilities.
- Augmented/Virtual Reality for Teams (Right subject, wrong cloud.)
 
- Matt - New Version of the ARM processor Cobalt
- New generation of Surface hardware
- Major update to the App Services Platform in Azure
 
- Justin - Microsoft will launch their own LLM
 
 
-  
- Microsoft Office Copilot upgrade with MCP inclusion in it.
- Agentspaces or Glean Type Competitor
- Specifically, Satya Nadella mentioned that Microsoft 365 Copilot can now search across data from various applications, including Salesforce. (16:53)
- Number of times copilot will be mentioned in the keynote
-  -  - 55 Justin
- 75 Matt
 
 
-  
- 62 Ryan – Actual Number 69 (If you didn’t chuckle we can’t be friends.)
-  - 1 Jonathan
 
Big Congrats to Ryan for winning – at Azure predictions. Lotto? No. Azure?
Yes. https://www.youtube.com/watch?v=LdE3WlQ__GY
1:01:58 Azure AI Foundry: Your AI App and agent factory | Microsoft Azure Blog
- Azure AI Foundry is an end-to-end platform for building and deploying AI apps and agents.
- It provides a unified development experience across code (Visual Studio Code) collaboration (GitHub) and cloud (Azure).
- It offers a growing catalog of state-of-the-art AI models, including Grok 3, Flux Pro 1.1, Sora, and 10,000+ open-source models from Hugging Face.
- A new model router optimizes model selection.
- The Azure AI Foundry Agent Service (now GA) enables designing, deploying and scaling production-grade AI agents. It integrates with 1,400+ enterprise data sources and platforms like Microsoft 365, Slack, Twilio.
- Multi-agent orchestration allows agents to collaborate on complex workflows across clouds. Agentic retrieval in Azure AI Search improves answer relevance by 40% for multi-part questions.
- Enterprise-grade features include end-to-end observability, first-class identity management via Microsoft Entra Agent ID, and built-in responsible AI guardrails.
- Foundry Local is a new runtime for building offline, cross-platform AI apps on Windows and Mac. Integration with Azure Arc enables central management of edge AI.
- Compared to AWS and GCP, Azure AI Foundry offers tighter integration with Microsoft’s developer tools and enterprise platforms. It targets customers building enterprise AI workflows.
- Azure AI Foundry aims to democratize AI development with an integrated, full-stack platform. Its agent orchestration, enterprise features, and edge runtime differentiate it.
- For companies already using Azure and Microsoft 365, it could accelerate adoption of generative AI in their apps and processes.
1:04:33 Powering the next AI frontier with Microsoft Fabric and the Azure data portfolio | Microsoft Azure Blog
- Microsoft Fabric and Azure data services are being enhanced to power the next generation of AI applications that combine analytical, transactional, and operational data in structured and unstructured forms.
- Cosmos DB NoSQL database is now available in Microsoft Fabric to handle semi-structured data for AI apps, in addition to SQL databases.
- Pricing starts at $0.25/hour for serverless instances.
- A new “digital twin builder” low-code tool allows creating virtual replicas of physical and logical entities to enable analytics, simulations and process automation.
- Power BI is getting a new Copilot experience to allow users to chat with their data and ask questions; this will also integrate with Microsoft 365 Copilot.
- SQL Server 2025 preview adds vector database capabilities and integrations with AI frameworks like LangChain to power intelligent apps. Pricing varies based on cores and edition.
- The PostgreSQL extension for VS Code now includes GitHub Copilot for AI assistance writing queries. Azure Database for PostgreSQL adds high-performance vector indexing.
- Azure Cosmos DB and Azure Databricks now integrate with Azure AI Foundry to store conversation data and power AI solutions
- Microsoft is partnering with SAP on the SAP Business Data Cloud and SAP Databricks on Azure initiatives to help customers innovate on SAP data
- These enhancements position Azure as a leader in converging databases, analytics and AI compared to point solutions from AWS and GCP, targeting enterprise customers building next-gen AI applications.
1:06:15 Matt- “The big thing here is CosmoDB – that felt like a little bit of a gap in the past.”
1:07:09 Transforming R&D with agentic AI: Introducing Microsoft Discovery
- Microsoft Discovery is a new enterprise AI platform that aims to accelerate research and development (R&D) by enabling scientists to collaborate with specialized AI agents and a graph-based knowledge engine.
- Microsoft says it can help drive scientific outcomes faster and more accurately.
- Discovery integrates with Azure infrastructure and services to provide enterprise-grade trust, compliance, governance and extensibility.
- Researchers can bring their own models, tools and datasets. It also leverages innovations from Microsoft Research and will integrate future capabilities like reliable quantum computing.
- The platform introduces a new “agentic AI” paradigm where people and AI agents cooperatively refine knowledge and experimentation iteratively in real-time. The AI can deeply reason over nuanced scientific data, specialize across domains, and learn and adapt.
- While AWS and GCP offer some AI/ML tools for research, Microsoft Discovery appears to be a more comprehensive, specialized platform focused on the full R&D lifecycle and scientific reasoning. The agentic AI approach also seems novel.
- Target customers include R&D teams in industries like chemistry, materials, pharma, manufacturing, semiconductors and more. Microsoft is partnering with companies like GSK, Estée Lauder, NVIDIA, Synopsys and others.
- For Cloud Pod listeners, this shows how Microsoft is applying advanced AI to help enterprises accelerate scientific innovation, a key economic engine. It demonstrates Azure’s AI/ML capabilities and how Microsoft is partnering across industries.
1:09:37 Agentic DevOps: Evolving software development with GitHub Copilot and Microsoft Azure
- GitHub Copilot is evolving into an AI-powered coding agent that collaborates with developers across the entire software development lifecycle, from planning to production. GOOD LUCK.
- The new agentic DevOps approach reimagines DevOps by having intelligent agents automate and optimize each stage, while keeping developers in control.
- Agent mode in GitHub Copilot can analyze codebases, make multi-file edits, generate tests, fix bugs and suggest commands based on prompts.
- The new coding agent in Copilot acts as a peer programmer, taking on code reviews, tests, bug fixes and feature specs so developers can focus on high-value work.
- Azure is adding app modernization capabilities to Copilot to assess, update and remediate legacy Java, .NET and mainframe apps to reduce technical debt
- The new Azure Site Reliability Engineering (SRE) Agent monitors production apps 24/7, responds to incidents and troubleshoots autonomously to improve reliability.
- GitHub Models make it easy to experiment with and deploy AI models from various providers right from GitHub with enterprise guardrails
- Microsoft is open-sourcing the GitHub Copilot extensions in VS Code, reflecting their commitment to transparency and community-driven AI development.
- These agentic AI capabilities remove friction, reduce complexity and change the cost structure of software development while enabling developer creativity.
1:06:15 Matt- “During the keynote they talked about (if) there’s a production outage and it automatically goes and scales and fixes it and then makes an issue that then it can self fix with their GitHub CoPilot agent. It’s really terrifying. You’re gonna wake up all of a sudden to an Azure bill of like $400,000, because it’d be like, hey, there’s a problem with your SQL… All of a sudden I’m writing, you know, 128 V cores on my SQL Hyperscale cluster because someone’s DDoSing me. Feel like there’s gonna be things it’s gonna miss.”
1:12:46 Oci Launches E6 Standard Compute Powered By Amd
- Oracle Cloud Infrastructure (OCI) has launched new E6 Standard Compute instances powered by AMD EPYC processors, claiming “up to 55% better price-performance compared to similar compute offerings” – but specifics are vague and comparisons likely cherry-picked.
- E6 instances are “supposedly” ideal for workloads like web servers, application servers, batch processing, and distributed analytics – but these are generic use cases that any major cloud provider can easily handle.
- Oracle touts security benefits from using “in-house designed servers with built-in firmware-level security” – an improvement, but likely table stakes compared to security from AWS, Azure, GCP.
- E6 instances offer up to 128 OCPUs, 2,048 GB of RAM, and 1 PB of remote block storage – specs that match or trail other cloud providers, despite Oracle’s positioning as “industry-leading price performance.”
- Oracle claims E6 is “the best price-performance in the industry for scale-out workloads” – a bold claim that warrants deep skepticism without rigorous, independent benchmarking
- Pricing details are unclear beyond a starting price of “$0.075 per OCPU hour” – Oracle’s pricing is notoriously complex and opaque compared to major cloud rivals.
- Oracle is likely targeting existing Oracle database/software customers and trying to keep them in the Oracle ecosystem as they move to the cloud – but organizations are increasingly adopting multi-cloud strategies.
- For most organizations using AWS, Azure or GCP, there’s little reason to get excited – those clouds offer similar or better options, with more mature ecosystems and without Oracle lock-in risks.
- Oracle wants to stay relevant in cloud discussions with splashy “we’re the best!” announcements – but informed observers will remain healthily skeptical until proven otherwise.
- Show note editor Heather would like to remind Oracle fanboys (Are there any of those?) The snark factor in this one brought to you by AI. Star Wars and Zoolnader puns? All me. Oracle Snark? Justin’s AI prompts.
Closing
And that is the week in the cloud! Visit our website, the home of the Cloud Pod where you can join our newsletter, slack team, send feedback or ask questions at theCloud Pod.net or tweet at us with hashtag #theCloudPod
318 episodes


 
 
 
