Manage episode 496871747 series 3680004
Welcome to episode 299 of The Cloud Pod – where the forecast is always cloudy! Google Next is quickly approaching, and you know what that means – it’s time for predictions! Who will win this year’s Crystal Ball award? Only time and the main stage will tell. Join Matthew, Justin, and Ryan as they break down their thoughts on what groundbreaking (and less groundbreaking) announcements are in store for us.
Titles we almost went with this week:
- OpenAI and Anthropic join forces?
- Its 2025, and AWS is still trying to make Jumbo packets happen
- Beanstalk and Ruby’s Updates!! They’re Alive!!!
- Google Colossus or how to expect a colossal cloud outage someday.
- The Cloud Pod gives an ode to Peter
A big thanks to this week’s sponsor:
We’re sponsorless! Want to get your brand, company, or service in front of a very enthusiastic group of cloud news seekers? You’ve come to the right place! Send us an email or hit us up on our slack channel for more info.
AI Is Going Great – Or How ML Makes All Its Money
02:27 OpenAI adopts rival Anthropic’s standard for connecting AI models to data
- OpenAI is embracing Anthropic’s standard for connecting AI assistants to the systems where the data resides.
- By adapting Anthropic’s Model Context Protocol or MCP across its products, including the desktop app for ChatGPT.
- MCP is an open source standard that helps AI models produce better, more relevant responses to certain queries.
- Sam Altman says that people love MCP and they are excited to add support across their products and that it is available today in the Agents SDK and support for the ChatGPT desktop and Response API is coming soon.
- MCP lets models draw data from sources like business tools and software to complete tasks, as well as from content repositories and app development environments.
- We found two helpful articles that may help demystify this whole concept.
MCP: What It Is and Why It Matters – by Addy Osmani
Meet MCP: Your LLM’s Super-Helpful Assistant!
-  - Justin particularly loves Addy Osmani’s blog, as they start out with a simple ELI5 on understanding MCP. We’re going to quote verbatim:
 
- “Imagine you have a single universal plug that fits all your devices – that’s essentially what the Model Context Protocol (MCP) is for AI. MCP is an open standard (think “USB-C for AI integrations”) that allows AI models to connect to many different apps and data sources in a consistent way. In simple terms, MCP lets an AI assistant talk to various software tools using a common language, instead of each tool requiring a different adapter or custom code.”
-  - So, what does this mean in practice? If you’re using an AI coding assistant like Cursor or Windsurf, MCP is the shared protocol that lets that assistant use external tools on your behalf. For example, with MCP an AI model could fetch information from a database, edit a design in Figma, or control a music app – all by sending natural-language instructions through a standardized interface. You (or the AI) no longer need to manually switch contexts or learn each tool’s API; the MCP “translator” bridges the gap between human language and software commands.
- In a nutshell, MCP is like giving your AI assistant a universal remote control to operate all your digital devices and services. Instead of being stuck in its own world, your AI can now reach out and press the buttons of other applications safely and intelligently. This common protocol means one AI can integrate with thousands of tools as long as those tools have an MCP interface – eliminating the need for custom integrations for each new app. The result: your AI helper becomes far more capable, able to not just chat about things but take actions in the real software you use.
- The problem your’re solving:
 Without MCP, integrating an AI assistant with external tools is a bit like having a bunch of appliances each with a different plug and no universal outlet. Developers were dealing with fragmented integrations everywhere. For example, your AI IDE might use one method to get code from GitHub, another to fetch data from a database, and yet another to automate a design tool – each integration needing a custom adapter. Not only is this labor-intensive, it’s brittle and doesn’t scale. As Anthropic put it:
- “even the most sophisticated models are constrained by their isolation from data – trapped behind information silos…Every new data source requires its own custom implementation, making truly connected systems difficult to scale.”
 
04:45 Justin – “Basically, I consider this to be SQL for AI.”
07:43 Announcing Anthropic Claude 3.7 Sonnet is natively available in Databricks
- Databricks is coming in late to the party with support for Claude 3.7 Sonnet
- Databricks is excited to announce that Anthropic Claude 3.7 Sonnet is now natively available in Databricks across AWS, Azure and GCP.
- For the first time, you can securely access Claude’s advanced reasoning, planning and agentic capabilities directly within Databricks.
08:53 OpenAI Goes Ghibli, Tech’s Secret Chats
- We talked last week about ChatGPT’s new image capabilities but everyone is not as pleased with the results.
- ChatGPT can make a pretty realistic version of Studio Ghibli’s unique cartoon/anime style which will probably get OpenAI sued over copyright infringement.
AWS
11:17 Firewall support for AWS Amplify hosted sites 
- You can now integrate AWS WAF with AWS Amplify Hosting.
- Web application owners are constantly working to protect their applications from a variety of threats. Previously, if you wanted to implement a robust security posture for your Amplify Hosted applications, you needed to create architectures using Amazon Cloudfront Distributions with AWS WAF protection, which required additional configuration steps, expertise and management overhead.
- With the GA of AWS WAF for Amplify hosting, you can now directly attach a web app firewall to your AWS Amplify apps through a one-click integration in the Amplify Console or using IaC. This integration gives you access to the full range of AWS WAF capabilities including managed rules, which provide protection against common web exploits and vulnerabilities like SQL injection and cross-site scripting (XSS). You can also create your own custom rules based on your specific application needs.
12:19 Ryan – “This is one of those rough edges that you find the wrong way. So I’m glad they fixed this. If you’re using Amplify, I’m sure you don’t want to get down in the dirty in-network routing and how to implement the WAF. So you’re looking for something to apply the managed rules and protect yourself from bots and that kind of traffic. I imagine this is a great integration for those people that are using Amplify.”
17:35 Amazon EC2 now supports more bandwidth and jumbo frames to select destinations
- Amazon EC2 now supports up to the full EC2 instance bandwidth for inter-region VPC peering traffic and to AWS Direct Connect.
- Additionally, EC2 supports jumbo frames up to 8500 Bytes for cross-region VPC peering.
- Before today, the egress bandwidth for EC2 instances was limited to 50% of the aggregate bandwidth limit for the cases with 32 or more vCPUs and 5 Gbps for more minor instances.
- Cross-region peering supported up to 1500 bytes. Now, customers can send bandwidth from the EC2 region or towards AWS direct connect at the full instance baseline specification or 5Gbps, whichever is greater. Customers can use jumbo frames across regions for peered VPCs.
18:17 Justin – “I can see some benefits, as much as I made fun of it, but it’s one of those things that you run into in weird outage scenarios…so it’s nice, especially for going between availability zones and cross region peering. ”
20:20 AWS Lambda adds support for Ruby 3.4
- RUBYS NOT DEAD!
- AWS Lambda now supports creating serverless apps using Ruby 3.4 (released in February 2025).
- Developers can use Ruby 3.4 as both a managed runtime and a container base image, and AWS will automatically apply updates to the managed runtime and base image as they become available.
- Ruby 3.4 is the latest LTS version of Ruby with support expected until March 2028.
20:56 Ryan – “I am astonished. I did not think that Ruby was a thing that was still supported.”
23:55 Amazon API Gateway now supports dual-stack (IPv4 and IPv6) endpoints
- Amazon is finally launching IPv6 support for Amazon API Gateway across all endpoint types, custom domains, and management APIs, in all commercial and AWS GovCloud (US) regions.
- You can configure Rest, HTTP and WebSocket APIs and custom domains to accept calls from IPv6 clients alongside the existing IPv4 support. You can also call the management API’s via IPv6 or IPv4 clients.
- Remember that AWS is still charging you for the IPv4 and there is no way to remove the Ipv4 addresses.
24:45 Matthew – “It’s pretty required in mobile; that’s really the big area where you need it. Because the mobile networks have all gone IPv6.”
27:17 Announcing Amazon EKS community Add-ons catalog | Containers
- EKS supports add ons that streamline support operations capabilities for K8 applications.
- These add ons come from AWS< Partners and the OSS community. But discovery of these tools across multiple different avenues has resulted in chaos and security and misconfiguration risks.
- To fix this Amazon is releasing the community add-ons catalog, which provides a way to streamline cluster operations by integration popular community add-ons through native AWS management, broadening the choice of add-ons that users can install to their clusters directly using EKS console, AWS SDK, CLI and CloudFormation.
- Some of the critical capabilities you can find in the add-on catalog include essential capabilities such as: - Metrics server
- Kube-state-metrics
- Prometheus-node-exporter
- Cert-manager
- External-DNS
 
- If you make an add-on you want to add, you can create an issue on the EKS roadmap GitHub requesting its inclusion.
28:04 Justin – “Those five examples all just seem like they should be a part of EKS. Just my personal opinion.”
29:34 Amazon Bedrock Custom Model Import introduces real-time cost transparency
- When importing your customized foundational models on-demand to Bedrock, you now get full transparency in the compute resources being used and calculate inference costs real-time.
- This launch provides you with the minimum compute resources, custom model units, required to run the workload model prior to model invocation in the Bedrock console and through Bedrock APIs. As the models scale to handle more traffic, CloudWatch metrics provide real-time visibility into the inference costs by showing the total number of CMUs used.
30:05 Ryan – “The only common metric is money.”
30:33 AWS Elastic Beanstalk now supports retrieving secrets and configuration from AWS Secrets Manager and AWS Systems Manager
- See Matt – Beanstalk isn’t dead!
- AWS Elastic Beanstalk now enables customers to reference AWS Systems Manager Parameter Store Parameters and AWS Secrets Manager secrets in environmental variables.
- This new integration provides developers with a native method for accessing data from these services in their applications.
31:04 Ryan – “It’s a crazy new feature for services that’s been around for a very long time.”
32:33 Amazon makes it easier for developers and tech enthusiasts to explore Amazon Nova, its advanced Gen AI models
-  - Check out https://nova.amazon.com can we kill Partyrock now?
- Amazon has realized that while they’ve created numerous Generative AI applications including Alexa+, Amazon Q and Rufus, as well as tools like Bedrock.
- Using their cutting edge Amazon Nova engine, they are now rolling nova.amazon.com a new website for easy exploration of their foundational models.
- As well as they are introducing Amazon Nova Act, a new AI model trained to perform actions within a web browser. They’re releasing a research preview of the Amazon Nova Act SDK, which will allow developers to experiment with an early version of the new model.
 
- “Nova.amazon.com puts the power of Amazon’s frontier intelligence into the hands of every developer and tech enthusiast, making it easier than ever to explore the capabilities of Amazon Nova,” said Rohit Prasad, SVP of Amazon Artificial General Intelligence. “We’ve created this experience to inspire builders, so that they can quickly test their ideas with Nova models, and then implement them at scale in Amazon Bedrock. It is an exciting step forward for rapid exploration with AI, including bleeding-edge capabilities such as the Nova Act SDK for building agents that take actions on the web. We’re excited to see what they build and to hear their useful feedback.”
GCP
36:04 Google Next is coming up VERY SOON!
BRK2-024 – Workload-optimized data protection for mission-critical enterprise apps
BRK1-028 – Unlock value for your workloads: Microsoft, Oracle, OpenShift and more
Google Next Predictions
- Ryan
-  -  - Responsible AI, in Console/Service/SDK to enable and/or visualize your responsible AI creation or usage
- Endpoint Security Tools (Crowdstrike, Patch Management/Vulnerability)
- Won’t be announcing anything new service announcements just enhancements for AI/Gemini/Etc.
 
 
-  
- Justin
-  -  - AI Agents specialized for Devops, K8, Devops capability
- Next Generation of TPU GPU’s optimized Optimized Multi-modal
- Unification or Major Enhancement of Anthos & GKE Enterprise
 
 
-  
- Matt
-  -  - Green AI
- 3 not-AI specific keynotes
- AI security thing that is not Endpoint. More Guardrails.
 
 
-  
- Honorable Mentions
-  -  - Industry verticalization for AI LLM Models. Fine Tuning Marketplace or special model for specific industry/use case
- Personal Assistant for Workspace productivity
- Multi Cloud tooling
 
 
-  
- Number of times AI or ML said on stage
-  - Matt: 52
- Justin: 97
- Ryan: 1
 
52:08 Secure backups with threat detection and remediation | Google Cloud Blog
- Google is really nibbling on the edges of backups and disaster recovery, which I think is a sign that ransomware is still a big problem and concern for customers.
- Backup vault was announced last year as a powerful storage feature available as part of Google Cloud Backup and DR services.
- The point is to secure backups against tampering and unauthorized deletion, and integrates with Security Command Center for real-time alerts on high risk actions.
- To further support security needs, they are deepening the integration between Google Backup and DR and security command center enterprise. This includes new detections including threats to the backup vault itself, and end to end workflows to help customers protect backup data.
33:53 Ryan – “I think not only is ransomware still a big issue, but also it’s hit the compliance round; it’s a question that comes up all the time in any kind of security audit or attestation – or even a customer walkthrough. It’s definitely an issue that’s in the front of people’s minds and something that’s annoying to fix in reality. So this is great.”
54:12 mLogica and Google Cloud partner on mainframe modernization
- The mainframe is still kicking, and Google and mLogica have announced an expanded partnership focused on accelerating and de-risking mainframe application modernization, combining mLogica’s LIBER*M automated code refactoring suite (available via marketplace) with Google Cloud Dual Run for validation and de-risking offering a validated modernization path to their joint customers.
- LIBER*M provides automated assessment, code analysis, dependency mapping, and code transformation capabilities, and it supports multiple target languages and platforms, providing a crucial foundation for refactoring projects.
- Google Dual Run (I didn’t know this existed) enables the simultaneous operation of mainframe and cloud applications in parallel, letting you compare and validate refactored applications before cutting over.
- This, along with powerful testing capabilities, enables a controlled phase transition, minimizes business disruption and substantially reduces the risks inherent in large-scale mainframe modernization projects.
56:349 How Colossus optimizes data placement for performance
- Google has a great article about its foundational distributed storage system, Colossus storage platform.
- Google’s universal storage platform Colossus achieves throughput that rivals or exceeds the best parallel file systems, has the management and scale of an object storage system, and has an easy-to-use programming model that’s used by all Google teams.
- Moreover, it does all this while serving the needs of products with incredibly diverse requirements, be it scale, affordability, throughput or latency.
| Example application | I/O sizes | Expected performance | 
| BigQuery scans | hundreds of KBs to tens of MBs | TB/s | 
| Cloud Storage – standard | KBs to tens of MBs | 100s of milliseconds | 
| Gmail messages | less than hundreds of KBs | 10s of milliseconds | 
| Gmail attachments | KBs to MBs | seconds | 
| Hyperdisk reads | KBs to hundreds of KBs | <1 ms | 
| YouTube video storage | MBs | seconds | 
- This flexibility shows up in publicly available google products. Things from Hyper Disk ML to tiered storage for Spanner.
- Colossus was the evolution of GFS (Google File System), the traditional colossus file system contained in a single datacenter.
- Colossus simplified the GFS programming model to an append only storage system that combines file system familiar programming interface with the scalability of object storage.
- The colossus metadata service is made up of “curators” that deal with interactive control operations like file creation and deletion, and “custodians,” which maintain the durability and availability of data as well as disk-space balancing.
- Colossus clients interact with the curators for metadata and then directly store data on “D servers” which host its SSD or HDDs.
- It’s also good to understand that Colossus is a zonal product, they build a single colossus filesystem per cluster, an internal building block of a Google Cloud Zone. Most data centers have one cluster and thus one colossus filesystem, regardless of how many workloads run inside the cluster.
- Many Colossus file systems have multiple exabytes of storage, including two different filesystems that have in excess of 10 exabytes of storage each.
- Demanding applications also need large amounts of IOPS and throughput. In fact, some of Google’s largest file systems regularly exceed read throughputs of 50 TB/s and write throughputs of 25 TB/s. This is enough throughput to send more than 100 full-length 8k movies every second!
- Their single busiest cluster does over 600M IOPS, combined between read and write operations.
- Previously when they talked about colossus they talked about how they place the hottest data on SSDs and balance the remaining data across all of the devices in the cluster. This is more pertinent today, as over the years the SSDs have gotten more affordable, but still pose a substantial cost premium over blended fleets of SSD and HDD.
- To make it easier for their developers they have a L4 distributed SSD caching layer which dynamically picks the data that is most suitable for SSD.
33:53 Justin – “This is more pertinent today as over the years, the SSDs have gotten more affordable but still pose a substantial cost premium over blended fleets of SSD and HDD drives. To make it easier for developers, they have an L4 distributed SSD caching layer with dynamic PIX data that is most suitable for SSDs, so the developers don’t even have to think about the tiering. Take that, Amazon!”
1:03:26 AI-assisted BigQuery data preparation now GA
- BigQuery data preparation is now generally available. It also now integrates with BigQuery pipelines, letting you connect data ingestion and transformative tasks so you can create end-to-end data pipelines with incremental processing, all in a unified environment.
- Features include: - Comprehensive transformation capabilities
- Data standardization
- Automated schema mapping
- AI-suggested join keys for data enrichment
- Visual Data pipelines
- Data quality enforcement with error tables
- Streamlined deployment with github integrations
 
1:03:59 Ryan – “Automated schema mapping is probably my biggest life work improvement.”
Azure
1:04:52 Announcing backup storage billing for SQL database in Microsoft Fabric: what you need to know
- Azure is back to charge you money for SQL backups. Previously, your fabric capacity-based billing model included compute and data storage. By default, the system provides a full weekly backup, differential backup every 12 hours and transaction log backups every 10 minutes. After April 1st, 2025, backup storage will also be billed, that exceeds the allocated DB size.
- Listen. We get charging for this, but where we’re unclear is if this is configurable for the duration and period we want to store. So if it’s not configurable, this feels like a bit of a cost increase you can’t escape.
1:05:46 Matthew – “That’s probably what happened – they realized how much more storage this is actually using.”
1:08:12 Announcing Alert Triage Agents in Microsoft Purview, powered by Security Copilot
- Microsoft says that per their research that organizations face up to 66 alerts per day when it comes to Purview (DLP) alerts, up from 52 in 2023 with teams only really able to review about 63% of the alerts.
- Given the sheer volume of data security alerts, it’s no surprise – per Microsoft – it’s hard to keep up.
- To help customers increase the efficacy of their data security programs, address key alerts and focus on the most critical data risks, Microsoft is thrilled to announce Alert Triage Agents in Microsoft Purview Data Loss Prevention (DLP) and Insider Risk Management (IRM).
- These autonomous security copilot capabilities integrated directly into Microsoft Purview offer an agent-managed alert queue that identifies the DLP and IRM alerts that pose the greatest risk.
1:10:09 Ryan – “Doing something with DLP is really tricky, because you don’t want to all up in user’s data – but you want to make sure you are protected from data loss. So each one of these investigations for each one of these alerts is time consuming.”
Oracle
1:11:37 Announcing New AI Infrastructure Capabilities with NVIDIA Blackwell for Public, On-Premises, and Service Provider Clouds
- OCI is making available the latest and greatest NVIDIA GB300 NVL72 and NVIDIA HGX B300 NVL16 with Blackwell Ultra GPUs, providing early access to the AI acceleration.
- You can get the GB300, B300, in bare metal, or you can use super clusters with up to 131,072 NVIDIA GB300 Grace Blackwell Ultra Superchips as part of rack-scale NVIDIA GB300 NVL72 solutions.
- Justin was trying to figure out what a supercluster would cost, but it wasn’t an option in the pricing calculator. However, he was able to pick 1 BM.GPU.GB200.4 with 4 GPUs and 756GB of memory running autonomous linux for $857,088 in Monthly on-demand cost. A bargain!
1:14:03 Justin – “I want to run Windows on it so I can open up task manager and see all the CPUs just scaling off .”
1:14:41 Oracle Launches OCI Compute E6 Standard Instances: 2X the Performance, Same Price
- In more reasonably priced instances, the E6 Standard bare metal and flex virtual machine instances are now available, powered by the 5th-gen AMD EPYC processors.
- OCI is among the first cloud providers to offer them. (Among is doing some heavy lifting here. Google was the *actual* first. Neither AWS or Azure have announced yet.)
- Oracle is promising a performance of 2x that of the E5 at the same price.
- They feature 2.7GHz base frequency with max boost up to 4.1GHz based on the zen-5 architecture.
- There are configurations from 1-126 OCPU and up to 3072 GB for bare metal and 1454 for virtual machines.
1:17:37 Justin – “$10,285 for a bare metal running autonomous Linux. So that’s actually not that bad. It does jump up to $27,000 if you go for Windows. Yeah, so not bad. I only added 100 gigs of disk space, because who needs more than that? Capacity reservation didn’t change the price.”
1:18:25 Oracle under fire for its handling of separate security incidents
- Oracle is under fire for potential security breaches.
- The first one is related to Oracle Health; the breach impacts patient data.
- Oracle blamed the Cerner breach on an old legacy server not yet migrated to Oracle Cloud. Sure, Jan.
- The other breach may be on Oracle Cloud, and Oracle is being cagey. A hacker going by rose87168 posted on a cybercrime forum offering the data of 6 million oracle cloud customers, including authenticated data and encrypted passwords.
- Several Oracle customers have confirmed that the data appears genuine, but Oracle has stated that there has been no breach, and the published credentials are not from the Oracle Cloud. Ok, so where did it come from?
- Cybersecurity Expert Kevin Beaumont writes: “This is a serious cybersecurity incident which impacts customers, in a platform managed by oracle. Oracle are attempting to wordsmith statements around Oracle Cloud and use very specific words to avoid responsibility. This is not ok.”
- Can’t be unbreakable if it’s breakable.
Closing
And that is the week in the cloud! Visit our website, the home of the Cloud Pod where you can join our newsletter, slack team, send feedback or ask questions at theCloud Pod.net or tweet at us with hashtag #theCloudPod
318 episodes


 
 
 
