Skip to Content

Google Cloud and Docker Supercharge AI App Deployments with Compose-to-Cloud Run Integration

Simplifying AI App Deployment Workflows

Get All The Latest Research & News!

Thanks for registering!

Deploying AI applications from local development to production environments is now easier than ever, thanks to a groundbreaking collaboration between Google Cloud and Docker. 

The seamless integration of Docker Compose with Cloud Run empowers developers to move sophisticated, multi-container AI apps from their desktops to the cloud effortlessly while taking advantage of the affordable and flexibly Cloud Run Service.

What is Cloud Run?

Google Cloud Run is a fully managed serverless platform that enables you to run stateless containers in a highly scalable and cost-effective environment. It abstracts away all infrastructure management, allowing you to go from a container image to a globally accessible web service with a single command. 

You only pay for the exact resources you use when your code is running making it an ideal solution for deploying everything from simple microservices to complex, GPU-accelerated AI applications.

Streamlined Path from Development to Production

Transitioning AI projects from local Docker Compose environments to managed cloud platforms once required cumbersome manual configuration and translation. This was particularly challenging for agentic applications using several services and self-hosted AI models. 

Now, with support for the open-source Compose Specification, Google Cloud introduces the gcloud run compose up command, letting developers deploy compose.yaml files directly to Cloud Run.

  • No more code refactoring or rewriting configurations for the cloud
  • Automatic container builds from source code
  • Native Cloud Run volume mount support for persistent storage
  • Uniform configuration across local and cloud environments
name: agent
services:
  webapp:
    build: .
    ports:
      - "8080:8080"
    volumes:
      - web_images:/assets/images
    depends_on:
      - adk

  adk:
    image: us-central1-docker.pkg.dev/jmahood-demo/adk:latest
    ports:
      - "3000:3000"
    models:
      - ai-model

models:
ai-model:
    model: ai/gemma3-qat:4B-Q4_K_M
    x-google-cloudrun:
      inference-endpoint: docker/model-runner:latest-cuda12.2.2

volumes:
  web_images:

Example of a compose.yaml file

This innovation simplifies deployment workflows, allowing for rapid local iteration and frictionless scaling to production. Currently in private preview, the feature invites early adopters to experience its benefits firsthand.

Cloud Run: Tailored for Modern AI Workloads

Cloud Run is emerging as a go-to platform for deploying large language models (LLMs) and AI applications. The recent general availability of GPU support eliminates barriers for developers seeking scalable, GPU-powered infrastructure. Key advantages include:

  • Pay-per-second billing for optimal cost control
  • Scale-to-zero for efficient resource management
  • Impressively fast scaling—around 19 seconds for advanced model inference

These features position Cloud Run as a perfect fit for Docker’s latest OSS MCP Gateway and Model Runner, making it easier to go from local AI development to robust cloud production. Developers can now use Docker Compose’s new ‘models’ attribute to specify AI models, with Cloud Run seamlessly handling inference endpoints and runtime images.

Real-World Example: Multi-Container AI Apps

The collaboration highlights a compose.yaml file for a demo AI app, demonstrating how:

  • Multiple services (e.g., a web interface and agentic development kit) are defined in a single file
  • Storage volumes provide reliable data persistence
  • The new models attribute streamlines AI model selection
  • Cloud Run-specific extensions automate runtime image configuration for inference

This unified approach ensures developers maintain a single configuration file from prototyping through production, dramatically reducing deployment errors and improving reliability.

Championing Open Standards in AI-Native Development

Google Cloud’s commitment to open standards and developer flexibility shines through in this partnership. By supporting the Compose Specification and working closely with Docker, Cloud Run delivers a familiar, powerful toolkit for building, testing, and deploying agentic AI applications at scale.

As AI projects become increasingly complex, this collaboration sets a new bar for simplicity, consistency, and innovation—enabling teams to focus on advancing AI rather than wrestling with infrastructure.

Empowering the Future of AI App Delivery

The integration of Docker Compose with Cloud Run marks a pivotal moment for AI developers aiming to streamline their delivery pipelines. By bridging the gap between local development and cloud deployment, Google Cloud and Docker are making AI-native development more scalable, accessible, and robust. Developers eager to explore these capabilities can sign up for the private preview, heralding a new era for AI application deployment.

Source: Google Cloud Blog


Google Cloud and Docker Supercharge AI App Deployments with Compose-to-Cloud Run Integration
Joshua Berkowitz July 13, 2025
Share this post