Google Cloud adds Claude Apps Gateway for enterprises

Wed, 1st Jul 2026 (Today)

Google Cloud has introduced support for Anthropic's Claude Apps Gateway, a service designed to help organisations manage Claude Code deployments on Google Cloud.

The self-hosted gateway sits between local Claude Code clients and Google Cloud, giving companies a central point for sign-in, access rules, usage tracking, spending limits and request routing. It ships with the same Claude binary and can be deployed on Cloud Run or Kubernetes environments such as GKE.

The move addresses a practical problem for companies that have grown beyond small developer groups. Individual engineers can already connect Claude Code to a Google Cloud project using Vertex-based inference within their existing cloud boundary. But broader roll-outs have required each developer to manage separate cloud credentials and local configuration files, with limited central oversight.

Under the gateway model, developers sign in through an organisation's identity provider, including Google Workspace or another OpenID Connect service. The gateway then exchanges that sign-in for a short-lived session, rather than placing service-account keys, API keys or project identifiers on a developer's machine.

That design also changes how access control is applied. Role-based access rules are stored in a central configuration file and enforced server-side, with the gateway checking model access on each message request. Local changes on a developer's device do not override those centrally defined policies.

Central controls

Usage monitoring is another part of the offering. Token usage metrics can be tied to a verified email address and group membership from the session token, then sent through OTLP/HTTP to a customer-selected monitoring system such as Cloud Monitoring, Grafana or Datadog.

The gateway also includes spending controls. Administrators can set daily, weekly or monthly limits for a user, a group or the whole organisation through an admin API. The service records token usage against a Cloud SQL ledger and returns a 429 response if a limit is reached.

Those cost figures are based on list prices, meaning they are intended to guard against excessive use rather than replace billing reconciliation where discounts or negotiated pricing apply.

For routing, inference requests are sent under a single Cloud Run service identity. Organisations can direct traffic to the global endpoint for Agent Platform or define additional upstream entries to provide failover if the first destination returns 5xx errors, 429 responses or timeouts.

Inference remains inside the customer's Google Cloud project, with existing quota, billing and data-processing arrangements unchanged. That is likely to matter for larger companies that want to keep AI workloads within an established governance and compliance structure.

Deployment steps

The setup uses several existing Google Cloud services. A typical deployment starts with enabling Agent Platform, Cloud SQL and Secret Manager, creating a dedicated service account with access to Vertex AI functions, and provisioning a small PostgreSQL instance in Cloud SQL to hold sign-in state and spending records.

Configuration is then stored in Secret Manager, including the gateway settings, OpenID Connect client secret, PostgreSQL connection string and a key used to sign JSON Web Tokens. The gateway is deployed as a stateless container, allowing it to scale horizontally behind the Cloud Run load balancer.

Developers connect to the service over the corporate network, and organisations can expose it through an internal Application Load Balancer if they want to keep access private. A public deployment is also possible, provided developers can reach the configured URL.

For end users, onboarding can be handled through mobile device management policies that push managed settings to laptops. Those settings tell Claude Code to use the gateway login method and point to the correct internal address, reducing the need for manual setup during a company-wide roll-out.

The approach reflects a wider trend in enterprise AI deployments, as companies want the convenience of coding assistants but also need controls that fit existing identity systems, internal networks and cost-management processes. Rather than requiring each developer to configure direct cloud access separately, the gateway concentrates those decisions into a service that platform teams can operate centrally.

The gateway validates its own session token, checks policy and forwards requests to Agent Platform using the Cloud Run service account, while Cloud SQL stores the device-code sign-in state and spending ledger, and an OTLP collector receives the attributed metrics.

ChatGPT

Key takeaways Explain why it matters Create action plan Future watch

Claude

Key takeaways Explain why it matters Create action plan Future watch

Perplexity

Key takeaways Explain why it matters Create action plan Future watch

Grok

Key takeaways Explain why it matters Create action plan Future watch

Share Share

Add us as a preferred source on Google

Image: Ivan Nardini and Roy Arsan