Google Cloud Load Balancing

Worldwide Autoscaling and Load Balancing

Scale your applications on Google Compute Engine from zero to full-throttle with Google Cloud Load Balancing, with no pre-warming needed. Distribute your load-balanced compute resources in single or multiple regions, close to your users and to meet your high availability requirements. Cloud Load Balancing can put your resources behind a single anycast IP and scale your resources up or down with intelligent Autoscaling. Cloud Load Balancing comes in a variety of flavors and is integrated with Google Cloud CDN for optimal application and content delivery.

Global Load Balancing with Single Anycast IP

With Cloud Load Balancing, a single anycast IP front-ends all your backend instances in regions around the world. It provides cross-region load balancing including automatic multi-region failover which gently moves traffic in fractions if backends become unhealthy. In contrast to DNS-based Global Load Balancing solutions, Cloud Load Balancing reacts instantaneously to changes in users, traffic, network, backend health and other related conditions.

Software-Defined Load Balancing

Cloud Load Balancing is a fully distributed, software-defined, managed service for all your traffic. It is not an instance or device based solution, so you won’t be locked into physical load balancing infrastructure or face the HA, scale and management challenges inherent in instance based LBs. You can apply Cloud Load Balancing to all of your traffic: HTTP(S), TCP/SSL, and UDP. You can also terminate your SSL traffic with HTTPS Load Balancing and SSL proxy.

Over One Million Queries Per Second

Cloud Load Balancing is built on the same front-end serving infrastructure that powers Google. It supports 1 Million+ queries per second with consistent high performance and low latency. Traffic enters Cloud Load Balancing through 80+ distinct global load balancing locations, maximizing the distance traveled on Google's fast private network backbone.

Seamless Autoscaling

Cloud Load Balancing can scale as your users and traffic grow, including easily handling huge, unexpected and instantaneous spikes by diverting traffic to other regions in the world that can take traffic. Autoscaling does not require pre-warming, you can scale from zero to full throttle in a matter of seconds.

Internal Load Balancing

Internal Load Balancing enables you to build scalable and highly available internal services for your internal client instances without requiring your load balancers to be exposed to the Internet. GCP Internal Load Balancing is architected using Andromeda, Google’s software-defined network virtualization platform.


High performance, scalable load balancing on Google Cloud Platform

HTTP(S) Load Balancing

HTTP(S) load balancing can balance HTTP and HTTPS traffic across multiple backend instances, across multiple regions. Your entire app is available via a single global IP address, resulting in a simplified DNS setup. HTTP(S) load balancing is scalable, fault-tolerant, requires no pre-warming, and enables content-based load balancing. For HTTPS traffic, it provides SSL termination and load balancing.

TCP/SSL Load Balancing

TCP load balancing can spread TCP traffic over a pool of instances within a Compute Engine region. It is scalable, does not require pre-warming, and health checks help ensure only healthy instances receive traffic. SSL proxy provides SSL termination for your non-HTTPS traffic with load balancing.

SSL Offload

SSL offload enables you to centrally manage SSL certificates and decryption. You can enable encryption between your load balancing layer and backends to ensure highest level of security, with some additional overhead for processing on backends.

UDP Load Balancing

UDP load balancing can spread UDP traffic over a pool of instances within a Compute Engine region. It is scalable, does not require pre-warming, and health checks help ensure only healthy instances receive traffic.

Stackdriver Logging

Stackdriver Logging for load balancing logs all the load balancing requests sent to your load balancer. These logs can be used for debugging as well as analyzing your user traffic. You can view request logs and export them to Google Cloud Storage, Google BigQuery, or Google Cloud Pub/Sub for analysis.

Seamless Autoscaling

Autoscaling helps your applications gracefully handle increases in traffic and reduces cost when the need for resources is lower. You just define the autoscaling policy and the autoscaler performs automatic scaling based on the measured load. No pre warming required - go from zero to full throttle in seconds.

High Fidelity Health Checks

Health checks ensure that new connections are only load balanced to healthy backends that are up and ready to receive them. High fidelity health checks ensure that the probes mimic actual traffic to backends.


Cloud Load Balancing Affinity provides the ability to direct and stick user traffic to specific backend instances.

Cloud CDN Integration

Enable Cloud CDN for HTTP(S) Load Balancing for optimizing application delivery for your users with a single checkbox.