Everyone in the IT business has heard of load balancing, but many don’t know much about it. Even technology professionals who are familiar with the concept may know little about how it works. A brief introduction to the subject of load balancing and its development will help to round out our IT education.
As the name suggests, load balancing is a way to distribute the workload among network devices. A spokesman for hosting provider UKFast offers this definition in a promotional video: “Load balancing is taking the overall hosting burden and spreading it across multiple devices so that at any given point in time a single device of hardware is not overwhelmed.”
That’s an excellent description, but we can modify it slightly. By taking out the reference to hardware, we recognize that network devices these days may not be physical at all. Virtualization has spread to network components as well as servers. Software-defined networking (SDN) and network functions virtualization (NFV) can benefit from load balancing as well.
Another quote from the video gives us more insight into load balancing: “Basically, any business with a multi-server strategy which can’t afford a moment of downtime should consider investing in load balancing. It’s also important to note that high levels of uptime equate to a good online customer experience.”
For mission-critical applications, load balancing makes a lot of sense. Resources on an individual server are finite, and there is always the chance of an outage. Load balancing is about ensuring adequate capabilities to run services and eliminating single points of failure. It’s too risky to do without it.
Traditional load balancing came into play in the 1990s. In their white paper “Load Balancing 101: Nuts and Bolts”, industry leader F5 lays out the basics of load balancing, as well as its history and development. There are a great many other sites dealing with the subject on the internet. Let’s try to summarize how it works.
When a user attempts to use an online service, a load balancer plays the part of traffic cop. To the platform of the user (whether it’s a PC, laptop, phone, or some other device), there is just one way to access the service. That is through a single IP address or domain name. What takes place from there is beyond the prying eyes of any curious outside observer. The service traffic remains transparent to the user.
That’s because the traditional form of load balancing uses Network Address Translation (NAT). This technology masks a pool of destination servers that share the traffic load. The load balancer itself — historically a physical device between the client device and the server pool — uses a virtual IP address (VIP) to receive user traffic. Individual service requests are distributed to the server pool using a load balancing algorithm.
There can be any number of algorithms used to balance traffic loads, but there are two that are the most popular. The first is called “round robin”. It’s pretty simple. Suppose there are five servers. Round robin load balancing means that they just continue to take turns. Server 1 handles the first service request, then server 2 and so on. After server 5, then it starts over at server 1. The second type is called “least connections”. That means that the server currently handling the smallest amount of traffic gets the next service request. At Total Uptime, our Cloud Load Balancer has 10 different methods to choose from, so you can get very granular with how you distribute traffic . We could go into great detail about the other ways to fashion a load balancing algorithm, but the two above give you some idea how they operate.
Not every service is suitable for load balancing, however. Stateless applications like HTTP, VPN, FTP, or firewalls are good candidates and make for an easy deployment. But load balancing gets a little trickier for files for dynamic applications, like databases. But that doesn’t mean it can’t be done! The vast majority of load balancing issues are the result of a lack of session stickiness, or what we call persistence. If you can enable that, most of your problems go away.
The 1990s are long gone, of course. But the need for traffic balancing remains. With virtualization and cloud technology, much of the concern has moved to the higher layers of the OSI model. New terms like Application Delivery Network (ADN) and Application Delivery Controllers (ADC) direct our attention to the reason for all this IT infrastructure in the first place: to deliver online services.
The same principles are at play as in traditional load balancing. But the scale has gone global. No longer are we talking about sharing traffic among servers in a single data center. Services like Total Uptime’s Cloud Load Balancer make effective use of IP Anycast to redirect traffic across the globe depending on customer infrastructure, network conditions, and predefined traffic policies. That means a lot of traffic optimization takes place before the service request even hits the customer’s data center.
The differences between traditional load balancing and today’s global load balancing reflect the exponential growth of network traffic and the globalization of commerce since the origination of the technology. And it is as much about managing applications as it is about managing networks.
There is a lot more to the subject. How servers are clustered and how traffic is redirected are matters for other articles. Modern load balancing solutions also depend on the clever resource allocation techniques of the cloud and all the other technologies that make up today’s IT environment. Suffice it to say that load balancing has gone global, and it’s contributing to the total uptime of clients and users everywhere.