Creating web applications is an evolving challenge. As you attract more users, you generate more connections and put greater demand on your infrastructure. As you create rich new features within your application, you will need clever ways to route, process, and store increasing amounts data. To prevent your back-end from drowning within a flood of requests, text, images, analysis, updates, and uploads, you may consider configuring a Load Balancer.
This article will introduce you to the basics of using a Load Balancer and unveil some of the greater engineering challenges that can accompany the convenience.
In your early days, when your application is little more than a prototype, your application stack will be modest. As demand increases, so will the complexity of your application stack. Whether growth has sprung upon you or you are looking ahead, you will need to figure out how you organize visitor connections within your infrastructure.
Before we go any further it is wise to define the two different types of backend application growth, or scaling:
Vertical Scaling: Vertical scaling occurs when you provide additional computational resources to a server. For example, you have a server and you increase the available RAM and CPU; a 32GB machine with 4 cores becomes 64GB with 8 cores. You do this to improve RAM and CPU capacity, increase IOPS, or increase disk capacity.
Horizontal Scaling: In contrast, horizontal scaling adds more servers. You have a server build you are comfortable with and you add an identical unit; you have a 32GB machine with 4 cores and you add a second 32GB machine with 4 cores. You do this to increase I/O concurrency, reduce the load on existing nodes, or increase disk capacity.
If you are using one large server then vertical scaling looks like an alluring solution: “I have one server, I will make it bigger to meet demand!”.
While this solution may work for a short period of time, what happens when you have reached the maximum capability of a single server? What if you have many connections that are not resource intensive? What do you do when you want to integrate a CDN or include new backend applications?
At some point you will need to welcome another server or service into your infrastructure and grow horizontally. When you have at least two servers, the trick becomes getting them to reconcile. This is where we introduce the Load Balancer.
The Load Balancer acts as the traffic conductor, placed before your back-end infrastructure. It becomes your destination IP address. When a request arrives at the Load Balancer, it will route the request to one of your servers based on an underlying algorithm like Round Robin, Power of Two Random Choices, Least Connection, or Source IP Hash.
A Load Balancer, as the traffic conductor, provides significant benefit. It makes your application highly-available, increases its capability to handle demand, simplifies routing between backend applications, introduces redundancy and fault tolerance, and it creates an intelligent foundation for further building out your application infrastructure. There are things to be wary of, however…
Down the Rabbit Holes
At a shallow glance, it seems like using a Load Balancer is a simple and graceful way to bring organization to your visitor connections and offer your servers respite. As we dig deeper, we soon see there is much more to be mindful of.
A WebSocket enables a visitor to establish a persistent connection with your application. The connection is full-duplex, allowing flowing bidirectional communication. Think of an online game or video chat. A server is limited in how many WebSocket connections it can send outwards, allowing one per TCP port. A network interface contains 65,536 TCP ports per IP address and that becomes your standard per-server out-going connection limit. Increasing this number and handling connections well is a scenario that requires sophisticated Load Balancing!
Given the relative newness of WebSockets, current Load Balancing solutions have caveats. One example is with Haproxy, a common software Load Balancer. Haproxy allows you to add various virtual network interfaces. For example, if you were to create 3 virtual network interfaces, in theory, you would be able to hold 196,608 outgoing socket connections.
Alas, each connection requires that you attribute more resources to Haproxy. With no horizontal scaling support your Haproxy will need to get larger and more expensive as connections increase. With this set-up, you will also need to manually configure fail-over and adjust the Load Balancer when changes are made to your application code.
An appropriate solution may require multiple Load Balancers and a more complex balancing algorithm in Source IP Hash. As more applications appear set to arrive within your backend infrastructure, security becomes an ever-creeping concern.
Security should be at the forefront of your architectural decisions. If your application involves sensitive user data then it behooves you to ensure that it is protected. The standard method of securing network packets in motion is HTTPS.
SSL Termination refers to the point in which an encrypted HTTPS connection is decrypted. In a conventional set-up, this point would be within your Load Balancer:
Green denotes encrypted connections, red denotes unencrypted connections.
We see a secure route between the Visitor and the Load Balancer. This provides data security up until the Load Balancer. This is the most common implementation of HTTPS and what some consider secure application routes. It has significant weakness, however.
We can unearth a practical example when running a traceroute on your local machine. Consider this ICMP request within the traceroute as an encrypted packet:
traceroute to fly.io (18.104.22.168), 64 hops max, 52 byte packets 1 192.168.1.1 (192.168.1.1) 2.432 ms 0.954 ms 0.943 ms 2 22.214.171.124 (126.96.36.199) 25.553 ms 8.907 ms 9.504 ms 3 rc1st-be101-1.vc.shawcable.net (http://rc1st-be101-1.vc.shawcable.net/) (188.8.131.52) 10.581 ms 9.787 ms 10.993 ms 4 rc2wt-be50-1.wa.shawcable.net (http://rc2wt-be50-1.wa.shawcable.net/) (184.108.40.206 (tel:6616370106)) 15.255 ms 14.694 ms 16.566 ms 5 xe-0-5-0-17.r04.sttlwa01.us.bb.gin.ntt.net (http://xe-0-5-0-17.r04.sttlwa01.us.bb.gin.ntt.net/) (220.127.116.11) 17.005 ms ...
Each routing hop transports the encrypted packet until it arrives at the application or Load Balancer associated with the destination IP address. During transit, encryption prevents HTTP headers from being read. The only data available to the routing nodes are: the sender’s IP address, the destination IP address and hostname, and information about the type of encryption.
An encrypted HTTP header is full of rich information; custom HTTP headers can be invaluable tools when managing visitor sessions and caching, among other things, within your applications. When we look within our encrypted packets, we can see what is contained within an HTTP request:
:authority:www.fly.io (http://www.fly.io/) :method:GET :path:/docs/ :scheme:https accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 accept-encoding:gzip, deflate, sdch, br accept-language:en-US,en;q=0.8 cache-control:max-age=0 cookie:flyio_uid=rBEAA1ivQ/JGcgANAywJAg== dnt:1 if-modified-since:Fri, 03 Mar 2017 01:10:34 GMT upgrade-insecure-requests:1 user-agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36
From this HTTP request header sent to https://fly.io/docs/, we know the operating system, browser, caching information, language, path requested, the root domain the path belongs to, cookies, and more. This information is essential for your application to function and provides value with respect to analytics, performance, and security.
Using an SSL connection between your Load Balancer and all back-end applications would be the optimal security practice. However, you introduce discovery and visibility issues; how will you receive and interpret HTTP header data while keeping your traffic secure? Doing so is difficult. Each entity within your infrastructure should contain a signed TLS certificate and connect over TLS. TLS, and the need for encryption and decryption, reduces application response time and is more CPU intensive.
Leaving your infrastructure open and accessible behind your Load Balancer is the easier route. However, your back-ends are exposed. Now, we understand why our diagram shows the datacenter as having an unencrypted back-end; readable HTTP request/response information is vital for a functioning application and keeping the back-end secure is a difficult task.
It is easy to assume that your datacenter will remain free from intrusion but any unencrypted connection is vulnerable. If your application communicates to various, distant datacenter locations, you lose the ability to guarantee a secure and isolated application. Taking ownership of your security includes protecting each lane that a visitor’s requests may travel through.
Using a single server has short-lived perks. One such perk is having all of your vital logging information in one location. As you add more servers and applications it becomes more difficult to parse and organize logging data.
Accessing your error logs, render times, request, and response times over various servers requires an organized Operations effort. Your Load Balancer will not aid you in this task, either.
Wormholes and Other Things
A Load Balancer is a requisite part of building performant and diverse web applications. It can be a convenient and powerful tool in opening your application to the world in a way that is fast and secure for a global visitor base. It can also open you up to deep challenges and attack broad vectors.
Fly is an Application Delivery Network. A big part of application delivery is Load Balancing. In particular, Fly helps facilitate global load balancing. We have many, widely distributed edge-servers. When an application is connected to the Fly network a user who visits your site will arrive, and terminate TLS at, an edge-server that is topographically closest to them.
All edge-servers apply Let’s Encrypt TLS and HTTP/2. We apply clever the Power of Two Random Choices algorithm to balance traffic evenly across two or more application backends. This gives you a really good Load Balancer that you don’t need to maintain; you wield the global power but don’t need to perform global maintenance and upkeep.
To solve the problem of open connections behind the Load Balancer, we’ve constructed an open-source application known as Wormhole. Wormhole creates a fast and encrypted tunnel between two end-points. You do not need to expose a piece of your infrastructure to the web in order to have it receive encrypted traffic.
Fly turns your project’s hostname into the foundation which you place your backends, applications, and services upon. You can use whatever you’d like; once the backend, app, or service is attached, you then specify a subdirectory that it lives on. For example, your Github Pages landing page at
/, your Shopify store at
/store/, and your Kubernetes cluster at
# ... spec: # ... containers: # Your web server container - env: # ... # Fly Agent container - env: - name: "FLY_TOKEN" value: "x" - name: "FLY_LOCAL_ENDPOINT" value: "127.0.0.1:5000" # Your application's port image: "flyio/wormhole:0.5.36" name: "wormhole"
Each container that is spawned will appear within Fly and Load Balancing will commence using Power of Two Random Choices. You don’t need to worry about exposing wherever your cluster is stored like Amazon Web Services or Google Cloud to the web, only the ports the port you’ve specified for Wormhole. In this example, Kubernetes is now infused with a global network to facilitate HTTP/2, automatically renewing Let’s Encrypt TLS certificates, caching and TLS termination as close to the visitor as possible. All while staying secure and unexposed.
With your application taken care of, you can then plug the rest of your services for your blog, store, marketing site or landing pages, whatever it is you’d like, upon your hostname and have it all delivered in the same way.
Familiarizing yourself with the fundamental idea behind Load Balancers and a few potential rabbit holes will help prepare you for the engineering requirements that come along a busy and growing application. Fly is a nifty tool that can help you avoid most of these conundrums. That means more time spent writing useful features and faster, safer, and better experiences for your users.
Fly started when we wondered “what would a programmable edge look like”? Developer workflows work great for infrastructure like CDNs and optimization services. You should really see for yourself, though.