Between monoliths, microservices, serverless functions, and powerful third-party services, you'll have many different logging threads to keep an eye on. Clean aggregation of your log streams can be a time consuming and complicated task -- especially when you start to consider your proxies and load balancers.
Within this article, we're going to explore how to log disparate applications within one tidy channel. What happens if we want to break from our services or cloud providers? How will we weave several functions into a coherent logging pipe? Onwards, to discourse.
The Life of a Log
Let's introduce some context. We'll demonstrate a few different higher-level approaches to how your logging arrangement may appear. For this example, we'll pretend that we're hosting a set of handy, serverless AWS Lambda functions.
AWS Lambda has logging built in. Amazon will route detailed logging information when your function is invoked to its own logging service, Cloudwatch.
Cloudwatch is still in its infancy. As such, users report sub-par experiences. While things will improve, many choose to forward their logging detail into a third party. Perhaps it's an ELK stack you're hosting somewhere, or a service like Loggly or Papertrail.
The good news is that all of your eventually logs wind up in one place. The bad news is that breaking out of the garden of your mega-vendor is no clean task. This is due to the presence of the proprietary logging service that the vendor would like you to use, like Cloudwatch or Google Cloud Logging. It sure seems like it would be easier to host everything - your containers, functions, logs - in the AWS or GCP bubble, that way all of your logging is threaded together.
Within AWS, the more optimal solution today is to route everything outwards. The alternative is to reconcile disparate log groups and log streams within Cloudwatch; given that you're going to want to parse this data at some point, a slicker alternative is to route them into one pool that excels at post-processing.
Using a third-party venue for logging could also open up the potential for a variety of different non-AWS hosting methods to be used. For example, you have many services: a self-hosted load balancer for your front-end servers, a PaaS that hosts your primary application, a set of serverless functions for high intensity, asynchronous compute. One of your functions is of upmost importance, so you're interested in hosting it on the billowing clouds of both Google and Amazon. You'd then circumvent the vendors logging solution and rope your tangle of pipes over to your log server.
Unified logging is an attractive proposition. I'd like to step back a bit and look at the over-arching process, wherein a log entry is the end result of client engagement.
Your client queries your hostname, via a browser or an API request. The request is sent to a load balancer. The load balancer routes the request where it should go: was the request destined for your API, your serverless function, your front-end application? That destination path exists somewhere on your hostname:
The request is processed, or not; the data of that request, the header, the verb, are all documented within your logs. We've written before about HTTP logging vs. the Syslog; either output provides essential information about the clients' visit. By modifying the request that your function or backend provide, you increase the granularity of the data that you capture.
Within our above image, each service must coordinate that output to our centralized pool. We've made that much easier. You might like it!
Through Fly, you attach a hostname, then stack your various backends and services on top of it. You could, for example, have Heroku, Amazon S3, and a set of powerful AWS Lambda functions attached to your onehostname.com. Each application that's running through Fly would then sit atop your hostname on a subfolder:
/app/compress, in the case of one of your serverless functions.
With Fly out front of your services and backends, you can configure one of our four logging Middleware and receive unified logs; not bad for filling in a couple fields and clicking a couple buttons.
In the serverless world, Fly supports AWS Lambda, Google Cloud Functions and Now by Zeit, with more on the way. Fly will automatically log requests made to any backend that you've added. This saves you from having to have each service coordinate with your logging repository; as a global load balancer, Fly navigates where a request should go and handles the response.
Give it a try! Fly started when we wondered "what would a programmable edge look like"? Developer workflows work great for infrastructure like CDNs and optimization services. You should really see for yourself, though.
To conclude, a lovely quote from the introduction to Syslogs' RFC: 3164:
Since the beginning, life has relied upon the transmission of messages. For the self-aware organic unit, these messages can relay many different things. The messages may signal danger, the presence of food or the other necessities of life, and many other things. In many cases, these messages are informative to other units and require no acknowledgement. As people interacted and created processes, this same principle was applied to societal communications. As an example, severe weather warnings may be delivered through any number of channels - a siren blowing, warnings delivered over television and radio stations, and even through the use of flags on ships. - C. Lonvick, RFC: 3164