Is Cost A Real Concern In Serverless?
Why is there so much noise in the industry about the cost of serverless operation?
Cost
Complexity
Lock-in
The three antipathies of serverless adoption — according to many.
While speed, agility, and value have been cited as the benefits of serverless adoption, cost, complexity, and lock-in have been argued as concerns since the early days of serverless.
Are these real concerns or container-influenced conversations?
In this article, I examine cost.
How has cost become a concern?
Are the cost concerns applicable only to serverless?
How can you become cost-conscious in serverless?
How Has Cost Become A Concern In Serverless?
When I first gave a talk on serverless in 2018, it was about the cost of serverless—not architecture, but cost.
It was a money talk — Shillings in Serverless!
The more I learned about serverless and its characteristics, the clearer it became that cost computation (and consideration) should be given equal focus alongside architecture.
My advice to engineers performing solution designs at that time was to include a section for cost estimation—an approximation based on known factors.
The above had been the approach of the many early adopters of serverless.
Over time, the focus shifted more toward architecture, tooling, developer experience, testing, observability, and so on than cost.
Cost consciousness somehow took a backseat in serverless.
In the following sections, I analyze why cost has become — or has always been— a concern in serverless.
The freebie deception
The free tier offer (as a freebie) has hugely influenced cloud adoption.
However, many engineers and architects do not know the difference between free trials, always-free offers, and 12-month free offers in AWS.
The free tier intends to let everyone explore and try without spending money upfront.
However, many who start with an explore-and-try mindset continue to experiment and build production workloads, underestimating the cumulative cost of cloud operation.
After the early excitement, the curiosity (and practice) to check whether the usage is within the free tier ceases until hit with a hefty bill.
Lack of cloud cost education
Many engineers lack adequate training in the non-technical aspects of cloud computing.
Engineers are as good as their team.
Due to this, most conversations in a serverless team center around infrastructure, CI/CD pipelines, patterns, architecture, APIs, events, incidents, etc. — hardly anything about service pricing or their monthly bills.
Many engineers implementing Lambda functions do not know how to estimate its cost. Or the cost impact of the values of its attributes.
An application that works well does not mean it was architected well.
Not-so-well-architected applications
The lack of experience in architecting solutions often leads to building expensive applications. Though this situation is common in new teams, it can also happen in high-profile teams.
The minimum viable product (MVP) mindset can also be a factor when teams evolve their applications without clear overall architectural thinking.
Many engineers do not have a basic understanding of the AWS Well-Architected Framework Principles.
With freedom comes responsibility. — Eleanor Roosevelt.
Autonomy with no accountability
Several teams do not have visibility into the cost of their cloud operations. Either they do not have access or never bother checking.
As an organization's cloud costs are not always distributed to individual teams, configuring billing alarms is not a common practice in many enterprise teams.
Unlike traditional development, cost is an important factor in serverless development, influencing day-to-day architecture and operational decisions.
Cost allocation tags in AWS are unheard of in many serverless teams, especially when teams share a cloud account to operate their workloads.
Distributed does not mean isolated. Information still flows between systems.
Misunderstanding the dynamics of distributed-ness
Modern applications are event-driven and distributed, which means the cost of operation is also distributed.
Although the cloud cost for ingesting data may be low, you can’t assume the same for processing that data.
How about data transfer costs? Cost of events and messages?
Several engineers and architects do not know AWS’s measure of an EventBridge event or the difference between a read and write capacity unit in DynamoDB.
Many apply the general interpretation of economies of scale with serverless and distributed systems without understanding the nuances of serverless operation at scale.
Dark matter is everywhere. In this room. Everywhere. — Fabiola Gianotti.
Missing the obvious things
An audience’s confession at my 2018 money talk: An engineer from a heavy data processing team investigating a production bug enabled all logs and went on a 3-week holiday. Effect? A hefty $15,000 bill.
Services such as Amazon CloudWatch are critical—they are always there, everywhere. However, they often do not receive the same care as other services.
While everyone counts the pennies of Lambda functions, they miss the dollars on log storage, metrics queries, capacity consumption, frequent scans on massive data buckets, and so on.
Is Cost A Concern Only In Serverless?
Cloud spending is part of every modern organization regardless of whether they consume serverless or non-serverless services.
If so,
Why is the cost projected as a concern only in serverless?
Are containers saving money for every organization?
Illusions!
Illusions happen when abstractions hide relevant concepts and, therefore, set false expectations. — Gregor Hohpe
Illusions apply to many things in many ways. For example,
Serverless is cheaper — is an illusion
Serverless is expensive — it is also an illusion
Containers are cheaper — is an illusion
If we remove the illusions, it will become clear that both serverless and container solutions can be cheaper or more expensive — depending on the use case, architecture, and usage.
Let’s examine a few situations.
The ‘containers are cheaper’ illusion.
Everyone knows that hardly any enterprise production workloads run on T2 Micros. The higher the container spec, the more expensive it is.
When teams choose a container instance for their workload for a specific duration—on-demand or with reservation—there is a known cost — i.e., expected cost — which is accounted for as part of the bill. Let’s say, in this case, it is 200 dollars per month.
However, most serverless workloads start with a pay-per-X concept. It won’t be clear how much it will be until the end of the month. If the bill jumps from a cheaper (and expected) 100 dollars in one month to an expensive (and unexpected) 1000 dollars in the next, it gets flagged as a concern.
Assuming the serverless application is well-architected, the 1000 dollars is for the actual usage (without any wastage) of all the collaborating managed cloud services.
However, if, as per DataDog’s state of cloud costs survey, more than 80 percent of container spend is wasted on idle resources, no one is really worried because the business is comfortable with the expected (and cheaper) 200-dollar bill for the containers.
The complexity of never-accounted hidden cloud costs
In my article, Is The Serverless Fairytale Over? I had used the following phrase.
Some technologies are complex by nature. And then, we create complexity with technology.
You would agree that Kubernetes resonates with the first part.
Many articles and discussions constantly highlight its complexities. Yet, in the name of portability (and other legitimate reasons), enterprises accumulate masses of Kubernetes pods and clusters.
In engineering, it is known that when something is complex, it demands special skills and longer engineering hours. The hourly or daily rates of such skills are always high in comparison.
In addition, several enterprises housing such complex technologies seek the help of external experts, paying a premium.
Unfortunately, in this case, the cloud bill will only reflect the 200 dollars cheaper and expected cost!
Fact 6 in DataDog’s state of cloud costs survey looks at cross-A-Z data transfer, which accounts for 48% of data transfer costs. With managed services, high availability and the associated data replication between A-Zs are taken care of.
Economies of scale occur when increasing output leads to lower long-run average costs. It means that as firms increase in size, they become more efficient. — economicshelp.org
Misunderstanding the ‘economies of scale’ in serverless
From my observation, one cannot assuredly state that the cost will always come down with the (high) scale of serverless operation — but it depends!
The term economies of scale is related to production and applies to a unit of a product. This is not the case with cloud.
However, a cloud provider's mammoth cloud operation benefits from economies of scale. A good proportion of these benefits are passed down to the service consumers, which is the main reason behind the democratization of the cloud and its low entry cost.
When you consume a managed service, you start with an affordable price. When your serverless operation scales, the services typically scale out (and not up). The cost benefits of a scale-out cloud operation (and pay-per-use) cannot be equated to the typical unit of a product.
Moreover, many serverless services cost zero to start — the freebies!
Though several cloud services offer price tiers and variations as usage increases, these are not as clean or simple as the unit price of a product.
For example, the AWS Lambda price breakdown for the Europe (Frankfurt) Region is below. Though the price of memory consumption (GB-seconds) drops at scale, the invocation cost remains the same.
The moral of this story is that without enough data and a case-by-case consideration of the use case and the cloud services, one cannot assume and apply the economies of scale expectation in serverless.
Why TCO can’t be an accurate representation of cost
The total cost of ownership (TCO) in software is the price of a software product and its operating costs over time.
How about the TCO of the cloud?
Though there are measures to compute TCO, it can only be a high-level approximation and a guide.
For example, the following picture shows the steps for migrating from on-premises to the cloud and using container services.
In step 4, the platform cost can be more realistic depending on the choice of container service, instance types, savings plan, etc.
How about the modernization effort?
How about the post-migration operations?
Serverless Development on AWS (O’Reilly, 2024) highlights engineering, delivery, operations, and maintenance as additional costs when computing the TCO of serverless operations.
Luc van Donkersgoed’s post below highlights, in a simplistic way, why calculating TCO is not as easy as everyone thinks.
Even though the adoption of serverless brings several proven benefits to an organization, it is a challenge to highlight its benefits based on the TCO.
Why?
Simply extrapolate what Luc said above with your organization's number of divisions, departments, teams, engineers’ skills versus their remuneration, cloud accounts, the mix of serverless and non-serverless services, annual cloud support cost, and everything else I missed here.
So, what do we do instead?
We pull the monthly cloud bills of 100, 200, or 1000 dollars of a few accounts, compare the costs of serverless and non-serverless operations, and assess TCO!
How Not To Become A Cost Victim In Serverless
It was tempting to title this section ‘How to become cost efficient in serverless?’
I resisted.
Whenever we hear the words cost-effective or cost-efficient, our brains automatically translate them as cheaper or low-value. This is not always the case in serverless.
In certain situations, you may be willing to spend more than a like-for-like non-serverless solution within valid reasons—the trade-offs!
As I once wrote, you sometimes balance the cost with convenience.
The book Serverless Development on AWS: Building Enterprise-scale Serverless Solutions (O’Reilly, 2024) has an entire chapter dedicated to discussing the Cost of Serverless Operation.
The following sections highlight a few ways to become cost-conscious.
The more you learn, the more you earn. — Warren Buffett.
Serverless cost education
The best way to raise awareness about a subject is to discuss it. Conferences can include more talks and discussions about this topic.
Yan Cui's talk, Money Saving Tips for the Frugal Serverless Developer, contains great insights on cost.
Though enterprises don’t share their cloud spend or TCO, they can share the trends and patterns and their impact on price. Such insights are great lessons for others.
In the talk, Standing on the Shoulders of Giants — Embracing Serverless in the Enterprise, Luc van Donkersgoed often expresses how to understand the cost of serverless and the need for education and optimization.
Luc shared the serverless adoption at PostNL as a case study in the book Serverless Development on AWS.
The value of approximation is better than no value.
Upfront cost estimation
Well-organized software teams conduct architecture or solution design reviews before building their applications. Such meetings are ideal for discussing costs and making projections based on operational expectations.
This practice
adds cost estimation as a process to the team
brings awareness of the expected cost of a service or feature
removes cost ambiguities and minimizes unpleasant surprises
helps optimize the architecture and agree on trade-offs
There will be use cases where a Lambda function cannot meet the computing power or be cost-viable compared to specific container instances. Recognize such situations early.
The use of apt patterns and practices
The AWS Well-Architected Framework should be a starting point for anyone architecting, building, and operating on AWS.
As I echoed in one of my previous articles, grasping the contents of the Well-Architected Framework is not easy. Hence, there are condensed options for serverless teams, such as Serverless Applications Lens and SCORPS.
Yan Cui’s talk, which I referred to earlier, has several patterns and recommendations.
Serverless teams should focus equally on architecture and cost of operation.
Use the apt service with the right configuration
This is related to the above; however, with the granular nature of managed serverless services, it is essential to focus deeper on service considerations.
One size does not fill all.
Just because managed services with a set of configurations work well in terms of cost efficiency does not mean the same service configurations will work the same or better in a different use case.
Engineers should be able to identify such nuances while detailing their architectures to build and operate cost-optimized applications.
Continuously refactor to keep your applications efficient and optimized for cost, performance, and sustainability.
You build it; you pay for it. — Serverless Development on AWS (O’Reilly, 2024)
Cost accountability
Cost visibility must be brought to every team that deploys and operates serverless workloads.
Engineers must understand monthly cost variations and the reasons. Performing well-architected reviews with the SCORPS process is an excellent way for engineers to work together, learn, and improve.
In a shared cloud account environment, distribute cost allocation tags per team.
Follow these three principles towards cost-accountability.
🔶Create cost awareness
Educate engineers about cloud costs.
Teach them how to estimate the cost from the design or architecture of a service or feature.
🔶Make cost visible
Share the monthly cloud bill with the team.
Let engineers understand and analyze the costs and variations.
🔶Instill cost accountability
Let the team identify hotspots and cost concerns and drill down to specific areas/services.
Let engineers own those areas to identify and act on improvement.
Create cost awareness. Make cost visible. Instill cost accountability.
Conclusion
Cost is part of every business operation.
Cost is part of every cloud operation, thereby, serverless operation.
Cost concerns are there in every part of software engineering and cloud computing. We must acknowledge it.
However, how we do cost care and prevent it from spiraling out of control is upon us.
We can draw inspiration from the observability domain, where we employ AI and ML to track and act on anomalies proactively.
If each serverless team can proactively track cost, they can act on continuous optimization and service refactoring before cost reaches a critical point.
Whether in life or serverless, prevention is always better!
Paying the right price does not always mean paying a lower price. It’s the price you are prepared to pay for the right technology for your job.
In a subsequent article, I will share my thoughts on complexity.
Please like and share your thoughts. Thanks for reading!