Skip to main content
Cloud

Functions as a Service: Four Benefits Worth the Cold-Start Pain

FaaS is oversold for some workloads and undersold for others. Here is our honest take on when Lambda, Azure Functions, and Cloud Run Functions actually pay for themselves — cold starts and all.

John Lane 2023-11-14 6 min read
Functions as a Service: Four Benefits Worth the Cold-Start Pain

Functions as a Service — Lambda, Azure Functions, Cloud Run Functions, and the rest — got sold to the market on a simple promise: write a function, never think about servers, pay only for what you use. The reality is more nuanced than that. FaaS is a great fit for some workloads and a mediocre fit for others, and the places it shines are not always the places the marketing focuses on.

We build on all three major FaaS platforms regularly and we also talk customers out of using FaaS when it is the wrong answer. Here are the four benefits that have held up over the last several years, along with an honest account of when the cold-start pain is worth the price of admission.

Benefit One: Event Plumbing Without Servers

The best use of FaaS, in our experience, is as the glue between managed services — the code that runs when a file lands in a bucket, when a queue message arrives, when a database row is updated, when a scheduled timer fires. These workloads have three properties that make FaaS nearly unbeatable: they are short-running, they are triggered by events rather than user requests, and they would otherwise require you to stand up a small service with its own scaling, monitoring, and deployment infrastructure.

A Lambda that transforms uploaded images, an Azure Function that routes inbound email attachments, a Cloud Function that updates a search index whenever a database row changes — all of these are five to fifty lines of code and would be absurd to build as dedicated services. FaaS lets you write the code, wire it to the event source, and forget about it. No containers, no cluster, no deployment pipeline beyond a git push.

The cold-start pain is irrelevant here because events rarely care if they wait 200 milliseconds to start processing. The glue code is the classic FaaS sweet spot and it is where we recommend starting if you are new to the pattern.

Benefit Two: Genuinely Unpredictable Traffic Patterns

The second benefit is for workloads where traffic is genuinely unpredictable — bursty, spiky, or driven by external events you cannot forecast. Think webhook receivers, notification handlers, SMS response processors, campaign landing pages, or internal tools used by a handful of people at unpredictable times. These workloads are characterized by long idle periods punctuated by short bursts.

On a traditional server, you either pay for idle capacity most of the time or you get crushed when the burst arrives. On FaaS, you pay cents during the idle period and the platform provisions capacity on demand when the burst hits. For truly bursty workloads we have seen FaaS bills come in at under 10 percent of the equivalent reserved-instance cost.

The catch is that "truly bursty" has to be tested against your actual traffic. Workloads that look bursty on a weekly chart often turn out to be smooth on an hourly chart. If your traffic is consistent enough that a single small VM can handle it, FaaS will cost more, not less.

Benefit Three: Low-Maintenance Internal Tooling

The third benefit is internal tooling — the scripts, utilities, and admin interfaces that keep your organization running but that nobody wants to maintain. Reporting jobs, data exports, ETL pipelines between SaaS products, one-off automation for the help desk. These are the projects that traditionally end up as a cron job on someone's workstation, a Windows scheduled task on a forgotten server, or a Python script that only runs when the last person who knew about it still works here.

FaaS is a legitimately better home for these workloads than a forgotten VM. The function has version control, the trigger is explicit, the logs are centralized, and when the person who wrote it leaves, the next person can find it. The economics barely matter here because the compute cost is negligible — what matters is that the code lives somewhere durable and discoverable.

We have moved hundreds of legacy cron jobs onto FaaS platforms and the main benefit is almost always operational, not economic. The scripts become visible. They get monitored. They get retired when nobody is using them anymore. None of that happens to a script on someone's laptop.

Benefit Four: Cheap Experimentation

The fourth benefit is the ability to stand up an experiment without committing to infrastructure. A developer can deploy a new endpoint in hours, see whether anyone uses it, and either promote it to a real service or delete it without ever paying for a server. That speed-of-experimentation is a genuine productivity advantage, especially for teams that are still figuring out what to build.

The honest caveat is that most experiments get "promoted" in place rather than rewritten. A Lambda that was supposed to be a three-month experiment turns into a production dependency serving live traffic two years later, still with its original 200-millisecond cold start and its original scrappy error handling. That is fine if everyone knows about it and is comfortable with it. It becomes a problem when the experiment gets on the critical path without anyone noticing.

When FaaS Is The Wrong Answer

We owe you the other side of this. FaaS is a bad fit for:

  • Sustained high-throughput workloads. If your function runs constantly for minutes at a time under load, you are paying the FaaS premium without getting the FaaS benefit. Move to containers.
  • Latency-sensitive user-facing APIs. Cold starts are real and they matter for user experience. Yes, you can provisionally warm functions, and yes, it partially helps, and no, it does not fully close the gap with a dedicated service. If your p99 latency requirement is under 200 milliseconds, think carefully.
  • Long-running tasks. FaaS platforms cap function duration (typically 15 minutes for Lambda, longer for Cloud Run Functions). Anything that needs to run for an hour belongs on a container or a VM, not on FaaS.
  • Anything with significant state. Functions are stateless by design. State lives in databases, queues, and caches. If you find yourself fighting the platform to hold state between invocations, you are using the wrong tool.
  • Predictable steady-state workloads at scale. A reserved-instance VM or a small Kubernetes deployment will beat FaaS on cost for any workload where you know roughly what capacity you need.

The Cold-Start Conversation

Cold starts deserve an honest paragraph because the topic is misunderstood in both directions. Yes, cold starts are real — a Lambda in Python or Node.js typically adds 100 to 400 milliseconds on the first invocation of a new container. A Lambda in .NET or Java can add several seconds unless you use snap-start or Graal. No, cold starts do not matter for every workload — most event-driven glue code, internal tooling, and bursty traffic is insensitive to a few hundred milliseconds of initial latency.

The honest rule: if a cold start is visible to an end user in a user-facing flow, plan around it (provisioned concurrency, warm pools, or a different runtime). If the cold start is visible only to a backend system that is already tolerant of latency, ignore it.

Three Takeaways

  1. FaaS is event plumbing first, application runtime second. The event-glue use case is where FaaS shines the brightest and where the economics are the most compelling.
  2. Cold starts are not the reason to avoid FaaS — sustained load is. The real test is whether your workload has enough idle time to justify paying the per-invocation premium.
  3. Experiments have a way of becoming production. Treat every FaaS function you deploy as if it might be running in three years, because statistically, some of them will be.

Talk with us about your infrastructure

Schedule a consultation with a solutions architect.

Schedule a Consultation
Talk to an expert →