ServerLess (FAAS) System Computing-&-Designing

1. Serverless & FaaS Computing

Serverless computing is an execution model where the cloud provider dynamically manages the allocation, provisioning, and scaling of machine resources. The term "serverless" is a misnomer; servers still exist, but they are completely abstracted away from the developer. You do not provision, patch, manage, or scale virtual machines or physical hardware.

Functions-as-a-Service (FaaS)

FaaS is the central programmatic component of serverless computing. Instead of deploying a continuously running monolithic application or a long-lived container, developers break application logic down into small, single-purpose, discrete blocks of code called Functions.

Core Characteristics

  • Event-Driven Execution: Functions do not sit idle listening for requests. They remain dormant until a specific event trigger occurs—such as an HTTP request via an API gateway, a file upload to object storage, a database modification, or a scheduled cron timer.

  • Microsecond, Pay-Per-Use Billing: Traditional cloud instances charge hourly or by the minute, whether they are processing traffic or sitting idle. Serverless models meter usage based on the exact number of executions and the execution duration measured in milliseconds, multiplied by the allocated memory size. If your function runs zero times, your infrastructure cost is zero.

  • Ephemeral Lifespan: Function instances are temporary. They spin up instantly to process a single event, return the output, and are torn down shortly afterward. They are inherently stateless; no persistent application state or data is saved locally between runs.

2. Serverless & FaaS Architecture

Serverless architecture focuses on event routing networks, stateless execution runtimes, and distributed service components.

The Systemic Blueprint

  • 1. Event Providers & Triggers: The system components that detect state changes (e.g., an AWS S3 file upload event, an incoming Stripe webhook call) and emit an event payload.

  • 2. API Gateway & Routing Plane: Validates incoming HTTP requests, checks authorization policies, matches routes, and maps the request payloads to the corresponding target function configuration.

  • 3. Execution Routing Controller: An internal control plane that checks if a container instance of the function is already running. If none are available, it signals the internal orchestrator to spin one up immediately.

  • 4. Downstream Managed Services: Because the function cannot natively persist state, it connects securely over the network to external decoupled services like managed databases (e.g., Amazon DynamoDB), cache layers (Redis), or external object stores.

The Execution Lifespan: Cold Starts vs. Warm Starts

Understanding how a function comes alive is the primary architectural consideration when designing serverless systems.

Cold Start (Latency Penalty: 100ms - 3000ms)
[Trigger] ---> [Provision Container] ---> [Bootstrap Runtime] ---> [Init Code/Deps] ---> [Execute App Logic]
                                                                                               |
                                                                                    Container remains cached
                                                                                               v
Warm Start (Latency Optimized: 1.x ms)                                                         |
[Trigger] ------------------------------------------------------------------------------------>+
  • Cold Start: Occurs when a function is triggered for the first time, or after a prolonged period of inactivity. The underlying platform must physically allocate a slice of compute, pull the function code package, spin up a lightweight container or isolation cell, bootstrap the language runtime (like Node.js or Python), execute global initialization scripts, and then run the logic. This introduces a noticeable latency penalty.

  • Warm Start: If a function is triggered while a previous container instance is still kept alive ("warm") in the platform's cache, the system skips the provisioning and initialization steps, routing the event payload directly into the active instance. This executes almost instantly.

3. Serverless & FaaS Infrastructure

The infrastructure supporting modern serverless execution requires micro-virtualization tools that can spin up isolated, secure compute spaces in a fraction of a second.

Micro-Virtualization Substrates

Standard Virtual Machines take minutes to boot, and Docker containers take seconds. To handle thousands of concurrent serverless triggers without severe latency penalties, cloud infrastructure utilizes specialized Micro-VM technologies:

  • Firecracker (AWS): An open-source minimalist virtual machine monitor written in Rust. It utilizes the Linux Kernel-based Virtual Machine (KVM) to create multi-tenant micro-VMs in less than 5 milliseconds, with a memory footprint of only 5 MiB per instance. This allows hyper-dense server consolidation on host bare-metal hardware.

  • V8 Isolates (Cloudflare Workers): An alternative infrastructure model that avoids virtual machines and containers completely. It runs code directly inside Google’s open-source V8 JavaScript engine using "Isolates"—lightweight instances that isolate memory space and variables without needing a guest operating system or container runtime. This drops cold start times to zero.

The Global Network Edge (Edge Functions)

Edge Functions shift serverless code deployment out of a single regional cloud data center and replicate it across hundreds of distributed Point-of-Presence (PoP) edge servers globally. When a user makes an API request, the closest physical edge node intercepts the request, runs the serverless function immediately on the spot, and returns the result without sending the packet back to a home data center.

4. Serverless & FaaS Designing

Designing serverless systems requires breaking down architectures into highly isolated, decoupled services and carefully handling state and network connection pools.

Core Architectural Design Principles

1. Managing Database Connection Starvation

Traditional databases (like PostgreSQL or MySQL) open a long-lived, persistent TCP socket connection for every active application thread. In a serverless environment, if 10,000 users hit a function simultaneously, the FaaS platform instantly scales out 10,000 separate concurrent container instances. This will overwhelm traditional databases, causing them to run out of memory or crash from connection exhaustion.

Design Fix: Designers must place an intelligent, connection-pooling intermediary layer—such as AWS RDS Proxy, or migrate completely to built-in HTTP-based cloud-native serverless databases (like DynamoDB or PlanetScale) that handle large connection spikes gracefully.

2. Orchestrating State: Function Chaining vs. Step Functions

Because FaaS functions are stateless and have strict execution time limits (typically 15 minutes max), complex, long-running workflows cannot be packed inside a single function. However, having one function invoke another function directly via raw HTTP calls (Function Chaining) is an anti-pattern; it creates a "Double Billing" scenario where Function A sits completely idle, burning cash while waiting for Function B to finish processing.

Function Chaining Anti-Pattern (Wasteful Waiting Costs):
[Function A] ---> (Invokes & Waits) ---> [Function B] ---> (Invokes & Waits) ---> [Function C]

State Machine Orchestration Pattern (Optimized & Decoupled):
[Step Functions State Engine]
      |---> Execs [Fn A] (Tears Down Fn A)
      |---> Analyzes Output State
      |---> Execs [Fn B] (Tears Down Fn B)

Design Fix: Designers use centralized state machine orchestrators (such as AWS Step Functions or Azure Durable Functions). The orchestration engine handles workflow state logic, retries, and conditional branching, spinning functions up only when they have work to do and instantly destroying them when done.

Strategic Sizing & Trade-offs

Design Axis Structural Selection Architectural Benefit Engineering Trade-off
Memory Sizing Granular allocation ranges (e.g., 128 MB to 10 GB). Cloud platforms scale CPU allocation proportionally with memory. Doubling memory doubles CPU performance. Over-provisioning memory wastes money if code is strictly I/O bound; under-provisioning slows down CPU-heavy operations.
Stateless Persistence Externalizing state completely to shared caches or DBs. Allows the FaaS platform to scale horizontally infinitely without data conflicts. Introduces network latency overhead for every read/write call during function lifecycles.
Monolith vs. Micro-Function Single Function routing multiple paths vs. individual functions per route. "Mono-lambda" configurations keep instances warm longer, drastically reducing cold start frequency. Disables granular security control; changing one path requires redeploying the entire code package.