Cloud System Computing and Cloud-Designing

1. Cloud Computing

Cloud computing is the on-demand delivery of computing services—including servers, storage, databases, networking, software, and analytics—over the internet ("the cloud"). Instead of buying and maintaining physical data centers and servers, organizations rent access to these resources from cloud providers on a pay-as-you-go basis.

Core Driving Characteristics

  • On-Demand Self-Service: Consumers can provision computing capabilities (such as server time or network storage) automatically as needed, without requiring human interaction with the service provider.

  • Broad Network Access: Capabilities are available over the network and accessed through standard mechanisms by heterogeneous thin or thick client platforms (e.g., mobile phones, tablets, laptops, and workstations).

     

  • Resource Pooling: The provider’s computing resources are pooled to serve multiple consumers using a multi-tenant model. Physical and virtual resources are dynamically assigned and reassigned according to consumer demand.

     

  • Rapid Elasticity: Capabilities can be elastically provisioned and released—in some cases automatically—to scale rapidly outward and inward commensurate with demand.

  • Measured Service: Cloud systems automatically control and optimize resource use by leveraging a metering capability at some level of abstraction appropriate to the type of service (e.g., storage, processing, bandwidth, and active user accounts).

Cloud Deployment Models

  • Public Cloud: The cloud infrastructure is provisioned for open use by the general public. It is owned, managed, and operated by a business, academic, or government organization (e.g., Amazon Web Services, Microsoft Azure, Google Cloud Platform).

  • Private Cloud: The cloud infrastructure is provisioned for exclusive use by a single organization comprising multiple consumers. It may be owned, managed, and operated by the organization, a third party, or some combination of them, and it may exist on or off premises.

     

  • Hybrid Cloud: The cloud infrastructure is a composition of two or more distinct cloud infrastructures (private, community, or public) that remain unique entities but are bound together by standardized or proprietary technology that enables data and application portability.

  • Multi-Cloud: The deliberate use of cloud services from multiple different public cloud providers to avoid vendor lock-in, optimize costs, or leverage specific niche services from different platforms.

Cloud Service Models (The SPI Framework)

+-----------------------------------------------------------------------------------------+
|SaaS (Software as a Service) <---> [End-User Apps: Gmail]|
+-----------------------------------------------------------------------------------------+
|PaaS (Platform as a Service) <---> [Dev Tools: App Engine]|
+-----------------------------------------------------------------------------------------+
|IaaS (Infrastructure as a Service) <---->  [Raw Hardware: EC2 VMs]|
+------------------------------------------------------------------------------------------+

  • Infrastructure as a Service (IaaS): Provides access to fundamental computing resources like physical or virtual machines, raw storage, and firewalls. The user manages the operating system, middleware, and applications (e.g., AWS EC2, Azure VMs).

  • Platform as a Service (PaaS): Provides a managed environment with runtime, tools, and operating systems pre-configured. The user only manages the deployment and configuration of their own application code (e.g., AWS Elastic Beanstalk, Heroku, Google App Engine).

  • Software as a Service (SaaS): Delivers a complete, fully functional software application managed entirely by the provider over a web browser or client interface (e.g., Microsoft 365, Salesforce, Google Workspace).

2. Cloud Architecture

Cloud Architecture defines the conceptual blueprint, logical structure, and relationships between the software components, virtual resources, and services that make up a cloud environment.

Shared Responsibility Model

A fundamental structural framework that establishes where the security and operational responsibilities of the cloud provider end, and where the responsibilities of the customer begin.

  • Provider Responsibility ("Security of the Cloud"): Protecting the physical infrastructure, global facilities, edge locations, virtualization layer, host hardware, and baseline networking that run the cloud services.

  • Customer Responsibility ("Security in the Cloud"): Managing guest operating systems, application patches, network security configurations (firewalls), identity and access management (IAM), data encryption, and corporate data assets.

Fundamental Architectural Concepts

  • Virtualization: The core enabling technology of cloud computing. A software layer called a Hypervisor (Type 1 bare-metal, or Type 2 hosted) abstracts physical hardware resources (CPU, Memory, Storage) into multiple isolated virtual machines (VMs).

  • Microservices Architecture: A design approach where an application is broken down into a collection of loosely coupled, independently deployable, and small modular services that communicate via lightweight APIs (HTTP/REST or gRPC).

  • Containers: An operating-system-level virtualization method used to deploy and run applications without utilizing an entire VM guest operating system. Containers share the host OS kernel, making them lightweight, fast to boot, and highly portable (e.g., Docker).

  • Container Orchestration: Automation tooling required to manage the lifecycle, scaling, networking, and deployment of thousands of containers across a cluster of host machines (e.g., Kubernetes).

  • Serverless Computing: An execution model where the cloud provider dynamically manages the allocation and provisioning of machine resources. Code is executed in response to discrete events via Functions-as-a-Service (FaaS), and users are billed strictly for the milliseconds the code runs (e.g., AWS Lambda, Google Cloud Functions).

3. Cloud Infrastructure

Cloud Infrastructure refers to the physical and virtual components—hardware, storage, networks, and virtualization software—that form the substrate required to build and sustain cloud environments.

Physical Enterprise Foundation

  • Regions: Geographical areas across the globe that contain multiple, isolated, and physically separated data centers.

  • Availability Zones (AZs): Distinct locations within a single Cloud Region. Each AZ consists of one or more physical data centers engineered to be isolated from failures in other AZs (using independent power grids, cooling systems, and physical security), yet connected via ultra-low latency fiber-optic networking.

  • Edge Locations: Distributed points of presence (PoPs) located near high-population areas. They cache data close to end-users to reduce latency when delivering content via Content Delivery Networks (CDNs).

Software-Defined Resources

Modern cloud infrastructure abstracts physical hardware completely through software-defined mechanisms:

  • Compute Resources: Virtualized processing power. Instances or VMs can be provisioned instantly with granular CPU cores, RAM capacities, and hardware accelerators (GPUs, TPUs).

  • Software-Defined Storage (SDS):

    • Block Storage: High-performance, low-latency raw storage volumes attached directly to virtual instances, acting like physical hard drives (e.g., AWS EBS).

    • Object Storage: Flat, highly scalable storage architecture that stores data as unstructured objects along with rich metadata and a unique identifier, accessible over HTTP (e.g., AWS S3, Google Cloud Storage).

    • File Storage: Shared file systems accessible simultaneously by multiple computing instances using network protocols like NFS or SMB.

  • Software-Defined Networking (SDN): Virtualized networking infrastructure managed entirely via software control planes. It allows the instant programmatic creation of virtual routers, firewalls, and subnets.

4. Cloud Designing

Cloud Designing is the disciplined practice of architecting scalable, resilient, secure, and cost-optimized system topologies within a cloud ecosystem.

The Five/Six Pillars of Cloud Design Frameworks

Modern systems are designed around standardized industry design frameworks (such as the AWS Well-Architected Framework or Azure Architecture Framework):

Pillar Core Operational Objective Common Implementations
Operational Excellence Running and monitoring systems to deliver business value. Continuous Integration/Continuous Deployment (CI/CD), logging consolidation, infrastructure tracking.
Security Protecting data, assets, and infrastructure systems. Identity & Access Management (IAM), data encryption at rest and in transit, Network Access Control Lists (NACLs).
Reliability Recovering from infrastructure or service disruptions. Multi-AZ deployments, data backups, cross-region replication, automated failover.
Performance Efficiency Using computing resources efficiently to meet demands. Auto Scaling groups, choosing correct instance classes, leveraging managed database caches.
Cost Optimization Eliminating unneeded instances or suboptimal spending. Right-sizing instances, selecting spot instances, setting up automated resource shutdown policies.
Sustainability Minimizing environmental impacts of cloud footprints. Energy-efficient architecture patterns, minimizing data movement and storage overhead.

Core Engineering Design Patterns

High Availability & Disaster Recovery (DR)

Designing for infrastructure failure requires distributed data structures and state handling:

  • Active-Active Deployment: Traffic is routed across multiple computing nodes running in parallel across different zones or regions simultaneously. If one zone fails, the live zones absorb the remaining traffic instantly.

  • Active-Passive Deployment: A primary infrastructure environment handles all production workloads, while a secondary backup site sits idle or scaled down. If a disaster strikes, traffic is re-routed to the secondary site (Failover).

Elasticity & Auto Scaling

Designing systems to adapt to real-time traffic changes without manual intervention.

      "Cloud Monitoring (CloudWatch)" --- {Triggers Metric Threshold} --> "Auto Scaling Controller" 

[Auto Scaling Controller]

                  |                                                                                 |

                              v (Scale Out)                                                                        v (Scale-In)       

                                     "Spin up new Instances"                                              "Terminate idle Nodes"

  • Horizontal Scaling (Scaling Out/In): Adding or removing computing instances (e.g., adding more small servers to a cluster) dynamically based on metric thresholds like CPU usage or request volume.

  • Vertical Scaling (Scaling Up/Down): Modifying an existing single instance by adding more raw power (more CPU cores or RAM). This often requires downtime or instance restarts.

Infrastructure as Code (IaC)

A foundational design practice where infrastructure configurations are written, versioned, and managed using declarative text code files instead of manual console clicking. This ensures rapid, repeatable, and error-free infrastructure generation (e.g., Terraform, AWS CloudFormation, OpenToFu).