Categories


Authors

System Design & Scalability

System Design & Scalability

System-Design-and-Scalability.png

Overview

  • Types of Scaling & Load Balancing

  • Data Storage Design

  • Message-oriented Middleware (MOM)

  • Fault Handling

  • Performance Metrics

  • MapReduce

 

 

 

 

Types of Scaling & Load Balancing

Horizontal vs. Vertical Scaling

Horizontal-vs-Vertical-Scaling.png

Load Balancing

  • Round Robin

  • Source IP Hash - a given client IP address will always go to the same server

  • Request Hash - a given request type will always go to the same server/cache; avoids cache duplication

  • Least Connections

  • Least Traffic

  • Least Latency

Reference

Data Storage Design

Database - Structure

  • Relational - general purpose for tabular/table-based data

  • Specialized - for data structures that don't easily fit the tabular format (e.g. multi-level nesting & hierarchies)

    • NoSQL

    • Others

Database - Read vs. Write Performance

  • Normalization - the process of organizing data to minimize redundancy (removing duplicate columns in different tables). In relational DBs, normalization involves dividing a DB into several tables and defining relationships between those tables using foreign keys.

  • Normalize vs. Denormalize

    • Normalize - ↓duplicate data ⇨ ↑write perf but ↓read perf

    • Denormalize - ↑duplicate data ⇨ ↑read perf but ↓write perf

  •  Have your cake and… - Use an append-only structure for writes; then asynchronously restructure data into a read-optimized format[*]

Cache

Cache-not-Cash.png
  • DB reads are expensive; i.e. hold as much of it in memory as possible

  • Cache Hit - data were found in cache; Cache Miss - data not found, so retrieve it from DB[*]

  • Local vs. Distributed Rule of Thumb - use local cache for small data sets, with predictable number of immutable records[*]

  • Cache Warming - anticipate queries and "prime" the cache not only on startup but also in real-time (e.g. load surrounding tiles of a recently-requested map)

Cache - Replacement Policy

Replacement Policy - algorithm used to maximize cache performance by choosing which data to eject & which data to add in its place[*]

  • LRU - ejects the most Least Recently Used data

  • advanced - considers access frequency, size of items, latency & throughput
     

Cache-Replacement-Policy.png

Data Store Sharding

Shard-not-Shart.png

Sharding - partition data across multiple nodes
 

 

 

Sharding-Types.png

Message-oriented Middleware (MOM)

MOM Considerations

  • Used by distributed systems to communicate amongst nodes[*]

  • Abstracts OS & network intricacies (e.g. endian format, sockets, etc.)

MOM Types

Message-oriented-Middleware.png

Fault Handling

Fault-Handling.png

Fault Handling Types

  • High Availability (HA) - delayed recovery to secondary 

  • Fault Tolerant - immediate recovery

    • Active/Passive - primary fails over to secondary

    • Active/Active - no primary vs. secondary; when 1 fails, the other(s) takes the additional load

  • Great YouTube video on the subject!

 

 

Single-Point-of-Failure.png

Avoid SPOF

A Single Point of Failure (SPOF) is a part of a system that, if it fails, will stop the entire system from working.[*]

 

 

Performance Metrics

Performance Metrics - Terminology

Bandwidth - The maximum amount of data that can be transferred in a unit of time (e.g. 100Mbps)[*]
Throughput - The actual amount of data that is transferred in a unit of time (e.g. 88MBps)
Latency - The time it takes to send & receive (round-trip) a packet of data (e.g. 20ms)[*]

Performance Metrics - Analogy

Given a water pipe, its diameter determines its throughput, and its length determines its latency.  Therefore, to improve:

  • Throughput - Get a fatter pipe

  • Latency - Colocate to reduce distance or reduce network hops (point-to-point circuit), which also reduces distance that data have to travel

Performance Metrics - Tail Latencies

  • A tail latency, like P99 or P95, is used to represent the worst-case scenario.[*]

  • e.g. P99 = 0.75ms:  For the 99th percentile, the request latency was 0.75ms.  Therefore, 99% of the requests were faster than 0.75ms.

MapReduce

MapReduce - Explanation

Uses parallel & distributed systems to process large data sets[*]

MapReduce - Steps

Fundamentally, consists of two steps, Map & Reduce, but Shuffle step is also prevalent:

  • Map - Organizes/filters/sorts.  Think of putting elements into a typical Map Interface with key-value pairs (e.g. <key, value>)

  • Shuffle - Redistributes data so that all data pertaining to a given key reside on the same node

  • Reduce - Summary/aggregation (e.g. sum all values for a given key)
     

Slack to the Rescue

Slack to the Rescue

3 Founding Principles of Big&nbsp;Tech

3 Founding Principles of Big Tech