Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System Design Strava #17

Open
JiantaoFu opened this issue Jan 16, 2025 · 0 comments
Open

System Design Strava #17

JiantaoFu opened this issue Jan 16, 2025 · 0 comments

Comments

@JiantaoFu
Copy link
Member

Requirements (5 mins):

Functional Requirements

Identify core features (e.g., "Users should be able to post tweets"). Prioritize 2-3 key features.

  1. users can start/stop/pause their activity (runs/rides)
  2. users should be able to check the activity data (route, distance, time etc) when running /riding
  3. users should be able to check history records, including friends

Non-Functional Requirements

Focus on system qualities like scalability, latency, and availability, consistency, security, durability, fault tolerance. Quantify where possible (e.g., "render feeds in under 200ms").

  1. the system should be highly available
  2. the app should work offline when no Internet
  3. the stats should be accurate
  4. should support 10m concurrent users

Capacity Estimation

Skip unnecessary calculations unless they directly impact the design (e.g., sharding in a TopK system).

route GPS data estimation: 100m DAU, 10m concurrent users, collect interval 5 secs, 30 min activity per day will generate 30 x 60 / 5 = 360 records, so around 36,000m records per day, each record we have around 40 Byte.

Storage: 36,000m x 40 = 1440GB per day. 1500G x 360 = 180000x3G = 540000 GB = 540 TB

QPS 3600m / 100k = 36k

Core Entities (2 mins)

Identify key entities (e.g., User, Tweet, Follow) to define the system's foundation.

  • user
  • activity
  • route
  • friend

API/System Interface (5 mins)

Define the contract between the system and users. Prefer RESTful APIs unless GraphQL is necessary.

  • POST /activity -> Activity, create an activity, body {type}
  • PATCH /activity/id, change status, body { status}, start, stop, pause, complete
  • POST /activity/id/route, update route geo tracking { geolocation}
  • GET /activities/page?mode=user|friend&page=

[Optional] Data Flow (5 mins)

Describe high-level processes for data-heavy systems (e.g., web crawlers).

High-Level Design (10-15 mins)

Draw the system architecture, focusing on core components (e.g., servers, databases). Keep it simple and iterate based on API endpoints.

image

Deep Dives (10 mins)

Address non-functional requirements, edge cases, and bottlenecks. Proactively improve the design (e.g., scaling, caching, database sharding).

No internet connection

We can save data in local db and sync to server later

Support 100 DAU, 10m concurrent users

storage: 540TB/year

  • purge old storage of route date
  • shard data by time
  • use cold storage to save cost
  • compress the date

QPS: 40k

  • use nosql database, time series database more specific
  • clients aggregate the data and use longer intervals if not in realtime view
  • shard the data by user id, activity id to distribute the load across servers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant