没有合适的资源?快使用搜索试试~ 我知道了~
首页Grokking the System Design by educative.io (z-lib.org).pdf
Grokking the System Design by educative.io (z-lib.org).pdf
需积分: 16 24 下载量 67 浏览量
更新于2023-03-16
评论 1
收藏 2.1MB PDF 举报
Grokking the System Design by educative.io
资源详情
资源评论
资源推荐
System Design Interviews: A step by step guide
A lot of software engineers struggle with system design interviews (SDIs) primarily because of three
reasons:
• The unstructured nature of SDIs, where they are asked to work on an open-ended design
problem that doesn’t have a standard answer.
• Their lack of experience in developing large scale systems.
• They did not prepare for SDIs.
Like coding interviews, candidates who haven’t put a conscious effort to prepare for SDIs, mostly
perform poorly especially at top companies like Google, Facebook, Amazon, Microsoft, etc. In these
companies, candidates who don’t perform above average, have a limited chance to get an offer. On the
other hand, a good performance always results in a better offer (higher position and salary), since it
shows the candidate’s ability to handle a complex system.
In this course, we’ll follow a step by step approach to solve multiple design problems. First, let’s go
through these steps:
Step 1: Requirements clarifications
It is always a good idea to ask questions about the exact scope of the problem we are solving. Design
questions are mostly open-ended, and they don’t have ONE correct answer, that’s why clarifying
ambiguities early in the interview becomes critical. Candidates who spend enough time to define the
end goals of the system always have a better chance to be successful in the interview. Also, since we
only have 35-40 minutes to design a (supposedly) large system, we should clarify what parts of the
system we will be focusing on.
Let’s expand this with an actual example of designing a Twitter-like service. Here are some questions
for designing Twitter that should be answered before moving on to the next steps:
• Will users of our service be able to post tweets and follow other people?
• Should we also design to create and display the user’s timeline?
• Will tweets contain photos and videos?
• Are we focusing on the backend only or are we developing the front-end too?
• Will users be able to search tweets?
• Do we need to display hot trending topics?
• Will there be any push notification for new (or important) tweets?
All such question will determine how our end design will look like.
Step 2: System interface definition
Define what APIs are expected from the system. This will not only establish the exact contract expected
from the system, but will also ensure if we haven’t gotten any requirements wrong. Some examples for
our Twitter-like service will be:
postTweet(user_id, tweet_data, tweet_location, user_location, timestamp, …)
generateTimeline(user_id, current_time, user_location, …)
markTweetFavorite(user_id, tweet_id, timestamp, …)
Step 3: Back-of-the-envelope estimation
It is always a good idea to estimate the scale of the system we’re going to design. This will also help
later when we will be focusing on scaling, partitioning, load balancing and caching.
• What scale is expected from the system (e.g., number of new tweets, number of tweet views,
number of timeline generations per sec., etc.)?
• How much storage will we need? We will have different numbers if users can have photos and
videos in their tweets.
• What network bandwidth usage are we expecting? This will be crucial in deciding how we will
manage traffic and balance load between servers.
Step 4: Defining data model
Defining the data model early will clarify how data will flow among different components of the
system. Later, it will guide towards data partitioning and management. The candidate should be able to
identify various entities of the system, how they will interact with each other, and different aspect of
data management like storage, transportation, encryption, etc. Here are some entities for our Twitter-
like service:
User: UserID, Name, Email, DoB, CreationData, LastLogin, etc.
Tweet: TweetID, Content, TweetLocation, NumberOfLikes, TimeStamp, etc.
UserFollowo: UserdID1, UserID2
FavoriteTweets: UserID, TweetID, TimeStamp
Which database system should we use? Will NoSQL like Cassandra best fit our needs, or should we use
a MySQL-like solution? What kind of block storage should we use to store photos and videos?
Step 5: High-level design
Draw a block diagram with 5-6 boxes representing the core components of our system. We should
identify enough components that are needed to solve the actual problem from end-to-end.
For Twitter, at a high-level, we will need multiple application servers to serve all the read/write
requests with load balancers in front of them for traffic distributions. If we’re assuming that we will
have a lot more read traffic (as compared to write), we can decide to have separate servers for handling
these scenarios. On the backend, we need an efficient database that can store all the tweets and can
support a huge number of reads. We will also need a distributed file storage system for storing photos
and videos.
Step 6: Detailed design
Dig deeper into two or three components; interviewer’s feedback should always guide us what parts of
the system need further discussion. We should be able to present different approaches, their pros and
cons, and explain why we will prefer one approach on the other. Remember there is no single answer,
the only important thing is to consider tradeoffs between different options while keeping system
constraints in mind.
• Since we will be storing a massive amount of data, how should we partition our data to
distribute it to multiple databases? Should we try to store all the data of a user on the same
database? What issue could it cause?
• How will we handle hot users who tweet a lot or follow lots of people?
• Since users’ timeline will contain the most recent (and relevant) tweets, should we try to store
our data in such a way that is optimized for scanning the latest tweets?
• How much and at which layer should we introduce cache to speed things up?
• What components need better load balancing?
Step 7: Identifying and resolving bottlenecks
Try to discuss as many bottlenecks as possible and different approaches to mitigate them.
• Is there any single point of failure in our system? What are we doing to mitigate it?
• Do we have enough replicas of the data so that if we lose a few servers we can still serve our
users?
• Similarly, do we have enough copies of different services running such that a few failures will
not cause total system shutdown?
• How are we monitoring the performance of our service? Do we get alerts whenever critical
components fail or their performance degrades?
Summary
In short, preparation and being organized during the interview are the keys to be successful in system
design interviews. The above-mentioned steps should guide you to remain on track and cover all the
different aspects while designing a system.
Let’s apply the above guidelines to design a few systems that are asked in SDIs.
Designing a URL Shortening service like
TinyURL
Let's design a URL shortening service like TinyURL. This service will provide short aliases redirecting
to long URLs. Similar services: bit.ly, goo.gl, qlink.me, etc. Difficulty Level: Easy
1. Why do we need URL shortening?
URL shortening is used to create shorter aliases for long URLs. We call these shortened aliases “short
links.” Users are redirected to the original URL when they hit these short links. Short links save a lot of
space when displayed, printed, messaged, or tweeted. Additionally, users are less likely to mistype
shorter URLs.
For example, if we shorten this page through TinyURL:
https://www.educative.io/collection/page/5668639101419520/5649050225344512/5668600
916475904/
We would get:
http://tinyurl.com/jlg8zpc
The shortened URL is nearly one-third the size of the actual URL.
URL shortening is used for optimizing links across devices, tracking individual links to analyze
audience and campaign performance, and hiding affiliated original URLs.
If you haven’t used tinyurl.com before, please try creating a new shortened URL and spend some time
going through the various options their service offers. This will help you a lot in understanding this
chapter.
2. Requirements and Goals of the System
� You should always clarify requirements at the beginning of the interview. Be sure to ask
questions to find the exact scope of the system that the interviewer has in mind.
Our URL shortening system should meet the following requirements:
Functional Requirements:
1. Given a URL, our service should generate a shorter and unique alias of it. This is called a short
link.
2. When users access a short link, our service should redirect them to the original link.
3. Users should optionally be able to pick a custom short link for their URL.
4. Links will expire after a standard default timespan. Users should be able to specify the
expiration time.
Non-Functional Requirements:
1. The system should be highly available. This is required because, if our service is down, all the
URL redirections will start failing.
2. URL redirection should happen in real-time with minimal latency.
3. Shortened links should not be guessable (not predictable).
Extended Requirements:
1. Analytics; e.g., how many times a redirection happened?
2. Our service should also be accessible through REST APIs by other services.
3. Capacity Estimation and Constraints
Our system will be read-heavy. There will be lots of redirection requests compared to new URL
shortenings. Let’s assume 100:1 ratio between read and write.
Traffic estimates: Assuming, we will have 500M new URL shortenings per month, with 100:1
read/write ratio, we can expect 50B redirections during the same period:
100 * 500M => 50B
What would be Queries Per Second (QPS) for our system? New URLs shortenings per second:
500 million / (30 days * 24 hours * 3600 seconds) = ~200 URLs/s
Considering 100:1 read/write ratio, URLs redirections per second will be:
100 * 200 URLs/s = 20K/s
剩余162页未读,继续阅读
阿飞算法
- 粉丝: 1153
- 资源: 2
上传资源 快速赚钱
- 我的内容管理 收起
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
会员权益专享
最新资源
- RTL8188FU-Linux-v5.7.4.2-36687.20200602.tar(20765).gz
- c++校园超市商品信息管理系统课程设计说明书(含源代码) (2).pdf
- 建筑供配电系统相关课件.pptx
- 企业管理规章制度及管理模式.doc
- vb打开摄像头.doc
- 云计算-可信计算中认证协议改进方案.pdf
- [详细完整版]单片机编程4.ppt
- c语言常用算法.pdf
- c++经典程序代码大全.pdf
- 单片机数字时钟资料.doc
- 11项目管理前沿1.0.pptx
- 基于ssm的“魅力”繁峙宣传网站的设计与实现论文.doc
- 智慧交通综合解决方案.pptx
- 建筑防潮设计-PowerPointPresentati.pptx
- SPC统计过程控制程序.pptx
- SPC统计方法基础知识.pptx
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论0