开源应用架构解析：扩展性和分布式系统

4星 · 超过85%的资源需积分: 15 166 浏览量更新于2024-07-19 收藏 7.69MB PDF 举报

身份认证购VIP最低享 7 折!

领优惠券(最高得80元）

资源详情

资源推荐

always do more reading than writing), but also helps clarify what is going on at each point. Finally, this

separates future concerns, which would make it easier to troubleshoot and scale a problem like slow reads.

The advantage of this approach is that we are able to solve problems independently of one another—we don't

have to worry about writing and retrieving new images in the same context. Both of these services still

leverage the global corpus of images, but they are free to optimize their own performance with

service-appropriate methods (for example, queuing up requests, or caching popular images—more on this

below). And from a maintenance and cost perspective each service can scale independently as needed,

which is great because if they were combined and intermingled, one could inadvertently impact the

performance of the other as in the scenario discussed above.

Of course, the above example can work well when you have two different endpoints (in fact this is very similar

to several cloud storage providers' implementations and Content Delivery Networks). There are lots of ways

to address these types of bottlenecks though, and each has different tradeoffs.

For example, Flickr solves this read/write issue by distributing users across different shards such that each

shard can only handle a set number of users, and as users increase more shards are added to the cluster

(see the presentation on Flickr's

scaling,http://mysqldba.blogspot.com/2008/04/mysql-uc-2007-presentation-file.html). In the first example it is

easier to scale hardware based on actual usage (the number of reads and writes across the whole system),

whereas Flickr scales with their user base (but forces the assumption of equal usage across users so there

can be extra capacity). In the former an outage or issue with one of the services brings down functionality

across the whole system (no-one can write files, for example), whereas an outage with one of Flickr's shards

will only affect those users. In the first example it is easier to perform operations across the whole

dataset—for example, updating the write service to include new metadata or searching across all image

metadata—whereas with the Flickr architecture each shard would need to be updated or searched (or a

search service would need to be created to collate that metadata—which is in fact what they do).

When it comes to these systems there is no right answer, but it helps to go back to the principles at the start of

this chapter, determine the system needs (heavy reads or writes or both, level of concurrency, queries across

the data set, ranges, sorts, etc.), benchmark different alternatives, understand how the system will fail, and

have a solid plan for when failure happens.

Redundancy

In order to handle failure gracefully a web architecture must have redundancy of its services and data. For

example, if there is only one copy of a file stored on a single server, then losing that server means losing that

file. Losing data is seldom a good thing, and a common way of handling it is to create multiple, or redundant,

copies.

This same principle also applies to services. If there is a core piece of functionality for an application, ensuring

that multiple copies or versions are running simultaneously can secure against the failure of a single node.

Creating redundancy in a system can remove single points of failure and provide a backup or spare

functionality if needed in a crisis. For example, if there are two instances of the same service running in

production, and one fails or degrades, the system can failoverto the healthy copy. Failover can happen

automatically or require manual intervention.

Another key part of service redundancy is creating a shared-nothing architecture. With this architecture, each

node is able to operate independently of one another and there is no central "brain" managing state or

coordinating activities for the other nodes. This helps a lot with scalability since new nodes can be added

without special conditions or knowledge. However, and most importantly, there is no single point of failure in

these systems, so they are much more resilient to failure.

For example, in our image server application, all images would have redundant copies on another piece of

hardware somewhere (ideally in a different geographic location in the event of a catastrophe like an

earthquake or fire in the data center), and the services to access the images would be redundant, all

potentially servicing requests. (See Figure 1.3.) (Load balancers are a great way to make this possible, but

there is more on that below).

Figure 1.3: Image hosting application with redundancy

Partitions

There may be very large data sets that are unable to fit on a single server. It may also be the case that an

operation requires too many computing resources, diminishing performance and making it necessary to add

capacity. In either case you have two choices: scale vertically or horizontally.

Scaling vertically means adding more resources to an individual server. So for a very large data set, this might

mean adding more (or bigger) hard drives so a single server can contain the entire data set. In the case of the

compute operation, this could mean moving the computation to a bigger server with a faster CPU or more

memory. In each case, vertical scaling is accomplished by making the individual resource capable of handling

more on its own.

To scale horizontally, on the other hand, is to add more nodes. In the case of the large data set, this might be

a second server to store parts of the data set, and for the computing resource it would mean splitting the

operation or load across some additional nodes. To take full advantage of horizontal scaling, it should be

included as an intrinsic design principle of the system architecture, otherwise it can be quite cumbersome to

modify and separate out the context to make this possible.

When it comes to horizontal scaling, one of the more common techniques is to break up your services into

partitions, or shards. The partitions can be distributed such that each logical set of functionality is separate;

this could be done by geographic boundaries, or by another criteria like non-paying versus paying users. The

advantage of these schemes is that they provide a service or data store with added capacity.

In our image server example, it is possible that the single file server used to store images could be replaced

by multiple file servers, each containing its own unique set of images. (See Figure 1.4.) Such an architecture

would allow the system to fill each file server with images, adding additional servers as the disks become full.

The design would require a naming scheme that tied an image's filename to the server containing it. An

image's name could be formed from a consistent hashing scheme mapped across the servers. Or

alternatively, each image could be assigned an incremental ID, so that when a client makes a request for an

image, the image retrieval service only needs to maintain the range of IDs that are mapped to each of the

servers (like an index).

Figure 1.4: Image hosting application with redundancy and partitioning

Of course there are challenges distributing data or functionality across multiple servers. One of the key issues

is data locality; in distributed systems the closer the data to the operation or point of computation, the better

the performance of the system. Therefore it is potentially problematic to have data spread across multiple

servers, as any time it is needed it may not be local, forcing the servers to perform a costly fetch of the

required information across the network.

Another potential issue comes in the form of inconsistency. When there are different services reading and

writing from a shared resource, potentially another service or data store, there is the chance for race

conditions—where some data is supposed to be updated, but the read happens prior to the update—and in

those cases the data is inconsistent. For example, in the image hosting scenario, a race condition could occur

if one client sent a request to update the dog image with a new title, changing it from "Dog" to "Gizmo", but at

the same time another client was reading the image. In that circumstance it is unclear which title, "Dog" or

"Gizmo", would be the one received by the second client.

There are certainly some obstacles associated with partitioning data, but partitioning allows each problem to

be split—by data, load, usage patterns, etc.—into manageable chunks. This can help with scalability and

manageability, but is not without risk. There are lots of ways to mitigate risk and handle failures; however, in

the interest of brevity they are not covered in this chapter. If you are interested in reading more, you can

check out my blog post on fault tolerance and monitoring.

1.3. The Building Blocks of Fast and Scalable Data

Access

Having covered some of the core considerations in designing distributed systems, let's now talk about the

hard part: scaling access to the data.

Most simple web applications, for example, LAMP stack applications, look something like Figure 1.5.

Figure 1.5: Simple web applications

As they grow, there are two main challenges: scaling access to the app server and to the database. In a

highly scalable application design, the app (or web) server is typically minimized and often embodies a

shared-nothing architecture. This makes the app server layer of the system horizontally scalable. As a result

of this design, the heavy lifting is pushed down the stack to the database server and supporting services; it's

at this layer where the real scaling and performance challenges come into play.

The rest of this chapter is devoted to some of the more common strategies and methods for making these

types of services fast and scalable by providing fast access to data.

Figure 1.6: Oversimplified web application

Most systems can be oversimplified to Figure 1.6. This is a great place to start. If you have a lot of data, you

want fast and easy access, like keeping a stash of candy in the top drawer of your desk. Though overly

simplified, the previous statement hints at two hard problems: scalability of storage and fast access of data.

For the sake of this section, let's assume you have many terabytes (TB) of data and you want to allow users to

access small portions of that data at random. (See Figure 1.7.) This is similar to locating an image file

somewhere on the file server in the image application example.

剩余428页未读，继续阅读

sinat_21954747

粉丝: 0
资源: 24

开源应用架构解析：扩展性和分布式系统

(免费)The Architecture of Open Source Applications 1&2 及中文版 开源软件架构1和2

The Architecture of Open Source Applications 1 epub

explain software architecture of android

ARCHITECTURE trans of xxx和ARCHITECTURE behav of xxx有什么区别

some/ip有开源实现吗

软件架构师10大经典书籍

can you tell me how can changed the function of labelme

spring+spingmvc+mybatisplus

architecture tt of sineWaveGenerator is

解释mobile applications architecture

operating system architecture

java opc ua

VHDL ARCHITECTURE trans of xxx 用法

verilog实现vhdl中的architecture tt of sineWaveGenerator is

point transformer v2

vscode 使用ssh 远程连接报错：the remote host’s architecture is not support

ruoyi-cloud

libgstreamer-1.0.0.dylib

最新资源

(免费)The Architecture of Open Source Applications 1&2 及中文版开源软件架构1和2