浏览器网络性能优化深度解析

5星 · 超过95%的资源需积分: 9 182 浏览量更新于2024-07-25 7 收藏 6.37MB PDF 举报

"High Performance Browser Networking by Ilya Grigorik" 本文是Ilya Grigorik的一部著作，专注于探讨网络和Web性能方面的知识，对于任何Web开发者来说都是必读的资料。书中强调了浏览器作为当今最广泛部署的开发平台的重要性，无论在智能手机、平板电脑、笔记本电脑、台式机还是其他各种设备上，几乎都预装了浏览器。随着行业的发展，预计到2020年将有200亿连接设备，每台设备至少配备一个浏览器，并且具备Wi-Fi或蜂窝网络连接。浏览器已经从过去的样子演变成如今功能丰富的平台，这得益于HTML和CSS为呈现层带来的创新，JavaScript成为Web的新汇编语言，而持续发展的HTML5 API则不断强化了浏览器的能力，使得我们可以创建出引人入胜且高性能的应用程序。这种技术的普及度和影响力是前所未有的，当我们为浏览器开发应用时，没有任何其他技术或平台可以与之相比。书中的内容可能涵盖了以下几个关键知识点： 1. **网络基础**：包括TCP/IP协议栈、HTTP协议及其演进（如HTTP/2和HTTP/3），以及WebSocket等持久连接技术，这些对优化页面加载速度和交互性能至关重要。 2. **浏览器工作原理**：介绍浏览器如何解析HTML、CSS和JavaScript，渲染页面，以及JavaScript引擎的工作机制，如V8引擎的优化策略。 3. **性能优化**：讲解如何通过延迟加载、资源缓存、预加载和预读取等技术提高页面加载速度，以及利用浏览器的开发者工具进行性能分析和调试。 4. **移动优先**：讨论移动设备上的网络限制，如带宽限制、电池寿命和CPU性能，以及如何针对这些条件优化Web应用。 5. **安全与隐私**：介绍HTTPS的重要性，以及Cookie、 localStorage和其他存储机制的安全考虑，以及用户隐私保护的最佳实践。 6. **响应式设计**：解释如何根据设备特性调整内容显示，确保跨设备的兼容性和用户体验。 7. **性能指标**：如Time to First Byte (TTFB)、First Contentful Paint (FCP)、First Input Delay (FID)等，以及如何监控和改善这些指标。 8. **未来趋势**：展望Web技术的未来发展方向，如WebAssembly、Service Worker和WebRTC等新兴技术对性能和功能的提升。这本书深入浅出地介绍了Web开发者需要了解的网络和浏览器性能优化技巧，是提升Web应用性能的必备参考。通过学习这些知识，开发者可以更好地理解网络环境下的工作原理，从而创建出更高效、更流畅的Web体验。

13>(,9 Client receives eight segments and ACK’s each one

1@6(,9 Server increments its cwnd for each ACK, and sends remaining segments

16/(,9 Client receives remaining segments, ACK’s each one

264 ms to transfer the 20 KB file on a new TCP connection with 56 ms roundtrip time between the client and server! By

comparison, let’s now assume that the client is able to reuse the same TCP connection (Figure 2-6) and issues the same

request once more.

Figure 2-6. Fetching a file over an existing TCP connection

3(,9 Client sends the HTTP request

1>(,9 Server receives the HTTP request

6>(,9 Server completes generating the 20 KB response, but the cwnd value is already greater than the 15 segments required

to send the file, hence it dispatches all the segments in one burst

?6(,9 Client receives all 15 segments, ACK’s each one

The same request made on the same connection, but without the cost of the three-way handshake and the penalty of the

slow-start phase now took 96 milliseconds, which translates into a 275% improvement in performance!

In both cases, the fact that both the server and the client have access to 5 Mbps of upstream bandwidth had no impact during

the startup phase of the TCP connection. Instead, the latency, and the congestion window sizes were the limiting factors.

In fact, the performance gap between the first and the second request dispatched over an existing connection will only widen

if we increase the roundtrip time - as an exercise, try it with a few different values. Once you develop an intuition for the

mechanics of TCP congestion control, dozens of optimizations such as keep-alive, pipelining, and multiplexing will require

little further motivation.

As an exercise, run through Figure 2-5 with cwnd value set to ten network segments, instead of four. You should see a

full roundtrip of network latency disappear - a 22% improvement in performance!

Congestion avoidance

It is important to recognize that TCP is specifically designed to use packet-loss as a feedback mechanism to help regulate its

performance. In other words, it is not a question of if, but rather of when the packet loss will occur. Slow-start initializes the

connection with a conservative window, and for every roundtrip, doubles the amount of data in flight until it exceeds the

receiver’s flow-control window, a system configured congestion threshold (ssthresh) window, or until a packet is lost, at

which point the congestion avoidance algorithm (Figure 2-3) takes over.

The implicit assumption in congestion avoidance is that packet loss is indicative of network congestion - somewhere along

the path we have encountered a congested link or a router, which was forced to drop the packet, and hence we need to adjust

our window to avoid inducing more packet loss to avoid overwhelming the network.

Once the congestion window is reset, congestion avoidance specifies its own algorithms for how to grow the window to

minimize further loss. At a certain point, another packet loss event will occur, and the process will repeat once over. If you

have ever looked at a throughput trace of a TCP connection and observed a sawtooth pattern within it, now you know why it

looks as such - it is the congestion control and avoidance algorithms adjusting the congestion window size to account for

packet loss in the network.

Finally, it is worth noting that improving congestion control and avoidance is an active area both for academic research and

commercial products: there are adaptations for different network types, different types of data transfers, and so on. Today,

depending on your platform, you will likely run one of the many variants: TCP Tahoe and Reno (original implementations),

TCP Vegas, TCP New Reno, TCP BIC, TCP CUBIC (default on Linux), or Compound TCP (default on Windows), amongst

many others. However, regardless of the flavor, the core performance implications of congestion control and avoidance hold

for all.

Bandwidth-delay Product

The built-in congestion control and congestion avoidance mechanisms in TCP carry another important performance

implication: the optimal sender and receiver window sizes must vary based on the roundtrip time and the target data rate

Increasing TCP’s Initial Congestion Window

Increasing the initial cwnd size on the server to the new RFC 6928 value of ten segments (IW10) is one of the simplest ways

to improve performance for all users, and all applications running over TCP. And the good news is that many operating

systems have already updated their latest kernels to use the increased value - check the appropriate documentation and

release notes.

For Linux, IW10 is the new default for all kernels above 2.6.39. However, don’t stop there: upgrade to 3.2+ to also get the

benefit of other important updates, such as “Proportional Rate Reduction for TCP”.

Proportional Rate Reduction for TCP

Determining the optimal way to recover from packet loss is a non-trivial exercise: if you are too aggressive, then an

intermittent lost packet will have significant impact on throughput of the entire connection, and if you don’t adjust quickly

enough, then you will induce more packet loss!

Originally, TCP used the "Multiplicative Decrease and Additive Increase" (AIMD) algorithm: when packet loss occurs, halve

the congestion window size, and then slowly increase the window by fixed amount per roundtrip. However, in many cases

AIMD is too conservative, and hence new algorithms were developed.

Proportional Rate Reduction (PRR) is a new algorithm specified by RFC 6937, whose goal is to improve the speed of

recovery when a packet is lost. How much better is it? According to measurements done at Google, where the new

algorithm was developed, it provides a 3-10% reduction in average latency for connections with packet loss.

PRR is now the default congestion avoidance algorithm in Linux 3.2+ kernels - another good reason to upgrade your

servers!

between them.

To understand why this is the case, first recall that the maximum amount of unacknowledged, in-flight data between the

sender and receiver is defined as the minimum of the receive (rwnd) and congestion (cwnd) window sizes: the current receive

windows are communicated in every ACK, and the congestion window is dynamically adjusted by the sender based on the

congestion control and avoidance algorithms.

If either the sender or receiver exceeds the maximum amount of unacknowledged data, then they must stop and wait for the

other end to ACK some of the packets before proceeding. How long would they have to wait? That’s dictated by the roundtrip

time between them!

Bandwidth-delay product (BDP)

Product of data link’s capacity and its end-to-end delay. The result is the maximum amount of unacknowledged data

that can be in-flight at any point in time.

If either the sender or receiver are frequently forced to stop and wait for ACK’s for previous packets, then this would create

gaps in the data flow (Figure 2-7), which would consequently limit the maximum throughput of the connection. To address

this problem, the window sizes should be made just big enough, such that either side can continue sending data until an ACK

arrives back from the client for an earlier packet - no gaps, maximum throughput. Consequently, the optimal window size is

dependent on the roundtrip time! Pick a low window size, and you will limit your connection throughput, regardless of the

available or advertised bandwidth between the peers.

Figure 2-7. Transmission gaps due to low congestion window size

So, how big do the flow control (rwnd) and congestion control (cwnd) window values need to be? The actual calculation is a

simple one. First, let us assume that the minimum of the cwnd and rwnd window sizes is 16 KB, and the roundtrip time is

100 ms:

Regardless of the available bandwidth between the sender or receiver, this TCP connection will not exceed a 1.31 Mbps data

rate! To achieve higher throughput we need to raise the minimum window size, or lower the roundtrip time.

Similarly, we can compute the optimal window size if we know the roundtrip time and the available bandwidth on both ends.

In this scenario, let’s assume the roundtrip time stays the same (100 ms), but the sender has 10 Mbps of available bandwidth,

and the receiver is on a high-throughput 100 Mbps+ link. Assuming there is no network congestion between them, our goal is

to saturate the 10 Mbps link available to the client:

The window size needs to be at least 122.1 KB to saturate the 10 Mbps link. Recall that the maximum receive window size in

TCP is 64 KB unless “Window Scaling (RFC 1323)” is present - double check your client and server settings!

The good news is that the window size negotiation and tuning is managed automatically by the network stack and should

adjust accordingly. The bad news is, sometimes it will still be the limiting factor on TCP performance. If you have ever

wondered why your connection is transmitting at a fraction of the available bandwidth, even when you know that both the

client and the server are capable of higher rates, then it is likely due to a small window size: a saturated peer advertising low

receive window, bad network weather and high packet loss resetting the congestion window, or explicit traffic shaping which

could have been applied to limit throughput of your connection.

Bandwidth-delay product in high-speed LANs

Head of line blocking

TCP provides the abstraction of a reliable network running over an unreliable channel, which includes basic packet error

checking and correction, in-order delivery, retransmission of lost packets, as well as flow control, congestion control, and

congestion avoidance designed to operate the network at the point of greatest efficiency. Combined, these features make TCP

the preferred transport for most applications.

However, while TCP is a popular choice, it is not the only, nor necessarily the best choice for every occasion. Specifically,

some of the features such as in-order and reliable packet delivery are not always necessary and can introduce unnecessary

delays and negative performance implications.

To understand why that is the case, recall that every TCP packet carries a unique sequence number when put on the wire, and

the data must be passed to the receiver in-order (Figure 2-8). If one of the packets is lost en-route to the receiver, then all

subsequent packets must be held in the receiver’s TCP buffer until the lost packet is retransmitted and arrives at the receiver.

Because this work is done within the TCP layer, our application has no visibility into the TCP retransmissions or the queued

packet buffers, and must wait for the full sequence before it is able to access the data. Instead, it simply sees a delivery delay

when it tries to read the data from the socket. This effect is known as TCP head-of-line (HOL) blocking.

Figure 2-8. TCP Head-of-line blocking

The delay imposed by head-of-line blocking allows our applications to avoid having to deal with packet re-ordering and re-

assembly, which makes our application code much simpler. However, this is done at the cost of introducing unpredictable

latency variation in the packet arrival times - commonly referred to as jitter - which can negatively impact the performance of

the application.

Further, some applications may not even need either reliable delivery, or in-order delivery: if every packet is a standalone

message, then in-order delivery is strictly unnecessary, and if every message overrides all previous messages, then the

requirement for reliable delivery can be removed entirely. Unfortunately, TCP does not provide such configuration - all

packets are sequenced and delivered in order.

Applications which can deal with out-of-order delivery or packet loss and which are latency or jitter sensitive are likely better

served with an alternate transport such as UDP.

BDP is a function of the roundtrip time and the target data rate. Hence, while the roundtrip time is a common bottleneck in

cases with high propagation delay, it can also be a bottleneck on a local LAN!

To achieve 1 GBit/s with 1 ms roundtrip time, we would also need a congestion window of at least 122 KB. The calculation is

exactly the same as we saw above, we simply added a few zeroes to the target data rate, and removed the same amount of

zeroes from the roundtrip latency.

Packet-loss is "OK"

In fact, packet-loss is necessary to get the best performance from TCP! A dropped packet acts as a feedback mechanism,

which allows the receiver and sender to adjust their sending rates to avoid overwhelming the network, and to minimize

latency - see “Bufferbloat in your local router”. Further, some applications can tolerate packet-loss without adverse effects:

audio, video, and game state updates are common examples of application data which do not require either reliable or in-

Optimizing for TCP

TCP is an adaptive protocol designed to be fair to all network peers and to make the most efficient use of the underlying

network. Thus, the best way to optimize TCP is to tune how TCP senses the current network conditions and adapts its

behavior based on the type and the requirements of the layers below and above it: wireless networks may need different

congestion algorithms, and some applications may need custom quality of service (QoS) semantics to deliver the best

experience.

The close interplay of the varying application requirements, and the many knobs in every TCP algorithm makes TCP tuning

and optimization an inexhaustible area of academic and commercial research. In this chapter we have only scratched the

surface of the many factors that govern TCP performance. Additional mechanisms such as selective acknowledgments

(SACK), delayed acknowledgments, and fast retransmit amongst many others make each TCP session much more

complicated (or interesting, depending on your perspective) to understand, analyze, and tune.

Having said that, while the specific details of each algorithm and feedback mechanism will continue to evolve, the core

principles and their implications remain unchanged:

TCP three-way handshake introduces a full roundtrip of latency

TCP slow-start is applied to every new connection

TCP flow and congestion control regulates throughput of all connections

TCP throughput is regulated by current congestion window size

As a result, the rate with which a TCP connection can transfer data in modern high-speed networks is often limited by the

roundtrip time between the receiver and sender. Further, while bandwidth continues to increase, latency is bounded by the

speed of light and is already within a small constant factor of its maximum value. In most cases, latency, not bandwidth, is

the bottleneck for TCP - e.g. see Figure 2-5.

Tuning server configuration

As a starting point, prior to tuning any specific values for each buffer and timeout variable in TCP, of which there are dozens,

you are much better off simply upgrading your hosts to their latest system versions. TCP best practices and underlying

algorithms that govern its performance continue to evolve, and most of these changes are only available in latest kernels. In

short, keep your servers up-to-date to ensure the optimal interaction between the sender and receiver’s TCP stacks.

With the latest kernel in place, it is good practice to ensure that your server is configured to use the following best practices:

“Increasing TCP’s Initial Congestion Window”

Larger starting congestion window allows TCP transfer more data in first roundtrip, and significantly accelerates the

order delivery - incidentally, this is also why WebRTC uses UDP as its base transport.

If a packet is lost, then the audio codec can simply insert a minor break in the audio and continue processing the incoming

packets. If the gap is small, the user may not even notice, and waiting for the lost packet runs the risk of introducing

variable pauses in audio output, which would result in a much worse experience for the user.

Similarly, if we are delivering game state updates for a character in a 3D world, then waiting for a packet describing their

state at time MJ0, when we already have the packet for time M is often simply unnecessary - ideally, we would receive each

and every update, but to avoid gameplay delays, we can accept intermittent loss in favor of lower latency.

On the surface, upgrading server kernel versions seems like trivial advice. However, in practice, it is often met with

significant resistance: many existing servers are tuned for specific kernel versions, and system administrators are

reluctant to perform the upgrade.

To be fair, every upgrade brings its risks, but to get the best TCP performance, it is also likely the single best

investment you can make.

剩余236页未读，继续阅读

Horky

粉丝: 1224
资源: 27

浏览器网络性能优化深度解析

high performance browser networking(高性能浏览器网络)

高性能浏览器网络High Performance Browser Networking

High-Performance-Browser-Networking

Packt.ASP.NET.Site.Performance.Secrets

Setting up a Cluster Environment with VirtualBox: High Availability Applications

华普微四通道数字隔离器

基于区块链的分级诊疗数据共享系统全部资料+详细文档.zip

本文简要介绍了sql注入

【创新未发表】基于多元宇宙优化算法MVO-PID控制器优化研究Matlab代码.rar

精选微信小程序源码：酒水商城小程序（含源码+源码导入视频教程&文档教程，亲测可用）

最新资源