Sequoia:优化Serverless服务的质量与管理策略

需积分: 0 169 浏览量更新于2024-06-30 收藏 2.7MB PDF 举报

随着云计算的快速发展，Serverless计算已经成为一种热门的开发模式，它允许开发者仅关注业务逻辑，而将资源扩展、管理以及底层基础设施的复杂性交给云服务提供商。Sequoia是这项研究的核心，它旨在解决在Serverless环境下确保服务质量（Quality-of-Service, QoS）的关键挑战。当前的Serverless平台如AWS Lambda、Azure Functions或Google Cloud Functions，虽然提供了按需自动扩展的能力，但在实际应用中，如何有效地管理和调度这些无服务器工作负载，确保性能的一致性和响应时间的稳定性，成为了一个亟待解决的问题。Sequoia的工作着重于分析现有的云服务商（如AWS的事件驱动模型和资源调度策略）中的不足，以及潜在的优化空间。研究团队由来自University of Colorado Boulder的多名专家组成，他们深入探讨了当前Serverless管理实践中遇到的挑战，包括但不限于： 1. **资源分配与预留**：如何在用户请求激增时动态分配足够的运行时资源，同时避免过度预留导致的成本浪费。 2. **服务质量保证**：如何在无服务器架构下提供可预测的服务性能，保证请求的快速响应和数据处理的准确性。 3. **任务优先级和隔离**：如何处理不同优先级的函数调用，以及防止功能间的资源竞争。 4. **故障恢复与容错**：在无服务器环境中，如何设计有效的故障检测和恢复机制，以最小化服务中断。 5. **成本优化**：如何在满足QoS的同时，提供具有竞争力的计费模型，帮助开发者更好地控制成本。 Sequoia的目标是提出创新的管理和调度算法，以提升Serverless环境下的整体性能和用户体验。这可能包括使用预测性模型来预测工作负载需求，实现更精确的资源分配，或者开发自适应策略来动态调整函数实例的数量和规格。通过集成这些优化，Sequoia有望推动Serverless计算向更高的成熟度迈进，使其成为企业级应用部署的理想选择。

SoCC ’20, October 19–21, 2020, Virtual Event, USA Ali Tariq, Austin Pahl, Sharat Nimmagadda, Eric Rozner, and Siddharth Lanka

queuing policies. Although results are omitted due to space,

we nd (i) scheduling across frameworks follows a simple

FIFO queuing model and (ii) scheduling is performed on a

per-function basis (instead of other policies like per-chain).

2.2.1 Limitations. This section shows limitations in exist-

ing serverless oerings and how these impact QoS for in-

coming requests and overall performance. Sp ecically, it is

shown that inconsistent and incorrect concurrency limits

are prevalent, mid-chain function drops occur, workloads

such as bursts are not easily supported, HTTP functions are

prioritized without documentation, inecient resource allo-

cation is common, and concurrency collapses under certain

conditions.

400

800

1200

Count

Time (sec)

Concurrency

Completed

(a) IBM Cloud Functions (b) Azure Functions

Figure 2: Incorrect concurrency limits

Inconsistent and incorrect concurrency limits

We nd

numerous issues with concurrency limits on serverless plat-

forms. IBM suers from a simple issue: default concurrency

limits are documented to be 1,000, but up to 1,200 concurrent

functions are run in parallel. Figure 2a shows a burst of 1,200

Single functions. The x-axis is time, the y-axis is number

of concurrently running functions, the dotted line tracks

completions, and the solid line shows up to 1,200 functions

running simultaneously.

In the worst case, no enforcement can occur in Azure. A

workload is created in which demand is slowly ramped up

over time. Azure does not limit the numb er of concurrent

HTTP functions, which was congured to 1,000, or the num-

ber of instances, which is 200 by default. During the test,

the Function App’s Live Metrics Stream reported up to 440

instances allocated to the Function App with up to 8,000

concurrent requests run at a time, as shown in Figure 2b.

Last, GCF does not limit total CPU consumption in a tight

manner. GCF caps total CPU usage over all functions to a

specied threshold over a 100 second period. CPU consump-

tion is tracked during the period, and when the threshold

is reached, no new functions are invoked. We nd two is-

sues, however. First, any outstanding functions are able to

complete after the limit is reached, violating CPU limits. Sec-

ond, a slow trickle of invocations still occurs after the CPU

limit is reached. Figure 3 shows CPU usage is more than

doubled in the MixedChain workload: CPU limits were set

3000

6000

9000

100

150

200

250

300

Concurrency

CPU (Million MHzs)

Time (sec)

1

2

3

Total

CPU

Figure 3: GCF: MixedChain workload CPU usage

to 40M MHz/s, but over 90M MHz/s consumption was en-

countered (dotted red line). Concurrency for each



and total

concurrency, the sum of all



concurrencies, are also shown.

The above ndings indicate concurrency limits are of-

ten inconsistent or incorrect, placing additional burden on

serverless developers. When limits are under intended values,

workloads may unexpecte dly encounter p oor performance

or increased drops. Dealing with such issues increases server-

less application complexity. When limits are over intended

values, developers may incur higher costs than budgeted for.

And when limits are inconsistent, developers can have di-

culty managing and reasoning about serverless performance.

Mid-chain drops

Some serverless platforms provide a hard

concurrency limit (AWS and IBM) beyond which all subse-

quent requests are dropped. When demand rises above a

specied function invocation limit, functions can be queued

(up to 4 days in the case of AWS [

]), silently dropped [

or returned with an error (in the synchronous case only).

This is problematic for several reasons. First, developers may

rely on function chain completion, and when function chains

drop mid-chain, incorrectness may arise. Alternatively, de-

velopers can solve the problem at the application layer, but

this increases complexity and developer burden, two prob-

lems serverless aims to solve. Third, drops mid-chain result

in ineciency because the resources spent running func-

tions before the drop are wasted and could have been better

used to nish some other outstanding function chain. And

last, if providers queue requests mid-chain, then the total

function chain running time variance can be signicantly

increased, impacting SLAs or otherwise negatively aecting

performance.

To assess the impact of mid-chain drops, a Fan-2 Burst

workload is run on AWS Step Functions and IBM Cloud

Functions, where the burst is the size of the concurrency

limit. Note the “fan” portion of Fan-2 invokes twice as many

functions after



completion, meaning a burst of 1,000 Fan-

2’s will ultimately result in 2,000 concurrent functions (i.e.,



) and a violation of concurrency limits. Figure 4 shows

314

剩余16页未读，继续阅读

艾斯·歪

粉丝: 42
资源: 342

Sequoia:优化Serverless服务的质量与管理策略

45-段佳昂-（2021 ASPLOS） Nightcore efficient and scalable serverless

30-段佳昂-2020_TPDS_Automated Fine-Grained CPU Cap Control in Serve

CC-JS_DEBUG和SOCC.js 2.6版全开源

SOCC:SFU意见和评论语料库

socc：OCaml中的简单C编译器

harmonyos2-PAT-Noxim:周期精确的片上网络模拟器

卫通通信系统-Global Xpress系统.ppt

bq40z50-R2芯片安全保护机制：过温与过流导致的永久失效

Apache Hadoop YARN: SoCC2013最佳论文奖

Bq78350-R1技术参考：安全过压与过流保护

最新资源