自适应内容外包：基于广义社区的CDN策略

论文

需积分: 10 80 浏览量更新于2023-07-05 收藏 2.5MB PDF 举报

身份认证购VIP最低享 7 折!

领优惠券(最高得80元）

"这篇论文《CDNS content outsourcing via Generalized communities》主要探讨了内容分发网络（CDNs）中的内容外包策略，通过一种基于社区聚类的自适应方法，解决了传统预取策略依赖于不可靠的内容流行度统计信息的问题。" 在当前的互联网环境中，内容分发网络（CDNs）起着至关重要的作用，它们平衡了服务成本和内容传输质量，以提供客户定制的内容并提高性能。CDN提供商需要制定高效的内容外包策略，以便根据这些策略调整服务，优化用户体验，并实现显著的经济效益。然而，传统的CDN内容预取策略往往基于内容流行度的统计信息，而这些信息并不总是准确且极其不稳定。论文提出了一个新颖的自我适应技术，该技术在CDN框架下运作，无需预先了解请求统计信息就能识别出应被外包的内容。这种方法利用了结构化的方法，识别出一组"相关"Web服务器内容对象的连贯集群，即所谓的Web页面社区。这些社区构成了核心的外包单位。 Web页面社区是内容分发网络中的一种概念，它们是由在网络请求模式中表现出高度关联性的网页组成的群体。通过分析这些社区的形成和变化，CDN可以更精确地预测哪些内容将受到用户需求的影响，从而避免依赖不稳定的内容流行度数据进行预取决策。论文中，作者Dimitrios Katsaros、George Pallis、Konstantinos Stamos、Athena Vakali、Antonis Sidiropoulos和Yannis Manolopoulos深入研究了如何利用这些社区结构来优化内容分发策略。他们可能讨论了如何构建和识别这些社区，以及如何利用这些社区信息动态调整内容的外包策略，以提高服务效率和用户满意度。此外，论文可能还涵盖了评估这种新方法的实验结果，包括性能提升、成本节约和对内容流行度波动的适应性等方面。这些实验可能对比了新方法与传统预取策略的优劣，以证明其在实际应用中的有效性。这篇论文为CDN的内容管理和优化提供了一个创新的视角，通过基于社区的自适应内容外包策略，解决了传统方法中的关键问题，为未来的CDN设计提供了有价值的理论支持和实践指导。

资源详情

资源推荐

server content structure). The proposed policy called

Communities identification with Betweenness Cen-

trality (CiBC) identifies overlapped Web page com-

munities using the concept of Betweenness Centrality

(BC) [5]. Specifically, Newman and Girvan [22] have

used the concept of edge betweenness to select edges

to be removed from the graph so as to devise a

hierarchical agglomerative clustering procedure,

which though is not capable of providing the final

communities but requires intervening of adminis-

trators. Contrary to this work [22], the BC is used, in

this paper, to measure how central each node of the

Web site graph is within a community.

. Experimenting on a detailed simulation testbed, since the

experimentation carried out involves numerous

experiments to evaluate the proposed scheme under

regular traffic and under flash crowd events. Current

usage of Web technologies and Web server content

performance characteristics during a flash crowd

event are highlighted, and from our experimentation,

the proposed approach is shown to be robust and

effective in minimizing both the average response

time of users’ requests and the costs of CDNs’

providers.

1.2 Road Map

The rest of this paper is structured as follows: Section 2

discusses the related work. In Section 3, we formally define

the problem addressed in this paper. Section 4 presents the

proposed policy. Sections 5 and 6 present the simulation

testbed, examined policies, and performance measures.

Section 7 evaluates the proposed approach, and finally,

Section 8 concludes this paper.

2RELEVANT WORK

2.1 Content Outsourcing Policies

As identified by earlier research efforts [9], [15], the choice of

the outsourced content has a crucial impact in terms of

CDN’s pricing [15] and CDN’s performance [9], and it is

quite complex and challenging, if we consider the dynamic

nature of the Web. A naive solution to this problem is to

outsource all the objects of the Web server content (full

mirroring) to all the surrogate servers. The latter may seem

feasible, since the technological advances in storage media

and networking support have greatly improved. However,

the respective demand from the market greatly surpasses

these advantages. For instance, after the recent agreement

between Limelight Networks

and YouTube, under which

the first company is adopted as the content delivery platform

by YouTube, we can deduce, since this is proprietary

information, the huge storage requirements of the surrogate

servers. Moreover, the evolution toward completely perso-

nalized TV (e.g., the stage6)

reveals that the full content of

the origin servers cannot be completely outsourced as a

whole. Finally, the problem of updating such a huge

collection of Web objects is unmanageable. Thus, we have

to resort to a more “selective” outsourcing policy.

A few such content outsourcing policies have been

proposed in order to identify which objects to outsource for

replica ting to CDNs’ surrogate servers. These can be

categorized as follows:

. Empirical-based outsourcing. The Web server con-

tent administrators decide empirically about which

content will be outsourced [3].

. Popularity-based outsourcing. The most popular

objects are replicated to surrogate servers [37].

. Object-based outsourcing. The content is replicated

to surrogate servers in units of objects. Each object is

replicated to the surrogate server (under the storage

constraints) which gives the most performance gain

(greedy approach) [9], [37].

. Cluster-based outsourcing. The content is replicated

to surrogate servers in units of clusters [9], [14]. A

cluster is defined as a group of Web pages which

have some common characteristics with respect to

their content, the time of references, the number of

references, etc.

From the above content outsourcing policies, the object-

based one achieves high performance [9], [37]. However, as

pointed out by the authors of these policies, the huge

amount of objects results in not being implemented on a

real application. On the other hand, the popularity-based

outsourcing policies do not select the most suitable objects

for outsourcing, since the most popular objects remain

popular for a short time period [9]. Moreover, they require

quite a long time to collect reliable request statistics for each

object. Such a long interval though may not be available,

when a new Web server content is published to the Internet

and should be protected from flash crowd events.

Thus, we resort to exploit action of cluster-based out-

sourcing policies. The cluster-based one has also gained the

most attraction in the research community [9]. In such an

approach, the clusters may be identified by using conven-

tional data clustering algorithms. However, due to the lack

of a uniform schema for Web documents and dynamics of

Web data, the efficiency of these approaches is unsatisfac-

tory. Furthermore, most of them require administratively

tuned parameters (maximum cluster diameter, maximum

number of clusters) to decide the number of clusters, which

causes additional problems, since there is no a priori

knowledge about how many clusters of objects exist and of

what shape these clusters are.

In disaccordance with the above approaches, we exploit

the Web server content structure and consider each cluster

as a Web page community, where its characteristics are

that it reflects the dynamic and heterogeneity nature of the

Web. Specifically, it considers each page as a whole object,

rather than breaking down the Web page into information

pieces and reveals mutual relationships among the

concerned Web data.

2.2 Identifying Web Page Communities

In the literature there are several proposals for identifying

Web page communities [13], [16]. One of the key distin-

guishing properties of the algorithms that is usually

considered has to do with the degree of locality which is

used for assessing whether or not a page should be assigned

in a community. Regarding this feature, the methods for

identifying the communities can be summarized as follows:

KATSAROS ET AL.: CDNS CONTENT OUTSOURCING VIA GENERALIZED COMMUNITIES 3

4. http://www.limelightnetworks.com.

5. http://stage6.divx.com.

剩余14页未读，继续阅读

morre

粉丝: 187
资源: 2336

会员权益专享

自适应内容外包：基于广义社区的CDN策略

A Taxonomy of CDNs

cdns windows7

cdns,read-delay是什么配置

数学建模组队问题cdns

那cdns,tshsl-ns这个参数在设备树中表示什么

那cdns,tsd2d-ns这个参数在设备树中表示什么

element plus cdn

气溶胶查找表cdsn

pdfh5.js cdn

这是linux设备树ospi配置项之一

音视频流媒体开发需要哪些技术

linux串口驱动程序代码

基于bp网络的图像压缩cdsn

算法是cdsn设计的吗？

petalinux 网口内核配置

petalinux 系统测试方法

multisim仿真电路实例700例.rar

2007-2021年 企业数字化转型测算结果和无形资产明细

quickjs实现C++和js互相调用的代码示例

基于C语言开发的Foc的矢量控制驱动器+源码+硬件资料+3D模型+项目文档（毕业设计&课程设计&项目开发）

会员权益专享

最新资源

2007-2021年企业数字化转型测算结果和无形资产明细