两粒度软件更新策略的马尔可夫再生过程优化：提升系统性能

71 浏览量更新于2024-07-15 收藏 1.02MB PDF 举报

本文主要探讨了"基于马尔可夫再生过程的两粒度软件更新策略的优化"这一主题，针对的是计算机系统性能提升中的一个重要策略——软件再生技术。软件再生是一种主动的软件控制技术，用于缓解系统因软件老化导致的性能下降问题。论文提出了一种新颖的两层粒度（用户级应用和操作系统）软件检查为基础的软件再生策略，它采用闭环控制的方法来对抗不同级别的软件老化影响。首先，作者构建了一个基于马尔可夫再生过程的模型，该模型考虑了系统的实时状态，以便更好地理解和预测性能变化。马尔可夫过程是一种随机过程，其状态只依赖于当前状态，不依赖过去的路径，这使得它在处理复杂系统动态方面具有优势。通过该模型，研究者能够捕捉到系统在不同软件层次上的行为和潜在问题。接着，通过故障注入实验，研究人员对应用软件和操作系统进行了详细的性能退化率评估。这种实验方法有助于量化软件老化的影响，并为优化策略提供数据支持。实验结果不仅验证了软件再生策略的有效性，还提供了关于软件老化特性的深入理解。文章的核心内容是优化算法的设计，该算法旨在根据马尔可夫再生过程的状态动态调整两层粒度的软件更新频率和方式。这可能涉及到定期重新加载用户应用、更新操作系统补丁，或是采取其他形式的软件维护措施，以保持系统的高效运行。通过优化，目标是最大化系统的整体性能，同时最小化更新带来的干扰或停机时间。此外，诊断监测工具的准确性也是优化策略的关键要素，因为它直接影响到软件再生决策的精度。论文中提到的监控和分析技术可能包括实时性能监控、异常检测和智能决策支持系统，这些都为了确保策略执行的高效性和智能化。这篇研究论文深入探讨了软件再生策略在多层软件架构中的实际应用，通过马尔可夫再生过程模型和故障注入实验，提出了一个有效的两粒度优化方案，旨在提高计算机系统的稳定性和性能。这对于软件工程领域和数据中心管理来说，具有重要的理论和实践价值。

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

4 IEEE TRANSACTIONS ON RELIABILITY

TABLE II

AINTAIN ACTIONS AND THEIR CORRESPONDING TIME DETERMINED

BY THE

MONITOREDSTATE(t)

NO. MonitoredState(t) Action at Action at Time of

AS level OS level the action

0 rr no action no action -

4 Fr Rea. Rest. no action T

A 4

5 FD Proa. Rest. Proa. Rebo. T

O 3

6 FF Rea. Rest. Rea. Rebo. T

O 4

7 rD Proa. Rest. Proa. Rebo. T

O 3

8 Dr Proa. Rest. no action T

A 3

9 DD Proa. Rest. Proa. Rebo. T

O 3

1) The time to inspection trigger is a constant, T

, which is

actually the constant variable in this paper.

2) There is no delay in carrying out the inspection for the

M&A, which means that the system makes instantaneous

diagnoses after it is triggered.

3) When the AS fails, the M&A will be triggered immedi-

ately to detect the state of the OS.

4) After each rejuvenation of the AS or the OS, the timer of

the M&A will be reset.

5) The holding time T

(or T

O 1

) of AS (or OS) from robust

state to degradation state has the exponential distribution

with parameter λ

(or λ

O 1

). The holding time T

(or

O 2

) of the AS (or OS) from degradation state to failed

state also has the exponential distribution with parameter

(or λ

O 2

6) The rejuvenation time T

(or T

O 3

)oftheAS(orOS)

from degradation state to robust state has a general distri-

bution F

(t) (or F

O 3

(t)). The reactive restart (or reboot)

time T

(or T

O 4

) of the AS (or OS) also has a general

distribution F

(t) (or F

O 4

(t)).

Let TrueState(t) be the true state of the system at time t.

Assume that the M&A does not always make correct diagnoses.

MonitoredState(t) = TrueState(t) means that the M&A makes

a misdiagnosis. Recall that the AS and the OS have their sepa-

rated memory usage. Suppose that the M&A makes diagnoses

independently for the AS and the OS. Let MonitoredState

(t)

and MonitoredState

(t) be the diagnosis results of AS and OS,

respectively. (t)=‘rD’ means that MonitoredState

(t)=‘r’

and MonitoredState

(t)=‘D.’ Suppose that if AS and OS are

at one of the states in {failed, rejuvenation}, they can be rightly

diagnosed by the M&A. Let

= Pr{MonitoredState

(t)=r |TrueState

(t)=r}

= Pr{MonitoredState

(t)=D |TrueState

(t)=D}

= Pr{MonitoredState

(t)=r |TrueState

(t)=r}

= Pr{MonitoredState

(t)=D |TrueState

(t)=D}.

(1)

When and which rejuvenation technique should be executed

depends on the diagnosis results from the M&A. For example,

if TrueState(t) is “rD,” there are four possible actions according

Fig. 2. True state diagram for the two-granularity software aging and rejuve-

nation model.

to the inspection results MonitoredState(t):

⎧

⎪

⎨

⎪

⎩

no action → rD, if (t)=rr

Reju. OS → rr, if (t)=rD

Reju. AS → rD, if (t)=Dr

Reju. OS → rr, if (t)=DD

by noting our assumption that if the AS and the OS are at one

of the states in {failed, rejuvenation}, they can be rightly di-

agnosed by the M&A. The meaning of the above brace can

be explained as follows. Suppose that TrueState(t) is “rD.” If

MonitoredState(t) is “rr,” which means that the M&A makes

a right report regarding the AS level and makes a wrong re-

port regarding the OS level, then no maintenance action will

be taken. The true state of the system will still be “rD.” If

MonitoredState(t) is “Dr,” which means that the M&A makes

wrong reports regarding both the AS level and the OS level,

then the action of AS rejuvenation will be taken. The true state

of the system will still be “rD” after AS rejuvenation. The true

state transition path in this case is rD → RD → rD.

Based on the rejuvenation scheduling deﬁned above, the true

state diagram is determined. Fig. 2 shows the state diagram of

the ten states for our two-granularity software aging and rejuve-

nation model. In the ﬁgure, F

(·)

(t) are the distribution functions

of the corresponding sojourn times. According to our supposi-

tion, the holding time T

of the AS transition from robust state

to degradation state follows exponential distribution F

(t).

Also the holding times T

, T

O 1

, and T

O 2

follow exponential

distribution with the distribution functions F

(t), F

O 1

(t), and

O 2

(t), respectively. As shown in Fig. 2, because the time to

trigger inspection is a deterministic value T

, the stochastic

process Z(t) determined by the model is not a Markov process

剩余16页未读，继续阅读

weixin_38674115

粉丝: 5
资源: 968

两粒度软件更新策略的马尔可夫再生过程优化：提升系统性能

Matlab源码 基于马尔可夫决策过程的移动边缘计算中的动态服务迁移.zip

相比马尔可夫奖励过程，马尔可夫决策过程引入什么新元素

PPO算法和马尔可夫决策过程是什么关系，为什么PPO算法要用到马尔可夫决策过程

马尔可夫决策过程2000字

马尔可夫过程的随机过程

马尔可夫随机过程Python

马尔可夫决策过程的步骤

马尔可夫决策过程理论与应用 pdf

马尔可夫决策过程实例

强化学习马尔可夫决策过程流程图

最新资源

Matlab源码基于马尔可夫决策过程的移动边缘计算中的动态服务迁移.zip