International Journal of xxxxxx
Vol. x, No. x, xxxxx, 2017
3
of the contribution consists of two algorithms: the first is used to select VMs for
customers’ applications depending on both usage time and failure probability of VMs
and the second algorithm is used to select a suitable fault tolerance technique. However,
the paper is lack of experimental assessment and comparison to verify the solution. In
[11], the authors present an approach toward transparently delivering fault tolerance on
the applications deployed in VM instances. In particular, they propose an approach for
realizing generic fault tolerance mechanisms as independent modules, validating fault
tolerance properties of each mechanism, and matching users’ requirements with
available fault tolerance modules to obtain a comprehensive solution with desired
properties. However, no experimental results are discussed in the works also. Garg et al.
[12] provide a fault tolerance scheme by ensuring reliability, scalability and availability.
Their result shows that various situations like unexpected traffic spikes, organic traffic
growth or internal challenge like server failure or urgent maintenance of component can
be handled easily using HAProxy. However, HAProxy cannot be used to handle the
request for multiple applications coming from multiple users at the same time, so this
framework is not suitable for SaaS services because of its multi-tenant attribute. In [13],
H.T. Tran et al. have estimated the theoretical improvements in service availability that
can be achieved using the Retry Fault Tolerance, Recovery Block Fault Tolerance and
Dynamic Sequential Fault Tolerance strategies, and have compared these estimates to
experimentally obtained results. But they neglect the various pragmatic failure
scenarios in cloud environment. For example, if the system becomes aging, their
solution is worse than the reboot strategy.
Fault injection [14] is a critical method for evaluating fault tolerance and
dependability benchmarking [15] of cloud services, and it is also a promising approach
to understand a system behavior in the presence of faults [16]. Therefore, to enable a
thorough analysis of a suitable fault tolerance framework, some works adopted some
kind of fault injection mechanism. Take [17] for example. The authors present a
framework by mapping available Malicious and Accidental Fault Tolerance for Internet
Applications (MAFTIA) intrusion tolerance framework for dependencies, and validated
the framework by integrating intrusion tolerance via Threshold Cryptography (ITTC)
mechanism in the simulated cloud environment. However, this work focuses on solving
intrusion tolerance in cloud computing, which is a fault tolerant design approach to
defend cloud infrastructure against malicious attacks.
Considering the main feature, performance and application of the above publications,
it was observed that the proposed models benefited from fault tolerance techniques in
different ways. However, in a long runtime, the SaaS services inevitably encounter a
variety of failures, such as VM failure, network failure and logical fault. Different types
of failures are likely to be concurrent, or there may even be a causal relationship. But
the works above are lack of analysis on different failure scenario and discuss on the
fault tolerance in hybrid fault situation. Moreover, fault injection for experimental
assessment of fault tolerance architecture for cloud services still requires appropriate
models, effective design paradigms, practical implementations, and in-depth
experiments for building highly dependable cloud services.
3. Urn and ball model