没有合适的资源?快使用搜索试试~ 我知道了~
首页NIST SP800-22测试:加密应用随机数生成器的统计评估
"NIST SP800-22rev1a.pdf" 是一份由美国国家标准与技术研究所(NIST)发布的关于随机和伪随机数生成器(RNG和PRNG)的统计测试套件,专门针对加密应用。这份文档讨论了在选择和测试这些生成器时的一些关键点,强调了在加密应用中使用生成器的安全性和不可预测性要求。它还涉及统计测试和密码分析的关系,并提供了一些建议的统计测试方法,这些测试可以作为评估生成器是否适合特定加密应用的初步步骤。
在加密应用中,随机和伪随机数生成器的重要性不言而喻,因为它们通常用于生成密钥材料等关键安全元素。根据描述,文档指出,用于此类应用的生成器需要满足比其他应用更严格的标准,尤其是其输出必须在不了解输入的情况下保持不可预测。文档中可能会讨论一些评估生成器性能和安全性的标准和准则。
NIST SP800-22测试套件包含了多个统计测试,这些测试旨在检查生成器的输出是否具有足够的随机性。虽然这些测试可以提供一定的保证,但文档也明确指出,没有任何一组统计测试能够绝对证明一个生成器在特定应用中的适用性。这意味着,除了进行统计测试外,还需要进行密码分析以确保生成器的安全性。
关键词包括“随机数生成器”、“假设检验”和“P值”,暗示文档将深入探讨如何通过统计方法来评估随机性,包括使用P值来判断生成器输出的随机性是否达到预期的显著水平。P值是统计学中衡量观测结果与零假设相符程度的一个指标,若P值低于预设的显著性水平,通常会拒绝零假设,认为数据展现出的随机性足够。
NIST SP800-22rev1a是加密领域中评估随机和伪随机数生成器质量的重要参考资料,它提供了统计测试的指导和建议,帮助确保用于加密的随机数源的可靠性与安全性。这份文档由NIST的专家团队修订并发布,反映了当前对这一领域的科学理解和实践标准。
A STATISTICAL TEST SUITE FOR RANDOM AND PSEUDORANDOM NUMBER GENERATORS FOR CRYPTOGRAPHIC APPLICATIONS
The probability of a Type I error is often called the level of significance of the test. This probability can
be set prior to a test and is denoted as α. For the test, α is the probability that the test will indicate that the
sequence is not random when it really is random. That is, a sequence appears to have non-random
properties even when a “good” generator produced the sequence. Common values of α in cryptography
are about 0.01.
The probability of a Type II error is denoted as β. For the test, β is the probability that the test will
indicate that the sequence is random when it is not; that is, a “bad” generator produced a sequence that
appears to have random properties. Unlike α, β is not a fixed value. β can take on many different values
because there are an infinite number of ways that a data stream can be non-random, and each different
way yields a different β. The calculation of the Type II error β is more difficult than the calculation of α
because of the many possible types of non-randomness.
One of the primary goals of the following tests is to minimize the probability of a Type II error, i.e., to
minimize the probability of accepting a sequence being produced by a generator as good when the
generator was actually bad. The probabilities α and β are related to each other and to the size n of the
tested sequence in such a way that if two of them are specified, the third value is automatically
determined. Practitioners usually select a sample size n and a value for α (the probability of a Type I
error – the level of significance). Then a critical point for a given statistic is selected that will produce the
smallest β (the probability of a Type II error). That is, a suitable sample size is selected along with an
acceptable probability of deciding that a bad generator has produced the sequence when it really is
random. Then the cutoff point for acceptability is chosen such that the probability of falsely accepting a
sequence as random has the smallest possible value.
Each test is based on a calculated test statistic value, which is a function of the data. If the test statistic
value is
S and the critical value is t, then the Type I error probability is P(S > t || H
o
is true) = P(reject H
o
|
H
0
is true), and the Type II error probability is P(S ≤ t || H
0
is false) = P(accept H
0
| H
0
is false). The test
statistic is used to calculate a P-value that summarizes the strength of the evidence against the null
hypothesis. For these tests, each P-value is the probability that a perfect random number generator would
have produced a sequence less random than the sequence that was tested, given the kind of non-
randomness assessed by the test. If a P-value for a test is determined to be equal to 1, then the sequence
appears to have perfect randomness. A P-value of zero indicates that the sequence appears to be
completely non-random. A significance level (
α
) can be chosen for the tests. If P-value
≥ α
, then the
null hypothesis is accepted; i.e., the sequence appears to be random. If P-value
< α
, then the null
hypothesis is rejected; i.e., the sequence appears to be non-random. The parameter
α
denotes the
probability of the Type I error. Typically,
α
is chosen in the range [0.001, 0.01].
• An α of 0.001 indicates that one would expect one sequence in 1000 sequences to be rejected by
the test if the sequence was random. For a P-value ≥ 0.001, a sequence would be considered to be
random with a confidence of 99.9%. For a P-value < 0.001, a sequence would be considered to be non
-
random with a confidence of 99.9%.
• An α of 0.01 indicates that one would expect 1 sequence in 100 sequences to be rejected. A P-
value ≥ 0.01 would mean that the sequence would be considered to be random with a confidence of 99%.
A P-value < 0.01 would mean that the conclusion was that the sequence is non-random with a confidence
of 99%.
For the examples in this document, α has been chosen to be 0.01. Note that, in many cases, the
parameters in the examples do not conform to the recommended values; the examples are for illustrative
purposes only.
1-4
€
A STATISTICAL TEST SUITE FOR RANDOM AND PSEUDORANDOM NUMBER GENERATORS FOR CRYPTOGRAPHIC APPLICATIONS
1.1.6 Considerations for Randomness, Unpredictability and Testing
The following assumptions are made with respect to random binary sequences to be tested:
1. Uniformity: At any point in the generation of a sequence of random or pseudorandom
bits, the occurrence of a zero or one is equally likely, i.e., the probability of each is
exactly 1/2. The expected number of zeros (or ones) is n/2, where n = the sequence
length.
2. Scalability: Any test applicable to a sequence can also be applied to subsequences
extracted at random. If a sequence is random, then any such extracted subsequence
should also be random. Hence, any extracted subsequence should pass any test for
randomness.
3. Consistency: The behavior of a generator must be consistent across starting values
(seeds). It is inadequate to test a PRNG based on the output from a single seed, or an
RNG on the basis of an output produced from a single physical output.
1.2 Definitions
Term
Definition
Asymptotic Analysis
A statistical technique that derives limiting approximations for functions
of interest.
Asymptotic Distribution
The limiting distribution of a test statistic arising when n approaches
infinity.
Bernoulli Random
Variable
A random variable that takes on the value of one with probability p and the
value of zero with probability 1-p.
Binary Sequence
A sequence of zeroes and ones.
Binomial Distribution
A random variable is binomially distributed if there is an integer n and a
probability p such that the random variable is the number of successes in n
independent Bernoulli experiments, where the probability of success in a
single experiment is p. In a Bernoulli experiment, there are only two
possible outcomes.
Bit String
A sequence of bits.
Block
A subset of a bit string. A block has a predetermined length.
Central Limit Theorem
For a random sample of size n from a population with mean
µ
and
variance
σ
2
, the distribution of the sample means is approximately normal
with mean
µ
and variance
σ
2
/n as the sample size increases.
Complementary Error
Function
See Erfc.
Confluent Hypergeometric
Function
The confluent hypergeometric function is defined as
Φ(a; b;z) =
Γ(b)
Γ(a)Γ(b − a)
e
zt
t
a−1
(1− t)
b− a−1
dt
0
1
∫
,0 < a < b
.
Critical Value
The value that is exceeded by the test statistic with a small probability
(significance level). A "look-up" or calculated value of a test statistic (i.e.,
a test statistic value) that, by construction, has a small probability of
occurring (e.g., 5 %) when the null hypothesis of randomness is true.
Cumulative Distribution
Function (CDF) F(x)
A function giving the probability that the random variable X is less than or
equal to x, for every value x. That is,
F(x) = P(X ≤ x).
1-5
A STATISTICAL TEST SUITE FOR RANDOM AND PSEUDORANDOM NUMBER GENERATORS FOR CRYPTOGRAPHIC APPLICATIONS
Entropy
A measure of the disorder or randomness in a closed system. The entropy
of uncertainty of a random variable X with probabilities p
i
, …, p
n
is
defined to be H(X) =
i
n
i 1
i
−
∑
p log p
=
.
Entropy Source
A physical source of information whose output either appears to be
random in itself or by applying some filtering/distillation process. This
output is used as input to either a RNG or PRNG.
Erfc
The complementary error function erfc(z) is defined in Section 5.5.3. This
function is related to the normal cdf.
Geometric Random
Variable
A random variable that takes the value k, a non-negative integer with
probability p
k
(1-p). The random variable x is the number of successes
before a failure in an infinite series of Bernoulli trials.
Global Structure/Global
Value
A structure/value that is available by all routines in the test code.
igamc
The incomplete gamma function Q(a,x) is defined in Section 5.5.3.
Incomplete Gamma
Function
See the definition for igamc.
Hypothesis (Alternative)
A statement H
a
that an analyst will consider as true (e.g., H
a
: the sequence
is non-random) if and when the null hypothesis is determined to be false.
Hypothesis (Null)
A statement H
0
about the assumed default condition/property of the
observed sequence. For the purposes of this document, the null hypothesis
H
0
is that the sequence is random. If H
0
is in fact true, then the reference
distribution and critical values of the test statistic may be derived.
Kolmogorov-Smirnov Test
A statistical test that may be used to determine if a set of data comes from
a particular probability distribution.
Level of Significance (
α
)
The probability of falsely rejecting the null hypothesis, i.e., the probability
of concluding that the null hypothesis is false when the hypothesis is, in
fact, true. The tester usually chooses this value; typical values are 0.05,
0.01 or 0.001; occasionally, smaller values such as 0.0001 are used. The
level of significance is the probability of concluding that a sequence is
non-random when it is in fact random. Synonyms: Type I error,
α
error.
Linear Dependence
In the context of the binary rank matrix test, linear dependence refers to m-
bit vectors that may be expressed as a linear combination of the linearly
independent m-bit vectors.
Maple
An interactive computer algebra system that provides a complete
mathematical environment for the manipulation and simplification of
symbolic algebraic expressions, arbitrary extended precision mathematics,
two- and three-dimensional graphics, and programming.
MATLAB
An integrated, technical computer environment that combines numeric
computation, advanced graphics and visualization, and a high level
programming language. MATLAB includes functions for data analysis and
visualization; numeric and symbolic computation; engineering and
scientific graphics; modeling, simulation and prototyping; and
programming, application development and a GUI design.
Normal (Gaussian)
Distribution
A continuous distribution whose density function is given by
2
2
1
2
2
1
);( ;
−
−
=
σ
µ
πσ
µ σ
x
exf
, where µ and σ are location and scale
parameters.
1-6
A STATISTICAL TEST SUITE FOR RANDOM AND PSEUDORANDOM NUMBER GENERATORS FOR CRYPTOGRAPHIC APPLICATIONS
P-value
The probability (under the null hypothesis of randomness) that the chosen
test statistic will assume values that are equal to or worse than the
observed test statistic value when considering the null hypothesis. The P-
value is frequently called the “tail probability.”
Poisson Distribution - §3.8
Poisson distributions model (some) discrete random variables. Typically,
a Poisson random variable is a count of the number of rare events that
occur in a certain time interval.
Probability Density
Function (PDF)
A function that provides the "local" probability distribution of a test
statistic. From a finite sample size n, a probability density function will be
approximated by a histogram.
Probability Distribution
The assignment of a probability to the possible outcomes (realizations) of
a random variable.
Pseudorandom Number
Generator (PRNG)
A deterministic algorithm which, given a truly random binary sequence of
length k, outputs a binary sequence of length l >> k which appears to be
random. The input to the generator is called the seed, while the output is
called a pseudorandom bit sequence.
Random Number
Generator (RNG)
A mechanism that purports to generate truly random data.
Random Binary Sequence
A sequence of bits for which the probability of each bit being a “0” or “1”
is ½. The value of each bit is independent of any other bit in the sequence,
i.e., each bit is unpredictable.
Random Variable
Random variables differ from the usual deterministic variables (of science
and engineering) in that random variables allow the systematic
distributional assignment of probability values to each possible outcome.
Rank (of a matrix)
Refers to the rank of a matrix in linear algebra over GF(2). Having
reduced a matrix into row-echelon form via elementary row operations, the
number of nonzero rows, if any, are counted in order to determine the
number of linearly independent rows or columns in the matrix.
Run
An uninterrupted sequence of like bits (i.e., either all zeroes or all ones).
Seed
The input to a pseudorandom number generator. Different seeds generate
different pseudorandom sequences.
SHA-1
The Secure Hash Algorithm defined in Federal Information Processing
Standard 180-1.
Standard Normal
Cumulative Distribution
Function
See the definition in Section 5.5.3. This is the normal function for mean =
0 and variance = 1.
Statistically Independent
Events
Two events are independent if the occurrence of one event does not affect
the chances of the occurrence of the other event. The mathematical
formulation of the independence of events A and B is the probability of the
occurrence of both A and B being equal to the product of the probabilities
of A and B (i.e., P(A and B) = P(A)P(B)).
Statistical Test (of a
Hypothesis)
A function of the data (binary stream) which is computed and used to
decide whether or not to reject the null hypothesis. A systematic statistical
rule whose purpose is to generate a conclusion regarding whether the
experimenter should accept or reject the null hypothesis H
o
.
Word
A predefined substring consisting of a fixed pattern/template (e.g., 010,
0110).
1-7
A STATISTICAL TEST SUITE FOR RANDOM AND PSEUDORANDOM NUMBER GENERATORS FOR CRYPTOGRAPHIC APPLICATIONS
1.3 Abbreviations
Abbreviation
Definition
ANSI
American National Standards Institute
FIPS
Federal Information Processing Standard
NIST
National Institute of Standards and Technology
RNG
Random Number Generator
SHA-1
Secure Hash Algorithm
1.4 Mathematical Symbols
In general, the following notation is used throughout this document. However, the tests in this document
have been designed and described by multiple authors who may have used slightly different notation.
The reader is advised to consider the notation used for each test separate from that notation used in other
tests.
Symbol
Meaning
x
The floor function of x; for a given real positive x,
x
= x-g, where
x
is a non-negative integer, and 0
≤
g
<
1
.
α
The significance level.
d
The normalized difference between the observed and expected number of frequency
components. See Sections 2.6 and 3.6.
∇ψ
2
m
(obs);
∇
2
ψ
2
m
(obs)
A measure of how well the observed values match the expected value. See Sections
2.11 and 3.11.
E[ ]
The expected value of a random variable.
ε
The original input string of zero and one bits to be tested.
ε
i
The i
th
bit in the original sequence
ε
.
H
0
The null hypothesis; i.e., the statement that the sequence is random.
log(x)
The natural logarithm of x: log(x) = log
e
(x) = ln(x).
log
2
(x)
Defined as
ln( 2 )
ln( x )
, where ln is the natural logarithm.
M
The number of bits in a substring (block) being tested.
N
The number of M-bit blocks to be tested.
n
The number of bits in the stream being tested.
f
n
The sum of the log
2
distances between matching L-bit templates, i.e., the sum of the
number of digits in the distance between L-bit templates. See Sections 2.9 and 3.9.
π
3.14159… unless defined otherwise for a specific test.
σ
The standard deviation of a random variable =
( )
∫
− f ( x )dxx
2
µ
.
σ
2
The variance of a random variable = (standard deviation)
2
.
s
obs
The observed value which is used as a statistic in the Frequency test.
S
n
The n
th
partial sum for values X
i
= {-1, +1}; i.e., the sum of the first n values of X
i
.
Σ
The summation symbol.
Φ
Standard Normal Cumulative Distribution Function (see Section 5.5.3).
ξ
j
The total number of times that a given state occurs in the identified cycles. See
Section 2.15 and 3.15.
X
i
The elements of the string consisting of ±1 that is to be tested for randomness, where
X
i
= 2
ε
i
-1.
1-8
剩余130页未读,继续阅读
168 浏览量
139 浏览量
130 浏览量
247 浏览量
2024-10-31 上传
2024-10-31 上传
230 浏览量
314 浏览量
157 浏览量
艾米的爸爸
- 粉丝: 800
- 资源: 314
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- cesium js 指北针
- PRIMA-CRM客户关系管理系统源代码
- 数据_扇形FBP_ct数据_扇形CT_giftcja_FBP
- phylopeachtree.github.io:Peachtree-在树上绘制流行病学和对齐字符
- 开课吧 vue面试题训练营
- 易语言超级列表框排序源码,易语言超级列表框排序_增加时间排序源
- Dark Patterns-crx插件
- boxy:使用Phaser 3的演示平台游戏
- staffdashboard
- Textarea Lift-off-crx插件
- TSSOS:基于矩SOS层次结构的稀疏多项式优化工具
- audio-flac:audioflac 包
- wAppbar:Windows桌面应用程序栏(appbar),基于Nim和wNim Framework
- MCQTabbedAppPOC
- Color-Identifying-Game:通过查看红色,绿色和蓝色值来识别颜色
- 易语言超级列表框指定行着色
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功