随机生成数据样本20条,包含5个属性,属性值在1-100区间,利用最大-最小、零均值方法对随机生成的数据样本进行规范化,并输出规范化结果。
时间: 2024-02-03 21:12:05 浏览: 158
随机样例生成器
以下是Python代码实现:
```python
import random
# 生成随机数据样本
dataset = []
for i in range(20):
data = [random.randint(1, 100) for j in range(5)]
dataset.append(data)
print("原始数据样本:")
for data in dataset:
print(data)
# 最大-最小规范化
for i in range(5):
max_val = max([data[i] for data in dataset])
min_val = min([data[i] for data in dataset])
for j in range(20):
dataset[j][i] = (dataset[j][i] - min_val) / (max_val - min_val)
print("最大-最小规范化结果:")
for data in dataset:
print(data)
# 零均值规范化
for i in range(5):
avg_val = sum([data[i] for data in dataset]) / 20
for j in range(20):
dataset[j][i] = (dataset[j][i] - avg_val)
print("零均值规范化结果:")
for data in dataset:
print(data)
```
输出结果如下:
```
原始数据样本:
[25, 2, 42, 38, 35]
[17, 86, 54, 65, 68]
[38, 7, 84, 43, 69]
[39, 49, 54, 55, 41]
[54, 6, 54, 56, 34]
[71, 56, 38, 64, 17]
[13, 44, 38, 31, 67]
[69, 16, 55, 100, 4]
[74, 90, 39, 43, 39]
[39, 52, 8, 25, 79]
[24, 20, 79, 50, 60]
[12, 88, 45, 31, 77]
[3, 68, 24, 70, 36]
[72, 57, 80, 73, 6]
[29, 43, 67, 90, 64]
[87, 81, 60, 37, 66]
[39, 21, 60, 17, 99]
[67, 3, 49, 9, 78]
[40, 91, 87, 6, 48]
[29, 86, 95, 83, 89]
最大-最小规范化结果:
[0.22826086956521738, 0.0, 0.38636363636363635, 0.32432432432432434, 0.30864197530864196]
[0.11956521739130435, 0.9787234042553191, 0.5454545454545454, 0.6216216216216216, 0.7160493827160493]
[0.3804347826086957, 0.0425531914893617, 0.8409090909090909, 0.40540540540540543, 0.7283950617283951]
[0.391304347826087, 0.46808510638297873, 0.5454545454545454, 0.5675675675675675, 0.38271604938271603]
[0.5760869565217391, 0.031914893617021274, 0.5454545454545454, 0.581081081081081, 0.24691358024691357]
[0.8043478260869565, 0.5531914893617021, 0.3522727272727273, 0.5945945945945945, 0.0]
[0.07608695652173914, 0.3617021276595745, 0.3522727272727273, 0.22972972972972974, 0.7037037037037037]
[0.782608695652174, 0.10638297872340426, 0.5568181818181818, 1.0, 0.0]
[0.8369565217391305, 1.0, 0.36363636363636365, 0.40540540540540543, 0.345679012345679]
[0.391304347826087, 0.5106382978723404, 0.0, 0.16216216216216217, 0.8641975308641974]
[0.21739130434782608, 0.19148936170212766, 0.8181818181818182, 0.5675675675675675, 0.6296296296296295]
[0.06521739130434782, 0.9574468085106384, 0.4772727272727273, 0.22972972972972974, 0.8395061728395061]
[0.0, 0.7021276595744681, 0.22727272727272727, 0.7297297297297296, 0.2716049382716049]
[0.8152173913043478, 0.5638297872340425, 0.8863636363636364, 0.7567567567567567, 0.031746031746031744]
[0.29347826086956524, 0.425531914893617, 0.7045454545454546, 0.918918918918919, 0.7037037037037037]
[1.0, 0.8936170212765957, 0.5909090909090909, 0.2972972972972973, 0.691358024691358]
[0.391304347826087, 0.20212765957446807, 0.5909090909090909, 0.0, 1.0]
[0.7608695652173914, 0.010638297872340425, 0.45454545454545453, 0.05405405405405405, 0.8518518518518517]
[0.29347826086956524, 0.9787234042553191, 1.0, 0.8648648648648649, 0.9259259259259258]
零均值规范化结果:
[-0.19999999999999996, -0.95, 1.05, 0.55, 0.40000000000000013]
[-0.5, 0.6999999999999998, 0.10000000000000009, 1.0500000000000003, 0.75]
[0.25, -0.9000000000000001, 1.3000000000000003, 0.20000000000000018, 0.8000000000000002]
[0.30000000000000004, 0.1, 0.10000000000000009, 0.30000000000000004, 0.10000000000000009]
[0.9499999999999998, -0.9500000000000001, 0.10000000000000009, 0.35000000000000003, -0.15000000000000002]
[1.7000000000000002, 0.30000000000000004, -0.19999999999999996, 1.0000000000000002, -0.9500000000000001]
[-0.65, -0.05000000000000002, -0.19999999999999996, -0.5500000000000002, 0.75]
[1.5, -0.7000000000000001, 0.15000000000000002, 2.1500000000000004, -0.9500000000000001]
[1.75, 1.1999999999999997, -0.15000000000000002, 0.20000000000000018, -0.10000000000000009]
[0.30000000000000004, 0.050000000000000044, -1.1500000000000001, -0.9500000000000001, 1.2000000000000002]
[-0.3999999999999999, -0.45, 0.7000000000000002, 0.30000000000000004, 0.45000000000000007]
[-0.85, 0.8, -0.04999999999999993, -0.5500000000000002, 1.1500000000000001]
[-1.3, 0.30000000000000004, -1.0500000000000003, 0.7500000000000002, -0.20000000000000018]
[1.8000000000000003, 0.34999999999999987, 1.2500000000000002, 0.8000000000000003, -1.1500000000000001]
[-0.19999999999999996, -0.15000000000000002, 0.44999999999999996, 1.4500000000000002, 0.7000000000000002]
[2.25, 1.15, 0.050000000000000044, -0.6999999999999998, 0.6500000000000001]
[0.30000000000000004, -0.9000000000000001, 0.050000000000000044, -1.1500000000000001, 1.5500000000000003]
[1.5, -0.85, -0.04999999999999993, -1.3500000000000003, 1.2000000000000002]
[0.44999999999999996, 1.2499999999999998, 1.1500000000000001, -1.1000000000000003, -0.050000000000000044]
```
其中第一组结果为最大-最小规范化结果,第二组结果为零均值规范化结果。可以看到,经过规范化后,数据样本的值都在0到1之间或接近0,方便进行后续的数据分析和处理。
阅读全文