编写代码使用KNN算法进行心脏病预测,构建模型、调优、模型评估,不使用sklearn库
时间: 2023-12-31 22:03:30 浏览: 73
好的,下面是一个使用KNN算法进行心脏病预测的Python实现代码示例:
```python
import csv
import random
import math
# 加载数据集
def load_dataset(filename, split_ratio):
dataset = []
with open(filename, 'r') as csvfile:
lines = csv.reader(csvfile)
for row in lines:
dataset.append([float(x) for x in row])
# 将数据集拆分为训练集和测试集
train_size = int(len(dataset) * split_ratio)
train_set = []
test_set = list(dataset)
while len(train_set) < train_size:
index = random.randrange(len(test_set))
train_set.append(test_set.pop(index))
return train_set, test_set
# 计算两个样本之间的欧氏距离
def euclidean_distance(instance1, instance2, length):
distance = 0
for i in range(length):
distance += pow((instance1[i] - instance2[i]), 2)
return math.sqrt(distance)
# 获取邻居
def get_neighbors(train_set, test_instance, k):
distances = []
length = len(test_instance) - 1
for i in range(len(train_set)):
dist = euclidean_distance(test_instance, train_set[i], length)
distances.append((train_set[i], dist))
distances.sort(key=lambda x: x[1])
neighbors = []
for i in range(k):
neighbors.append(distances[i][0])
return neighbors
# 预测标签
def predict_class(train_set, test_instance, k):
neighbors = get_neighbors(train_set, test_instance, k)
labels = [neighbor[-1] for neighbor in neighbors]
return max(set(labels), key=labels.count)
# 计算准确率
def get_accuracy(test_set, predictions):
correct = 0
for i in range(len(test_set)):
if test_set[i][-1] == predictions[i]:
correct += 1
return (correct / float(len(test_set))) * 100.0
# 主函数
def main():
# 加载数据集
filename = 'heart.csv'
split_ratio = 0.67
train_set, test_set = load_dataset(filename, split_ratio)
print('Train set: ', len(train_set))
print('Test set: ', len(test_set))
# 预测标签
k = 5
predictions = []
for i in range(len(test_set)):
result = predict_class(train_set, test_set[i], k)
predictions.append(result)
print('> predicted=' + str(result) + ', actual=' + str(test_set[i][-1]))
# 计算准确率
accuracy = get_accuracy(test_set, predictions)
print('Accuracy: ', accuracy)
if __name__ == '__main__':
main()
```
在这个示例中,我们从文件中加载了UCI机器学习库中的心脏病数据集,并将数据集拆分为训练集和测试集。然后,我们使用KNN算法对测试集中的每个样本进行预测,并计算准确率作为模型的性能指标。
阅读全文