1用python将csv文件中乱码的用户列重新赋值。最后从1开始用数字代替现有的用户id,不删除重复,用相同数字代替相同id 2.统一两种算法的用户列id排列方式,然后对比两种算法对同一首歌的推荐结果。最好能用图表展示出来。 3.将scipy中的svd计算方式封装成一个单独的模块,放到自己的一个py文件中。可以被调用。
时间: 2023-05-27 22:07:16 浏览: 108
1.
```python
import csv
def fix_user_column(csv_file):
user_dict = {}
new_user_id = 1
with open(csv_file, 'r', encoding='utf-8') as file:
reader = csv.reader(file)
rows = [row for row in reader]
for row in rows:
user = row[0]
if user not in user_dict:
user_dict[user] = new_user_id
new_user_id += 1
row[0] = user_dict[user]
with open(csv_file, 'w', encoding='utf-8', newline='') as file:
writer = csv.writer(file)
writer.writerows(rows)
fix_user_column('data.csv')
```
2.
```python
import pandas as pd
import matplotlib.pyplot as plt
def compare_recommendations(algorithm1_file, algorithm2_file, song_id):
df1 = pd.read_csv(algorithm1_file)
df2 = pd.read_csv(algorithm2_file)
df1 = df1[df1['song_id'] == song_id]
df2 = df2[df2['song_id'] == song_id]
merged_df = pd.merge(df1, df2, on='user_id')
merged_df = merged_df[['user_id', 'algorithm1_score', 'algorithm2_score']]
merged_df.plot(kind='bar', x='user_id')
plt.show()
compare_recommendations('algorithm1.csv', 'algorithm2.csv', '1234')
```
3.
```python
from scipy.sparse.linalg import svds
def svd_factorization(matrix, k):
u, s, vt = svds(matrix, k=k)
return u, s, vt
```
该函数可以被调用,前提是导入了 `scipy` 包。
阅读全文