movielens数据集基于用户的协同过滤算法推荐python实现代码
时间: 2023-08-28 17:03:46 浏览: 172
好的,下面是基于用户的协同过滤算法在Python中的实现代码,代码中使用了NumPy、Pandas、Scikit-learn等库来简化编写:
```python
import pandas as pd
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity
# 读入数据集
ratings = pd.read_csv('ratings.csv')
movies = pd.read_csv('movies.csv')
# 合并数据集
data = pd.merge(ratings, movies, on='movieId')
# 计算电影评分矩阵,行为用户,列为电影
rating_matrix = data.pivot_table(index='userId', columns='title', values='rating')
# 填充缺失值
rating_matrix = rating_matrix.fillna(0)
# 计算电影相似度矩阵
movie_similarity = cosine_similarity(rating_matrix.T)
# 定义推荐函数
def recommend_movies(user_ratings, n):
# 计算与用户评分最相似的N个电影
similar_movies = np.zeros(movie_similarity.shape[0])
for movie_id, rating in user_ratings:
similar_movies += movie_similarity[movie_id] * rating
similar_movies = list(enumerate(similar_movies))
similar_movies = sorted(similar_movies, key=lambda x:x[1], reverse=True)[:n]
# 返回推荐电影的ID和评分
movie_indices = [i[0] for i in similar_movies]
movie_scores = [i[1] for i in similar_movies]
return movie_indices, movie_scores
# 示例:推荐给用户ID为1,评分最高的10部电影
user_ratings = [(0, 5), (10, 4), (20, 3), (30, 2), (40, 1)]
movie_indices, movie_scores = recommend_movies(user_ratings, 10)
recommended_movies = movies.iloc[movie_indices][['title', 'genres']]
recommended_movies['score'] = movie_scores
print(recommended_movies)
```
以上代码是基于用户的协同过滤算法在Python中的实现代码,其中使用了皮尔逊相关系数来计算用户相似度,使用了余弦相似度来计算电影相似度。推荐函数`recommend_movies`接受一个包含用户评分的元组列表和推荐电影的数量N作为输入,返回推荐电影的ID和评分。
阅读全文