读取"ramen-rating.csv"文件，并进行数据预处理；统计各个国家拉面品牌数量，并绘制排名前10位国家的直方图；找出各个国家最受欢迎的拉面品牌和包装类型；统计各个国家各个品牌的stars平均值；给出一些交叉列联表，行列自己选。

1. 数据预处理： ```python import pandas as pd # 读取数据 df = pd.read_csv("ramen-rating.csv") # 删除无用列 df.drop(['Review #', 'Top Ten'], axis=1, inplace=True) # 处理包装列 df.loc[df['Style'].str.contains('cup', case=False), 'Packaging'] = 'Cup' df.loc[df['Style'].str.contains('bowl', case=False), 'Packaging'] = 'Bowl' df.loc[df['Style'].str.contains('box', case=False), 'Packaging'] = 'Box' df.loc[df['Style'].str.contains('tray', case=False), 'Packaging'] = 'Tray' # 处理品牌列 df.loc[df['Brand'].str.contains('nissin', case=False), 'Brand'] = 'Nissin' df.loc[df['Brand'].str.contains('maruchan', case=False), 'Brand'] = 'Maruchan' df.loc[df['Brand'].str.contains('samyang', case=False), 'Brand'] = 'Samyang' df.loc[df['Brand'].str.contains('sapporo', case=False), 'Brand'] = 'Sapporo Ichiban' # 处理国家列 df.loc[df['Country'].str.contains('usa', case=False), 'Country'] = 'United States' df.loc[df['Country'].str.contains('south korea', case=False), 'Country'] = 'South Korea' df.loc[df['Country'].str.contains('hong kong', case=False), 'Country'] = 'Hong Kong' df.loc[df['Country'].str.contains('taiwan', case=False), 'Country'] = 'Taiwan' df.loc[df['Country'].str.contains('singapore', case=False), 'Country'] = 'Singapore' df.loc[df['Country'].str.contains('japan', case=False), 'Country'] = 'Japan' df.loc[df['Country'].str.contains('thailand', case=False), 'Country'] = 'Thailand' df.loc[df['Country'].str.contains('china', case=False), 'Country'] = 'China' df.loc[df['Country'].str.contains('malaysia', case=False), 'Country'] = 'Malaysia' df.loc[df['Country'].str.contains('indonesia', case=False), 'Country'] = 'Indonesia' # 处理stars列 df.loc[df['Stars'] == 'Unrated', 'Stars'] = '0' df['Stars'] = df['Stars'].astype(float) # 保存清洗过的数据 df.to_csv("ramen_cleaned.csv", index=False) ``` 2. 统计各个国家拉面品牌数量，并绘制排名前10位国家的直方图： ```python import matplotlib.pyplot as plt # 统计各个国家的品牌数量 country_counts = df.groupby('Country')['Brand'].nunique().sort_values(ascending=False) # 绘制前10位国家的直方图 plt.bar(country_counts.index[:10], country_counts[:10]) plt.xticks(rotation=45) plt.xlabel('Country') plt.ylabel('Number of brands') plt.title('Top 10 countries with most ramen brands') plt.show() ``` 3. 找出各个国家最受欢迎的拉面品牌和包装类型： ```python # 找出各个国家最受欢迎的品牌和包装类型 popular_brand = df.groupby('Country')['Brand'].apply(lambda x: x.value_counts().index[0]) popular_packaging = df.groupby('Country')['Packaging'].apply(lambda x: x.value_counts().index[0]) # 输出结果 print("Most popular brand by country:\n", popular_brand) print("\nMost popular packaging by country:\n", popular_packaging) ``` 4. 统计各个国家各个品牌的stars平均值： ```python # 统计各个国家各个品牌的stars平均值 country_brand_stars = df.groupby(['Country', 'Brand'])['Stars'].mean() # 输出结果 print(country_brand_stars) ``` 5. 交叉列联表： ```python # 统计不同包装类型的品牌数量 packaging_brand_count = pd.crosstab(df['Packaging'], df['Brand']) print("Packaging vs Brand:\n", packaging_brand_count) # 统计不同国家的包装类型数量 country_packaging_count = pd.crosstab(df['Country'], df['Packaging']) print("\nCountry vs Packaging:\n", country_packaging_count) # 统计不同星级的品牌数量 stars_brand_count = pd.crosstab(df['Stars'], df['Brand']) print("\nStars vs Brand:\n", stars_brand_count) ```

相关推荐

umai-ramen-site:网站umai-ramen.fr

30-Ramen-with_solutions.ipynb

Ramen-Shop:最好的拉面店带来更大的好处

vue-color-kit不显示颜色选择器

大一c语言餐馆点菜系统代码

请创建一个元组，存储5种你喜欢的食品。 使用一个for循环，将它们都打印出来。

Ramen-Battle-Net:旨在找到最好的一碗拉面

ramen-restaurant-landing-page:使用HTML CSS和JS开发的拉面餐厅登陆页面

Ramen-ratings

ramen-stack:RAMEN堆栈= React + Ampersand.js + MongoDB + Express + Node.js

Ramen-Restuarent

ramen-locator

ramen-restaurant

服务器虚拟化部署方案.doc

北京市东城区人民法院服务器项目.doc

求集合数据的均方差iction-mast开发笔记

Wom6.3Wom6.3Wom6.3

html网页版python语言pytorch框架的图像分类西瓜是否腐烂识别-含逐行注释和说明文档-不含图片数据集

2020年细分产品出口数据集.xlsx

最新推荐

服务器虚拟化部署方案.doc

北京市东城区人民法院服务器项目.doc

求集合数据的均方差iction-mast开发笔记

Wom6.3Wom6.3Wom6.3

html网页版python语言pytorch框架的图像分类西瓜是否腐烂识别-含逐行注释和说明文档-不含图片数据集

计算机基础知识试题与解答

管理建模和仿真的文件

【进阶】音频处理基础：使用Librosa

设置ansible 开机自启

计算机基础知识试题与解析

请创建一个元组，存储5种你喜欢的食品。使用一个for循环，将它们都打印出来。