Python学习路径：从入门到精通

需积分: 10 70 浏览量更新于2024-07-09 收藏 8.76MB PDF 举报

"Python速查表.pdf 是一份用于快速学习和参考Python编程语言的资料，包含从基础到高级的各种知识点。这份PDF可能包含了Python环境设置、基础知识、数据结构、正则表达式、科学计算库、数据可视化以及机器学习等多个方面。\n\n在开始Python学习之旅前，你应该明确学习Python的原因及其在实际中的应用。了解Python的广泛用途，如Web开发、数据分析、自动化任务等，将有助于保持学习动力。\n\n第一步是设置你的学习环境，推荐下载并安装Anaconda，它是一个集成环境，包含了Python及许多常用的数据科学库。\n\n第二步是掌握Python的基础。你可以通过Codecademy等在线平台开始，学习Python的基本语法、数据类型（如列表、元组和字典）以及列表推导式和字典推导式。同时，解决Python练习题能帮助你巩固所学。\n\n深入学习正则表达式是Python中的一大关键技能，你可以通过Google的课程学习，并实践‘babynames’练习，进一步理解文本挖掘的概念。\n\n到达一定阶段后，你需要掌握数据科学领域的重要工具——SciKit-Learn和机器学习。可以参考哈佛大学的CS109课程或Andrew Ng的机器学习课程，并完成相关作业来提升你的技能。\n\n数据可视化是现代数据分析不可或缺的部分，你可以学习哈佛大学提供的数据可视化教程，并进行实践，掌握如何用Python进行有效的数据展示。\n\n在不断实践中，你将逐步从初级过渡到中级，再晋升至高级，最后达到专家级别。这需要对Python的每一个环节都有深入理解和熟练运用，包括但不限于函数、面向对象编程、异常处理、文件操作、多线程、网络编程等。\n\nPython速查表.pdf提供了一个清晰的学习路径，通过这个路径，你可以在Python的世界里不断精进，成为真正的Python专家。记得，持续的实践和项目经验是提升编程技能的关键。"

Data Exploration

using Pandas

CHEATSHEET

1. Reading and Writing Data

a. Reading a CSV file

>>>df=pd.read_csv(‘AnalyticsVidhya.csv’)

b. Writing content of data frame to CSV file

>>>df.to_csv(‘AV.csv’)

c. Reading an Excel file

>>>df=pd.read_excel(‘AV.xlsx’,‘sheet1’)

d. Writing content of data frame to Excel file

>>>df.to_excel(‘AV2.xlsx’,sheet_name=’sheet2’)

2.Getting Preview of Dataframe

a. Looking at top n records

>>>df.head(5)

b. Looking at bottom n records

>>>df.tail(5)

c. View columns name

>>>df.columns

3. Rename Columns of Data Frame

a. Rename method helps to rename column of data frame.

>>>df2=df.rename(columns={‘old_columnname’:’new_columnname’})

This statement will create a new data frame with new column name.

b. To rename the column of existing data frame, set inplace=True.

>>>df.rename(columns={‘old_columnname’:’new_columnname’}, inplace=True)

4. Selecting Columns or Rows

a. Accessing sub data frames

>>>df[[‘column1’,’column2’]]

b. Filtering Records

>>>df[ df[‘column1’]>10]

>>>df[ (df[‘column1’]>10) & df[‘column2’]==30]

>>>df[ (df[‘column1’]>10) | df[‘column2’]==30]

5. Handling Missing Values

This is an inevitale part of dealing with data . To overcome this hurdle, use

dropna or fillna function.

a. dropna: It is used to drop rows or columns having missing data

b. fillna: It is used to fill missing values

>>>df2.ﬁllna(value=5) #It replaces all missing values with 5

>>>mean=df2[‘column1’].mean()

>>>df2[‘column1’].ﬁllna(mean) #It replaces all missing values of column1 with mean

of available values

>>>df1.dropna()

6. Creating New Columns

New column is a function of existing columns

>>>df[‘NewColumn1’]=df[‘column2’] #Create a copy of existing column2

>>>df[‘NewColumn2’]=df[‘column2’]+10 #Add 10 to existing column2 then create a new one

>>>df[‘NewColumn3’]= df[‘column1’] + df[‘column2’] #Add elements of column1 and column2

then create new column

Aggregate

a. Groupby: Groupby helps to perform three operations

i. Splitting the data into groups

ii. Applying a function to each group individually

iii. Combining the result into a data structure

b. Pivot Table: It helps to generate data structure. It has three components

index, columns and values (similar to excel)

>>>pd.pivot_table(df, values=’column1’, index=[‘column2’,’column3’], columns=[‘column4’])

By default, it shows the sum of values column but you can change it using

argument aggfunc

>>>pd.pivot_table(df, values=’column1’, index=[‘column2’,’column3’], columns=[‘column4’], aggfunc=len)

#it shows count

7. Aggregate

>>>df.groupby(‘column1’).sum()

>>>df.groupby([‘column1’,’column2’]).count()

c. Cross Tab: Cross Tab computes the simple cross tabulation of two factors.

>>>pd.crosstab(df.column1, df.column2)

8. Merging/ Concatenating DataFrames

It performs similar operation like we do in SQL.

a. Concatenating: It concatenate two or more data frames based on their columns.

>>>pd.concat([df1,df2])

b. Merging: We can perform left, right and inner join also.

>>>pd.merge(df1, df2, on=’column1’, how=’inner’)

>>>pd.merge(df1, df2, on=’column1’, how=’left’)

>>>pd.merge(df1, df2, on=’column1’, how=’right’)

>>>pd.merge(df1, df2, on=’column1’, how=’outer’)

9. Applying function to element, column or dataframe

a. Map: It iterates over each element of a series.

>>>df[‘column1’].map(lambda x: 10+x #this will add 10 to each element of column1

>>>df[‘column2’].map(lambda x: ‘AV’+x) #this will concatenate “AV“ at the beginning of

each element of column2 (column format is string)

b. Apply: As the name suggests, applies a function along any axis of the

DataFrame.

>>>df[[‘column1’,’column2’]].apply(sum) #it will returns the sum of all the values of

column1 and column2.

c. ApplyMap: This helps to apply a function to each element of dataframe.

>>>func = lambda x: x+2

>>>df.applymap(func) #it will add 2 to each element of dataframe (all columns of

dataframe must be numeric type)

10. Identify unique values

Function unique helps to return unique values of a column.

>>>df[‘Column1’].unique()

11. Basic Stats

Pandas helps to understand the data using basic statistical methods.

a. describe: This returns the quick stats (count, mean, std, min, first quartile,

median, third quartile, max) on suitable columns

>>>df.describe()

b. covariance: It returns the co-variance between suitable columns.

>>>df.cov()

c. correlation: It returns the co-variance between suitable columns.

>>>df.corr()

To learn more, we recommend

Wes Mckinney’s Python for Data Analysis

Book for Learning Pandas

For more resources on analytics / data science, visit

www.analyticsvidhya.com

剩余22页未读，继续阅读

Inochigohan

粉丝: 222
资源: 2

Python学习路径：从入门到精通

Python程序设计基础速查表与考试大纲整理

数据科学速查表大全：Python工具与报告

Python机器学习速查表：常用包与方法大全

python速查表.pdf

Python入门速查表.pdf

Python3速查表.pdf

python正则表达式速查表.pdf

Python 速查表.zip

综合 Python 速查表.zip

机器&深度学习代码速查表.pdf

最新资源