pandas 0.25.1：强大的Python数据分析工具

需积分: 2 48 浏览量更新于2024-07-15 收藏 9.65MB PDF 举报

"pandas.pdf：一个强大的Python数据分析工具包，版本0.25.1，由Wes McKinney和PyData开发团队发布。该文档包含了关于pandas库的详细信息，包括数据结构、数据分析工具以及如何下载、安装和获取支持等资源链接。此版本不再支持Python 2.7，仅支持Python 3.5.3及以上版本，并且未来的版本将提高Python版本要求至3.6。Panel组件已被完全移除，推荐使用xarray进行N维数据处理。此外，read_pickle()和read_msgpack()的向后兼容性仅保证到pandas的早期版本。" 在Python编程语言中，pandas是一个不可或缺的数据分析工具包，它提供高性能且易于使用的数据结构和分析工具。这个库的核心是DataFrame对象，它是一种二维表格型数据结构，可以存储许多不同类型的数据（如整数、浮点数、字符串甚至是其他复杂对象）。DataFrame既有行索引也有列索引，使得数据操作变得简单而直观。 pandas库中的另一个重要组件是Series，它类似于一维数组，可以理解为带标签的数组。Series可以包含任何数据类型，并且与DataFrame一样，具有内置的索引功能。除此之外，pandas还提供了Index对象，用于创建和管理数据结构的索引。在pandas 0.25.1版本中，有一些关键更新需要注意： 1. **Python版本支持**：从0.25.x系列开始，pandas仅支持Python 3.5.3及更高版本。未来计划进一步提高Python版本要求，至少为3.6。这意味着对于仍使用Python 2.7的用户，需要升级Python版本才能继续使用pandas。 2. **Panel组件移除**：Panel曾是pandas中的一个数据结构，用于处理三维数据。但在0.25.1版本中，Panel已被完全移除。如果需要处理多维数据，建议使用xarray库，它专门设计用于处理N维带标签的数据。 3. **序列化函数兼容性**：read_pickle()和read_msgpack()这两个用于读取序列化数据的函数，其向后兼容性只保证到pandas的某个早期版本。这意味着在新版本中使用这些函数可能无法加载旧版本保存的文件，因此在升级pandas时需要考虑数据迁移的问题。 pandas库提供了丰富的数据处理功能，如数据清洗（缺失值处理、重复值检测）、数据合并（join、merge）、时间序列分析、数据重塑（pivot、stack、unstack）等。它还与其他Python库如NumPy、SciPy和Matplotlib深度集成，为数据分析和可视化提供了一站式的解决方案。为了充分利用pandas，开发者应该熟悉其主要数据结构的特性，掌握基本的数据操作方法，了解如何利用pandas进行数据预处理、统计分析和数据可视化。同时，及时关注pandas的版本更新，以适应不断变化的开发环境和功能增强。

pandas: powerful Python data analysis toolkit, Release 0.25.1

Providing any SparseSeries or SparseDataFrame to concat() will cause a SparseSeries or

SparseDataFrame to be returned, as before.

1.2.5 The .str-accessor performs stricter type checks

Due to the lack of more ﬁne-grained dtypes, Series.str so far only checked whether the data was of object

dtype. Series.str will now infer the dtype data within the Series; in particular, 'bytes'-only data will raise

an exception (except for Series.str.decode(), Series.str.get(), Series.str.len(), Series.

str.slice()), see GH23163, GH23011, GH23551.

Previous behavior:

In [1]: s = pd.Series(np.array(['a', 'ba', 'cba'], 'S'), dtype=object)

In [2]: s

Out[2]:

0 b'a'

1 b'ba'

2 b'cba'

dtype: object

In [3]: s.str.startswith(b'a')

Out[3]:

0 True

1 False

2 False

dtype: bool

New behavior:

In [26]: s = pd.Series(np.array(['a', 'ba', 'cba'], 'S'), dtype=object)

In [27]: s

Out[27]:

0 b'a'

1 b'ba'

2 b'cba'

Length: 3, dtype: object

In [28]: s.str.startswith(b'a')

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\---------------

˓→------------------------------------------------------------

TypeError Traceback (most recent call last)

<ipython-input-28-ac784692b361> in <module>

----> 1 s.str.startswith(b'a')

/pandas/pandas/core/strings.py in wrapper(self,

args,

kwargs)

1840 )

1841 )

-> 1842 raise TypeError(msg)

1843 return func(self,

args,

kwargs)

1844

TypeError: Cannot use .str.startswith with values of inferred dtype 'bytes'.

12 Chapter 1. What’s new in 0.25.0 (July 18, 2019)

剩余2832页未读，继续阅读

不朽之盾

粉丝: 1
资源: 3

pandas 0.25.1：强大的Python数据分析工具

Mastering Pandas.pdf

Thinking in Pandas.pdf

python 45页Pandas.pdf

Python机器学习-Pandas.pdf

pandas.1.2.1.pdf

pandas1.0.5.pdf

pandas1.3.2.pdf

Pandas手册.pdf

pdf. pandas. 教程

基于STM32单片机的激光雕刻机控制系统设计-含详细步骤和代码

最新资源