理解神经网络:Python中从零开始的实现

5星 · 超过95%的资源 7 下载量 6 浏览量 更新于2024-08-27 1 收藏 402KB PDF 举报
"本文将介绍如何使用Python从头构建一个简单的神经网络,通过实践加深对神经网络工作原理的理解。文章中使用scikit-learn的make_moons函数生成了一个非线性可分的二分类数据集,以此来演示为什么神经网络在处理此类问题时具有优势。在比较了线性分类器(如Logistic回归)的局限性后,文章构建了一个包含输入层、隐藏层和输出层的3层神经网络,并解释了网络结构和预测过程。" 在这篇文章中,作者强调了亲手实现神经网络的重要性,即便最终可能会使用现成的库,如PyBrain。这样做有助于深入理解神经网络的运作机制,并为设计高效模型打下基础。文章中,首先介绍了数据集的生成,利用scikit-learn的`make_moons`函数创建了一个非线性分布的二分类数据集,呈现出月牙形分布,难以通过线性方法分割。 接着,文章展示了Logistic回归在处理这类数据时的不足。由于数据的非线性特性,Logistic回归无法找到合适的决策边界完全区分两类数据。这突显了神经网络的强项,即它们能自动学习非线性特征,无需手动进行特征工程。 然后,文章进入了神经网络的构建部分,描述了一个简单的3层神经网络架构:一个输入层接收x和y坐标,一个隐藏层负责学习特征,一个输出层产生预测结果,每个类别对应一个输出节点。由于这里只有两个类别,因此输出层有两个节点,分别表示属于两类别的概率。通过这种方式,神经网络能够更好地适应数据的月牙形状,提供更准确的分类结果。 在实际实现过程中,神经网络的训练通常涉及权重初始化、前向传播、损失计算、反向传播以及权重更新等步骤。这些步骤在文章中可能没有详述,但这是构建神经网络的基本流程。权重初始化确保网络在开始训练时具有合适的参数,前向传播用于根据当前权重计算预测,损失函数衡量预测与真实值之间的差距,反向传播则用来计算梯度以便更新权重,以减小损失。 这篇文章通过实例解释了神经网络如何处理非线性问题,以及其相比线性模型的优势。对于初学者来说,这是一个很好的起点,帮助他们理解神经网络的基本结构和工作原理,为进一步深入学习和应用神经网络奠定了基础。
2018-10-30 上传
It is known that there is no sufficient Matlab program about neuro-fuzzy classifiers. Generally, ANFIS is used as classifier. ANFIS is a function approximator program. But, the usage of ANFIS for classifications is unfavorable. For example, there are three classes, and labeled as 1, 2 and 3. The ANFIS outputs are not integer. For that reason the ANFIS outputs are rounded, and determined the class labels. But, sometimes, ANFIS can give 0 or 4 class labels. These situations are not accepted. As a result ANFIS is not suitable for classification problems. In this study, I prepared different adaptive neuro-fuzzy classifiers. In the all programs, which are given below, I used the k-means algorithm to initialize the fuzzy rules. For that reason, the user should give the number of cluster for each class. Also, Gaussian membership function is only used for fuzzy set descriptions, because of its simple derivative expressions The first of them is scg_nfclass.m. This classifier based on Jang’s neuro-fuzzy classifier [1]. The differences are about the rule weights and parameter optimization. The rule weights are adapted by the number of rule samples. The scaled conjugate gradient (SCG) algorithm is used to determine the optimum values of nonlinear parameters. The SCG is faster than the steepest descent and some second order derivative based methods. Also, it is suitable for large scale problems [2]. The second program is scg_nfclass_speedup.m. This classifier is similar the scg_nfclass. The difference is about parameter optimization. Although it is based on SCG algorithm, it is faster than the traditional SCG. Because, it used least squares estimation method for gradient estimation without using all training samples. The speeding up is seemed for medium and large scale problems [2]. The third program is scg_power_nfclass.m. Linguistic hedges are applied to the fuzzy sets of rules, and are adapted by SCG algorithm. By this way, some distinctive features are emphasized by power values, and some irrelevant features are damped with power values. The power effects in any feature are generally different for different classes. The using of linguistic hedges increase the recognition rates [3]. The last program is scg_power_nfclass_feature.m. In this program, the powers of fuzzy sets are used for feature selection [4]. If linguistic hedge values of classes in any feature are bigger than 0.5 and close to 1, this feature is relevant, otherwise it is irrelevant. The program creates a feature selection and a rejection criterion by using power values of features. References: [1] Sun CT, Jang JSR (1993). A neuro-fuzzy classifier and its applications. Proc. of IEEE Int. Conf. on Fuzzy Systems, San Francisco 1:94–98.Int. Conf. on Fuzzy Systems, San Francisco 1:94–98 [2] B. Cetişli, A. Barkana (2010). Speeding up the scaled conjugate gradient algorithm and its application in neuro-fuzzy classifier training. Soft Computing 14(4):365–378. [3] B. Cetişli (2010). Development of an adaptive neuro-fuzzy classifier using linguistic hedges: Part 1. Expert Systems with Applications, 37(8), pp. 6093-6101. [4] B. Cetişli (2010). The effect of linguistic hedges on feature selection: Part 2. Expert Systems with Applications, 37(8), pp 6102-6108. e-mail:bcetisli@mmf.sdu.edu.tr bcetisli@gmail.com