没有合适的资源?快使用搜索试试~ 我知道了~
AASRI Procedia 6 ( 2014 ) 41 – 48 2212-6716 © 2014 The Authors. Published by Elsevier B. V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/3.0/).Peer-review under responsibility of Scientific Committee of American Applied Science Research Institutedoi: 10.1016/j.aasri.2014.05.007 ScienceDirect2013 2nd AASRI Conference on Computational Intelligence and Bioinformatics A New Fitting Scattered Data Method Based on the Criterion of Geometric Distance Guowei Yang*, Jia Xu College of Information Engineering, Nanchang Hangkong University, Nanchang 330063, China Abstract The traditional data fitting method based on least square method is not good for vector data fitting whose independent variable is random. So this paper proposes a new criterion of data fitting which is the least quadratic sum of geometrical distance, and brings forward the new fitting scattered data method based on the new criterion. At the same time the paper puts forward the optimization algorithm for the solution of the data fitting parameter. Simulation experiments show that the fitting precision of the new method is higher than the one of least square method for data fitting of vector, whose independent variable is random. © 2013 Published by Elsevier B.V. Selection and/or peer review under responsibility of American Applied Science Research Institute Keywords: Data fitting, criterion of data fitting least square method, geometrical distance; 1. Introduction In the experimental science, social sciences, behavioral science and the actual engineering fields, such as Computer-Aided Design, Manufacturing, virtual reality, medical imaging, the experiment, the survey or the test can frequently bring large numbers of data. In order to explain these data or according to these data to make the forecast, the judgment, provides the important basis to the policy-maker, we needs to carry on the * Corresponding author. Tel.: +86-791-83953432; fax: +86-791-83953432. E-mail address: ygw_ustb@163.com. Available online at www.sciencedirect.com© 2014 The Authors. Published by Elsevier B. V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/3.0/).Peer-review under responsibility of Scientific Committee of American Applied Science Research Institute42 Guowei Yang and Jia Xu / AASRI Procedia 6 ( 2014 ) 41 – 48 linear or nonlinear fitting modelling frequently to the survey data, seeks to the function (model) which can approximately reflect the change rules of the data. There are a long history and enrich results about data fitting and its application[1-3]. In linear data fitting, we frequently use the least square method in order to get data fitting parameter. The reason is that if the fitting model conforms to the Gauss -Markov hypothesis condition, least square method can obtain the fitting parameter with good statistical nature, like agonic, uniformity, smallest variance and so on. However, the actual test data infinitely varied, moreover the purpose of carrying on the data fitting to the test data are also different, and the precision request are also different, thus the data fitting result which use the least squares method to carry on the data fitting can not achieve the requested purpose. For example, there are some unusual data due to occasionally abnormal error or data probability distribution deviates normal distribution, if we use the regression analysis result of least square method, we will lose its good statistical property, one solution of this situation is to use the criterion function with steady performance[4-5]. Moreover, during linear fitting, compare to the Scatter-point actual distributed tendency, the straight line determined by least squares method in many fields all has a bit small slope phenomenon, especially the fluctuation of the sample Scatter-point is a little bigger, this phenomenon is more obvious. The reason is that, when we use ordinary least square method on parameter estimation, we have the following criterion: choose an equation, which the range difference sum of squares ������22ˆiiiyyE of vygx��)(.0观测值和估计值之间的因变量最小的线性方程。其几何意义是:找到一条直线,使得观测值的散点与这条直线之间的纵向距离平方和最小,而不是使得散点与这条直线之间的几何距离(垂直距离)平方和最小,这导致估计的直线具有较小的斜率倾向[6]。在最小二乘法理论的背景下,我们可以发现当我们在数据拟合中使用最小二乘法时隐藏的默认前提假设:因为样本的因变量受到随机扰动项的影响,它是随机波动的,而样本的自变量不是随机的,即散点偏离线性方程,完全是因为散点在因变量的波动方向上波动,而不是在所有因变量的波动方向上的综合效应。因此,在不考虑前提假设条件的情况下直接使用最小二乘法进行数据拟合,难怪我们会得到带有偏差的拟合方程。不同的方法产生不同的结果,然而不同的方法建立在不同的前提假设基础上。因此,每种方法本身不能说在这个问题上谁是对的,只能说谁的前提假设更合理。在实际生活中,模型通常有两种关系,即:(1)明确的关系模型,即,在散点� x , y � ,x和y有明确的关系(即:x是自变量y是因变量或y是自变量x是因变量)0(2)自变量是随机变量模型,即,在散点� x , y � ,0x和y的主次关系是模糊的,不清晰,自变量和因变量并不总是非常清晰地区分。例如,人的身高和体重的关系,这两个变量的程度完全协调,不仅有� � � f ( x ) y ,还有0显然,最小二乘法并不适用于所有线性拟合模型参数估计。因此,0对于第二种模型,我们通常不使用最小二乘法准则拟合。为了解决自变量为随机变量模型的拟合问题,近年来,许多科学技术工作者对最小二乘拟合方法进行了大量研究,并取得了丰硕的成果。例如,主成分回归(PCR)、偏最小二乘(PLS)、各种非线性主成分回归(NLPCR)、各种非线性偏最小二乘(NLPLS)、神经网络方法以及这些方法的综合等,在当前项目中实际应用相当广泛[4-nn21on formula �������� by least square method is: is dependent variable, �02211����bXaXaXann� (1) 043 郭伟杨和贾旭 / AASRI Procedia 6 ( 2014 ) 41 – 480工作者进行了大量研究,并取得了丰硕的成果。例如,主成分回归(PCR)、偏最小二乘(PLS)、各种非线性主成分回归(NLPCR)、各种非线性偏最小二乘(NLPLS)、神经网络方法以及这些方法的综合等,在当前项目中实际应用相当广泛[4-09] .在本文中,我们在进行散点拟合时没有使用上述文献提到的方法,而是使用几何距离平方和的最小值作为新的拟合标准,提出了一种基于新标准的新数据拟合方法。在该方法中,包括模型转换,将拟合参数解决方案转化为“约束优化问题”,并提供拟合参数解决方案算法。模拟实验表明,拟合参数解决方案算法是可行和有效的;在独立变量为随机变量向量数据的数据拟合中,我们可以通过使用新的数据拟合方法而不是最小二乘法获得更高的拟合精度。02. 一个新的数据拟合标准和新的数据拟合方法0问题:给定线性关系变量 , , , 2 1 X � X X n 在 n 维空间中有实验或观测数据(带有误差): � in � i i x x x , , ,2 1 � , i � ,1 � , N � N � n � ,求解 X n X X , , , 2 1 � 的实际线性关系公式。0是自变量,建立 X 1 , X n X , , 2 � 的线性关系公式 2 2 1 X X � � � � � � n X n �0是自变量,并且 X n X , , 2 �是固定变量(表示实验或观测数据没有误差)。证明了,在这种假设下,最小二乘法进行数据拟合是非常有效的。然而,在许多情况下,这种假设是不满意的,有时自变量 X n X , , 2 � 是随机变量,有时 X n X X , , , 2 1 �无法区分哪个是因变量,哪个是自变量。当自变量 X n X , , 2 � 是随机变量,或者 X n X X , , , 2 1 �无法区分哪个是因变量,哪个是自变量时,我们自然地将它们联想到它们的隐含函数关系,即,在 n维空间中变量 X n X X , , , 2 1 � 的线性关系可以表示如下,0其中 a n a a 2 � 1 ,不全为0。数据向量 � in � i i x x x , , , 2 1 � � N � i , ,1 2, � � 在上述超平面或周围。 a n X n a Xa X � � 2 � 2 1 1 � b � 0 。00 2 2 1 1 � � � � b a X a X a X n n � 在 n 维空间中可以进行向量数据拟合,其中 b a a a n , , 2 1 �是一些不确定的参数。直观且易于构思的超平面选择标准是:距离(或距离)的最小平方和 从 � in � i i x x x , , ,2 1 � � N � i , ,1 2, � � 到超平面 0 2 2 1 1 � � � � b a X a X a X n n � 。be���Ni,,2,1�� (2) is),,,(21njjjxxx� to ���nn XaXaXa�22110�b geometry distance. �������������NininniiNiiaaabxaxaxaeJ1222221221112��CallJminis geometrical distance criterion (new data fitting criterion), that is, the least quadratic sum of geometrical distance criterion. As follows, we will give the new data fitting method base on the least quadratic sum of geometrical distance criterion. The new data fitting method base on the least quadratic sum of geometrical distance criterion mainly consists of the following two parts: (a)Data fitting optimization model (b)Data fitting optimization model solution Data Fitting Optimization Model Divide 22221naaa���� on each side of (1) equation synchronously, and suppose 22221njjaaaaa�������nj,,2,1��,22221naaabb�����,then (1) equation can be changed into: 02211�����bXaXaXann� (4) and122221����naaa�. In a similar way, (3) equation can be changed into: ��������������������NiinniiNininniiNiibxaxaxaaaabxaxaxaeJ1222111222221221112���Hence, data fitting model can be changed into the following optimization model: 044 郭伟杨和徐佳 / AASRI Procedia 6 ( 2014 ) 41 – 480i n i i i a a a0定义2:定义数据拟合的评价函数为0定义2:定义数据拟合的评价函数为45 Guowei Yang and Jia Xu / AASRI Procedia 6 ( 2014 ) 41 – 48 ��01..min22221122211�����������nNiinniiaaatsbxaxaxa����Ni,,2,1�� (5) ����1),,,,,(2222112221121���������� ��nNiinniinaaabxaxaxabaaaL����� (6) obtain the following equation group. 0),,,,,(21��baaaFn� (7) Simply written as: 0�F, where ��,,21 FFF ��TnnFF21,,��, that is �����������������������������������������������010002222112211122111221111nNiinniiNiinniiinnNiinniiiaaabxaxaxabxaxaxaxabxaxaxaxa����������� (8) Where Ni,,1 �� are N known data scatter-points. ����������NiinniiimmmbxaxaxaxaF12211����nm,,2,1������������NiinniinbxaxaxaF122111�0其中 b a a a , n , , , 2 1 � 是一些不确定的参数。� 在 � i i x x x 2 � 1 , , N i � ,1 � , 是 N个已知的数据散点。数据拟合优化模型解 根据极值解理论,公式(5)可以转化为拉格朗日函数:0考虑稳定点 , , ) , , , ( 2 1 a b � a a n �。对上述方程进行偏导数,我们可以得到46 Guowei Yang and Jia Xu / AASRI Procedia 6 ( 2014 ) 41 – 48 1222212������nnaaaF�Solve the non-linear equations group of equation(7), the fitting parameter baaan,,,,21�and � are obtained, farther solve the linear fitting equation 02211����bXaXaXann�, as well as solve the equation nnXaXaXa���22110�� b, thereby we can obtain the data fitting model. 3. Solution Algorithm of the Data Fitting Parameter Iteration algorithm of fitting parameter will be given as follows. For non-linear equation group,,,,,(21baaaFn�0) ��,�,,,,,21baaan�are some uncertain parameters. Suppose ),,,,,(21�baaaYn��, initial iteration point � �� �� �� �� �� �),,,,,(00002010�baaaYn��,the iteration point after iterate k times is � �� �� �� �� �� �),,,,,(21kkknkkkbaaaY���, by Taylor formula, we can obtain � �������� �����111)()(�������kkkkkYYYFYFYFafter rearrangement, we obtain the following equation: ����� ������ �)()()(111kkkkkYFYFYYYF�������Definition 3: Call iteration algorithm ��� �� ������ ���������� ������ ������ �����������������������������������,1,0)()()(111111kYFYFYFYFYFYYYFYFYFYYkkkkkkkkkkkk (9) is the correction technique of solving non-linear equation group0),,,,,(21��baaaFn� rank .mBy equation (9), we can solvebaaan,,,,21�, accordingly, can also determined the variable coefficient baaan,,,,21�. As well as by equation (9), first solvesnaaa,,,21�, b , consequently, determines the variable coefficient baaan,,,,21�.4. Simulation Analysis Suppose the point on the straight line0110��� yx, the observation data (has error) as follows: 1�i2�i3�i4�i5�i6�i7�i8�i9�ix -2 -2 -0.5 -1 0.5 0 1.5 1 2.5 y -19 -14 -9 -4 1 6 11 16 21 Determine the fitting straight line0���cbyax by the above discrete point, where cba,,are some unknown parameter. Then solve the equation (8) with the data fitting parameter solution algorithm. Following Table 1 and Fig. 1 respectively is under the perfect condition, under the least square criterion, the straight line equation fitting parameter and the simulation figure based on the geometrical distance criterion. 47 Guowei Yang and Jia Xu / AASRI Procedia 6 ( 2014 ) 41 – 48 Table 1. Fitting parameter under different criterion perfect condition the least square criterion based on the geometrical distance criterion a 10 8.4211 -0.99377232 b -1 -1 0.11142971 0c 1 1 -0.11150图1. 不同拟合标准下的模拟结果0通过图表,我们可以看到,离散点均匀分布在三条直线的不同侧面。很明显,根据最小二乘准则和基于几何距离准则的拟合效果都很好。然而,参照理想的直线,我们可以知道,根据新准则的拟合直线明显优于最小二乘拟合直线,新准则下的拟合直线位于理想直线和最小二乘直线之间,拟合精度高于最小二乘直线。0致谢0作者要感谢编辑和匿名审稿人的宝贵意见和建设性建议。本研究得到中国国家自然科学基金(No. 61272077,61202319),江西省自然科学基金(No. 20114BAB201034)和江西省科技项目(No.20133BBE50022)的支持。0参考文献0[1] 林洪华。动态测量数据处理。北京:北京理工大学出版社;1952。 [2]林洪伟。渐进迭代逼近的自适应数据拟合。《计算机辅助几何设计》。2012; 2(7):463-473。048 郭伟阳和徐佳 / AASRI Procedia 6 ( 2014 ) 41 – 480[3] Lapo Governi, Rocco Furferi, Matteo Palai, Yary Volpe.从正交视图中的3D几何重建:基于3D图像处理和数据拟合的方法。《工业计算机》,2013年3月19日在线发表。 [4] E. Vassiliou, I.C. Demetriou.一种用于最小二乘分段单调数据拟合的自适应算法。《计算统计与数据分析》。2005; 49(2):591-609。 [5]G. Casciola, L. Romani.一种用于易于控制的有理曲线的约束最小二乘数据拟合的牛顿型方法。《计算和应用数学杂志》。2009;223:672-692。 [6] 黄敏杰,叶浩,王贵增。基于投影的回归分析方法总结,《控制理论与应用》。2001;8(18):1-6。 [7] Alexandru Mihai Bica.使用最佳Hermite型三次插值样条拟合数据。《应用数学通讯》。2012; 25(12):2047-2051。 [8] PhilippReinecke, Tilman Krauß, Katinka Wolter.基于聚类的相位类型分布拟合到经验数据。《计算机与应用数学》。2012; 64(12);3840-3851。 [9] AkemiGálvez, Andrés Iglesias, Andreina Avila.基于免疫的方法用贝塞尔曲面准确拟合3D嘈杂数据点。《计算机科学论文集》。2013; 18:50-59。
下载后可阅读完整内容,剩余1页未读,立即下载
cpongm
- 粉丝: 5
- 资源: 2万+
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 前端协作项目:发布猜图游戏功能与待修复事项
- Spring框架REST服务开发实践指南
- ALU课设实现基础与高级运算功能
- 深入了解STK:C++音频信号处理综合工具套件
- 华中科技大学电信学院软件无线电实验资料汇总
- CGSN数据解析与集成验证工具集:Python和Shell脚本
- Java实现的远程视频会议系统开发教程
- Change-OEM: 用Java修改Windows OEM信息与Logo
- cmnd:文本到远程API的桥接平台开发
- 解决BIOS刷写错误28:PRR.exe的应用与效果
- 深度学习对抗攻击库:adversarial_robustness_toolbox 1.10.0
- Win7系统CP2102驱动下载与安装指南
- 深入理解Java中的函数式编程技巧
- GY-906 MLX90614ESF传感器模块温度采集应用资料
- Adversarial Robustness Toolbox 1.15.1 工具包安装教程
- GNU Radio的供应商中立SDR开发包:gr-sdr介绍
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功