翻译成英文 我们在对数据集进行预处理后,先对数据集进行机器学习,通过线性回归模型、决策树回归模型、随机森林回归模型、LGBM回归模型、XGBOOST回归模型的相互比较,具有较低的平均绝对百分比误差,但是由于数据集的数据不充分,我们又采取机器学习的方式,通过优化的LSTM模型,得到较低的MAPE值,拥有较好的预测效果,于是我们选择了第二种方法。在证明单词的任何属性是否会影响选择困难人数占总人数的百分比时,我们选择Person相关性分析、线性回归分析、单样本T检验、单位根检验逐渐深入的证明了两者的相关性。
时间: 2023-03-13 10:26:40 浏览: 145
Translated to English: After preprocessing the dataset, we first used machine learning to compare the linear regression model, decision tree regression model, random forest regression model, LGBM regression model and XGBOOST regression model, which have a lower mean absolute percentage error. However, due to insufficient data in the dataset, we also adopted machine learning to optimize the LSTM model and obtained a lower MAPE value with better prediction performance, so we chose the second method. In order to prove whether any attribute of the word would affect the percentage of people who are difficult to choose among the total number of people, we chose Person correlation analysis, linear regression analysis, one sample T-test and unit root test to gradually prove the correlation between the two.
阅读全文