2017年Packt精通Java机器学习实战指南

需积分: 9 13 下载量 10 浏览量 更新于2024-07-19 收藏 23.85MB PDF 举报
《Mastering Java Machine Learning 2017》是一本由Packt出版的专业书籍,专注于帮助读者深入理解和应用Java编程语言进行机器学习实践。本书旨在为Java开发者提供一个实用且全面的指导,使他们能够在实际项目中有效地利用机器学习技术。 第1章“Machine Learning Review”首先回顾了机器学习的历史和定义,解释了何为机器学习以及与传统问题解决方式的区别。它探讨了机器学习的基本概念和术语,如监督学习、非监督学习、半监督学习等不同类型及其子类型。此外,章节还介绍了在机器学习过程中常用的训练数据集,以及机器学习在现实生活中的应用场景,包括推荐系统、图像识别和自然语言处理等领域。 接着,第二章“Practical Approach to Real-World Supervised Learning”重点介绍了在监督学习中的实际应用方法。这部分涵盖了数据的正式描述和标准化,包括数据预处理、特征工程(如特征选择和转换)、维度减少的重要性。模型构建的过程也在此详细讲解,包括各种监督学习算法(如线性回归、决策树、支持向量机等)的选择和实现。此外,通过案例研究——马匹肠绞痛分类,展示了如何将理论应用于实际场景并评估模型性能。 第三章“Unsupervised Machine Learning Techniques”则转向了无监督学习,讨论了它与监督学习之间的共性和独特问题。章节着重于聚类、关联规则学习和异常检测等技术,并强调了在没有明确标签数据的情况下,如何发现数据内在结构和模式。同时,它还探讨了如何在不同类型的无监督学习任务中选择合适的算法。 《Mastering Java Machine Learning 2017》提供了一个从基础到进阶的机器学习学习路径,特别是对于那些希望用Java作为工具进行数据科学和人工智能项目的开发者来说,这本书是一个宝贵的资源。它不仅涵盖理论知识,而且包含实战案例,有助于读者在实践中掌握和提升Java机器学习技能。
2018-04-03 上传
Chapter 1, Machine Learning Review, is a refresher of basic concepts and techniques that the reader would have learned from Packt's Learning Machine Learning in Java or a similar text. This chapter is a review of concepts such as data, data transformation, sampling and bias, features and their importance, supervised learning, unsupervised learning, big data learning, stream and real-time learning, probabilistic graphic models, and semi-supervised learning. Chapter 2, Practical Approach to Real-World Supervised Learning, cobwebs dusted, dives straight into the vast field of supervised learning and the full spectrum of associated techniques. We cover the topics of feature selection and reduction, linear modeling, logistic models, non-linear models, SVM and kernels, ensemble learning techniques such as bagging and boosting, validation techniques and evaluation metrics, and model selection. Using WEKA and RapidMiner, we carry out a detailed case study, going through all the steps from data analysis to analysis of model performance. As in each of the other chapters, the case study is presented as an example to help the reader understand how the techniques introduced in the chapter are applied in real life. The dataset used in the case study is UCI HorseColic. Chapter 3, Unsupervised Machine Learning Techniques, presents many advanced methods in clustering and outlier techniques, with applications. Topics covered are feature selection and reduction in unsupervised data, clustering algorithms, evaluation methods in clustering, and anomaly detection using statistical, distance, and distribution techniques. At the end of the chapter, we perform a case study for both clustering and outlier detection using a real-world image dataset, MNIST. We use the Smile API to do feature reduction and ELKI for learning. Chapter 4, Semi-supervised Learning and Active Learning, gives details of algorithms and techniques for learning when only a small amount labeled data is present. Topics