《机器学习for文本》是由Charu C. Aggarwal撰写的一本关于机器学习在文本处理领域的权威著作。这本书是IBM T.J. Watson Research Center位于美国纽约州约克敦高地的研究成果,出版于2018年。它提供了对文本挖掘、自然语言处理(NLP)和深度学习在文本分析中的深入探讨,旨在帮助读者理解并应用这些技术来解决实际问题。

该书的ISBN号为978-3-319-73530-6和978-3-319-73531-3,电子版可以通过DOI链接访问:https://doi.org/10.1007/978-3-319-73531-3。

书中涵盖了诸如特征工程、文本分类、情感分析、主题建模、文档摘要、命名实体识别等核心主题,展示了如何使用机器学习算法如朴素贝叶斯、支持向量机、神经网络等处理文本数据,以提取有价值的信息。此外,它还介绍了当时最新的技术和工具,比如深度学习模型(如卷积神经网络和循环神经网络)在文本处理中的应用。

对于想要深入理解机器学习在文本分析中的专业人士和研究者来说,《机器学习for Text》是一本重要的参考书,不仅提供了理论知识,还提供了实践指导,有助于读者将理论转化为实际的文本处理解决方案。
The rich area of text analytics draws ideas from information retrieval, machine learning, and natural language processing. Each of these areas is an active and vibrant field in its own right, and numerous books have been written in each of these different areas. As a result, many of these books have covered some aspects of text analytics, but they have not covered all the areas that a book on learning from text is expected to cover. At this point, a need exists for a focussed book on machine learning from text. This book is a first attempt to integrate all the complexities in the areas of machine learning, information retrieval, and natural language processing in a holistic way, in order to create a coherent and integrated book in the area. Therefore, the chapters are divided into three categories: 1. Fundamental algorithms and models: Many fundamental applications in text analyt- ics, such as matrix factorization, clustering, and classification, have uses in domains beyond text. Nevertheless, these methods need to be tailored to the specialized char- acteristics of text. Chapters 1 through 8 will discuss core analytical methods in the context of machine learning from text. 2. Information retrieval and ranking: Many aspects of information retrieval and rank- ing are closely related to text analytics. For example, ranking SVMs and link-based ranking are often used for learning from text. Chapter 9 will provide an overview of information retrieval methods from the point of view of text mining. 3. Sequence- and natural language-centric text mining: Although multidimensional rep- resentations can be used for basic applications in text analytics, the true richness of the text representation can be leveraged by treating text as sequences. Chapters 10 through 14 will discuss these advanced topics like sequence embedding, deep learning, information extraction, summarization, opinion mining, text segmentation, and event extraction.
