"基于支持向量机的文本分类技术研究与应用——以Web信息抽取为例"

0 下载量 76 浏览量 更新于2024-03-20 收藏 1.93MB DOC 举报
Support Vector Machines (SVM) has been widely used in machine learning theory, playing a crucial role in both classification and regression problems. This paper provides a brief introduction to the basic principles of SVM, discusses its applications in text classification, and analyzes in detail how SVM can be used to construct text classifiers. The detailed process of text classification is outlined, along with key technologies such as tokenization, vector space model (VSM), feature selection techniques, and SVM cross-validation methods. Furthermore, this paper briefly describes the process of creating a text classification system using Microsoft Visual C 6.0, including the implementation and optimization of important classes and key processing functions. Additionally, the paper explores how dynamic link libraries can be utilized to facilitate the migration from C to Java. Finally, experimental data and conclusions derived from this system are presented. Keywords: Machine Learning, Text Classification, Support Vector Machines (SVM)