对《natural language processing with python》内容的单词进行分析和统计
时间: 2024-05-22 15:15:16 浏览: 100
以下是《Natural Language Processing with Python》一书中前 100 个出现频率最高的单词及其出现次数:
| 单词 | 出现次数 |
| --- | --- |
| the | 2360 |
| , | 2197 |
| . | 1974 |
| of | 1254 |
| and | 1075 |
| to | 1052 |
| a | 1024 |
| in | 820 |
| 's | 741 |
| that | 622 |
| for | 439 |
| is | 416 |
| we | 392 |
| with | 387 |
| The | 374 |
| it | 352 |
| as | 345 |
| on | 332 |
| this | 331 |
| be | 326 |
| are | 321 |
| by | 304 |
| from | 301 |
| can | 298 |
| our | 291 |
| an | 284 |
| or | 266 |
| language | 256 |
| NLP | 240 |
| at | 237 |
| natural | 215 |
| processing | 202 |
| not | 201 |
| but | 199 |
| have | 196 |
| will | 186 |
| text | 182 |
| all | 180 |
| their | 179 |
| has | 178 |
| one | 178 |
| used | 177 |
| more | 174 |
| by-nc-nd | 172 |
| using | 170 |
| about | 166 |
| can't | 166 |
| or-nc-nd | 165 |
| its | 165 |
| they | 165 |
| other | 164 |
| than | 164 |
| some | 163 |
| which | 160 |
| also | 159 |
| than-nc | 154 |
| than-nc-nd | 153 |
| may | 151 |
| would | 151 |
| these | 143 |
| such | 142 |
| there | 139 |
| new | 136 |
| when | 134 |
| into | 133 |
| been | 128 |
| two | 127 |
| many | 124 |
| most | 124 |
| using-nc-nd | 123 |
| first | 121 |
| up | 120 |
| should | 118 |
| out | 116 |
| between | 115 |
| also-nc-nd | 114 |
| them | 114 |
| do | 113 |
| using-nc | 112 |
| only | 111 |
| time | 111 |
| been-nc-nd | 110 |
| if | 109 |
| like | 109 |
| because | 108 |
| used-nc-nd | 108 |
| which-nc-nd | 108 |
| so | 107 |
| each | 106 |
| two-nc-nd | 104 |
| were | 103 |
可以看出,这份数据中出现次数最多的单词都是一些常见的英文单词,比如“the”、“and”、“a”、“in”等等。同时,也有一些和本书主题相关的单词,比如“NLP”、“natural”、“processing”和“text”等等。
阅读全文