首页token.texts_to_sequences

token.texts_to_sequences

时间: 2023-10-03 18:09:30 浏览: 143

`token.texts_to_sequences` is a method in the Keras Tokenizer class that converts a list of texts into a list of sequences (i.e., lists of integers). Each integer represents a word in the text, and the list of integers represents the sequence of words in the text. The method takes in a list of texts as its argument and returns a list of sequences. For example, suppose we have a list of text documents: ``` texts = [ "the cat in the hat", "the dog chased the cat", "the cat ran away from the dog" ] ``` We can use the Tokenizer class to tokenize these texts and convert them into sequences: ```python from keras.preprocessing.text import Tokenizer # create tokenizer object token = Tokenizer() # fit tokenizer on the texts token.fit_on_texts(texts) # convert texts to sequences sequences = token.texts_to_sequences(texts) print(sequences) ``` This will output: ``` [ [1, 2, 3, 4], [1, 5, 6, 1, 2], [1, 2, 7, 8, 9, 1, 5] ] ``` In this example, the word "the" is assigned the integer value 1, "cat" is assigned 2, "in" is assigned 3, and so on. The first sequence ([1, 2, 3, 4]) corresponds to the first text ("the cat in the hat"), where "the" is the first word, "cat" is the second word, and so on.

阅读全文