java实现用(tokenizer)将输入的文本分解成单词或者子词的序列,然后对每个单词或子词进行编码得到tokens
时间: 2024-05-29 14:09:58 浏览: 120
StringTokenizer的使用
5星 · 资源好评率100%
import java.util.Scanner;
import java.util.StringTokenizer;
public class TokenizerExample {
public static void main(String[] args) {
Scanner scanner = new Scanner(System.in);
System.out.println("请输入一段文本:");
String text = scanner.nextLine();
StringTokenizer tokenizer = new StringTokenizer(text);
int count = tokenizer.countTokens();
System.out.println("共有" + count + "个token:");
while (tokenizer.hasMoreTokens()) {
String token = tokenizer.nextToken();
int code = token.hashCode();
System.out.println(token + "的编码是:" + code);
}
}
}
// 示例输入:hello world!
// 示例输出:
// 共有2个token:
// hello的编码是:99162322
// world!的编码是:2032227872
阅读全文