7-1 jmu-Java-05集合-4-倒排索引 分数 20 作者 郑如滨 单位 集美大学 对若干行文字建立倒排索引(根据单词找到所在行号)。 然后根据关键字,在倒排索引查找进行查找,找到包含所有该关键字所在的行数并输出。 输入说明 若干行英文,以!!!!!为结束。 输入一行查询关键字,以1个空格为分隔 输出说明 输出所创建倒排索引。索引的key按照字母升序,索引的value按照行号升序 输出查询结果。如果找到,输出行集与行集内每一行的内容,如果没找到输出found 0 results 输入样例 where are you from are you ok this is a test that is an apple there are lots of apples you eat it who are you !!!!! you are eat you test abc 输出样例 a=[2] an=[3] apple=[3] apples=[4] are=[1, 4, 5] eat=[4] from=[1] is=[2, 3] it=[4] lots=[4] of=[4] ok=[1] test=[2] that=[3] there=[4] this=[2] where=[1] who=[5] you=[1, 4, 5] [1, 4, 5] line 1:where are you from are you ok line 4:there are lots of apples you eat it line 5:who are you [4] line 4:there are lots of apples you eat it found 0 results found 0 results
时间: 2024-01-04 07:03:30 浏览: 119
好的,那么我可以给你提供一份 Java 代码作为参考:
```
import java.util.*;
public class InvertedIndex {
public static void main(String[] args) {
Scanner scanner = new Scanner(System.in);
Map<String, Set<Integer>> invertedIndex = new TreeMap<>();
List<String> lines = new ArrayList<>();
int lineNumber = 0;
// 读入文本并建立倒排索引
while (scanner.hasNextLine()) {
String line = scanner.nextLine().toLowerCase();
if (line.equals("!!!!!")) {
break;
}
lineNumber++;
lines.add(line);
String[] words = line.split(" ");
for (String word : words) {
if (!invertedIndex.containsKey(word)) {
invertedIndex.put(word, new TreeSet<>());
}
invertedIndex.get(word).add(lineNumber);
}
}
// 输出倒排索引
for (String word : invertedIndex.keySet()) {
Set<Integer> lineNumbers = invertedIndex.get(word);
System.out.print(word + "=" + lineNumbers + " ");
}
System.out.println();
// 进行查询
String[] keywords = scanner.nextLine().toLowerCase().split(" ");
List<Integer> result = new ArrayList<>(invertedIndex.get(keywords[0]));
for (int i = 1; i < keywords.length; i++) {
Set<Integer> lineNumbers = invertedIndex.get(keywords[i]);
result.retainAll(lineNumbers);
}
// 输出查询结果
if (result.isEmpty()) {
System.out.println("found 0 results");
} else {
for (int lineNumber2 : result) {
System.out.print(result + " ");
System.out.println("line " + lineNumber2 + ":" + lines.get(lineNumber2 - 1));
}
}
}
}
```
该代码首先使用 Scanner 读入文本并建立倒排索引,使用 TreeMap 存储倒排索引,保证按照字母升序输出。然后读入查询关键字,进行查询并输出结果。其中,使用 Set 存储行号,保证行号升序输出。在查询时,先将第一个关键字的行号集合赋值给结果集合,然后对于后面的每个关键字,取出其行号集合并与结果集合取交集。最后,如果结果集合为空,则输出"found 0 results",否则输出行号集合及其对应的行内容。
阅读全文