贾法里算法公式,java
时间: 2023-07-19 12:03:00 浏览: 59
贾法里算法是一种字符串匹配算法,用于在一个文本串中查找一个模式串的出现位置。以下是贾法里算法的Java实现代码:
```java
public class JaroWinklerDistance {
public static double similarity(final String s1, final String s2) {
if (s1.equals(s2)) {
return 1.0;
}
final int window = Math.max(0, Math.max(s1.length(), s2.length()) / 2 - 1);
final String s1Matches = getMatchingCharacters(s1, s2, window);
final String s2Matches = getMatchingCharacters(s2, s1, window);
if (s1Matches.length() == 0 || s2Matches.length() == 0) {
return 0.0;
}
final int transpositions = getTranspositions(s1Matches, s2Matches);
final double jaro = ((double) s1Matches.length() / s1.length()
+ (double) s2Matches.length() / s2.length()
+ (double) (s1Matches.length() - transpositions / 2.0) / s1Matches.length()) / 3.0;
final double jaroWinkler = jaro + (getCommonPrefixLength(s1, s2) * 0.1 * (1.0 - jaro));
return jaroWinkler;
}
private static String getMatchingCharacters(final String s1, final String s2, final int window) {
final StringBuilder common = new StringBuilder();
final StringBuilder copy = new StringBuilder(s2);
for (int i = 0; i < s1.length(); i++) {
final char ch = s1.charAt(i);
boolean found = false;
for (int j = Math.max(0, i - window); !found && j < Math.min(i + window, s2.length()); j++) {
if (copy.charAt(j) == ch) {
found = true;
common.append(ch);
copy.setCharAt(j, '*');
}
}
}
return common.toString();
}
private static int getTranspositions(final String s1Matches, final String s2Matches) {
int transpositions = 0;
for (int i = 0; i < s1Matches.length(); i++) {
if (s1Matches.charAt(i) != s2Matches.charAt(i)) {
transpositions++;
}
}
return transpositions / 2;
}
private static int getCommonPrefixLength(final String s1, final String s2) {
final int n = Math.min(4, Math.min(s1.length(), s2.length()));
for (int i = 0; i < n; i++) {
if (s1.charAt(i) != s2.charAt(i)) {
return i;
}
}
return n;
}
}
```
其中,similarity方法用于计算两个字符串的相似度,返回值为一个0到1之间的double类型的数值。该方法首先判断两个字符串是否相同,如果相同则直接返回1.0;否则,根据传入的window参数获取两个字符串的匹配字符,并计算它们的相似度。其中,getMatchingCharacters方法用于获取匹配字符,getTranspositions方法用于计算转移数,getCommonPrefixLength方法用于获取两个字符串的前缀长度。
相关推荐
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![rar](https://img-home.csdnimg.cn/images/20210720083606.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)