以下有两条序列,进行匹配,其中A对应T,C对应G。匹配规则如下匹配成功得五分,错配扣三分,空位扣四分,线性扣分。 序列1:CAATTCAGTTA 序列2:GGATCGA 请给出匹配分数最高的匹配。并给出Python语言的代码
时间: 2023-04-02 16:04:08 浏览: 74
我可以回答这个问题。最高匹配分数为2,对应的匹配方式为:
序列1:CAATTCAGTTA
序列2:GGA-TCGA--
其中,-表示空位。匹配分数为:2*5 - 3*3 - 2*4 = 4
以下是Python代码:
seq1 = "CAATTCAGTTA"
seq2 = "GGATCGA"
# 构建匹配矩阵
matrix = [[ for j in range(len(seq2)+1)] for i in range(len(seq1)+1)]
for i in range(1, len(seq1)+1):
for j in range(1, len(seq2)+1):
if seq1[i-1] == 'A' and seq2[j-1] == 'T':
matrix[i][j] = matrix[i-1][j-1] + 5
elif seq1[i-1] == 'C' and seq2[j-1] == 'G':
matrix[i][j] = matrix[i-1][j-1] + 5
else:
matrix[i][j] = max(matrix[i-1][j]-4, matrix[i][j-1]-4, matrix[i-1][j-1]-3)
# 找到最高分数及其位置
max_score =
max_i, max_j = ,
for i in range(1, len(seq1)+1):
for j in range(1, len(seq2)+1):
if matrix[i][j] > max_score:
max_score = matrix[i][j]
max_i, max_j = i, j
# 回溯找到匹配方式
match1 = ""
match2 = ""
i, j = max_i, max_j
while i > and j > :
if seq1[i-1] == 'A' and seq2[j-1] == 'T':
match1 = 'A' + match1
match2 = 'T' + match2
i -= 1
j -= 1
elif seq1[i-1] == 'C' and seq2[j-1] == 'G':
match1 = 'C' + match1
match2 = 'G' + match2
i -= 1
j -= 1
else:
if matrix[i-1][j] >= matrix[i][j-1] and matrix[i-1][j] >= matrix[i-1][j-1]:
match1 = seq1[i-1] + match1
match2 = '-' + match2
i -= 1
elif matrix[i][j-1] >= matrix[i-1][j] and matrix[i][j-1] >= matrix[i-1][j-1]:
match1 = '-' + match1
match2 = seq2[j-1] + match2
j -= 1
else:
match1 = seq1[i-1] + match1
match2 = seq2[j-1] + match2
i -= 1
j -= 1
print("最高匹配分数为:", max_score)
print("匹配方式为:")
print(match1)
print(match2)