models of how this happens. It's not so obvious how to build a single model that reasons from symptoms
to diseases, i.e. P(e | f). Furthermore, we may have independent sources of information about P(e) in
isolation, such as old hospital records.
----------------------------------------------------------------------------------
7. Word Reordering in Translation
If we reason directly about translation using P(e | f), then our probability estimates had better be very
good. On the other hand, if we break things apart using Bayes Rule, then we can theoretically get good
translations even if the probability numbers aren't that accurate.
For example, suppose we assign a high value to P(f | e) only if the words in f are generally translations of
words in e. The words in f may be in any order: we don't care. Well, that's not a very accurate model of
how English gets turned into French. Maybe it's an accurate model of how English gets turned into
really bad French.
Now let's talk about P(e). Suppose that we assign a high value to P(e) only if e is grammatical. That's
pretty reasonable, though difficult to do in practice.
An interesting thing happens when we observe f and try to come up with the most likely translation e.
Every e gets the score P(e) * P(f | e). The factor P(f | e) will ensure that a good e will have words that
generally translate to words in f. Various “English” sentences will pass this test. For example, if the
string “the boy runs” passes, then “runs boy the” will also pass. Some word orders will be grammatical
and some will not. However, the factor P(e) will lower the score of ungrammatical sentences.
In effect, P(e) worries about English word order so that P(f | e) doesn't have to. That makes P(f | e) easier
to build than you might have thought. It only needs to say whether or not a bag of English words
corresponds to a bag of French words. This might be done with some sort of bilingual dictionary. Or to
put it in algorithmic terms, this module needs to be able to turn a bag of French words into a bag of
English words, and assign a score of P(f | e) to the bag-pair.
Exercise. Put these words in order: “have programming a seen never I language better”. This task is
called bag generation.
Exercise. Put these words in order: “actual the hashing is since not collision-free usually the is less
perfectly the of somewhat capacity table”
Exercise. What kind of knowledge are you applying here? Do you think a machine could do this job?
Can you think of a way to automatically test how well a machine is doing, without a lot of human
checking?
Exercise. Put these words in order: “loves John Mary”
The last exercise is hard. It seems like P(f | e) needs to know something about word order after all. It
can't simply suggest a bag of English words and be done with it. But, maybe it only needs to know a
little bit about word order, not everything.
----------------------------------------------------------------------------------
8. Word Choice in Translation
评论1