A conversation with translator Art Goldhammer
As far as I'm concerned, Andrew Conner's example reveals a shortcoming of the translator a lot more basic than sexist bias or lack of nuance. The meaning of all pronouns should be entirely unambiguous thanks to the presence of a unique antedecent ("John", in the first sentence). Even if Google should somehow believe that "John" is a female name, the pronouns would all end up the same. My impression is that the previous, logical (as opposed to statistical) wave of AI would have gotten this one right. Apparently, whatever google is doing is unable to connect words that are more than 2 sentences apart. If this is a general limitation, then we can sleep easy, at least those of us with tech jobs beyond data entry...
> Right now, MT is only doing step 1 — the wide reading of an entire corpus. There is currently no technical mechanism for doing the more focused reading in step 2.
"Fine-tuning" models on a smaller, more focused dataset definitely exists in other areas of natural-language processing. I wonder why we haven't seen it applied to translation?