thompsonb.github.io - Brian Thompson

Example domain paragraphs

I am currently a Senior Applied Scientist at Amazon AWS AI . I previously worked at Apple , Johns Hopkins University (where I also completed my PhD), MIT Lincoln Laboratory , and Rincon Research Corporation on topics including text-to-speech, machine translation (MT), bitext curation and filtering, automatic MT evaluation, multilingual modeling, paraphrasing, cross-language information retrieval, domain adaptation, and digital signal processing.

I developed Vecalign for the ParaCrawl parallel data acquisition project. Vecalign is an accurate sentence alignment algorithm based on multilingual sentence embeddings which is linear in complexity with respect to the number of sentences being aligned. In conjunction with LASER , Vecalign makes it easy to perform sentence alignment in about 100 languages (i.e. 100^2 language pairs), without the need for a machine translation system or lexicon. At the time of writing, Vecalign has the best reported performa

I also developed Prism , an automatic MT metric which uses a sequence-to-sequence paraphraser to score MT system outputs conditioned on their respective human references. Prism uses a multilingual neural MT model as a zero-shot paraphraser, which eliminates the need for synthetic paraphrase data and results in a single model which works in many languages (we release a model in 39 languages). At the time of publication, Prism outperformed or statistically tied with all metrics submitted to the WMT 2019 metri

Links to thompsonb.github.io (2)