value-nlp.org - Multi-VALUE: Cross-Dialectal NLP

Description: A toolkit for measuring and reducing discrepancies in Cross-Dialectal NLP.

Example domain paragraphs

๐Ÿค”    Problem: Dialect differences cause performance issues for many types of users of language technologies. If we want fair, inclusive, and equitable NLP, our systems need to be dialect invariant : performance should be constant over dialect shifts.

๐Ÿ’ก    Solution: Multi-VALUE is a suite of resources for evaluating and achieving English dialect invariance. It contains tools for systematically modifying written text in accordance with 189 attested linguistic patterns from 50 varieties of English. Researchers can use this to (1) build dialect stress tests and (2) train more robust models using Multi-VALUE as data augmentation.

๐Ÿงช    Experiments: You can reproduce experiments showing significant performance disparities in dialect QA, MT, and Semantic Parsing tasks. To fill these gaps, you can start by training on synthetic dialect data.

Links to value-nlp.org (2)