Loading…
Monday December 16, 2024 13:15 - 14:15 GMT+03

Description:
AI continues to shape industries and innovation. Language plays a critical role in expanding the reach and capabilities of generative AI models. However, many languages are still underrepresented in training datasets. These are called "low-resource languages." For example, the Common Crawl is a free and open repository of web crawl data, widely used for the training of large language models. Yet, 46.5% of its documents are primarily in English. This is followed by Russian, German, Japanese and Spanish; each comprising around 5% of the dataset. According to UNESCO, there are over 8300 languages worldwide; whereas the Common Crawl contains only 160 languages. AI systems trained in a diverse set of languages is a precondition for advancing human rights and inclusion in the digital age. This session, "The Human Rights Impact of Underrepresented Languages in AI: The Unspoken South," will explore this issue by identifying problems and mapping solutions. First, it will underscore policy and societal implications of language underrepresentation in AI systems. This will include the impacts to cultural rights under international human rights law. This is, specifically, the rights to take part in cultural life; to enjoy the benefits of scientific progress; to benefit from the protection of scientific, literary or artistic production, including the protection of traditional knowledge. Moreover, the session will cover AI-specific policy implications, such as bias, fairness and safety. Second, the session will highlight lines of action to solve the challenge. This may include (1) the creation of incentive systems for people to contribute with data ethically; (2) awareness-raising to mainstream the topic within the digital rights agenda; (3) advocacy to unlock access to language datasets for communities that are culturally-associated with the data therein; and (4) co-designing copyright licenses that attend to the needs of low-resource language communities affected by AI.
Monday December 16, 2024 13:15 - 14:15 GMT+03
Workshop Room 1
Log in to leave feedback.

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link