A community-driven research ecosystem dedicated to empowering under-resourced languages with cutting-edge AI and Knowledge Graph technology.
To make under-resourced languages community-driven, accessible, reusable, and AI-ready. We aim to bridge the digital divide by providing communities with the tools and structured data necessary to thrive in the modern technological landscape.
A world where every language, regardless of its resource status, has a sovereign digital presence. We envision an open ecosystem of FAIR and CARE data that empowers speakers to build their own futures.
Developing foundational Knowledge Graphs for under-represented languages.
Ensuring language data remains under the collective control of the community.
Creating high-quality datasets for training inclusive LLMs and STT models.
Training community members in digital documentation and AI literacy.
QuechuaBase Lab emerged from the intersection of deep linguistic passion and the urgent need for digital inclusion. What started as a grassroots collaboration between researchers in Puno, Peru, has grown into a global network.
We recognized that most AI technologies overlook the rich complexity of Andean and Amazonian languages. By combining traditional knowledge with modern semantic technologies, we created a "Lab" that isn't just a physical space, but a decentralized community of practitioners and scientists.
Today, our lab operates from Nuñoa, Peru and Cambridge, UK, bridging the gap between local language speakers and international research institutions.
"Making languages accessible, one word at a time, is not just a technical challenge — it is a commitment to preserving the shared history of humanity."