Creating a model that can process harmful content - without generating more - is a challenge. Addressing it across multiple languages and the varying cultural sensitivities of different regions adds another layer of complexity. CaLICO is a large language model being developed by Textgain to overcome these challenges to help identify and contextualise harmful content in all official European languages.
Large language models, especially commercial ones, refuse to handle toxic language. This makes it almost impossible to use them to process harmful content. With CaLICO, we are developing a language model from scratch that can process harmful content content responsibly, without perpetuating it. As a winner of the prestigious Large AI Grand Challenge, CaLICO has received significant support, including the resources necessary to train a foundation model on the EU’s supercomputers.
We believe that CaLICO has the potential to revolutionize harmful content analysis. By identifying harmful content across all EU languages, CaLICO will empower better analytics of potentially dangerous content such as violent speech more effectively. The project’s ongoing development and our commitment to academic collaboration will ensure that CaLICO continues to evolve, providing novel tools for processing harmful content in the future.