Aya Expanse 8B: Translation-Focused Language Model

27/10/2024 20:51:38

Aya Expanse 8B

The top model of the current week in our ranking is Aya Expanse 8B, developed by Cohere For AI.

About the Model:

Aya-Expanse was released in two versions: 32B and 8B parameters, specifically designed for translation tasks.

The model uses a transformer architecture with a 128K token context length and processes text-only input and output. Its development incorporated data arbitrage, multilingual preference training, safety tuning, and model merging techniques.

The model supports 23 languages: Arabic, Chinese (simplified & traditional), Czech, Dutch, English, French, German, Greek, Hebrew, Hindi, Indonesian, Italian, Japanese, Korean, Persian, Polish, Portuguese, Romanian, Russian, Spanish, Turkish, Ukrainian, and Vietnamese.

User testing shows strong performance in translation tasks, particularly excelling in Arabic and Vietnamese translations.

The model operates under a CC-BY-NC license and requires adherence to C4AI's Acceptable Use Policy. Testing shows weak performance in coding tasks and general knowledge queries, with notably strict content filtering that can block even standard technical terminology (as expected, since it was designed for translation tasks).

Benchmark results in multilingual Arena-Hard show win rates of: 70.6% vs Llama-3.1 8B, 60.4% vs Gemma-2 9B, 63.1% vs Ministral 8B, and 55.7% vs Qwen-2.5 7B in translation-related tasks.

Benchmark results in multilingual Arena-Hard