A platform for automated
and human evaluation
of machine translation

Machine translation models are evolving rapidly. Automated scores alone do not fully explain how models behave in production.

Vera combines automated and human evaluation to assess machine translation quality. Automated metrics provide speed and consistency, while human review improves error detection. Together, they reduce uncertainty in model selection, help identify regressions, and enable more transparent and traceable decision-making.

Request a demo

The model defines
the translation

The opacity of neural machine translation (NMT) and large language models (LLM) makes automatic metrics insufficient, as models with similar scores perform very differently in production. Furthermore, constant updates can introduce invisible degradations. Without a solid evaluation, it is impossible to detect these quality losses in time.

Vera
workflow

Vera integrates automated metrics, human evaluation based on MQM, and comparative analysis into a single workflow. This unified approach combines the speed and reproducibility of automated evaluation with the linguistic accuracy of human review, enabling reliable comparisons between translation systems and more efficient detection of improvements or degradations in translation quality.

Dataset
Upload

Automated
Metrics

Human
Evaluation (MQM)

Model
Comparison

Best
Decision

Research focused
on real decisions

Vera integrates automatic evaluation, human evaluation, and comparative analysis into a single web environment, facilitating processes that are usually carried out using separate tools and fragmented workflows.

The platform is used both for scientific research and validation and for the continuous quality control of our own machine translation systems, including platforms such as Opentrad.

Licensing for universities and research centres
Continuous evaluation of translation models and engines
Objective comparison between systems and versions
Quality control before production deployments

Key
features

Automated Evaluation

Human Evaluation

Analysis

Automated Metrics

Provides fast and consistent evaluation of machine translation quality through automated metrics, enabling large-scale comparisons across models, versions, and configurations.

MQM Annotation

Enables detailed human evaluation using the MQM framework to identify and classify translation errors, providing insight into a model's strenghts and weaknesses.

Correlations

Analyses the relationship between automated scores and human judgements to determine when metrics accurately reflect real translation quality.

Statistical Testing

Applies statistical analysis to determine whether differences between models are significant, supporting reliable decision-making.

Sampling

Allows the creation of representative evaluation samples, reducing review effort while maintaining reliable quality analysis.

Error Filtering

Provides advanced filtering to explore errors by type, severity, model or criteria, helping identify improvement areas.

User-Friendly Interface

Offers a web-based environment to configure evaluations, manage projects and analyse results without fragmented workflows.

Reference Creation

Supports the creation and refinement of reference translations for more accurate evaluation in specific domains and scenarios.

Reporting

Generates structured reports and visual summaries to communicate results and support model selection, improvement and deployment decisions.

Additional
features

Multiuser Evaluation

Vera enables collaborative evaluation of models among different users and incorporates the calculation of Inter-Annotator Agreement (IAA) to measure consistency between annotations.

Multilingual

Vera is language-independent and imposes no restrictions on language pairs, allowing the evaluation of any combination to be integrated into research.

Continuous improvement

The platform evolves constantly with the development of new functionalities aimed at improving the system and expanding its capabilities.

Objective comparison
using metrics

Vera incorporates automatic evaluation metrics to measure precision, similarity and translation quality between different systems.

The platform allows performance comparison across different models, calculating statistical differences and detecting variations between models in a simple and visual way.

Trust is also
evaluated

At imaxin, we believe that artificial intelligence must be measurable, analysable and validated in a transparent way.

That is why Vera is not just an evaluation tool: it is the foundation that allows us to build more reliable, accurate machine translation systems adapted to each linguistic context.

It is not enough to translate. We need to know how well it translates.

Applied research
for real problems

Vera is developed in collaboration with universities and international research centres, combining scientific rigour and practical application in production environments.

The platform has been presented at specialised forums for machine translation evaluation and is part of an industrial research line aimed at improving the quality and reliability of linguistic AI systems.

Discover how
Vera can help
your organisation

Get a demo or a free trial

Book a demo to see how Vera can
help you make the best decision.

Vera Evaluation, comparison and improvement of machine translation models.

A platform for automated and human evaluation of machine translation

The model definesthe translation

Vera workflow

Dataset Upload

Automated Metrics

Human Evaluation (MQM)

Model Comparison

Best Decision

Research focused on real decisions

Key features

Automated Evaluation

Human Evaluation

Analysis

Automated Metrics

MQM Annotation

Correlations

Statistical Testing

Sampling

Error Filtering

User-Friendly Interface

Reference Creation

Reporting

Additional features

Objective comparisonusing metrics

Trust is also evaluated

It is not enough to translate. We need to know how well it translates.

Applied research for real problems

Discover how Vera can help your organisation

Get a demo or a free trial

Vera
Evaluation, comparison and improvement
of machine translation models.

A platform for automated
and human evaluation
of machine translation

The model defines
the translation

Vera
workflow

Dataset
Upload

Automated
Metrics

Human
Evaluation (MQM)

Model
Comparison

Best
Decision

Research focused
on real decisions

Key
features

Additional
features

Objective comparison
using metrics

Trust is also
evaluated

Applied research
for real problems

Discover how
Vera can help
your organisation