Discover how Nuwa can transform your organisation. Get in touch today.Contact Us
Nuwa

VAARHeT: VAARHeT Pilot 3: AR Translation Agent for Multilingual Heritage Tours

Validating real-time translation for live museum tours using mobile AR and smart glasses, achieving NPS -14 whilst revealing critical minority language support requirements and AR glasses hardware limitations for cultural heritage applications.

CompletedPublished:
Duration: -
heritageimmersive interactivedata ai ml
Programme: Horizon Europe VOXReality | Grant Agreement: 101070521
Funded by the European Union

Funded by the European Union

This project has received funding from the European Union. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union. Neither the European Union nor the granting authority can be held responsible for them.

Grant agreement number: 101070521

Customer Need and Value Proposition

Heritage institutions serving international tourism face persistent linguistic challenges providing comprehensive interpretive programming accessible across European language diversity without maintaining specialised multilingual guide staff for every visitor language combination. Āraiši Ezerpils Archaeological Park receives visitors requiring interpretation in Latvian (domestic population), English (international tourism standard), German (significant European heritage tourism segment), and Russian (regional visitor base), yet economic constraints prevent hiring guides fluent across all language pairs for every tour schedule throughout operational season. Current practice concentrates multilingual tours during peak-season high-demand periods whilst shoulder seasons and off-peak schedules offer limited language options excluding visitors lacking English or Latvian proficiency. The VAARHeT Pilot 3 investigation explored whether real-time translation technology using mobile devices and AR glasses could minimise cost and linguistic expertise barriers enabling multilingual live tour interpretation without requiring guide multilingual fluency across comprehensive European language spectrum.

Operational Challenge Context

Live museum tours deliver superior educational value compared to self-guided experiences through expert narrative, responsive question-answering, contextual emphasis adapting to visitor interest signals, and social group experience creating shared discovery moments, yet multilingual delivery requires either tour guide fluency across multiple languages (rare and expensive) or separate tour instances for each language (fragmenting visitor groups and multiplying guide resource requirements). Simultaneous interpretation equipment used in conference contexts proves impractical for outdoor museum environments requiring mobile visitor movement across extensive terrain whilst cost structures designed for large-scale events prove economically inappropriate for typical heritage tour group sizes of 5-15 participants. Existing translation applications support asynchronous text translation but fail to provide real-time speech translation maintaining conversational flow synchronization where translated content reaches international visitors simultaneously with original language guide narration without disruptive delays undermining tour pacing and group cohesion.

Technical Solution Architecture

Pilot 3 implemented real-time translation pipeline integrating VOXReality Automatic Speech Recognition capturing tour guide speech through smartphone microphone, Neural Machine Translation converting German source narration to English and Latvian target languages, and text display rendering through either ActiveLook AR smart glasses overlaying translated subtitles on visitor field of view or Samsung Galaxy Note10+ 5G mobile device screen for participants without AR glasses. Guide wore smartphone with microphone capture transmitting audio to cloud-based VOXReality ASR service (German language model) generating text transcriptions, which fed VOXReality Neural Machine Translation service producing English and Latvian translations transmitted back to participant devices for subtitle rendering. The architecture prioritised real-time processing minimising translation lag maintaining tour narrative synchronization, whilst AR glasses aimed to provide hands-free subtitle reading enabling participants to maintain visual attention on guide gestures, site features, and group interactions without looking down at mobile screens.

Validation Methodology

Āraiši Ezerpils validation (14-16 July 2025) tested live tour translation with 39 participants divided between ActiveLook AR glasses users and mobile device display users, following authentic museum tours led by Latvian-speaking guide with German narration whilst translation system processed real-time English and Latvian subtitle generation. Evaluation instruments included Net Promoter Score measuring overall recommendation likelihood, translation accuracy assessment by language pair comparing machine translations against professional translator reference, hardware usability evaluation separately rating AR glasses versus mobile display interfaces, user experience ratings across interaction dimensions (first impressions, ease of use, text readability, synchronization quality), and qualitative feedback capturing participant perspectives on multilingual access value, technical reliability, and deployment readiness for operational museum use.

Quantified Outcomes and Metrics

Pilot 3 validation produced lowest satisfaction metrics among VAARHeT components with Net Promoter Score -14 (10 promoters, 8 passives, 21 detractors from 39 respondents) indicating negative overall reception requiring substantial improvement before acceptable deployment (VAARHeT Pilot 3 Validation Report July 2025). Task completion reached only 58% with participants successfully following tour narrative through translated subtitles, substantially lower than Pilots 1 and 2, indicating reliability and quality barriers preventing consistent functional operation. Accuracy rate averaged 65% across all language pairs with substantial variation: German-English translation achieved acceptable quality meeting intelligibility thresholds whilst Latvian translation suffered poor quality with frequent vocabulary errors, grammatical inconsistencies, and contextual misunderstandings undermining comprehension, revealing VOXReality Latvian language model maturity insufficient for heritage interpretation deployment requiring specialised cultural and archaeological terminology precision. Latency averaged 2100 milliseconds creating noticeable synchronization lag where translated subtitles appeared substantially delayed relative to guide speech, disrupting tour narrative flow and preventing participants from correlating spoken content with visual site features or guide gestures in real-time.

Strategic Insights and Lessons

Pilot 3 validation generated critical strategic insight that European linguistic diversity requires dedicated multilingual AI investment covering smaller language communities (Latvian, Lithuanian, Estonian, regional dialects) beyond high-resource languages (English, German, French, Spanish) dominating commercial AI training priorities, with minority language support constituting essential baseline requirement rather than optional enhancement for European heritage sector deployment credibility. The finding proved particularly significant for Culturama Platform language strategy, establishing that platforms claiming European cultural heritage focus without comprehensive linguistic coverage risk institutional rejection from heritage professionals prioritising local community accessibility over international tourist convenience. ActiveLook AR glasses hardware evaluation revealed current smart glasses technology inadequate for comfortable extended text reading in outdoor daylight conditions, with small display area, limited brightness, and font size constraints creating usability barriers preventing recommendation for heritage tour applications despite conceptual appeal of hands-free subtitle display. Participant feedback consistently indicated mobile device interfaces, whilst less immersive, provided superior practical usability enabling readable text display, intuitive interaction controls, and reliable visual feedback without novel hardware adoption barriers or specialised device training requirements.

Platform Evolution and Commercial Pathway

Pilot 3 validation evidence informed Culturama Platform language strategy establishing minority language support as critical success factor requiring dedicated investment in European linguistic spectrum coverage, prioritising accuracy across diverse language communities over feature completeness or rapid deployment timelines. Commercial positioning acknowledges multilingual capability as essential baseline for European heritage market credibility whilst tempering expectations about current technical maturity, explicitly communicating to potential clients that high-quality minority language support requires continued development investment beyond initial platform launch capabilities focused on high-resource language pairs. Platform development roadmap deprioritises AR glasses integration given current hardware limitations, focusing instead on mobile-first interfaces and desktop web applications providing reliable usability without requiring novel hardware adoption creating deployment friction and cost barriers for heritage institutions operating within constrained technology budgets.

Partnership Model and Attribution

Āraiši Ezerpils Archaeological Park provided live tour operational environment enabling authentic validation context with guide participation, multilingual visitor recruitment ensuring representative language distribution testing, and heritage interpretation expertise validating cultural content accuracy requirements. Technical Art Services contributed hardware usability evaluation methodology, accessibility assessment frameworks, and liaison between technology capabilities and museum operational requirements. Maggioli as VOXReality consortium leader provided access to Neural Machine Translation components whilst acknowledging Latvian language model maturity limitations requiring continued research investment. F6S Innovation facilitated consortium coordination and technical mentorship connections supporting implementation troubleshooting and validation methodology refinement. Validation outcomes generated honest negative results providing valuable strategic intelligence about technology readiness limitations, preventing premature deployment whilst informing targeted improvement priorities for achieving production-ready multilingual capability meeting heritage institutional quality standards.

Validation Metrics

Net Promoter Score
Likelihood to recommend rating
-14
Needs Improvement
Validation Metrics Profile
All validation dimensions normalised to 0-100 scale
NPSNet Promoter Score
-14
TaskTask Completion Rate
58%
AccuracyAccuracy Rate
65%
LatencyResponse Time
2100ms