Discover how Nuwa can transform your organisation. Get in touch today.Contact Us
Nuwa

Evaluation of XR Pilots for Humanitarian Response Use Cases

Peer-reviewed evaluation of extended reality pilot implementations for humanitarian emergency response training, assessing usability, training effectiveness, and technology integration within operational contexts.

Published: by Guillaume Auvray, Mark Roddy, Ciaran O'Floinn, Laurence Knoop
DOI: 10.5281/zenodo.17009269
Funded by the European Union

Funded by the European Union

This project has received funding from the European Union. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union. Neither the European Union nor the granting authority can be held responsible for them.

Grant agreement number: 101070192

Abstract

Nuwa partnered with Action Contre la Faim's Emergency Readiness and Response Unit to develop a proof-of-concept platform demonstrating and evaluating cooperative real-time experiences in extended reality for humanitarian crisis training. The platform integrated CORTEX2 enabling technologies including Rainbow CPaaS for secure communication, Video Call Alternative Appearance for avatar-based presence, Conversational Virtual Agent for AI dialogue, and automatic meeting summarisation capabilities. Three pilots addressed distinct emergency management cycle phases: arrival briefing introducing fundamental concepts through AI mentor interaction, collaborative alert and response strategy development in team environments, and implementation simulation featuring stakeholder negotiations with AI-powered characters. Evaluation conducted in Paris on 14 May 2025 with eight Action Contre la Faim emergency roster personnel assessed usability through System Usability Scale (average 59%), training value across five components (average 70% added value), and overall satisfaction (average 66%). Results indicate limited XR value for theoretical knowledge transfer (3.2 out of 5 rating), moderate value for collaborative planning (3.6 out of 5), and substantial value for soft skills practice (4.2 out of 5), informing selective application strategies prioritising implementation phase training. Findings recommend interface streamlining, dual desktop and VR headset support, enhanced speech recognition for non-native speakers, and focused deployment on high-value applications rather than comprehensive emergency cycle coverage. The platform achieved Technology Readiness Level 7 and received Unity for Humanity Grant recognition for ImmErgenSim, validating both technical maturity and social impact potential whilst identifying clear pathways for continued development toward commercial SimExBuilder platform deployment.

Introduction and Research Context

Humanitarian emergency response training faces persistent challenges including high delivery costs, geographic accessibility limitations, and difficulty replicating realistic crisis decision-making contexts through conventional modalities. The XRisis project, funded through EU Horizon Europe CORTEX2 Open Call Track 2 cascade mechanism, investigated whether cooperative real-time extended reality technologies could address these limitations whilst enhancing learning outcomes for emergency preparedness competencies. Action Contre la Faim's Emergency Readiness and Response Unit aims to ensure 80% of country offices maintain current Emergency Preparedness and Response Plans through systematic training delivery, yet conventional in-person simulation exercises prove expensive (often exceeding €50,000 per event), logistically complex (requiring participant travel and extended time commitments), and difficult to repeat frequently enough to maintain organisational readiness across 56-country global operations. Extended reality offers theoretical potential for cost reduction through elimination of physical venue requirements, accessibility improvement through remote participation enabling field staff to engage without international travel, and realism enhancement through immersive scenarios that conventional classroom exercises cannot replicate. However, technology potential requires validation through rigorous evaluation with operational users in authentic contexts rather than controlled laboratory demonstrations that may not generalise to real deployment conditions. This research addresses the gap between XR technology advocacy claims and evidence-based assessment of actual training value in humanitarian emergency preparedness applications.

Methodology and Validation Approach

The evaluation employed mixed-methods design combining standardised quantitative instruments with structured qualitative feedback and observational analysis. The System Usability Scale provided validated usability measurement through ten Likert-scale statements producing scores comparable to benchmark data across thousands of previous studies. Added value assessment disaggregated platform evaluation into five specific components (informational briefing from AI avatar, interactive response strategy tool, team collaboration in VR coordination office, soft skills practice with AI avatars, facilitator debrief in VR environment) rated separately on five-point scales to identify differential effectiveness rather than aggregate impressions obscuring component-level variation. Open-ended survey questions invited narrative feedback about specific strengths, limitations, and improvement suggestions complementing numerical ratings with contextual explanations. Structured verbal debriefs with participants following exercise completion explored themes including perceived training value, usability challenges, comparison to conventional simulation exercises, and recommendations for future development through facilitated group discussion. Separate debrief sessions with facilitators and project team members captured perspectives on delivery effectiveness, technical reliability, pedagogical appropriateness, and operational deployment feasibility that participant-focused sessions would not address. Participant selection engaged eight Action Contre la Faim emergency roster members representing target user population (emergency response professionals), organised as two four-person teams representing typical country office composition (programme leads, logistics specialists, finance managers), with demographic diversity across age ranges, gender, emergency deployment experience levels, and prior simulation exercise exposure. The validation workshop structure sequenced three pilots (arrival briefing, collaborative alert and response strategy, implementation simulation) with debrief intervals enabling reflection between phases, total exercise duration approximately 90 minutes excluding induction and debrief sessions. Evaluation timing positioned validation after platform achieved sufficient stability for reliable operation yet early enough in development lifecycle to incorporate findings into commercial platform evolution, balancing formative assessment informing improvement against summative evaluation determining whether minimum viable quality thresholds had been achieved.

Results Summary and Component Differentiation

System Usability Scale assessment produced average score of 59% with substantial variance (range 41%-73%), indicating acceptable usability for motivated professional users whilst confirming significant improvement opportunities particularly around interface complexity reduction and navigation streamlining. Added value ratings demonstrated clear component differentiation: soft skills practice with AI avatars (4.2 out of 5) substantially exceeded collaborative team work in VR office (3.6 out of 5), interactive response strategy tool (3.4 out of 5), facilitator debrief in VR (3.3 out of 5), and informational briefing from AI avatar (3.2 out of 5), with overall average of 3.5 out of 5 equivalent to 70% added value. User satisfaction averaged 3.3 out of 5 (66%) with high variance including 50% of participants providing maximum 5 out of 5 rating whilst others offered more moderate assessments, suggesting strong appeal to certain user profiles whilst leaving others uncertain about value proposition. Qualitative feedback consistently identified implementation simulation as most valuable component for training outcomes, collaborative planning as moderately useful but requiring deployment context optimisation, and theoretical briefing as poor match for XR delivery compared to conventional e-learning alternatives. Technical reliability proved generally acceptable with occasional issues including AI speech recognition failures, state synchronisation delays, and navigation confusion, none severe enough to prevent exercise completion yet sufficient to impact usability scores and participant satisfaction. Participants specifically praised conversational AI realism, avatar diversity options, scenario narrative coherence, and facilitator support quality whilst critiquing interface complexity, unclear task signposting, unnecessarily elaborate environments, and speech recognition accuracy limitations, providing actionable improvement priorities for subsequent development iterations.

Discussion and Recommendations

Results demonstrate that extended reality delivers differential value across emergency management training applications, requiring selective deployment focusing on implementation phase soft skills practice rather than comprehensive platform attempting to address all training needs. Theoretical knowledge transfer shows limited incremental benefit from XR delivery compared to conventional e-learning, with elaborate virtual environments creating cognitive load that detracts from rather than enhances conceptual learning. Collaborative planning in virtual environments provides moderate value for distributed teams lacking face-to-face interaction opportunities but proves suboptimal for co-located participants who achieve better coordination through direct interaction supplemented by desktop collaboration tools. Implementation simulation featuring AI-powered stakeholder negotiations delivers substantial training value impossible to replicate through conventional exercises without extensive role-player coordination and scheduling complexity, representing defensible use case where XR capabilities justify deployment investment. Platform accessibility requires ruthless interface simplification, comprehensive user induction, speech recognition accuracy improvement for international deployment, and dual desktop and VR headset support enabling organisations to match hardware to budget constraints. Commercial development should prioritise capabilities with validated high value (implementation simulation) whilst reconsidering or eliminating components with modest returns (theoretical briefing), allocating resources toward refinement of core strengths rather than attempting feature parity across all emergency management applications. The evaluation demonstrates that rigorous validation with operational users in realistic deployment contexts generates essential evidence enabling informed decisions about XR training technology adoption, challenging both uncritical enthusiasm assuming immersion universally enhances learning and blanket scepticism dismissing XR as unproven novelty without genuine value. The research contributes methodology for evaluating immersive training effectiveness beyond technical feasibility demonstrations, combining standardised instruments enabling cross-study comparison with domain-specific assessment ensuring results meaningfully inform operational deployment decisions rather than merely satisfying academic publication requirements.

Conclusion and Future Research

The XRisis evaluation provides evidence-based foundation for selective XR deployment in humanitarian emergency preparedness training, validating implementation phase simulation as high-value application whilst revealing limited benefits for theoretical knowledge transfer and mixed results for collaborative planning depending on deployment context. The platform achieved Technology Readiness Level 7 through successful operational environment validation with target users, positioning capabilities for commercial development whilst identifying clear refinement priorities addressing usability barriers and technical limitations. Future research should investigate long-term learning retention comparing XR simulation training against conventional modalities, analyse cost-effectiveness across full deployment lifecycle including content development and technical support, explore multilingual deployment approaches addressing speech recognition limitations, and validate scalability through multi-site implementations across diverse organisational and cultural contexts. The findings inform evidence-based XR training strategy emphasising targeted application to specific high-value requirements rather than universal immersion, contributing to realistic assessment of immersive technology capabilities and limitations in educational contexts.

Funding Acknowledgement

This work was supported by the European Union's Horizon Europe research and innovation programme under CORTEX2 grant agreement number 101070192. Views and opinions expressed reflect those of the authors only and do not necessarily represent positions of the European Union or the European Commission. Neither the European Union nor the granting authority can be held responsible for content herein.

Data Availability

Evaluation data and detailed technical documentation are available through Zenodo repository: https://doi.org/10.5281/zenodo.17009269

Published under Creative Commons Attribution 4.0 International license enabling reuse with appropriate citation.