Discover how Nuwa can transform your organisation. Get in touch today.Contact Us
Nuwa

Technical Literacy and Accessibility in XR Training Platforms

Validation evidence challenges assumptions about technical literacy as deployment barrier whilst identifying interface complexity, induction design, and speech recognition as critical success factors for inclusive XR training access across diverse user populations.

Published: by Anastasiia P.
Funded by the European Union

Funded by the European Union

This project has received funding from the European Union. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union. Neither the European Union nor the granting authority can be held responsible for them.

Grant agreement number: 101070192

Challenging Technical Literacy Assumptions

The XRisis validation challenged prevalent assumptions that technical literacy represents a fundamental barrier to XR training platform adoption among humanitarian professionals, particularly across generational and gender demographics. During requirements gathering, stakeholders raised concerns that older staff members and women might struggle with immersive technology interfaces, potentially creating digital divide where technology-enabled training benefited some employee populations whilst excluding others, contradicting organisational commitments to inclusive access and equitable professional development opportunities. Validation results contradicted these concerns: participants spanning age ranges from late twenties to late fifties, with varied prior gaming experience and general technology comfort levels, achieved functional proficiency with XRisis platform after brief induction sessions, suggesting that motivated professionals can overcome initial unfamiliarity when training serves clear operational purposes. Gender analysis revealed no significant differences in usability scores or effectiveness ratings between male and female participants, challenging assumptions that women would experience greater difficulties with spatial navigation in 3D environments or manipulation of virtual objects within immersive contexts. Several participants explicitly commented that their initial technology anxiety dissipated quickly once engaged with actual scenarios, discovering that domain familiarity (understanding emergency management procedures, organisational roles, and humanitarian coordination) proved far more important than gaming experience or general technology sophistication for achieving productive platform use. The validation evidence suggested that concerns about technical literacy often reflect organisational leadership anxieties rather than actual workforce capabilities: decision-makers who themselves lack comfort with immersive technologies project their uncertainties onto staff who may actually adapt readily when provided appropriate support and authentic reasons to engage with new tools. This finding proves strategically important because technology adoption decisions often stall when leaders fear staff resistance, making evidence about actual adoption patterns essential for overcoming institutional inertia that blocks potentially beneficial innovations. The project discovered that meaningful induction design matters far more than assumed baseline technical literacy: when onboarding processes clearly explain purpose, demonstrate core functions through hands-on practice, provide easily accessible reference materials, and establish psychological safety about making mistakes during initial exploration, users with minimal prior XR experience achieve adequate proficiency for training participation. Technical literacy operates on multiple dimensions beyond general "comfort with technology": participants needed familiarity with videoconferencing (already widespread from COVID-19 pandemic), basic document navigation (fundamental professional skill), and willingness to explore unfamiliar interfaces through experimentation (characteristic of curious learners regardless of age or gender), but did not require gaming experience, 3D modelling knowledge, or prior VR exposure. The validation did identify that individuals with very limited computer experience struggled more than those comfortable with standard office applications, suggesting that minimum viable technical literacy involves basic digital competence rather than advanced technical sophistication, a threshold that most humanitarian professionals exceed given increasing digitisation of operational workflows. The findings enable more confident deployment planning: whilst some staff will require more induction support than others, organisations need not assume systematic exclusion of particular demographics, instead designing inclusive onboarding, providing differentiated support based on individual needs, and maintaining conventional training alternatives for exceptional cases where immersive technology genuinely proves inaccessible despite accommodation efforts.

Interface Complexity as Primary Accessibility Barrier

Validation results clearly identified interface complexity rather than technical literacy as the primary barrier to XRisis accessibility and adoption. The System Usability Scale score of 59% reflected genuine design problems including navigation mechanisms that required multiple steps for common tasks, information displays presenting excessive simultaneous content creating cognitive overload, unclear signposting about what actions participants should take during specific scenario phases, and inconsistent interaction patterns where similar interface elements behaved differently in different contexts confusing users who expected systematic correspondence. Participants reported needing to remember too many controls: how to move avatars in virtual space, how to access different information sources (email, news, documents, maps), how to initiate or answer communication calls, how to manipulate shared planning tools, and how to signal readiness for scenario progression, creating memory burden that diverted cognitive resources from actual training content toward interface operation. The complexity stemmed partly from attempting to accommodate multiple device types (VR headsets, desktop computers, mobile devices) within a single interface design, with compromises required to support lowest-common-denominator capabilities reducing optimisation for any specific platform and creating suboptimal experiences across all deployment contexts. VR-specific challenges included accidental triggering of controls through unintended hand gestures or head movements, difficulty reading text on virtual screens requiring careful positioning and head orientation, and disorientation when teleporting between environment locations rather than walking naturally. Desktop-specific challenges involved mouse-and-keyboard interaction paradigms feeling awkward for spatial tasks like pointing at virtual objects or positioning avatars, screen real estate limitations requiring frequent window switching between communication interfaces and scenario environments, and lack of immersive presence reducing engagement compared to VR experiences. Participants suggested several concrete improvement directions: implement progressive disclosure hiding advanced features until users explicitly request them rather than presenting all capabilities simultaneously, provide contextual help overlays explaining what users should do at each scenario phase rather than assuming self-evident tasks, standardise interaction paradigms so that similar tasks always follow identical procedures rather than requiring relearning for each scenario section, and simplify navigation by teleporting users automatically when scenarios progress rather than requiring manual movement between locations. The interface complexity analysis revealed tension between flexibility and simplicity: supporting diverse scenario types required generic interaction capabilities applicable across different contexts, yet this flexibility created complexity compared to highly constrained interfaces optimised for single specific tasks. The solution likely involves scenario authoring tools that generate simplified interfaces tailored to specific training requirements rather than universal interfaces attempting to accommodate all possible scenarios, trading some flexibility for substantial usability gains targeted at actual use cases. Accessibility extends beyond technical interface design to encompass cognitive accessibility: neurodivergent users may experience particular challenges with complex multi-modal interfaces requiring simultaneous attention to visual displays, audio communications, text instructions, and spatial navigation, suggesting value in providing adjustable complexity modes allowing users to disable features they find distracting. The findings validated a crucial principle: assuming users will adapt to complex interfaces because training value justifies effort leads to accessibility barriers that exclude potential beneficiaries, whereas investing in ruthless simplification that removes every unnecessary complexity element creates inclusive access serving diverse user populations without compromising capability for those who would tolerate more elaborate interfaces.

Induction Design and Onboarding Critical Success Factors

The validation evidence demonstrated that induction and onboarding design prove as critical for successful XR training deployment as platform capabilities themselves, with inadequate user preparation undermining even well-designed system functionality. Initial validation attempts without formal induction revealed participants struggling to achieve basic tasks, experiencing frustration that coloured subsequent evaluation responses, and abandoning exploration of advanced features because confidence never developed beyond tentative uncertainty about correct interaction approaches. The revised induction approach implemented before the May 2025 validation workshop structured onboarding through progressive skill building: participants first practiced basic navigation in empty environments without scenario pressure, then explored communication tools through casual conversations establishing comfort with voice interfaces, next engaged simple single-interaction tasks building confidence with specific capabilities before attempting complex integrated scenarios. This scaffolded learning proved far more effective than comprehensive upfront instruction attempting to explain all capabilities before any hands-on practice, recognising that adult learners absorb procedural knowledge through doing rather than passive observation. The induction design incorporated explicit permission for mistakes and exploration, with facilitators emphasising that fumbling with interfaces during onboarding was expected and valuable rather than indicating personal inadequacy, creating psychological safety essential for learners to experiment without fear of judgement that would restrict exploration to minimum viable interactions. Reference materials provided quick-access guides showing common tasks (how to move, how to pick up objects, how to initiate calls, how to access documents), enabling just-in-time learning when participants encountered specific needs rather than requiring retention of comprehensive instruction covering capabilities they might not use immediately. Buddy system approaches where more technically confident participants informally supported peers encountering difficulties proved valuable during team scenarios, distributing support burden beyond facilitators and building collaborative dynamics that enhanced social learning alongside technical skill development. The platform's design gradually reduced induction requirements through interface refinements that made interactions more intuitive, reducing explicit instruction needed for basic proficiency, though validation evidence suggested induction would remain necessary even with simplified interfaces because spatial navigation and immersive interaction paradigms differ fundamentally from familiar web browsing and desktop application patterns regardless of how carefully designed. Time investment in thorough induction (approximately 15-20 minutes before scenario engagement) proved worthwhile: participants who completed structured onboarding exhibited substantially higher confidence, explored more capabilities, and reported better learning experiences than those who received minimal preparation before scenario immersion. The induction design balanced efficiency (avoiding excessive time consumption that reduces available training duration) against thoroughness (ensuring genuine skill development rather than superficial capability introduction), with facilitators adjusting pace based on observed participant confidence and comfort rather than following rigid timing regardless of actual learning needs. Future platform development will incorporate embedded tutorials triggered contextually when users first encounter specific features rather than frontloading all instruction, reducing upfront cognitive load whilst ensuring guidance availability when needed. The lesson extends beyond XRisis: any complex interactive system serving non-technical users requires investing as much design effort in onboarding experiences as in core functionality, recognising that brilliant capabilities that users cannot access due to inadequate preparation deliver no value despite technical excellence.

Speech Recognition and Multilingual Deployment Challenges

AI speech recognition accuracy emerged as the most significant technical limitation affecting XRisis accessibility, particularly for international humanitarian deployment where linguistic diversity represents fundamental operational reality. The conversational AI systems powered by CEA's Conversational Virtual Agent platform struggled to reliably understand participants with strong non-native English accents, regional dialect variations, rapid speech under scenario pressure, or domain-specific terminology that language models had not encountered during training, resulting in misrecognitions that generated nonsensical responses breaking scenario immersion. Humanitarian organisations operate globally with staff from dozens of countries speaking English as second, third, or fourth language with proficiency levels spanning basic communication capability to native fluency, creating accessibility requirement that speech interfaces must work reliably across accent diversity rather than only for native speakers. Participants from Francophone African contexts specifically noted that whilst they could understand spoken instructions from AI avatars adequately, the system's poor comprehension of their responses created asymmetric communication where they could receive information but not effectively provide it, limiting engagement to passive listening rather than active dialogue. The problem compounds in scenarios requiring negotiation or persuasion where participants need nuanced expression of positions, concerns, and proposals: when speech recognition failures prevent participants from articulating complex positions, they cannot practice the communication skills that represent the scenario's learning objectives, negating the training value that conversational AI should provide. Current workaround approaches including text chat interfaces supplementing voice communication proved suboptimal because typing disrupts conversational flow and realism, creating interaction dynamics fundamentally different from voice communication that the training aims to prepare participants for actual field contexts. Validation with English-as-second-language speakers revealed that certain accent patterns proved particularly problematic (strong French accents, Asian language influences on English pronunciation, rapid speech from Romance language speakers) whilst others encountered fewer difficulties (Germanic language background speakers, measured deliberate speech), suggesting that recognition models trained predominantly on native English speakers exhibited systematic biases requiring targeted dataset expansion rather than general capability improvements. The accessibility gap proves particularly concerning because it creates inverse equity effects: participants from well-resourced contexts with extensive English language education access full platform capabilities whilst those from resource-constrained contexts with limited language learning opportunities face degraded functionality, exactly opposite the democratisation objectives motivating XR training platform development. Technical solutions under exploration include multilingual deployment supporting scenario delivery in French, Spanish, Arabic, and other languages prevalent in humanitarian operations, upgraded speech recognition models specifically trained on humanitarian sector communication patterns and non-native accents, hybrid interaction modalities combining voice with gesture or structured input reducing dependence on perfect speech understanding, and graceful degradation strategies enabling scenario participation through alternative channels when voice recognition fails. Market research with potential NGO clients beyond Action Contre la Faim revealed that speech recognition limitations represent a significant adoption barrier: organisations cannot justify deployment of training platforms that function well only for English-fluent staff when substantial proportions of their workforce speak other languages or English with accents that current systems struggle to understand. The accessibility challenge extends beyond technical implementation to fundamental questions about whether conversational AI technologies have matured sufficiently for reliable international deployment or whether current generation systems remain appropriate only for limited contexts where linguistic homogeneity exists, requiring honest assessment that defers certain applications until capabilities improve rather than deploying solutions that exclude populations they claim to serve.