Verify any claim · lenz.io
Claim analyzed
Health“Questionnaire-based mHealth usability instruments such as the Mobile App Usability Questionnaire (MAUQ), the Mobile App Rating Scale (MARS), and the Health Information Technology Usability Evaluation Scale (Health-ITUES) measure perceived usability but do not directly capture specific interaction errors during task performance.”
Submitted by Keen Zebra 7777
The conclusion
Open in workbench →The evidence supports the core distinction. MAUQ, MARS, and Health-ITUES are questionnaire-based rating tools that assess perceived usability or related app-quality dimensions through subjective responses, not by observing and recording concrete errors during tasks. Some items mention mistakes or interaction design, but that is still retrospective judgment rather than direct error capture.
Caveats
- MARS is broader than a pure usability scale; it also rates app-quality dimensions such as engagement, aesthetics, and information quality.
- Some questionnaire items address error recovery or interaction design, but they remain self-reported or rater-scored perceptions rather than observed error counts or logs.
- Direct detection of specific interaction errors usually requires usability testing, task analysis, screen recording, telemetry, or eye-tracking methods.
This analysis is for informational purposes only and does not constitute health or medical advice, diagnosis, or treatment. Always consult a qualified healthcare professional before making health-related decisions.
Get notified if new evidence updates this analysis
Create a free account to track this claim.
Sources
Sources used in the analysis
The Health-ITUES is a customizable questionnaire with a four-factor structure. The Health-ITUES consists of 20 items rated on a five-point Likert scale from strongly disagree (1) to strongly agree (5). A higher scale value indicates higher perceived usability of the technology. The Health-ITUES explicitly considers a task by addressing various levels of expectation of support for the task by the health IT system.
An mHealth app usability questionnaire (MAUQ) was designed by the research team based on a number of existing questionnaires used in previous mobile app usability studies, especially the well-validated questionnaires. Psychometric analysis indicated that the MAUQ has three subscales and their internal consistency reliability is high. The three factors correspond to three constructs, or subscales, on the MAUQ: ease of use and satisfaction (8 items, MAUQ_E), system information arrangement (6 items, MAUQ_S), and usefulness (7 items, MAUQ_U).
The Mobile App Rating Scale (MARS) was developed after conducting a literature search and classifying previous attempts at rating app quality. The MARS is the first mHealth app quality rating tool, to our knowledge, to provide a multidimensional measure of the app quality indicators of engagement, functionality, aesthetics, and information quality, as well as app subjective quality. These app quality indicators were extracted from previous research across the UX, technical, human-computer interaction, and mHealth literature.
The Mobile Application Rating Scale (MARS) is the most widely used scale for evaluating the quality and content of MHA. In total four separate dimensions were derived: engagement, functionality, aesthetics and information quality. All items are assessed on a 5-point scale (1-inadequate, 2-poor, 3-acceptable, 4-good, and 5-excellent).
The Health-ITUES consists of 20 items rated on a five-point Likert scale from strongly disagree (1) to strongly agree (5). A higher scale value indicates higher perceived usability of the technology. The Health-ITUES explicitly considers a task by addressing various levels of expectation of support for the task by the health IT system.
The Mobile App Rating Scale (MARS) was developed after conducting a literature search and classifying previous attempts at rating app quality. Ultimately, the MARS contained 23 individual items to rate applications on six specific subscales: 1) App classification; 2) User engagement; 3) Function; 4) Aesthetics; 5) Information quality; and 6) Subjective app quality. These subscales also yield a total MARS rating score.
Usability is the most evaluated outcome that was assessed by a questionnaire in the studies (n = 99, 40%). Questionnaires were used for usability (40%), quality (34.5%), acceptance (8.5%), and satisfaction (4%) outcomes, respectively. The Mobile Application Rating Scale (35.5%), Health Information Technology Usability Evaluation Scale (2%), and mHealth App Usability Questionnaire were among the most used questionnaires.
By integrating subjective and objective metrics, our model serves as a benchmark for enhancing mHealth apps and evaluating their messaging usability. We chose the SUS to measure user satisfaction because it is very effective in assessing user-perceived usability and is simple and economical. In the information access usability aspect, we used eye tracking to measure the test subject's performance.
The Health Information Technology Usability Evaluation Scale (Health-ITUES) is a tool evaluating the task addressing different levels of expectation of support for the task by the health information technology (IT). Moreover, the Health-ITUES is customizable and can assess the study purposes without item addition, deletion, or modification. Our findings are the results of a self-assessed questionnaire, and the surgeons can be influenced by the specific event in which they have tried the technology.
The Mobile Application Rating Scale (MARS) is a tool designed to evaluate the quality of mobile apps based on multiple criteria. Its purpose is to provide a reliable framework for assessing app usability, engagement, functionality, aesthetics, and information quality. The MARS includes 23 items grouped into five dimensions: Engagement, Functionality, Aesthetics, Information Quality, and Subjective Quality. Each item is rated on a five-point Likert scale, where 1 = Inadequate and 5 = Excellent.
The usability testing method, however, detects problems of greater severity. There is also a significant correlation between the number of usability issues found and how long it takes to perform tasks in usability tests. Additionally, it is recommended that SUS questionnaires should not be the sole method of determining the usability level of mobile health applications.
This technology is a customizable, 20-item questionnaire for determining perceived health IT usability. The tool consists of various components associated with levels of expectation, capturing user-system interaction and representing higher expectations of system impact. For example, the assessment addresses questions related to quality of work life, perceived usefulness, perceived ease of use, and user control.
The Rating scale assesses app quality on four dimensions. All items are rated on a 5-point scale from “1. Inadequate” to “5. Excellent”. The functionality section examines the app's performance, ease of use, navigation and gestural design. For example, item 9 asks: 'Gestural design: Are interactions (taps/swipes/pinches/scrolls) consistent and intuitive across all components/screens?'
The document presents the mHealth App Usability Questionnaire (MAUQ), designed to assess the usability of standalone mHealth apps used by patients. It includes a series of statements for users to rate their experience with the app, covering aspects such as ease of use, navigation, and overall satisfaction. One statement asks: 'Whenever I made a mistake using the app, I could recover easily and quickly.'
Mobile app usability testing is a systematic evaluation process that assesses how effectively and efficiently users can interact with a mobile application. This testing involves observing real users as they perform specific tasks within the app, identifying usability issues, and gathering feedback to enhance the overall user experience. This method helps identify confusing labels, poorly structured categories, and hidden navigation issues.
A study involving ICU nurses assessing system usability through testing, satisfaction surveys, Health-ITUES, and eye-tracking analysis revealed that eye tracking identified some functions, including alarms, were overlooked or not recognized. This suggests that questionnaire-based methods like Health-ITUES may not directly capture specific interaction errors during task performance, which can be revealed by objective measures.
What do you think of the claim?
Your challenge will appear immediately.
Challenge submitted!
For developers
This same pipeline is available via API.
Verify your AI's output programmatically.
/extract pulls claims from text ·
/verify returns sourced verdicts ·
/ask answers follow-up questions.
Continue your research
Verify a related claim next.
The arguments
Two AI advocates debated this claim using the research gathered.
Argument for
Health-ITUES and MAUQ are explicitly questionnaire instruments scored on Likert-scale agreement items that yield a higher value for higher perceived usability (Health-ITUES: “higher scale value indicates higher perceived usability,” Source 1/5; MAUQ: subscales like ease of use/satisfaction, information arrangement, usefulness, Source 2), which by design captures users' subjective judgments rather than logging concrete, moment-by-moment task interaction failures. Likewise, MARS is a multidimensional rating scale of app quality dimensions (engagement, functionality, aesthetics, information quality, subjective quality) scored on 5-point items (Sources 3/4/6), and evidence comparing questionnaires with objective methods shows that eye-tracking/usability testing can reveal overlooked functions and interaction problems that questionnaires like Health-ITUES do not directly capture during task performance (Source 16), confirming the motion.
The Proponent's reliance on Source 16's eye-tracking comparison to establish that questionnaires fail to capture interaction errors commits a false equivalence fallacy: demonstrating that objective methods reveal additional errors does not prove that questionnaire items never capture interaction-level data, particularly when Source 14 explicitly includes an item probing error recovery during task use. Furthermore, the Proponent's characterization of MAUQ and MARS as capturing only 'subjective judgments' ignores that Source 13 shows MARS's functionality section directly examines specific gestural interaction behaviors (taps, swipes, pinches), which are concrete task-performance elements, not abstract perceptual ratings — meaning the claim that these instruments 'do not directly capture specific interaction errors' remains an overstatement unsupported by the full body of evidence.
Argument against
The claim oversimplifies the instruments' scope: Source 14 shows the MAUQ explicitly includes an item asking whether users could recover from mistakes ('Whenever I made a mistake using the app, I could recover easily and quickly'), which directly probes interaction error recovery during task performance, not merely abstract perceived usability. Furthermore, Source 13 demonstrates that MARS's functionality section examines specific interaction behaviors such as gestural design consistency across taps, swipes, and pinches, meaning these questionnaires do capture elements of concrete task-level interaction rather than purely subjective perception.
The Opponent equivocates between asking respondents to self-report whether they “could recover” from mistakes and actually capturing specific interaction errors as they occur; even the cited MAUQ item in Source 14 is a Likert-style perception statement, consistent with MAUQ's validated constructs of ease of use/satisfaction and usefulness (Source 2) rather than an observational error log. Likewise, pointing to MARS items about gestural design (Source 13) does not rebut the motion because MARS remains a rater-scored quality checklist (Sources 3/4/6) and the brief explicitly shows objective methods like eye tracking can reveal overlooked functions and task-performance interaction problems that questionnaire/rating instruments such as Health-ITUES do not directly capture (Source 16).
Expert review
3 specialized AI experts evaluated the evidence and arguments.
Expert 1 — The Logic Examiner
Sources 1/5 and 2 show Health-ITUES and MAUQ are Likert-style questionnaires yielding perceived usability constructs, and Sources 3/4/6 show MARS is likewise a rater-scored Likert scale of app-quality dimensions; Source 16 further supports that objective measures (eg, eye tracking) can reveal task-performance interaction problems that such questionnaires may miss, which is consistent with (though not a deductive proof of) the claim's “do not directly capture” phrasing. The opponent's counterexamples (MAUQ item about recovering from mistakes in Source 14 and MARS gestural-design item in Source 13) still amount to subjective/rater judgments about interaction quality rather than an observational capture/log of specific errors during task execution, so the claim is logically supported overall.
Expert 2 — The Source Auditor
The most reliable sources here are the peer-reviewed PMC/JMIR publications (Sources 1-7), which are high-authority academic sources. These sources confirm that MAUQ, MARS, and Health-ITUES are questionnaire/rating scale instruments that measure perceived usability through Likert-scale items and subjective ratings. Source 16 (ResearchGate, lower authority) explicitly notes that eye-tracking revealed interaction errors that Health-ITUES did not capture, supporting the claim. The opponent raises a valid point: MAUQ (Source 14, Scribd - low authority) includes an item about error recovery, and MARS (Source 13, Scribd - low authority) includes gestural design items. However, these are still self-reported perception items on Likert scales, not direct observational logs of specific interaction errors during task performance. The high-authority sources consistently describe these instruments as measuring 'perceived usability' through agreement ratings, not as tools that directly log or observe specific interaction errors as they occur. The distinction between asking users to retrospectively rate whether they could recover from errors versus directly capturing those errors during task performance is meaningful and supported by the authoritative literature. The claim is largely accurate: these instruments measure perceived usability and do not directly capture specific interaction errors during task performance, though the opponent's point about error-recovery items being included is a minor caveat that makes the claim slightly overstated rather than false.
Expert 3 — The Precision Analyst
The evidence confirms that MAUQ, MARS, and Health-ITUES are questionnaire-based instruments measuring perceived usability and subjective quality (Sources 1, 2, 4, 6), whereas objective methods like eye-tracking and usability testing are required to directly capture specific interaction errors during task performance (Sources 8, 16). While MAUQ and MARS contain items asking users to self-rate error recovery or gestural consistency (Sources 13, 14), these remain subjective self-reports rather than direct, objective captures of interaction errors as they occur.