Five papers by CSE researchers at ASSETS 2024

CSE authors are presenting innovations in the area of accessibility, from non-visual image editing to sound modification in virtual reality.

Five papers by researchers affiliated with CSE are being presented at the 2024 International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS), the leading conference focused on the development and application of computing technologies for people with disabilities and the elderly. This year’s conference is taking place October 28-30 in St. John’s, Labrador and Newfoundland, Canada.

The papers being presented by CSE researchers introduce numerous innovations in the area of accessibility, including non-visual image editing for blind and low-vision users, sound masking for individuals with noise sensitivities, virtual reality sound modification for deaf and hard-of-hearing people, and more.

These papers are as follows, with the names of authors affiliated with CSE in bold:

EditScribe: Non-Visual Image Editing with Natural Language Verification Loops
Ruei-Che Chang, Yuxuan Liu, Lotus Zhang, Anhong Guo

Abstract: Image editing is an iterative process that requires precise visual evaluation and manipulation for the output to match the editing intent. However, current image editing tools do not provide accessible interaction nor sufficient feedback for blind and low vision individuals to achieve this level of control. To address this, we developed EditScribe, a prototype system that makes object-level image editing actions accessible using natural language verification loops powered by large multimodal models. Using EditScribe, the user first comprehends the image content through initial general and object descriptions, then specifies edit actions using open-ended natural language prompts. EditScribe performs the image edit, and provides four types of verification feedback for the user to verify the performed edit, including a summary of visual changes, AI judgement, and updated general and object descriptions. The user can ask follow-up questions to clarify and probe into the edits or verification feedback, before performing another edit. In a study with ten blind or low-vision users, we found that EditScribe supported participants to perform and verify image edit actions non-visually. We observed different prompting strategies from participants, and their perceptions on the various types of verification feedback. Finally, we discuss the implications of leveraging natural language verification loops to make visual authoring non-visually accessible.

Two images of a kitten in a colored bow-tie, with text descriptions demonstrating how EditScribe allows users to edit the photo using language prompts.
EditScribe supports non-visual image editing using natural language verification loops. The user first comprehends the image content through initial general and object descriptions, then specifies edit actions using natural language. EditScribe performs the image edit, and provides feedback for the user to verify the performed edit. The user can ask follow-up questions to clarify the edits or verification feedback, before performing another edit.

SoundModVR: Sound Modifications in Virtual Reality to Support People who are Deaf and Hard of Hearing
Xinyun Cao, Dhruv Jain

Abstract: Previous VR sound accessibility work have substituted sounds with visual or haptic output to increase VR accessibility for deaf and hard of hearing (DHH) people. However, deafness occurs on a spectrum, and many DHH people (e.g., those with partial hearing) can also benefit from manipulating audio (e.g., increasing volume at specific frequencies) instead of substituting it with another modality. In this demo paper, we present a toolkit that allows modifying sounds in VR to support DHH people. We designed and implemented 18 VR sound modification tools spanning four categories, including prioritizing sounds, modifying sound parameters, providing spatial assistance, and adding additional sounds. Evaluation of our tools with 10 DHH users across five diverse VR scenarios reveal that our toolkit can improve DHH users’ VR experience but could be further improved by providing more customization options and decreasing cognitive load. We then compiled a Unity toolkit and conducted a preliminary evaluation with six Unity VR developers. Preliminary insights show that our toolkit is easy to use but could be enhanced through modularization.

MaskSound: Exploring Sound Masking Approaches to Support People with Autism in Managing Noise Sensitivity
Anna Y Park, Andy Jin, Jeremy Zhengqi Huang, Jesse Carr, Dhruv Jain

Abstract: Noise sensitivity is a frequently reported characteristic in many autistic individuals. While strategies like sound isolation (e.g., noise-canceling headphones) and avoidance behaviors (e.g., leaving a crowded room) can help, they can reduce situational awareness and limit social engagement. In this paper, we examine an alternate approach to managing noise sensitivity: introducing ambient background sounds to reduce the perception of disruptive noises, i.e., sound masking. Through two studies (with ten and nine autistic individuals respectively), we investigated the autistic individuals’ preferred sound masks (e.g., white noise, brown noise, calming water sounds) for different contexts (e.g., traffic, speech) and elicited reactions for a future interactive tool to deliver effective sound masks. Our findings have implications not just for the accessibility community, but also for designers and researchers working on sound augmentation technology.

Four images of a smartphone screen showing different features of the MaskSound app.
Three views of the customizable sound masking prototype in the form of a mobile app (MaskSound): (A) Sound Palette, (B) Sound Library, (C) Recommendations View.

Audio Description Customization
Rosiana Natalie, Ruei-Che Chang, Smitha Sheshadri, Anhong Guo, Kotaro Hara

Abstract: Blind and low-vision (BLV) people use audio descriptions (ADs) to access videos. However, current ADs are unalterable by end users, thus are incapable of supporting BLV individuals’ potentially diverse needs and preferences. This research investigates if customizing AD could improve how BLV individuals consume videos. We conducted an interview study (Study 1) with fifteen BLV participants, which revealed desires for customizing properties like length, emphasis, speed, voice, format, tone, and language. At the same time, concerns like interruptions and increased interaction load due to customization emerged. To examine AD customization’s effectiveness and tradeoffs, we designed CustomAD, a prototype that enables BLV users to customize AD content and presentation. An evaluation study (Study 2) with twelve BLV participants showed using CustomAD significantly enhanced BLV people’s video understanding, immersion, and information navigation efficiency. Our work illustrates the importance of AD customization and offers a design that enhances video accessibility for BLV individuals.

Misfitting With AI: How Blind People Verify and Contest AI Errors
Rahaf Alharbi, Pa Lor, Jaylin Herskovitz, Sarita Schoenebeck, Robin N. Brewer

Abstract: Blind people use artificial intelligence-enabled visual assistance technologies (AI VAT) to gain visual access in their everyday lives, but these technologies are embedded with errors that may be difficult to verify non-visually. Previous studies have primarily explored sighted users’ understanding of AI output and created vision-dependent explainable AI (XAI) features. We extend this body of literature by conducting an in-depth qualitative study with 26 blind people to understand their verification experiences and preferences. We begin by describing errors blind people encounter, highlighting how AI VAT fails to support complex document layouts, diverse languages, and cultural artifacts. We then illuminate how blind people make sense of AI through experimenting with AI VAT, employing non-visual skills, strategically including sighted people, and cross-referencing with other devices. Participants provided detailed opportunities for designing accessible XAI, such as affordances to support contestation. Informed by disability studies framework of misfitting and fitting, we unpacked harmful assumptions with AI VAT, underscoring the importance of celebrating disabled ways of knowing. Lastly, we offer practical takeaways for Responsible AI practice to push the field of accessible XAI forward.