Aaron Preece
Discord, a mobile and desktop app that allows you to chat and call your friends as well as connect with others with similar interests, has become one of the most dominant communication platforms in the past few years. With its increasing focus on accessibility, this is true for both the mainstream as well as audiences with blindness or low vision.
Considering its prominence in popular culture as well as its increasing use by people with blindness or low vision, it seemed time to dedicate an article to Discord. As a complex web app and with its recent focus on increasing accessibility, Discord also serves as an excellent test environment for exploring what to do and not to do when making a desktop or mobile app accessible.
For this review, Discord was tested on Windows 10 using the NVDA screen reader. Specifically, the desktop app was tested as it is almost entirely identical to the web interface and is browser agnostic. The mobile app was tested on iOS using VoiceOver.
Positives
Discord, specifically the desktop app, is a model example of how to make a complex web app accessible to screen reader users. On both the desktop and mobile app, there are few to no unlabeled, or mislabeled, elements. This is true for common elements such as links and buttons, but also for more complex elements as well. As an example, tabs are labeled properly in both apps and correctly alert the user of which tab currently has focus. Likewise, sliders, such as those used for adjusting the volume of specific users in a call, are labeled correctly and provide needed feedback as the slider is adjusted. The only unlabeled element encountered during testing of the desktop app was an image on the log in screen, assumed to be the discord logo. Logos should be given a simple alt text label such as “Discord Logo” or similar.
Discord makes liberal use of CAPTCHAs to protect from bots and spam, employing the hCAPTCHA service. Historically, hCAPTCHA required blind and low vision users to install an "Accessibility Cookie" which in many cases required users to disable privacy settings, such as prevention of cross sight tracking, to work correctly. Fortunately, hCAPTCHA has implemented a text CAPTCHA option where a user need only answer a few yes and no questions to solve the CAPTCHA. This method is both easier for a blind or low vision user to use than traditional audio CAPTCHAs, while also allowing a blind or low vision user to keep their privacy protections in place.
On desktop in particular, the discord app presents a great deal of information on each page. To aid in navigation for users of screen readers, the app uses elements strategically so that a user can efficiently navigate. As one might expect, headings are used for important items such as main content; in addition, on desktop, in a channel or direct message conversation, headings are used to identify a user's messages. To elaborate, if I submit a message, the message will be proceeded by my user name as a heading. If I continue to submit messages quickly, they will appear below my original message but without the heading. If someone then submits their own message, their message will be proceeded by their name as a heading.
Even when headings are not being used on desktop to provide quick navigation, the use of different element types often serves the same purpose. In settings, each settings category appears as a button while individual settings categories appear as tabs. In servers, channels appear as links, in contrast to the buttons used for voice channels and other controls that are around them. If a screen reader user is comfortable switching between different element shortcuts, they can learn to navigate rapidly.
In contrast to a mobile app, where there is only one main navigation method when using a screen reader, desktop screen readers allow a user to navigate in what is commonly known as "Browse Mode", where the screen reader user can navigate the site with the arrow keys as well as use specific element shortcuts to jump to elements like headings and links. "Focus Mode" deactivates the enhanced keyboard navigation provided by Browse Mode and is generally used for text entry or interacting with certain complex controls. As Discord is a web app, it can be controlled both in Browse and Focus mode. A user is able to access all controls while in browse mode, and when switching to Focus mode, native keyboard support is available. For example, a user can navigate through messages with arrow keys as well as use the arrows to navigate through channels in a server. Discord also includes an extensive list of keyboard shortcuts
When using a screen reader, web sites and apps often take control of the screen reader's focus when needed, redirecting a user's focus to important content. Not redirecting focus or redirecting it incorrectly is a common problem. In the vast majority of cases, Discord does this correctly when using the desktop app. when a user opens a context menu, focus is automatically brought to it and when in a list of messages, if the user presses any letter on the keyboard, they will be brought directly to the message edit field.
Identified issues
Discord has implemented accessibility quite successfully, particularly in their desktop app, though access issues or oversights in usability are still present. Desktop app issues primarily are oversights in usability, while the majority of true access issues are found in the mobile app.
When you are using the Discord desktop app and are focused on a text channel or direct message, you will be alerted (presumably through ARIA) that users are typing but not when new messages are posted. This makes it difficult to keep up with conversations. A user either has to leave the discord window so that they will receive notifications, which are read by screen readers, or constantly check for new messages. Since users are alerted when others are typing, it seems that the capability to alert a user of activity directly is possible, and it would be extremely useful if users were read new messages as they arrived in a focused channel or direct/group message.
In addition, if you close the Discord app using the "Close" button at the top of the interface, you will still be able to interact with the window as if it were still present until manually moving out of the window. When the app is closed using this method, behavior should be the same as when the user closes the app using "ALT+F4".
The mobile app has several moderate access issues that make it more difficult to use than the desktop app. In many cases, particularly when scrolling through message history, focus will be moved far back in the message history, skipping what should have been next in the focus order. To replicate this, focus on the message field and begin swiping to previous items. The assumption would be that focus would be moved to the latest message in the history, but instead focus is placed on a message far back in message history. Screen reader focus changes should always be predictable to a user; in our example above, when typing focus is moved to the message field, it is a logical redirection of focus. If a mechanism is present to jump a user back to a specific point in message history, this should be identified to the user and be of the user’s choice.
Elements that are labeled in the desktop app are unlabeled in the mobile app in some cases. For example, if someone joins a server you are a part of and you have the option to react with a sticker, that option appears as an unlabeled button in the mobile app. The unlabeled item in the mobile app should have the same label as the desktop counterpart.
Some issues are unique to Discord itself. For instance, when entering a voice channel on mobile, the speaker defaults to the handset speaker, making it very difficult to hear your screen reader. Most apps allow this to be switched by covering and uncovering the proximity censor,(and Discord does allow this when in a direct call with another user) but when in a voice channel, you must navigate to the "Change Audio Output" button and select it, quite difficult when your screen reader is coming through a quiet speaker. Default behavior on iOS is for audio to come through the headset speaker but if the proximity sensor is covered and uncovered, the audio output is switched to the phone's main speakers.
Finally, in certain screens, such as the "Spectators" screen in a call in a voice channel, there is no button for returning to the previous screen. Fortunately, it is possible to use VoiceOver's "Scrub gesture" for activating a back function, but including an actual back or previous button would be useful. In many cases, the "Scrub gesture" is not supported so using this method for leaving a screen without a back button might not occur to a user.
Conclusion
Outside of general good accessibility practices; proper labeling and similar, Discord is particularly notable for properly using screen reader focus rerouting and making a web app possible to navigate both in browse mode and in focus mode using standard keyboard navigation, something rarely seen in desktop web apps. Key issues in the web/desktop app are specific to Discord and regard usability and not accessibility. Examples would include the lack of message readout while in a channel or conversation and the lack of focus redirect when closing the app using the in-app "Close" button.
The mobile app lags behind the web app in general accessibility with some minor labeling issues and some significant oversights or errors in screen reader navigation that can complicate usage for screen reader users. Outside of its general complex nature, Discord is perfectly usable by people with blindness and low vision. Mobile issues can be identified and worked around, and lack of some features in the desktop app that would make some tasks easier can be compensated for by using alternative methods.
This article is made possible in part by generous funding from the James H. and Alice Teubert Charitable Trust, Huntington, West Virginia.