You must inform certain classes of users -- or any user in certain contexts -- of conditions, alarms, alerts and other contextually-relevant or timebound content without requiring them to read the device screen.

Practically all mobile devices have audio output of some sort, and it can be accessed by almost every application or website. There can be strict limits, such as devices which only output over headsets, or those which only send phone call audio over Bluetooth, that can limit the use of some tones.


Much as Notifications appear on the device screen (and sound alarms and blink the LED) when messages arrive or other such alerts activate, Voice Notifications read the content of a notification so the information can be used when the device is not in the hand, or cannot be viewed.

These can be especially useful for users or contexts in which reading would be difficult. A common case is turn-by-turn directions. The reminders are based on telemetry information instead of time (like alarms) or remote information (like messages), but the principle is identical: a message is read, without specific user input, at the relevant time.

Voice Notifications also allow the screen to remain available for the display of other content. It may be able to be glanced at to retrieve certain types of data (such as a map, in the navigation example), but not to the level or type of detail of the read-aloud text.

The best way to notify is using technologies like push messaging, supported by a surprising number of devices, on most networks. Remote messages can launch the application, then not only read the alert but present a correspondingly well-formatted page to look at. Note that the audio only plays when the entire message has loaded, so the user may glance at the device to get other information and follow along. The audio must always be able to be paused or stopped, in case it is disrupting an event, or the user has privacy concerns.


There are only two basic variations:

If the user requests voice output, or voice output is used as a part of generally navigating the device UI, this is a Voice Readback pattern instead.

Interaction Details

Voice Notifications occur based on time, position, remote messaging, or other actions outside the user's direct control.

Treat reminders currently being read the same way as other alarms. Make sure they appear in the Notifications area, or as Pop-Up dialogues, as the rest of the OS standards dictate. You must also enable the same actions; they may be muted, snoozed, or cancelled. Do not snooze alarms that are irrelevant at a later time, such as position-related messages.

If the voice component is muted, this is the same as silencing an alarm of Tones. The notification itself will remain in the Notifications area, and may be manually selected and viewed (or listened to), or be cleared.

A privacy mode may only read a generic version of the message. When the user interacts with the device, a more detailed message can be seen, or read aloud. This will rapidly become the Voice Readback pattern. See that for specifics of the design and interaction.

Consider informing users of the risks of reading private data aloud, when setting up Voice Notifications, and provide methods to change or cancel them easily.

In context Voice Notifications are most often used for features like turn-by-turn navigation. The display is mostly reserved for pictorial displays of information, which are more glanceable, and allow multiple types of information to be presented at once. A summary of the audio message should be presented on screen, even if expected to be unreadable; the user may change contexts (e.g. stop the car), another individual may be able to assist the user, and the mere presence of the information box reinforces that the message is originating from this device, and relates to this task.

Presentation Details

Make sure all alerts are presented visually, as well as via audio. See the patterns under Notifications for display of notifications. Even in-context, the alert should be noted in some manner, though this may be more specific, such as an Annotation over a map. Only use Pop=Up dialogues or other intrusive measures when absolutely required.

To inform the user that audio is about to commence, and to prepare them for the volume level, play a subtle tone or unimportant audio message immediately beforehand. This may also be the normal alert tone, to emphasize the device is communicating an alert. Make sure the volume of this tone is similar to (or lower than) the audio to be read aloud, as the intention is not just to consciously alert the user, but to prime their auditory system for receiving messages.

Use syntax that makes it clear what is being communicated, and attempt to give users a moment to acclimate to the voice and the instruction format, even after any notification Tones. For example, state the type of message or action that the user must take, then the details:

Repetition generally is not perceived in a negative way in spoken phrases as in print, due to the immediacy and the way audio vs. visual perception works.

Use caution when choosing what sort of content will be read -- and to what level of detail. Alerts will generally be sent through the speaker, so have minimal privacy. Since anyone nearby can hear, messages must be formatted to be general or have no explicitly secret information. For example, instead of stating the medicine and dosage as above, the message could just be "Time to take your medications.... Be sure to take it with a meal."

Use a consistent voice and tone of voice for all notifications. This way, the user can become accustomed to the reminder, and be able to comprehend the intent without listening as closely to opening phrases.


Avoid "blurting out" alerts or instructions, especially with important information in the beginning of the phrase. Directions such as "Turn Right in 500 yards" are often not useful, as the user is still acclimating to the voice speaking through the direction statement; they may only hear "... in 500 yards."

Voices provide strong emotional responses, and when not accustomed to it, or expecting it, may be startling. Avoid using Voice Notifications when users are asleep, and be sure to preceed all notifications with Tones to make it more clear that this is a machine, not a scary intruder sneaking up from behind.

The voice you use must be as understandable as possible. Even within a single language, be sure to use the most generic possible accents, and be aware of regional idioms.

Text-to-voice translation of names, especially, can be difficult to understand or improperly pronounced. If quality is too low with the available hardware and software, do not implement the solution.

Next: Haptic Output

Discuss & Add

Please do not change content above this line, as it's a perfect match with the printed book. Everything else you want to add goes down here.


If you want to add examples (and we occasionally do also) add them here.

Make a new section

Just like this. If, for example, you want to argue about the differences between, say, Tidwell's Vertical Stack, and our general concept of the List, then add a section to discuss. If we're successful, we'll get to make a new edition and will take all these discussions into account.

Voice Notifications (last edited 2011-07-31 23:56:48 by shoobe01)