1、 _ SAE Technical Standards Board Rules provide that: “This report is published by SAE to advance the state of technical and engineering sciences. The use of this report is entirely voluntary, and its applicability and suitability for any particular use, including any patent infringement arising ther
2、efrom, is the sole responsibility of the user.” SAE reviews each technical report at least every five years at which time it may be revised, reaffirmed, stabilized, or cancelled. SAE invites your written comments and suggestions. Copyright 2015 SAE International All rights reserved. No part of this
3、publication may be reproduced, stored in a retrieval system or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of SAE. TO PLACE A DOCUMENT ORDER: Tel: 877-606-7323 (inside USA and Canada) Tel: +1 724-776-49
4、70 (outside USA) Fax: 724-776-0790 Email: CustomerServicesae.org SAE WEB ADDRESS: http:/www.sae.org SAE values your input. To provide feedback on this Technical Report, please visit http:/www.sae.org/technical/standards/J2988_201506 SURFACE VEHICLE INFORMATION REPORT J2988 JUN2015 Issued 2015-06 Gui
5、delines for Speech Input and Audible Output in a Driver Vehicle Interface RATIONALE This SAE Information Report is provided to establish a set of high-level guidelines for systems with speech input and audible output as a means of controlling select vehicle features and functions. This Information R
6、eport addresses the appropriate application of speech input and audible output systems, their general operation and other aspects of these systems. While not a comprehensive guideline, this Information Report helps to establish general guidance to promote consistency in user experiences and expectat
7、ions for operation of these systems. TABLE OF CONTENTS 1. SCOPE 2 1.1 Purpose . 2 2. REFERENCES 2 2.1 Applicable Documents 2 3. DEFINITIONS . 2 4. GUIDANCE ON WHEN TO IMPLEMENT SPEECH INPUT AND AUDIBLE OUTPUT 3 4.1 Appropriate Implementation of Speech Input and Audible Output Systems . 3 4.2 Inappro
8、priate Implmentation of Speech Input and Audible Output System 4 4.3 Speech Input Should Not Be Applied to the Following Vehicle Controls or Features 4 5. DIALOGUE STRUCTURE GUIDANCE 4 5.1 Dialogue Structure Elements 4 5.2 Recognition Guidance . 5 5.3 Dialogue Structure Recommendations . 5 6. GUIDAN
9、CE FOR HANDLING ERRORS 8 6.1 Descriptions and Guidance for Specific Errors . 8 7. SYSTEM FEEDBACK GUIDANCE . 9 8. USER ASSISTANCE GUIDANCE 10 9. NOTES 10 9.1 Marginal Indicia . 10 SAE INTERNATIONAL J2988 Issued JUN2015 Page 2 of 10 1. SCOPE The scope of this document is a technology-neutral approach
10、 to speech input and audible output system guidelines applicable for OEM and aftermarket systems in light vehicles. These may be stand-alone interfaces or the speech aspects of multi-modal interfaces. This document does not apply to speech input and audible output systems used to interact with autom
11、ation or automated driving systems in vehicles that are equipped with such systems while they are in use (ref. J3016:JAN2014). 1.1 Purpose To provide speech input and audible output guidelines for system designers and integrators developing these systems intended for use by the driver while the vehi
12、cle is in motion. 2. REFERENCES 2.1 Applicable Documents The following publications form a part of this specification to the extent specified herein. Unless otherwise indicated, the latest issue of SAE publications shall apply. 2.1.1 SAE Publications Available from SAE International, 400 Commonwealt
13、h Drive, Warrendale, PA 15096-0001, Tel: 877-606-7323 (inside USA and Canada) or +1 724-776-4970 (outside USA), www.sae.org. SAE J3016 Taxonomy and Definitions for Terms Related to On-Road Motor Vehicle Automated Driving Systems 2.1.2 Related Publications The following publications are provided for
14、information purposes only and are not a required part of this SAE Technical Report. Statement of Principles, Criteria and Verification Procedures on Driver Interactions with Advanced In-Vehicle Information and Communication Systems (Alliance of Automobile Manufacturers, 2006) 3. DEFINITIONS 3.1 BARG
15、E-IN Process that allows or enables a user to speak over an audible system prompt in order to utter the next entry, without having to exercise a mechanical interaction with a control. NOTE: This is distinguished from mechanical interruption, such as by pressing a button to interrupt a dialogue. 3.2
16、DIALOGUE Verbal interaction sequence between the user and the system consisting of what the user says and how the system responds, including the specific phrasing, the format of auditory feedback, the timing and the control logic. 3.3 ENDPOINTING Process by which a system automatically determines th
17、e beginning and end of a human speech utterance. 3.4 MULTI-MODAL INTERFACE Means of controlling a function or feature that combines voice and other interface modalities (often, visual-manual interfaces) and may alternate between them during a task. NOTE: Other modalities such as gestures and haptics
18、 may also be part of a multi-modal interface. SAE INTERNATIONAL J2988 Issued JUN2015 Page 3 of 10 3.5 TASK Sequence of control operations (i.e., specific method) leading to a goal at which the driver will normally persist until the goal is reached. 3.6 TIMEOUT System-defined amount of elapsed time w
19、ithout user speech entry, after which the user is prompted that the system is again ready to receive input. NOTE: Timeouts are usually followed by a re-prompt, unless multiple timeouts occur successively. EXAMPLE: A system has a maximum time of X seconds to wait for speech input, at which point the
20、system times out and closes the listening window. It then indicates to the user that no input was received. Timeouts are usually followed by a reprompt, unless multiple timeouts occur successively. 3.7 UNIVERSAL COMMANDS Verbal commands that can be uttered at any time and will be recognized and acte
21、d upon by the system in question, regardless of the sub-system or function in use, or the present location within a menu hierarchy. 3.8 VISUAL-MANUAL INTERFACE Means of interacting with a function or device using manual control(s) and visual feedback. NOTE: There may or may not be system feedback, a
22、nd if provided, the feedback could be voice, audible (tone or other sounds), a visual display, haptic, or observable completion of a driver-requested action. 3.9 VOICE USER INTERFACE Means of interacting with a function or device using speech input and audible output. NOTE: There may or may not be s
23、ystem feedback, and if provided, the feedback could be voice, audible (tone or other sounds), a visual display, haptic, or observable completion of a driver-requested action. 4. GUIDANCE ON WHEN TO IMPLEMENT SPEECH INPUT AND AUDIBLE OUTPUT Speech input should not be used for every function within th
24、e vehicle. The addition of each command impacts the recognition rate of the other commands. Drivers also tend to use a subset of commands regularly, and otherwise do not continue to explore the system and learn new commands. A voice interface should be provided when supporting data indicate that dri
25、ver performance would be better than otherwise expected for an alternative interface, such as a visual manual interface. 4.1 Appropriate Implementation of Speech Input and Audible Output Systems Speech input should be considered for implementation when the visual manual implementation might otherwis
26、e be overly complex. Examples of uses for speech input include, but are not limited to the following, and should not be considered mutually exclusive with respect to other means of control: Functions that a driver will perform regularly while driving. Functions that are complex and cannot be perform
27、ed more efficiently with traditional controls. Functions that are readily understandable to the driver (e.g., phone “dialing” by name). Functions where the speech input and audible output adds value to the system (e.g., long list searches). SAE INTERNATIONAL J2988 Issued JUN2015 Page 4 of 10 4.2 Ina
28、ppropriate Implmentation of Speech Input and Audible Output System Restrictions should be placed on the use of voice interfaces as a control mechanism for certain functions. Consideration should be given to the balance of risk and benefit resulting from the application of voice control. EXAMPLE: The
29、 use of voice control for adjustment of a head restraint has the benefit of allowing the adjustment to be made while the vehicle is being driven, without the driver having to remove his or her gaze from the road or hands from the steering wheel. However, this benefit must be balanced with the risks
30、associated with unintended adjustments; the potential for the user to not know the voice commands required to reverse or stop an adjustment in progress, and the possibility of a substitution error resulting in the unwanted movement of the head restraint. 4.3 Speech Input Should Not Be Applied to the
31、 Following Vehicle Controls or Features There are conditions where speech input should not be allowed, examples of when speech input should not be implemented include but are not limited to the following: Vehicle controls that directly and continuously affect vehicle motion (propulsion, braking, ste
32、ering), or that change the operation or activation status of such controls (i.e., braking systems, transmission shifter/selector, motor/engine on or off). Specific operations with obvious adverse implications for driver focus or occupant protection, such as opening a door, trunk, lift gate or hood w
33、hile the vehicle is in motion. NOTE: This recommended limitation on the application of speech input does not apply to the opening and closing of components that are designed to be operated during driving, such as windows, sunroofs, or cabin partitions. If a speech input and audible output system is
34、provided for controlling the following features, redundant visual-manual controls should be provided to help a driver to quickly prevent or override inadvertent or unintended voice commands: Controls specified by FMVSS 101 including: exterior lamps, turn signals and hazard indicator lights. Controls
35、 used for seat movement, restraint positioning (e.g., seat belt upper anchor position) or deactivation, or vehicle components related to driver vision (mirrors, windows, and camera). 5. DIALOGUE STRUCTURE GUIDANCE 5.1 Dialogue Structure Elements A dialogue structure includes several components desig
36、ned to blend together in order to make the user experience consistent and simple. The elements listed below offer recommendations and examples that can help the developer create a dialogue structure that is intuitive and efficient for general users. 5.1.1 Top-Level Grammar The top-level grammar refe
37、rs to the initial commands and syntax that are available to a user. Often referred to as the main menu, the top-level grammar should be designed to represent the system-enabled tasks in a way that a novice user can easily be made aware of the grammar items. EXAMPLE: The system provides the user with
38、 a list of available options: “Available options include Navigation, Climate, Audio, Help or Cancel.” NOTE: A system that recognizes natural speech will be less dependent on top-level grammar structures. SAE INTERNATIONAL J2988 Issued JUN2015 Page 5 of 10 5.1.2 Sub-Menu Grammars Sub-menu grammar ref
39、ers to subsequent commands and syntax available to the user after traversing through the top level grammar. Usually, at various steps of a dialog, a finite set of commands are available to the user, but may not be available in other locations within the menu. Like the Top-Level Grammar, Sub-Menu Gra
40、mmars, which are unique to a specific location in a menu, should be designed in a way that a novice user can easily be made aware of the grammar items. If the dialogue leads to a specific set of responses, the system could offer these responses to the user. EXAMPLE: “Would you like to set a new dest
41、ination? Please say yes or no.” 5.1.3 Universal Commands Universal commands are commands that function in the same way, regardless of their location in the dialog structure, and regardless of function or context. They allow users to perform common actions such as navigating within the dialogue, exit
42、ing the dialogue, or accessing help. EXAMPLE: Examples of universal commands include “help,” “cancel,” “repeat,” “go back,” or “main menu.” See 3.7. 5.1.4 Dialogue Complexity Dialogue complexity refers to the level of ease and efficiency a system provides for a user attempting to complete a task. Mi
43、nimizing the number of dialog steps and optimizing usability are two ways to reduce dialogue complexity. The vocabularies (words a user may use in response to a prompt) should be intuitive and easy for users to identify and/or remember. Intuitiveness is subjective and can be assessed through usabili
44、ty studies. In general, maximizing simplicity for the user is important so that the user can focus on driving the vehicle. 5.2 Recognition Guidance 5.2.1 Recognition Accuracy Recognition accuracy, namely the measure (e.g., percentage) of valid utterances that are correctly recognized, will vary amon
45、g different users, but developers should target the highest possible levels of recognition accuracy for valid utterances by native speakers using the target dialect. Command words should be selected to be easily recognized by the system with a minimum potential for confusion with other words. In gen
46、eral, avoid single-syllable words that rhyme. Multiple-syllable words tend to be more easily recognized than single-syllable words. 5.2.2 Recognition Confidence A recognition confidence score is a measure of the likelihood that an utterance is recognized correctly. Measurement of recognition confide
47、nce scores should be used to determine whether confirmation by the user is necessary. A low confidence score generally means a high degree of recognition uncertainty indicating a need for a confirmation prompt. An extremely low confidence score usually indicates misrecognition, indicating the need t
48、o re-prompt the user. EXAMPLE 1: A confirmation prompt could be “204 Main Street. Is this correct?” EXAMPLE 2: A re-prompt could be “Please repeat.” 5.3 Dialogue Structure Recommendations The recommendations below are intended to provide additional guidance when constructing a dialogue structure. Th
49、ese recommendations are intended to assist the user in navigating through the dialogue structure. 5.3.1 Consider Including a Clear Dialogue Initiation Method A system should provide a means for a user to conveniently start a speech session, such as a button with a “speak” symbol/label located on the left or right side of the steering wheel, uttering a keyword, or by other appropriate means. SAE INTERNATIONAL J2988 Issued JU