1、 ETSI TS 126 243 V15.0.0 (2018-07) Digital cellular telecommunications system (Phase 2+) (GSM); Universal Mobile Telecommunications System (UMTS); LTE; ANSI-C code for the fixed-point distributed speech recognition extended advanced front-end (3GPP TS 26.243 version 15.0.0 Release 15) TECHNICAL SPEC
2、IFICATION ETSI ETSI TS 126 243 V15.0.0 (2018-07)13GPP TS 26.243 version 15.0.0 Release 15Reference RTS/TSGS-0426243vf00 Keywords GSM,LTE,UMTS ETSI 650 Route des Lucioles F-06921 Sophia Antipolis Cedex - FRANCE Tel.: +33 4 92 94 42 00 Fax: +33 4 93 65 47 16 Siret N 348 623 562 00017 - NAF 742 C Assoc
3、iation but non lucratif enregistre la Sous-Prfecture de Grasse (06) N 7803/88 Important notice The present document can be downloaded from: http:/www.etsi.org/standards-search The present document may be made available in electronic versions and/or in print. The content of any electronic and/or prin
4、t versions of the present document shall not be modified without the prior written authorization of ETSI. In case of any existing or perceived difference in contents between such versions and/or in print, the only prevailing document is the print of the Portable Document Format (PDF) version kept on
5、 a specific network drive within ETSI Secretariat. Users of the present document should be aware that the document may be subject to revision or change of status. Information on the current status of this and other ETSI documents is available at https:/portal.etsi.org/TB/ETSIDeliverableStatus.aspx I
6、f you find errors in the present document, please send your comment to one of the following services: https:/portal.etsi.org/People/CommiteeSupportStaff.aspx Copyright Notification No part may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and
7、 microfilm except as authorized by written permission of ETSI. The content of the PDF version shall not be modified without the written authorization of ETSI. The copyright and the foregoing restriction extend to reproduction in all media. ETSI 2018. All rights reserved. DECTTM, PLUGTESTSTM, UMTSTMa
8、nd the ETSI logo are trademarks of ETSI registered for the benefit of its Members. 3GPPTM and LTETMare trademarks of ETSI registered for the benefit of its Members and of the 3GPP Organizational Partners. oneM2M logo is protected for the benefit of its Members. GSMand the GSM logo are trademarks reg
9、istered and owned by the GSM Association. ETSI ETSI TS 126 243 V15.0.0 (2018-07)23GPP TS 26.243 version 15.0.0 Release 15Intellectual Property Rights Essential patents IPRs essential or potentially essential to normative deliverables may have been declared to ETSI. The information pertaining to thes
10、e essential IPRs, if any, is publicly available for ETSI members and non-members, and can be found in ETSI SR 000 314: “Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in respect of ETSI standards“, which is available from the ETSI Secretariat. Latest
11、updates are available on the ETSI Web server (https:/ipr.etsi.org/). Pursuant to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guarantee can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the updates on the ETSI Web
12、server) which are, or may be, or may become, essential to the present document. Trademarks The present document may include trademarks and/or tradenames which are asserted and/or registered by their owners. ETSI claims no ownership of these except for any which are indicated as being the property of
13、 ETSI, and conveys no right to use or reproduce any trademark and/or tradename. Mention of those trademarks in the present document does not constitute an endorsement by ETSI of products, services or organizations associated with those trademarks. Foreword This Technical Specification (TS) has been
14、produced by ETSI 3rd Generation Partnership Project (3GPP). The present document may refer to technical specifications or reports using their 3GPP identities, UMTS identities or GSM identities. These should be interpreted as being references to the corresponding ETSI deliverables. The cross referenc
15、e between GSM, UMTS, 3GPP and ETSI identities can be found under http:/webapp.etsi.org/key/queryform.asp. Modal verbs terminology In the present document “shall“, “shall not“, “should“, “should not“, “may“, “need not“, “will“, “will not“, “can“ and “cannot“ are to be interpreted as described in clau
16、se 3.2 of the ETSI Drafting Rules (Verbal forms for the expression of provisions). “must“ and “must not“ are NOT allowed in ETSI deliverables except when used in direct citation. ETSI ETSI TS 126 243 V15.0.0 (2018-07)33GPP TS 26.243 version 15.0.0 Release 15Contents Intellectual Property Rights 2g3F
17、oreword . 2g3Modal verbs terminology 2g3Foreword . 4g31 Scope 5g32 References 5g33 Definitions and abbreviations . 5g33.1 Definitions 5g33.2 Abbreviations . 5g34 C code structure 5g34.1 Contents of the C source code 5g34.2 Program execution 6g34.3 Code hierarchy . 7g34.5 Variables, constants and tab
18、les . 14g34.5.1 Description of constants used in the C-code . 15g34.5.2 Description of fixed tables used in the C-code . 18g34.5.3 Static variables used in the C-code . 19g35 File formats 24g35.1 Speech file 24g3Annex A (informative): Change history . 25g3History 26g3ETSI ETSI TS 126 243 V15.0.0 (20
19、18-07)43GPP TS 26.243 version 15.0.0 Release 15Foreword This Technical Specification has been produced by the 3rdGeneration Partnership Project (3GPP). The contents of the present document are subject to continuing work within the TSG and may change following formal TSG approval. Should the TSG modi
20、fy the contents of the present document, it will be re-released by the TSG with an identifying change of release date and an increase in version number as follows: Version x.y.z where: x the first digit: 1 presented to TSG for information; 2 presented to TSG for approval; 3 or greater indicates TSG
21、approved document under change control. y the second digit is incremented for all changes of substance, i.e. technical enhancements, corrections, updates, etc. z the third digit is incremented when editorial only changes have been incorporated in the document. ETSI ETSI TS 126 243 V15.0.0 (2018-07)5
22、3GPP TS 26.243 version 15.0.0 Release 151 Scope The present document contains an electronic copy of the ANSI-C code for DSR Extended Advanced Front-end. The ANSI-C code is necessary for a bit exact implementation of DSR Extended Advanced Front-end. 2 References The following documents contain provis
23、ions which, through reference in this text, constitute provisions of the present document. 1 ETSI ES 202 050 (2007-01) V1.1.5: “Distributed Speech Recognition; Advanced Front-end Feature Extraction Algorithm; Compression Algorithm“. 2 ETSI ES 202 212 (2005-11) V1.1.2: “Distributed Speech Recognition
24、; Extended Advanced Front-end Feature Extraction Algorithm; Compression Algorithm, Back-end Speech Reconstruction Algorithm“. 3 3GPP TS 26.177: “Speech Enabled Services (SES); Distributed Speech Recognition (DSR) extended advanced front-end test sequences“. 3 Definitions and abbreviations 3.1 Defini
25、tions Definition of terms used in the present document, can be found in 1, 2 3.2 Abbreviations For the purpose of the present document, the following abbreviations apply: ANSI American National Standards Institute I/O Input/OutputRAM Random Access Memory ROM Read Only Memory AFE Advanced Front-end X
26、-AFE eXtended Advanced Front-end DSR Distributed Speech Recognition 4 C code structure This clause gives an overview of the structure of the bit-exact C code and provides an overview of the contents and organization of the C code attached to this document. The C code has been verified on the followi
27、ng systems: - Sun Microsystems workstations and GNU gcc compiler - IBM PC compatible computers with Linux operating system and GNU gcc compiler. ANSI-C was selected as the programming language because portability was desirable. 4.1 Contents of the C source code The distributed files with suffix “c“
28、contain the source code and the files with suffix “h“ are the header files. Makefiles are provided for the platforms in which the C code has been verified (listed above). ETSI ETSI TS 126 243 V15.0.0 (2018-07)63GPP TS 26.243 version 15.0.0 Release 154.2 Program execution There are separate executabl
29、es for the FrontEnd and Vector Quantization, with and without Extensions. The command line options are described below. - indicates parameters for the given option for running the executable () indicates default parameter. FrontEnd w/ Extension: USAGE: bin/ExtAdvFrontEnd infile HTK_outfile pitch_out
30、file class_outfile options OPTIONS: -q Quiet Mode (FALSE) -F format Input file format (NIST) -fs freq Sampling frequency in kHz (8) -swap Change input byte ordering (Native) -noh No HTK header to output file (FALSE) -noc0 No c0 coefficient to output feature vector (FALSE) -nologE No logE component t
31、o output feature vector (FALSE) -skip_header_bytes n - Skip header, first n bytes ( Only for -F RAW) -noh, -noc0, -nologE and skip_header_bytes are not used and should not be changed. FrontEnd w/o Extension: USAGE: bin/AdvFrontEnd infile HTK_outfile options OPTIONS: - Same as FrontEnd w/ Extension V
32、ector Quantization w/ Extension: Usage: extcoder htk_file_in pitch_file_in class_file_in bitstream_file_out pitch_file_out txt_file_out -freq x -VAD/No_VAD htk_file_in Input mel-frequency cepstral coefficient file in HTK MFCC format. pitch_file_in Input pitch period file. class_file_in Input classif
33、ication file. bit_file_out Output binary bitstream. pitch_file_out Output quantised pitch period file. txt_file_out Vector quantiser output in text format. -freq x Sampling frequency in kHz (8 or 16). -VAD Use voice activity detector data. Voice activity input file must have same name as htk_file, b
34、ut extension .vad -No_VAD Do not incorporate voice activity detector information in output bitstream. Vector Quantization w/o Extension: Usage: coder htk_file_in bitstream_file_out txt_file_out -freq x -VAD/No_VAD htk_file_in Input mel-frequency cepstral coefficient file in HTK MFCC format. bit_file
35、_out Binary output bitstream. txt_file_out Vector quantiser output in text format. -freq x Sampling frequency in kHz (8 or 16). -VAD Use voice activity detector data. Voice activity input file must have same name as htk_file, but extension .vad -No_VAD Do not incorporate voice activity detector info
36、rmation in output bitstream. File extension descriptions as generated by the sample script: .cep Binary file containing cepstral features in HTK format. Output from the FrontEnd, input to the vector quantizer. .pitch Binary file containing pitch information. Output from the FrontEnd, input to the ve
37、ctor quantizer. Only used for Extension. .class Ascii file containing class information. Output from the FrontEnd, input to the vector quantizer. Only used for Extension. .bs Binary file containing the bitstream. Output from the vector quantizer. .log Log files from the different executables. ETSI E
38、TSI TS 126 243 V15.0.0 (2018-07)73GPP TS 26.243 version 15.0.0 Release 154.3 Code hierarchy Tables 1 to 3 are call graphs that show the functions used for AFE (table 1), VQ (table 2), and Extension (table 3). Each column represents a call level and each cell a function. The functions contain calls t
39、o the functions in rightwards neighboring cells. The time order in the call graphs is from the top downwards as the processing of a frame advances. All standard C functions: printf(), fwrite(), etc. have been omitted. Also, no basic operations (add(), L_add(), mac(), etc.) or double precision extend
40、ed operations (e.g. L_Extract() appear in the graphs. The basic operations are not counted as extending the depth, therefore the deepest level in this software is level 7. ETSI ETSI TS 126 243 V15.0.0 (2018-07)83GPP TS 26.243 version 15.0.0 Release 15Table 1: AFE call structure ETSI ETSI TS 126 243
41、V15.0.0 (2018-07)93GPP TS 26.243 version 15.0.0 Release 15main() AdvProcessInit_B() DoNoiseSupInit_B() DoWaveProcInit_B() DoCompCepsInit_B()DoPostProcInit_B() DoVADInit_F() Do16kProcInit_B()QMF_FIR_Init_B() fir_initialization_B() DP_HP_filters_B()BufIn32Alloc() AdvProcessAlloc_B() DoNoiseSupAlloc_B(
42、)DoWaveProcAlloc_B() DoCompCepsAlloc_B() DoPostProcAlloc_B()DoVADAlloc_F() Do16kProcAlloc_B() FlushAdvProcess_B() DoVADFlush_F() CvFeatInt2Float() AdvProcessDelete_B() DoNoiseSupDelete_B() DoWaveProcDelete_B() DoCompCepsDelete_B()DoPostProcDelete_B() DoVADDelete_B() BufIn32Free()DoAdvProcess_B() Do1
43、6kProcessing_B() DoNoiseSup_B()Get16k_p_bufferData16k_B() Get16k_bufData16kSize_B()Get16k_p_BandsForCoding16k_B()Get16k_p_CodeForBands16k_B() Get16k_dataHP_B() VAD_F() Log_2() DoSigWindowing16_F1() DoSigWindowing16_F2()ff4NRFix32_B() GetL15() GetH15()Mult16x32()Add_Mult16x16_16() Sub_Mult16x16_16()P
44、ermut() FFTtoPSD_F() Square24d2_B() Square24_B()Get16k_BFC_dec_B() GetBandsForCoding16k_B()PSDMean_F() NoiseEstimation_F1() Sqrt_2() Sqrt16_2()NoiseEstimation_F2() Sqrt_2() Sqrt16_2()FilterCalc_F() SpeechQVar()FilterBank16()SpeechQSpec() SpeechQMel() DoGainFact_F1() Log_2() DoGainFact_F2() Log_2()Do
45、MelIDCT_F16() ApplyWF() Get16k_dec1()Get16k_dec2()Get16k_dec3() DoSigWindowing16_F3() ff4NRFix32_B() GetL15() GetH15()Mult16x32()Add_Mult16x16_16() Sub_Mult16x16_16()Permut() FFTtoPSD_F() Square24d2_B() Square24_B()DoMelFB_B() CodeBands16k_B()DoSpecSub16k_B()Log_2() UpDateDecal() ApplyDecal()DCOffse
46、tFil_F() Get16k_hpBandsSize_B() Get16k_p_hpBands_B()Get16k_p_bufferCodeForBands16k_B()Get16k_p_CodeForBands16k_B() ETSI ETSI TS 126 243 V15.0.0 (2018-07)103GPP TS 26.243 version 15.0.0 Release 15Get16k_p_bufferCodeWeights_B() Get16k_p_codeWeights_B()Set16k_hpBands_dec_B() DoWaveProc_B() TeagerEng()
47、GetTeagerFilter()GetMaximaPositions() DoCompCeps_B() CepsCompute() Get16k_p_bufferCodeWeights_B() Get16k_p_bufferCodeForBands16k_B() PreEmphHamm() ff4NB16_B()GetBandsForDecoding16k_B() DecodeBands16k_B() FilterBank() Get16k_hpBands_dec_B()Get16k_p_hpBands_B() MergeSSandCoded_B()CorrectEnergy_B()CosI
48、nv16Khz() cosInv() (only for 8kHz) DoPostProc_B() DoVADProc_F()focalpoint() Table 2: VQ call structure main() quantize_and_print()get_best_dataframe() best_centroid() quant_pitch_abs() get_class_bit()quant_pitch_diff()get_class_bit() mfcc_crc_encode()pc_crc_encode()ETSI ETSI TS 126 243 V15.0.0 (2018
49、-07)113GPP TS 26.243 version 15.0.0 Release 15Table 3: Extension call structure ETSI ETSI TS 126 243 V15.0.0 (2018-07)123GPP TS 26.243 version 15.0.0 Release 15main() RVC_ConstructPitchRom_be() RVC_ConstructPitchMeter_be() Allocate_InterpolatedDft_be() RVC_ResetPitchMeter_be() RVC_DestructPitchRom_be() RVC_DestructPitchMeter_be() Deallocate_InterpolatedDft_be() DoAdvProcess_B() DoPitchExtract() FilterBank() dsr_afe_vad() get_vm() fnLog2() IsLowBandNoise() get_zcm() pre_process() iir_d() iir_s() RVC_MeasurePi
copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1