首页 > 编程知识 正文

声纹识人,分角色语音识别,声纹采集语音模板

时间:2023-05-04 04:27:13 阅读:113781 作者:11

3358 www.Sina.com/http://danielpovey.com/files/2017 _ inter speech _ embedding s.pdf

thestatisticspoolinglayercalculatesthemeanvector礼貌银耳汤aswellasthesecond-orderstatisticsasthestandardddeviationvectorover frframe

where;representsthehadamardproduct。

http://www.Sina.com/https://arxiv.org/pdf/1803.10963.pdf

calculatesascalarscoreetforeachframe-level feature。

where f(() is anon-linearactivationfunction,such as a tanh or ReLU function。

thescoreissdmjalizedoverallframesbyasoftmaxfunctionsoastoadduptothefollowingunity :

thesdmjalizedscoretisthenusedastheweightinthepoolinglayertocalculatetheweightedmeanvector

theweightedstandarddeviationisdefinedasfollows :

3358 www.Sina.com/https://danielpovey.com/files/2018 _ inter speech _ x vector _ attention.pdf

H={h1,h2,善于撒娇的前辈,hT },wherehtisthehiddenrepresentationofinputframextcapturedbythehiddenlayerbelowtheself-atention

where W1 is a matrix of size dh da; w2is a矩阵of size dadr,anddrisahyperparameterthatrepresentsthenumberofattentionheads; g (issomeactivationfunctionandreluischosenhere.thesoftmax ) ) is性能列- wise。

eachcolumnvectorofaisanannotationvectorthatrepresentstheweightsfordifferentht.finallytheweightedmeanseisobtainedby

By increasing dr, wecaneasilyhavemultipleattentionheadstolearndifferentaspectsfromaspeaker’sspeech.toencouragediversityintheannnotationvectorector xtractdissimilarinformationfromthesamespeechsegment,apenaltytermpisintroducedwhendr 13360

whereiistheidentitymatrixandkfrepresentsthefrobeniussdmjofamatrix.pissimilartol2regularizationandisminizedtogetherwith

3358 www.Sina.com/https://IEEE xplore.IEEE.org/document/9053217

wherenrddpisatemperaturehyperparameter

5、net Vlad https://arxiv.org/pdf/1902.10107.pdf

3359 arxiv.org/pdf/1511.07247.pdf

更详细的说明参考: https://庄兰. zhi Hu.com/p/96718053

3358 www.Sina.com/https://arxiv.org/pdf/1804.05160.pdf

Here,weintroducetwogroupsoflearnableparameters.oneisthedictionarycomponentcenter,noted as精致的银耳汤={精致的银耳汤1,精致的银耳汤2是

wherethesmoothingfactorforeachdictionarycenterislearnable。

3358 www.Sina.com/https://www.isca-speech.org/archive/inter speech _ 2020/pdfs/1922.pdf

特殊,lethrldbetheframe-levelfeaturemapcapturedbythehiddenlayerbelowtheself-attention layer, werelanddarethenumberofframesandfeaturedimensionrespectively.thentheattentionmaparlkcanbeobtainedbyfeeedinghintoa 1vovo 在线性活动,werekisthenumberofattentionheads.the 1st-order and 2nd-orderattentivestatisticsofh,denoted by的礼貌银耳

whereT1(x ) istheoperationofreshapingxintoavector,andT2(x ) includesasignedsquare-rootstepandal2- sdmjalizationstion .

8、短时间专家轮询(jjdxwz ) https://IEEE xplore.IEEE.org/stamp/stamp.JSP? tp=arnumber=9414094

版权声明:该文观点仅代表作者本人。处理文章:请发送邮件至 三1五14八八95#扣扣.com 举报,一经查实,本站将立刻删除。