Skip to main content

Cute: A concatenative method for voice conversion using exemplar-based unit selection

Author(s): Jin, Zeyu; Finkelstein, Adam; Diverdi, Stephen; Lu, Jingwan; Mysore, Gautham J

Download
To refer to this page use: http://arks.princeton.edu/ark:/88435/pr1652h
Full metadata record
DC FieldValueLanguage
dc.contributor.authorJin, Zeyu-
dc.contributor.authorFinkelstein, Adam-
dc.contributor.authorDiverdi, Stephen-
dc.contributor.authorLu, Jingwan-
dc.contributor.authorMysore, Gautham J-
dc.date.accessioned2021-10-08T19:45:34Z-
dc.date.available2021-10-08T19:45:34Z-
dc.date.issued2016en_US
dc.identifier.citationJin, Zeyu, Adam Finkelstein, Stephen DiVerdi, Jingwan Lu, and Gautham J. Mysore. "Cute: A concatenative method for voice conversion using exemplar-based unit selection." 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2016): pp. 5660-5664. doi:10.1109/ICASSP.2016.7472761en_US
dc.identifier.urihttps://pixl.cs.princeton.edu/pubs/Jin_2016_CAC/CUTE-icassp_2016.pdf-
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/pr1652h-
dc.description.abstractState-of-the art voice conversion methods re-synthesize voice from spectral representations such as MFCCs and STRAIGHT, thereby introducing muffled artifacts. We propose a method that circumvents this concern using concatenative synthesis coupled with exemplar-based unit selection. Given parallel speech from source and target speakers as well as a new query from the source, our method stitches together pieces of the target voice. It optimizes for three goals: matching the query, using long consecutive segments, and smooth transitions between the segments. To achieve these goals, we perform unit selection at the frame level and introduce triphone-based preselection that greatly reduces computation and enforces selection of long, contiguous pieces. Our experiments show that the proposed method has better quality than baseline methods, while preserving high individuality.en_US
dc.format.extent5660 - 5664en_US
dc.language.isoen_USen_US
dc.relation.ispartofIEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)en_US
dc.rightsAuthor's manuscripten_US
dc.titleCute: A concatenative method for voice conversion using exemplar-based unit selectionen_US
dc.typeConference Articleen_US
dc.identifier.doi10.1109/ICASSP.2016.7472761-
dc.identifier.eissn2379-190X-
pu.type.symplectichttp://www.symplectic.co.uk/publications/atom-terms/1.0/conference-proceedingen_US

Files in This Item:
File Description SizeFormat 
CuteVoiceConversion.pdf287.47 kBAdobe PDFView/Download


Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.