Cute: A concatenative method for voice conversion using exemplar-based unit selection
Author(s): Jin, Zeyu; Finkelstein, Adam; Diverdi, Stephen; Lu, Jingwan; Mysore, Gautham J
DownloadTo refer to this page use:
http://arks.princeton.edu/ark:/88435/pr1652h
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Jin, Zeyu | - |
dc.contributor.author | Finkelstein, Adam | - |
dc.contributor.author | Diverdi, Stephen | - |
dc.contributor.author | Lu, Jingwan | - |
dc.contributor.author | Mysore, Gautham J | - |
dc.date.accessioned | 2021-10-08T19:45:34Z | - |
dc.date.available | 2021-10-08T19:45:34Z | - |
dc.date.issued | 2016 | en_US |
dc.identifier.citation | Jin, Zeyu, Adam Finkelstein, Stephen DiVerdi, Jingwan Lu, and Gautham J. Mysore. "Cute: A concatenative method for voice conversion using exemplar-based unit selection." 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2016): pp. 5660-5664. doi:10.1109/ICASSP.2016.7472761 | en_US |
dc.identifier.uri | https://pixl.cs.princeton.edu/pubs/Jin_2016_CAC/CUTE-icassp_2016.pdf | - |
dc.identifier.uri | http://arks.princeton.edu/ark:/88435/pr1652h | - |
dc.description.abstract | State-of-the art voice conversion methods re-synthesize voice from spectral representations such as MFCCs and STRAIGHT, thereby introducing muffled artifacts. We propose a method that circumvents this concern using concatenative synthesis coupled with exemplar-based unit selection. Given parallel speech from source and target speakers as well as a new query from the source, our method stitches together pieces of the target voice. It optimizes for three goals: matching the query, using long consecutive segments, and smooth transitions between the segments. To achieve these goals, we perform unit selection at the frame level and introduce triphone-based preselection that greatly reduces computation and enforces selection of long, contiguous pieces. Our experiments show that the proposed method has better quality than baseline methods, while preserving high individuality. | en_US |
dc.format.extent | 5660 - 5664 | en_US |
dc.language.iso | en_US | en_US |
dc.relation.ispartof | IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) | en_US |
dc.rights | Author's manuscript | en_US |
dc.title | Cute: A concatenative method for voice conversion using exemplar-based unit selection | en_US |
dc.type | Conference Article | en_US |
dc.identifier.doi | 10.1109/ICASSP.2016.7472761 | - |
dc.identifier.eissn | 2379-190X | - |
pu.type.symplectic | http://www.symplectic.co.uk/publications/atom-terms/1.0/conference-proceeding | en_US |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
CuteVoiceConversion.pdf | 287.47 kB | Adobe PDF | View/Download |
Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.