论文链接:
https://arxiv.org/abs/2308.16569
代码地址:
https://github.com/thuhcsi/LightGrad
数据支持:
针对BZNSYP和LJSpeech提供训练脚本
针对Grad-TTS提出两个问题:
-
DPMs are not lightweight enough for resource-constrained devices.
-
DPMs require many denoising steps in inference, which increases latency.
提出解决方案:
-
To reduce model parameters, regular convolution networks in diffusion decoder are substituted withdepthwise separable convolutions.
-
To accelerate the inference procedure, we adopt a training-free fast sampling technique for DPMs (DPM-solver).
-
Streaming inferenceis also implemented in LightGrad to reduce latency further.
Compared with Grad-TTS, LightGrad achieves62.2%reduction in paramters服务器托管网,65.7%reduction in latency, while preserving comparable speech quality on both Chinese Mandarin and English in4denoising steps.
LightGrad流式方案(基于三星论文):
论文链接:
https://arxiv.org/abs/2111.09052
具体实现:
-
Decoder input is chopped into chunks at phoneme boundaries to cover several consecutive phonemes and the chunk lengths are limited to a predefined range.
-
To incorporate context information into decoder, last 服务器托管网phoneme of the previous chunk and first phoneme of the following chunk are padded to the head and tail of the current chunk.
-
Then, the decoder generates mel-spectrogram for each padded chunk.
-
After this, mel-spectrogram frames corresponding to the padded phonemes are removed to reverse the changes to each chunk.
服务器托管,北京服务器托管,服务器租用 http://www.fwqtg.net
相关推荐: CSS3属性详解(一)文本 盒模型中的 box-ssize 属性 处理兼容性问题:私有前缀 边框 背景属性 渐变
CSS3是用于为HTML文档添加样式和布局的最新版本的层叠样式表(Cascading Style Sheets)。下面是一些常用的CSS3属性及其详细解释: border-radius:设置元素的边框圆角的半径。可以使用四个值设置四个不同的圆角半径,也可以只使…