Audio/video samples from Ke Fang's previous projects where he involved as a major contributor. Due to privacy issue, the technical details will not be revealed.
Samples from my MusicGAN project of unconditional raw Mozart's piano notes generation.
This project tries to use GAN to learn long-term dependencies music generation, which is a main defect of current GAN-based sound art.
Samples generated from random z without any condition.
Samples generated by my modified Tacotron based TTS trained on ~20h Mandarin Chinese single-speaker dataset. The model can run on mobile devices in real-time.
Samples from my TTS model, sentences below are randomly chosen.
“感谢您选购世纪佳缘网,我们将为您提供优质的服务和放心的产品。”
““中国氢弹之父”于敏,于1月16日在京去世,享年93岁。于敏毕业于北京大学,后被著名物理学家钱三强、彭桓武调到中科院近代物理研究所。”
“岳云鹏的相声经常性的在台上中断表演,针对某个观众说一些态度比较激烈的话。”
“作为一个正在进行时的舔狗,我感觉我有必要在这里宣泄释放一下,文字表达能力有限,反正就这样吧”
Samples of using Multi-Speaker TTS model for Speaker-Adaptation. We trained the model on ~800 speakers TTS, then collect ~1min new speakers audio for adaption, below is the results on two famous voices.
Samples of Guo Degang(郭德纲) and Chibi Maruko-chan(樱桃小丸子(国语版)).
“想要带你去浪漫的土耳其。”
“海中月是天上月。”
“亲爱的李岩倪,我是你最喜爱的郭德纲老师。”
“阿倪姐姐我是樱桃小丸子,旁嘎思密达!。”
Samples generated by my implementation of paper Neural Discrete Representation Learning(VQ-VAE) on the task of voice-style transfer with VCTK dataset.
The left audio is ground-truth, the right is the results.
“We are encouraged by the news.”
“It was a breathtaking moment.”
“Who was the mystery MP?”
“Under Alex Ferguson, Aberdeen showed it could be done.”
Samples generated by my open-source repo randomCNN-voice-transfer.
TBD