일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | |||||
3 | 4 | 5 | 6 | 7 | 8 | 9 |
10 | 11 | 12 | 13 | 14 | 15 | 16 |
17 | 18 | 19 | 20 | 21 | 22 | 23 |
24 | 25 | 26 | 27 | 28 | 29 | 30 |
- Dilated convolution
- Implicit feedback
- Tacotron2
- word2vec
- RecSys
- Noise Contrastive Estimation
- 백준
- 부스트캠프 AI Tech
- ALS
- Neural Collaborative Filtering
- FastSpeech2
- SGNS
- Item2Vec
- BOJ
- Recommender System
- ANNOY
- FastSpeech
- Ai
- matrix factorization
- 추천시스템
- Negative Sampling
- NEG
- TTS
- CF
- CV
- 논문리뷰
- Skip-gram
- wavenet
- Tacotron
- Collaborative Filtering
- Today
- Total
목록TTS (5)
devmoon
https://arxiv.org/abs/2006.04558 FastSpeech 2: Fast and High-Quality End-to-End Text to Speech Non-autoregressive text to speech (TTS) models such as FastSpeech can synthesize speech significantly faster than previous autoregressive models with comparable quality. The training of FastSpeech model relies on an autoregressive teacher model for duratio arxiv.org 한국어 음성합성 프로젝트를 진행하기 위해서 살펴본 TTS 논문들 ..
https://arxiv.org/abs/1905.09263 FastSpeech: Fast, Robust and Controllable Text to Speech Neural network based end-to-end text to speech (TTS) has significantly improved the quality of synthesized speech. Prominent methods (e.g., Tacotron 2) usually first generate mel-spectrogram from text, and then synthesize speech from the mel-spectrogram us arxiv.org Abstract FastSpeech 이전 등장하는 TTS 모델들은 공통적으..
https://arxiv.org/abs/1809.08895 Neural Speech Synthesis with Transformer Network Although end-to-end neural text-to-speech (TTS) methods (such as Tacotron2) are proposed and achieve state-of-the-art performance, they still suffer from two problems: 1) low efficiency during training and inference; 2) hard to model long dependency using arxiv.org Abstract End-to-End TTS를 지원하는 Tacotron2는 TTS분야에서 뛰..
https://arxiv.org/abs/1712.05884 Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions This paper describes Tacotron 2, a neural network architecture for speech synthesis directly from text. The system is composed of a recurrent sequence-to-sequence feature prediction network that maps character embeddings to mel-scale spectrograms, followed arxiv.org Tacotron1에 이어 나온 Taco..
https://arxiv.org/abs/1609.03499 WaveNet: A Generative Model for Raw Audio This paper introduces WaveNet, a deep neural network for generating raw audio waveforms. The model is fully probabilistic and autoregressive, with the predictive distribution for each audio sample conditioned on all previous ones; nonetheless we show that arxiv.org 1. Introduction WaveNet 논문은 텍스트를 음성으로 바꾸어주는 TTS에서 적용가능한 모..