Abstract: Recent TTS models with decoder-only Transformer architecture, such as SPEAR-TTS and VALL-E, achieve impressive naturalness and demonstrate the ability for zero-shot adaptation given a speech ...
Abstract: Radiology report generation aims to automatically produce diagnostic reports from medical images, reducing radiologists' workload. Most existing models commonly use an encoder-decoder ...