numpad0
43 minutes ago
OT but is any work anywhere being done with Japanese pronunciation problem?
Japanese language are often described as using multiple type of alphabets - kanji, kana, numbers, and English alphabets sometimes - and pronunciations of especially kanji is not very well constrained, creating tons of homophones and homographs, e.g. "koushou" shared across more than 20 words, and the character for "life" said to be involved in more than 150 differently read parts of words.
Even OT but Unicode code space used for Japanese Kanji is famously shared with Chinese Hanzi, leading to ambiguities.
This situation is causing AI-based TTS(and also image generators) trained directly on Unicode text to go weird on kanji, even for simple ones as "tomorrow". Classical pre-LLM Japanese TTS avoid this by operating on generated or manually specified pronunciations, skipping kanji altogether, which do occasionally lead to wrong readings, but won't lead to sound generation code creating butchered middle-of-road sounds.
It doesn't seem like most or any of AI TTS tackle this problem, but I'm not in that field. Do anyone know the statuses on it?