You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am looking for a way to insert silence with different duration into the output of the TTS audio similar to how humans read. Based on the text we pause sometimes longer and not just read every word. The best way would be if we could insert a special token into the text like [SILENCE], this could, for example, mean 100ms silence, if repeated twice [SILENCE][SILENCE] means 200ms silence, etc. This would make the audio naturally sound.
Is this already implemented in some way? Thank you.
The text was updated successfully, but these errors were encountered:
I am looking for a way to insert silence with different duration into the output of the TTS audio similar to how humans read. Based on the text we pause sometimes longer and not just read every word. The best way would be if we could insert a special token into the text like [SILENCE], this could, for example, mean 100ms silence, if repeated twice [SILENCE][SILENCE] means 200ms silence, etc. This would make the audio naturally sound.
Is this already implemented in some way? Thank you.
The text was updated successfully, but these errors were encountered: