-
Notifications
You must be signed in to change notification settings - Fork 869
support for alignment output in tsv format #407
base: master
Are you sure you want to change the base?
Conversation
|
I've been trying this out. Looks like when using a long text some of the last words are being skipped in the alignment file. |
|
@vytskalt can you provide an example so I can debug/fix it? |
Yes, this is the command I'm running: cat text.txt | piper --sentence-silence 0.5 -m en_US-ryan-high --output_file out.wav --alignment-data alignment.tsvThis is the text (random Reddit post): text.txt In the alignment.tsv, 2 of the last words are missing. |
|
ok, it's not the length that is the issue, it's the content. For example: "musical/sport" will be spoken as 3 words. "in the" is mangled into one spoken word. My word/phoneme sync trips over this. Needs to be fixed, I have to find another way to sync. |
… or split by "musical/sports". Also fixed missing sentence silence in calculation
|
Hi, i pulled this pull request and make a build but the --ali gnment-data is not disponible in the executable "piper" in the install folder. Am i missing something to make it work ? Thanks (: |
|
It is only built into the python script, not in the c++ executable. |
|
Make sense ! Thanks (: |
Hi would it be possible to add it to the c++ exe? i am using windows which does not have the python version so i need it compiled. Dont know how to translate python to c++ |
|
The |
Support of alignment data output.
Kind of matching on issue #364
Can be used as a base for #391 and #361
Runs text to speech 2 times, one for normal audio generation,
a second time for each word.
Since both produce different outputs and times, a correction is applied.
Not perfect, but "good enough". Both will self sync after each sentence, so only slight offset are created.