How to accurately sync audio and text

Not clear on what you’re saying. But I’m not saying that there is no syncing with the transcript. I’m only saying that it’s not accurate enough when reading in sentence mode.

When reading in sentence mode, you can play the speaker’s audio for that sentence - that’s the hope - but since there are consistently mismatches either partially or totally between the audio that’s played and the sentence that is read it’s more annoying than effective.

Small differences are tolerable, but playing the audio in sentence mode is mostly unreliable - unfortunately - because the concept would be so useful for practicing speech.