Let me know if you have luck! Lol. Ive been on this platform for nearly 2 years and haven’t seen an improvement in this area except for when whisper sync was added. But since it takes so long I rarely use it. Also you can only upload very small clips which I think is silly as I would like to do it for a whole audiobook and don’t have the time to do it 40 times for a single book.
After seeing this thread, I was playing about with this tonight and I found that if you hit “Regenerate Lesson” and then hit “Generate timestamps” it kinda works.
I couldn’t fully test it - I only had a short MP3 clip from the lesson I was using, so it didn’t generate the correct timestamps, but it did at least generate them, which is better than the results I was getting before. It may be that if you have the correct length of MP3 file, it might get pretty close. Worth a try anyway. It’s late now and I’m about to hit the sack, but I will do further testing tomorrow.
I just tried it again this morning, with the full audio of a lesson, and it works like a charm. So the procedure I used this morning was as follows:
Edit the lesson.
Regenerate Lesson (this is the important bit).
Perform the next two steps immediately after regenerating the lesson (i.e. don’t click out of the webpage - I believe in order for it to work, the text has to appear on the page in the regenerated format)
Add the audio file.
I think you can also do all this after you already have the audio file loaded, but I did not test the procedure that way.  I just tried it, and that works too.
You have to make sure the .mp3 and the transcript are exactly the same. You can’t have an introduction in the podcast/audiobook which is not in the transcript or stuff written in the transcript, which is not actually said. Furthermore, it often has issues with music, such as is common in introductions for podcasts.
If you really need timestamps, I recommend using YouTube content. Alternatively, use Whisper to generate the transcript.
I just tested it this morning, and I definitely had words in the transcript which weren’t said, and it still worked fine. But you’re right in that it’s definitely a good idea to remove any music and make sure the transcript and the audio match as perfectly as possible before trying to generate the timestamps.
I’ve never clicked on “Generate Timestamps” twice. “Regenerate Lesson” can’t be unrelated, since if you click on it, “Generate Timestamps” works (with only one click) and if you don’t, it doesn’t. Maybe it’s a bug, but it definitely works.
If you have a text, it’s a good idea to convert it to Word format (.docx) before importing and - as @nfera wrote - make sure that the transcript matches the audio exactly. Then the results are great.
Speech to text unfortunately leads to quite a few errors, especially in Danish, i.e. words and grammatical forms that do not exist in Danish, fx in “Koen paa isen”. This may be better in other languages.