This is how Lingq handles our uploaded text or SRTs

  1. Ai will arbitrarily add periods to your sentences.
  2. The program will force line breaks according to the period.

Then your content swings away.


It wasn’t like this before

image

upload MP3 & SRT

  • The sentences inside the course are different from my SRT.

upload SRT only

  • The sentences inside the course are different from my SRT too.

My thoughts:
Don’t use use AI to preprocess my srt
Don’t use Whisper to process my audio
I’ll handle it myself

here’s a good one:

{
  "faster_whisper_config": {
    "model_path": " ",
    "local_files_only": true,
    "download_root": " ",

    "device": "cuda:0",
    "compute_type": "float16",

    "transcribe_options": {
      "language": "de",
      "task": "transcribe", 
      "beam_size": 10,
      "best_of": 5,
      "patience": 1.0,
      "length_penalty": 2.0,
      "temperature": [0.2, 0.4, 0.6, 0.8, 1.0],
      "compression_ratio_threshold": 2.4,
      "log_prob_threshold": -1.0,
      "no_speech_threshold": 0.0,

      "word_timestamps": true,
      "suppress_blank": true,
      "suppress_tokens": [-1],
      "repetition_penalty": 1.0,
      "no_repeat_ngram_size": 0,
      "prompt_reset_on_temperature": 0.5,
      "chunk_length": 30,

      "initial_prompt": "",
      "prefix": "",
      "hotwords": "This is a audio, please detect all inflections and use more punctuation to enrich the emotion, such as ellipses, question marks, full stops, exclamation points, dashes(...) ? !)"
    },

    "thread_num": 24,
    "num_worker": 1
  }
}

1 Like

Thanks for reporting, we will investigate the issue.

2 Likes

Weird things are happening to imported videos with both English and Chinese captions. Captions are broken up differently than the original and alignment with the video is off.

2 Likes

@zoran Here, I just imported this video, which has both Chinese and English subtitles.

By the 4th caption, words were dropped. I can see them in the original captions. The first time I imported it, sections were missing.

I have put both import attempts into a course for your convenience: Login - LingQ

FWIW, I’ve tried a handful of different times/ways, and the best I’ve managed is the one with the missing words and incorrect timestamps.

2 Likes

Thanks for the additional information @BassmasterJJ , much appreciated.

1 Like