Is it possible to edit the full text instead of by paragraphs ? (to convert traditional Chinese to simplified)

I searched and there were already a few posts about Traditional and Simplified Chinese conversion but I couldn’t find exactly what I wanted to know.

I know there are websites that can convert (from Traditional to Simplified in my case) but when I try to edit a lesson I can only access each paragraph and not the full text to be converted.

I usually don`t use materials that have traditional characters but that was the way Whisper transcripted the audio I imported here.

Any tips to facilitate the process?

1 Like

My reply here assumes the “web browser version” of LingQ.

If you need a way to “copy-and-paste” an entire lesson, enter that lesson and then use the “Print Lesson” option in the LingQ lesson “3 dots” menu. After you have selected “Print Lesson”, you can copy the entire lesson all in one mouse action. You can then paste into a program outside your web browser such as a text editor.

If you use Whisper in the future, note that you can give Whisper a “hint” that you want the text output to be Simplified Chinese. I have not used Whisper for over a year. Back in 2023, I installed Whisper on a machine I own, and used the following python code to get Simplified Chinese:

image

I expect that other interfaces to Whisper have similar capabilities.

Whisper expects the initial_prompt to be a hint regarding which spoken language is present in the sound file (or other useful hint about the audio itself). That’s why my initial_prompt mentions “Pu3tong1hua4” (Standard Spoken Chinese) and not “Jian3ti3zi4” (Simplified Characters).

2 Likes

Thank you. Since I can`t open the Chinese lessons now (worst bug ever) I can’t try your copy-and-paste tip for the moment but once converted, I assume I can’t paste it in the same lesson and need to create a new one importing the new converted text, is that so?

I can’t give Whisper a hint, since I’m just using it indirectly through LingQ. I imported the audio and then it transcripted with traditional characters. I’ve seen other people reporting the same before, even though the audio source was from mainland China.

1 Like

Go to the lesson editor, on the left side click “Regenerate lesson” replace the text, make sure it actually saves the changes by clicking outside the text editor. Caveat, this can reset the character splitting and the timestamps.
The print lesson function is not a good option because it introduces whitespaces between characters that can confuse character converters / translation software etc. as Chinese is not written with whitespaces outside of LingQ.
Personally I have not found the “prompt” feature to be useful, for me Whisper will happily flip-flop between character sets in the middle of a transcript. This really needs a post-processing step, as for software I recommend: GitHub - BYVoid/OpenCC: Conversion between Traditional and Simplified Chinese which works perfectly in my experience (both directions) and has many language bindings. It would be unrealistic to expect LingQ to implement such an advanced technological solution for a fringe language but Rooster got this: Web Browser Extensions (Software) for LingQ - #200 by roosterburton

2 Likes

Great! Danke schön Bamboozled!

Rooster is awesome!

1 Like