I just clicked the link to look at the site and the pricing is something I could afford. What caught my eye is they can sample your voice and then make you speak all of their languages. I don’t know how good it works of course but if it works well I would want to try that. It must be way easier to imitate yourself speaking another language than someone else.
No, you don’t need to reimport. Just click to edit the lesson and find the Re-split with AI button in the sidebar. This option is only available on the web right now but once split the lesson will appear with the updated splits on all platforms.
I guess this is also only working for Japanese at present? This could be a very useful feature for both traditional and simplified Chinese also.
Edited to add: Wait, do you mean splitting the words up correctly or is this referring to splitting sentences or chapters in a long book? I meant pairing characters up to make words; many times the characters are paired together oddly in Chinese. If there’s a way to use “AI” to improve the character groupings, that would be very welcome.
in chinese is not available, isn´t it?
I also noticed (at least form the web) that whisper transcribed imports from youtube now have the lines as full sentences rather than having them split. This is much better than before. Thanks.
Whisper transcription should have always resulted in full sentences being generated. it’s the import of automatic captions from YouTube that resulted in split chunks of text. These captions now get ignored and Whisper used instead.
Ok, we’ll see if we can enable the different readings to show in Chinese too so they can be selected in different contexts @WillowMeDown .
Also, the splitting does refer to the way words are split. Our AI splitter does help to improve the accuracy in Japanese. Could you find a sample text and send us a link to the original lesson along with how you would prefer to see it split? We can look into what we can do with it once we have that. You can just post it here.
Full sentences were generated with whisper ai…but they were split (seemingly based on the timing events of the auto generated subtitles). Maybe this is what you are saying. I can assure you that for Whisper AI imports from Youtube, it was definitely splitting mid sentence (maybe at commas?) as most of my favorite content doesn’t have subtitles and if it was actually using the subtitles from autogeneration there would be no punctuation. So it was definitely using Whisper. I would go in after the fact and join and resplit the lines so they broke at the sentence endings. So I’m definitely happy it doesn’t appear I have to do this on these anymore.
BTW…Youtube videos WITH real subtitles still split along the timestamps in the middle of sentences. Would be nice if this one also got resplit at sentence endings (and the timestamps adjusted).
All in all great updates by the team! Nice to see a list of the changes.
I don’t get it. Are Youtube without subtitles being imported as well and converted with Whisper? Because yesterday I tried and only videos with subtitles were imported (using Safari).
After importing the videos (with subtitles), I had the message that I had to wait for the lesson to be generated. Which I don’t understand because if the subtitles are already there, why should I wait?
Confused.
EDIT: I realized they had CC activated but the subtitles were auto-generated. So, I suppose the ones the didn’t work didn’t have the auto-generated feature ON.
I have restarted again to import Youtube videos, so I’ll check further the various combinations.
Odd @davideroccato. My understanding has been that as of the past X # of months (even just prioror to Whisper?) that it wasn’t importing any auto generated subtitles…whether CC turned on or not.
I also usually have CC on, and I haven’t seen it import the autogenerated ones in ages…always been Whisper transcribed since it came out. (I think!)
yes, I mean, they were Whisper transcribed if autogenerated. But few videos were not imported, I assume they were without CC activated, so without autogenerated option either. I’ll pay more attention if it happens again.
EDIT: I thought that with Whisper, every Youtube video is imported.
This is great, thank you for the update Mark!
First of all great updates.
My understanding is that importing Netflix shows into LingQ currently works only using a desktop computer. Since I’m exclusively use mobile devices it’d be great to see that feature made available across your mobile apps.
Thanks again for your tireless work to improve the overall experience.
The reality is that we won’t be able to enable importing from Netflix on a mobile device for a variety of technical reasons. But, if you can borrow a friend’s computer for an hour, you can log in to Netflix and import a bunch of shows all at once which you can then access from your mobile device once imported.
Hi! I had set the sentence view to not show spaces, so I didn’t notice before, but I just set it back to “show spaces” and discovered that the spacing seems to be better than before. (Am I imagining this?) I still did see a few things in a quick look-through. I’m sure there are plenty of people out there with better Chinese knowledge than mine, so corrections are welcomed if I make mistakes below.
Just a few examples
結果 子 - I would have put it as 結 果子 or 結果子 (all 3 characters together).
I.e. instead of “BearFru It”, I would have grouped it as “Bear Fruit” or else “BearFruit.”
主說 到 - “MasterSa Id”. I would have grouped it as 主 說到 - “Master Said” - no need to glue the subject and verb together.
而活 - I notice that 而 is generally stuck together with the verb that follows it, but that makes no sense to me. 而 is a word on its own.
並所作 - same as above, I’m pretty sure 並 is a word on its own (“and”) and shouldn’t be prepended to the following word.
這真 葡萄樹 - I would not put 這真 together as one word here. ThisTrue Grapevine
should instead be This True Grapevine.
二至 八節 Stanzas 2 through 8. I would prefer to put “Through” as its own separate word. Thus: 二 至 八節 or even 二 至 八 節 (four separate words.).
Again, 宇宙 中真 葡萄樹 - “in the universe” is 宇宙 中 or 宇宙中. I don’t see any reason to stick 中真 together. 中- in (in the universe) and 真- true or genuine. So instead of grouping it as Univers AlTrue Grapevine, I would group it as either Universal True Grapevine or else Univers- al True Grapevine. -
生機體 - living organism. This may be ok as is, although I’d prefer to see this grouping without any space.
得 着 - my preference would be for these two characters to be grouped together
父是 - “Father-Is” - these two should not be grouped together as one word.
在 結果 子上 - I.e. in fruitbearing. 子上 shouldn’t be stuck together as a single word. The construct is “在 (verb) 上” . So 上 should stand alone here.
會 有 一道 流臨 及 別人 - I would have separated 流臨 into two words. But I’m not sure which way is correct.
這流會結 許多 果子 - ThisFlowWillBear Much Fruit - I would add more spaces: This Flow Will Bear Much Fruit - 這 流 會 結 許多 果子.
這要 藉着 葡萄樹 結果 子得 彰顯 - this must be manifested by the fruitbearing of the grapevine. “ThisMust Through Grapevine BearingFru ItBe Manifested.” I would separate “ThisMust” and “ItBe” since these are not actual pairings in this sentence.
So it would be: 這 要 藉着 葡萄樹 結果 子 得 彰顯 or 這 要 藉着 葡萄樹 結果子 得 彰顯 .
父所 是 的 一切都在 - (everything that the father is, is all in (noun)) - I would separate that into: 父 所是 的 一切 都 在 (or else go ahead and keep 所 是 as two separate words, but I wouldn’t connect 父所 as a single word.
And on and on. This was just from the first few sentences of the very first lesson I opened today. There are too many to list, really. They’re everywhere.
Thanks, but how about organizing such lists by topic?
Excited to see Cantonese is being supported with Whisper:
- Enable Generate Timestamps feature for Slovenian, Cantonese, and Belarusian
- Enable Whisper transcription for Cantonese.
I can’t get it to work though. I have tried dozens of times with different browsers and after a while, it always says import failed.
Any idea what the issue is?
If you refer to Youtube video import with Whisper transcription, also my imports in Mandarin language fail.
I was uploading an MP3 but the backend processing may be similar.
Generate Timestamps is already working for Cantonese.
We’ve had some difficulty getting Whisper to work in Cantonese and we are still working on this feature at the moment.
Could you give an example of a YouTube video where Whisper transcription fails? This feature works correctly for me at the moment.
in this thread I posted 2 links of videos and the problem I am getting (with screenshots): https://forum.lingq.com/t/youtube-imports-fail/300091