App Updates Up Until May 2024

kraemder · May 5, 2024, 10:56pm

I just clicked the link to look at the site and the pricing is something I could afford. What caught my eye is they can sample your voice and then make you speak all of their languages. I don’t know how good it works of course but if it works well I would want to try that. It must be way easier to imitate yourself speaking another language than someone else.

WillowMeDown · May 6, 2024, 1:35am

No, you don’t need to reimport. Just click to edit the lesson and find the Re-split with AI button in the sidebar. This option is only available on the web right now but once split the lesson will appear with the updated splits on all platforms.

I guess this is also only working for Japanese at present? This could be a very useful feature for both traditional and simplified Chinese also.

Edited to add: Wait, do you mean splitting the words up correctly or is this referring to splitting sentences or chapters in a long book? I meant pairing characters up to make words; many times the characters are paired together oddly in Chinese. If there’s a way to use “AI” to improve the character groupings, that would be very welcome.

rafaelhmay · May 6, 2024, 2:10am

in chinese is not available, isn´t it?

ericb100 · May 6, 2024, 6:18pm

I also noticed (at least form the web) that whisper transcribed imports from youtube now have the lines as full sentences rather than having them split. This is much better than before. Thanks.

mark · May 6, 2024, 10:36pm

Whisper transcription should have always resulted in full sentences being generated. it’s the import of automatic captions from YouTube that resulted in split chunks of text. These captions now get ignored and Whisper used instead.

mark · May 6, 2024, 11:14pm

Ok, we’ll see if we can enable the different readings to show in Chinese too so they can be selected in different contexts @WillowMeDown .
Also, the splitting does refer to the way words are split. Our AI splitter does help to improve the accuracy in Japanese. Could you find a sample text and send us a link to the original lesson along with how you would prefer to see it split? We can look into what we can do with it once we have that. You can just post it here.

ericb100 · May 7, 2024, 12:32am

Full sentences were generated with whisper ai…but they were split (seemingly based on the timing events of the auto generated subtitles). Maybe this is what you are saying. I can assure you that for Whisper AI imports from Youtube, it was definitely splitting mid sentence (maybe at commas?) as most of my favorite content doesn’t have subtitles and if it was actually using the subtitles from autogeneration there would be no punctuation. So it was definitely using Whisper. I would go in after the fact and join and resplit the lines so they broke at the sentence endings. So I’m definitely happy it doesn’t appear I have to do this on these anymore.

BTW…Youtube videos WITH real subtitles still split along the timestamps in the middle of sentences. Would be nice if this one also got resplit at sentence endings (and the timestamps adjusted).

All in all great updates by the team! Nice to see a list of the changes.

davideroccato · May 7, 2024, 6:58am

I don’t get it. Are Youtube without subtitles being imported as well and converted with Whisper? Because yesterday I tried and only videos with subtitles were imported (using Safari).

After importing the videos (with subtitles), I had the message that I had to wait for the lesson to be generated. Which I don’t understand because if the subtitles are already there, why should I wait?

Confused.

EDIT: I realized they had CC activated but the subtitles were auto-generated. So, I suppose the ones the didn’t work didn’t have the auto-generated feature ON.
I have restarted again to import Youtube videos, so I’ll check further the various combinations.

ericb100 · May 7, 2024, 1:09pm

Odd @davideroccato. My understanding has been that as of the past X # of months (even just prioror to Whisper?) that it wasn’t importing any auto generated subtitles…whether CC turned on or not.

I also usually have CC on, and I haven’t seen it import the autogenerated ones in ages…always been Whisper transcribed since it came out. (I think!)

davideroccato · May 7, 2024, 1:18pm

yes, I mean, they were Whisper transcribed if autogenerated. But few videos were not imported, I assume they were without CC activated, so without autogenerated option either. I’ll pay more attention if it happens again.

EDIT: I thought that with Whisper, every Youtube video is imported.

Caldazar · May 8, 2024, 5:09am

This is great, thank you for the update Mark!

Ramonesfan · May 8, 2024, 6:37am

First of all great updates.
My understanding is that importing Netflix shows into LingQ currently works only using a desktop computer. Since I’m exclusively use mobile devices it’d be great to see that feature made available across your mobile apps.
Thanks again for your tireless work to improve the overall experience.

mark · May 8, 2024, 2:53pm

The reality is that we won’t be able to enable importing from Netflix on a mobile device for a variety of technical reasons. But, if you can borrow a friend’s computer for an hour, you can log in to Netflix and import a bunch of shows all at once which you can then access from your mobile device once imported.

WillowMeDown · May 8, 2024, 8:54pm

Hi! I had set the sentence view to not show spaces, so I didn’t notice before, but I just set it back to “show spaces” and discovered that the spacing seems to be better than before. (Am I imagining this?) I still did see a few things in a quick look-through. I’m sure there are plenty of people out there with better Chinese knowledge than mine, so corrections are welcomed if I make mistakes below.

Just a few examples

結果子 - I would have put it as 結果子 or 結果子 (all 3 characters together).
I.e. instead of “BearFru It”, I would have grouped it as “Bear Fruit” or else “BearFruit.”
主說到 - “MasterSa Id”. I would have grouped it as 主說到 - “Master Said” - no need to glue the subject and verb together.

而活 - I notice that 而 is generally stuck together with the verb that follows it, but that makes no sense to me. 而 is a word on its own.
並所作 - same as above, I’m pretty sure 並 is a word on its own (“and”) and shouldn’t be prepended to the following word.
這真葡萄樹 - I would not put 這真 together as one word here. ThisTrue Grapevine
should instead be This True Grapevine.
二至八節 Stanzas 2 through 8. I would prefer to put “Through” as its own separate word. Thus: 二至八節 or even 二至八節 (four separate words.).
Again, 宇宙中真葡萄樹 - “in the universe” is 宇宙中 or 宇宙中. I don’t see any reason to stick 中真 together. 中- in (in the universe) and 真- true or genuine. So instead of grouping it as Univers AlTrue Grapevine, I would group it as either Universal True Grapevine or else Univers- al True Grapevine. -
生機體 - living organism. This may be ok as is, although I’d prefer to see this grouping without any space.
得着 - my preference would be for these two characters to be grouped together
父是 - “Father-Is” - these two should not be grouped together as one word.
在結果子上 - I.e. in fruitbearing. 子上 shouldn’t be stuck together as a single word. The construct is “在 (verb) 上” . So 上 should stand alone here.
會有一道流臨及別人 - I would have separated 流臨 into two words. But I’m not sure which way is correct.
這流會結許多果子 - ThisFlowWillBear Much Fruit - I would add more spaces: This Flow Will Bear Much Fruit - 這流會結許多果子.
這要藉着葡萄樹結果子得彰顯 - this must be manifested by the fruitbearing of the grapevine. “ThisMust Through Grapevine BearingFru ItBe Manifested.” I would separate “ThisMust” and “ItBe” since these are not actual pairings in this sentence.
So it would be: 這要藉着葡萄樹結果子得彰顯 or 這要藉着葡萄樹結果子得彰顯 .
父所是的一切都在 - (everything that the father is, is all in (noun)) - I would separate that into: 父所是的一切都在 (or else go ahead and keep 所是 as two separate words, but I wouldn’t connect 父所 as a single word.

And on and on. This was just from the first few sentences of the very first lesson I opened today. There are too many to list, really. They’re everywhere.

wernerkuhn · May 9, 2024, 6:03pm

Thanks, but how about organizing such lists by topic?

anthonydclarke · May 12, 2024, 2:58pm

Excited to see Cantonese is being supported with Whisper:

Enable Generate Timestamps feature for Slovenian, Cantonese, and Belarusian
Enable Whisper transcription for Cantonese.

I can’t get it to work though. I have tried dozens of times with different browsers and after a while, it always says import failed.

Any idea what the issue is?

fabiothebest · May 12, 2024, 4:00pm

If you refer to Youtube video import with Whisper transcription, also my imports in Mandarin language fail.

anthonydclarke · May 12, 2024, 4:17pm

I was uploading an MP3 but the backend processing may be similar.

nsprung · May 13, 2024, 11:43am

@anthonydclarke

Generate Timestamps is already working for Cantonese.

We’ve had some difficulty getting Whisper to work in Cantonese and we are still working on this feature at the moment.

@fabiothebest

Could you give an example of a YouTube video where Whisper transcription fails? This feature works correctly for me at the moment.

fabiothebest · May 13, 2024, 2:20pm

in this thread I posted 2 links of videos and the problem I am getting (with screenshots): https://forum.lingq.com/t/youtube-imports-fail/300091