Japanese Splitting has suddenly turned Terrible?

I don’t know what happened but around a couple days ago the way Japanese words have been split has become very unproductive to learning. In particular particles have become attached to the words that come before them and the system considers them as “new words.” This kind of splitting DRASTICALLY increases the word count. This didn’t happen before.

4 Likes

现在这个功能在设置中暂时下架了,官方应该在调整中,新导入的课程不会被自动分割

1 Like

We are looking into it, thanks.

2 Likes

Also, there used to be a popup offering to resplit with AI for newly imported lessons opened for the first time. It would be great to have it back.

1 Like

After making this topic I learned that you can still find it in edit lesson. Bottom left it says “Re-split text with AI.”


1 Like

I’ve also been experiencing this exact problem with importing in Japanese.

In my case, I’ve been trying to import from Netflix. I’ve been doing this for several years not with no problems, but suddenly last week, instead of mostly words being highlighted, entire phrases and sentences are being highlighted, causing me to have to edit each individual sentence in order to create a LingQ for individual words.

I also noticed another post mentioning about text being uploaded into one huge block of text. This is also happening in my case, and is incredibly inconvenient when it comes to reading dialogue spoken by multiple characters.

For about a week now, I’ve been trying to upload a lesson to see if the situation has gotten better, and at the moment it hasn’t.

1 Like

I’ve been going into the edit panel on the lesson and hitting re-split words, and it seems to (sometimes) fix the issue after it reprocesses it.

1 Like

I haven’t uploaded any articles in a while, so I decided to upload one to see how it would look, and it looks as it should.

However, when I upload the subtitles from a Netflix episode, I’m still getting this.

This discussion was posted 19 days ago, and the Japanese splitting is STILL terrible. I’m also still having the problem of importing episodes from Netflix and the text being one block instead of line by line according to who is speaking.

Is there some kind of benefit I am missing in regards to re-splitting an imported lesson? When I’ve tried it, it takes too much time, and it’s an extra step I never had to do before. And it’s still a giant block of text. Before, I only needed to import each episode once without any issues.

Fortunately, I still have some lessons that I imported before this problem happened. If there is a series I’m interested in, and there aren’t too many episodes, I will usually import them all at once. But, if there are too many, maybe half now and half later. So, I recently went back and went through a series that I never got around to. This current problem with importing Japanese lessons through Netflix is REALLY bad.

I have some other series that I imported before that will occupy me for a while, but I really do hope this gets fixed before that because there are some more series that I would like to import that I would like to study in an enjoyable way without extra hassles.

3 Likes

I just started lingq a couple of days ago and I’ve been dealing with this problem constantly. I really hope this can be improved / fixed soon. It’s making things very messy.

2 Likes

Its getting close to a month now since this started and i still can’t believe its not fixed. Now for me the re splitting with ai is not even working correctly. Im a huge advocate for this site but this is just really killing my motivation. I want to just study but instead i have to try to fix every lesson and waste time reporting bugs

3 Likes

I have been using ling for years and this is only a recent issue. Its a really bad one that im hoping gets fixed soon

3 Likes

Our team is looking into this now and we are working on improving the Japanese splitting. We expect to have it improved very soon. Thanks everyone for your patience.

2 Likes

Hi @AzureAbyss , @ThewDev , @scrubtaku !
Could you please provide the source video/articles, importing which brings an unsplit text?
Or if this relates to existing internal lessons - links to such.

1 Like

for me its all imported Youtube videos and whisper audio generated lessons. I could link one but im not sure what difference it would make.
Here is a reply i had on another thread that details the issue more with screenshots

Here is a link to the video if that helps at all https://youtu.be/c8jRlJ8uuLY?si=vU1VQyP_4le6ziEh but again its all YT imports and whisper generated text where this happens for me personally.

2 Likes

scrubtaku’s link demonstrates what I’m seeing very well.

Personally, I’ve only imported from Youtube so far. I’m very new to lingq, so, to be honest, I’m not sure if what I want is what the system is designed to provide.

I very frequently see common Japanese particles like が, に or と being connected to the words in front of them (or sometimes even the following word). Also, the copula です or だ is almost always connected to the word in front of it. This creates a lot of fake “new” vocabulary, which is kind of annoying to deal with.

2 Likes

There has always been a bit of that with lingQ. Im not sure if they can ever make it perfect but now its way worse.

1 Like

For me, the issue is importing lessons from Netflix. I’m using the LingQ importer extension in Google Chrome to import the subtitles from Netflix. I’ve tried importing 3 different series, and all 3 of them had unsplit text. I’m assuming this is the issue regardless of which series I try to import.

I don’t remember the date this started, but I guess anything I imported before November… 6th? has not been affected.

2 Likes

Hey everyone, thank you for bringing this up. This is an issue we take seriously, and we are actively working on a solution for Japanese word-splitting that will resolve it soon. Our goal is to provide the most accurate splitting possible in a reasonable period of time. Our previous implementation was too time consuming. Unfortunately, this latest implementation is unstable but we are confident that the end result will achieve our goals. We are trying to get something out in the next few days that should resolve the issues. Thanks for your patience. We will post here when the new updates are live.

3 Likes

We have pushed some fixes to importing Japanese. Can you all try importing again and let us know if things are improved? They should be. Let us know any issues and also let us know how the splitting could be improved. Using AI we should be able to make it work better than it ever has.

2 Likes

I tried importing 3 different lessons and its pretty much the same as before


After re splitting with AI

EDIT i just tried it also with whisper ai generated Audio imported lesson and its the same result

1 Like