Japanese lingqs don't highlight

Bennels · April 9, 2009, 12:41am

I’m sorry if this question has been answered already in past posts.

I have been studying Japanese on Lingq for about a year, and love this site, and Steve’s methods! The only thing which has constantly been a problem for me is that “lingqing” in Japanese (and possibly some other languages) has a great disadvantage in comparison to lingqing in languages where there are spaces between the words.

Linging in Japanese means that most of the time your lingqs will not be highlighted in the text, and it also means that your known words total is inaccurate. It seems the system counts each line (without spaces) as a new word. This makes the whole system of known words totals and percentages inaccurate.

I am wondering if there are any updates on the horizon which might find a solution to this problem. I think LingQ is the best thing I’ve ever found, and I would love to be able to take advantage of all the features and ease of use for Japanese in the near future.

Thanks, and keep up the great work!

Ben

nobody · April 9, 2009, 3:20am

Perhaps, LingQ team already have thought about it, but anyway I have a suggestion It is quite difficult to parse Japanese texts and divide it into words. BTW, it is additional load to the server. Why not to implement “Hide spaces” button? User click on it and Japanese text shows without spaces, but program behaves like there are spaces - highlight lingqs, count words. The issue is - to ask providers to put spaces in the texts… It is additional work for them…

nobody · April 9, 2009, 3:27am

If there will be two the same kanji lessons in the Japanese library: one free without spaces and other not-free and with spaces, I would pay for the lesson with spaces (after looking at the free lesson gg and of course if I find the quality/price balance quite appropriate).

steve · April 9, 2009, 6:00am

Ben,

I think that we may be close. Thanks for your patience.

Bennels · April 9, 2009, 7:03am

That’s fantastic news Steve! Thanks for your reply! Keep up the great work! This site, and your learning methods have already helped me tremendously! I’ll be a member for life!

dooo · April 9, 2009, 2:09pm

Further to the idea of spaces in Japanese texts that Cakypa brings up, I understand the need for spaces as a way for accounting for words. But sometimes I find the spaces create challenges rather than lessen them.

For example, in the following:

ぎんこうへのふりかえのみといわれてて。 (Bad day)

のみと looks like it should be a word, but it is really のみ and then と.

Personally I prefer the Japanese as it is actually written and read by Japanese. But it doesn’t bother me that much. I’ll still use the lessons. I just thought I’d mention it in the context of this discussion

nobody · April 9, 2009, 2:41pm

I have not implemented parsing algorithms for a 4 years, that’s why I just can’t imagine how to parse Japanese text. I imagine it like finite-state machine with almost infinite number of states %) I remember how much states had my finite-state machine for a c++ parser and c++ is an artificial and very logical language with a finite number of possible structures… (Ok, I was a student, who did not attend lectures – so, perhaps, there was a better solution for a c++ parser gg). I wonder is it possible to write algorithm that will correctly parse any Japanese text into separate words?.. I use Rikaichan add-on to Firefox and it rather often has variety of suggestions: from the longest character chain to the smallest.

I also prefer texts without spaces, because I want to switch to the authentic content. Now I submit writings with as much kanji as I can afford and, received a correction, I import it and delete all spaces. I can’t write without spaces right now as I am not sure that I can write understandable things. With spaces it is easier for a tutor to understand what do I mean :))

mark · April 9, 2009, 4:36pm

We are actually trying to use a similar approach to Rikai chan. We hope to have something done there in the next few weeks to a month. I don’t understand all the “computerese” but we hope to remove the need for spaces.

Hitomikanda · April 12, 2009, 3:03am

Hi all, as a Japanese tutor, it has always been a concern where to cut sentences.
As Ed has mentioned, it would cause some difficulties to distinguish an each word.
We do make gaps word by word for children’s books in Japan but the stories are very simple so I can understand them easily.
However, some contents like “Bad Day”, it is spoken in very casual way therefore it will make things complicated if I have to add spaces.

karlt · June 7, 2010, 5:44am

This is my first post. It looks like this thread is dead, so I will start a new one. It is now June 2010, and I was going to ask if there were any improvements coming for the parsing of Japanese… (I’ll start the new thread now)