How Does one Import Japanese into LingQ

The following is the first page of a story from a graded reader on Tadoku (https://tadoku.org/). The original file is a PDF; here I just pasted in an image to share.

I tried a couple ways of importing it with no success.

Does LingQ simply not support traditional Japanese? Or am I missing something, even a workaround?

I haven’t used LingQ for Japanese, but I was curious about this. While studying one language or another, haphazardly, I have seen some imports work and others not work. I decided to give your import a try.

Overall, my experience was better than yours. I hope this helps you get past the hurdle that is blocking you, @gmeyer.

On Windows 11, I downloaded w0004e-michikonohoshizora-v2.pdf (not w0004p-michikonohoshizora-v2.pdf). Viewing lingq.com with a Chrome web browser, I used drag-and-drop to import that file. It worked OK. I got a normal LingQ lesson. My web browser view says that the LingQ lesson has 67 pages. Sentence View sees 670 sentences. So far, so good.

The last section of Japanese text in that lesson is labelled “33”, meaning “end of page 33” of the 50 page PDF. So there are 17 missing pages from that lesson. That is normal. That happens when importing one long file (PDF, TXT, etc) into LingQ. Lingq tries to split the one too-long PDF into many LingQ lessons. The split worked for me. One PDF, two lessons.

There is some material on page one that is not proper Japanese. Stuff like the following. Not a problem, just push past it.
michiko_no_hoshizora_example_of_import_flaw

The learning material starts on page five of LingQ lesson 1 (of the 2 lessons generated by the one PDF). The lesson is useable, but the LingQ import does not do word segmentation as well as a human, and you have to be tolerant of transposed furigana. Sometimes the furigana are transposed quite a distance from where you might expect. An example is “ekimae”. A sample from what I see:

I am unable to import pdf files that have images in them both for German and English. I tried converting them into txt format with Calibre ; failed conversion. I am left to import pdf/txt files that have no images/pictures in them.

@gmeyer LingQ can import PDF files (click the import button near the top right of your library screen on desktop).

However, the PDF file you shared cannot be imported. Vertical Japanese is tricky. Furthermore, I noticed that I’m unable to run my cursor over the text and highlight. If that’s not possible, most likely the text cannot be imported.

Here’s a PDF file that can be imported. However, it’s written horizontally.

I tried another PDF that was written vertically.

LingQ is able to import the file but the text is not in the correct order. I’ll ask the dev team if this can be solved.

Thanks for the prompt reply Eric.

What I visually shared is just a screenshot for the sake of making a visual post, easy for the forum reader here.

The actual file I was trying to import is here: 美知子の星空 – にほんごたどく

@gmeyer Right now, LingQ is not planning to develop a way to import vertical Japanese text from PDF and re-format horizontally.

However, I fiddled around with ChatGPT and was able to extract the text from the PDF without furigana. From there, I manually imported the text into LingQ. So, there is a way to do it but it just takes a bit of time and prompting.

I’d be curious how you did that. I seem to have sent ChatGPT down the wrong road.

@gmeyer

Upload the PDF to ChatGPT.

Prompt: “I would like you to extract the text and format it vertically. Omit furigana”.

I was able to do a page no problem. It may take multple prompts back and forth to do an entire novel. As long as the PDF text can be highlighted, text extraction is possible.

ChatGPT can be like an emotionally needy colleague.

You have to give it social interaction and affirmation every few minutes. It can’t focus on a task more than a few seconds!