Can anyone recommend one? I use Notepad or Open Office for English, and Word for Russian, but none of these seem to be able to handle Japanese text without fuss.
What I am trying to do right now is to strip out the HTML from web pages in Japanese before importing them as lessons in LingQ, but there are other things I want to be able to do (like be able to create and read files in shift-JIS format, or whatever the most common Japanese text format is).
Unfortunately I only read Japanese at kindergarden level so I can’t read web pages in Japanese very well ;-(
Word should be fine for use with Japanese. How are you importing it? And what version are you using? If you copy and paste it should show up fine. It should, in fact, be able to convert between encodings when you copy and paste.
The problem is possibly the code page you’ve got set up as default. In order to cope with the multitude of encodings that aren’t UTF-8, you have to specify which one you normally use. As Shift-JIS is very non-standard, if you’re using a different encoding by default then it won’t show up properly in applications such as notepad that don’t cope with a huge array of encodings.
Shift-JIS is a legacy encoding anyway. It’s still widely used, but it’s a pretty nasty encoding, and although it is the de facto standard for web pages in Japanese, using UTF-8 is recommended.
Sakura, a free text editor, is the standard free one used for Japanese, but I don’t know how it supports Japanese and whether it would have problems on a non-Japanese system.
I didn’t realise that shift-JIS isn’t the standard, thanks for that!
My immediate problem is to convert HTML to plain text. I’ve just managed to do it using google mail as a text editor
Word may do all I want, I haven’t experimented much with it with Japanese.
I would like to have a Japanese text editing (or at least reading) app on my phone…but that’s a project for another day.
Thanks again for your help, Roan!
I import Japanese often from the web with no problem, and no conversion process on my part. Confused.
Second Best Toll Pricing?
Summer Biomedical Training Program?
Seminole Bromeliad & Tropical Plant?
What does SBTP stand for?
Modern websites use UTF-8. Read more: UTF-8 - Wikipedia
You can find out about the encoding of a web page by opening the source text and searching for “charset”. You’ll find something like: .
http://canon.jp or http://www.nikon.co.jp or http://www.yahoo.co.jp/ all use UTF-8.
All modern text editors (like Notepad) are able to handle UTF-8. But sometimes you need to tell the text editor the “encoding” when opening the file or saving the file. UTF-8 is always a good choice.
LingQ uses also UTF-8.
@dooo: The problem is getting rid of all the hypertext links from Wikipedia pages, so that I can click on words in a lesson without them trying to open up a web page.
Google mail’s editor does it fine.
I also want to be able to display Japanese ebooks on my ebook reader and mobile phone. I think for the phone I need a text editor / reader that can handle UTF-8 encoded files.
The ebook reader may be a lost cause unless they update the firmware to support UTF-8.
Erm…because I’m using the LingQ quick import bookmarklet. It brings over the HTML whether I want it or not.
Ah, does that cut out the HTML? I may go back to doing it that way then.
I now know what you mean. I just ignore the hotlinks in the lesson, if any. If I really want to know them, I turn on Rikaichan (FF add on) briefly and hover over the word.