First, the disclaimers--this is very much a work in progress, and there have been some errors--nothing that would destroy your machine, but things that would make you do more than you actually had to do. As I find them, either through someone kind enough to point them out, or through experimentation, I fix them. If you find any errors here, please drop me a line.
If you read this page, you'll see that almost all of the information has been gathered from others--in other words, if you have a problem and write me, I will help if I can, but I don't know that much about it.
A few quick introductory links: For a far more detailed treatment of this subject, see Dr. Mike Fabian's page on the Suse website. Charles Muller's site has a page on Japanese in Mandrake and David Thiel's page on Japanese in FreeBSD is brief, but very easy to understand and useful. David Oftedal has a page about Internationlization in Gentoo Linux which covers several languages.
Note that this article only covers using Japanese in X. Japanese input is constantly being improved, and much of this information is deprecated.
I use Japanese in either an aterm or urxvt terminal with things like vi and mutt. I also use it with OpenOffice. Therefore, those are about the only things that I checked. In most cases, if it works in these applications, it will also work with firefox, and, I assume, thunderbird.
The kinput2 cannaserver combination used to be the input method of choice. However, these days, scim and anthy have become far more popular. With distributions I have tested, I will give information on using scim and anthy. With NetBSD I've had more success with kinput2 and canna.
However, as of Fedora Core 4, at least, scim, anthy and scim-anthy are available. I don't know about Core 3. One can check it with
yum search scim anthy scim-anthy |
Canna may be running at startup. So, first stop it
pkill cannaserver |
Uninstall it
yum remove Canna |
(Note that it's case sensitive, the upper case C is necessary).
Install anthy, scim and scim-anthy
yum install scim anthy scim-anthy |
Fedora's default is to boot up in graphic mode. If you boot in console mode, you can add the following to your .Xclients file, otherwise, add these lines to your .bash_profile
export XMODIFIERS='@im=SCIM' export GTK_IM_MODULE="scim" export QT_IM_MODULE="scim" export LC_CTYPE=ja_JP.utf8 scim -d |
To put them into effect immediately
source .bash_profile |
You should see a message that scim is running. (However, this doesn't always work--if it doesn't, just log off and log on again.)
Now when you start any application, hitting ctrl+space will open up a little scim panel in the lower right of the screen. If you enter english text, you will see hiragana appear. If you hit the space bar, it will select kanji. Note that the panel should have the word Anthy on it. If it doesn't click the words RAW CODE or English whatever and you should have an option for Japanese=>Anthy.
Using scim-anthy you should be able to use Japanese in most applications. If you have trouble inputting Japanese in an xterm, if, for example, you're using vi or mutt, use uxterm. (This can be called by simply typing uxterm from any command line.)
At some point, I got into a habit of only calling these variables when I needed them, and made a little lang.sh script.
#!/bin/sh
XMODIFIERS='@im=SCIM' LC_CTYPE=ja_JP.UTF-8 ${1+"$@"} &
|
Then, I might call mutt, for example, with
lang.sh mutt |
Whether or not this saves on resources, it became a habit.
I've found that I can usually get away with leaving out the GTK and QT IM_MODULES variables, although it's probably better to include them.
If you wish your menus and the like to be in Japanese as well, you can add, either to the lang.sh script or your .bash_profile
LANG=ja_JP.utf8 |
Now, most applications will also work in Japanese--some things may show up as mojibake (gibberish) but you will be able to use Sylpheed, xchat (an irc client) etc in Japanese without problem. You'll also be able to input kanji as text in GIMP.
I used to have an entire section on Gentoo. However, as I don't use it these days, when updating this page, a bit of research indicated that my method was entirely deprecated. Aside from the link to David's page at the top, the reader can also check this thread on Gentoo Forums.
Kevin W. (AKA sandcrawler on Gentoo Forums) was kind enough to send me his mini Gentoo howto.
He added the following USE variables (though we're not sure canna is necessary)
immqt-bc nls cjk canna unicode |
Then
emerge --newuse world |
Emerge the necessary programs
emerge scim anthy scim-anthy scim-qtimm |
He added the following to his .bash_profile
export XMODIFIERS='@im=SCIM' export GTK_IM_MODULE="scim" export QT_IM_MODULE="scim" export LC_CTYPE=ja_JP.utf8 scim -f socket -c socket -d |
(If not booting into X, you might leave off the scim line and put it in .xinitrc or whatever file you use to start X.)
This enables him to input Japanese in most applications.
Debian is another distribution I haven't used in awhile. Many other distributions are based on it, including Mepis and Ubuntu. Instructions written for those two distributions should work. One can use kinput2 and canna, but scim and anthy are available, and probably preferable. There is a Mepis article and two for Ubuntu. The older one is from the Ubuntu wiki and a later one that a friend described as "short, sweet and newer" is available on their forums.
Note that to input Japanese in a terminal, you may need to install one of the multi language terminals, such as rxvt-ml, rxvt-unicode or mlterm. Most distributions now include a uxterm which is an xterm that can handle unicode and UTF-8 and that should work if you don't wish to install any other xterms.
There are now scim-anthy Debian and Ubuntu packages available. I would actually suggest, rather than using the scim-uim pacakages recommended in the howtos linked above that one goes to the scim-anthy page at cvs.sourceforge.jp. Do a search on the page for Debian (or Ubuntu) and they tell you what lines to add to /etc/apt/sources.list. Then, simply apt-get scim, anthy and scim-anthy and otherwise, follow the instructions in the above howtos.
One installed, set the XMODIFIERS and LC_CTYPE and call scim in your .xinitrc, before the line calling your window manager. For example, if your window manager is fluxbox
export XMODIFIERS="@im=SCIM" export GTK_IM_MODULE="scim" export QT_IM_MODULE="scim" export LC_CTYPE=ja_JP.utf8 scim -d |
There is a package for rxvt_ja which supports euc and also a package for rxvt-unicode. If you install rxvt-unicode, it's called with the command urxvt.
One interesting thing was that even if I hadn't started the scim daemon in my .xinitrc, when I started a GTK app, scim would open a panel and one could input Japanese, however it would only work in that application. For example, if I installed firefox and opened it, the scim daemon would open. When I opened a urxvt terminal, even though the daemon was running and I could input Japanese in firefox, I still couldn't input it in the urxvt terminal.
Scim can be downloaded here, and anthy here. Note that the anthy link sends you to a download selection page. You want the latest version of anthy, not anthy-ss. At time of writing, it's 6700-b.
The scim-anthy source can be found here.
Once downloaded, untar and install the three programs. Install anthy first, then scim, and scim-anthy last. In each case, the commands are the same. The versions given in these examples are current at time of writing, change the command to fit the version you download.
tar -zxvf anthy-6700b.tar.gz cd anthy-6700b ./configure --prefix=/usr && make && make install |
Do the same for scim and scim-anthy in that order. Restart X and you should be able to call up scim input in any program by hitting ctrl+space.
You will also want Japanese fonts, especially if you are using Japanese in something like OpenOffice. Subsitute kochi truetype fonts can be found from download.sourceforge.jp. You want the package kochi-substitute-20030809.tar.bz2.
Download it and untar it.
tar -jxvf kochi-substitute-20030809.tar.bz2 |
This will create a kochi-substitute-20030809 directory. You will see the kochi-mincho and kochi-gothic substitute fonts. They have a .ttf ending.
Move the fonts to /usr/X11R6/lib/X11/fonts/TrueType or /usr/X11R6/lib/X11/fonts/TTF if there is no TrueType directory.
However, with one of its offshoots, Vector, at first I would open, for example, an mlterm session. I hit ctl+space and the scim panel appeared. I then entered romaji, but rather than seeing hiragana, I saw dotted squares. If I typed correctly, and hit space (for example, typing nihongo and hitting space once), the word nihongo, in kanji, would appear, however, I didn't see this until I hit enter.
The scim faq indicates that this is because scim isn't finding the fonts it needs. I am not sure what packages were missing--however, choosing to install gimp during the initial installation fixed the problem. Afterwards, even if I deinstalled gimp, it would still work properly
Vector's default editor, like Slackware's, is elvis, which didn't work properly. I had to grab the Slackware package for vim and install it. I used a Slackware CD that I had, but if you don't have one, go to Slackware's package search site. I used the version from 10.1, which may change by time of writing.
Two programming friends, Godwin Stewart and Stuart Bouyer (who has done a great deal of work on Japanese input packages for Gentoo Linux) made me a tarball of a modified kinput2 and canna installation. It is not perfect--when one starts cannaserver, you see the message Terminated. However, doing pgrep cannaserver shows that it is running and it works perfectly for me.
The tarball is available from qnd-guides.org.
Thanks to the generosity of the Tokyo Linux User Group it is also available on their site. To use it, first download and untar it. You will see two gzipped files there, one for Canna and one for kinput. Install canna first as the kinput file will be looking for it.
tar -jxvf vanillajpn.tar.bz2 tar -zxvf Canna36p1.tar.gz cd Canna36p1 xmkmf make Makefile make canna make install make install.man |
When done, you'll have a file /usr/sbin/cannaserver.
Now kinput2
tar -zxvf kinput2-v3.1-beta3.tar.gz cd kinput2-v3.1.-beta3 xmkmf make Makefiles make depend make make install |
As every distribution has its own way to make a program run at startup, that is an exercise I will leave to the reader. For example, in Slackware, you can add a few lines to /etc/rc.d/rc.M. As I said, you will see, after starting /usr/sbin/cannaserver the word Terminated. However, it can be ignored.
You will need a terminal that can display Japanese. As mentioned above, one can use the builtin uxterm. The mlterm and rxvt-unicode programs also work.
Add these lines to your .xinitrc above the line that calls your window manager.
export XMODIFIERS='@im=kinput2' export LC_CTYPE=ja_JP.utf8 kinput2 -canna & |
This should enable you to input Japanese in most programs.
A quick note on the LC_CTYPE variable. In most Linux distributions, it is ja_JP.utf8, however, some distros have it differently, and it is case sensitive. To check it type
locale -a | grep ja_JP |
If you get something like ja_JP.UTF-8 use that rather than utf8.
If you get an error similar to "Unable to set locale" that is often the reason, you have it as, for example, utf8 and the system is looking for UTF-8.
To sum up, most people consider the scim-anthy combination better than kinput2 and canna. If your distribution doesn't have packages for scim and anthy, you can download and install them, following the instructions given above. If they don't work for you, then use the kinput2 canna combination, using the vanilljpn tarball, for I have found that to work in almost every distribution that I have tried.
Despite there being over 400 Linux distributions, most of them seem to be based on RedHat, Debian or Slackware so the instructions above should work for almost every distribution.
cd /usr/ports/japanese/scim-anthy make install clean |
There is a package message, suggesting setting the LANG variable to ja_JP.eucJP. However, I haven't found this necessary.
In your .xinitrc file
export XMODIFIERS='@im=SCIM' export GTK_IM_MODULE="scim" export QT_IM_MODULE="scim" export LC_CTYPE=ja_JP.UTF-8 scim -d |
One will need a terminal capable of displaying unicode. There is the builtin uxterm, mlterm and rxvt-unicode. One oddity I have found is that if I try to type nihongo directly into one of these terminals, it may not display correctly. However, if one tries to cat a text file written in Japanese, it will display the file correctly.
FreeBSD's vi is nvi. I haven't gotten this working properly with Japanese, so I install /usr/ports/editors/vim-lite. One can create an alias by editing their shell's rc file. For example, I use zsh, so in my $HOME/.zshrc file I have
alias vi=vim |
For openoffice and the like, I need Japanese fonts. I use the the substitute kochi fonts in /usr/ports/japanese/kochi-ttfonts.
I want to thank Matt Dougherty of tlug for his help and patience with this. Ha's the one who showed me that it was possible and gave me a few clues as to what I was missing. So, I'm assuming you're at least a bit familiar with NetBSD. If you go into /usr/pkgsrc you'll see that there is a Japanese section. However, what you need is in the inputmethod directory.
So, as root or with root privileges
|
cd /usr/pkgsrc/inputmethod cd canna; make install clean cd ../kinput2; make USE_WNN4=NO USE_WNN6=NO USE_SJ3=NO install clean cd /usr/pkgsrc/x11/rxvt; make install clean |
When done, you'll have a /usr/pkg/sbin/cannaserver as well as kinput2.
Cannaserver should be started as daemon upon the next reboot. (You'll
see that it also provides a script in /usr/pkg/local/rc.d)
Once again, set your variables in your .xinitrc.
export XMODIFIERS='@im=kinput2' LC_CTYPE=ja_JP.eucJP kinput2 -canna & |
NetBSD doesn't have ja_JP.UTF-8 in its default locales, so we will use EUC instead. It is possible to get UTF-8 working with NetBSD, but as of August 2005 it requires more research than I'm willing to give it.
We need a terminal capable of displaying Japanese with euc encoding, which is why we installed rxvt.
Once rxvt is installed, there is a message telling you that
double-byte encoding is disabled by default. You then have to edit
/usr/pkg/lib/X11/app-defaults/Rxvt. You will see several lines
marked !Rxvt.multichar_enoding
One of them has eucj at the end of it. Take out the ! at the beginning
of the line. (Also, put a ! at the beginning of the top line, which
ends with noenc).
After that, I was able to input Japanese in rxvt without difficulty.
Like FreeBSD, I've found that I have to use vim rather than vi.
You may get an error when trying to start kinput2. It will say it can't load the app-defaults file and that XFILESEARCHPATH might be set incorrectly. This can also be added to .xinitrc however, be sure to add it ABOVE the kinput2 -canna & line.
| export XFILESEARCHPATH=/usr/pkg/lib/X11/app-defaults/Kinput2 |
Unfortunately, my favorite terminal, aterm, doesn't handle Japanese by default. Debian has an aterm-ml package, but it doesn't do on-the-spot conversion--that is, if one enters Japanese, a window appears underneath the term window. It works, but I find that annoying.
There is a patch to get aterm working with Japanese. The original site seems to have disappeared, however, I have a bzipped copy of the patch here.
The latest version of aterm is 1.0, however this patch is for the previous version, 0.4.2.
If you wish to use aterm with Japanese, you might choose to do it this way rather than use your distribution's version.
The 0.4.2 version can be downloaded from sourceforge.
Unzip the patch
bunzip2 aterm-0.4.2.-ja.patch.bz2 |
Decompress and untar the aterm source.
tarl -zxvf aterm-0.4.2.tar.gz |
Move the patch into the newly created directory, CD into the directory and apply it.
mv aterm-0.4.2-ja.patch aterm-0.4.2/ cd aterm-0.4.2 patch -p1 < aterm-0.4.2-ja.patch |
Run the configuration script, make and make install
./configurej --enable-kanji --enable-xim --enable-fading make make install |
The problem is that aterm can't handle unicode. I have to use it with euc. So, I call it with a script that changes the LC_CTYPE to euc. In FreeBSD, the script reads
#!/bin/sh
XMODIFIERS='@im=SCIM' LC_CTYPE=ja_JP.eucJP ${1+"$@"} &
#!/bin/sh
|
I use fluxbox as my window manager, and have an entry in my .fluxbox/keys file so I can call aterm with this script with a simple key combination. As you become more experienced, you may wish to experiment with different xterms and window managers.
UTF-8 is rapidly becoming the default for Asian language in *nix. However, euc is still popular. Most browsers and email clients that can handle Asian languages are able to read both encodings.
Although most browsers can read it, you might have to manually select it. In opera, firefox and mozilla, it's in View => encodings. Although there is an autoselect for Japanese, it doesn't always work. If you get a page in Japanese that seems to be mojibake, then try different encodings, including Unicode (which isn't in the Japanese section) and one should work.
Dark Prince from bsdnexus.com forums was kind enough to send the following. If you are creating a web page with Japanese in UTF-8, this code should make the viewer's browser use UTF-8 on the page. He tested this on apache, but it should work with any server that can use php. At the top of the page put
<?php header('Content-Type: text/html; charset=UTF-8'); ?>
|
Again, this will only work if your server has php enabled. Many ISP provided web pages don't support php.
Then, code that tells the browser to read UTF-8. (Dark Prince says this may not be necessary, but it probably can't hurt.) This code should be between the <head> </head> tags
<meta HTTP-EQUIV="content-type" CONTENT="text/html;charset=UTF-8"> |
Having the meta tags will not be sufficient to make the viewer's browser use UTF-8, the php code is necessary.
Lately, I've been playing with mrxvt. Again, there is no unicode support. If building from source, one needs to configure it as follows (in addition to any options you choose)
./configure --enable-xim --enable-cjk --encoding=eucj |
If you are using FreeBSD, a patch I submitted has been accepted to add EUC input. When installing the port simply type
make -DWITH_JAPANESE install clean |
Printing in *nix, of course, is non-trivial in itself. CUPS is making it easier when it works--when it doesn't work, one finds that they spend a lot of time searching google to find many people with the same error messages and few solutions. I have a few simple CUPS solutions on on another page. I know so little about this that I'm tempted to leave it out, but perhaps it may help some people.
So, first install OpenOffice. Depending upon distribution, this can be a chore in itself. FreeBSD for example, only has the development version available as a port that requires over 9 gigs to compile.
Both OpenOffice and CUPS have a great deal of documentation available, and trying to cover the various problems and solutions I've come across are worth pages in themselves. So, I'm going to assume you have both of them working, and are able to print from OpenOffice. To get Japanese to print, however, takes a little more work. One needs the fonts (I use the kochi fonts mentioned above). Once this is done, you have to use spadamin to add the fonts. In FreeBSD, they'll be in /usr/X11R6/lib/X11/fonts/TrueType. In some distributions, the path is the same save that it's called truetype. You may have to be root or have root privilege to run spadmin.
Once these fonts were added, Japanese printed out without problem. I'm afraid I can't give much more, if any information about this. However, before adding these fonts, although I was able to input Japanese in OpenOffice, it wouldn't print correctly.
I haven't been able to print out Japanese text from a web page in *nix, however, one can copy and paste both text files and web pages into OpenOffice and they will then print. An ugly hack, but at least it works.
Special thanks to Dr. Mike Fabian for all his help, as well as several other members of the Tokyo Linux Users Group (tlug).