A listing of random software, tips, tweaks, hacks, and tutorials I made for Ubuntu
The state of IMEs under Linux
June 22, 2017Posted by on
Input Method Editors, or IMEs for short, are ways for a user to input text in another, more complex character set using a standard keyboard, commonly used for Chinese, Japanese, and Korean languages (CJK for short). So in order to type anything in Chinese, Japanese, or Korean, you must have a working IME for that language.
Quite obviously, especially considering the massive userbase in these languages, it’s crucial for IMEs to be quick and easy to setup, and working in any program you decide to use.
The reality is quite far from this. While there are many problems that exist with IMEs under Linux, the largest one I believe is the fact that there’s no (good) standard for communicating with programs.
IMEs all have to implement a number of different interfaces, the 3 most common being XIM, GTK (2 and 3), and Qt (3, 4, and 5).
XIM is the closest we have to a standard interface, but it’s not very powerful, the pre-editing string doesn’t always work properly, isn’t extensible to more advanced features, doesn’t work well under many window systems (in those I’ve tested, it will always appear at the bottom of the window, instead of beside the text), and a number of other shortcomings that I have heard exist, but am not personally aware of (due to not being one who uses IMEs very often).
GTK and Qt interfaces are much more powerful, and work properly, but, as might be obvious, they only work with GTK and Qt. Any program using another widget toolkit (such as FLTK, or custom widget toolkits, which are especially prevalent in games) needs to fall back to the lesser XIM interface. Going around this is theoretically possible, but very difficult in practice, and requires GTK or Qt installed anyways.
IMEs also need to provide libraries for every version of GTK and Qt as well. If an IME is not updated to support the latest version, you won’t be able to use the IME in applications using the latest version of GTK or Qt.
This, of course, adds quite a large amount of work to IME developers, and causes quite a problem with IME users, where a user will no longer be able to use an IME they prefer, simply because it has not been updated to support programs using a newer version of the toolkit.
I believe these issues make it very difficult for the Linux ecostructure to advance as a truly internationalized environment. It first limits application developers that truly wish to honor international users to 2 GUI toolkits, GTK and Qt. Secondly, it forces IME developers to constantly update their IMEs to support newer versions of GTK and Qt, requiring a large amount of effort, duplicated code, and as a result, can result in many bugs (and abandonment).
I believe fixing this issue would require a unified API that is toolkit agnostic. There’s 2 obvious ways that come to mind.
- A library that an IME would provide that every GUI application would include
- A client/server model, where the IME is a server, and the clients are the applications
Option #1 would be the easiest and least painful to implement for IME developers, and I believe is actually the way GTK and Qt IMEs work. But there are also problems with this approach. If the IME crashes, the entire host application will crash as well, as well as the fact that there could only be one IME installed at a time (since every IME would need to provide the same library). The latter is not necessarily a big issue for most users, but in multi-user desktops, this can be a big issue.
Option #2 would require more work from the IME developers, juggling client connections and the likes (although this could be abstracted with a library, similar to Wayland’s architecture). However, it would also mean a separate address space (therefore, if the IME crashes, nothing else would crash as a direct result of this), the possibility for more than one IME being installed and used at once, and even the possibility of hotswapping IMEs at runtime.
The problem with both of these options is the lack of standardization. While they can adhere to a standard for communicating with programs, configuration, dealing with certain common problems, etc. are all left to the IME developers. This is the exact problem we see with Wayland compositors.
However, there’s also a third option: combining the best of both worlds in the options provided above. This would mean having a standard server that will then load a library that provides the IME-related functions. If there are ever any major protocol changes, common issues, or anything of the likes, the server will be able to be updated while the IMEs can be left intact. The library that it loads would be, of course, entirely configurable by the user, and the server could also host a number of common options for IMEs (and maybe also host a format for configuring specific options for IMEs), so if a user decides to switch IMEs, they wouldn’t need to completely redo their configuration.
Of course, the server would also be able to provide clients for XIM and GTK/Qt-based frontends, for programs that don’t use the protocol directly.
Since I’m not very familiar with IMEs, I haven’t yet started a project implementing this idea, since there may be challenges about a method like this that might have already been discussed, but that I’m not aware of.
This is why I’m writing this post, to hopefully bring up a discussion about how we can improve the state of IMEs under Linux :) I would be very willing to work with people to properly design and implement a better solution for the problem at hand.