lkubuntu

A listing of random software, tips, tweaks, hacks, and tutorials I made for Ubuntu

Category Archives: protocol

The state of IMEs under Linux

Input Method Editors, or IMEs for short, are ways for a user to input text in another, more complex character set using a standard keyboard, commonly used for Chinese, Japanese, and Korean languages (CJK for short). So in order to type anything in Chinese, Japanese, or Korean, you must have a working IME for that language.

Quite obviously, especially considering the massive userbase in these languages, it’s crucial for IMEs to be quick and easy to setup, and working in any program you decide to use.

The reality is quite far from this. While there are many problems that exist with IMEs under Linux, the largest one I believe is the fact that there’s no (good) standard for communicating with programs.

IMEs all have to implement a number of different interfaces, the 3 most common being XIM, GTK (2 and 3), and Qt (3, 4, and 5).

XIM is the closest we have to a standard interface, but it’s not very powerful, the pre-editing string doesn’t always work properly, isn’t extensible to more advanced features, doesn’t work well under many window systems (in those I’ve tested, it will always appear at the bottom of the window, instead of beside the text), and a number of other shortcomings that I have heard exist, but am not personally aware of (due to not being one who uses IMEs very often).

GTK and Qt interfaces are much more powerful, and work properly, but, as might be obvious, they only work with GTK and Qt. Any program using another widget toolkit (such as FLTK, or custom widget toolkits, which are especially prevalent in games) needs to fall back to the lesser XIM interface. Going around this is theoretically possible, but very difficult in practice, and requires GTK or Qt installed anyways.

IMEs also need to provide libraries for every version of GTK and Qt as well. If an IME is not updated to support the latest version, you won’t be able to use the IME in applications using the latest version of GTK or Qt.

This, of course, adds quite a large amount of work to IME developers, and causes quite a problem with IME users, where a user will no longer be able to use an IME they prefer, simply because it has not been updated to support programs using a newer version of the toolkit.

I believe these issues make it very difficult for the Linux ecostructure to advance as a truly internationalized environment. It first limits application developers that truly wish to honor international users to 2 GUI toolkits, GTK and Qt. Secondly, it forces IME developers to constantly update their IMEs to support newer versions of GTK and Qt, requiring a large amount of effort, duplicated code, and as a result, can result in many bugs (and abandonment).

 

I believe fixing this issue would require a unified API that is toolkit agnostic. There’s 2 obvious ways that come to mind.

  1. A library that an IME would provide that every GUI application would include
  2. A client/server model, where the IME is a server, and the clients are the applications

Option #1 would be the easiest and least painful to implement for IME developers, and I believe is actually the way GTK and Qt IMEs work. But there are also problems with this approach. If the IME crashes, the entire host application will crash as well, as well as the fact that there could only be one IME installed at a time (since every IME would need to provide the same library). The latter is not necessarily a big issue for most users, but in multi-user desktops, this can be a big issue.

Option #2 would require more work from the IME developers, juggling client connections and the likes (although this could be abstracted with a library, similar to Wayland’s architecture). However, it would also mean a separate address space (therefore, if the IME crashes, nothing else would crash as a direct result of this), the possibility for more than one IME being installed and used at once, and even the possibility of hotswapping IMEs at runtime.

The problem with both of these options is the lack of standardization. While they can adhere to a standard for communicating with programs, configuration, dealing with certain common problems, etc. are all left to the IME developers. This is the exact problem we see with Wayland compositors.

However, there’s also a third option: combining the best of both worlds in the options provided above. This would mean having a standard server that will then load a library that provides the IME-related functions. If there are ever any major protocol changes, common issues, or anything of the likes, the server will be able to be updated while the IMEs can be left intact. The library that it loads would be, of course, entirely configurable by the user, and the server could also host a number of common options for IMEs (and maybe also host a format for configuring specific options for IMEs), so if a user decides to switch IMEs, they wouldn’t need to completely redo their configuration.

Of course, the server would also be able to provide clients for XIM and GTK/Qt-based frontends, for programs that don’t use the protocol directly.

Since I’m not very familiar with IMEs, I haven’t yet started a project implementing this idea, since there may be challenges about a method like this that might have already been discussed, but that I’m not aware of.

This is why I’m writing this post, to hopefully bring up a discussion about how we can improve the state of IMEs under Linux :) I would be very willing to work with people to properly design and implement a better solution for the problem at hand.

Follow up on the non-windowing display server idea

Note: I’m sorry, this post is a bit of a mess.

I wrote a post 2 days ago, outlining an idea for a non-windowing display server — a layer that wayland compositors (or other programs) could be built upon. It got quite a bit more attention than I expected, and there were many responses to the idea.

Before I go on, I wish to address a few things that weren’t clear in the original post:

The first being that I am not an ubuntu developer, and am in no way associated with canonical. I am only an ubuntu member :) Even though I don’t use ubuntu personally, I wish to improve the user experience of those who do.

Second is a point that I did not address clearly in the original post: One of the main reasons for this idea is to enable users to modify the video resolution, gamma ramp, orientation, brightness, etc. DRM provides an API for doing these operations, however, AFAIK, you cannot run modesetting operations on a virtual terminal that is already running an application that has called video modesetting operations. In other words, you cannot run a DRM-based application on an already-running wayland server in order to run a modesetting operation. So, AFAIK, the only way to enable an application to do this is to write a sort of “proxy” server that handles requests, and then runs the video modesetting operations.

Since I am currently confusing myself re-reading this, I’ll try to provide a diagram in order to explain what I mean.

If you want to change the gamma ramp, for example, this is impossible:

drm_client_wayland

So with the display server acting as a proxy of sorts, it becomes possible:

drm_client_display_server

This is also why I believe that having a server over a shared library is crucial. A shared library would allow for abstraction over multiple backends, however, it doesn’t allow communication with more than one application. A wayland compositor can access all of the functions, yes, but wayland clients cannot.

The third clarification is that this is not only meant for wayland. Though this is the main “client” I have in mind for this server, it isn’t restricted to only wayland. The idea is that it could be used by anything, for example, as one response pointed out, xen virtualization. Or, in my case, I actually want to write clients that use this server directly, without even using a windowing server like wayland (yes, I actually have a good reason for wanting this XD ). In other words, though I believe that the group that would use this the most would be wayland users (hence why I wrote the original post tailored towards this), it isn’t only meant for wayland.

There were a few responses saying that wayland intentionally doesn’t support this, not because of the reason I originally suspected (it being “only” a windowing protocol), but because one of wayland’s main goals is to let the compositor to have full control over the display, and make sure that there are no flickers or tearing etc., which changing the video resolution (or some other modesetting operations) would undoubtedly cause. I understand and respect this, however, I still want to be able to change the resolution or gamma ramp (etc.) myself, and suffer the consequences of the momentary flickering or whatever else. Again though, I respect wayland’s decision in this aspect, so my proposal, instead, is this: To make this an optional backend for wayland compositors. Instead of my original proposal, which was to build wayland compositors on top of this (in order to help simplify the stack), instead, have this as an option, so that if users wish to have the video modesetting (etc.) capabilities, they can use this backend instead.

A pretty large concern that many people (including myself) have is performance. Having an extra server on the stack would definitely have an impact on performance, but the question is how much.

So with this being said, going forwards, I am currently working on implementing a proof-of-concept prototype in order to have a better sense of what it entails, especially in regards to performance. The prototype will be anything but production-ready, but hopefully will at least work … maybe XD .

Idea: Non-windowing display server

For the TL;DR folk who are concerned with the title: It’s not an alternative to wayland or X11. It’s layer that wayland compositors (or other) can use.

As a quick foreward: I’m still a newbie at this field. While I try my best to avoid inaccuracies, there might be a few things I state here that are wrong, feel free to correct me!

Wayland is mainly a windowing protocol. It allows clients to draw windows (or, as the wayland documentation puts it, “surfaces”), and receive input from those surfaces. A wayland server (or “compositor”) has the task of drawing these surfaces, and providing the input to the clients. That is the specification.

However, where does a compositor draw these surfaces to? How does the compositor receive input? It has to provide many backends for various methods of drawing the composited surface. For example, the weston compositor has support for drawing the composited surface using 7 different backends (DRM, Linux Framebuffer, Headless [a fake rendering device], RDP, Raspberry Pi, Wayland, and X11). The amount of work put into making these backends work must be incredible, which is exactly where the problem relies in: it’s arguably too much work for a developer to put in if they want to make a new compositor.

That’s not the only issue though. Another big problem is that there is then no standard way to configure the display. Say you wanted a wayland compositor to change the video resolution to 800×600. The only way to do that is to use a compositor-specific extension to the protocol, since the protocol, AFAIK, has no method for changing the video resolution — and rightfully so. Wayland is a windowing protocol, not a display protocol.

My idea is to create a display server that doesn’t handle windowing. It handles display-related things, such as drawing pixels on the screen, changing video mode, etc… Wayland compositors and other programs that require direct access to the screen could then use this server and trust that the server will take care of everything display-related for them.

I believe that this would enable for much simpler code, and add a good deal more power and flexibility.

To give a more graphic description (forgive my horrible diagraming skills):

Current Stack:

wayland_current

Proposed Stack:

 

wayland_new

I didn’t talk about the input server, but it’s the same idea as the display server: Have a server dedicated to providing input. Of course, if the display server uses something like SDL as the backend, it may have to also provide the input server, due to the SDL library, AFAIK, doesn’t allow a program to access the input of another program.

This is an idea I have toyed around with for some time now (ever since I tried writing my own wayland compositor, in fact! XD), so I’m curious as to what people think of it. I would be more than happy to work with others to implement this.