lkubuntu

A listing of random software, tips, tweaks, hacks, and tutorials I made for Ubuntu

The state of IMEs under Linux

Input Method Editors, or IMEs for short, are ways for a user to input text in another, more complex character set using a standard keyboard, commonly used for Chinese, Japanese, and Korean languages (CJK for short). So in order to type anything in Chinese, Japanese, or Korean, you must have a working IME for that language.

Quite obviously, especially considering the massive userbase in these languages, it’s crucial for IMEs to be quick and easy to setup, and working in any program you decide to use.

The reality is quite far from this. While there are many problems that exist with IMEs under Linux, the largest one I believe is the fact that there’s no (good) standard for communicating with programs.

IMEs all have to implement a number of different interfaces, the 3 most common being XIM, GTK (2 and 3), and Qt (3, 4, and 5).

XIM is the closest we have to a standard interface, but it’s not very powerful, the pre-editing string doesn’t always work properly, isn’t extensible to more advanced features, doesn’t work well under many window systems (in those I’ve tested, it will always appear at the bottom of the window, instead of beside the text), and a number of other shortcomings that I have heard exist, but am not personally aware of (due to not being one who uses IMEs very often).

GTK and Qt interfaces are much more powerful, and work properly, but, as might be obvious, they only work with GTK and Qt. Any program using another widget toolkit (such as FLTK, or custom widget toolkits, which are especially prevalent in games) needs to fall back to the lesser XIM interface. Going around this is theoretically possible, but very difficult in practice, and requires GTK or Qt installed anyways.

IMEs also need to provide libraries for every version of GTK and Qt as well. If an IME is not updated to support the latest version, you won’t be able to use the IME in applications using the latest version of GTK or Qt.

This, of course, adds quite a large amount of work to IME developers, and causes quite a problem with IME users, where a user will no longer be able to use an IME they prefer, simply because it has not been updated to support programs using a newer version of the toolkit.

I believe these issues make it very difficult for the Linux ecostructure to advance as a truly internationalized environment. It first limits application developers that truly wish to honor international users to 2 GUI toolkits, GTK and Qt. Secondly, it forces IME developers to constantly update their IMEs to support newer versions of GTK and Qt, requiring a large amount of effort, duplicated code, and as a result, can result in many bugs (and abandonment).

 

I believe fixing this issue would require a unified API that is toolkit agnostic. There’s 2 obvious ways that come to mind.

  1. A library that an IME would provide that every GUI application would include
  2. A client/server model, where the IME is a server, and the clients are the applications

Option #1 would be the easiest and least painful to implement for IME developers, and I believe is actually the way GTK and Qt IMEs work. But there are also problems with this approach. If the IME crashes, the entire host application will crash as well, as well as the fact that there could only be one IME installed at a time (since every IME would need to provide the same library). The latter is not necessarily a big issue for most users, but in multi-user desktops, this can be a big issue.

Option #2 would require more work from the IME developers, juggling client connections and the likes (although this could be abstracted with a library, similar to Wayland’s architecture). However, it would also mean a separate address space (therefore, if the IME crashes, nothing else would crash as a direct result of this), the possibility for more than one IME being installed and used at once, and even the possibility of hotswapping IMEs at runtime.

The problem with both of these options is the lack of standardization. While they can adhere to a standard for communicating with programs, configuration, dealing with certain common problems, etc. are all left to the IME developers. This is the exact problem we see with Wayland compositors.

However, there’s also a third option: combining the best of both worlds in the options provided above. This would mean having a standard server that will then load a library that provides the IME-related functions. If there are ever any major protocol changes, common issues, or anything of the likes, the server will be able to be updated while the IMEs can be left intact. The library that it loads would be, of course, entirely configurable by the user, and the server could also host a number of common options for IMEs (and maybe also host a format for configuring specific options for IMEs), so if a user decides to switch IMEs, they wouldn’t need to completely redo their configuration.

Of course, the server would also be able to provide clients for XIM and GTK/Qt-based frontends, for programs that don’t use the protocol directly.

Since I’m not very familiar with IMEs, I haven’t yet started a project implementing this idea, since there may be challenges about a method like this that might have already been discussed, but that I’m not aware of.

This is why I’m writing this post, to hopefully bring up a discussion about how we can improve the state of IMEs under Linux :) I would be very willing to work with people to properly design and implement a better solution for the problem at hand.

About the “Mir hate-fest”

If you’ve been following the news, you’ll probably know about Ubuntu dropping Unity. I would say this is probably a surprise to many of us, due to the many years of efforts they have invested in Unity 8, and it being so close to completion.

It was speculated that, since Unity 8 is now dropped, Mir would also be dropped. However, it looks like it will still be developed, but not necessarily for desktop usage.

But speaking of that post, I found it quite unfortunate how Mark talked about “Mir-hating”, simplifying it to seem like it was irrational hatred with very little rational grounds:

“It became a political topic as irrational as climate change or gun control, where being on one side or the other was a sign of tribal allegiance”

“[…] now I think many members of the free software community are just deeply anti-social types who love to hate on whatever is mainstream”

“The very same muppets would write about how terrible it was that IOS/Android had no competition and then how terrible it was that Canonical was investing in (free software!) compositing and convergence”

Now, in all fairness, I haven’t been involved enough in the community to know much about the so-called “Mir hate-fest”. It is very possible that I haven’t seen the irrational tribal-like hatred he was talking about. However, the “hatred” I have seen would be spread into 2 categories (mainly):

  1. People (like me) who were worried about Mir splitting up the linux desktop, basically forcing any linux user who cares about graphical performance to be under Canonical’s control.
  2. People worrying about architectural problems in the codebase (or other code-related issues).

Both of these, IMO, are quite valid concerns, and should be allowed to be voiced, without being disregarded as “irrational hate”.

I’ll admit, my original post on this topic was pretty strong (and admittedly not very well articulated either). However, I believe that it’s important, especially in a free software community, to be able to voice our opinions about projects and decisions. In software circles that tend to stifle open discussion (I’ve seen this especially in various proprietary software communities), it is honestly a terrible atmosphere (at least IMO), and the community tends to suffer as a whole (due to companies feeling that they have power over their users, and feel that they can do anything they want, in hopes of gaining more profit).

In Mark’s defense, I agree that it is very important to stay respectful and constructive, and I apologize for the tone in my first post. I haven’t seen many other rude comments towards Mir, but as I said, I could be wrong. Having a lot of rude comments towards your software is very difficult for those behind the project to handle, and usually doesn’t amount to anywhere constructive anyways.

But I think that saying something along the lines of “anyone who disagrees that Mir is a good project is an idiot” (“I agree, it’s a very fast, clean and powerful graphics composition engine, and smart people love it for that.”, alongside the quotes I mentioned above) is very counterproductive to maintaining a good free software ecosystem.

My bottom line for this post is: I believe it’s vital to allow healthy discussion for projects within the free software community, but in a respectful and constructive manner, and I believe this point is especially important for large projects used by many people (such as ubuntu, the linux kernel, etc.).

The Perfect C Array Library

I love C. And I loathe C++.

But there’s one thing I like about C++: The fact that I don’t have to write my own dynamic array libraries each time I try to start a project.

Of course, there are many libraries that exist for working with arrays in C. Glib, Eina, DynArray, etc. But I wanted something as easy to use as C++’s std::vector, with the performance and memory usage of std::vector.

By the way, I am not talking about algorithmic performance. I’m writing this assuming the algorithms are identical (i.e. I’m writing purely about implementation differences).

There is a few problems with the performance and memory usage of the aforementioned libraries, the major one being that the element size is stored as a structure member. Which means an extra 4-8 bytes per array, and constantly having to read a variable (which means many missed optimization opportunities). While this may not sound too bad (and in the grand scheme of things, probably isn’t), it is undeniably less efficient than C++.

This isn’t the only problem, there are other missed optimization opportunities in the function (vs macro)-based variants, for example, calling functions for tiny operations, calling memcpy for types that fit within registers, etc.

All of this might seem like splitting hairs, and it probably is. But knowing that C++ can be faster, more memory efficient, and less bothersome to code in than C is not a thought I like very much. So I wanted to try to level the playing field.

It took me a rather long amount of sporadic work for me to create my very own “Perfect C Array Library”, that, I thought, fulfilled my requirements.

First, let’s look at some example code using it:

array(int) myarray = array_init();
array_push(myarray, 5);
array_push(myarray, 6);

for (int i = 0; i < myarray.length; i++) {
    printf("%i\n", myarray.data[i]);
}

array_free(myarray);

Alright, it might be a tiny bit less pretty than C++. But hey, this is good enough for me.

In terms of performance and memory issues, I fixed the issues I wrote above. So in theory, it should be just as fast as C++, right?

Turns out I missed one issue. Cache Misses. In my mind, if everything was written as a macro, it would, in theory, be faster than functions. I was wrong. Large portions of code inlined can result in cache misses, which will quite negatively impact the performance of the function.

So, as far as I can see, it is impossible to write a set of array functions for C that will be as fast and easy to use as C++’s std::vector. But please correct me if I’m wrong!

With that being said, this implementation is the most efficient I’ve been able to write so far, so let me show you the idea behind it:

#define array(type)  \
  struct {           \
      type* data;    \
      size_t length; \
  }

#define array_init() \
  {                  \
      .data = NULL,  \
      .length = 0    \
  }

#define array_free(array) \
  do {                    \
      free(array.data);   \
      array.data = NULL;  \
      array.length = 0;   \
  } while (0)

#define array_push(array, element)                \
  do {                                            \
      array.data = realloc(array.data,            \
                           sizeof(*array.data) *  \
                             (array.length + 1)); \
      array.data[array.length] = element;         \
      array.length++;                             \
  } while (0)

The magic is in sizeof(*array.data). For some reason I never knew this was legal in C, but it does exactly what it says it does: it returns the size of type. Which eliminates the need to store this in the struct.

The code above is vastly oversimplified to demonstrate the idea. It’s very incomplete, algorithmically slow, and unsafe. But the idea is there.

To summarize, I am not aware of any way to write a completely zero-compromise array library in C, but the code above shows the closest I’ve come to that.

 

P.S. There is one problem I am aware of with this method:

array(int) myarray;
array(int) myarray1 = myarray; /* 'error: invalid initializer' */

There are 2 ways to get around this:

memcpy(&myarray1, &myarray, sizeof(myarray));
/* or */
myarray1 = *((typeof(myarray1)*)&myarray); /* requires GNU C */

Both of which should, under a decent optimization level, result in the same assembly.

How to fix lockdown error -5 on usbmuxd

When I plug my iPhone in (iOS 9), usbmuxd (and hence libimobiledevice-related software) runs just fine, but when I plug my older iPad in (iOS 5), I get this error:

[18:51:51.199][3] Connected to v2.0 device 1 on location 0x20009 with serial number ("random" string of numbers)
[18:51:51.867][1] preflight_worker_handle_device_add: ERROR StartSession failed on device (same "random" string of numbers), lockdown error -5

I spent quite a long time trying to fix this issue. I tracked down this file, that contains this line:

LOCKDOWN_E_SSL_ERROR = -5

So it’s an SSL-related issue, and it happens with an older device, and not a newer one … figured it might be the certificates, and, since I haven’t connected my iPad to the internet for a while, decided to give that a shot. Same issue. But hey, it was fun being able to use Cydia! (at the time of writing, there is no public jailbreak for iOS >=9.2)

After a while of playing with certificates on my host computer (with the same luck), I decided to try compiling both libimobiledevice and usbmuxd with debug flags.

Right off the bat, I got my first (and only) compiler error: undefined reference to 'SSLv3_method' (or something like that).

Turns out that OpenSSL disables SSLv3 by default, for security reasons. After changing SSLv3_method to SSLv23_method (as per this patch) and enabling debug flags, I found that, indeed, the problem relied exactly where the compiler error was:

19:11:34 idevice.c:722 idevice_connection_enable_ssl(): ERROR in SSL_do_handshake: SSL_ERROR_SSL

Thankfully, libimobiledevice contains support for GnuTLS, which, also thankfully, still supports SSLv3! Although there were a few compiler errors because I’m guessing nobody really used the GnuTLS backend for some time, once it was built, I finally managed to make usbmuxd (and libimobiledevice-using software) recognize my iPad!

EDIT: libimobiledevice fixed the compiler errors upstream.

So, if you encounter this error, here’s how you fix it:

git clone https://github.com/libimobiledevice/libimobiledevice.git
cd libimobiledevice
./autogen.sh
./configure.sh --prefix=/usr --disable-openssl
make
sudo make install

I hope this can help someone else with this issue! If you have any problems, feel free to leave a comment and I’ll try my best to help!

openlux 0.2.1 + roadmap

A few months ago, I wrote openlux as an open-source alternative to f.lux, similar to redshift, but different in goal and execution.

openlux 0.2.1 isn’t a particularly exciting release, but it fixes problems with graphics cards that support 10-bit displays .. something that openlux claimed to support, but, as it turns out, didn’t. After nvidia’s new driver update, I found that out the hard way (the screen went really weird, and at one point, totally black … not fun haha).

I also changed the logo slightly, I think it looks much nicer now! Here it is:

openlux.svg

Now about the roadmap, I’m planning on adding these features later on:

  • Two new backends, DRM and RandR, to support both direct tty (DRM), and more graphics drivers (RandR) that don’t properly support XF86VM (what openlux currently uses as its main backend)
  • A GUI for automating openlux via cron (and also for using it directly)
  • A GUI for iOS that will support both scheduling and using it directly (similar to the feature above, but iOS doesn’t support cron)
  • iOS 9 support (apparently iOS 9 works in a different way for this … sadly, my iPhone isn’t jailbroken yet, so I can’t test it yet)

 

Explaining Javascript Closures Simply

Please note that this article might not be entirely accurate, it is simply meant as a very simple explanation that will get you in the right path to understand them. If you wish for a technically correct answer, see this question on Stack Overflow.

Javascript Closures tend to confuse people who aren’t used to Javascript (don’t feel bad if you’re in this category, Javascript is a weird language, I am certainly included in this category as well).

However, they’re really simple. But many guides tend to make them seem more complicated than they really are. Stack Overflow contains a wonderful array of answers that ended up leaving me more confused than when I started.

So, I’ll give the answer that I believe would have made it clearest to me: A closure in Javascript inherits the stack of the function that declared it. The stack is an object, in the sense that it only goes away when there are no more references.

If you understand what that means, then great! Hopefully this helped! If not, I’ll explain :)

First, a closure in javascript is basically a ‘lambda’, a sort of ‘anonymous function’, and if these terms confuse you (don’t feel stupid if they do, I think we all suck with terms), here’s what it looks like:

setTimeout(/* This is a closure! --> */ function() {
    alert("Hello world!");
} /* <-- End of the closure */, 1000);

You’ve probably seen these everywhere in Javascript.

Now, the “stack” (in Javascript) is basically the local variables in a function. For example:

function hello() {
    var first = "Hello";
    var space = " ";
    var second = "world!"

    var message = first + space + second;

    /* first, space, second, and message are all part of the "stack" */
}

Technically, every function in Javascript is a closure, so therefore, every function inherits the stack of the function that declared it, not the stack of the function that calls it.

In other words:

function hello() {
    var message = "Hello world!";

    return function() {
        alert(message);
    };
}

var message = "Farewell planet";

var closure = hello();
closure(); /* This will output "Hello world!", because it inherits hello's stack */

But doesn’t the stack end once hello() has finished running? Nope, the stack is like an object (it might actually be one, I’m not sure), in the sense that the object will live so long as there is a reference to it. The closure has a reference to it (every object has a reference to their stack), and therefore, the stack is still alive.

What if a closure makes a new variable .. will that change be reflected in the stack of the old function? Nope, every function makes a new stack, that inherits the parent stack (the stack of the parent function that declared this function). For example:

function hello() {
    var first = "Hello";
    var second = "world!";

    [
        function() {
            first = "Goodbye";      // Modifies 'first' in the parent stack

            var second = "planet!"; // Notice the "var" .. this makes a new variable
                                    // in the _current_ stack.
                                    //
                                    // Any future reference to 'second' will refer
                                    // to the variable in the _current_ stack, not
                                    // the variable in the parent stack.

            second = "galaxy!";

            var message = "Hi";     // This was declared in the current stack, message
                                    // will not be declared in the parent stack.
        },
        function() {
            console.log(first);     // "Goodbye"
            console.log(second);    // "world!"
            console.log(message);   // message isn't declared
        }
    ].forEach(function(x) {x()});   // Runs these functions, one after the other
}

That’s about it!

I hope this helped you to understand how closures worked in Javascript. If you’re still confused, feel free to leave a comment!

I’m not an expert on this, and certainly, this guide isn’t meant as an expert technical analysis of how closures work, it’s mainly just meant as a (hopefully) simple bird’s-eye view of how  they work. As I mentioned a few times before, you can find a more detailed explanation at Stack Overflow.

Download videos online with Inject2Download

and I swear this blog hasn’t been hijacked . Apologies for the clickbait title, although, in all honesty, I’m not sure how else to word it … suggestions? :)

Most video download scripts I’ve seen generally tend to rely on some download server that does magic behind the scenes, which may or may not work, among other issues (ads, loading times, etc.). I also don’t like the fact that I can’t really know what’s going on behind the scenes (one of the reasons I use free software).

Thing is, for many websites that embed videos (that don’t host content with their own proprietary players, i.e. not youtube, vimeo, wistia, dailymotion, etc.), they have the direct video URL somewhere in the javascript (plainly readable or obfuscated).

Since most websites use the same player engines to play their videos (jwplayer, flowplayer, or video.js), all that is needed to do is to inject code into those engines when the page is loaded that will capture the video URL and somehow share the URL with the user.

I originally started writing this using Chrome, until I found out that Chrome actually doesn’t support injecting code directly after a library is loaded (which Firefox does), so I ended up making this only Firefox-compatible. I tried to sort of “race” the code so as to try to run the code as soon as possible after the libraries were loaded, but it only worked a fraction of the time, and plus it slowed down the webpage heavily.

Inject2Download is a user script, so you’ll need Greasemonkey to run it. If there’s enough interest, I’ll make it a proper extension later :)

Download and install it here: https://greasyfork.org/en/scripts/18671-inject2download

Github: https://github.com/AnonymousMeerkat/inject2download

What you’ll (hopefully) notice is that when you go to a website that hosts a video player, a little box will pop up at the top-left corner of the page (you might have to scroll up to see it), containing one or more URLs.

Some websites host ads on the player, and it’s sometimes (although rarely, thankfully) a bit difficult for the script to know which is an ad, and which is a legitimate video, so just use common sense and avoid URLs with “ads” or other suspicious text as part of them :)

If you have any issues with this, please feel free to let me know, either on here or via github, I’d be happy to help!

 

trivial-require: Closure-friendly Browserify(ish)

… and that title is about as accurate as saying that D is a superset of C, but the most accurate one I could think of that could fit in a relatively small space of text.

Before I explain further, I’ll explain my use-case scenario. I’m writing a web app that is meant to be used on my iPhone. 3G can be used up really quickly by browsing websites, and I plan on using the app a lot, so I need to make sure that the website uses as little bandwidth as possible.

Google’s Closure Compiler is very good at minifying Javascript code (the best I know of), and, under normal circumstances, it works just fine. Thing is, I’m writing the server in node.js, and I want to be able to share the same codebase with both the server and the client. With a tool like Browserify, this becomes very easy.

Problem is, Closure and Browserify don’t match very well. Sure, you can use Closure on a Browserified piece of code, but it doesn’t optimize nearly as well as it could. Few functions are properly inlined or evaluated, many variable and property names stay intact, and there’s a lot of needless code around it.

Use case scenario done. Now on to trivial-require (this line is for the TL;DR people :P)

I wrote trivial-require as a very quick and dirty hack for the project I was working on. It might not be ideal for everyone, but hopefully some people will be able to find it useful as well :)

trivial-require sort of functions like Browserify, in the sense that it will turn node.js code into browser code, and, if the winds are in your favour, it might work.

Okay, that might have been a bit of an exaggeration. This following line will sum up almost exactly what it does (bolded for the TL;DR folk, again)

trivial-require will literally include the contents of the require()d file at the spot where it is require()d. module.exports is entirely disregarded

No extra code is added around it. Literally the only difference between this and C’s #include directive is that this only will include a file once. In other words, var module = require('file'); var module2 = require('file'); will result in file.js being included once (both lines are removed in the output file).

This works in a very different way from the way that browserify and node.js work. But, it is possible to write code that works both with node.js and trivial-require.

Before I go to this though, I’ll explain how to install/run it:

Installing:

sudo npm install -g trivial-require

Running:

trivial-require script.node.js > script.browser.js

Now that that’s done, let’s go to the guidelines on using it:

Do not use ambiguous variable names. It might be overridden by a future module.

// Wrong
var Logger = require('./Logger')("ModuleName");

// Correct
var Logger_ModuleName = require('./Logger')("ModuleName");

Do not use module.exports as a means for writing a function (or other). Any line containing module.exports is deleted.

// Wrong
module.exports = function() {
    console.log("Hello World");
};

// Correct
function HelloWorld() {
    console.log("Hello World");
};

module.exports = HelloWorld;

Use the same module name when require()ing a file. Each file is literally included, and both require() and module.exports lines are deleted.

//// Wrong

// HelloWorld.js
function HelloWorld() = {
    console.log("Hello World");
};

module.exports = HelloWorld;

// index.js
var GreetPlanet = require('./HelloWorld');


//// Correct

// index.js
var HelloWorld = require('./HelloWorld');

require() in the global scope. Caching might hurt you later.

// Wrong
function my_function() {
    var HelloWorld = require('./HelloWorld');
    HelloWorld();
}

// Correct
var HelloWorld = require('./HelloWorld');

function my_function() {
    HelloWorld();
}

require()s containing non-relative pathnames will be removed. Use your own modules instead

// Wrong
var utf8 = require('utf8');

var encoded = utf8.encode("Hello World");

// Correct
var is_node = false;
if (typeof window === "undefined")
    is_node = true;

var encoded;

if (is_node) {
    var utf8 = require('utf8');
    encoded = utf8.encode("Hello World");
} else {
    encoded = unescape(encodeURIComponent("Hello World"));
}

And that’s about it! I hope that trivial-require might be useful for you, and that this guide is clear enough :) If you need help with anything, feel free to leave a comment, and I’ll try to help!

Fun obfuscation in openlux

I was working on a free software alternative to f.lux named openlux a while ago, and I wasn’t working on any interesting aspects of the program, just rewriting functions, which gets a bit tedious after a while, so I decided to try writing one part of the code in a slightly different manner, for fun! :D

The code is supposed to add a variable to another if a character is ‘+’, and subtract it if the character is ‘-‘. Here would be how one might implement this:

int a = /* something */;
unsigned short b = /* something ... note that b is a smaller data type than A, this is important */;
char chr = /* '+' or '-' */

if (chr == '+')
    return b + a;
else
    return b - a;

After a few hours (maybe even a day, I’m not very good at this :P), I came up with this instead:

a = (0x80000000 | a) ^ (0x80000000 - !!(chr - 43));
if (a & 0x80000000) a++;
return a + b;

To dissect this, let’s start with the easiest part (other than return a + b;, of course :P): !!(chr - 43).

43 is simply the value of ‘+’. Yes, okay, that was a bit cheap :P So, what !!(chr - '+') does is that it will return 0 if chr == ‘+’, 1 otherwise. This could be have been rewritten as (chr != '+').

Easy part out of the way, let’s look at how numbers are encoded, via example, in binary:

00000000 = 0
00000001 = 1
00000010 = 2
00000011 = 3
(...)

So far so good, right? But what about when we reach 10000000? If it’s an unsigned byte, it will return 128 (2^7). If it’s signed, however, it will return -127:

(...)
01111111 = 127
unsigned 10000000 = 128
signed   10000000 = -127
unsigned 10000001 = 129
signed   10000001 = -126
(...)

In both cases, the number will keep getting larger after 10000000, but if it’s signed, it will have wrapped around to -127.

Let’s see larger values:

(...)
unsigned 11111101 = 253
signed   11111101 = -3
unsigned 11111110 = 254
signed   11111110 = -2
unsigned 11111111 = 255
signed   11111111 = -1

Notice that the largest value for the signed number is not 0, but rather, -1. This is important. If it was 0, then it would mean inverting the bits of a number would make it negative (i.e. ~n == -n).

So what would inverting the bits do?

11111111 = -1
00000000 = 0

11111110 = -2
00000001 = 1

11111101 = -3
00000010 = 2

11111100 = -4
00000011 = 3

(...)

Notice a pattern here? If we want to turn a negative number positive, we can do it via (~n) - 1. Vice-versa, it’s (~n) + 1.

Alright, back to the code!

(0x80000000 - (chr != '+'))

0x80000000 can be represented in binary as 10000000000000000000000000000000. In other words, the sign bit for a 32-bit integer. So if chr == '+', it will simply evaluate as 0, and therefore, keeping 0x80000000 intact. Otherwise, it will turn it into 0x7fffffff, which is equivalent to 01111111..., or ~0x80000000.

(0x80000000 | a) simply returns a, with the sign bit on.

Now, to deal with the xor part, let’s use a few examples to clarify:

chr = '+';
(0x80000000 | a) ^ (0x80000000 - (chr != '+'));
(0x80000000 | a) ^ (0x80000000 - 0);
// 0x80000000 ^ 0x80000000 cancel each other out, leaving us with 'a', unchanged (assuming a < 0x80000000)

chr = '-';
(0x80000000 | a) ^ (0x80000000 - (chr != '+'));
(0x80000000 | a) ^ (0x7fffffff);
// assuming a < 0x80000000, this is equivalent to ~a, because the sign bit is left on, while every bit of a is inversed
// as we discussed, ~a is equal to ((-a) - 1)

if (a & 0x80000000) a++; checks if the sign bit is set (i.e. a < 0), and if so, increments a so that it gets the correct (negative) value, for the reasons I explained earlier.

Lastly, all we have to do is return a + b;, which should hopefully be pretty obvious :P

Let’s recap quickly

If chr == '+', a is left unchanged, and the result is simply a + b

If chr != '+', a‘s bits are inverted, and then incremented so that it can be equivalent to a = -a, so the result would be (assuming a’s original value, not the inverted+incremented value): -a + b or b - a.

I hope that you found this interesting, or at least fun to read! I’m sorry if this isn’t very clear, I’m sort of writing this to try and get to sleep, I’ll edit it tomorrow :)

Injecting code into running process with linux-inject

I was about to title this “Injecting code, for fun and profit”, until I realized that this may give a different sense than I originally intended… :P

I won’t cover the reasons behind doing such, because I’m pretty sure that if you landed on this article, you would already have a pretty good sense of why you want to do this …. for fun, profit, or both ;)

Anyway, after trying various programs and reading on how to do it manually (not easy!), I came across linux-inject, a program that injects a .so into a running application, similar to how LD_PRELOAD works, except that it can be done while a program is running… and it also doesn’t actually replace any functions either (but see the P.S. at the bottom of this post for a way to do that). In other words, maybe ignore the LD_PRELOAD simile :P

The documentation of it (and a few other programs I tried) was pretty lacking though. And for good reason, the developers probably expect that most users who would be using these kinds of programs wouldn’t be newbies in this field, and would know exactly what to do. Sadly, however, I am not part of this target audience :P It took me a rather long time to figure out what to do, so in hopes that it may help someone else, I’m writing this post! :D

Let’s start by quickly cloning and building it:

git clone https://github.com/gaffe23/linux-inject.git
cd linux-inject
make

Once that’s done, let’s try the sample example bundled in with the program. Open another terminal (so that you have two free ones), cd to the directory you cloned linux-inject to (e.g. cd ~/workspace/linux-inject), and run ./sample-target.

Back in the first terminal, run sudo ./inject -n sample-target sample-library.so

What this does is that it injects the library sample-library.so to a process by the -name of sample-target. If instead, you want to choose your victim target by their PID, simply use the -p option instead of -n.

But … this might or might not work. Since Linux 3.4, there’s a security module named Yama that can disable ptrace-based code injections (or code injections period, I doubt there is any other way). To allow this to work, you’ll have to run either one of these commands (I prefer the second, for security reasons):

echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope # Allows any process to inject code into any other process started by the same user. Root can access all processes
echo 2 | sudo tee /proc/sys/kernel/yama/ptrace_scope # Only allows root to inject code

Try it again, and you will hopefully see “I just got loaded” in-between the “sleeping…” messages.

Before I get to the part about writing your own code to inject, I have to warn you: Some applications (such as VLC) will segfault if you inject code into them (via linux-inject, I don’t know about other programs, this is the first injection program that I managed to get working, period :P). Make sure that you are okay with the possibility of the program crashing when you inject the code.

With that (possibly ominous) warning out of the way, let’s get to writing some code!

#include <stdio.h>

__attribute__((constructor))
void hello() {
    puts("Hello world!");
}

If you know C, most of this should be pretty easy to understand. The part that confused me was __attribute__((constructor)). All this does is that it says to run this function as soon as the library is loaded. In other words, this is the function that will be run when the code is injected. As you may imagine, the name of the function (in this case, hello) can be whatever you wish.

Compiling is pretty straightforward, nothing out of the ordinary required:

gcc -shared -fPIC -o libhello.so hello.c

Assuming that sample-target is running, let’s try it!

sudo ./inject -n sample-target libhello.so

Amongst the wall of “sleeping…”, you should see “Hello world!” pop up!

There’s a problem with this though: the code interrupts the program flow. If you try looping puts("Hello world!");, it will continually print “Hello world!” (as expected), but the main program will not resume until the injected library has finished running. In other words, you will not see “sleeping…” pop up.

The answer is to run it in a separate thread! So if you change the code to this …

#include <stdio.h>
#include <unistd.h>
#include <pthread.h>

void* thread(void* a) {
    while (1) {
        puts("Hello world!");
        usleep(1000000);
    }
    return NULL;
}

__attribute__((constructor))
void hello() {
    pthread_t t;
    pthread_create(&t, NULL, thread, NULL);
}

… it should work, right? Not if you inject it to sample-target. sample-target is not linked to libpthread, and therefore, any function that uses pthread functions will simply not work. Of course, if you link it to libpthread (by adding -lpthread to the linking arguments), it will work fine.

However, let’s keep it as-is, and instead, use a function that linux-inject depends on: __libc_dlopen_mode(). Why not dlopen()? dlopen() requires the program to be linked to libdl, while __libc_dlopen_mode() is included in the standard C library! (glibc’s version of it, anyways)

Here’s the code:

#include <stdio.h>
#include <unistd.h>
#include <pthread.h>
#include <dlfcn.h>

/* Forward declare these functions */
void* __libc_dlopen_mode(const char*, int);
void* __libc_dlsym(void*, const char*);
int   __libc_dlclose(void*);

void* thread(void* a) {
    while (1) {
        puts("Hello world!");
        usleep(1000000);
    }
}

__attribute__((constructor))
void hello() {
    /* Note libpthread.so.0. For some reason,
       using the symbolic link (libpthread.so) will not work */
    void* pthread_lib = __libc_dlopen_mode("libpthread.so.0", RTLD_LAZY);
    int(*pthread_lib_create)(void*,void*,void*(*)(void*),void*);
    pthread_t t;

    *(void**)(&pthread_lib_create) = __libc_dlsym(pthread_lib, "pthread_create");
    pthread_lib_create(&t, NULL, thread, NULL);

    __libc_dlclose(pthread_lib);
}

If you haven’t used the dl* functions before, this code probably looks absolutely crazy. I would try to explain it, but the man pages are quite readable, and do a way better job of explaining than I could ever hope to try.

And on that note, you should (hopefully) be well off to injecting your own code into other processes!

If anything doesn’t make sense, or you need help, or just even to give a thank you (they are really appreciated!!), feel more than free to leave a comment or send me an email! :D And if you enjoy using linux-inject, make sure to thank the author of it as well!!

P.S. What if you want to change a function inside the host process? This tutorial was getting a little long, so instead, I’ll leave you with this: http://www.ars-informatica.com/Root/Code/2010_04_18/LinuxPTrace.aspx and specifically http://www.ars-informatica.com/Root/Code/2010_04_18/Examples/linkerex.c . I’ll try to make a tutorial on this later if someone wants :)