Pages

Saturday, February 26, 2011

Goin' back to...

Just got back from the Mother Land. That's right peoples, I just traveled to the USA. I'm back here in Melbourne now. So, the trip was fast (left Oz last Monday the 21st around noon). The cool piece of this deal was that I left at noon that Monday, arrived in Cali-for-ni-a (the home of the Governator) the same day at 2PM. Like a super-crazy worm-hole. In-fact, many abductees claim the extraterrestrial abduction was responsible for their time shift. But, I am pretty sure, science can explain the loss in hours, I think it was probably do to flying against the international dateline. None-the-less, just a mere 2 hours after I left, I arrived in LA, but on a ~14 hour plane flight. Anyways, flying home that following Thursday, and just got back in today... Saturday... freaky. Anyways, California is mega-sweetness. I crossed over the Golden Gate Bridge.... but no jumpers.

-Matt

Sunday, February 13, 2011

Acca Dacca Lane and More

SoooooooOOoOoOoOoOOooOoOo.... posted pictures, I have. Enjoy.
http://www.flickr.com/matt-davis

Saturday, February 12, 2011

Linux syscall, vsyscall, and vDSO... Oh My!

The following article is a brief investigation I wrote up, which looks into the system call functionality of the Linux kernel. Mainly, I was trying to clarify the difference between vsyscall and vdso. This was conducted, more or less, on 2.6.32 and 2.6.37 branches, if I recall properly. Anyways here goes:


-= What are System Calls =-

System calls are routines that communicate directly with the operating system in hopes of attaining some specific piece of information, or to make a specific request that the OS needs to fulfill. For example, gettimeofday() is a request often from user-land for an application to obtain the current timing information from the operating system's kernel. In Linux this call is implemented as a system call. Often this interface exists as a means of fulfilling a request by a user for kernel information. This interface allows a lesser privileged application to access higher-privileged kernel information [1, 2].

Often these calls are not even accessed directly by an application, rather they are executed via a wrapper. The GNU C library, glibc, has wrapper functions that do all the handy-dandy wrapping. For instance, a call to gettimeofday() in an application is really just a call to glibc's wrapper for the system call for gettimeofday(). There is no overhead because the wrapper, at link-time, just associates gettimeofday() to the system's appropriate version of the routine. If one were to desire to make a true system call, avoiding the wrapper, Linux provides a syscall() routine allowing such. Both wrapper and the explicit syscall versions of gettimeofday() are demonstrated below:


#include <stdlib.h>
#include <sys/syscall.h>
#include <sys/time.h>

void main(void)
{
    struct timeval tv;

    /* glibc wrapped */
    gettimeofday(&tv, NULL);

    /* Explicit syscall */
    syscall(SYS_gettimeofday, &tv, NULL);

    return 0;
}


System calls are loaded at kernel boot-time. All of these calls are accessed by a number. In the example above SYS_gettimeofday is just a constant integer. And this integer is the index into an array (called an interrupt descriptor table) of system calls that are constructed at boot-time. This table/vector can be thought of as pointers to the actual routines [3]. So at index SYS_gettimeofday is a pointer to a piece of code which is where the actual sys_gettimeofday() routine exists.

-= Overhead =-

Originally, invoking a syscall in Linux was actually a pretty expensive process, as it was implemented as a system interrupt (int 0x80). When the syscall interface was called, via the "int 0x80" instruction, the CPU would pass control to the OS. The OS, would look at what value was passed in, such as SYS_gettimeofday, and then the interrupt vector would be indexed and gettimeofday() would get called. Interrupts force the CPU to save the state of execution just before the system call was executed and the system interrupted. This state is restored after the interrupt (in this case a syscall) completes.

Ok, so interrupts are relatively cheap in execution time. However, the way the kernel is designed does add some overhead when a syscall is executed. Linux is divided into two primary memory segments. User-space and kernel-space (userland, kernelland). User-space memory is that which user applications run in. Kernel-space is where all the kernel services run. This segmentation acts as a security barrier, so that crafty and/or malicious user apps cannot directly access the kernel's memory. Making a syscall, as mentioned previously, is just a bridge, a mere wormhole, between these two segments of memory. It is this hop between address spaces that causes quite some overhead, as the kernel must switch between user and kernel-land memory segments, and then back again as the syscall completes [5, 6]. Changing the memory addressing requires some register shuffling, and just imparts more overhead on the whole syscall process.

-= vsyscall and vDSO =-

To reduce the overhead of hopping between user and kernel spaces, a newer mechanism in Linux allows certain syscalls to be accessed directly from userspace, without the need to cross the user/kernel space barrier. This is just what the vsyscall and vdso (Virtual Dynamic Share Object) interfaces do. At boot-time a page of memory is dedicated to containing a subset of syscalls, deemed safe to execute from
userland, that should not cause a security hole for the kernel. The page of memory where these calls lies is mapped into each running process' user-space. Thus, when a call to one of these syscalls is made, no context switch between the memory regions of user and kernel-space is conducted, thus less overhead.

Another interesting means of reducing syscall overhead comes specific to the underlying CPU architecture. Both the more recent AMD and Intel chips have implemented a fast syscall functionality. Instead of issuing an interrupt, programs can issue instructions (SYSCALL/SYSENTER and SYSRET/SYSEXIT) that act faster than a traditional interrupt. The usage of these are based on object code in the OS. Therefore, a programmer does not have to consider how to implement/request the use of a SYSCALL/SYSENTER over a traditional "int 0x80" [1]. When applications are built, based on the architecture and the system, vsyscall and vdso linkage is done automagically.

As demonstrated above, by using the syscall() routine, a traditional syscall will be conducted, even if there is vDSO support (virtual Dynamically Shared Object). However, despite this fact, that call might still be using the newer SYSCALL/SYSENTER CPU instructions. The glibc wrapped gettimeofday() call is what most programs would use. Since the kernel has been designed to use the most efficient mechanism of syscall, that version has the potential to be a virtualized syscall that is mapped into userspace.

To determine if a specific call is using a virtualized (user-space) syscall or a traditional, memory segment-shuffling, syscall, the strace utility can be used. If a true traditional syscall is being conducted, the routine will be output by strace, will look similar to the following:

gettimeofday({1297472587, 581519}, NULL) = 0


According to comments in glibc-2.12.1: "The vsyscall page is a virtual DSO (Dynamic Shared Object) pre-mapped by the kernel" [7].

vsyscall and vDSO are similar in how they work, however there are some slight differences. vsyscall is limited to 4 entries, and is static in memory. Therefore, any statically linked applications can guarantee where vsyscalls are loaded. On-the-other-hand, vDSO is dynamically loaded into the user process, therefore it is not predictable due to Linux's randomized address space layout. If more than 4 vsyscalls are needed, then a vdso should be used instead [8].
For example, run `cat /proc/self/maps` and look at both the '[vdso]' and '[vsyscall]' entries. If your system supports these, the memory range for vDSO is different for each process issued, and vsyscall is totally predictable. If you dont believe me, run 'cat /proc/self/maps' a few times and note the addresses of vdso and vsyscall. Since this example is looking at '/proc/self/maps' the memory mappings displayed are for that 'cat' process [9, 10].

[1] http://en.wikipedia.org/wiki/System_call
[2] Linux User's Manual: intro(2)
[3] http://www.tldp.org/LDP/khg/HyperNews/get/syscall/syscall86.html
[4] http://en.wikipedia.org/wiki/Interrupt
[5] http://en.wikipedia.org/wiki/Kernel_(computing)
[6] http://www.linux.it/~rubini/docs/ksys/ksys.html
[7] glibc-2.12.1
[8] Linux Kernel 2.6.37 arch/x86/kernel/vsyscall_64.c
[9] http://anomit.com/2010/04/18/examining-the-linux-vdso/
[10] http://www.trilithium.com/johan/2005/08/linux-gate/

-Matt

Saturday, February 5, 2011

matt += food;

Soooo. Well, I never posted about me getting out of the hospital, but I was there for three weeks, and was released about two weeks ago. I have been going back to work and research, but I have also been eating more. I have weekly doctor appointments to run blood tests. I have also been enjoying cooking more, and ya know, it's not that hard and it's not boring. It's not hard cause I'm not trying to be the next Master Chef. I need to keep eating, as my ultimate goal is to regain more health. So, for all of those that cared or offered to come out to see me, or sent me messages, or were legitimately concerned... much props.

BTW, I had no idea that I would acquire so many books while being in the crazy-house. Much props to my advisor, friend, and professor for bringing me a copy of Douglas Hofstadter's Godel, Escher, Bach. I have not completed it, but got just under halfway through while being re-fed. I must say, as much as it sucked to be held against my will, I needed it. Technically, I was only against my will for one night, but I have a feeling if I were to have tried to discharge myself, I would have gotten another Section 12 (involuntary) status; thereby preventing me from leaving. I must say I did meet some really cool people. Cool people are interesting. That could have been said in reverse, interesting people are cool. It is a commutative relation.

I did learn two things when in the hospital:
  • If you carry around a large book about science, you might be perceived a genius.
  • If you can access facebook via a proxy, you might be perceived as a hacker.
  • Sultanas are seemingly the equivalent to raisins in America.

-Matt