Tuesday, December 9, 2008

Do You Know Your Limits!?

To continue further our discussion of what are process resource limits, let’s see some real figures. Try putting following two printk() statements in FirstModuleSource.c posted here.

printk (KERN_INFO "Process Name: %s", current -> comm);
printk(KERN_INFO "current limit- # processes: %ul\n", current -> rlim[RLIMIT_NOFILE].rlim_cur);
printk(KERN_INFO "maximum limit- # processes: %ul\n", current -> rlim[RLIMIT_NOFILE].rlim_max);

These statements print soft and hard limits (for the maximum number of files the current user can open) imposed on current process (which is insmod). On similarly printing the limits of other resources, we see that all values of soft and hard limits are assigned limits from array INIT_RLIMITS (see #line 24 in snapshot of resource.h posted here).


To see the difference between resource limits of a kernel process with an ordinary user process, we shall print the resource limits of a user process by adding following lines of code in the FirstModuleSource.c

int some_pid = 3630; /*reader should change this value appropriately*/
struct task_struct *task = find_task_by_pid(some_pid);
printk (KERN_INFO "Process Name: %s", task -> comm);
printk (KERN_INFO "current limit- # processes: %ul\n", task -> rlim[RLIMIT_NOFILE].rlim_cur);
printk (KERN_INFO "maximum limit- # processes: %ul\n", task -> rlim[RLIMIT_NOFILE].rlim_max);

Here, 3630 is the pid of the shell command tail, which I got by running tail and looking for it in output of ps. The find_task_by_pid() kernel function returns the process descriptor belonging to some_pid. We can specify pid of any user process running in background; but that pid must be alive while the module is inserted.

On my i386/RHEL-4 machine, both sets of statements given above produce following output, respectively:

Nov 15 21:27:32 localhost kernel: Process Name: insmod<6>
Nov 15 21:27:32 localhost kernel: current limit- # processes: 1024l
Nov 15 21:27:32 localhost kernel: maximum limit- # processes: 1024l

Nov 15 21:27:32 localhost kernel: Process Name: tail<6>
Nov 15 21:27:32 localhost kernel: current limit- # processes: 1024l
Nov 15 21:27:32 localhost kernel: maximum limit- # processes: 1024l

On similarly printing the limits of other resources and analyzing the output, we see that all the user/kernel processes’ resource limits are initialized with the same set of values which are specified by array INIT_RLIMITS of resource.h. This small experiment suggests that by default, the kernel imposes similar resource limits on any user/kernel mode process. To verify this further, I changed some values in INIT_RLIMITS, and then ran the same program again after compiling the kernel. The changed values were reflected as-is in the output.

How are Limits Assigned?
Unless a user is assigned custom limits by the superuser, all the processes belonging to that user are assigned default limits as follows.

The first Linux process which is created during system bootup is the swapper process. After kernel initialization, the swapper creates another process init, which then takes full control of all the system activities and creates more processes. swapper's data structures are initialized through init_task.h. It is the only process, for which data structures are statically allocated like this. Data structures for all other processes are dynamically allocated.

As we see in the snapshot here, INIT_TASK macro initializes the task_struct of swapper. At #line 80, rlim field of task_struct has been initialized with INIT_RLIMITS structure defined in resource.h (see last post), #line 82 is the name of the process.
Since every process inherits properties of its parent process when it is created, init receives the same values as swapper for all of its fields, including rlim. Further, when init creates more processes, these values are inherited by every child process and to their child processes and so on. This goes on until and unless
  1. A process executes setrlimit() system call
  2. The superuser has configured custom values for a user.
We've already covered the first point in last post. For information on second point, do a `man limits.conf' and open your /etc/security/limits.conf. The custom values for a user/group are specified in this configuration file by the system superuser.

Therefore, if a particular limit for a resource is set for user X in limits.conf, that limit shall be assigned to all the processes created by user X, at the time of process creation.

** Try shell command `ulimit –a’ to see all soft limits for current user. `ulimit -s unlimited’ sets the stack size soft limit to maximum for all the processes belonging to current session of the current user. (Current session is nothing but the current shell on which ulimit is entered; the changes by ulimit do not affect other shells.) If a user/resource doesn't appear in limits.conf, ulimit shall display default limits (i.e. the values of INIT_RLIMIT array) for that user/resource.

Shweta is going to share some findings on this very soon... on the same page.
-------------------------------------------------------------

Shweta added:

The above explanation can be understood with the help of a small and quick experiment. This should make clear this behavior of resource limits assignment. This experiment has two parts:

First, clear the entries belonging to user X from limits.conf, (if any), then login to your machine with user account X and write a user program to print value of RLIMIT_NOFILE using getrlimit(). Also check this value using 'ulimit'. Both outputs should show the value specified in INIT_RLIMIT array. Now change the value of RLIMIT_NOFILE using setrlimit(), fork() a new process in the same program and print the value of RLIMIT_NOFILE for child process using getrlimit(). This should output the value changed in parent process using setrlimit().

Now for the second part of the experiment, make an entry for RLIMIT_NOFILE for user X in limits.conf and repeat part-1 of the experiment. The outputs this time would differ in the way that wherever previous experiment printed values from INIT_RLIMIT array, this experiment would print values specified in limits.conf file.

Saturday, November 22, 2008

Process Resource Limits

Having seen the task_struct structure representing a process in kernel, in this post we shall discuss one very important field of task_struct, named rlim (#line 460 in the snapshot of sched.h in last post). A process can never access the system resources how much ever it wants. Linux kernel imposes limits on the amount of resources a process is allowed to use, so that a single process cannot cause other processes to starve by itself consuming large amounts of resources. There are RLIM_NLIMITS types of resources a process can use. The kernel specifies limits on all of these RLIM_NLIMITS types of resources; for kernel 2.6, the value of RLIM_NLIMITS is 13. See the snapshot of file include/asm-i386/resource.h below, which contains the macros for all 13 resource types.

All Linux processes are assigned the same values of resource limits when they are created. However, after creation, a process can increase or decrease its limits. The resource limits of a process are stored in the
rlim field of the process descriptor. The rlim field is a 13 elements array of type struct rlimit, which is defined in file include/linux/resource.h. (Snapshot below)

The
rlimit structure contains two fields specifying current and maximum limits of a resource (also called soft and hard limits, respectively) to be used by a process. Using system calls getrlimit()/setrlimit(), any process can increase/decrease its soft limits up to corresponding hard limits. Hard limits of a process can only be increased if the particular process has superuser privileges. Therefore, a “normal” process may use getrlimit()/setrlimit() system calls to irreversibly decrease its hard limits.

If a process exceeds the consumption of a resource beyond the assigned soft limit, the kernel signals the process, or the system call (e.g.
malloc, fopen) trying to consume the resource fails with errno set to appropriate value. For example, the signals SIGXCPU, SIGXFSZ, SIGSEGV are fired when a process exceeds RLIMIT_CPU, RLIMIT_FSIZE, RLIMIT_STACK respectively.

This was the general background; in the next post we will practically see the actual limits imposed on Linux processes, and how these limits are assigned.

Tuesday, July 29, 2008

Kernel Process Data-Structures

In the simplest words, a process is defined as "execution context of a running program". Execution context is the collection of all the data structures, registers, memory addresses and other resources that the process uses while executing a program. Therefore, a program itself is not a process. Rather, it is a collection of data structures that fully describes how far the execution of program has progressed. In other words, a process is an active program, i.e., an in-memory representation of a program stored in hard-disk. An important point to note is that two or more processes can co-exist which are executing the same program, and a single process can execute two or more programs in its life span.


Running and managing a process is a complex activity for kernel. In this post (and probably the next few) I try to make a note of how beautifully the Linux kernel performs this task. Specifically in this post I note how the Linux kernel keeps track of the process currently scheduled to run on CPU.


Data Structure to Hold Entire Information About a Process:

Kernel identifies each process with a unique process descriptor represented by struct task_struct, which contains all the information about a specific process. List of all process descriptors is maintained in a circular doubly linked list.

The following screenshot shows a (partial) definition of the task_struct structure, defined in linux/sched.h. I have removed many fields here, which are not directly relevant to our discussion here.

getpid() Does Not Return Process id ?!?

But I have been always using it to get the pid of calling process without knowing this fact!!! Indeed, getpid() returns tgid (thread group id), which is the pid of the leader of threads executing within the same group, (which is usually the first process created within that group). Since Linux doesn't support multithreading directly, each process executes within a single thread in its own thread group. Therefore, a single threaded process naturally becomes its thread group leader. The difference occurs only when some library (e.g. POSIX) is used to create multithreaded processes, in which case, getpid() returns the value of calling process' tgid field. By the way, in Linux, threads are called light weight processes and they also have pids, since as we can see, there is nothing called thread id in task_struct.

Food for Thought:

1. What if I need to determine the pid of a light weight process, which is not a group leader?


In a uniprocessor system, only a single process can run at a time. The pointer to the task_struct of the process currently running on CPU is returned by a macro current. Thus, the statement


printk(KERN_ALERT "Process Name: %s\n" "PID: %d\n", current->comm, current->pid);


prints the name and pid of the currently executing process.


How Does the current Macro Determine the Currently Running task_struct?

Other than task_struct, there is one more data structure associated with a process, thread_info, which contains the low level information about a process when it is running on CPU. Both the task_struct and thread_info structures contain a field pointing to each-other, named thread_info and task respectively (line #439 in the task_struct snapshot). But unlike task_struct, which is a dedicated structure for each process, there is one thread_info structure per processor in the system. The thread_info field of task_struct (and eventually, the task field of thread_info) is updated as soon as the process is scheduled onto the processor. When a process is not running, the value of thread_info field of task_struct does not mean anything.

For each process running in kernel mode, kernel uses the following union spanning 8KB:

union thread_union{

struct thread_info thread_info;

unsigned long stack[2048]; /* (long: 4 bytes) x 2K = 8KB*/

};


Food for Thought:

2. What happens if I reverse the order of the two declarations above in thread_union?

3. What is wrong in following code (courtesy of Kernel Trap mailing list http://kerneltrap.org/node/5835)

char *get_sp() { asm("movl %esp, %eax"); }


int main(void){

struct thread_info *current_thread_info;

struct task_struct *current_task;

current_thread_info = (struct thread_info *)((unsigned long)get_sp()&~0x1FFF));

current_task = (struct task_struct *)(current_thread_info->task);

printf("my pid is: %d\n",(unsigned int)current_task->pid);

}

As soon as a process switches to kernel mode, the (8 KB) thread_union is "created" in the memory area kernel data segment which is designated for kernel by kernel only (address to kernel data segment is returned by macro __KERNEL_DS). The user mode processes cannot access the kernel data segment.


Since at any given time, esp register points to a memory location lying somewhere between this 8 KB area (adrressed by 13 bits), the starting address of thread_union can be obtained by masking out the 13 least significant bits of the esp register. Furthermore, since the first field of thread_union is thread_info, obtained value is a pointer to thread_info. This masking operation is performed by kernel routine current_thread_info() which looks like following:


movl $0xffffe000,%ecx /* store the mask */

andl %esp,%ecx /* mask the esp, ecx points to thread_info*/


Having acquired the address of thread_info structure, getting the pointer to task_struct is trivial. Hence, the current macro is:


current = current_thread_info()->task


This is how the current macro points to the task_struct of the process currently running on CPU.

Monday, July 21, 2008

There is Something About printk()

In the example myFirstModuleSource.c of Writing the First Kernel Module, I used printk() just as another name of printf(). In this post I note the differences between the two, and the way Linux kernel uses it.


At the first glance, the only difference between printk() and printf() is the way the two functions are called. However, this single difference causes all the mysteries of printk() in displaying the kernel outputs. I called it “mystery” because we don't see printk() outputs at the place where (as naive kernel learners) we expect them to come, i.e., on the console. Actually, all outputs of printk() are logged into file /var/log/messages. After executing our program (inserting/removing modules), we have to go check /var/log/messages to see the outputs. Let's have a look at how printk() is called, and then see how to get its outputs at the place we want, besides /var/log/messages.


The printk() is called with one more argument than printf(), like this:


printk(KERN_log_priority "hello world\n");


Here, log_priority is one of the eight values (predefined in linux/kernel.h, similar to /usr/include/sys/syslog.h), EMERG, ALERT, CRIT, ERR, WARNING, NOTICE, INFO, DEBUG (in order of decreasing priority). See line #31 to #38 in the snapshot of linux/kernel.h below. In the example of myFirstModuleSource.c, I used printk() without mentioning any log priority. default_message_loglevel is assigned to such cases (line #43 in the snapshot below), whenever the log level is not specified explicitly while calling printk().




The kernel gives different treatment to these different priorities of messages. Different rules are followed depending upon whether the system is in one of the six text console mode (Ctrl-Alt-F1 to Ctrl+Alt+F6) or GUI mode.


Logging in Console Mode and Getting the printk() Output on the Console:

The file /proc/sys/kernel/printk contains four integer values. E.g., my RHEL-4 machine has values 6,4,1,7. These integers correspond to currently set, default, minimum allowed, and boot time default message log level, respectively. These are the values in line #42 to #45 in linux/kernel.h. Any message with priority less than the current console log level (i.e. the first integer in the file) is displayed on the console. The rest are logged into /var/log/messages.


The values in file /proc/sys/kernel/printk can be changed according to the requirements. For instance, changing the first integer value (current console log level) to 8 in causes messages with any priority to be printed on console. Similarly, changing the second value changes the default priority level assignment. The third and fourth values are generally not changed.


Logging When in GUI Mode:

In this case, logging is done according to the rules defined in /etc/syslog.conf file (snapshot below). This file contains two columns, one for the type of message (in the form of facility.priority), and the other for the place where to display the corresponding kernel log message. The facility specifies the subsystem that produced the message and the priority specifies its severity.

For instance, EMERG messages are logged everywhere (all the terminals and log files), no matter produced by which subsystem (line #16). All the messages from MAIL subsystem are logged into /var/log/maillog (line #12), INFO messages from any subsystem are logged into /var/log/messages (line #7). See syslog.conf manual page for more details.


Getting printk() Messages Displayed on the Terminal in GUI Mode:

This is the point where I am stuck right now. One way to do this is to assign one terminal, say /dev/pts/3, in syslog.conf at line #7, so that that all INFO (and higher) messages will go there. But this solution is no better than looking for output in /var/log/messages. My requirement is to see the printk() messages on whichever terminal I am using at that time. klogd manual page suggests to start klogd daemon with –c switch to change the current console log level according to need. I changed it to 8, but could not see any change in the manner messages being logged. This did not solve the problem either.


Thoughts/Suggestions are welcome. If someone has done that, guidance needed.


Saturday, July 19, 2008

Compiling and Inserting the First Kernel Module

Let’s now see how to compile the module program shown in previous post. We shall write a Makefile to make the procedure of compilation simpler. Below is a ‘template’ of a Makefile, which I use to compile my modules. The description follows.

########################################################################################

#Build as a loadable module
obj-m += ‘module_name_1’.o ‘module_name_2’.o
‘module_name_1’-objs := “space separated list of object files needed by ‘module_name_1’.o”
‘module_name_2’-objs := “space separated list of object files needed by ‘module_name_2’.o”

#Location of the current linux kernel source directory
SRC=/lib/modules/`uname -r`/build

#Working Directory
PWD=`pwd`

default:
make -C ${SRC} M=${PWD} modules

clean:
rm -f ${‘module_name’-objs}‘module_source_name.o ‘module_name’.ko ‘module_name’.mod.o
rm -f ‘module_name’.mod.c

################################################################################

obj-m tells the compiler which object files have to be created on make command. As many modules can be specified here, as we want to create. After that, for each module in obj-m list, a list of object files has to be specified which together link into that particular module’s object file.

Under /lib/modules, there is one subdirectory for each kernel installed in your system, with the name of that particular kernel. Each of these directories contain source (code, module object files) of the corresponding kernel image. The shell command uname -r gives as output the name of the currently running kernel. Therefore, SRC environment variable stores the path to the skeleton of source code of currently running kernel.

This path is specified to let make read the kernel top level Makefile, which defines the rules to make the target, i.e. modules. This environmental variable can alternatively/additionally be passed at command line, as an argument to make. As is the case with Makefiles, SRC specified at command line will take precedence over the one specified in Makefile.
The -C switch in make changes the current directory to $SRC (kernel source directory) to find kernel top level Makefile. M=dir specifies the directory where the module to be built is present. M=dir modules instructs to make all those modules in the directory dir, listed in variable obj-m.

The myFirstModule Makefile:


#####################################################################################

#Build as a loadable module
obj-m += myFirstModule.o
myFirstModule-objs:= myFirstModuleSource.o

#Location of the linux source directory
SRC=/lib/modules/`uname -r`/build

#Working Directory
PWD=`pwd`

default:
make -C ${SRC} M=${PWD} modules

clean:
rm -f ${ myFirstModule-objs} myFirstModule.o myFirstModule.ko myFirstModule.mod.o
rm -f myFirstModule.mod.c

###############################################################################


Running make:
[shweta@localhost modules]# make
make -C /lib/modules/`uname -r`/build M=`pwd` modules
make[1]: Entering directory `/usr/src/kernels/2.6.9-42.EL-smp-i686'
CC [M] /home/Shweta/wikalk/modules/myFirstModuleSource.o
LD [M] /home/Shweta/wikalk/modules/myFirstModule.o
Building modules, stage 2.
MODPOST
CC /home/Shweta/wikalk/modules/myFirstModule.mod.o
LD [M] /home/Shweta/wikalk/modules/myFirstModule.ko
make[1]: Leaving directory `/usr/src/kernels/2.6.9-42.EL-smp-i686'


Inserting (or linking) the module:
The shell provides two commands to insert a module into the kernel, insmod and modprobe, which both do the same set of activities. The difference lies in how they search for the module binary to load.

insmod requires the absolute path of the module as argument:
[shweta@localhost modules]# insmod /root/wikalk/modules/myFirstModule.ko
modprobe requires just the module name as argument:

[shweta@localhost modules]#modprobe myFirstModule
The modprobe searches for the module name in the default path /lib/modules/`uname -r`. If no module with the given name is found, an error is displayed.

So now, our myFirstModule is inserted/removed as follows:
[shweta@localhost modules]#insmod myFirstModule.ko
Hey! myFirstModule is in the kernel now.

[shweta@localhost modules]#rmmod myFirstModule
myFirstModule is removed from kernel
myFirstModule was in kernel for 143 seconds.

Some Points to Remember:

  • After inserting a module, it needs to be explicitly unloaded or else it will be removed when the system is shut down. However, it is a better way to explicitly unload the module if it no more required otherwise it will consume system resources (memory, CPU...) for no use.
  • If the return statement is missing in init() function, the compilation succeeds without any errors or warnings. But insmod gives “error inserting myFirstModule.ko

Tuesday, July 15, 2008

Writing the First Kernel Module

Having read the bare essential theory, we are ready to get revealed to the amateur beauty of a module. Well, at least I found it beautiful. Let’s find out, how you feel...

A Linux module is just a C program. However, writing a module requires a lot more attention, skill, awareness (and so on ...) than generally required in a normal C program, which runs in user-space. Since a kernel module runs in kernel-space, errors must be handled very intelligently, as even a smallest problem may result in a system crash.

Now, have a glance at the code below and then read the following text. There is nothing in this program that a C acquaint can't understand, except the absence of the main() function.

/***********************************************/

/* myFirstModuleSource.c */

#include linux/module.h /* macros for init(), exit() functions*/

#include linux/time.h

/* kernel data structures to represent time */

struct timespec moduleLoadTime, moduleUnloadTime;

int myFirstModuleInit(void) /* mandatory syntax for an init function */

{

printk("Hey! myFirstModule is in the kernel now.\n"); /* No, this ain't a typo error for printf() */

moduleLoadTime = current_kernel_time(); /* kernel routine to determine current timestamp */

return 0;

}

int calculateDifference(int one, int two) /* a normal C function*/

{

return (one-two);

}

void myFirstModuleExit(void) /* mandatory syntax for an exit function */

{

int moduleLifespan;

printk("myFirstModule is removed from kernel\n");


moduleUnloadTime = current_kernel_time(); /* kernel routine to get current timestamp */

moduleLifespan = calculateDifference(moduleUnloadTime.tv_sec,moduleLoadTime.tv_sec); /* a C function call */

printk("myFirstModule was in kernel for %d seconds.\n",moduleLifespan);

}

module_init(myFirstModuleInit);

module_exit(myFirstModuleExit);

/***********************************************/

The above module is mere a "hello world" module, with an added functionality of displaying the duration for which it remained linked into the kernel.


No Main()'s Land.

For a user space C program, the main() function acts as an entry point, which tells the system where to start the execution from. For a kernel module, it is an init() function. The init function is executed only once, when the module is linked into the kernel (usually by insmod shell command). However, unlike a user space program, a kernel module requires an additional function, the exit() function, which is executed when the module is removed from kernel (usually using rmmod shell command). Therefore, running a kernel module requires at least two functions:

1. An init function to load the module

2. An exit function to unload the module

A programmer can specify any function to be an init or exit function for a module. It is just that the name of that particular function has to be registered with the kernel using macros module_init() and module_exit() (as done in the last two lines of the above code). But, the the syntax cannot be altered. Every init() and exit() must have the syntax as shown.


printk()??? A typo error?

Not really!! How could I make the same spelling mistake at three places?

The Linux kernel does not have the standard libc C library (or any user space library, for that matter) which contains printf(). Therefore, it has no access to printf(). But (thank God) it has its own output function printk(). Well there is lot to say about printk() which I plan to tell in another post. For now, just consider it as an avatar of our old friend printf(). But just remember always to put a '\n' at the end of the format string in every printk() call and to look for all printk() outputs in /var/log/messages.


Some points to remember:

  • Each module must have an init() function, but exit() is not mandatory. However, if there is no exit() registered, the module is permanently linked into the kernel. It gets only removed on reboot.
  • All the clean up activities should be done in the exit() function for writing a clean and safe module.
  • It is not possible to have floating point arithmetic in kernel modules, since these operations are heavy and kernel does not have required libraries to perform them.

Module writing is over now, in next post I shall tell how did I compile the above program and insert it into the kernel.

Wednesday, June 25, 2008

The OS, The Kernel and The Modules

Typically kernel literature start with the words like modules, microkernels etc. Lets quickly run over these to set the context right for reading.

Often the terms operating system and kernel are confused because the two terms are used interchangeably in many books. But actually there is a difference: kernel is just one part of operating system, other parts being device drivers, user interface etc. Kernel is the innermost layer and UI is the outermost layer of an operating system. For example, command shell is a part of operating system, not of kernel. When the ls command is fired on the shell, it invokes kernel routines to read the contents of a particular directory. The kernel manages all the data structures and functions which are needed by the operating system (and the applications running on it) for several purposes. For another example, bootstrap loader is a part of operating system, which loads the kernel into RAM at the system startup. Applications such as web browser or document viewer are NOT part of operating system, however they are installed by default with many operating systems (ms-windows, red hat).

The kernel as described in Linux Kernel Development by Robert Love is as follows. "The kernel is the core internals of the operating system; the software that provides "basic services" for all other parts of the system, manages hardware, and distributes system resources. Typical components of a kernel are interrupt handlers to service interrupt requests, a scheduler to share processor time among multiple processes, a memory management system to manage process address spaces, and system services such as networking and interprocess communication."

So now we know, that the core functionality of an operating system is provided by its kernel.

Now what are modules when we talk about linux kernel? Before understanding the concept of the module, one needs to understand that by design, linux is a monolithic kernel. A monolithic kernel is implemented as a single large process. It loads into memory as a whole, executing everything in kernel mode.
The another category is microkernel architecture, in which the kernel only includes core essential functionality like inter-process communication, process scheduling, synchronization (the term microkernel: minimal kernel). The services such as memory management, device drivers are run as separate processes, which run in user space on top of the microkernel.

Compare the two architectures:
  • Monolithic kernels give great performance, with simple design, having trivial communication within the kernel (direct function calls as in a single program). But the problem with this design is that one small failure in any part of the code brings the entire system down.
  • Microkernel architectures propose clean interfaces, hardware independence, and better memory management because everything is implemented as separate components. This also makes development easy. Moreover, in case of a problem, only the specific component fails, other parts function properly. But the problem is that communication within the kernel is done by means of message passing, which makes the performance very poor.
The concept of module fits somewhere between these two kernel designs. In order to achieve advantages of microkernel architecture, offering monolithic performance at the same time, the linux kernel divides its services into modules. Different filesystems, device drivers, memory management etc are separate modules which can be inserted or removed from kernel at run time, i.e. without even having to compile the kernel. For example, ext3, vfat filesystems are inserted into the kernel as modules. After inserting into the kernel, the module becomes a part of the kernel and executes in kernel mode like other kernel routines. (While in microkernel design, it would execute as a separate process in user mode).
Try shell command lsmod to list down all the modules currently linked into the kernel.
The feature of modules in linux kernel offers a great flexibility to linux programmers and independent hackers to develop their own modules with their customized functionalities. For learners, experimenting with the kernel becomes as easy as wiriting a normal C code. If there were no modules, even for a smallest addition/experiment, one would have to make changes directly into main kernel code. Modules give us freedom to develop the kernel even without touching the kernel. Thats one of the greatest opportunities Linux has given to the kernel freaks.

Thats all the theory before starting kernel programming. From now on, only practicals. In the next post, I shall tell how did I write a small, useless module and run it in the kernel. Stay Tuned!!

References:
  1. The Tanenbaum-Torvalds Debate.
  2. The Linux Kernel Module Programming Guide.