[Fsf-friends] Re: Reg ur talk at IITM -- with attachment

Anand Babu ab@gnu.org.in
Tue Feb 15 06:44:56 IST 2005


,----[ "Mohammed Riyaz" <p_mdriyaz@fastmail.fm> ]
| Hi,
| 
| I have written this document of what you spoke in IIT. I am
| attaching a copy of it. This document will be put on the ilugc
| websever. So if you do not like any part of it or want something
| changed, let me know.
| 
| Thank you,
| Mohammed Riyaz P.
| 
| P.S: it was a great session. :)
`----

As I promised, you can download a copy of my presentation and example
code at

http://www.gnu-india.org/gnu/Hacking-GNU.pdf
http://www.gnu-india.org/gnu/Hacking-GNU.sxi
http://www.gnu-india.org/gnu/data-server.tgz

OK, here is a slightly revised version of your minutes (actually
documentation :) 
========================================================================
Hacking GNU/HURD 
================
as a part of GLV@ILUGC

Speaker: Anand Babu
Email: ab (at) gnu.org.in
Date: 05 FEB 2005
Location: csd 320, Tenet seminar hall, IITM.


For all those who couldn't make it, you did miss a lot. Cheer up
though, i will try my best to bridge that gap.


Brief Introduction:
-------------------

The Hurd is a collection of servers that run on the Mach micro kernel
to implement file systems, network protocols, file access control, and
other features that are implemented by the monolithic Unix kernel or
similar kernels (such as Linux). GNU Mach 1.x was derived from CMU
Mach 3. There is also GNU Mach 2.0 branch based on Oskit from Utah
(Oskit has Mach 4 code). initially developed from MACH 3. L4 (another
micro kernel) is currently being developed and GNU Hurd hasn't booted
off L4 yet. With L4 lacking the basic framework, AB recommended
hacking GNU MACH over L4. Moreover getting a working GNU Hurd can be
much faster with GNU MACH as most of the ground work has already been
done. What GNU Hurd lacks as of now is mainly device drivers support
and performance optimization.

AB also emphasized on the need to get GNU Hurd working (it already works
fine, i mean up to industry standard) in at least two years, in order to
get in par with Linux kernel ;)

SNIP: L4 tasks manages their resource themselves unlike Linux kernel.

Though named micro kernel, they agent necessarily small in size. GNU
Mach is big because of the Linux device drivers in it (more about this
later).


Drivers in User Space Vs Drivers in Kernel Space:
-------------------------------------------------
AB spoke about the possibility of running device drivers in user space
unlike Linus kernel where device drivers run in kernel space. Though
he recommends hacking drivers in kernel space to get them ready
initially. Mach tasks communicate over IPC and the advantages of
user-space kernel surpasses the over-head of abstraction, Although
performance doesn't suffer that much. Reason being IPC is essentially
an abstraction with "mach_mesg" system call interrupt at the heart.
(They are not like packets transferred between two socket
applications). If the device drivers are made to run in user space,
gdb could be run on the device driver to debug it. 

SNIP: L4 doesn't copy/queue the messages and is totally synchronous.
	In L4 the designers are planning to use User space drivers.


Micro kernel Vs Monolithic Kernel:
---------------------------------
No it wasn't a flame war ;)!! 

Though Linus calls Linux modular, it isn't actually modular because at
run time the kernel runs as one big program( it is called modular
because of the modules aspect of Linux). The disadvantage of this
being, as the kernel gets bigger it is going to be more difficult to
maintain it. On the other hand, the micro kernel as such is small. The
modules are in user space, and are separate programs in its own space,
which helps in maintaining them.


Important concepts discussed about MACH:
----------------------------------------
-Threads
-Tasks
-Ports
-Message & Message queue.

Threads and tasks are exactly what we think they are. A task is like a
container of threads, ports rights...  The new and interesting things
were ports, messages and message passing. Threads are the basic unit
of execution. Task can have one or more threads. How ever Task are not
like Unix processes. They do not have pid, gid ...

Ports:
******

Unlike the ports we are familiar with, ports in MACH are the portals
through which different tasks communicate with each other. The messages
to different tasks are sent and received through ports. Ports are
message queues with some properties associated to it (like message count).

Port rights:
************
These rights which decide whether you can send/receive messages
to/from a task(through the port) or not.

	Different rights:
		Send right - right to send to a port.
		Receive right - right to receive messages from a port.
		Send once right - this right is revoked after being
	                          used once.
                Port Set - Collection of receive rights.
        

Similar to Linux file descriptors for a program that cannot be used by
another program, the ports of task are protected by the kernel. The
port rights are transferred through messages. Though the tasks can
decide on the rights, it is the kernel which actually does the
transfer of rights.

Port Names pace:
***************
Port Names pace is a structure maintained by the kernel for each task
which contains the different rights. Each of the send once rights have
a unique entry in the names pace and hence have a unique port
name(discussed below). Similarly each of the receive right has a
unique port name. Each port-set will have a unique port
name. Send rights and receive rights to same ports have same port
name. Remember a receive right cannot exist inside a port set and
out-side as well.

Port Name:
**********
They appear as numbers like file descriptors. Port names are an index
of port-rights into the port name space.

Port Set:
*********
Port set was released with CMU Mach 3. It is a collection of receive
rights for a task. You can let the kernel listen on the entire range
of receive-rights in a port-set and notify the task when ever message
is ready. (Similar to "select" system call).

Messages:
*********
Messages consist of a mach message header, and data part. The data
part in turn has data type field, count field and a data field.

eg.
data type - ins
count	- 10
data - 0..9

as guessed an array of 10 int's. :)


How ports are handled in MACH?
------------------------------
In GNU Mach for a task to communicate with another task, it has to be
done through ports. More precisely, it needs to have send rights to
the other port. So when a task is created in MACH, the kernel creates
port called task port (similarly thread port for threads) and the send 
rights for that port are placed in the task's task
structure. Similarly two other ports are created, namely bootstrap
port and exception port. 

The task in turn calls a routine mask_task_self(), which provides the
send rights to the task port.

Other basic requirements for a task such as contacting the file system
server(more about servers later) are taken care of by inheriting the
environment ports, which are created during GNU Hurd initialization.

SNIP: Look at task.c and ipp_tt.c
	Line 86 of ipp_tt.c deals with the above paragraph.

MiG:
----
MiG is the Mach 3.0 interface generator, as maintained by the GNU
Hurd developers for the GNU project.

The interface generator produces stub code from interface definition
(.defs) files. The stub code makes it easy to implement and use Mach
interfaces as remote procedure calls (RPC). 

Generally a .defs file is written which contains the functions to be
implemented and is compiled with MIG. The output is two files, a
server part and client part. The server part contains the function
prototype. The function definition's are to be filled in by the
developer to suit his requirements. The client part contains the
routines to call the functions implemented by the server.

So the actually message passing part is implemented by MIG.

REFER TO data-server.c and data-client.c example.


The Requirement- Device Drivers:
--------------------------------
Shantanu Goel  took Linux (1.3.35) device drivers for block,SCSI, PCI
and ISA and got it working with CMU Mach. Advantage
being no changes were required for the device drivers source. The
wrapper took care of initialization, kernel memory allocation, I/O
blocking. Currently GNU Mach has 2.0, 2.2 drivers of Linux kernel.

Now the requirement is to port the Linux 2.6 kernel device drivers to
GNU Mach. The emulation will produce a performance drop of few
microseconds.

The GNU Hurd:
-------------

Servers:
********
In The GNU Hurd you have servers, eg. filesystem server, authserver
TCP/IP stack, block drivers...etc. Communication (IPC) interface is
defined by the corresponding MIG .defs files. Each of these servers
takes care of specialized tasks and as a whole implements a POSIX
system. In future GNU Hurd will also support distributed model called
"collectives".

Translators:
************
Translators are hooks in the filesystem which in turn link to a
task. eg. You already have ftpfs, httpfs file systems, where you can
mount a remote FTP or HTTP server locally and run tar or grep on them.

$settans /root/.mbox /hurd/pop3fs --server=mail.gnu.org --user=....

to mount a remote POP3 connection as a local mbox file.

Once the translator is set you could use the normal file system
commands like cat, ls , grep etc on /tmp/ftp and it will behave like a
local file system.

Similarly translators could be written for gzip, http ... the list
ends with your creativity.  A whole list of libraries are available
for this (eg. libdiskfs libnetfs ..)

	Two type of translators:

	*Active translators - these are lost with reboot, and
                     showtrans does not work on them.
	*Passive translators.

CASE STUDY:
~~~~~~~~~~~
The famous /dev/null in Linux is implemented as a translator in
HURD. If you run ps you can see /hurd/null running as a task.

This is hooked to /dev/null as a translator.


If you have understood all this, then you should be wondering how a
task (which acts a translator) is able to understand cat, grep, ls
...because i too did and AB explained it beautifully which takes us to
the next topic.


POSIX implementation on HURD:
-----------------------------
Welcome to real world!!! or rather How deep is the rabbit hole??..:)

Before we get into this.. there is a function that needs to be
discussed. dirlookup() This function returns send rights (read
important concepts in Mach) to particular task. (similar to a DNS
server which returns a ip for the url).

The libc calls like open, read, write .. in turn run a dirlookup() to
get the rights from the required server. eg. if there is a regular
file /tmp/foo a open on this would request rights from the filesystem
server, but if /tmp/foo is a translator linked with foobar the the
rights would be returned for foobar.

Now both the filesystem server and foobar will have to implement the
same set of functions, eg. io_read, io_stat etc.

How is this done is you might ask.

Remember the .defs file ??(read MIG above)
It is the developer who fills the function definition's (go back and
read MIG once more if in doubt). so in foobar for io_read, i can do a
socket connect, or anything i like.

so when you run cat, you call open, which does a dirlookup and returns
the send right(either to a translator task or a server) and then open
in turn calls io_open, io_read.. on their server.

and we already know how to write a translator!!! :).. even though you
missed the meet. LUCK HUH!!
:)

Cool tool librpci:
------------------
I dont really remember much about this(my brain was supercharged by
now 4.5 hrs of HURD :)), but this tool will allow you to do a whole
lot of things like stealing the task port, recreating the errors. eg
if a server seems buggy to you, you could take a dump, then run
librpci with this dump, librpci will simulate the previous situation
with the help of the dump file. This should recreate the errors making
the debugging easier.

Closing notes:
--------------
If you have installed GNU/Hurd, sshd will not run because /dev/urandom is
not yet implemented in HURD. So you could write a binary which returns
some thing random (or the same thing every time :) ) and create a
symlink to it as /dev/urandom. You will have a sshd running fine.

Alright, so you didn't attend the meet or you attended it and slept
through out, but at least you have read this document and come this
far. That gives more meaning to my effort in typing this
document. Thank you.


Mohammed Riyaz P.
(HAPPY HACKING :) (quoting RMS))


==============================================================================

Here is some more info, I wrote. Usually you take care of this after
you finish the installation and login in for the first time.


SWAP and CDROM
---------------
Also don't forget to add SWAP entries in your /etc/fstab after
installation completes.

You need to create devices before you use them.
If you have a swap partition say hd0s2 (hda2) and cdrom as hd2 (hdc),
then 
# cd /dev
# ./MAKEDEV hd0s2
# ./MAKEDEV hd2

and add these two lines to /etc/fstab

  /dev/hd0s2  none    swap    sw   0   0
  /dev/hd2    /cdrom  iso9660 ro   0   0

# swapon -a

APT
----
If your network card works, add this to your /etc/apt/sources.list

  deb http://ftp.gnuab.org/debian unreleased main
  deb http://ftp.debian.org/debian unstable main

Or if you have setup to use CDROM instead, do this
  deb file:/cdrom/debian unstable main contrib local non-US/main non-US/contrib

  
NETWORK
--------
Setup networking like this: (Choose IP addresses appropriately) 

# settrans -fgap /servers/socket/2 /hurd/pfinet -i eth0 \
           -a 192.168.1.54 -g 192.168.1.1 -m 255.255.255.0
# echo "nameserver 202.54.15.1" >> /etc/resolv.conf
# ping www.gnu.org

SSH
----
ssh installation will fail, because of missing
/dev/urandom. Before you proceed with ssh installation, temporarily
create a symbolic link of some binary file in place of urandom. This
is just a dirty hack..

# ln -s /bin/bash /dev/urandom

CONSOLE
--------
I am comfortable with "screen" package, except I re-map C-a prefix
with 

Happy Hacking,
-- 
Anand Babu
Free as in Freedom <www.gnu.org>



More information about the Fsf-friends mailing list