From mboxrd@z Thu Jan  1 00:00:00 1970
From: Alfonso <a.martone@retepnet.it>
Subject: _main and startup code
Date: Wed, 12 Feb 2003 12:15:15 +0100
Sender: linux-8086-owner@vger.kernel.org
Message-ID: <200302121215.15299.a.martone@retepnet.it>
References: <200301301341.h0UDf5Z25323@preshak.recjai.ac.in> <200302060016.28257.a.martone@retepnet.it> <1044942743.1550.70.camel@Castle.goembel>
Reply-To: a.martone@retepnet.it
Mime-Version: 1.0
Content-Transfer-Encoding: 7BIT
Return-path: <linux-8086-owner@vger.kernel.org>
In-Reply-To: <1044942743.1550.70.camel@Castle.goembel>
List-Id: <linux-8086.vger.kernel.org>
Content-Type: text/plain; charset="us-ascii"
To: linux-8086@vger.kernel.org

> Would the program start symbol also need to be _main,
> or is that only needed by the C runtime library?


The name "main" is internal to the C compiler and its libraries.

As of ELKS 0.1.1, an executable starts always at its CS:0000, where the 
bcc places the startup code (adjust stack parameters, call _main and 
then issue an "exit" syscall).

In the Minix-header there are two "Unused" fields. I found them defined 
somewhere as "starting address" and "lenght of symbol table (or DLL 
data) appended to the file".

Since the compiler sets them to zero by default, the patch below should 
be backward- and forward-compatible (I'm sorry, I don't know how to 
use those weird things like diff, patch, cvs, etc, so I show it to you 
in this jerky mode):


in the file  include/linuxmt/minix.h
change the line #23 from:
    unsigned long     unused;
to:
    unsigned long     startaddr;


in the file  fs/exec.c
add these lines after the line #348 (soon after "tregs->cs=cseg" line):
    tregs->ip = mh.startaddr;    /* guaranteed good at compile time */


This implements a "start address" eventually different from CS:0000.

But bcc won't support it. Every program compiled by bcc has a short 
initial startup section which rearranges argc/argv/envp parameters for 
calling _main (first part) and issue an _exit when main() returns 
(second part).

Yes, the "rearrange" could be easily implemented in the ELKS kernel. 
When the program terminates by exit() the second startup part is not 
needed. But if the main() returns a value, there is need for it.

An ugly hack could be patching the bcc in order to get the "main()" 
with a different stack frame; maybe something like:

_main: mov bp, sp                 ; opening code: not need to push bp

       [maybe: sub sp, NN -- stack space needed for local variables]
       [...rest of main()...]

       mov bx, ax                 ; closing code: bp is no more needed
       mov ax, 1
       int 0x80                   ; call exit(main())

This is transparent to the programmer except in the case of a recursive 
call to _main (well, in sixteen years of C programming I never found a 
case of a recursive _main call).

But this is an ugly hack because it needs a different stackframe for 
_main (the compiler should always check for the function name)...!


A decent hack should provide, when the main() returns, at least the 
three instructions of "closing code" of above. But this means either 
adding a "push return-address-to-closing-code" somewhere in the 
kernel, or... simply placing a "call _main -- closing code follows" at 
the beginning of the program (which can start at CS:0000 without any 
extra work).


I think I would not hack anything. The startup code often does a little 
more than calling _main and exiting with its return value. For example 
you could save in a static variable the envp so that a getenv() 
library function call can get in any moment (without _main support) an 
environment string.


These notes demostrate that, at least for C programs (not only bcc 
compiled programs), a "start address" different from CS:0000 is not 
really needed. Maybe a compiler for some other programming language 
could take some advantage from it.

We still did not discuss assembler-written programs. An assembler 
program does not always need to start at CS:0000 and/or with some 
startup code. The little patch of above seems suffice, but then I 
would add in the fs/exec.c a little extra check (to verify that 
mh.startaddr is less than mh.tseg).