All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Why does C3 CPU downgrade in kernel 2.4.20?
From: Denis Vlasenko @ 2002-12-14 16:46 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Daniel Egger, Dave Jones, Joseph, linux-kernel@vger.kernel.org
In-Reply-To: <20021212180957.GA184@elf.ucw.cz>

On 12 December 2002 18:09, Pavel Machek wrote:
> Hi!
>
> > > > I believe someone (Jeff Garzik?) benchmarked gcc code
> > > > generation, and the C3 executed code scheduled for a 486 faster
> > > > than it did for -m586
> > > > I'm not sure about the alignment flags. I've been meaning to
> > > > look into that myself...
> > >
> > > Interesting. I have no clue about which C3 you're talking about
> > > here but a VIA Ezra has all 686 instructions including cmov and
> > > thus optimising for PPro works best for me.
> > >
> > > Prolly I would have to do more benchmarking to find out about
> > > aligment advantages.
> >
> > I heard cmovs are microcoded in Centaurs.
> >
> > s...l...o...w...
>
> It still might be faster then a branch... or not if centaurs are
> really that simple.
> 								Pavel

I did not measure it myself, but rumors were they took tens of cycles.

Well, a IFcc prefix meaning 'execute next instruction if' would be
way more cool that CMOVcc. Because I want CADDcc, CTESTcc, CBSWAPcc too ;)

But since all 1 byte opcodes are taken and

	Jcc	skip		# <- 2 byte opcode
	opcode	op1,op2
skip:

I think some CPU magic can detect such short jumps and handle'em just like
they were such a prefix, saving potential branch (mis-)prediction.
--
vda

^ permalink raw reply

* [parisc-linux] problems with PCI IDE controller
From: Joerg Steindlberger @ 2002-12-14 11:48 UTC (permalink / raw)
  To: PARISC-LINUX

Hi,

I got some problems with using an IDE controller with my C240 (same in 
my C360). The controller itself works best in x86 architectures. It's a 
non RAID controller produces by Promise. The kernel is a 
linux-2.4.20-pa14 whith all the IDE stuff built as kernel modules. 
Earlier replies to my problem says that it could be a problem with the 
probing of the intel's default IO ports for IDE controllers.

Here is what I did: Building the kernel - booting the system - doing a 
modprobe -k ide-disk (from remote) - pressing Transfer Of Control - 
interrupting boot sequence and typing ser pim. And here is the output of 
the console: (thank You for helping - Joerg)

hp-c240 login:
Debian GNU/Linux testing/unstable hp-c240 tts/0

hp-c240 login: Uniform Multi-Platform E-IDE driver Revision: 6.31
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
PDC20268: IDE controller on PCI bus 00 dev 08
PDC20268: chipset revision 2
PDC20268: not 100% native mode: will probe irqs later


Uniform Multi-Platform E-IDE driver Revision: 6.31
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
PDC20268: IDE controller on PCI bus 00 dev 08
PDC20268: chipset revision 2
PDC20268: not 100% native mode: will probe irqs later

Stack Dump:
  2f1a0900:  00000002 73747576 6f707172 6b6c6d6e
  2f1a08f0:  6768696a 1010fdec 00000000 0000000f
  2f1a08e0:  103dee76 10323810 10323b68 00000036
  2f1a08d0:  00000061 00000000 00000001 10323810
  2f1a08c0:  00000000 00000002 103df23f 00000010
  2f1a08b0:  103dee40 0002423c 00000400 1037da4c

Kernel addresses on the stack:
  [<1010fdec>]  [<0002423c>]  [<0002ec19>]  [<1024ba78>]
  [<00034b44>]  [<000204c8>]  [<00034b44>]  [<0002ebe8>]
  [<0002e898>]  [<1011cd58>]  [<102d0cec>]  [<00034b44>]
  [<00036010>]  [<1011ce84>]  [<00034b44>]  [<00036010>]
  [<000300ea>]  [<0002120c>]  [<00036010>]  [<0002ee98>]
  [<0002e898>]  [<1011cd58>]  [<00036010>]  [<00036010>]
  [<0002c180>]  [<0002ba01>]  [<000213e8>]  [<00020003>]
  [<10135ddc>]  [<0000fe30>]  [<0001eac3>]  [<00017af0>]
  [<0002b9cc>]  [<0002ba04>]  [<00036010>]  [<00010504>]
  [<1011de54>]  [<0002d000>]  [<0002c5c0>]  [<10104904>]
  [<0002c5b4>]  [<0002c5b4>]  [<0000a060>]  [<0002c5d0>]
  [<10108f90>]  [<10108084>]  [<10107cf4>]  [<1013fd44>]
  [<1013e638>]  [<1014ed24>]  [<0002c5b4>]  [<0001436b>]
  [<101205a8>]

High Priority Machine Check (HPMC): Code=1 regs=10313080 (Addr=00000000)

      YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00000000000001000000011100001100 Not tainted
r00-03  00000000 10320010 1010f840 000000ff
r04-07  0000fd00 0000fd1f 109ab400 0002e898
r08-11  109ab400 00000061 00000000 00000001
r12-15  00000000 2fda3ce0 2f1a0550 00000003
r16-19  00044c08 00045408 00000000 f200006f
r20-23  0000000e 0000000f 1026bee4 10323810
r24-27  0000fd1f 00000003 109ac160 10310010
r28-31  0000006f ffffe003 2f1a0900 1010f840
sr0-3   00000000 00000265 00000000 00000265
sr4-7   00000000 00000000 00000000 00000000

IASQ: 00000000 00000000 IAOQ: 1026bf10 1026bf14
  IIR: 00141860    ISR: 9227ffc8  IOR: 00000064
  CPU:        0   CR30: 2f1a0000 CR31: 103d0000
  ORIG_R28: 00000000



Firmware Version  5.8

Duplex Console IO Dependent Code (IODC) revision 1

------------------------------------------------------------------------------
    (c) Copyright 1995-1998, Hewlett-Packard Company, All rights reserved
------------------------------------------------------------------------------

   Processor   Speed            State           Coprocessor State  I/D Cache
   ---------  --------   ---------------------  ----------------- 
-------------
       0      236 MHz    Active                 Functional         2 MB/2 MB

   Central Bus Speed (in MHz) :        118

   Available memory (bytes)    : 536870912
   Good memory required (bytes):   44974080

   Primary boot path:    LAN.0.0.0.0.0.0
   Alternate boot path:  FWSCSI.6.0
   Console path:         GRAPHICS(4)
   Keyboard path:        PS2

CPU 0
WARNING:  Self tests have been disabled as a result of FASTBOOT
           being enabled.  To enable self tests, use the FASTBOOT
           command in the CONFIGURATION menu and reboot the system.


Processor is booting from first available device.

To discontinue, press any key within 10 seconds.

Boot terminated.


------- Main Menu 
-------------------------------------------------------------

         Command                         Description
         -------                         -----------
         BOot [PRI|ALT|<path>]           Boot from specified path
         PAth [PRI|ALT|CON|KEY] [<path>] Display or modify a path
         SEArch [DIsplay|IPL] [<path>]   Search for boot devices

         COnfiguration [<command>]       Access Configuration menu/commands
         INformation [<command>]         Access Information menu/commands
         SERvice [<command>]             Access Service menu/commands

         DIsplay                         Redisplay the current menu
         HElp [<menu>|<command>]         Display help for menu or command
         RESET                           Restart the system
-------
Main Menu: Enter command > ser pim

PROCESSOR PIM INFORMATION

-----------------  Processor 0 HPMC Information ------------------

Timestamp =   Fri Dec  13 23:00:06 GMT 2002    (20:02:12:13:23:00:06)

HPMC Chassis Codes = 0xcbf0  0x5007  0x5408  0x5508  0xcbfb

General Registers 0 - 31
00-03   0000000000000000  0000000010320010  000000001010f840 
00000000000000ff
04-07   000000000000fd00  000000000000fd1f  00000000109ab400 
000000000002e898
08-11   00000000109ab400  0000000000000061  0000000000000000 
0000000000000001
12-15   0000000000000000  000000002fda3ce0  000000002f1a0550 
0000000000000003
16-19   0000000000044c08  0000000000045408  0000000000000000 
00000000f200006f
20-23   000000000000000e  000000000000000f  000000001026bee4 
0000000010323810
24-27   000000000000fd1f  0000000000000003  00000000109ac160 
0000000010310010
28-31   000000000000006f  00000000ffffe003  000000002f1a0900 
000000001010f840

<Press any key to continue (q to quit)>

Control Registers 0 - 31
00-03   0000000000000000  0000000000000000  0000000000000000 
0000000000000000
04-07   0000000000000000  0000000000000000  0000000000000000 
0000000000000000
08-11   00000000000004ca  0000000000000000  00000000000000c0 
000000000000001f
12-15   0000000000000000  0000000000000000  0000000000107000 
00000000f0000000
16-19   000000195c1e11d2  0000000000000000  000000001026bf10 
0000000000141860
20-23   000000009227ffc8  c000000000000064  000000000004070c 
0000000080000000
24-27   000000000032d000  000000001f208000  00000000ffffffff 
00000000ffffffff
28-31   00000000ffffffff  00000000ffffffff  000000002f1a0000 
00000000103d0000

Space Registers 0 - 7
00-03   00000000          00000265          00000000          00000265
04-07   00000000          00000000          00000000          00000000

<Press any key to continue (q to quit)>

IIA Space                    = 0x0000000000000000
IIA Offset                   = 0x000000001026bf14
Check Type                   = 0x20000000
CPU State                    = 0x9e000004
Cache Check                  = 0x00000000
TLB Check                    = 0x00000000
Bus Check                    = 0x0020000c
Assists Check                = 0x00000000
Assist State                 = 0x00000000
Path Info                    = 0x00000000
System Responder Address     = 0xfffffffffffa0000
System Requestor Address     = 0x0000000000000000
Check Summary                = 0x8002000040004000
Available Memory             = 0x0000000000000000
CPU Diagnose Register 2      = 0x0501000000000004
CPU Status Register 0        = 0x4420c20000000000
CPU Status Register 1        = 0x8002000000000000
SADD LOG                     = 0x0800000000000000
Read Short LOG               = 0xc10010fff200006f

<Press any key to continue (q to quit)>

Memory Error Log Information:

Timestamp =   Fri Dec  13 23:00:07 GMT 2002    (20:02:12:13:23:00:07)

    No memory errors logged


I/O Module Error Log Information:

Timestamp =   Fri Dec  13 23:00:08 GMT 2002    (20:02:12:13:23:00:08)

Bus    HPA       Module Type      Path  Slt Md Sev  Estat Requestor 
Responder
--- ---------- ---------------- -------- -- -- ---- ----- ---------- 
----------
  0  0xfff88000 I/O Adapter      8         2  0  he   0x0d 0x00000000 
0x00000000
  1  0xf203f000 Bus Converter    8/63     15  3  se   0x07 0xfffa0800 
0xf200006f
  1  0xf2000000 Bus Bridge (PCI)                      0x07 0xf2000000 
0x0000fd1f

PCI Device Failure Information

Physical Slot   Logical   Path
-------------  ----------------------------
    1            0/255/255/255/8/0/1/0



PCI Error Summary

A Processer IO error occurred. The GSC-PCI bridge was
the requestor. Error bit indicates a master timeout
or master Abort was received during a PCI transaction.

Bus    HPA       Module Type      Path  Slt Md Sev  Estat Requestor 
Responder
--- ---------- ---------------- -------- -- -- ---- ----- ---------- 
----------
  0  0xfff8a000 I/O Adapter      10        2  2  he   0x0d 0x00000000 
0x00000000

<Press any key to continue (q to quit)>
Main Menu: Enter command >

^ permalink raw reply

* 2.5.51 cpufeatures.h
From: Margit Schubert-While @ 2002-12-14 11:55 UTC (permalink / raw)
  To: linux-kernel

Somewhat confused.
In include/asm-i386/cpufeature.h we have:
--snip--
#define X86_FEATURE_XMM2        (0*32+26) /* Streaming SIMD Extensions-2 */
#define X86_FEATURE_SELFSNOOP   (0*32+27) /* CPU self snoop */
#define X86_FEATURE_HT          (0*32+28) /* Hyper-Threading */
#define X86_FEATURE_ACC         (0*32+29) /* Automatic clock control 
*/      <---- ******
#define X86_FEATURE_IA64        (0*32+30) /* IA-64 processor */
--end snip--

According to Intel specs, bit 29 is :
" TM  Thermal Monitor    The processor implements the thermal monitor automatic
   thermal control circuitry (TMM)"

The (wrong?) FEATURE ACC is used in
arch/i386/kernel/cpu/cpufreq/p4-clockmod.c
arch/i386/kernel/cpu/mcheck/p4.c

Margit 


^ permalink raw reply

* Re: mmap() and NFS server performance
From: Trond Myklebust @ 2002-12-14 11:22 UTC (permalink / raw)
  To: Matthew Mitchell; +Cc: nfs
In-Reply-To: <3DFA4C9A.50101@geodev.com>

>>>>> " " == Matthew Mitchell <matthew@geodev.com> writes:

     > values.  These apps were originally written on Solaris with
     > Solaris NFS servers assumed to be the data source; the Sun guys
     > said that mmap would be much faster than read/write and they
     > were correct.  However, now that we have a few Linux NFS
     > servers, we're seeing the opposite.

As long as the clients are still Solaris, then the only difference can
be the network, and the server performance.

Of the 2, the bigger 'generic' troublemaker tends to be the network.
Solaris clients always tend to prefer NFS over TCP since that tends to
be more reliable on poor networks than does UDP. Unfortunately, NFS
over TCP on the server side is a fairly recent addition to Linux: it
only just made it into the stable release 2 weeks ago (when 2.4.20 was
released). To the best of my knowledge, none of the RedHat kernels
support it yet.

Cheers,
  Trond


-------------------------------------------------------
This sf.net email is sponsored by:
With Great Power, Comes Great Responsibility 
Learn to use your power at OSDN's High Performance Computing Channel
http://hpc.devchannel.org/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply

* Re: Not able to compile modutils-2.4.21-7.src.rpm
From: Gregoire Favre @ 2002-12-14 11:13 UTC (permalink / raw)
  To: Paolo Ciarrocchi; +Cc: linux-kernel
In-Reply-To: <20021214000944.30118.qmail@linuxmail.org>

On Sat, Dec 14, 2002 at 08:09:43AM +0800, Paolo Ciarrocchi wrote:
> Hi Rusty and Adam,
> I send you again this bug report.
> 
> [root@frodo module-init-tools-0.9.3]# rpm --rebuild /mnt/nt/linux/kernel/modules/modutils-2.4.21-7.src.rpm
> 
> gcc -O3 -fomit-frame-pointer -pipe -mcpu=pentiumpro -march=i586 -ffast-math -fno-strength-reduce -o modinfo modinfo.o ../obj/libobj.a ../util/libutil.a
> gcc -static -O3 -fomit-frame-pointer -pipe -mcpu=pentiumpro -march=i586 -ffast-math -fno-strength-reduce -o insmod.static insmod.o rmmod.o modprobe.o lsmod.o ksyms.o kallsyms.o ../obj/libobj.a ../util/libutil.a
> /usr/bin/ld: cannot find -lc
> collect2: ld returned 1 exit status
> make[1]: *** [insmod.static] Error 1

Just install glibc-static-devel...

	Grégoire
________________________________________________________________
http://ulima.unil.ch/greg ICQ:16624071 mailto:greg@ulima.unil.ch

^ permalink raw reply

* Re: [LARTC] ECN and ipitables: a political issue
From: Andrea Rossato @ 2002-12-14 10:52 UTC (permalink / raw)
  To: lartc
In-Reply-To: <marc-lartc-103920005828158@msgid-missing>

[-- Attachment #1: Type: text/plain, Size: 807 bytes --]

Andrea Rossato wrote:
> Being able to discriminate between good and bad guys it is possible 
> through a filtering rule,
> 
> iptables -A POSTROUTING -t mangle -p tcp -d bad.guy.com -j ECN 
> --ecn-tcp-remove.

> Now, the problem is the rule seems not to be working and I cannot 
> connect to those hosts unless turning ecn off (echo 0 > 
> /proc/sys/net/ipv4/tcp_ecn), the wrong solution. I suspect I'm getting 
> something wrong.

(just for documentation)

i was not getting anything wrong: there was a bug in checksum 
recalculation after application of the ECN target.
Patrick McHardy promprly posted a patch in netfilter-devel mailing list.
(the patch is attached to the present message)

Now the rule is working just fine!!

(should I submit a patch proposal to LARTC to document the issue?)

andrea


[-- Attachment #2: ipt_ECN.diff.1 --]
[-- Type: text/plain, Size: 499 bytes --]

--- net/ipv4/netfilter/ipt_ECN.c.orig	2002-12-09 23:14:20.000000000 +0100
+++ net/ipv4/netfilter/ipt_ECN.c	2002-12-09 23:13:27.000000000 +0100
@@ -88,8 +88,8 @@
 	}
 	
 	if (diffs[0] != *tcpflags) {
-		diffs[0] = htons(diffs[0]) ^ 0xFFFF;
-		diffs[1] = htons(*tcpflags);
+		diffs[0] = diffs[0] ^ 0xFFFF;
+		diffs[1] = *tcpflags;
 		tcph->check = csum_fold(csum_partial((char *)diffs,
 		                                    sizeof(diffs),
 		                                    tcph->check^0xFFFF));

^ permalink raw reply

* Re: usbaudio won't do 24-bit or 32-bit i/o...
From: Patrick Shirkey @ 2002-12-14 10:48 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel
In-Reply-To: <s5hd6o5u6hw.wl@alsa2.suse.de>

Takashi Iwai wrote:

> 
> could you tell me the rcs version numbers of the files on
> alsa-kernel/usb you are using (18 Nov.) ?  i've checked the files via
> cvs but i couldn't see any differences around the date.
> 

I have managed to test a more upto date version since then and it is the 
same. I have no idea when this happened though as I wasn't actively 
testing the 24 bit support until recently. I have a vague memory of 
testing it much earlier in the year but I think that was only for playback.

I currently cannot record from input 1 and 2 either.

Working are:

output 1,2,3,4
input 3,4

I also have to initialise both pcms with the small utility you made.




-- 
Patrick Shirkey - Boost Hardware Ltd.
For the discerning hardware connoisseur
Http://www.boosthardware.com
Http://www.djcj.org - The Linux Audio Users guide
========================================

Being on stage with the band in front of crowds shouting, "Get off! No! 
We want normal music!", I think that was more like acting than anything 
I've ever done.

Goldie, 8 Nov, 2002
The Scotsman



-------------------------------------------------------
This sf.net email is sponsored by:
With Great Power, Comes Great Responsibility 
Learn to use your power at OSDN's High Performance Computing Channel
http://hpc.devchannel.org/

^ permalink raw reply

* Re: JDIRTY JWAIT errors in 2.4.19
From: Oleg Drokin @ 2002-12-14 10:55 UTC (permalink / raw)
  To: Tupshin Harper; +Cc: linux-kernel
In-Reply-To: <3DFAF9EF.6000501@tupshin.com>

Hello!

On Sat, Dec 14, 2002 at 01:29:19AM -0800, Tupshin Harper wrote:

> i'm getting the following error logged every 11 seconds or so:
> 
> Dec 14 01:00:49 phylum kernel: vs-3050: wait_buffer_until_released: nobody
> releases buffer (dev 16:01, size 4096, blocknr 2916352, count 3, list 0, 
> state
> 0x10019, page c1172108, (UPTODATE, CLEAN, UNLOCKED)). Still waiting
> (-1320000000) JDIRTY !JWAIT
> Also, some processes are blocking, include ps (so I can't get a complete 
> process list), and shutdown.

Can you please execute SysRq-T, decode it with ksymoops and send us the result?

> such circumstances? Is this associated with reiserfs which all 
> partitions are running? A googling turned up one or two references to 

Yes, this is reiserfs error message.

Thank you.

Bye,
    Oleg

^ permalink raw reply

* Re: Aic7xxx v6.2.22 and Aic79xx v1.3.0Alpha2 Released
From: Jens Axboe @ 2002-12-14 10:42 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Justin T. Gibbs, James Bottomley, linux-scsi
In-Reply-To: <20021213210643.B15074@infradead.org>

On Fri, Dec 13 2002, Christoph Hellwig wrote:
> On Thu, Dec 12, 2002 at 06:38:18PM +0100, Jens Axboe wrote:
> > > 1) You have to set a flag when you've already told the system you
> > >    dma capabilities.
> > 
> > All drivers blindly copy the setting of the dma mask, typically means
> > nothing.
> 
> Maybe it's a bit to dangerous for 2.4, but for 2.5 Justin's suggestion
> looks very nice, IMHO.

The whole discussion was about the 2.4 patch. For 2.5 I'm not adverse to
doing it this way.

-- 
Jens Axboe


^ permalink raw reply

* kmalloc returning bogus values? Re: Kernel Oops  in 2.4.20
From: Oleg Drokin @ 2002-12-14 10:24 UTC (permalink / raw)
  To: Philippe Gramoull?; +Cc: nfs, Chris Mason, linux-kernel
In-Reply-To: <20021214024428.1e496afb.philippe.gramoulle@mmania.com>

Hello!

   Looking at the trace it seems that either kmalloc returned bogus value
   or kmalloc returned good value, but it got corrupted somehow corrupted
   almost immediately in register.

   No, I do not know how that might happen. Also this oops seems to have
   nothing to do with NFS either, NFS code happened to be active at the moment
   when interrupt from network card have happened.

   I am CCing this to lkml in hopes that somebody there might have some ideas.

Bye,
    Oleg
On Sat, Dec 14, 2002 at 02:44:28AM +0100, Philippe Gramoull? wrote:
> Hi,
> 
> Scope: NFS server is a DELL 2550, SMP , 1Go RAM with a PowerVault RAID5 array
> controlled by a PERC3/QC ( megaraid driver) with about 600Go of data
> with plenty of directories and lots of small files.
> NIC is Intel eepro100
> 
> The server is used for serving web pages as well as FTP accounts for
> members of our online service. Filesystem is Reiserfs 3.6 format.
> 
> As for software, we use a plain 2.4.20 kernel with patches from Oleg Drokin
> and Chris Mason to enable data logging as well as quota V2.
> No single patch related to NFS was applied.
> 
> All clients use NFSV3 udp with standard mount options (rsize=8192,wsize=8192,hard,intr)
> and all 80 clients kernels mostly use 2.4.19-pre3 , (with nfs fixes at the time of its release)
> few with 2.4.18-pre[23]. There are 285 mounts from this server.
> 
> The NFS server is very busy, running 256 NFS threads with a usual load average of ~ 4/5.
> 
> With the help of Reiserfs developper, we tried to chase down a bug that would cause the kernel
> to crash , and that first seemed to be related to the data logging feature ( partition is mounted
> with data=orderd option) each time that quotacheck was run
> 
> After several fixes provided, latest quotacheck made the kernel oops and decoded oops doesn't
> show anything related to reiserfs but more likely something to do with NFS.
> 
> Decoded oops is provided below. The oops will almost happen each time quotacheck is run,
> sometimes it can take many hours for the bug to be triggered.
> Kernel doesn't seem to oops if quotacheck is not run.
> 
> I can provide more information if needed, just let me know.
> 
> Thanks,
> 
> Philippe
> 
> 
> Unable to handle kernel paging request at virtual address 0040648e
>  c020318a
>  *pde = 00000000
>  Oops: 0002
>  CPU:    0
>  EIP:    0010:[<c020318a>]    Not tainted
>  Using defaults from ksymoops -t elf32-i386 -a i386
>  EFLAGS: 00010206
>  eax: 00000011   ebx: ca38b02e   ecx: f5196000   edx: 00406480
>  esi: 0d00c1d5   edi: 00007458   ebp: 3500c1d5   esp: f5197cac
>  ds: 0018   es: 0018   ss: 0018
>  Process nfsd (pid: 394, stackpage=f5197000)
>  Stack: c2a39811 c02038c4 00000005 ca38b02e c2a39860 ca38b02e c2a39860 c2a39860 
>         f7e2b000 ca38b02e c0202900 c2a39860 f7e2b000 ca38b02e c2a39860 c2a39860 
>         c0202d1a c2a39860 00000000 c2a39860 00000800 00000008 00000001 c01f5aeb 
>  Call Trace:    [<c02038c4>] [<c0202900>] [<c0202d1a>] [<c01f5aeb>] [<c01f5b99>]
>    [<c01f5cce>] [<c011bc0f>] [<c01088bb>] [<c010adf8>] [<c0115610>] [<c012808c>]
>    [<c012894d>] [<c0128dd4>] [<c0128ccc>] [<c016c584>] [<c017170b>] [<c0168b23>]
>    [<c0236585>] [<c016890f>] [<c0105684>]
>  Code: 88 42 0e c6 42 0f 00 0f b7 43 04 66 89 42 0c 8b 43 0c 89 42 
> 
> 
>  >>EIP; c020318a <ip_frag_create+26/b0>   <=====
> 
>  >>ebx; ca38b02e <_end+a05832a/3860435c>
>  >>ecx; f5196000 <_end+34e632fc/3860435c>
>  >>esp; f5197cac <_end+34e64fa8/3860435c>
> 
>  Trace; c02038c4 <ip_defrag+bc/16b>
>  Trace; c0202900 <ip_local_deliver+1c/12c>
>  Trace; c0202d1a <ip_rcv+30a/38d>
>  Trace; c01f5aeb <netif_receive_skb+11f/14c>
>  Trace; c01f5b99 <process_backlog+81/124>
>  Trace; c01f5cce <net_rx_action+92/154>
>  Trace; c011bc0f <do_softirq+6f/cc>
>  Trace; c01088bb <do_IRQ+db/ec>
>  Trace; c010adf8 <call_do_IRQ+5/d>
>  Trace; c0115610 <.text.lock.sched+7a/1da>
>  Trace; c012808c <___wait_on_page+98/b8>
>  Trace; c012894d <do_generic_file_read+301/464>
>  Trace; c0128dd4 <generic_file_read+7c/110>
>  Trace; c0128ccc <file_read_actor+0/8c>
>  Trace; c016c584 <nfsd_read+1bc/260>
>  Trace; c017170b <nfsd3_proc_read+127/184>
>  Trace; c0168b23 <nfsd_dispatch+d3/19a>
>  Trace; c0236585 <svc_process+28d/4d4>
>  Trace; c016890f <nfsd+1f7/338>
>  Trace; c0105684 <kernel_thread+28/38>
> 
>  Code;  c020318a <ip_frag_create+26/b0>
>  00000000 <_EIP>:
>  Code;  c020318a <ip_frag_create+26/b0>   <=====
>     0:   88 42 0e                  mov    %al,0xe(%edx)   <=====
>  Code;  c020318d <ip_frag_create+29/b0>
>     3:   c6 42 0f 00               movb   $0x0,0xf(%edx)
>  Code;  c0203191 <ip_frag_create+2d/b0>
>     7:   0f b7 43 04               movzwl 0x4(%ebx),%eax
>  Code;  c0203195 <ip_frag_create+31/b0>
>     b:   66 89 42 0c               mov    %ax,0xc(%edx)
>  Code;  c0203199 <ip_frag_create+35/b0>
>     f:   8b 43 0c                  mov    0xc(%ebx),%eax
>  Code;  c020319c <ip_frag_create+38/b0>
>    12:   89 42 00                  mov    %eax,0x0(%edx)

^ permalink raw reply

* Re: via82cxxx probable incorrect detection (urgency=low)........
From: Vojtech Pavlik @ 2002-12-14 10:28 UTC (permalink / raw)
  To: Aryix; +Cc: linux-kernel
In-Reply-To: <courier.3DF9A5CE.00001047@softhome.net>

On Thu, Dec 12, 2002 at 08:14:53AM -0300, Aryix wrote:
> I am Argentino <- i don't speak english, please be patient
> 
> Kernel-2.4.20-final
> 
> lspci -vvv
> 
> 00:07.1 IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 10) (prog-if 8
> a [Master SecP PriP])
>         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Step
> ping- SERR- FastB2B-
>         Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort
> - <MAbort- >SERR- <PERR-
>         Latency: 32
>         Region 4: I/O ports at d000 [size=16]
>         Capabilities: [c0] Power Management version 2
>                 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot
> -,D3cold-)
>                 Status: D0 PME-Enable- DSel=0 DScale=0 PME-
> 
> 
> 
> cat /proc/pci
>  Bus  0, device   7, function  0:
>     ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev 33).
>   Bus  0, device   7, function  1:
>     IDE interface: VIA Technologies, Inc. VT82C586B PIPC Bus Master IDE (rev 16).
>       Master Capable.  Latency=32.  
>       I/O at 0xd000 [0xd00f].
> 
> dmesg 
> 
> ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
> VP_IDE: IDE controller on PCI bus 00 dev 39
> VP_IDE: chipset revision 16
> VP_IDE: not 100% native mode: will probe irqs later
> ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx <- is my error?
> VP_IDE: VIA vt82c686a (rev 21) IDE UDMA66 controller on pci00:07.1
>     ide0: BM-DMA at 0xd000-0xd007, BIOS settings: hda:DMA, hdb:pio
>     ide1: BM-DMA at 0xd008-0xd00f, BIOS settings: hdc:DMA, hdd:DMA
> hda: QUANTUM FIREBALLlct20 30, ATA DISK drive
> hdc: ST36421A, ATA DISK drive
> hdd: ATAPI 44X CDROM, ATAPI CD/DVD-ROM drive
> ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
> ide1 at 0x170-0x177,0x376 on irq 15
> blk: queue c030a5c4, I/O limit 4095Mb (mask 0xffffffff)
> hda: 58633344 sectors (30020 MB) w/418KiB Cache, CHS=3649/255/63, (U)DMA <- ??????????

The drive doesn't say to the OS which DMA mode is enabled on it.

> blk: queue c030a928, I/O limit 4095Mb (mask 0xffffffff)
> hdc: 12596850 sectors (6450 MB) w/256KiB Cache, CHS=13330/15/63, UDMA(66) <- this is ok
> hdd: ATAPI 40X CD-ROM drive, 128kB Cache, UDMA(33) <- ok!
> 
> i have a chip via82c686a (Epox 7-kxa)
> at boot time is been detected via82c686a
> proc says via82c586b
> lspci -vvv no says 

Because all VIA IDE chips say they're 586b.

> the udma capabilities is not work propetly i set manually with "hdparm -m 8 -W 1 -X udma5 /dev/hda"

The chip doesn't support UDMA5. The highest speed it supports
is UDMA3 (UDMA66).

> whats happening here?

Nothing unusual.

-- 
Vojtech Pavlik
SuSE Labs

^ permalink raw reply

* Re: 2.4.20-ac1 KT400 AGP support
From: Dave Jones @ 2002-12-14 10:13 UTC (permalink / raw)
  To: Courtney Grimland; +Cc: BoehmeSilvio, linux-kernel
In-Reply-To: <20021213195759.3233dc42.cgrimland@yahoo.com>

On Fri, Dec 13, 2002 at 07:57:59PM -0600, Courtney Grimland wrote:
 > You should be able to set AGP to 4x or 2x in the BIOS.

Aparently some KT400 BIOS's got clever, and took away the option.
They switch to AGP 3.0 if an AGP 3.0 card is present, and drop
back to 2.0 if a 2.0 card is present.

		Dave

-- 
| Dave Jones.        http://www.codemonkey.org.uk
| SuSE Labs

^ permalink raw reply

* Re: [2.5.51] Failure to mount ext3 root when ext2 compiled in
From: Andrew Morton @ 2002-12-14 10:11 UTC (permalink / raw)
  To: Mohamed El Ayouty; +Cc: Rusty Russell, LKML
In-Reply-To: <1039790158.25215.48.camel@syKr0n.mine.nu>

Mohamed El Ayouty wrote:
> 
> Hi,
> 
> This sounds more like the bug I have opened:
> 
> http://bugme.osdl.org/show_bug.cgi?id=110
> 
> where if CONFIG_DEVFS_FS = Y and CONFIG_DEVFS_MOUNT = Y, you will get:
> 
> VFS: Cannot open root device "hda2" or 03:02
> Please append a correct "root=" boot option
> Kernel panic: VFS: Unable to mount root fs on 03:02
> 
> I worked around it by enabling CONFIG_UNIX98_PTYS = Y under the
> character devices.
> 
> But, a recent update to the bug shows that a patch was posted but nobody
> cared.
> 
> Personally, I think the patch should be merged.

You mean this one?

	http://www.lkml.org/archive/2002/12/13/50/index.html

It appears to simply disable internal mounting of devfs.

In which kernel did this problem first appear?   There were
devfs changes in 2.5.51.

^ permalink raw reply

* Re: Intel P6 vs P7 system call performance
From: Dave Jones @ 2002-12-14 10:01 UTC (permalink / raw)
  To: Mike Dresser; +Cc: GrandMasterLee, linux-kernel
In-Reply-To: <Pine.LNX.4.33.0212132345040.12319-100000@router.windsormachine.com>

On Fri, Dec 13, 2002 at 11:53:51PM -0500, Mike Dresser wrote:
 > On Fri, 13 Dec 2002, Mike Dresser wrote:
 > 
 > > The single P4/2.53 in another machine can haul down in 3m17s
 > >
 > Amend that to 2m19s, forgot to kill a background backup that was moving
 > files around at about 20 meg a second.

Note that there are more factors at play than raw cpu speed in a
kernel compile. Your time here is slightly faster than my 2.8Ghz P4-HT for
example.  My guess is you have faster disk(s) than I do, as most of
the time mine seems to be waiting for something to do.

*note also that this is compiling stock 2.4.20 with default configuration.
The minute you change any options, we're comparings apples to oranges.

		Dave

-- 
| Dave Jones.        http://www.codemonkey.org.uk
| SuSE Labs

^ permalink raw reply

* Re: rmap and nvidia?
From: mdew @ 2002-12-14  9:46 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: Linux Kernel
In-Reply-To: <20021214093831.GL9882@holomorphy.com>

On Sat, 2002-12-14 at 22:38, William Lee Irwin III wrote:
> On Sat, Dec 14, 2002 at 10:36:10PM +1300, mdew wrote:
> > is there a nvidia patch available to make it work with rmap?
> > 
> > nirvana:~/NVIDIA_kernel-1.0-3123# make
> > echo \#define NV_COMPILER \"`cc -v 2>&1 | tail -1`\" > nv_compiler.h
> > cc -c -Wall -Wimplicit -Wreturn-type -Wswitch -Wformat -Wchar-subscripts
> > -Wparentheses -Wpointer-arith -Wcast-qual -Wno-multichar  -O -MD
> > -D__KERNEL__ -DMODULE -D_LOOSE_KERNEL_NAMES -DNTRM -D_GNU_SOURCE
> > -DRM_HEAPMGR -D_LOOSE_KERNEL_NAMES -D__KERNEL__ -DMODULE 
> > -DNV_MAJOR_VERSION=1 -DNV_MINOR_VERSION=0 -DNV_PATCHLEVEL=3123 
> > -DNV_UNIX   -DNV_LINUX   -DNVCPU_X86       -I.
> > -I/lib/modules/2.4.20-xfs-rmap15b/build/include -Wno-cast-qual nv.c
> > nv.c: In function `nv_get_phys_address':
> > nv.c:2182: warning: implicit declaration of function `pte_offset'
> > nv.c:2182: invalid type argument of `unary *'
> > make: *** [nv.o] Error 1
> 
> Use pte_offset_map() with a corresponding pte_unmap().

err pardon?



^ permalink raw reply

* Re: rmap and nvidia?
From: William Lee Irwin III @ 2002-12-14  9:38 UTC (permalink / raw)
  To: mdew; +Cc: Linux Kernel
In-Reply-To: <1039858571.559.15.camel@nirvana>

On Sat, Dec 14, 2002 at 10:36:10PM +1300, mdew wrote:
> is there a nvidia patch available to make it work with rmap?
> 
> nirvana:~/NVIDIA_kernel-1.0-3123# make
> echo \#define NV_COMPILER \"`cc -v 2>&1 | tail -1`\" > nv_compiler.h
> cc -c -Wall -Wimplicit -Wreturn-type -Wswitch -Wformat -Wchar-subscripts
> -Wparentheses -Wpointer-arith -Wcast-qual -Wno-multichar  -O -MD
> -D__KERNEL__ -DMODULE -D_LOOSE_KERNEL_NAMES -DNTRM -D_GNU_SOURCE
> -DRM_HEAPMGR -D_LOOSE_KERNEL_NAMES -D__KERNEL__ -DMODULE 
> -DNV_MAJOR_VERSION=1 -DNV_MINOR_VERSION=0 -DNV_PATCHLEVEL=3123 
> -DNV_UNIX   -DNV_LINUX   -DNVCPU_X86       -I.
> -I/lib/modules/2.4.20-xfs-rmap15b/build/include -Wno-cast-qual nv.c
> nv.c: In function `nv_get_phys_address':
> nv.c:2182: warning: implicit declaration of function `pte_offset'
> nv.c:2182: invalid type argument of `unary *'
> make: *** [nv.o] Error 1

Use pte_offset_map() with a corresponding pte_unmap().


Bill

^ permalink raw reply

* Re: [OOPS] 2.5.51-mm2
From: Andrew Morton @ 2002-12-14  9:38 UTC (permalink / raw)
  To: Paul P Komkoff Jr, ext2-devel; +Cc: Linux Kernel Mailing List
In-Reply-To: <20021213181155.GB2496@stingr.net>

Paul P Komkoff Jr wrote:
> 
> This is very funny.

Actually it's very bad.  Thanks for reporting this.

> mke2fs -j -O dir_index -J size=192 -T news -N 1000100
> atest3 1000000
>  (creat & write 1 byte to 1000000 files)
> 
> free space on device became 0 and voila
> 
> Unable to handle kernel paging request at virtual address 5a5a5b9e


Here's a fix:



If ext3_add_nondir() fails it will do an iput() of the inode.  But we
continue to run ext3_mark_inode_dirty() against the potentially-freed
inode.  This oopses when slab poisoning is enabled.

Fix it so that we only run ext3_mark_inode_dirty() if the inode was
successfully instantiated.



 fs/ext3/namei.c |   11 +++++------
 1 files changed, 5 insertions(+), 6 deletions(-)

--- 25/fs/ext3/namei.c~ext3-use-after-free	Sat Dec 14 01:25:03 2002
+++ 25-akpm/fs/ext3/namei.c	Sat Dec 14 01:25:53 2002
@@ -1566,8 +1566,11 @@ static int ext3_add_nondir(handle_t *han
 {
 	int err = ext3_add_entry(handle, dentry, inode);
 	if (!err) {
-		d_instantiate(dentry, inode);
-		return 0;
+		err = ext3_mark_inode_dirty(handle, inode);
+		if (!err) {
+			d_instantiate(dentry, inode);
+			return 0;
+		}
 	}
 	ext3_dec_count(handle, inode);
 	iput(inode);
@@ -1609,7 +1612,6 @@ static int ext3_create (struct inode * d
 		else
 			inode->i_mapping->a_ops = &ext3_aops;
 		err = ext3_add_nondir(handle, dentry, inode);
-		ext3_mark_inode_dirty(handle, inode);
 	}
 	ext3_journal_stop(handle, dir);
 	unlock_kernel();
@@ -1642,7 +1644,6 @@ static int ext3_mknod (struct inode * di
 		inode->i_op = &ext3_special_inode_operations;
 #endif
 		err = ext3_add_nondir(handle, dentry, inode);
-		ext3_mark_inode_dirty(handle, inode);
 	}
 	ext3_journal_stop(handle, dir);
 	unlock_kernel();
@@ -2105,7 +2106,6 @@ static int ext3_symlink (struct inode * 
 	}
 	EXT3_I(inode)->i_disksize = inode->i_size;
 	err = ext3_add_nondir(handle, dentry, inode);
-	ext3_mark_inode_dirty(handle, inode);
 out_stop:
 	ext3_journal_stop(handle, dir);
 	unlock_kernel();
@@ -2140,7 +2140,6 @@ static int ext3_link (struct dentry * ol
 	atomic_inc(&inode->i_count);
 
 	err = ext3_add_nondir(handle, dentry, inode);
-	ext3_mark_inode_dirty(handle, inode);
 	ext3_journal_stop(handle, dir);
 	unlock_kernel();
 	return err;

_

^ permalink raw reply

* rmap and nvidia?
From: mdew @ 2002-12-14  9:36 UTC (permalink / raw)
  To: Linux Kernel

is there a nvidia patch available to make it work with rmap?

nirvana:~/NVIDIA_kernel-1.0-3123# make
echo \#define NV_COMPILER \"`cc -v 2>&1 | tail -1`\" > nv_compiler.h
cc -c -Wall -Wimplicit -Wreturn-type -Wswitch -Wformat -Wchar-subscripts
-Wparentheses -Wpointer-arith -Wcast-qual -Wno-multichar  -O -MD
-D__KERNEL__ -DMODULE -D_LOOSE_KERNEL_NAMES -DNTRM -D_GNU_SOURCE
-DRM_HEAPMGR -D_LOOSE_KERNEL_NAMES -D__KERNEL__ -DMODULE 
-DNV_MAJOR_VERSION=1 -DNV_MINOR_VERSION=0 -DNV_PATCHLEVEL=3123 
-DNV_UNIX   -DNV_LINUX   -DNVCPU_X86       -I.
-I/lib/modules/2.4.20-xfs-rmap15b/build/include -Wno-cast-qual nv.c
nv.c: In function `nv_get_phys_address':
nv.c:2182: warning: implicit declaration of function `pte_offset'
nv.c:2182: invalid type argument of `unary *'
make: *** [nv.o] Error 1




^ permalink raw reply

* JDIRTY JWAIT errors in 2.4.19
From: Tupshin Harper @ 2002-12-14  9:29 UTC (permalink / raw)
  To: linux-kernel

i'm getting the following error logged every 11 seconds or so:

Dec 14 01:00:49 phylum kernel: vs-3050: wait_buffer_until_released: nobody
releases buffer (dev 16:01, size 4096, blocknr 2916352, count 3, list 0, 
state
0x10019, page c1172108, (UPTODATE, CLEAN, UNLOCKED)). Still waiting
(-1320000000) JDIRTY !JWAIT

Also, some processes are blocking, include ps (so I can't get a complete 
process list), and shutdown.

This is on an old PII that's been up with no problems for about a month.
Is this a known problem? Is there any way to force a clean shutdown in 
such circumstances? Is this associated with reiserfs which all 
partitions are running? A googling turned up one or two references to 
this problem, but never with a kernel as recent as 2.4.19, and 
associated with, though not necessarily blamed on reiserfs.

This machine is still up, so if more info would be useful, it can 
probably be provided. Most processes are still working fine.

-Tupshin


^ permalink raw reply

* XMAS and NMAP scanning.... With default rules dropping all packets
From: Didier Hung Wan Luk @ 2002-12-14  9:09 UTC (permalink / raw)
  To: Netfilter Mailing List

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi people,

I was wondering whether I really need to include these rules if I am
already using a default rule of DROP for INPUT, OUTPUT and FORWARD
chains.

Default rule:-
iptables -P INPUT DROP
iptables -P OUTPUT DROP
iptables -P FORWARD DROP

Do I really need these rules? To protect me from these scans..

iptables -I FORWARD -p tcp --tcp-flags ALL ALL -j DROP
iptables -I INPUT -p tcp --tcp-flags ALL ALL -j DROP

#nmap NULL-Packets drop
iptables -I FORWARD -p tcp --tcp-flags ALL NONE -j DROP
iptables -I INPUT -p tcp --tcp-flags ALL NONE -j DROP


Thanks,

Didier
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.0 (MingW32)

iD8DBQE9+vVhH0p2xGbWNGwRAqebAJ9qQqpvAY0wZ50NqiaaW51HyQHLGwCePEAo
NPYRxMYonG0SWe0GzKiNb3M=
=bEjK
-----END PGP SIGNATURE-----



^ permalink raw reply

* question about ipt_table_info structure
From: Venkatesh Prasad Ranganath @ 2002-12-14  9:01 UTC (permalink / raw)
  To: netfilter-devel



Hi,

I am using netfilter/iptables (1.2.7a) in a project (which may end up 
contributing to netfilter/iptable branch if it succeeds).  Hence, I was 
browsing the kernel space netfilter/iptables code.  I am able to follow 
the code except for a few glitches.

1> What is the purpose of underflow field in ipt_replace?  Where is it used?
2> What is the purpose of term field in struct initial_table in 
iptables_filter.c?  Where is it used?
3> What is the purpose of ipt_replace structure?  Where is it used?
4> What is the purpose of table field in ipt_table?  It is not used at 
any time during filtering.  (or am I wrong about this?)  If it is used, 
where is it used?
5> Is it correct to say that ACCEPT, DROP, QUEUE, and RETURN are the 
builtin targets?

Also, can someone comment if my understanding of part of 
netfilter/iptable as given below is correct.
"Each rule that can be added via iptables command is represented via a 
set of data rather than a single piece of data.  Each criterion to be 
satisfied for the entire rule to be satisfied is represented as a match. 
 If all of the match/criterion are satisfied then target (linked at the 
end of the sequence of matches) associated with the rule is executed. 
 Hence, there is only one target with a rule, but may be multiple matches."

Finally, are there any documents that discuss the performance of and 
issues related (if any) to netfilter/iptables?  In particular, I am 
looking for documents which may have identified bottlenecks or have 
pointers to locations in which to look for such opportunities. Benchmark 
results and/or test run results would also be helpful.  I am just piggy 
backing this last question along with the others and I would understand 
if someone replied "google would be a good place to start" ;-)

waiting for reply,

-- 

Venkatesh Prasad Ranganath,
Dept. Computing and Information Science,
Kansas State University, US.
web: http://www.cis.ksu.edu/~rvprasad

^ permalink raw reply

* MARK matching
From: Rocco Stanzione @ 2002-12-14  8:49 UTC (permalink / raw)
  To: netfilter

Group:

I don't like the idea of allowing all traffic destined for the external IP on 
the external interface on a machine that doubles as a firewall and a server.  
But I have a webmail interface that doesn't work unless I do just that.  What 
I want to know is, is it valid to use the MARK target on these packets on 
their way 'out' so that they can be recognized as not having been spoofed?  I 
haven't seen any documentation on using it like this, and I wonder if this is 
a viable solution, or if anyone has a better idea.

Thanks,

Rocco


^ permalink raw reply

* [PATCH] kexec for 2.5.51....
From: Eric W. Biederman @ 2002-12-14  8:26 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel

Linus, 

My apologies for not resending this earlier I've been terribly
busy with other things..

No changes are included since the last time I sent this except
the diff now patches cleanly onto 2.5.51.  If there is some problem
holler and I will see about fixing it.

When I bypass the BIOS in booting clients my only current failure
report is on an IBM NUMAQ and that almost worked.

 MAINTAINERS                        |    7 
 arch/i386/Kconfig                  |   17 
 arch/i386/kernel/Makefile          |    1 
 arch/i386/kernel/entry.S           |    2 
 arch/i386/kernel/machine_kexec.c   |  142 ++++++++
 arch/i386/kernel/relocate_kernel.S |  107 ++++++
 include/asm-i386/kexec.h           |   25 +
 include/asm-i386/unistd.h          |    2 
 include/linux/kexec.h              |   45 ++
 include/linux/reboot.h             |    2 
 kernel/Makefile                    |    1 
 kernel/kexec.c                     |  640 +++++++++++++++++++++++++++++++++++++
 kernel/sys.c                       |   23 +
 13 files changed, 1012 insertions, 2 deletions

diff -uNr linux-2.5.51/MAINTAINERS linux-2.5.51.x86kexec/MAINTAINERS
--- linux-2.5.51/MAINTAINERS	Thu Dec 12 07:41:16 2002
+++ linux-2.5.51.x86kexec/MAINTAINERS	Thu Dec 12 07:43:53 2002
@@ -997,6 +997,13 @@
 W:	http://www.cse.unsw.edu.au/~neilb/patches/linux-devel/
 S:	Maintained
 
+KEXEC
+P:	Eric Biederman
+M:	ebiederm@xmission.com
+M:	ebiederman@lnxi.com
+L:	linux-kernel@vger.kernel.org
+S:	Maintained
+
 LANMEDIA WAN CARD DRIVER
 P:	Andrew Stanley-Jones
 M:	asj@lanmedia.com
diff -uNr linux-2.5.51/arch/i386/Kconfig linux-2.5.51.x86kexec/arch/i386/Kconfig
--- linux-2.5.51/arch/i386/Kconfig	Thu Dec 12 07:41:17 2002
+++ linux-2.5.51.x86kexec/arch/i386/Kconfig	Thu Dec 12 07:43:53 2002
@@ -784,6 +784,23 @@
 	depends on (SMP || PREEMPT) && X86_CMPXCHG
 	default y
 
+config KEXEC
+	bool "kexec system call (EXPERIMENTAL)"
+	depends on EXPERIMENTAL
+	help
+	  kexec is a system call that implements the ability to  shutdown your
+	  current kernel, and to start another kernel.  It is like a reboot
+	  but it is indepedent of the system firmware.   And like a reboot
+	  you can start any kernel with it not just Linux.  
+	
+	  The name comes from the similiarity to the exec system call. 
+	
+	  It is on an going process to be certain the hardware in a machine
+	  is properly shutdown, so do not be surprised if this code does not
+	  initially work for you.  It may help to enable device hotplugging
+	  support.  As of this writing the exact hardware interface is
+	  strongly in flux, so no good recommendation can be made.
+
 endmenu
 
 
diff -uNr linux-2.5.51/arch/i386/kernel/Makefile linux-2.5.51.x86kexec/arch/i386/kernel/Makefile
--- linux-2.5.51/arch/i386/kernel/Makefile	Sun Nov 17 22:51:14 2002
+++ linux-2.5.51.x86kexec/arch/i386/kernel/Makefile	Thu Dec 12 07:43:53 2002
@@ -24,6 +24,7 @@
 obj-$(CONFIG_X86_MPPARSE)	+= mpparse.o
 obj-$(CONFIG_X86_LOCAL_APIC)	+= apic.o nmi.o
 obj-$(CONFIG_X86_IO_APIC)	+= io_apic.o
+obj-$(CONFIG_KEXEC)		+= machine_kexec.o relocate_kernel.o
 obj-$(CONFIG_SOFTWARE_SUSPEND)	+= suspend.o suspend_asm.o
 obj-$(CONFIG_X86_NUMAQ)		+= numaq.o
 obj-$(CONFIG_PROFILING)		+= profile.o
diff -uNr linux-2.5.51/arch/i386/kernel/entry.S linux-2.5.51.x86kexec/arch/i386/kernel/entry.S
--- linux-2.5.51/arch/i386/kernel/entry.S	Thu Dec 12 07:41:17 2002
+++ linux-2.5.51.x86kexec/arch/i386/kernel/entry.S	Thu Dec 12 07:43:53 2002
@@ -743,7 +743,7 @@
 	.long sys_epoll_wait
  	.long sys_remap_file_pages
  	.long sys_set_tid_address
-
+	.long sys_kexec_load
 
 	.rept NR_syscalls-(.-sys_call_table)/4
 		.long sys_ni_syscall
diff -uNr linux-2.5.51/arch/i386/kernel/machine_kexec.c linux-2.5.51.x86kexec/arch/i386/kernel/machine_kexec.c
--- linux-2.5.51/arch/i386/kernel/machine_kexec.c	Wed Dec 31 17:00:00 1969
+++ linux-2.5.51.x86kexec/arch/i386/kernel/machine_kexec.c	Thu Dec 12 07:43:53 2002
@@ -0,0 +1,142 @@
+#include <linux/config.h>
+#include <linux/mm.h>
+#include <linux/kexec.h>
+#include <linux/delay.h>
+#include <asm/pgtable.h>
+#include <asm/pgalloc.h>
+#include <asm/tlbflush.h>
+#include <asm/io.h>
+#include <asm/apic.h>
+
+
+/*
+ * machine_kexec
+ * =======================
+ */
+
+
+static void set_idt(void *newidt, __u16 limit)
+{
+	unsigned char curidt[6];
+
+	/* ia32 supports unaliged loads & stores */
+	(*(__u16 *)(curidt)) = limit;
+	(*(__u32 *)(curidt +2)) = (unsigned long)(newidt);
+
+	__asm__ __volatile__ (
+		"lidt %0\n" 
+		: "=m" (curidt)
+		);
+};
+
+
+static void set_gdt(void *newgdt, __u16 limit)
+{
+	unsigned char curgdt[6];
+
+	/* ia32 supports unaliged loads & stores */
+	(*(__u16 *)(curgdt)) = limit;
+	(*(__u32 *)(curgdt +2)) = (unsigned long)(newgdt);
+
+	__asm__ __volatile__ (
+		"lgdt %0\n" 
+		: "=m" (curgdt)
+		);
+};
+
+static void load_segments(void)
+{
+#define __STR(X) #X
+#define STR(X) __STR(X)
+
+	__asm__ __volatile__ (
+		"\tljmp $"STR(__KERNEL_CS)",$1f\n"
+		"\t1:\n"
+		"\tmovl $"STR(__KERNEL_DS)",%eax\n"
+		"\tmovl %eax,%ds\n"
+		"\tmovl %eax,%es\n"
+		"\tmovl %eax,%fs\n"
+		"\tmovl %eax,%gs\n"
+		"\tmovl %eax,%ss\n"
+		);
+#undef STR
+#undef __STR
+}
+
+static void identity_map_page(unsigned long address)
+{
+	/* This code is x86 specific...
+	 * general purpose code must be more carful 
+	 * of caches and tlbs...
+	 */
+	pgd_t *pgd;
+	pmd_t *pmd;
+	struct mm_struct *mm = current->mm;
+	spin_lock(&mm->page_table_lock);
+	
+	pgd = pgd_offset(mm, address);
+	pmd = pmd_alloc(mm, pgd, address);
+
+	if (pmd) {
+		pte_t *pte = pte_alloc_map(mm, pmd, address);
+		if (pte) {
+			set_pte(pte, 
+				mk_pte(virt_to_page(phys_to_virt(address)), 
+					PAGE_SHARED));
+			__flush_tlb_one(address);
+		}
+	}
+	spin_unlock(&mm->page_table_lock);
+}
+
+
+typedef void (*relocate_new_kernel_t)(
+	unsigned long indirection_page, unsigned long reboot_code_buffer,
+	unsigned long start_address);
+
+const extern unsigned char relocate_new_kernel[];
+extern void relocate_new_kernel_end(void);
+const extern unsigned int relocate_new_kernel_size;
+
+void machine_kexec(struct kimage *image)
+{
+	unsigned long *indirection_page;
+	void *reboot_code_buffer;
+	relocate_new_kernel_t rnk;
+
+	/* Interrupts aren't acceptable while we reboot */
+	local_irq_disable();
+	reboot_code_buffer = image->reboot_code_buffer;
+	indirection_page = phys_to_virt(image->head & PAGE_MASK);
+
+	identity_map_page(virt_to_phys(reboot_code_buffer));
+
+	/* copy it out */
+	memcpy(reboot_code_buffer, relocate_new_kernel, 
+		relocate_new_kernel_size);
+
+	/* The segment registers are funny things, they are
+	 * automatically loaded from a table, in memory wherever you
+	 * set them to a specific selector, but this table is never
+	 * accessed again you set the segment to a different selector.
+	 *
+	 * The more common model is are caches where the behide
+	 * the scenes work is done, but is also dropped at arbitrary
+	 * times.
+	 *
+	 * I take advantage of this here by force loading the
+	 * segments, before I zap the gdt with an invalid value.
+	 */
+	load_segments();
+	/* The gdt & idt are now invalid.
+	 * If you want to load them you must set up your own idt & gdt.
+	 */
+	set_gdt(phys_to_virt(0),0);
+	set_idt(phys_to_virt(0),0);
+
+	/* now call it */
+	rnk = (relocate_new_kernel_t) virt_to_phys(reboot_code_buffer);
+	(*rnk)(virt_to_phys(indirection_page), virt_to_phys(reboot_code_buffer), 
+		image->start);
+}
+
diff -uNr linux-2.5.51/arch/i386/kernel/relocate_kernel.S linux-2.5.51.x86kexec/arch/i386/kernel/relocate_kernel.S
--- linux-2.5.51/arch/i386/kernel/relocate_kernel.S	Wed Dec 31 17:00:00 1969
+++ linux-2.5.51.x86kexec/arch/i386/kernel/relocate_kernel.S	Thu Dec 12 07:43:53 2002
@@ -0,0 +1,107 @@
+#include <linux/config.h>
+#include <linux/linkage.h>
+
+	/* Must be relocatable PIC code callable as a C function, that once
+	 * it starts can not use the previous processes stack.
+	 *
+	 */
+	.globl relocate_new_kernel
+relocate_new_kernel:
+	/* read the arguments and say goodbye to the stack */
+	movl  4(%esp), %ebx /* indirection_page */
+	movl  8(%esp), %ebp /* reboot_code_buffer */
+	movl  12(%esp), %edx /* start address */
+
+	/* zero out flags, and disable interrupts */
+	pushl $0
+	popfl
+
+	/* set a new stack at the bottom of our page... */
+	lea   4096(%ebp), %esp
+
+	/* store the parameters back on the stack */
+	pushl   %edx /* store the start address */
+
+	/* Set cr0 to a known state:
+	 * 31 0 == Paging disabled
+	 * 18 0 == Alignment check disabled
+	 * 16 0 == Write protect disabled
+	 * 3  0 == No task switch
+	 * 2  0 == Don't do FP software emulation.
+	 * 0  1 == Proctected mode enabled
+	 */
+	movl	%cr0, %eax
+	andl	$~((1<<31)|(1<<18)|(1<<16)|(1<<3)|(1<<2)), %eax
+	orl	$(1<<0), %eax
+	movl	%eax, %cr0
+	
+	/* Set cr4 to a known state:
+	 * Setting everything to zero seems safe.
+	 */
+	movl	%cr4, %eax
+	andl	$0, %eax
+	movl	%eax, %cr4
+	
+	jmp 1f
+1:	
+
+	/* Flush the TLB (needed?) */
+	xorl	%eax, %eax
+	movl	%eax, %cr3
+
+	/* Do the copies */
+	cld
+0:	/* top, read another word for the indirection page */
+	movl    %ebx, %ecx
+	movl	(%ebx), %ecx
+	addl	$4, %ebx
+	testl	$0x1,   %ecx  /* is it a destination page */
+	jz	1f
+	movl	%ecx,	%edi
+	andl	$0xfffff000, %edi
+	jmp     0b
+1:
+	testl	$0x2,	%ecx  /* is it an indirection page */
+	jz	1f
+	movl	%ecx,	%ebx
+	andl	$0xfffff000, %ebx
+	jmp     0b
+1:
+	testl   $0x4,   %ecx /* is it the done indicator */
+	jz      1f
+	jmp     2f
+1:
+	testl   $0x8,   %ecx /* is it the source indicator */
+	jz      0b	     /* Ignore it otherwise */
+	movl    %ecx,   %esi /* For every source page do a copy */
+	andl    $0xfffff000, %esi
+
+	movl    $1024, %ecx
+	rep ; movsl
+	jmp     0b
+
+2:
+
+	/* To be certain of avoiding problems with self modifying code
+	 * I need to execute a serializing instruction here.
+	 * So I flush the TLB, it's handy, and not processor dependent.
+	 */
+	xorl	%eax, %eax
+	movl	%eax, %cr3
+	
+	/* set all of the registers to known values */
+	/* leave %esp alone */
+	
+	xorl	%eax, %eax
+	xorl	%ebx, %ebx
+	xorl    %ecx, %ecx
+	xorl    %edx, %edx
+	xorl    %esi, %esi
+	xorl    %edi, %edi
+	xorl    %ebp, %ebp
+	ret
+relocate_new_kernel_end:
+
+	.globl relocate_new_kernel_size
+relocate_new_kernel_size:	
+	.long relocate_new_kernel_end - relocate_new_kernel
diff -uNr linux-2.5.51/include/asm-i386/kexec.h linux-2.5.51.x86kexec/include/asm-i386/kexec.h
--- linux-2.5.51/include/asm-i386/kexec.h	Wed Dec 31 17:00:00 1969
+++ linux-2.5.51.x86kexec/include/asm-i386/kexec.h	Thu Dec 12 07:43:53 2002
@@ -0,0 +1,25 @@
+#ifndef _I386_KEXEC_H
+#define _I386_KEXEC_H
+
+#include <asm/fixmap.h>
+
+/*
+ * KEXEC_SOURCE_MEMORY_LIMIT maximum page get_free_page can return.
+ * I.e. Maximum page that is mapped directly into kernel memory,
+ * and kmap is not required.
+ *
+ * Someone correct me if FIXADDR_START - PAGEOFFSET is not the correct
+ * calculation for the amount of memory directly mappable into the
+ * kernel memory space.
+ */
+
+/* Maximum physical address we can use pages from */
+#define KEXEC_SOURCE_MEMORY_LIMIT (FIXADDR_START - PAGE_OFFSET) 
+/* Maximum address we can reach in physical address mode */
+#define KEXEC_DESTINATION_MEMORY_LIMIT (-1UL)
+
+#define KEXEC_REBOOT_CODE_SIZE	4096
+#define KEXEC_REBOOT_CODE_ALIGN 0
+
+
+#endif /* _I386_KEXEC_H */
diff -uNr linux-2.5.51/include/asm-i386/unistd.h linux-2.5.51.x86kexec/include/asm-i386/unistd.h
--- linux-2.5.51/include/asm-i386/unistd.h	Thu Dec 12 07:41:35 2002
+++ linux-2.5.51.x86kexec/include/asm-i386/unistd.h	Thu Dec 12 07:43:53 2002
@@ -264,7 +264,7 @@
 #define __NR_epoll_wait		256
 #define __NR_remap_file_pages	257
 #define __NR_set_tid_address	258
-
+#define __NR_sys_kexec_load	259
 
 /* user-visible error numbers are in the range -1 - -124: see <asm-i386/errno.h> */
 
diff -uNr linux-2.5.51/include/linux/kexec.h linux-2.5.51.x86kexec/include/linux/kexec.h
--- linux-2.5.51/include/linux/kexec.h	Wed Dec 31 17:00:00 1969
+++ linux-2.5.51.x86kexec/include/linux/kexec.h	Thu Dec 12 07:43:53 2002
@@ -0,0 +1,45 @@
+#ifndef LINUX_KEXEC_H
+#define LINUX_KEXEC_H
+
+#if CONFIG_KEXEC
+#include <linux/types.h>
+#include <asm/kexec.h>
+
+/* 
+ * This structure is used to hold the arguments that are used when loading
+ * kernel binaries.
+ */
+
+typedef unsigned long kimage_entry_t;
+#define IND_DESTINATION  0x1
+#define IND_INDIRECTION  0x2
+#define IND_DONE         0x4
+#define IND_SOURCE       0x8
+
+struct kimage {
+	kimage_entry_t head;
+	kimage_entry_t *entry;
+	kimage_entry_t *last_entry;
+
+	unsigned long destination;
+	unsigned long offset;
+
+	unsigned long start;
+	void *reboot_code_buffer;
+};
+
+struct kexec_segment {
+	void *buf;
+	size_t bufsz;
+	void *mem;
+	size_t memsz;
+};
+
+/* kexec interface functions */
+extern void machine_kexec(struct kimage *image);
+extern asmlinkage long sys_kexec(unsigned long entry, long nr_segments, 
+	struct kexec_segment *segments);
+extern struct kimage *kexec_image;
+#endif
+#endif /* LINUX_KEXEC_H */
+
diff -uNr linux-2.5.51/include/linux/reboot.h linux-2.5.51.x86kexec/include/linux/reboot.h
--- linux-2.5.51/include/linux/reboot.h	Thu Dec 12 07:41:37 2002
+++ linux-2.5.51.x86kexec/include/linux/reboot.h	Thu Dec 12 07:43:53 2002
@@ -21,6 +21,7 @@
  * POWER_OFF   Stop OS and remove all power from system, if possible.
  * RESTART2    Restart system using given command string.
  * SW_SUSPEND  Suspend system using Software Suspend if compiled in
+ * KEXEC       Restart the system using a different kernel.
  */
 
 #define	LINUX_REBOOT_CMD_RESTART	0x01234567
@@ -30,6 +31,7 @@
 #define	LINUX_REBOOT_CMD_POWER_OFF	0x4321FEDC
 #define	LINUX_REBOOT_CMD_RESTART2	0xA1B2C3D4
 #define	LINUX_REBOOT_CMD_SW_SUSPEND	0xD000FCE2
+#define LINUX_REBOOT_CMD_KEXEC		0x45584543
 
 
 #ifdef __KERNEL__
diff -uNr linux-2.5.51/kernel/Makefile linux-2.5.51.x86kexec/kernel/Makefile
--- linux-2.5.51/kernel/Makefile	Thu Dec 12 07:41:37 2002
+++ linux-2.5.51.x86kexec/kernel/Makefile	Thu Dec 12 07:44:40 2002
@@ -21,6 +21,7 @@
 obj-$(CONFIG_CPU_FREQ) += cpufreq.o
 obj-$(CONFIG_BSD_PROCESS_ACCT) += acct.o
 obj-$(CONFIG_SOFTWARE_SUSPEND) += suspend.o
+obj-$(CONFIG_KEXEC) += kexec.o
 obj-$(CONFIG_COMPAT) += compat.o
 
 ifneq ($(CONFIG_IA64),y)
diff -uNr linux-2.5.51/kernel/kexec.c linux-2.5.51.x86kexec/kernel/kexec.c
--- linux-2.5.51/kernel/kexec.c	Wed Dec 31 17:00:00 1969
+++ linux-2.5.51.x86kexec/kernel/kexec.c	Thu Dec 12 07:43:53 2002
@@ -0,0 +1,640 @@
+#include <linux/mm.h>
+#include <linux/file.h>
+#include <linux/slab.h>
+#include <linux/fs.h>
+#include <linux/version.h>
+#include <linux/compile.h>
+#include <linux/kexec.h>
+#include <linux/spinlock.h>
+#include <net/checksum.h>
+#include <asm/page.h>
+#include <asm/uaccess.h>
+#include <asm/io.h>
+#include <asm/system.h>
+
+/* As designed kexec can only use the memory that you don't
+ * need to use kmap to access.  Memory that you can use virt_to_phys()
+ * on an call get_free_page to allocate.
+ *
+ * In the best case you need one page for the transition from
+ * virtual to physical memory.  And this page must be identity
+ * mapped.  Which pretty much leaves you with pages < PAGE_OFFSET
+ * as you can only mess with user pages.
+ * 
+ * As the only subset of memory that it is easy to restrict allocation
+ * to is the physical memory mapped into the kernel, I do that
+ * with get_free_page and hope it is enough.
+ *
+ * I don't know of a good way to do this calcuate which pages get_free_page
+ * will return independent of architecture so I depend on
+ * <asm/kexec.h> to properly set 
+ * KEXEC_SOURCE_MEMORY_LIMIT and KEXEC_DESTINATION_MEMORY_LIMIT
+ * 
+ */
+
+static struct kimage *kimage_alloc(void)
+{
+	struct kimage *image;
+	image = kmalloc(sizeof(*image), GFP_KERNEL);
+	if (!image)
+		return 0;
+	memset(image, 0, sizeof(*image));
+	image->head = 0;
+	image->entry = &image->head;
+	image->last_entry = &image->head;
+	return image;
+}
+static int kimage_add_entry(struct kimage *image, kimage_entry_t entry)
+{
+	if (image->offset != 0) {
+		image->entry++;
+	}
+	if (image->entry == image->last_entry) {
+		kimage_entry_t *ind_page;
+		ind_page = (void *)__get_free_page(GFP_KERNEL);
+		if (!ind_page) {
+			return -ENOMEM;
+		}
+		*image->entry = virt_to_phys(ind_page) | IND_INDIRECTION;
+		image->entry = ind_page;
+		image->last_entry = 
+			ind_page + ((PAGE_SIZE/sizeof(kimage_entry_t)) - 1);
+	}
+	*image->entry = entry;
+	image->entry++;
+	image->offset = 0;
+	return 0;
+}
+
+static int kimage_verify_destination(unsigned long destination)
+{
+	int result;
+	
+	/* Assume the page is bad unless we pass the checks */
+	result = -EADDRNOTAVAIL;
+
+	if (destination >= KEXEC_DESTINATION_MEMORY_LIMIT) {
+		goto out;
+	}
+
+	/* NOTE: The caller is responsible for making certain we
+	 * don't attempt to load the new image into invalid or
+	 * reserved areas of RAM.
+	 */
+	result =  0;
+out:
+	return result;
+}
+
+static int kimage_set_destination(
+	struct kimage *image, unsigned long destination) 
+{
+	int result;
+	destination &= PAGE_MASK;
+	result = kimage_verify_destination(destination);
+	if (result) {
+		return result;
+	}
+	result = kimage_add_entry(image, destination | IND_DESTINATION);
+	if (result == 0) {
+		image->destination = destination;
+	}
+	return result;
+}
+
+
+static int kimage_add_page(struct kimage *image, unsigned long page)
+{
+	int result;
+	page &= PAGE_MASK;
+	result = kimage_verify_destination(image->destination);
+	if (result) {
+		return result;
+	}
+	result = kimage_add_entry(image, page | IND_SOURCE);
+	if (result == 0) {
+		image->destination += PAGE_SIZE;
+	}
+	return result;
+}
+
+
+static int kimage_terminate(struct kimage *image)
+{
+	int result;
+	result = kimage_add_entry(image, IND_DONE);
+	if (result == 0) {
+		/* Point at the terminating element */
+		image->entry--;
+	}
+	return result;
+}
+
+#define for_each_kimage_entry(image, ptr, entry) \
+	for (ptr = &image->head; (entry = *ptr) && !(entry & IND_DONE); \
+		ptr = (entry & IND_INDIRECTION)? \
+			phys_to_virt((entry & PAGE_MASK)): ptr +1)
+
+static void kimage_free(struct kimage *image)
+{
+	kimage_entry_t *ptr, entry;
+	kimage_entry_t ind = 0;
+	if (!image)
+		return;
+	for_each_kimage_entry(image, ptr, entry) {
+		if (entry & IND_INDIRECTION) {
+			/* Free the previous indirection page */
+			if (ind & IND_INDIRECTION) {
+				free_page((unsigned long)phys_to_virt(ind & PAGE_MASK));
+			}
+			/* Save this indirection page until we are
+			 * done with it.
+			 */
+			ind = entry;
+		}
+		else if (entry & IND_SOURCE) {
+			free_page((unsigned long)phys_to_virt(entry & PAGE_MASK));
+		}
+	}
+	kfree(image);
+}
+
+static int kimage_is_destination_page(
+	struct kimage *image, unsigned long page)
+{
+	kimage_entry_t *ptr, entry;
+	unsigned long destination;
+	destination = 0;
+	page &= PAGE_MASK;
+	for_each_kimage_entry(image, ptr, entry) {
+		if (entry & IND_DESTINATION) {
+			destination = entry & PAGE_MASK;
+		}
+		else if (entry & IND_SOURCE) {
+			if (page == destination) {
+				return 1;
+			}
+			destination += PAGE_SIZE;
+		}
+	}
+	return 0;
+}
+
+static int kimage_get_unused_area(
+	struct kimage *image, unsigned long size, unsigned long align,
+	unsigned long *area)
+{
+	/* Walk through mem_map and find the first chunk of
+	 * ununsed memory that is at least size bytes long.
+	 */
+	/* Since the kernel plays with Page_Reseved mem_map is less
+	 * than ideal for this purpose, but it will give us a correct
+	 * conservative estimate of what we need to do. 
+	 */
+	/* For now we take advantage of the fact that all kernel pages
+	 * are marked with PG_resereved to allocate a large
+	 * contiguous area for the reboot code buffer.
+	 */
+	unsigned long addr;
+	unsigned long start, end;
+	unsigned long mask;
+	mask = ((1 << align) -1);
+	start = end = PAGE_SIZE;
+	for(addr = PAGE_SIZE; addr < KEXEC_SOURCE_MEMORY_LIMIT; addr += PAGE_SIZE) {
+		struct page *page;
+		unsigned long aligned_start;
+		page = virt_to_page(phys_to_virt(addr));
+		if (PageReserved(page) ||
+			kimage_is_destination_page(image, addr)) {
+			/* The current page is reserved so the start &
+			 * end of the next area must be atleast at the
+			 * next page.
+			 */
+			start = end = addr + PAGE_SIZE;
+		}
+		else {
+			/* O.k.  The current page isn't reserved
+			 * so push up the end of the area.
+			 */
+			end = addr;
+		}
+		aligned_start = (start + mask) & ~mask;
+		if (aligned_start > start) {
+			continue;
+		}
+		if (aligned_start > end) {
+			continue;
+		}
+		if (end - aligned_start >= size) {
+			*area = aligned_start;
+			return 0;
+		}
+	}
+	*area = 0;
+	return -ENOSPC;
+}
+
+static kimage_entry_t *kimage_dst_conflict(
+	struct kimage *image, unsigned long page, kimage_entry_t *limit)
+{
+	kimage_entry_t *ptr, entry;
+	unsigned long destination = 0;
+	for_each_kimage_entry(image, ptr, entry) {
+		if (ptr == limit) {
+			return 0;
+		}
+		else if (entry & IND_DESTINATION) {
+			destination = entry & PAGE_MASK;
+		}
+		else if (entry & IND_SOURCE) {
+			if (page == destination) {
+				return ptr;
+			}
+			destination += PAGE_SIZE;
+		}
+	}
+	return 0;
+}
+
+static kimage_entry_t *kimage_src_conflict(
+	struct kimage *image, unsigned long destination, kimage_entry_t *limit)
+{
+	kimage_entry_t *ptr, entry;
+	for_each_kimage_entry(image, ptr, entry) {
+		unsigned long page;
+		if (ptr == limit) {
+			return 0;
+		}
+		else if (entry & IND_DESTINATION) {
+			/* nop */
+		}
+		else if (entry & IND_DONE) {
+			/* nop */
+		}
+		else {
+			/* SOURCE & INDIRECTION */
+			page = entry & PAGE_MASK;
+			if (page == destination) {
+				return ptr;
+			}
+		}
+	}
+	return 0;
+}
+
+static int kimage_get_off_destination_pages(struct kimage *image)
+{
+	kimage_entry_t *ptr, *cptr, entry;
+	unsigned long buffer, page;
+	unsigned long destination = 0;
+
+	/* Here we implement safe guards to insure that
+	 * a source page is not copied to it's destination
+	 * page before the data on the destination page is
+	 * no longer useful.
+	 *
+	 * To make it work we actually wind up with a 
+	 * stronger condition.  For every page considered
+	 * it is either it's own destination page or it is
+	 * not a destination page of any page considered.
+	 *
+	 * Invariants 
+	 * 1. buffer is not a destination of a previous page.
+	 * 2. page is not a destination of a previous page.
+	 * 3. destination is not a previous source page.
+	 *
+	 * Result: Either a source page and a destination page 
+	 * are the same or the page is not a destination page.
+	 *
+	 * These checks could be done when we allocate the pages,
+	 * but doing it as a final pass allows us more freedom
+	 * on how we allocate pages.
+	 * 
+	 * Also while the checks are necessary, in practice nothing
+	 * happens.  The destination kernel wants to sit in the
+	 * same physical addresses as the current kernel so we never
+	 * actually allocate a destination page.
+	 *
+	 * BUGS: This is a O(N^2) algorithm.
+	 */
+
+	
+	buffer = __get_free_page(GFP_KERNEL);
+	if (!buffer) {
+		return -ENOMEM;
+	}
+	buffer = virt_to_phys((void *)buffer);
+	for_each_kimage_entry(image, ptr, entry) {
+		/* Here we check to see if an allocated page */
+		kimage_entry_t *limit;
+		if (entry & IND_DESTINATION) {
+			destination = entry & PAGE_MASK;
+		}
+		else if (entry & IND_INDIRECTION) {
+			/* Indirection pages must include all of their
+			 * contents in limit checking.
+			 */
+			limit = phys_to_virt(page + PAGE_SIZE - sizeof(*limit));
+		}
+		if (!((entry & IND_SOURCE) | (entry & IND_INDIRECTION))) {
+			continue;
+		}
+		page = entry & PAGE_MASK;
+		limit = ptr;
+
+		/* See if a previous page has the current page as it's 
+		 * destination.
+		 * i.e. invariant 2
+		 */
+		cptr = kimage_dst_conflict(image, page, limit);
+		if (cptr) {
+			unsigned long cpage;
+ 			kimage_entry_t centry;
+			centry = *cptr;
+			cpage = centry & PAGE_MASK;
+			memcpy(phys_to_virt(buffer), phys_to_virt(page), PAGE_SIZE);
+			memcpy(phys_to_virt(page), phys_to_virt(cpage), PAGE_SIZE);
+			*cptr = page | (centry & ~PAGE_MASK);
+			*ptr = buffer | (entry & ~PAGE_MASK);
+			buffer = cpage;
+		}
+		if (!(entry & IND_SOURCE)) {
+			continue;
+		}
+
+		/* See if a previous page is our destination page.
+		 * If so claim it now.
+		 * i.e. invariant 3
+		 */
+		cptr = kimage_src_conflict(image, destination, limit);
+		if (cptr) {
+			unsigned long cpage;
+ 			kimage_entry_t centry;
+			centry = *cptr;
+			cpage = centry & PAGE_MASK;
+			memcpy(phys_to_virt(buffer), phys_to_virt(cpage), PAGE_SIZE);
+			memcpy(phys_to_virt(cpage), phys_to_virt(page), PAGE_SIZE);
+			*cptr = buffer | (centry & ~PAGE_MASK);
+			*ptr = cpage | ( entry & ~PAGE_MASK);
+			buffer = page;
+		}
+		/* If the buffer is my destination page do the copy now 
+		 * i.e. invariant 3 & 1
+		 */
+		if (buffer == destination) {
+			memcpy(phys_to_virt(buffer), phys_to_virt(page), PAGE_SIZE);
+			*ptr = buffer | (entry & ~PAGE_MASK);
+			buffer = page;
+		}
+	}
+	free_page((unsigned long)phys_to_virt(buffer));
+	return 0;
+}
+
+static int kimage_add_empty_pages(struct kimage *image,
+	unsigned long len)
+{
+	unsigned long pos;
+	int result;
+	for(pos = 0; pos < len; pos += PAGE_SIZE) {
+		char *page;
+		result = -ENOMEM;
+		page = (void *)__get_free_page(GFP_KERNEL);
+		if (!page) {
+			goto out;
+		}
+		result = kimage_add_page(image, virt_to_phys(page));
+		if (result) {
+			goto out;
+		}
+	}
+	result = 0;
+ out:
+	return result;
+}
+
+
+static int kimage_load_segment(struct kimage *image,
+	struct kexec_segment *segment)
+{	
+	unsigned long mstart;
+	int result;
+	unsigned long offset;
+	unsigned long offset_end;
+	unsigned char *buf;
+
+	result = 0;
+	buf = segment->buf;
+	mstart = (unsigned long)segment->mem;
+
+	offset_end = segment->memsz;
+
+	result = kimage_set_destination(image, mstart);
+	if (result < 0) {
+		goto out;
+	}
+	for(offset = 0;  offset < segment->memsz; offset += PAGE_SIZE) {
+		char *page;
+		size_t size, leader;
+		page = (char *)__get_free_page(GFP_KERNEL);
+		if (page == 0) {
+			result  = -ENOMEM;
+			goto out;
+		}
+		result = kimage_add_page(image, virt_to_phys(page));
+		if (result < 0) {
+			goto out;
+		}
+		if (segment->bufsz < offset) {
+			/* We are past the end zero the whole page */
+			memset(page, 0, PAGE_SIZE);
+			continue;
+		}
+		size = PAGE_SIZE;
+		leader = 0;
+		if ((offset == 0)) {
+			leader = mstart & ~PAGE_MASK;
+		}
+		if (leader) {
+			/* We are on the first page zero the unused portion */
+			memset(page, 0, leader);
+			size -= leader;
+			page += leader;
+		}
+		if (size > (segment->bufsz - offset)) {
+			size = segment->bufsz - offset;
+		}
+		result = copy_from_user(page, buf + offset, size);
+		if (result) {
+			result = (result < 0)?result : -EIO;
+			goto out;
+		}
+		if (size < (PAGE_SIZE - leader)) {
+			/* zero the trailing part of the page */
+			memset(page + size, 0, (PAGE_SIZE - leader) - size);
+		}
+	}
+ out:
+	return result;
+}
+
+
+/* do_kexec executes a new kernel 
+ */
+static int do_kexec(unsigned long start, unsigned long nr_segments,
+	struct kexec_segment *arg_segments, struct kimage *image)
+{
+	struct kexec_segment *segments;
+	size_t segment_bytes;
+	int i;
+
+	int result; 
+	unsigned long reboot_code_buffer;
+	kimage_entry_t *end;
+
+	/* Initialize variables */
+	segments = 0;
+
+	segment_bytes = nr_segments * sizeof(*segments);
+	segments = kmalloc(GFP_KERNEL, segment_bytes);
+	if (segments == 0) {
+		result = -ENOMEM;
+		goto out;
+	}
+	result = copy_from_user(segments, arg_segments, segment_bytes);
+	if (result) {
+		goto out;
+	}
+
+	/* Read in the data from user space */
+	image->start = start;
+	for(i = 0; i < nr_segments; i++) {
+		result = kimage_load_segment(image, &segments[i]);
+		if (result) {
+			goto out;
+		}
+	}
+	
+	/* Terminate early so I can get a place holder. */
+	result = kimage_terminate(image);
+	if (result)
+		goto out;
+	end = image->entry;
+
+	/* Usage of the reboot code buffer is subtle.  We first
+	 * find a continguous area of ram, that is not one
+	 * of our destination pages.  We do not allocate the ram.
+	 *
+	 * The algorithm to make certain we do not have address
+	 * conflicts requires each destination region to have some
+	 * backing store so we allocate abitrary source pages.
+	 *
+	 * Later in machine_kexec when we copy data to the
+	 * reboot_code_buffer it still may be allocated for other
+	 * purposes, but we do know there are no source or destination
+	 * pages in that area.  And since the rest of the kernel
+	 * is already shutdown those pages are free for use,
+	 * regardless of their page->count values.
+	 *
+	 * The kernel mapping is of the reboot code buffer is passed to
+	 * the machine dependent code.  If it needs something else
+	 * it is free to set that up.
+	 */
+	result = kimage_get_unused_area(
+		image, KEXEC_REBOOT_CODE_SIZE, KEXEC_REBOOT_CODE_ALIGN,
+		&reboot_code_buffer);
+	if (result) 
+		goto out;
+
+	/* Allocating pages we should never need  is silly but the
+	 * code won't work correctly unless we have dummy pages to
+	 * work with. 
+	 */
+	result = kimage_set_destination(image, reboot_code_buffer);
+	if (result) 
+		goto out;
+	result = kimage_add_empty_pages(image, KEXEC_REBOOT_CODE_SIZE);
+	if (result)
+		goto out;
+	image->reboot_code_buffer = phys_to_virt(reboot_code_buffer);
+
+	result = kimage_terminate(image);
+	if (result)
+		goto out;
+
+	result = kimage_get_off_destination_pages(image);
+	if (result)
+		goto out;
+
+	/* Now hide the extra source pages for the reboot code buffer.
+	 */
+	image->entry = end;
+	result = kimage_terminate(image);
+	if (result)
+		goto out;
+
+	result = 0;
+ out:
+	/* cleanup and exit */
+	if (segments)	kfree(segments);
+	return result;
+}
+
+
+/*
+ * Exec Kernel system call: for obvious reasons only root may call it.
+ * 
+ * This call breaks up into three pieces.  
+ * - A generic part which loads the new kernel from the current
+ *   address space, and very carefully places the data in the
+ *   allocated pages.
+ *
+ * - A generic part that interacts with the kernel and tells all of
+ *   the devices to shut down.  Preventing on-going dmas, and placing
+ *   the devices in a consistent state so a later kernel can
+ *   reinitialize them.
+ *
+ * - A machine specific part that includes the syscall number
+ *   and the copies the image to it's final destination.  And
+ *   jumps into the image at entry.
+ *
+ * kexec does not sync, or unmount filesystems so if you need
+ * that to happen you need to do that yourself.
+ */
+struct kimage *kexec_image = 0;
+
+asmlinkage long sys_kexec_load(unsigned long entry, unsigned long nr_segments, 
+	struct kexec_segment *segments, unsigned long flags)
+{
+	/* Am I using to much stack space here? */
+	struct kimage *image, *old_image;
+	int result;
+		
+	/* We only trust the superuser with rebooting the system. */
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	/* In case we need just a little bit of special behavior for
+	 * reboot on panic 
+	 */
+	if (flags != 0)
+		return -EINVAL;
+
+	image = 0;
+	if (nr_segments > 0) {
+		image = kimage_alloc();
+		if (!image) {
+			return -ENOMEM;
+		}
+		result = do_kexec(entry, nr_segments, segments, image);
+		if (result) {
+			kimage_free(image);
+			return result;
+		}
+	}
+
+	old_image = xchg(&kexec_image, image);
+
+	kimage_free(old_image);
+	return 0;
+}
diff -uNr linux-2.5.51/kernel/sys.c linux-2.5.51.x86kexec/kernel/sys.c
--- linux-2.5.51/kernel/sys.c	Thu Dec 12 07:41:37 2002
+++ linux-2.5.51.x86kexec/kernel/sys.c	Thu Dec 12 07:43:54 2002
@@ -16,6 +16,7 @@
 #include <linux/init.h>
 #include <linux/highuid.h>
 #include <linux/fs.h>
+#include <linux/kexec.h>
 #include <linux/workqueue.h>
 #include <linux/device.h>
 #include <linux/times.h>
@@ -207,6 +208,7 @@
 cond_syscall(sys_lookup_dcookie)
 cond_syscall(sys_swapon)
 cond_syscall(sys_swapoff)
+cond_syscall(sys_kexec_load)
 cond_syscall(sys_init_module)
 cond_syscall(sys_delete_module)
 
@@ -419,6 +421,27 @@
 		machine_restart(buffer);
 		break;
 
+#ifdef CONFIG_KEXEC
+	case LINUX_REBOOT_CMD_KEXEC:
+	{
+		struct kimage *image;
+		if (arg) {
+			unlock_kernel();
+			return -EINVAL;
+		}
+		image = xchg(&kexec_image, 0);
+		if (!image) {
+			unlock_kernel();
+			return -EINVAL;
+		}
+		notifier_call_chain(&reboot_notifier_list, SYS_RESTART, NULL);
+		system_running = 0;
+		device_shutdown();
+		printk(KERN_EMERG "Starting new kernel\n");
+		machine_kexec(image);
+		break;
+	}
+#endif
 #ifdef CONFIG_SOFTWARE_SUSPEND
 	case LINUX_REBOOT_CMD_SW_SUSPEND:
 		if (!software_suspend_enabled) {

^ permalink raw reply

* ehci-hcd.o apparent load failure in 2.4.20-xx.. but
From: Frank Jacobberger @ 2002-12-14  7:48 UTC (permalink / raw)
  To: linux-kernel

Who maintains this driver?

I'm getting an odd error when kernel boots that the ehci-hcd.o.gz can't 
load..

or if doing an insmod ehci-hcd I get:

insmod ehci-hcd
Using /lib/modules/2.4.20-0.pp.7/kernel/drivers/usb/hcd/ehci-hcd.o.gz
/lib/modules/2.4.20-0.pp.7/kernel/drivers/usb/hcd/ehci-hcd.o.gz: 
init_module: No such device

Dmesg and everything else points to it loading:

hcd.c: ehci-hcd @ 00:1d.7, Intel Corp. 82801DB USB EHCI Controller

and:

Doing an lspci bears this out:

00:1d.7 USB Controller: Intel Corp. 82801DB USB EHCI Controller (rev 02)

No idea why the kernel is balking at boot and not logging this to kernel messages!

Any ideas?

Thanks,

Frank





^ permalink raw reply

* Re: Time Patch on 1.2.6a
From: hare ram @ 2002-12-14  7:38 UTC (permalink / raw)
  To: Huw Dixon, fabrice, netfilter
In-Reply-To: <F8au7JJi2Fu6JX8dE520000db97@hotmail.com>

Yes

DownLoad P-O-M or get new 1.2.7a tar file from netfilter site
since i have same version, i could not able to re-compile kernel after
patching with time, quote patch
so i removed RPM, bought latest 1.2.7a patched to the kernal

now its working fine

best of luck

hare
----- Original Message -----
From: "Huw Dixon" <huwdixon@hotmail.com>
To: <fabrice@netfilter.org>; <netfilter@lists.netfilter.org>
Sent: Saturday, December 14, 2002 11:14 AM
Subject: Re: Time Patch on 1.2.6a


> Thx - i did have rpm iptables installed. I've removed the iptables
package.
> Since I've already went the the 'pending-patches' step and the TIME match
> option does show in my kernel .config, can I simply do a kernel compile
from
> make dep to make modules_install, reboot and install iptables 1.2.6a?
>
> Huw
>
>
> >
> >You need to recompile & install iptables after that your kernel has been
> >patched.
> >Since you run RH, make sure also that iptables RPM package is not
> >installed.
> >
> ># rpm -q iptables
> >package iptables is not installed
> >
> >If it is, just remove it, and re-compile & reinstall iptables.
> >
> >Have a nice day,
> >
> >Fabrice.
> >--
> >Fabrice MARIE
> >
> >"Silly hacker, root is for administrators"
> >        -Unknown
>
>
> _________________________________________________________________
> Help STOP SPAM with the new MSN 8 and get 2 months FREE*
> http://join.msn.com/?page=features/junkmail
>
>
>



^ permalink raw reply


This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.