All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [2.4.17/18pre] VM and swap - it's really unusable
From: Robert Love @ 2002-01-13 20:04 UTC (permalink / raw)
  To: Andrew Morton
  Cc: jogi, Ed Sweetman, Andrea Arcangeli, yodaiken, Alan Cox, nigel,
	Rob Landley, linux-kernel
In-Reply-To: <3C41E415.9D3DA253@zip.com.au>

On Sun, 2002-01-13 at 14:46, Andrew Morton wrote:

> I can't say that I have ever seen any significant change in throughput
> of anything with any of this stuff.

I can send you some numbers.  It is typically 5-10% throughput increase
under load.  Obviously this work won't help a single task on a single
user system.  But things like (ack!) dbench 16 show a marked
improvement.

> Benchmarks are well and good, but until we have a solid explanation for
> the throughput changes which people are seeing, it's risky to claim
> that there is a general benefit.

I have an explanation.  We can schedule quicker off a woken task.  When
an event occurs that allows an I/O-blocked task to run, its time-to-run
is shorter.  Same event/response improvement that helps interactivity.

	Robert Love


^ permalink raw reply

* Re: initramfs buffer spec -- second draft
From: H. Peter Anvin @ 2002-01-13 20:08 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: linux-kernel
In-Reply-To: <E16PqXQ-0000BD-00@starship.berlin>

Daniel Phillips wrote:

> 
>>The structure of the cpio_header is as follows (all 8-byte entries
>>contain 32-bit hexadecimal ASCII numbers):
> 
> I thought there's a binary version of the cpio header.  What is the
> point of the ascii encoding?
> 


Byte order independence.  The binary version of cpio is ancient and 
obsolete.  Unfortunately the SysV people didn't have the htons() etc 
macros of BSD, so they had no concept of portable binary formats.

 
>>The c_mode field matches the contents of st_mode returned by stat(2)
>>on Linux, and encodes the file type and file permissions.
>>
>>The c_filesize should be zero for any non-regular file.
>>
>>If the filename is "TRAILER!!!" this is actually an end-of-file
>>marker; the c_filesize for an end-of-file marker must be zero.
>>
> It sure looks ugly, but I suppose the c_filesize=zero is the real
> end-of-file marker.  Did I mention it sure looks ugly?
> 


c_filesize == 0 does *NOT* imply a end-of-archive marker.  It is the 
filename "TRAILER!!!" that does.  And yes, it's ugly.

	-hpa



^ permalink raw reply

* Re: initramfs buffer spec -- second draft
From: H. Peter Anvin @ 2002-01-13 20:11 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: linux-kernel
In-Reply-To: <m1ofjyqb7t.fsf@frodo.biederman.org>

Eric W. Biederman wrote:

> 
> Comments.  Endian issues are not specified, is the data little, big
> or vax endian?
> 


Not applicable.  There are no endian-specific binary structure in the 
format AT ALL.  ASCII-coded fields are always bigendian.


> What is the point of alignment?  If the data starts as 4 byte aligned,
> the 6 byte magic string guarantees the data will be only 2 byte
> aligned.  This isn't good for 32 or 64 bit architectures.


They're ASCII-coded, so it supposedly doesn't matter (yet, it's a bit 
daft, but blame the SysV people.)  The alignment makes sure the *data* 
field is 4-byte aligned.


> I do like having a c_magic that at least allows us to change things
> in the future if necessary.


It's pretty clear from a lot of the comments that a number of people 
haven't understood that the cpio encapsulation *THIS IS A CODIFICATION 
OF AN EXISTING FORMAT.*

	-hpa



^ permalink raw reply

* 2.5.1-dj14
From: Dave Jones @ 2002-01-13 20:13 UTC (permalink / raw)
  To: Linux Kernel

Assorted merges, fixes, updates & work in progress.

Most of the remaining compilation errors are drivers in need
off bio work.  There may be a handful of kdev_t fixes still
needed, but this seems mostly done.

Me being away for a week has meant that the backlog grew quite large,
and as such this has a large number of merges, tread carefully.

Patch against 2.5.1 vanilla is available from:
ftp://ftp.xx.kernel.org/pub/linux/kernel/people/davej/patches/2.5/

 -- Davej.

2.5.1-dj14
o   Merge 2.5.2pre10 & pre11
    | Also include Ingosched H7, so my testboxes now boot.
    | Dropped Manfreds ldtgrow patch for now due to conflict.
o   Merge 2.4.18pre2 & pre3
    | DRM4.0 changes dropped.
    | various SCSI layer changes dropped.
o   Yet more kdev_t fixes.				(Various)
o   Fix potential ide-cd oops.				(Zwane Mwaikambo)
o   Reiserfs kmalloc cleanup.				(Oleg Drokin)
o   Reiserfs potential corruption fix.			(Oleg Drokin)
o   Reiserfs endian fixes.				(Oleg Drokin)
o   64 bit cleanness for reiserfs.			(Oleg Drokin)
o   Fix 'sticky alt on chvt' problem.			(James Simmons)
o   Fix 3Dfx fbdev ROP ops namespace collision.		(James Simmons)
o   Console blanking improvement.			(James Simmons)
o   Multiple sound devices for OSS API.			(Chris Rankin)
o   Remove unneeded pidhash clearing.			(Randy Dunlap)
o   Allow enslaved devices with same ethernet address.	(Lennert Buytenhek)
o   Cleanup IDE casts.					(Pavel Machek)
o   Work around FAT fs __divdi generation.		(Tom Rini)
o   Print correct MCE address in bluesmoke.		(Lowell Miles)
o   Numerous 's/more then/more than/'			(Me)


2.5.1-dj13
o   Merge 2.5.2pre9
    | Take akpm fix for ext3 over Linus'	(Andrew Morton)
o   More kdev_t fixes.				(Various)
o   Remerge acmes bio cleanups from -dj11	(Arnaldo Carvalho de Melo)
o   Add __copy_to_user_prefetch()		(Me)
o   Clean up preload_cache() a little.		(Me)


2.5.1-dj12
o   Merge 2.5.2pre7
o   Enable alternative PTE routines.			(Andrea Arcangeli)
o   Reschedule during inode flushes under mem pressure.	(Andrea Arcangeli)
o   More kdev_t compile fixes.				(Andries Brouwer,
					 		 Jonathan Corbet,
							 Luc Van Oostenryck, Me)
o   Further include file cleanups.			(Me)
o   aic7xxx nseg bugfix.				(Jens Axboe, Others)
o   Fix panic on corrupted reiserfs.			(Oleg Drokin)
o   Fix reiserfs taildata corruption on mempressure.	(Oleg Drokin)
o   Fix kreiserfsd sleep timeout thinko.		(Oleg Drokin)
o   apic.c LVTERR fixes.				(Mikael Pettersson)
o   Merge correct & updated sg driver.			(Doug Gilbert)
o   Add NetGear EA201 to ne2k ISAPNP clones.		(Chris Rankin)
o   Various 53c700 fixes.				(James Bottomley)


2.5.1-dj11
o   Merge up to 2.5.2pre6
    | Plus various compile fixes.		(Me, Jeff Garzik,
		    				 Frank Davis, Martin Dalecki)
o   Don't enable APIC on newer Dell laptops.	(Mikael Pettersson)
o   Add more missing MODULE_LICENSE tags.	(Me)
o   Report out-of-spec SMP Athlons.		(Me)
    | Flames to /dev/null
o   More fbdev/console clean up.		(James Simmons)
o   Sync up with latest bootproto.		(H. Peter Anvin)
o   Reiserfs Sparc alignment fix.		(Alexander Zarochentcev)
o   Remove some bogus headers left around.	(Christoph Hellwig)
o   Fix wanrouter build.			(Me)
o   Various bio surgery on SCSI drivers.	(Arnaldo Carvalho de Melo)
o   Reiserfs getblk cleanups.			(Christoph Hellwig)
o   make DASD use generic BLKGETSIZE{64} again	(Christoph Hellwig)
o   Fix devfs & tty breakage.			(James Simmons)


2.5.1-dj10
o   Remove one of the NFS changes. Better fix in mainline.	(Me)
o   Add switch to enable 486 string copies.			(Me)
    | 486 users please try this out, and give feedback
    | so we can see how broken this actually is.
    | It's in the 'kernel hacking' menu.
o   JFFS2 corruption fix.			(David Woodhouse)
o   Bridging CONFIG_INET cleanup.		(Lennert Buytenhek)
o   Bridging recursion bugfix.			(Lennert Buytenhek)
o   Fix up port state handling.			(Lennert Buytenhek)
o   Improved fbdev init.			(James Simmons)
o   PNPOS simple bootflag fix.			(Thomas Hood)
o   Drop most of the USB changes on Greg's request.
    | Newer versions should appear in -linus soon.
    | Some bits still remain, but if I've broke it, blame
    | me and not Greg.
o   Experimental preload_cache() function.	(Me)
o   Ugly hack to file_read_actor() to use the above	(Me)
    | Just playing, this needs more work.


2.5.1-dj9
o   Merge up to 2.5.2pre4.
    | Also fix up a bunch of build errors.
o   Add support for Sony DSC-P5 to USB unusual devs.	(Gregor Jasny)
o   First part of new console locking infrastructure.	(James Simmons)
o   Cleaner/Lighter fbdev api.				(James Simmons,
							 Geert Uytterhoeven)
o   Don't coredump framebuffer contents.		(Andrew Morton)
o   Fix hang on close of serial tty.			(Russell King)
o   Remove the set_current_state() patch, needs work.	(Me)
o   Drop ICH2 addition to ioapic Whitelist. 		(Me)
o   Do the asm/segment.h crapectomy properly.		(David Woodhouse)
o   Reactivate the PNPBIOS Configure.help entry.


2.5.1-dj8
o   Remove leftover EISA cruft in x86 ksyms.		(Me)
o   Add a missing part of the split visws support.	(Me)
o   Make reiserfs partitions mountable again.		(Al Viro,
							 Andrew Morton, Me)
o   Make x86 math emulation work with dynamic LDT.	(Manfred Spraul)
o   Fix problems with tdfxfb & high pixelclocks.	(Jurriaan)
    | Only tested on PCI 4500, feedback to thunder7@xs4all.nl
o   Replace text.lock with .subsection			(Keith Owens)
o   Remove Cyrix SLOP workaround.			(Me)
    | Can be done in userspace/initramfs.
o   Merge pnpbios support.				(Thomas Hood)
    | Should work, but may be nice to bend into shape
    | to fit the new driverfs model at some point.


2.5.1-dj7
o   Merge 2.5.2pre3
    | Drop some of the reiserfs changes. Looks like -dj has
    | a more complete set of fixes from 2.4. This is getting
    | a little hairy, so handle with care.
o   Make rootfs compile.				(Me)
o   Dynamically grow LDT.				(Manfred Spraul)
o   Randomness for ext2 generation numbers.		(Manfred Spraul)
o   Give Manfreds threaded coredump a retry.		(Manfred Spraul)
o   Add missing ad1848 formats.				(Alan Cox)
o   Make ide-floppy compile without PROC_FS.		(Robert Love)
o   generic_serial, rio_linux, serial_tx3912,		(Rasmus Andersen)
    sh-sci and sx drivers janitor work.
o   opl3sa2 Power management support & update.		(Zwane Mwaikambo)
    | Add Zwane to MAINTAINERS for this too.
o   Fix buggy MODINC i2o_config macro.			(Andreas Dilger)
o   Cyclades driver /proc/ioports oops fix.		(Andrew Morton)
    | Untested afaik, but looks sane.
    | rmmod cyclades.o ; cat /proc/ioports to see if this works.
o   SX driver, DCD-HylaFAX problem solved.		(Heinz-Ado Arnolds)
o   Only look in 1KB of EBDA for MP table.		(Zwane Mwaikambo) 
    | Follows the MP1.4 Spec closer, let me know of any
    | SMP problems if any with this change.
o   Better fix for the sunrpc 'missing include'.	(David Woodhouse)
o   Remove bogus <asm/segment.h> includes.		(David Woodhouse)
o   ps2esdi spinlock typo.				(Me)


2.5.1-dj6
o   Merge 2.5.2pre2
    | Includes updated for 2.5 SCSI debug driver.	(Douglas Gilbert)
o   Merge 2.4.18pre1
o   Missing include in sunrpc sched.c			(David S. Miller)
o   Remove incorrect devinit's from bttv & USB.		(Andrew Morton)
o   Remove redundant EISA_bus__is_a_macro macro.	(Me)
o   Split visws support to setup-visws.c		(Me)
    | Can someone with one of these beasts test this, and maybe
    | even *gulp* maintain it ?
o   pc110pad spinlock thinko				(Peter T. Breuer)
o   Fix reiserfs + highmem possible oops.		(Oleg Drokin)
o   Fix reiserfs fsx breakage.				(Oleg Drokin)
o   Make IPV6 accept timestamps in response to SYNs.	(Alexey Kuznetsov)
o   NCR5380_timer_fn needs to be static.		(Rasmus Andersen)
o   CONFIG_SERIAL_ACPI is IA64 only.			(Me)


2.5.1-dj5
o   Sync up to 2.5.2pre1
o   Merge 2.4.17final.
o   Gravis ultrasound PnP update		(Andrey Panin)


2.5.1-dj4
o   Merge with 2.4.17-rc2
    | Most was already here, more or less just fixes for
    | reiserfs & netfilter, and some VM changes.


2.5.1-dj3
o   Drop Manfreds multithread coredump changes		(Me)
    | They caused ltp waitpid05 regression on 2.5
    | (Same patch is fine for 2.4)
o   Intermezzo compile fix.				(Chris Wright)
o   Fix ymfpci & hisax merge errors.			(Me)
o   Drop ad1848 sound driver changes in favour of 2.5	(Me)
o   Make hpfs work again.				(Al Viro)
o   Alpha Jensen compile fixes.				(Ronald Lembcke)
o   Make NCR5380 compile non modularly.			(Erik Andersen)


2.5.1-dj2
o   bio fixes for qlogicfas.			(brett@bad-sports.com)
o   Correct x86 CPU helptext.			(Me)
o   Fix serial.c __ISAPNP__ usage.		(Andrey Panin)
o   Use better ide-floppy fixes.		(Jens Axboe)
o   Make NFS 'fsx' proof.			(Trond Mykelbust)
    | 2 races & 4 bugs, hopefully this is all.
o   devfs update				(Richard Gooch)
o   Backout early CPU init, needs more work.	(Me)
    | This should fix several strange reports.
o   drop new POSIX kill semantics for now	(Me)


2.5.1-dj1
o   Resync with 2.5.1
    | drop reiserfs changes. 2.4's look to be more complete.
o   Fix potential sysvfs oops.				(Christoph Hellwig)
o   Loopback driver deadlock fix.			(Andrea Arcangeli)
o   __devexit cleanups in drivers/net/			(Daniel Chen,
    synclink, wdt_pci & via82cxxx_audio 		 John Tapsell)
o   Configure.help updates				(Eric S. Raymond)
o   Make reiserfs compile again.				(Me)
o   bio changes for ide floppy					(Me)
    | handle with care, compiles, but is unfinished.
o   Make x86 identify_cpu() happen earlier			(Me)
    | PPro errata workaround & APIC setup got a little
    | cleaner as a result.
o   Blink keyboard LEDs on panic				(From 2.4.13-ac)
o   Change current->state frobbing to set_current_state()	(From 2.4.13-ac)
o   Add MODULE_LICENSE tags for acpi,md.c,fmvj18x,		(From 2.4.13-ac)
    atyfb & fbmem.


-- 
Dave Jones.                    http://www.codemonkey.org.uk
SuSE Labs.

^ permalink raw reply

* Re: [2.4.17/18pre] VM and swap - it's really unusable
From: Dieter Nützel @ 2002-01-13 20:11 UTC (permalink / raw)
  To: J Sloan, Robert Love; +Cc: Linux Kernel List

> The problem here is that when people report
> that the low latency patch works better for them
> than the preempt patch, they aren't talking about
> bebnchmarking the time to compile a kernel, they
> are talking about interactive feel and smoothness.
>
> You're speaking to a peripheral issue.

Yes, but I did latency testing for Robert for several months, now.

> I've no agenda other than wanting to see linux
> as an attractive option for the multimedia and
> gaming crowds

I am, too. But more for 3D visualization/simulation (with audio).

> - and in my experience, the low
> latency patches simply give a much smoother
> feel and a more pleasant experience.

Not for me. Even when lock-brake is applied.

> Kernel
> compilation time is the farthest thing from my
> mind when e.g. playing Q3A!

Q3A is _NOT_ changed in any case. Even some smoother system "feeling" with 
Q3A and UT 436 running in parallel on an UP 1 GHz Athlon II, 640 MB. Have you 
seen something on any Win box?

> I'd be happy to check out the preempt patch
> again and see if anything's changed, if the
> problem of tux+preempt oopsing has been
> dealt with -

You told me that TUX show some problems with preempt before.
What problems? Are they TUX specific?

Some latency numbers coming soon.

-Dieter

-- 
Dieter Nützel
Graduate Student, Computer Science

University of Hamburg
Department of Computer Science
@home: Dieter.Nuetzel@hamburg.de

^ permalink raw reply

* Re: BIO Usage Error or Conflicting Designs
From: Andre Hedrick @ 2002-01-13 19:59 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Andre Hedrick, linux-kernel
In-Reply-To: <20020113135927.A11793@suse.de>

On Sun, 13 Jan 2002, Jens Axboe wrote:

> On Sat, Jan 12 2002, Andre Hedrick wrote:
> > 
> > Jens,
> > 
> > Here is back at you sir.
> 
> Without highmem debug enabled?? I already knew this was the bug
> triggered, nothing new here.
> 
> Please print the two pfn values triggering the BUG_ON, I'll take a look
> at this tomorrow.

That is with highmem debug on, the stuff at the end of the config file.
Nothing more is generated, if there are more flags to set please tell me
where.

Regards,

Andre Hedrick
Linux Disk Certification Project                Linux ATA Development


^ permalink raw reply

* Re: ugly warnings with likely/unlikely
From: Andrew Morton @ 2002-01-13 20:12 UTC (permalink / raw)
  To: Alan Cox; +Cc: Oliver.Neukum, linux-kernel
In-Reply-To: <E16PmEA-0007Ai-00@the-village.bc.nu>

Alan Cox wrote:
> 
> > if (likely(stru->pointer))
> >
> > results in an ugly warning about using pointer as int.
> > Is there something that could be done against that ?
> 
>         if (likely(stru->pointer == NULL))
> 

-#define likely(x)       __builtin_expect((x),1)
-#define unlikely(x)     __builtin_expect((x),0)
+#define likely(x)       __builtin_expect((x)!=0,1)
+#define unlikely(x)     __builtin_expect((x)!=0,0)

?

^ permalink raw reply

* Re: F00F-bug workaround working?
From: H. Peter Anvin @ 2002-01-13 20:20 UTC (permalink / raw)
  To: linux-kernel
In-Reply-To: <20020112160308.B4926@ksu.edu>

Followup to:  <20020112160308.B4926@ksu.edu>
By author:    Joseph Pingenot <jap3003@ksu.edu>
In newsgroup: linux.dev.kernel
>
> From Tony Glader on Saturday, 12 January, 2002:
> >I have had problems with 2.4.17 running in a Classic Pentium (lot of 
> >oopses). I'm sure that there's no problem with hardware. Is F00F'bug 
> >workaround work still?
> 
> I'm running a P100 on 2.4.17 without any errors.  I see
> "Intel Pentium with F0 0F bug - workaround enabled."
> in the dmesg logs.
> 
> How about posting your oopses (after running them through ksymoops)
> 

F00F wouldn't oops the machine anyway, it would lock it up solid.

	-hpa
-- 
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt	<amsp@zytor.com>

^ permalink raw reply

* Re: [2.4.17/18pre] VM and swap - it's really unusable
From: jogi @ 2002-01-13 20:17 UTC (permalink / raw)
  To: Robert Love
  Cc: Andrew Morton, Ed Sweetman, Andrea Arcangeli, yodaiken, Alan Cox,
	nigel, Rob Landley, linux-kernel
In-Reply-To: <1010946178.11848.14.camel@phantasy>

On Sun, Jan 13, 2002 at 01:22:57PM -0500, Robert Love wrote:
> On Sun, 2002-01-13 at 12:42, jogi@planetzork.ping.de wrote:
> 
> >         13-pre5aa1      18-pre2aa2      18-pre3         18-pre3s        18-pre3sp       18-pre3minill  
> > j100:   6:59.79  78%    7:07.62  76%        *           6:39.55  81%    6:24.79  83%        *
> > j100:   7:03.39  77%    8:10.04  66%        *           8:07.13  66%    6:21.23  83%        *
> > j100:   6:40.40  81%    7:43.15  70%        *           6:37.46  81%    6:03.68  87%        *
> > j100:   7:45.12  70%    7:11.59  75%        *           7:14.46  74%    6:06.98  87%        *
> > j100:   6:56.71  79%    7:36.12  71%        *           6:26.59  83%    6:11.30  86%        *
> > 		                                                                                          
> > j75:    6:22.33  85%    6:42.50  81%    6:48.83  80%    6:01.61  89%    5:42.66  93%    7:07.56  77%
> > j75:    6:41.47  81%    7:19.79  74%    6:49.43  79%    5:59.82  89%    6:00.83  88%    7:17.15  74%
> > j75:    6:10.32  88%    6:44.98  80%    7:01.01  77%    6:02.99  88%    5:48.00  91%    6:47.48  80%
> > j75:    6:28.55  84%    6:44.21  80%    9:33.78  57%    6:19.83  85%    5:49.07  91%    6:34.02  83%
> > j75:    6:17.15  86%    6:46.58  80%    7:24.52  73%    6:23.50  84%    5:58.06  88%    7:01.39  77%
> 
> Again, preempt seems to reign supreme.  Where is all the information
> correlating preempt is inferior?  To be fair, however, we should bench a
> mini-ll+s test.

Your wish is granted. Here are the results for mini-ll + scheduler:

j100:   8:26.54
j100:   7:50.35
j100:   6:49.59
j100:   6:39.30
j100:   6:39.70
j75:    6:01.02
j75:    6:12.16
j75:    6:04.60
j75:    6:24.58
j75:    6:28.00

Jogi

-- 

Well, yeah ... I suppose there's no point in getting greedy, is there?

    << Calvin & Hobbes >>

^ permalink raw reply

* Re: [PATCH] 1-2-3 GB
From: H. Peter Anvin @ 2002-01-13 20:24 UTC (permalink / raw)
  To: linux-kernel
In-Reply-To: <Pine.LNX.4.21.0201121825200.1105-100000@localhost.localdomain>

Followup to:  <Pine.LNX.4.21.0201121825200.1105-100000@localhost.localdomain>
By author:    Hugh Dickins <hugh@veritas.com>
In newsgroup: linux.dev.kernel
> 
> Usually not a problem: but if you configure for 1GB of user virtual
> and 3GB of kernel virtual, and you have more than 1GB of physical
> memory (as you normally would if chose HIGHMEM64G), then there's
> a page at physical address 0x3ffff000, directly mapped to virtual
> address 0x7ffff000.  And if that page happens to get used for the
> pmd of a process, then on exit the free_one_pgd loop wraps over
> to carry on freeing "entries" at 0x80000000, 0x80000008, ...
> A lot of pmd_ERROR messages, but eventually an entry scrapes
> through the pmd_bad test and is wrongly freed, not so good.
> 

By the way, expect user programs to fail due to lack of address space
if you only give them 1 GB of userspace.  At 1 GB of userspace there
is *no* address space which is compatible with the normal address
space map available to the user process.

I would personally vote against including that particular option.

	-hpa
-- 
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt	<amsp@zytor.com>

^ permalink raw reply

* Re: [2.4.17/18pre] VM and swap - it's really unusable
From: Roman Zippel @ 2002-01-13 20:25 UTC (permalink / raw)
  To: Alan Cox; +Cc: Robert Love, Kenneth Johansson, arjan, Rob Landley, linux-kernel
In-Reply-To: <E16Pmok-0007GD-00@the-village.bc.nu>

Hi,

Alan Cox wrote:

> > What somehow got lost in this discussion, that both patches don't
> > necessarily conflict with each other, they both attack the same problem
> > with different approaches, which complement each other. I prefer to get
> > the best of both patches.
> 
> When you look at the benchmark there is no difference between ll and
> ll+pre-empt. ll alone takes you to the 1ms point.

I don't doubt that, but would you seriously consider the ll patch for
inclusion into the main kernel?
It's a useful patch for anyone, who needs good latencies now, but it's
still a quick&dirty solution. Preempt offers a clean solution for a
certain part of the problem, as it's possible to cleanly localize the
needed changes for preemption (at least for UP). That means the ll patch
becomes smaller and future work on ll becomes simpler, since a certain
type of latency problems is handled automatically (and transparently),
so you do gain something by it.
The remaining places pointed out in the ll patch are worth a closer look
as well, as mostly now we hold a spinlock for too long. These should be
fixed as well, as they mean possible contention problems on SMP.

> pre-empt takes you no
> further and to get much out of pre-emption requires you go and do all the
> hideously slow and complex priority inversion stuff.

The possibility of priority inversion problems are not new, it was
already discussed before. It was considered not a serious problem, since
all processes will still make progress. Preempt now increases the
likeliness such a situation occurs, but nonetheless the processes will
still make progress. In the past I can't remember any report that
indicated a problem caused by priority inversion and so I simply can't
believe it should become a massive problem now with preempt.

> > exactly that reason. I don't think we need to work around broken
> > hardware, but halfway decent hardware should not be a problem to get
> > decent latency.
> 
> We have to work around common hardware not designed for SMP - the 8390 isnt
> a broken chip in that sense, its just from a different era, and there are a
> lot of them.

Please let me rephrase, I just don't expect terrible good latency
numbers with non dma hardware.

bye, Roman

^ permalink raw reply

* __alloc_pages: 0-order allocation failed on 2.4.18-pre2aa2
From: Diego Calleja @ 2002-01-13 20:35 UTC (permalink / raw)
  To: linux-kernel

I got this message one time from kernel. I was only running 'apt-get
install XXXX'. No reproducible. Using 2.4.18-pre2aa2:
 
__alloc_pages: 0-order allocation failed (gfp=0xf0/0)

Diego Calleja.

^ permalink raw reply

* Re: [patch] O(1) scheduler, -G1, 2.5.2-pre10, 2.4.17 (fwd)
From: Dieter Nützel @ 2002-01-13 20:30 UTC (permalink / raw)
  To: Rusty Russell; +Cc: Ingo Molnar, Robert Love, Linux Kernel List

Rusty wrote:
> Agree.  Anyone who really has 3 CPU hogs on a 2 CPU machine, *and*
> never runs two more tasks to perturb the system, *and* notices that
> one runs twice the speed of the other two, *and* cares about fairness
> (ie. not RC5 etc), feel free to Email abuse to me.  Not Ingo, he has
> real work to do 8)

Or buy a third CPU...;-)

Regards,
	Dieter

BTW Not meant as flaming.
--
Dieter Nützel
Graduate Student, Computer Science

University of Hamburg
Department of Computer Science
@home: Dieter.Nuetzel@hamburg.de

^ permalink raw reply

* [parisc-linux] 715/50
From: geezer @ 2002-01-13 20:34 UTC (permalink / raw)
  To: parisc-linux, debian-hppa

[-- Attachment #1: Type: text/plain, Size: 305 bytes --]

So the palinux-0.9-README file says "YOU MUST INSTALL 'sid'", blah, blah, blah...
I get "failed getting release file file:/instmnt/dists/sid/Release"
The woody one installs OK, but I am not sure how "crippled" I may be with it.
Any ideas?  What more can I tell you to help with the diagnosis?
Geezer

[-- Attachment #2: Type: text/html, Size: 811 bytes --]

^ permalink raw reply

* Re: [2.4.17/18pre] VM and swap - it's really unusable
From: Andrew Morton @ 2002-01-13 20:30 UTC (permalink / raw)
  To: Robert Love
  Cc: jogi, Ed Sweetman, Andrea Arcangeli, yodaiken, Alan Cox, nigel,
	Rob Landley, linux-kernel
In-Reply-To: <1010952276.12125.59.camel@phantasy>

Robert Love wrote:
> 
> ...
> > Benchmarks are well and good, but until we have a solid explanation for
> > the throughput changes which people are seeing, it's risky to claim
> > that there is a general benefit.
> 
> I have an explanation.  We can schedule quicker off a woken task.  When
> an event occurs that allows an I/O-blocked task to run, its time-to-run
> is shorter.  Same event/response improvement that helps interactivity.
> 

Sounds more like handwaving that an explanation :)

The way to speed up dbench is to allow the processes which want to delete
files to actually do that.  This reduces the total amount of IO which the
test performs.  Another way is to increase usable memory (or at least to
delay the onset of balance_dirty going synchronous).  Possibly it's something
to do with letting kswapd schedule earlier.  Or bdflush.

In the swapstorm case, it's again not clear to me.  Perhaps it's due to prompter
kswapd activity, perhaps due somehow to improved request merging.

As I say, without a precise and detailed understanding of the mechanisms
I wouldn't be prepared to claim more than "speeds up dbench and swapstorms
for some reason".

(I'd _like_ to know the complete reason - that way we can stare at it
and maybe make things even better.  Doing a binary search through the
various chunks of the mini-ll patch would be instructive).

-

^ permalink raw reply

* why do i get kernel panic [ solved ! ]
From: Nico Schottelius @ 2002-01-13 20:37 UTC (permalink / raw)
  To: Linux Kernel Mailing List, Alan Cox

Hello list!

I solved the problem and wanted to report what tell you what the problem
was:
- the filessystem support was fine
- init was fine
but....:

I forgot that init was dynamicly linked and there were no libc.so.6 nor
ld-linux.so on my system!

I am afraid that the kernel says there is no init, because there where
an init,
only it had not the libs it needed.

Could we not change the panic message to
"could not execute init correctly (possibly a problem with init or
init's libs)" ?


--
{Greetings,Gruss},
Nico Schottelius

I am some kind of busy -
Do not expect an answer within 24 hours.
Instead use the telephon: +49 (0) 173 - 750 7022.




^ permalink raw reply

* Re: initramfs buffer spec -- second draft
From: Alexander Viro @ 2002-01-13 20:39 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: H. Peter Anvin, linux-kernel
In-Reply-To: <m1ofjyqb7t.fsf@frodo.biederman.org>



On 13 Jan 2002, Eric W. Biederman wrote:

> "H. Peter Anvin" <hpa@zytor.com> writes:
> 
> > This is an update to the initramfs buffer format spec I posted
> > earlier.  The changes are as follows:
> 
> Comments.  Endian issues are not specified, is the data little, big
> or vax endian?

Data is what you put into files, byte-by-byte.  Headers are ASCII.
 
> What is the point of alignment?  If the data starts as 4 byte aligned,
> the 6 byte magic string guarantees the data will be only 2 byte
> aligned.  This isn't good for 32 or 64 bit architectures.

Both data and headers are aligned.  And headers are ascii strings.


^ permalink raw reply

* Re: Hard lock when mounting loopback file
From: Andrew Morton @ 2002-01-13 20:35 UTC (permalink / raw)
  To: Marius Gedminas; +Cc: linux-kernel
In-Reply-To: <20020113115230.GB1955@gintaras>

Marius Gedminas wrote:
> 
> On Sat, Jan 12, 2002 at 11:49:04PM -0800, Andrew Morton wrote:
> > I don't know a thing about fat layout, but it appears that it uses a
> > linked list of blocks, and if that list ends up pointing back onto
> > itself, the kernel goes into an infinite loop in several places chasing
> > its way to the end of the list.
> >
> > The below patch fixed it for me, and I was able to mount and read
> > your filesystem image.
> >
> > Unless someone has a smarter fix, I'll send this to the kernel
> > maintainers in a week or two.
> 
> It seems to me that this patch will find only those infinite loops where
> the last link of the chain points to itself.  But there could be loops
> where the last link points to the middle of the chain.

Agree.

> Additional check on the number of followed links could be useful there.
> No chain should be longer than the number of clusters on the fs.
> Although on large FAT32 filesystems the number of clusters can be high,
> a very long loop is still better than an infinite one.  (In cases where
> we know the file size, this limit can be reduced to
> file_size/cluster_size + 1 links).

hmm..  OK, I'll take a look at that approach.

-

^ permalink raw reply

* Re: SCSI host numbers?
From: Itai Nahshon @ 2002-01-13 20:41 UTC (permalink / raw)
  To: Richard Gooch; +Cc: Alan Cox, linux-kernel
In-Reply-To: <200201060144.g061i9E09115@vindaloo.ras.ucalgary.ca>

On Sunday 06 January 2002 03:44 am, Richard Gooch wrote:
> Where exactly is the host_id for an unregistered host being
> remembered?

Sorry for the late reply. I was away from Email for the whole week.

Scsi host numbers (for both regstered and unregistered hosts)
are preserved in scsi_host_no_list.

The list is used in the function scsi_register (in drivers/scsi/hosts.c).
Same function also adds new hosts to the list.

The list can be initialized (from boot parameters ?) by 
the function scsi_host_no_init (drivers/scsi/scsi.c).

-- Itai

^ permalink raw reply

* [PATCH] SMP kernel deadlocking on UP boxen
From: Alexander Viro @ 2002-01-13 20:45 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Ingo Molnar, linux-kernel

	Still present in -pre11:

diff -urN C2-pre10/kernel/sched.c C2-pre10-resched_task.bork.bork.bork/kernel/sched.c
--- C2-pre10/kernel/sched.c	Mon Jan  7 19:33:06 2002
+++ C2-pre10-resched_task.bork.bork.bork/kernel/sched.c	Sun Jan 13 15:31:10 2002
@@ -219,7 +219,7 @@
 	need_resched = p->need_resched;
 	wmb();
 	p->need_resched = 1;
-	if (!need_resched)
+	if (!need_resched && p->cpu != smp_processor_id())
 		smp_send_reschedule(p->cpu);
 }
 


^ permalink raw reply

* Re: [PATCH] SMP kernel deadlocking on UP boxen
From: Ingo Molnar @ 2002-01-13 22:43 UTC (permalink / raw)
  To: Alexander Viro; +Cc: Linus Torvalds, linux-kernel
In-Reply-To: <Pine.GSO.4.21.0201131540550.27390-100000@weyl.math.psu.edu>


On Sun, 13 Jan 2002, Alexander Viro wrote:

> -	if (!need_resched)
> +	if (!need_resched && p->cpu != smp_processor_id())
>  		smp_send_reschedule(p->cpu);

this is DaveM's fix, which is in the -H7 scheduler patch.

	Ingo


^ permalink raw reply

* initramfs buffer spec -- third draft
From: H. Peter Anvin @ 2002-01-13 20:46 UTC (permalink / raw)
  To: linux-kernel

		       initramfs buffer format
		       -----------------------

		       Al Viro, H. Peter Anvin
		      Last revision: 2002-01-13

       ** DRAFT ** DRAFT ** DRAFT ** DRAFT ** DRAFT ** DRAFT **

Starting with kernel 2.5.x, the old "initial ramdisk" protocol is
getting {replaced/complemented} with the new "initial ramfs"
(initramfs) protocol.  The initramfs contents is passed using the same
memory buffer protocol used by the initrd protocol, but the contents
is different.  The initramfs buffer contains an archive which is
expanded into a ramfs filesystem; this document details the format of
the initramfs buffer format.

The initramfs buffer format is based around the "newc" or "crc" CPIO
formats, and can be created with the cpio(1) utility.  The cpio
archive can be compressed using gzip(1).  One valid version of an
initramfs buffer is thus a single .cpio.gz file.

The full format of the initramfs buffer is defined by the following
grammar, where:
	*	is used to indicate "0 or more occurrences of"
	(|)	indicates alternatives
	+	indicates concatenation
	GZIP()	indicates the gzip(1) of the operand
	ALGN(n)	means padding with null bytes to an n-byte boundary

	initramfs  := ("\0" | cpio_archive | cpio_gzip_archive)*

	cpio_gzip_archive := GZIP(cpio_archive)

	cpio_archive := cpio_file* + (<nothing> | cpio_trailer)

	cpio_file := ALGN(4) + cpio_header + filename + "\0" + ALGN(4) + data

	cpio_trailer := ALGN(4) + cpio_header + "TRAILER!!!\0" + ALGN(4)


In human terms, the initramfs buffer contains a collection of
compressed and/or uncompressed cpio archives (in the "newc" or "crc"
formats); arbitrary amounts zero bytes (for padding) can be added
between members.

The cpio "TRAILER!!!" entry (cpio end-of-archive) is optional, but is
not ignored; see "handling of hard links" below.

The structure of the cpio_header is as follows (all fields contain
hexadecimal ASCII numbers fully padded with '0' on the left to the
full width of the field, for example, the integer 4780 is represented
by the ASCII string "000012ac"):

Field name    Field size	 Meaning
c_magic	      6 bytes		 The string "070701" or "070702"
c_ino	      8 bytes		 File inode number
c_mode	      8 bytes		 File mode and permissions
c_uid	      8 bytes		 File uid
c_gid	      8 bytes		 File gid
c_nlink	      8 bytes		 Number of links
c_mtime	      8 bytes		 Modification time
c_filesize    8 bytes		 Size of data field
c_maj	      8 bytes		 Major part of file device number
c_min	      8 bytes		 Minor part of file device number
c_rmaj	      8 bytes		 Major part of device node reference
c_rmin	      8 bytes		 Minor part of device node reference
c_namesize    8 bytes		 Length of filename, including final \0
c_chksum      8 bytes		 Checksum of data field if c_magic is 070702;
				 otherwise zero

The c_mode field matches the contents of st_mode returned by stat(2)
on Linux, and encodes the file type and file permissions.

The c_filesize should be zero for any file which is not a regular file
or symlink.

The c_chksum field contains a simple 32-bit unsigned sum of all the
bytes in the data field.  cpio(1) refers to this as "crc", which is
clearly incorrect (a cyclic redundancy check is a different and
significantly stronger integrity check), however, this is the
algorithm used.

If the filename is "TRAILER!!!" this is actually an end-of-archive
marker; the c_filesize for an end-of-archive marker must be zero.


*** Handling of hard links

When a nondirectory with c_nlink > 1 is seen, the (c_maj,c_min,c_ino)
tuple is looked up in a tuple buffer.  If not found, it is entered in
the tuple buffer and the entry is created as usual; if found, a hard
link rather than a second copy of the file is created.  It is not
necessary (but permitted) to include a second copy of the file
contents; if the file contents is not included, the c_filesize field
should be set to zero to indicate no data section follows.  If data is
present, the previous instance of the file is overwritten; this allows
the data-carrying instance of a file to occur anywhere in the sequence
(GNU cpio is reported to attach the data to the last instance of a
file only.)

c_filesize must not be zero for a symlink.

When a "TRAILER!!!" end-of-archive marker is seen, the tuple buffer is
reset.  This permits archives which are generated independently to be
concatenated.

To combine file data from different sources (without having to
regenerate the (c_maj,c_min,c_ino) fields), therefore, either one of
the following techniques can be used:

a) Separate the different file data sources with a "TRAILER!!!"
   end-of-archive marker, or

b) Make sure c_nlink == 1 for all nondirectory entries.

-- 
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt	<amsp@zytor.com>

^ permalink raw reply

* A few questions on older Macintosh Powerbooks
From: C S Rosenmund @ 2002-01-13 20:53 UTC (permalink / raw)
  To: linuxppc-dev


Would someone tell me what the latest monolithic kernel and patch
available for a NuBus PPC would be and where they can be found? I'm
working with the 2.4.5 sources (and the patch-2.4.5-Nubus patch), and it
works *almost*. . .

If that is the latest, is there an expansion bay patch available, and if
so, where?
same question concerning the PCMCIA services. . .

Hardware in question is a PowerBook 1400c

OS I'm using is Debian-PPC 2.2

any reference material you could send me would be greatly appreciated.
Thanks in advance.

Sanjay
gnuman@attbi.com
gnuman0@pacbell.net

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply

* Re: [2.4.17/18pre] VM and swap - it's really unusable
From: Alan Cox @ 2002-01-13 21:11 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Alan Cox, Robert Love, Kenneth Johansson, arjan, Rob Landley,
	linux-kernel
In-Reply-To: <3C41ED4E.4D3F2D2C@linux-m68k.org>

> I don't doubt that, but would you seriously consider the ll patch for
> inclusion into the main kernel?

The mini ll patch definitely. The full ll one needs some head scratching to
be sure its correct. pre-empt is a 2.5 thing which in some ways is easier
because it doesnt matter if it breaks something.

> Please let me rephrase, I just don't expect terrible good latency
> numbers with non dma hardware.

Expect the same with DMA hardware too at times.

^ permalink raw reply

* Re: initramfs buffer spec -- second draft
From: Eric W. Biederman @ 2002-01-13 20:58 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: linux-kernel
In-Reply-To: <3C41EA0D.2050205@zytor.com>

"H. Peter Anvin" <hpa@zytor.com> writes:

> Eric W. Biederman wrote:
> 
> > Comments.  Endian issues are not specified, is the data little, big
> > or vax endian?
> >
> 
> 
> Not applicable.  There are no endian-specific binary structure in the format AT
> ALL.  ASCII-coded fields are always bigendian.

O.k.  Thanks, I missed that part.  I just looked back and it is clear
that there are 32 bit values encoded in hexadecimal.  And I admit the
bigendian (human readable) is strongly implied from the context.

> > What is the point of alignment?  If the data starts as 4 byte aligned,
> > the 6 byte magic string guarantees the data will be only 2 byte
> > aligned.  This isn't good for 32 or 64 bit architectures.
> 
> 
> They're ASCII-coded, so it supposedly doesn't matter (yet, it's a bit daft, but
> blame the SysV people.)  The alignment makes sure the *data* field is 4-byte
> aligned.

O.k.  So the we have a bit of implied padding after the filename.  And
it is necessary to preserve this padding or we break with the
prexisting format definition.  You don't gain much with that as being
4 byte aligned on 64bit architectures, is not fully aligned.

> > I do like having a c_magic that at least allows us to change things
> > in the future if necessary.
> 
> 
> It's pretty clear from a lot of the comments that a number of people haven't
> understood that the cpio encapsulation *THIS IS A CODIFICATION OF AN EXISTING
> FORMAT.*

Which we are reusing for a different purpose.  And because of that we
become trustees of our version of the format.  To make it clear that
someone else defines how this format works a reference to the
appropriate specification is called for.  

I admit I did a quick search earlier and I did not find this format
specified, elsewhere.

The cases where initramfs will be used are some of the most operating
specific cases I can imagine.  To handle those cases it is necessary
to support the full breadth of the capability of the operating system.
So if initramfs is going to survive todays implementation of the linux
kernel, or possibly be portable to other operating systems we must
have an extensible format.  It appears c_magic gives us that
extensibility.

Eric

^ permalink raw reply


This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.