public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 2.4.19pre9aa2
@ 2002-05-31  5:18 Andrea Arcangeli
  2002-05-31 13:13 ` 2.4.19pre9aa2 Andrey Nekrasov
  0 siblings, 1 reply; 8+ messages in thread
From: Andrea Arcangeli @ 2002-05-31  5:18 UTC (permalink / raw)
  To: linux-kernel

This in particular includes a fix from Denis Lunev to cure the
instability in 2.4.19pre9aa1 introduced by the first revision of the
inode-highmem fix (I also changed invalidate_inode_pages so that it runs
try_to_release_page for ext3 as suggested by Andrew a few days ago).  So
everybody running 2.4.19pre9aa1 is recommended to upgrade to
2.4.19pre9aa2 ASAP.

	http://www.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.19pre9aa2.gz
	http://www.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.19pre9aa2/

Full changlog follows:

Only in 2.4.19pre9aa2: 00_fix-stat-irq-1

	avoid showing those 10000... numbers in /proc/stat. From -ac.

Only in 2.4.19pre9aa2: 00_flock-posix-2001-1

	Update to posix 2001 semantics for negative length. From -ac.

Only in 2.4.19pre9aa2: 00_ipc-sem-set-pid-during-setval-1

	Set the sempid during serval operation, so getpid
	operation will see it. From -ac.

Only in 2.4.19pre9aa2: 00_loop-handling-pages-in-cache-1

	If the pagecache was preallocated it could be in highmem.
	From -ac.

Only in 2.4.19pre9aa2: 00_mmap-TASK_SIZE-len-1

	Avoid returning -EINVAL for length <= TASK_SIZE. From -ac,
	Spotted by DervishD.

Only in 2.4.19pre9aa2: 00_scsi-error-thread-reparent-1

	Reparent to init scsi-error kernel thread so it's not left
	zombie floating around when it exists. From -ac.

Only in 2.4.19pre9aa2: 00_sig_ign-discard-sigurg-1

	Update semantics while setting sigurg as sig_ign, the
	pending sigurg will be flushed. From -ac.

Only in 2.4.19pre9aa2: 00_vm86-drop-v86mode-dead-thread-var-1

	Drop dead variable in vm86 part of the thread struct. From -ac.

Only in 2.4.19pre9aa2: 00_wmem-default-lowmem-machines-1

	Typo fix from -ac.

Only in 2.4.19pre9aa2: 00_x86-optimize-apic-irq-and-cacheline-1

	cachelin-optimize the apic irq stats and cacheline align
	the irq_stat array. From -ac.

Only in 2.4.19pre9aa2: 03_sched-pipe-bandwidth-1

	If the pipe-writer fills the pipe reschedule the reader
	in the same cpu of the writer to maximaze memory copy
	bandwith in the local cpu over the pipe page. From Mike Kravetz.

Only in 2.4.19pre9aa1: 05_vm_10_read_write_tweaks-2
Only in 2.4.19pre9aa2: 05_vm_10_read_write_tweaks-3

	Updated comments per Christoph's suggestion.

Only in 2.4.19pre9aa1: 10_inode-highmem-1
Only in 2.4.19pre9aa2: 10_inode-highmem-2

	Fix showstopper bug that was corrupting the inode unused_list
	if the number of unused inodes was < vm_vfs_scan_ratio.
	Spotted and fixed by Denis Lunev.

Only in 2.4.19pre9aa1: 00_gcc-3_1-compile-2
Only in 2.4.19pre9aa2: 61_tux-exports-gcc-3_1-compile-1

	Moved later in the patch stage for clarity (suggested by
	Christoph Hellwig).

Andrea

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.4.19pre9aa2
  2002-05-31  5:18 2.4.19pre9aa2 Andrea Arcangeli
@ 2002-05-31 13:13 ` Andrey Nekrasov
  2002-05-31 18:40   ` 2.4.19pre9aa2 Andrea Arcangeli
  0 siblings, 1 reply; 8+ messages in thread
From: Andrey Nekrasov @ 2002-05-31 13:13 UTC (permalink / raw)
  To: Andrea Arcangeli, linux-kernel

Hello Andrea Arcangeli,


Stability fine. But something happened with interactivity. If to start the
countable task, enter on the computer on ssh, to make "su" - bothers to wait.

On 2.4.19pre8aa3 such was not. Because of "O1"?


Once you wrote about "2.4.19pre9aa2":
> This in particular includes a fix from Denis Lunev to cure the
> instability in 2.4.19pre9aa1 introduced by the first revision of the
> inode-highmem fix (I also changed invalidate_inode_pages so that it runs
> try_to_release_page for ext3 as suggested by Andrew a few days ago).  So
> everybody running 2.4.19pre9aa1 is recommended to upgrade to
> 2.4.19pre9aa2 ASAP.
> 
> 	http://www.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.19pre9aa2.gz
> 	http://www.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.19pre9aa2/
> 
> Full changlog follows:
> 
> Only in 2.4.19pre9aa2: 00_fix-stat-irq-1
> 
> 	avoid showing those 10000... numbers in /proc/stat. From -ac.
> 
> Only in 2.4.19pre9aa2: 00_flock-posix-2001-1
> 
> 	Update to posix 2001 semantics for negative length. From -ac.
> 
> Only in 2.4.19pre9aa2: 00_ipc-sem-set-pid-during-setval-1
> 
> 	Set the sempid during serval operation, so getpid
> 	operation will see it. From -ac.
> 
> Only in 2.4.19pre9aa2: 00_loop-handling-pages-in-cache-1
> 
> 	If the pagecache was preallocated it could be in highmem.
> 	From -ac.
> 
> Only in 2.4.19pre9aa2: 00_mmap-TASK_SIZE-len-1
> 
> 	Avoid returning -EINVAL for length <= TASK_SIZE. From -ac,
> 	Spotted by DervishD.
> 
> Only in 2.4.19pre9aa2: 00_scsi-error-thread-reparent-1
> 
> 	Reparent to init scsi-error kernel thread so it's not left
> 	zombie floating around when it exists. From -ac.
> 
> Only in 2.4.19pre9aa2: 00_sig_ign-discard-sigurg-1
> 
> 	Update semantics while setting sigurg as sig_ign, the
> 	pending sigurg will be flushed. From -ac.
> 
> Only in 2.4.19pre9aa2: 00_vm86-drop-v86mode-dead-thread-var-1
> 
> 	Drop dead variable in vm86 part of the thread struct. From -ac.
> 
> Only in 2.4.19pre9aa2: 00_wmem-default-lowmem-machines-1
> 
> 	Typo fix from -ac.
> 
> Only in 2.4.19pre9aa2: 00_x86-optimize-apic-irq-and-cacheline-1
> 
> 	cachelin-optimize the apic irq stats and cacheline align
> 	the irq_stat array. From -ac.
> 
> Only in 2.4.19pre9aa2: 03_sched-pipe-bandwidth-1
> 
> 	If the pipe-writer fills the pipe reschedule the reader
> 	in the same cpu of the writer to maximaze memory copy
> 	bandwith in the local cpu over the pipe page. From Mike Kravetz.
> 
> Only in 2.4.19pre9aa1: 05_vm_10_read_write_tweaks-2
> Only in 2.4.19pre9aa2: 05_vm_10_read_write_tweaks-3
> 
> 	Updated comments per Christoph's suggestion.
> 
> Only in 2.4.19pre9aa1: 10_inode-highmem-1
> Only in 2.4.19pre9aa2: 10_inode-highmem-2
> 
> 	Fix showstopper bug that was corrupting the inode unused_list
> 	if the number of unused inodes was < vm_vfs_scan_ratio.
> 	Spotted and fixed by Denis Lunev.
> 
> Only in 2.4.19pre9aa1: 00_gcc-3_1-compile-2
> Only in 2.4.19pre9aa2: 61_tux-exports-gcc-3_1-compile-1
> 
> 	Moved later in the patch stage for clarity (suggested by
> 	Christoph Hellwig).
> 
> Andrea
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
bye.
Andrey Nekrasov, SpyLOG.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.4.19pre9aa2
  2002-05-31 13:13 ` 2.4.19pre9aa2 Andrey Nekrasov
@ 2002-05-31 18:40   ` Andrea Arcangeli
  2002-05-31 19:55     ` 2.4.19pre9aa2 Andrew Morton
  0 siblings, 1 reply; 8+ messages in thread
From: Andrea Arcangeli @ 2002-05-31 18:40 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andrey Nekrasov

On Fri, May 31, 2002 at 05:13:06PM +0400, Andrey Nekrasov wrote:
> Hello Andrea Arcangeli,
> 
> 
> Stability fine. [..]

Cool thanks :)

> [..] But something happened with interactivity. If to start the
> countable task, enter on the computer on ssh, to make "su" - bothers to wait.
> 
> On 2.4.19pre8aa3 such was not. Because of "O1"?

if it's a userspace-cpu intensive background load, most probably because
of o1. The dyn-sched (before I integraed o1 that obsoleted it) was very
good at detecting cpu hogs and to avoid them to disturb interactive
tasks like ssh-shell, of course o1 also has a sleep_time/sleep_avg
derived from the dyn-sched idea from Davide, but maybe the constants are
tuned in a different manner.

Can you try to renice at +19 the cpu hogs and see if you still get bad
interactivity?

The other possibility is that the bad interactivity is due to bad
sched-latency, so that the scheduler posts a reschedule via irq
(schedule_tick()) but the function schedule() is never invoked because
the kernel spins on a loop etc... Now the fixes to prune_icache might
have increased the sched-latency in some case when you shrink the cache
but in turn now you release the inodes and also we roll the list, so
overall should be even an improvement for sched latency for you. And I
doubt you're exercising such path so frequently that it makes a
difference (even if you're certainly exercising it to test the fix
worked). So I would suggest to run some readprofile and to see if
prune_icache/invalidate_inode_pages goes up a lot in the profiling. Also
please check if the cpu load is all in userspace during the bad
interactivity, if it's all userspace load the bad sched latency is
almost certainly not the case.

Andrea

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.4.19pre9aa2
  2002-05-31 18:40   ` 2.4.19pre9aa2 Andrea Arcangeli
@ 2002-05-31 19:55     ` Andrew Morton
  0 siblings, 0 replies; 8+ messages in thread
From: Andrew Morton @ 2002-05-31 19:55 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: linux-kernel, Andrey Nekrasov

Andrea Arcangeli wrote:
> 
> ...
> if it's a userspace-cpu intensive background load, most probably because
> of o1. The dyn-sched (before I integraed o1 that obsoleted it) was very
> good at detecting cpu hogs and to avoid them to disturb interactive
> tasks like ssh-shell, of course o1 also has a sleep_time/sleep_avg
> derived from the dyn-sched idea from Davide, but maybe the constants are
> tuned in a different manner.
> 

I've been running 2.5.16+akpmhacks on my desktop (10 days uptime!).
Two impressions:  It's a bit swappy (it's basically the 2.4.15 VM,
so no surprises there).

But it's also quite markedly sluggish in the user interface when the
machine is compiling stuff.

While running a kernel build (-j0) and leaning on the spacebar in 
an X application I see occasional pauses of tens of keystrokes at
the autorepeat rate.  There was no swapin or out according to vmstat
at the time.   So I'd be suspecting that the interactivity heuristics
in the scheduler aren't working.  Renicing ksoftirqd to -19 doesn't
make any difference.

-

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.4.19pre9aa2
@ 2002-06-03 12:15 rwhron
  2002-06-03 23:59 ` 2.4.19pre9aa2 Andrea Arcangeli
  0 siblings, 1 reply; 8+ messages in thread
From: rwhron @ 2002-06-03 12:15 UTC (permalink / raw)
  To: andrea; +Cc: linux-kernel

Comparing 2.4.19pre8aa3 and 2.4.19pre9aa2 on quad Xeon:
Both kernels configured with CONFIG_2GB=y and CONFIG_HIGHIO=y.

dbench 64/192 on various filesystems had a 2-20% improvement.
(average 5 runs).

tbench 192 throughput up over 300%.

LMBench pipe bandwidth and latency improved.

The regression in OSDB aggregate simple report compared to
non-aa kernels is gone.

More benchmarks on quad Xeon at:
http://home.earthlink.net/~rwhron/kernel/bigbox.html

-- 
Randy Hron


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.4.19pre9aa2
  2002-06-03 12:15 2.4.19pre9aa2 rwhron
@ 2002-06-03 23:59 ` Andrea Arcangeli
  0 siblings, 0 replies; 8+ messages in thread
From: Andrea Arcangeli @ 2002-06-03 23:59 UTC (permalink / raw)
  To: rwhron; +Cc: linux-kernel

On Mon, Jun 03, 2002 at 08:15:05AM -0400, rwhron@earthlink.net wrote:
> More benchmarks on quad Xeon at:
> http://home.earthlink.net/~rwhron/kernel/bigbox.html

very cool work as usual :). Many thanks.

Just a note, watch the "File & VM system latencies in microseconds"
lmbench results, the creat become significantly slower, I'm wondering if
that's due the removal of the negative dcache after unlink. I think it's
still a global optimization (infact I think some of the dbench records
are also thanks to maximzing the useful cache information by dropping
immediatly negative dentries after unlink), but I wonder if the
benchmark is done in a way that generate false positives. To avoid false
positives and to really benchmark the whole "creat" path (that includes
in its non-cached form also a lookup in the lowlevel fs) lmbench should
rmdir; mkdir the directory where it wants to make the later creats
(rmdir/mkdir cycle will drop negative dentries in all 2.[245] kernels
too).  Otherwise at the moment I'm unsure what made creat slower between
pre8aa3 and pre9aa2, could it be a fake result of the benchmark? Maybe
you could give it a second spin just in case. The pipe bandwith reported
by lmbench in pre9aa2 is also very impressive, that's Mike's patch and I
think it's also a very worthwhile optimizations since many tasks really
uses pipes to passthrough big loads of data.

Andrea

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.4.19pre9aa2
@ 2002-06-04 12:44 rwhron
  2002-06-04 13:01 ` 2.4.19pre9aa2 Andrea Arcangeli
  0 siblings, 1 reply; 8+ messages in thread
From: rwhron @ 2002-06-04 12:44 UTC (permalink / raw)
  To: andrea; +Cc: linux-kernel

>> More benchmarks on quad Xeon at:
>> http://home.earthlink.net/~rwhron/kernel/bigbox.html

> Just a note, watch the "File & VM system latencies in microseconds"
> lmbench results, the creat become significantly slower, I'm wondering if
> that's due the removal of the negative dcache after unlink. I think it's
> still a global optimization (infact I think some of the dbench records
> are also thanks to maximzing the useful cache information by dropping
> immediatly negative dentries after unlink), but I wonder if the
> benchmark is done in a way that generate false positives. To avoid false
> positives and to really benchmark the whole "creat" path (that includes
> in its non-cached form also a lookup in the lowlevel fs) lmbench should
> rmdir; mkdir the directory where it wants to make the later creats
> (rmdir/mkdir cycle will drop negative dentries in all 2.[245] kernels
> too).  Otherwise at the moment I'm unsure what made creat slower between
> pre8aa3 and pre9aa2, could it be a fake result of the benchmark? 

I'll send you the 25 samples each of pre9aa2 and pre8aa3 off list.
All of the non-averaged lmbench results are currently at:
http://home.earthlink.net/~rwhron/kernel/lmball.txt

Some lmbench tests vary a lot.  The 0k and 10k creat tests were
pretty consistent for these two kernels.

Other consistent tests that showed notable improvement were context
switching at 8p/16K, 8p/64K, and 16p/16K.  The 16p/64K context switch
latency became inconsistent and higher on pre9aa2.

fork latency was consistent and improved by 10%.

> The pipe bandwith reported
> by lmbench in pre9aa2 is also very impressive, that's Mike's patch and I
> think it's also a very worthwhile optimizations since many tasks really
> uses pipes to passthrough big loads of data.

Yeah, that is impressive.

Glancing through the original lmbench logfiles, there are some results
that aren't in any report.  creat 1k and 4k, and select on various 
numbers of regular and tcp file descripters.  


-- 
Randy Hron


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.4.19pre9aa2
  2002-06-04 12:44 2.4.19pre9aa2 rwhron
@ 2002-06-04 13:01 ` Andrea Arcangeli
  0 siblings, 0 replies; 8+ messages in thread
From: Andrea Arcangeli @ 2002-06-04 13:01 UTC (permalink / raw)
  To: rwhron; +Cc: linux-kernel

On Tue, Jun 04, 2002 at 08:44:01AM -0400, rwhron@earthlink.net wrote:
> >> More benchmarks on quad Xeon at:
> >> http://home.earthlink.net/~rwhron/kernel/bigbox.html
> 
> > Just a note, watch the "File & VM system latencies in microseconds"
> > lmbench results, the creat become significantly slower, I'm wondering if
> > that's due the removal of the negative dcache after unlink. I think it's
> > still a global optimization (infact I think some of the dbench records
> > are also thanks to maximzing the useful cache information by dropping
> > immediatly negative dentries after unlink), but I wonder if the
> > benchmark is done in a way that generate false positives. To avoid false
> > positives and to really benchmark the whole "creat" path (that includes
> > in its non-cached form also a lookup in the lowlevel fs) lmbench should
> > rmdir; mkdir the directory where it wants to make the later creats
> > (rmdir/mkdir cycle will drop negative dentries in all 2.[245] kernels
> > too).  Otherwise at the moment I'm unsure what made creat slower between
> > pre8aa3 and pre9aa2, could it be a fake result of the benchmark? 
> 
> I'll send you the 25 samples each of pre9aa2 and pre8aa3 off list.
> All of the non-averaged lmbench results are currently at:
> http://home.earthlink.net/~rwhron/kernel/lmball.txt
> 
> Some lmbench tests vary a lot.  The 0k and 10k creat tests were
> pretty consistent for these two kernels.

yes, so if the problem is the negative dentry caching after unlink that
I dropped, by backing out this patch you should get the original
performance back (if you test it it would be interesting also to launch
a dbench run, as said I suspect the dbench records are also thanks to
this fix that maximizes the most "useful" cache):

	ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.19pre9aa2/00_negative-dentry-waste-ram-1

If it is the above patch that makes the difference I consider this more
a lmbench false positive (I mean: if that patch makes difference, then
lmbench is not benchmarking the real create speed of the filesystem,
because it is skipping the lowlevel lookup, that on huge dirs would be
faster on reiserfs or xfs than ext3 for example).

> 
> Other consistent tests that showed notable improvement were context
> switching at 8p/16K, 8p/64K, and 16p/16K.  The 16p/64K context switch
> latency became inconsistent and higher on pre9aa2.
> 
> fork latency was consistent and improved by 10%.

Noticed that too :).

> 
> > The pipe bandwith reported
> > by lmbench in pre9aa2 is also very impressive, that's Mike's patch and I
> > think it's also a very worthwhile optimizations since many tasks really
> > uses pipes to passthrough big loads of data.
> 
> Yeah, that is impressive.
> 
> Glancing through the original lmbench logfiles, there are some results
> that aren't in any report.  creat 1k and 4k, and select on various 
> numbers of regular and tcp file descripters.  
> 
> 
> -- 
> Randy Hron

thanks again for the so useful work you're doing with these detailed
benchmarks.

Andrea

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2002-06-04 13:40 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-05-31  5:18 2.4.19pre9aa2 Andrea Arcangeli
2002-05-31 13:13 ` 2.4.19pre9aa2 Andrey Nekrasov
2002-05-31 18:40   ` 2.4.19pre9aa2 Andrea Arcangeli
2002-05-31 19:55     ` 2.4.19pre9aa2 Andrew Morton
  -- strict thread matches above, loose matches on Subject: below --
2002-06-03 12:15 2.4.19pre9aa2 rwhron
2002-06-03 23:59 ` 2.4.19pre9aa2 Andrea Arcangeli
2002-06-04 12:44 2.4.19pre9aa2 rwhron
2002-06-04 13:01 ` 2.4.19pre9aa2 Andrea Arcangeli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox