* ext3-0.9.16 against linux-2.4.17-pre2
@ 2001-12-03 5:51 Andrew Morton
2001-12-05 12:32 ` Florian Lohoff
2001-12-05 23:42 ` Robert Love
0 siblings, 2 replies; 10+ messages in thread
From: Andrew Morton @ 2001-12-03 5:51 UTC (permalink / raw)
To: ext3-users@redhat.com, lkml
An ext3 update which also applies to linux-2.4.16 is available at
http://www.zip.com.au/~akpm/linux/ext3/
Quite a lot of miscellany here. It would be appreciated if interested
parties could please test it in preparation for sending upstream. Thanks.
Changelog:
- Merged several ext2 sync-up patches from Christoph Hellwig
- Drop the big kernel lock across the call to block_prepare_write.
This was causing excessive contention on large SMP machines. Thanks
to Anton ("dbench") Blanchard for finding this.
- Fixed a couple of potential kmap leaks on error paths.
There is some question whether the core kernel should be changed so
that this is not necessary, but it is right for current kernels.
- Fixed bugs concerning the use of bit operations on 32 bit quantities,
which could cause problems on 64-bit hardware. Thanks davem.
- Fix failure to return EFBIG when an attempt is made to lengthen an
ext3 file to more than the maximum file size via ftruncate().
- Current ext3 can cause an assertion failure and take down the machine
when an I/O error is encountered while mapping journal blocks in
preparation for writing to the journal. Fix from Stephen turns the
filesystem readonly when this occurs.
- ext3 is presently marking data dirty itself, which defeats the core
kernel's dirty buffer balancing. Take that out and let the generic
layer mark the buffers dirty.
This change, along with core kernel changes in 2.4.17-pre2 can
potentially reduce system congestion under heavy write loads.
- Update Documentation/Changes to reflect requirement for e2fsprogs
version (1.25)
- Update Documentation/Locking to describe the two address_space
methods which ext3 introduced.
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: ext3-0.9.16 against linux-2.4.17-pre2 2001-12-03 5:51 ext3-0.9.16 against linux-2.4.17-pre2 Andrew Morton @ 2001-12-05 12:32 ` Florian Lohoff 2001-12-05 12:36 ` Mike Fedyk 2001-12-05 12:37 ` Florian Lohoff 2001-12-05 23:42 ` Robert Love 1 sibling, 2 replies; 10+ messages in thread From: Florian Lohoff @ 2001-12-05 12:32 UTC (permalink / raw) To: Andrew Morton; +Cc: ext3-users@redhat.com, lkml [-- Attachment #1: Type: text/plain, Size: 5847 bytes --] On Sun, Dec 02, 2001 at 09:51:01PM -0800, Andrew Morton wrote: > > An ext3 update which also applies to linux-2.4.16 is available at > It seems something broken between 2.4.15-pre2 and this update - I am seeing filesystem corruption: Procmail moans about "locked" mailboxes - Opening them shows that the last mail originates about 4 hours ago although there are coming mails every minute. procmail: Extraneous locallockfile ignored procmail: Error while writing to "countpl/20011205" procmail: Truncated file to former size procmail: Error while writing to "archive/received-200112" procmail: Truncated file to former size From flo@mediaways.net Wed Dec 5 13:25:37 2001 Subject: Cron <nwmgmt@mgr1> /aol/bin/count.pl Folder: /home/flo/Mail/inbox (flo@ping)~# ls -la Mail/countpl/20011205 Mail/archive/received-200112 -rw------- 1 flo flo 51200000 Dec 5 13:25 Mail/archive/received-200112 -rw------- 1 flo flo 51200000 Dec 5 13:25 Mail/countpl/20011205 The last lines of the countpl/20011205 file contain 0 - Cut'n'pasted from "most". 0x030D3C20: 3E0A4461 74653A20 5765642C 20203520 >.Date: Wed, 5 0x030D3C30: 44656320 32303031 2030393A 32303A32 Dec 2001 09:20:2 0x030D3C40: 32202B30 30303020 28474D54 290A0A45 2 +0000 (GMT)..E 0x030D3C50: 52522020 31373938 3520322E 72646967 RR 17985 2.rdig 0x030D3C60: 2E756B20 3A203632 2E35352E 382E3132 .uk : 62.55.8.12 0x030D3C70: 36206661 696C6564 20776169 74696E67 6 failed waiting 0x030D3C80: 20666F72 20706167 696E6720 72657175 for paging requ 0x030D3C90: 65737420 696E206C 32747020 73657373 est in l2tp sess 0x030D3CA0: 696F6E20 7461626C 650A0A00 00000000 ion table....... 0x030D3CB0: 00000000 00000000 00000000 00000000 ................ 0x030D3CC0: 00000000 00000000 00000000 00000000 ................ 0x030D3CD0: 00000000 00000000 00000000 00000000 ................ 0x030D3CE0: 00000000 00000000 00000000 00000000 ................ 0x030D3CF0: 00000000 00000000 00000000 00000000 ................ 0x030D3D00: 00000000 00000000 00000000 00000000 ................ 0x030D3D10: 00000000 00000000 00000000 00000000 ................ 0x030D3D20: 00000000 00000000 00000000 00000000 ................ 0x030D3D30: 00000000 00000000 00000000 00000000 ................ 0x030D3D40: 00000000 00000000 00000000 00000000 ................ 0x030D3D50: 00000000 00000000 00000000 00000000 ................ 0x030D3D60: 00000000 00000000 00000000 00000000 ................ 0x030D3D70: 00000000 00000000 00000000 00000000 ................ 0x030D3D80: 00000000 00000000 00000000 00000000 ................ 0x030D3D90: 00000000 00000000 00000000 00000000 ................ 0x030D3DA0: 00000000 00000000 00000000 00000000 ................ 0x030D3DB0: 00000000 00000000 00000000 00000000 ................ 0x030D3DC0: 00000000 00000000 00000000 00000000 ................ 0x030D3DD0: 00000000 00000000 00000000 00000000 ................ 0x030D3DE0: 00000000 00000000 00000000 00000000 ................ 0x030D3DF0: 00000000 00000000 00000000 00000000 ................ 0x030D3E00: 00000000 00000000 00000000 00000000 ................ 0x030D3E10: 00000000 00000000 00000000 00000000 ................ 0x030D3E20: 00000000 00000000 00000000 00000000 ................ 0x030D3E30: 00000000 00000000 00000000 00000000 ................ 0x030D3E40: 00000000 00000000 00000000 00000000 ................ 0x030D3E50: 00000000 00000000 00000000 00000000 ................ 0x030D3E60: 00000000 00000000 00000000 00000000 ................ 0x030D3E70: 00000000 00000000 00000000 00000000 ................ 0x030D3E80: 00000000 00000000 00000000 00000000 ................ 0x030D3E90: 00000000 00000000 00000000 00000000 ................ 0x030D3EA0: 00000000 00000000 00000000 00000000 ................ 0x030D3EB0: 00000000 00000000 00000000 00000000 ................ 0x030D3EC0: 00000000 00000000 00000000 00000000 ................ 0x030D3ED0: 00000000 00000000 00000000 00000000 ................ 0x030D3EE0: 00000000 00000000 00000000 00000000 ................ 0x030D3EF0: 00000000 00000000 00000000 00000000 ................ 0x030D3F00: 00000000 00000000 00000000 00000000 ................ 0x030D3F10: 00000000 00000000 00000000 00000000 ................ 0x030D3F20: 00000000 00000000 00000000 00000000 ................ 0x030D3F30: 00000000 00000000 00000000 00000000 ................ 0x030D3F40: 00000000 00000000 00000000 00000000 ................ 0x030D3F50: 00000000 00000000 00000000 00000000 ................ 0x030D3F60: 00000000 00000000 00000000 00000000 ................ 0x030D3F70: 00000000 00000000 00000000 00000000 ................ 0x030D3F80: 00000000 00000000 00000000 00000000 ................ 0x030D3F90: 00000000 00000000 00000000 00000000 ................ 0x030D3FA0: 00000000 00000000 00000000 00000000 ................ 0x030D3FB0: 00000000 00000000 00000000 00000000 ................ 0x030D3FC0: 00000000 00000000 00000000 00000000 ................ 0x030D3FD0: 00000000 00000000 00000000 00000000 ................ 0x030D3FE0: 00000000 00000000 00000000 00000000 ................ 0x030D3FF0: 00000000 00000000 00000000 00000000 ................ 0x030D4000: I am backing out the 2417 changes now - I already did a forced fsck which (e2fs 1.25) which didnt find anything abnormal. (flo@ping)~# uname -a Linux ping.mediaways.net 2.4.16 #1 Tue Dec 4 19:42:30 CET 2001 i686 unknown Flo -- Florian Lohoff flo@rfc822.org +49-5201-669912 Nine nineth on september the 9th Welcome to the new billenium [-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: ext3-0.9.16 against linux-2.4.17-pre2 2001-12-05 12:32 ` Florian Lohoff @ 2001-12-05 12:36 ` Mike Fedyk 2001-12-05 12:37 ` Florian Lohoff 1 sibling, 0 replies; 10+ messages in thread From: Mike Fedyk @ 2001-12-05 12:36 UTC (permalink / raw) To: Florian Lohoff; +Cc: Andrew Morton, ext3-users@redhat.com, lkml On Wed, Dec 05, 2001 at 01:32:04PM +0100, Florian Lohoff wrote: > On Sun, Dec 02, 2001 at 09:51:01PM -0800, Andrew Morton wrote: > > > > An ext3 update which also applies to linux-2.4.16 is available at > > > > It seems something broken between 2.4.15-pre2 and this update - I am > seeing filesystem corruption: > Hmm, that's strange. > I am backing out the 2417 changes now - I already did a forced fsck > which (e2fs 1.25) which didnt find anything abnormal. > > (flo@ping)~# uname -a > Linux ping.mediaways.net 2.4.16 #1 Tue Dec 4 19:42:30 CET 2001 i686 unknown > Did you apply it against 2.4.16? It was meant for 2.4.17-pre2. Andrew, do you know if that could be the cause of this problem? Mike ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: ext3-0.9.16 against linux-2.4.17-pre2 2001-12-05 12:32 ` Florian Lohoff 2001-12-05 12:36 ` Mike Fedyk @ 2001-12-05 12:37 ` Florian Lohoff 1 sibling, 0 replies; 10+ messages in thread From: Florian Lohoff @ 2001-12-05 12:37 UTC (permalink / raw) To: Andrew Morton; +Cc: ext3-users@redhat.com, lkml [-- Attachment #1: Type: text/plain, Size: 1335 bytes --] On Wed, Dec 05, 2001 at 01:32:04PM +0100, Florian Lohoff wrote: > It seems something broken between 2.4.15-pre2 and this update - I am > seeing filesystem corruption: > > Procmail moans about "locked" mailboxes - Opening them shows that > the last mail originates about 4 hours ago although there are coming > mails every minute. > > procmail: Extraneous locallockfile ignored > procmail: Error while writing to "countpl/20011205" > procmail: Truncated file to former size > procmail: Error while writing to "archive/received-200112" > procmail: Truncated file to former size > From flo@mediaways.net Wed Dec 5 13:25:37 2001 > Subject: Cron <nwmgmt@mgr1> /aol/bin/count.pl > Folder: /home/flo/Mail/inbox > > (flo@ping)~# ls -la Mail/countpl/20011205 Mail/archive/received-200112 > -rw------- 1 flo flo 51200000 Dec 5 13:25 Mail/archive/received-200112 > -rw------- 1 flo flo 51200000 Dec 5 13:25 Mail/countpl/20011205 > > The last lines of the countpl/20011205 file contain 0 - Cut'n'pasted > from "most". Hand me the brown paperbag and let me die in shame :) Postfix the ulimit tweaker bit me again .... Flo -- Florian Lohoff flo@rfc822.org +49-5201-669912 Nine nineth on september the 9th Welcome to the new billenium [-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: ext3-0.9.16 against linux-2.4.17-pre2 2001-12-03 5:51 ext3-0.9.16 against linux-2.4.17-pre2 Andrew Morton 2001-12-05 12:32 ` Florian Lohoff @ 2001-12-05 23:42 ` Robert Love 2001-12-06 8:30 ` 2.4.17-pre2+ext3-0.9.16+anton's cache aligned smp Yusuf Goolamabbas 1 sibling, 1 reply; 10+ messages in thread From: Robert Love @ 2001-12-05 23:42 UTC (permalink / raw) To: Andrew Morton; +Cc: ext3-users@redhat.com, lkml On Mon, 2001-12-03 at 00:51, Andrew Morton wrote: > An ext3 update which also applies to linux-2.4.16 is available at > > http://www.zip.com.au/~akpm/linux/ext3/ > > Quite a lot of miscellany here. It would be appreciated if interested > parties could please test it in preparation for sending upstream. Thanks. Running 2.4.17-pre4 + preempt-kernel + ext3-0.9.16. System survived a preliminary stress test, involving I/O and VM pressure, with no problems. Seems solid here. Also, subjectively the combination of 2.4.17-pre2+ and this ext3 patch yields better performance under load. Can't comment which provide the benefit without testing, but hey, it's the user experience that counts. Robert Love ^ permalink raw reply [flat|nested] 10+ messages in thread
* 2.4.17-pre2+ext3-0.9.16+anton's cache aligned smp 2001-12-05 23:42 ` Robert Love @ 2001-12-06 8:30 ` Yusuf Goolamabbas 2001-12-06 8:39 ` Jens Axboe 2001-12-06 8:45 ` Andrew Morton 0 siblings, 2 replies; 10+ messages in thread From: Yusuf Goolamabbas @ 2001-12-06 8:30 UTC (permalink / raw) To: ext3-users; +Cc: Andrew Morton, linux-kernel, anton, axboe Running 2.4.17-pre2 + ext3-0.9.16 + Anton Blanchards cacheline_aligned_smp patch available at http://samba.org/~anton/linux/cacheline_aligned/ Running this on a dual Xeon 500/2GB ram attached to a 3ware 6200 with 2x20 IDE disks. RH 7.2 [pain to install on a 440GX+3ware]. Make sure to look at this bugzilla entry https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=54741 BTW, make install is borked on RH 7.2 (if you use grub) unless you comment out the lilo in /sbin/installkernel Workload was two client machines each with 50 mysql clients making mysql queries to this machine which the local database jock had written, mix of inserts,selects,update etc mysqladmin status on the server showed around 2100 queries/sec. Seemed very responsive. I'll be adding some more client machines and reducing server memory and testing further With Anton's patch, the number of ctx-swtch/sec drops by around 3000 from avg of 9000 (for 17-pre2+ext3) to avg of 6000 (with anton) as seen by vmstat 1 Load avg is around 4-5 for this compared to 10-12 for 2.4.7-10smp as installed by RH I'm also trying to see if I can get test with Jen Axboe's blk-highmem patch, It applies cleanly to 2.4.17-pre2+ext3-0.9.16 but I can't seem to get CONFIG_HIGHIO configured via make {old,menu}config. Any gurus want to take a look. I'd really like to reduce usage of bounce buffers. Also, on #kernelnewbies, Andre Hedrick claims blk-highmem eats your data. That didn't occur last time I tested it. I thought it was rock solid and ready for inclusion. Anybody confirm/deny ? > On Mon, 2001-12-03 at 00:51, Andrew Morton wrote: > > An ext3 update which also applies to linux-2.4.16 is available at > > > > http://www.zip.com.au/~akpm/linux/ext3/ > > > > Quite a lot of miscellany here. It would be appreciated if interested > > parties could please test it in preparation for sending upstream. Thanks. > > Running 2.4.17-pre4 + preempt-kernel + ext3-0.9.16. > > System survived a preliminary stress test, involving I/O and VM > pressure, with no problems. Seems solid here. > > Also, subjectively the combination of 2.4.17-pre2+ and this ext3 patch > yields better performance under load. Can't comment which provide the > benefit without testing, but hey, it's the user experience that counts. > > Robert Love ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.4.17-pre2+ext3-0.9.16+anton's cache aligned smp 2001-12-06 8:30 ` 2.4.17-pre2+ext3-0.9.16+anton's cache aligned smp Yusuf Goolamabbas @ 2001-12-06 8:39 ` Jens Axboe 2001-12-06 8:45 ` Andrew Morton 1 sibling, 0 replies; 10+ messages in thread From: Jens Axboe @ 2001-12-06 8:39 UTC (permalink / raw) To: Yusuf Goolamabbas; +Cc: ext3-users, Andrew Morton, linux-kernel, anton On Thu, Dec 06 2001, Yusuf Goolamabbas wrote: > I'm also trying to see if I can get test with Jen Axboe's blk-highmem > patch, It applies cleanly to 2.4.17-pre2+ext3-0.9.16 but I can't seem to > get CONFIG_HIGHIO configured via make {old,menu}config. Any gurus want > to take a look. I'd really like to reduce usage of bounce buffers. There was a config bug, please just use Andrea's -aa kernels they have that fixed. > Also, on #kernelnewbies, Andre Hedrick claims blk-highmem eats your > data. That didn't occur last time I tested it. I thought it was rock > solid and ready for inclusion. Anybody confirm/deny ? Andre claims a lot of things. -- Jens Axboe ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.4.17-pre2+ext3-0.9.16+anton's cache aligned smp 2001-12-06 8:30 ` 2.4.17-pre2+ext3-0.9.16+anton's cache aligned smp Yusuf Goolamabbas 2001-12-06 8:39 ` Jens Axboe @ 2001-12-06 8:45 ` Andrew Morton 2001-12-06 13:02 ` Anton Blanchard 2001-12-07 15:27 ` Daniel Phillips 1 sibling, 2 replies; 10+ messages in thread From: Andrew Morton @ 2001-12-06 8:45 UTC (permalink / raw) To: Yusuf Goolamabbas; +Cc: ext3-users, linux-kernel, anton, axboe Yusuf Goolamabbas wrote: > > Running 2.4.17-pre2 + ext3-0.9.16 + Anton Blanchards > cacheline_aligned_smp patch available at > > http://samba.org/~anton/linux/cacheline_aligned/ omigod look at that graph. Excuse me while I get frustrated. Will someone *please* send that damn patch to marcelo@conectiva.com.br? (It can be improved further by putting padding *behind* the lock but hey). > ... > > With Anton's patch, the number of ctx-swtch/sec drops by around 3000 > from avg of 9000 (for 17-pre2+ext3) to avg of 6000 (with anton) as seen > by vmstat 1 Really? The spinlock cacheline alignment alone made that difference? I wonder why. Thanks for testing. - ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.4.17-pre2+ext3-0.9.16+anton's cache aligned smp 2001-12-06 8:45 ` Andrew Morton @ 2001-12-06 13:02 ` Anton Blanchard 2001-12-07 15:27 ` Daniel Phillips 1 sibling, 0 replies; 10+ messages in thread From: Anton Blanchard @ 2001-12-06 13:02 UTC (permalink / raw) To: Andrew Morton; +Cc: Yusuf Goolamabbas, ext3-users, linux-kernel, axboe > omigod look at that graph. :) Well its probably worth explaining the results a bit. This machine has a 128 byte cacheline and ppc uses a load with reservation, store conditional pair to do atomic operations. The reservation granularity is one cacheline and if another cpu stores into the cacheline we lose the reservation. If this is the case its easy to see why forcing hot locks into their own cacheline makes a big difference. > Excuse me while I get frustrated. Will someone *please* send that > damn patch to marcelo@conectiva.com.br? Marcelo had some concerns with my original patch (I changed some things in UP too). I redid the patch (aligning kernel_flag too as suggested by you) which does not affect UP, I'll forward it on :) > (It can be improved further by putting padding *behind* the lock > but hey). Well since we put these spinlocks into the cachline_aligned section I dont think we need any padding behind the lock, can you check out the offsets in System.map, it looks ok on ppc64 at least :) > Really? The spinlock cacheline alignment alone made that > difference? I wonder why. Im guessing xeon cannot satisfy a cacheline miss by stealing directly from another cpus cache. If that is the case it could be quite expensive to bounce a cacheline between cpus. But I am also suprised it made that much difference. Anton diff -ru 2.4.17-pre2/arch/alpha/kernel/smp.c 2.4.17-pre2_work/arch/alpha/kernel/smp.c --- 2.4.17-pre2/arch/alpha/kernel/smp.c Fri Nov 23 18:12:35 2001 +++ 2.4.17-pre2_work/arch/alpha/kernel/smp.c Thu Dec 6 22:47:23 2001 @@ -23,6 +23,7 @@ #include <linux/delay.h> #include <linux/spinlock.h> #include <linux/irq.h> +#include <linux/cache.h> #include <asm/hwrpb.h> #include <asm/ptrace.h> @@ -65,7 +66,7 @@ IPI_CPU_STOP, }; -spinlock_t kernel_flag = SPIN_LOCK_UNLOCKED; +spinlock_t kernel_flag __cacheline_aligned_in_smp = SPIN_LOCK_UNLOCKED; /* Set to a secondary's cpuid when it comes online. */ static unsigned long smp_secondary_alive; diff -ru 2.4.17-pre2/arch/i386/kernel/smp.c 2.4.17-pre2_work/arch/i386/kernel/smp.c --- 2.4.17-pre2/arch/i386/kernel/smp.c Thu Oct 25 11:29:51 2001 +++ 2.4.17-pre2_work/arch/i386/kernel/smp.c Thu Dec 6 22:47:26 2001 @@ -17,6 +17,7 @@ #include <linux/smp_lock.h> #include <linux/kernel_stat.h> #include <linux/mc146818rtc.h> +#include <linux/cache.h> #include <asm/mtrr.h> #include <asm/pgalloc.h> @@ -102,7 +103,7 @@ */ /* The 'big kernel lock' */ -spinlock_t kernel_flag = SPIN_LOCK_UNLOCKED; +spinlock_t kernel_flag __cacheline_aligned_in_smp = SPIN_LOCK_UNLOCKED; struct tlb_state cpu_tlbstate[NR_CPUS] = {[0 ... NR_CPUS-1] = { &init_mm, 0 }}; diff -ru 2.4.17-pre2/arch/ia64/kernel/smp.c 2.4.17-pre2_work/arch/ia64/kernel/smp.c --- 2.4.17-pre2/arch/ia64/kernel/smp.c Fri Nov 23 18:12:36 2001 +++ 2.4.17-pre2_work/arch/ia64/kernel/smp.c Thu Dec 6 22:47:29 2001 @@ -30,6 +30,7 @@ #include <linux/kernel_stat.h> #include <linux/mm.h> #include <linux/delay.h> +#include <linux/cache.h> #include <asm/atomic.h> #include <asm/bitops.h> @@ -51,7 +52,7 @@ #include <asm/mca.h> /* The 'big kernel lock' */ -spinlock_t kernel_flag = SPIN_LOCK_UNLOCKED; +spinlock_t kernel_flag __cacheline_aligned_in_smp = SPIN_LOCK_UNLOCKED; /* * Structure and data for smp_call_function(). This is designed to minimise static memory diff -ru 2.4.17-pre2/arch/mips/kernel/smp.c 2.4.17-pre2_work/arch/mips/kernel/smp.c --- 2.4.17-pre2/arch/mips/kernel/smp.c Fri Nov 23 18:12:37 2001 +++ 2.4.17-pre2_work/arch/mips/kernel/smp.c Thu Dec 6 22:47:37 2001 @@ -31,6 +31,7 @@ #include <linux/timex.h> #include <linux/sched.h> #include <linux/interrupt.h> +#include <linux/cache.h> #include <asm/atomic.h> #include <asm/processor.h> @@ -52,7 +53,7 @@ /* Ze Big Kernel Lock! */ -spinlock_t kernel_flag = SPIN_LOCK_UNLOCKED; +spinlock_t kernel_flag __cacheline_aligned_in_smp = SPIN_LOCK_UNLOCKED; int smp_threads_ready; /* Not used */ int smp_num_cpus; int global_irq_holder = NO_PROC_ID; diff -ru 2.4.17-pre2/arch/mips64/kernel/smp.c 2.4.17-pre2_work/arch/mips64/kernel/smp.c --- 2.4.17-pre2/arch/mips64/kernel/smp.c Thu Oct 25 11:29:20 2001 +++ 2.4.17-pre2_work/arch/mips64/kernel/smp.c Thu Dec 6 22:47:44 2001 @@ -5,6 +5,7 @@ #include <linux/time.h> #include <linux/timex.h> #include <linux/sched.h> +#include <linux/cache.h> #include <asm/atomic.h> #include <asm/processor.h> @@ -52,7 +53,7 @@ #endif /* CONFIG_SGI_IP27 */ /* The 'big kernel lock' */ -spinlock_t kernel_flag = SPIN_LOCK_UNLOCKED; +spinlock_t kernel_flag __cacheline_aligned_in_smp = SPIN_LOCK_UNLOCKED; int smp_threads_ready; /* Not used */ atomic_t smp_commenced = ATOMIC_INIT(0); struct cpuinfo_mips cpu_data[NR_CPUS]; diff -ru 2.4.17-pre2/arch/ppc/kernel/smp.c 2.4.17-pre2_work/arch/ppc/kernel/smp.c --- 2.4.17-pre2/arch/ppc/kernel/smp.c Tue Nov 27 01:12:08 2001 +++ 2.4.17-pre2_work/arch/ppc/kernel/smp.c Thu Dec 6 22:47:54 2001 @@ -23,6 +23,7 @@ #include <linux/unistd.h> #include <linux/init.h> #include <linux/spinlock.h> +#include <linux/cache.h> #include <asm/ptrace.h> #include <asm/atomic.h> @@ -45,7 +46,7 @@ struct klock_info_struct klock_info = { KLOCK_CLEAR, 0 }; atomic_t ipi_recv; atomic_t ipi_sent; -spinlock_t kernel_flag __cacheline_aligned = SPIN_LOCK_UNLOCKED; +spinlock_t kernel_flag __cacheline_aligned_in_smp = SPIN_LOCK_UNLOCKED; unsigned int prof_multiplier[NR_CPUS]; unsigned int prof_counter[NR_CPUS]; cycles_t cacheflush_time; diff -ru 2.4.17-pre2/arch/s390/kernel/smp.c 2.4.17-pre2_work/arch/s390/kernel/smp.c --- 2.4.17-pre2/arch/s390/kernel/smp.c Fri Nov 23 18:12:37 2001 +++ 2.4.17-pre2_work/arch/s390/kernel/smp.c Thu Dec 6 22:48:00 2001 @@ -29,6 +29,7 @@ #include <linux/smp_lock.h> #include <linux/delay.h> +#include <linux/cache.h> #include <asm/sigp.h> #include <asm/pgalloc.h> @@ -55,7 +56,7 @@ int smp_threads_ready=0; /* Set when the idlers are all forked. */ static atomic_t smp_commenced = ATOMIC_INIT(0); -spinlock_t kernel_flag = SPIN_LOCK_UNLOCKED; +spinlock_t kernel_flag __cacheline_aligned_in_smp = SPIN_LOCK_UNLOCKED; unsigned long cpu_online_map; diff -ru 2.4.17-pre2/arch/s390x/kernel/smp.c 2.4.17-pre2_work/arch/s390x/kernel/smp.c --- 2.4.17-pre2/arch/s390x/kernel/smp.c Fri Nov 23 18:12:37 2001 +++ 2.4.17-pre2_work/arch/s390x/kernel/smp.c Thu Dec 6 22:48:10 2001 @@ -29,6 +29,7 @@ #include <linux/smp_lock.h> #include <linux/delay.h> +#include <linux/cache.h> #include <asm/sigp.h> #include <asm/pgalloc.h> @@ -55,7 +56,7 @@ int smp_threads_ready=0; /* Set when the idlers are all forked. */ static atomic_t smp_commenced = ATOMIC_INIT(0); -spinlock_t kernel_flag = SPIN_LOCK_UNLOCKED; +spinlock_t kernel_flag __cacheline_aligned_in_smp = SPIN_LOCK_UNLOCKED; unsigned long cpu_online_map; diff -ru 2.4.17-pre2/arch/sparc/kernel/smp.c 2.4.17-pre2_work/arch/sparc/kernel/smp.c --- 2.4.17-pre2/arch/sparc/kernel/smp.c Fri Nov 23 18:12:37 2001 +++ 2.4.17-pre2_work/arch/sparc/kernel/smp.c Thu Dec 6 22:48:17 2001 @@ -18,6 +18,7 @@ #include <linux/mm.h> #include <linux/fs.h> #include <linux/seq_file.h> +#include <linux/cache.h> #include <asm/ptrace.h> #include <asm/atomic.h> @@ -66,7 +67,7 @@ */ /* Kernel spinlock */ -spinlock_t kernel_flag = SPIN_LOCK_UNLOCKED; +spinlock_t kernel_flag __cacheline_aligned_in_smp = SPIN_LOCK_UNLOCKED; /* Used to make bitops atomic */ unsigned char bitops_spinlock = 0; diff -ru 2.4.17-pre2/arch/sparc64/kernel/smp.c 2.4.17-pre2_work/arch/sparc64/kernel/smp.c --- 2.4.17-pre2/arch/sparc64/kernel/smp.c Thu Dec 6 22:50:55 2001 +++ 2.4.17-pre2_work/arch/sparc64/kernel/smp.c Thu Dec 6 22:48:23 2001 @@ -17,6 +17,7 @@ #include <linux/spinlock.h> #include <linux/fs.h> #include <linux/seq_file.h> +#include <linux/cache.h> #include <asm/head.h> #include <asm/ptrace.h> @@ -49,7 +50,7 @@ static int smp_activated = 0; /* Kernel spinlock */ -spinlock_t kernel_flag = SPIN_LOCK_UNLOCKED; +spinlock_t kernel_flag __cacheline_aligned_in_smp = SPIN_LOCK_UNLOCKED; volatile int smp_processors_ready = 0; unsigned long cpu_present_map = 0; diff -ru 2.4.17-pre2/fs/block_dev.c 2.4.17-pre2_work/fs/block_dev.c --- 2.4.17-pre2/fs/block_dev.c Fri Nov 23 18:12:43 2001 +++ 2.4.17-pre2_work/fs/block_dev.c Thu Dec 6 22:28:01 2001 @@ -234,7 +234,7 @@ #define HASH_SIZE (1UL << HASH_BITS) #define HASH_MASK (HASH_SIZE-1) static struct list_head bdev_hashtable[HASH_SIZE]; -static spinlock_t bdev_lock = SPIN_LOCK_UNLOCKED; +static spinlock_t bdev_lock __cacheline_aligned_in_smp = SPIN_LOCK_UNLOCKED; static kmem_cache_t * bdev_cachep; #define alloc_bdev() \ diff -ru 2.4.17-pre2/fs/buffer.c 2.4.17-pre2_work/fs/buffer.c --- 2.4.17-pre2/fs/buffer.c Thu Dec 6 22:50:56 2001 +++ 2.4.17-pre2_work/fs/buffer.c Thu Dec 6 22:28:01 2001 @@ -73,7 +73,7 @@ static rwlock_t hash_table_lock = RW_LOCK_UNLOCKED; static struct buffer_head *lru_list[NR_LIST]; -static spinlock_t lru_list_lock = SPIN_LOCK_UNLOCKED; +static spinlock_t lru_list_lock __cacheline_aligned_in_smp = SPIN_LOCK_UNLOCKED; static int nr_buffers_type[NR_LIST]; static unsigned long size_buffers_type[NR_LIST]; diff -ru 2.4.17-pre2/fs/dcache.c 2.4.17-pre2_work/fs/dcache.c --- 2.4.17-pre2/fs/dcache.c Thu Oct 25 11:29:42 2001 +++ 2.4.17-pre2_work/fs/dcache.c Thu Dec 6 22:28:01 2001 @@ -29,7 +29,7 @@ #define DCACHE_PARANOIA 1 /* #define DCACHE_DEBUG 1 */ -spinlock_t dcache_lock = SPIN_LOCK_UNLOCKED; +spinlock_t dcache_lock __cacheline_aligned_in_smp = SPIN_LOCK_UNLOCKED; /* Right now the dcache depends on the kernel lock */ #define check_lock() if (!kernel_locked()) BUG() diff -ru 2.4.17-pre2/include/linux/cache.h 2.4.17-pre2_work/include/linux/cache.h --- 2.4.17-pre2/include/linux/cache.h Thu Oct 25 11:29:44 2001 +++ 2.4.17-pre2_work/include/linux/cache.h Thu Dec 6 22:28:01 2001 @@ -34,4 +34,12 @@ #endif #endif /* __cacheline_aligned */ +#ifndef __cacheline_aligned_in_smp +#ifdef CONFIG_SMP +#define __cacheline_aligned_in_smp __cacheline_aligned +#else +#define __cacheline_aligned_in_smp +#endif /* CONFIG_SMP */ +#endif + #endif /* __LINUX_CACHE_H */ diff -ru 2.4.17-pre2/mm/filemap.c 2.4.17-pre2_work/mm/filemap.c --- 2.4.17-pre2/mm/filemap.c Tue Nov 27 01:12:08 2001 +++ 2.4.17-pre2_work/mm/filemap.c Thu Dec 6 22:28:01 2001 @@ -53,7 +53,7 @@ EXPORT_SYMBOL(vm_min_readahead); -spinlock_t pagecache_lock ____cacheline_aligned_in_smp = SPIN_LOCK_UNLOCKED; +spinlock_t pagecache_lock __cacheline_aligned_in_smp = SPIN_LOCK_UNLOCKED; /* * NOTE: to avoid deadlocking you must never acquire the pagemap_lru_lock * with the pagecache_lock held. @@ -63,7 +63,7 @@ * pagemap_lru_lock -> * pagecache_lock */ -spinlock_t pagemap_lru_lock ____cacheline_aligned_in_smp = SPIN_LOCK_UNLOCKED; +spinlock_t pagemap_lru_lock __cacheline_aligned_in_smp = SPIN_LOCK_UNLOCKED; #define CLUSTER_PAGES (1 << page_cluster) #define CLUSTER_OFFSET(x) (((x) >> page_cluster) << page_cluster) diff -ru 2.4.17-pre2/mm/highmem.c 2.4.17-pre2_work/mm/highmem.c --- 2.4.17-pre2/mm/highmem.c Thu Oct 25 11:29:54 2001 +++ 2.4.17-pre2_work/mm/highmem.c Thu Dec 6 22:34:47 2001 @@ -32,7 +32,7 @@ */ static int pkmap_count[LAST_PKMAP]; static unsigned int last_pkmap_nr; -static spinlock_t kmap_lock = SPIN_LOCK_UNLOCKED; +static spinlock_t kmap_lock __cacheline_aligned_in_smp = SPIN_LOCK_UNLOCKED; pte_t * pkmap_page_table; ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.4.17-pre2+ext3-0.9.16+anton's cache aligned smp 2001-12-06 8:45 ` Andrew Morton 2001-12-06 13:02 ` Anton Blanchard @ 2001-12-07 15:27 ` Daniel Phillips 1 sibling, 0 replies; 10+ messages in thread From: Daniel Phillips @ 2001-12-07 15:27 UTC (permalink / raw) To: Andrew Morton, Yusuf Goolamabbas; +Cc: ext3-users, linux-kernel, anton, axboe On December 6, 2001 09:45 am, Andrew Morton wrote: > Yusuf Goolamabbas wrote: > > > > Running 2.4.17-pre2 + ext3-0.9.16 + Anton Blanchards > > cacheline_aligned_smp patch available at > > > > http://samba.org/~anton/linux/cacheline_aligned/ > > omigod look at that graph. > > Excuse me while I get frustrated. Will someone *please* send that > damn patch to marcelo@conectiva.com.br? > > (It can be improved further by putting padding *behind* the lock > but hey). > > > ... > > > > With Anton's patch, the number of ctx-swtch/sec drops by around 3000 > > from avg of 9000 (for 17-pre2+ext3) to avg of 6000 (with anton) as seen > > by vmstat 1 > > Really? The spinlock cacheline alignment alone made that > difference? I wonder why. Before getting *too* excited, remember, it's dbench, so effects could easily be magnified. Maybe test with something better behaved? -- Daniel ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2001-12-07 15:25 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2001-12-03 5:51 ext3-0.9.16 against linux-2.4.17-pre2 Andrew Morton 2001-12-05 12:32 ` Florian Lohoff 2001-12-05 12:36 ` Mike Fedyk 2001-12-05 12:37 ` Florian Lohoff 2001-12-05 23:42 ` Robert Love 2001-12-06 8:30 ` 2.4.17-pre2+ext3-0.9.16+anton's cache aligned smp Yusuf Goolamabbas 2001-12-06 8:39 ` Jens Axboe 2001-12-06 8:45 ` Andrew Morton 2001-12-06 13:02 ` Anton Blanchard 2001-12-07 15:27 ` Daniel Phillips
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox