* 2.5.66-mm1
@ 2003-03-26 9:38 ` Andrew Morton
0 siblings, 0 replies; 27+ messages in thread
From: Andrew Morton @ 2003-03-26 9:38 UTC (permalink / raw)
To: linux-kernel, linux-mm
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.66/2.5.66-mm1/
. The anticipatory scheduler is in wrapup mode now. It is pretty much in
its final form.
. The ext2 locking changes have been significantly redone.
The per-blockgroup data structures had to go. For a 4TB filesystem we
cannot even kmalloc that many pointers, let alone data structures.
So the per-blockgroup spinlocking has been replaced with hashed
spinlocking and the per-blockgroup accounting has been removed. A "per-cpu
counter" thing has been invented to amortise the locking cost of the
filesystem-wide counters.
. ext3 is now using spinlocking in its block allocator rather than a
filesystem-wide semaphore.
It is stability-tested but I have not yet performance tested this
closely. It does appear to have improved the context switch problem (and
the file fragmentation problem which the context switch problem causes).
But there's a way to go here.
Changes since 2.5.65-mm4:
linus.patch
Latest -bk
-nfsd-32-bit-dev_t-fixes.patch
-i2c-fix.patch
Merged
+kgdb-ga.patch
George Anzinger's gdb stub
+ppa-null-pointer-fix.patch
Might fix the parport scsi driver
+initcall-debug.patch
Debugging support for misbehaving initcalls
+posix-timers-64-bit-fix.patch
Timer fix for 64-bit machines
+slab-off-by-one-fix.patch
Slab was using too much memory.
+install_page-flush_cache_page.patch
Cache coherency bug in remap_file_pages()
+as-minor-tweaks.patch
+as-remove-stats.patch
Anticipaory scheduler tuning and clanups.
+posix-timer-double-expiration-fix.patch
Posix timers were sending timer expiry info twice.
+hugh-01-no-SWAP_ERROR.patch
+hugh-02-try_to_unmap-CONFIG_SWAP.patch
+hugh-03-add_to_swap_cache.patch
+hugh-04-page_convert_anon-ENOMEM.patch
+hugh-05-page_convert_anon-unlocking.patch
+hugh-06-wrap-below-vm_start.patch
+hugh-07-objrmap-page_table_lock.patch
+hugh-08-rmap-comments.patch
+hugh-09-tmpfs-truncation.patch
+hugh-10-tmpfs-atomics.patch
+hugh-11-fix-unuse_pmd-fixme.patch
+hugh-12-vm_enough_memory-double-counts.patch
Various vm/mm fixes and cleanups
+ext3-max-file-size-fix.patch
Allow ext3 to create files larger than 32GB (should be nearly 2TB)
-ext2-no-lock_super.patch
-ext2-ialloc-no-lock_super.patch
+ext2-no-lock_super-ng.patch
+ext2-ialloc-no-lock_super-ng.patch
Rework the ext2 block and inode allocator locking changes.
+dev_t-remove-B_FREE.patch
Remove B_FREE.
+tty_io-cleanup.patch
+page_to_pfn-in-blk_queue_bounce.patch
+init_inode_once-bloat-fix.patch
Cleanups and fixlets
+compound-page-warning-fix.patch
Fix a warning
+slab-cache-sizes-cleanup.patch
Unduplicate some tables in slab.
+stat_t-larger-dev_t.patch
Large dev_t fix.
+acpi-build-fix.patch
make acpi compile.
+sync_blockdev-on-final-close.patch
Only write out blockdev mappings on the final close.
+ext3-concurrent-block-inode-allocation.patch
+ext3-concurrent-block-allocation-fix-1.patch
Use spinlocking in the ext3 block allocator, not as fs-wide semaphore.
All 104 patches:
linus.patch
mm.patch
add -mmN to EXTRAVERSION
kgdb-ga.patch
kgdb stub for ia32 (George Anzinger's one)
ppa-null-pointer-fix.patch
initcall-debug.patch
initcall debugging support
posix-timers-64-bit-fix.patch
POSIX timers interface long/int cleanup
slab-off-by-one-fix.patch
slab: fix off-by-one in size calculation
config_spinline.patch
uninline spinlocks for profiling accuracy.
ppc64-reloc_hide.patch
ppc64-pci-patch.patch
Subject: pci patch
ppc64-aio-32bit-emulation.patch
32/64bit emulation for aio
ppc64-scruffiness.patch
Fix some PPC64 compile warnings
sym-do-160.patch
make the SYM driver do 160 MB/sec
install_page-flush_cache_page.patch
add flush_cache_page() to install_page()
config-PAGE_OFFSET.patch
Configurable kenrel/user memory split
ptrace-flush.patch
cache flushing in the ptrace code
buffer-debug.patch
buffer.c debugging
warn-null-wakeup.patch
ext3-truncate-ordered-pages.patch
ext3: explicitly free truncated pages
reiserfs_file_write-5.patch
rcu-stats.patch
RCU statistics reporting
ext3-journalled-data-assertion-fix.patch
Remove incorrect assertion from ext3
nfs-speedup.patch
nfs-oom-fix.patch
nfs oom fix
sk-allocation.patch
Subject: Re: nfs oom
nfs-more-oom-fix.patch
rpciod-atomic-allocations.patch
Make rcpiod use atomic allocations
linux-isp.patch
isp-update-1.patch
kblockd.patch
Create `kblockd' workqueue
as-iosched.patch
anticipatory I/O scheduler
as-np-reads-1.patch
AS: read-vs-read fixes
as-np-reads-2.patch
AS: more read-vs-read fixes
as-predict-data-direction.patch
as: predict direction of next IO
as-remove-frontmerge.patch
AS: remove frontmerge tunable
as-misc-cleanups.patch
AS: misc cleanups
as-minor-tweaks.patch
AS: tuning and tweaks
as-remove-stats.patch
AS: remove statistics
cfq-2.patch
CFQ scheduler, #2
unplug-use-kblockd.patch
Use kblockd for running request queues
fremap-all-mappings.patch
Make all executable mappings be nonlinear
objrmap-2.5.62-5.patch
object-based rmap
sched-2.5.64-D3.patch
sched-2.5.64-D3, more interactivity changes
scheduler-tunables.patch
scheduler tunables
show_task-free-stack-fix.patch
show_task() fix and cleanup
yellowfin-set_bit-fix.patch
yellowfin driver set_bit fix
htree-nfs-fix.patch
Fix ext3 htree / NFS compatibility problems
task_prio-fix.patch
simple task_prio() fix
slab_store_user-large-objects.patch
slab debug: perform redzoning against larger objects
pcmcia-2.patch
pcmcia-3b.patch
pcmcia-3.patch
pcmcia-4.patch
pcmcia-5.patch
pcmcia-6.patch
pcmcia-7b.patch
pcmcia-7.patch
pcmcia-8.patch
pcmcia-9.patch
pcmcia-10.patch
htree-nfs-fix-2.patch
htree nfs fix
posix-timer-double-expiration-fix.patch
posix timers: fix double-reporting of timer expiration
hugh-01-no-SWAP_ERROR.patch
swap 01/13 no SWAP_ERROR
hugh-02-try_to_unmap-CONFIG_SWAP.patch
Subject: [PATCH] swap 02/13 !CONFIG_SWAP try_to_unmap
hugh-03-add_to_swap_cache.patch
swap 03/13 add_to_swap_cache
hugh-04-page_convert_anon-ENOMEM.patch
swap 04/13 page_convert_anon -ENOMEM
hugh-05-page_convert_anon-unlocking.patch
swap 05/13 page_convert_anon unlocking
hugh-06-wrap-below-vm_start.patch
swap 06/13 wrap below vm_start
hugh-07-objrmap-page_table_lock.patch
swap 07/13 objrmap page_table_lock
hugh-08-rmap-comments.patch
swap 08/13 rmap comments
hugh-09-tmpfs-truncation.patch
swap 09/13 tmpfs truncation
hugh-10-tmpfs-atomics.patch
swap 10/13 tmpfs atomics
hugh-11-fix-unuse_pmd-fixme.patch
swap 11/13 fix unuse_pmd fixme
hugh-12-vm_enough_memory-double-counts.patch
swap 12/13 vm_enough_memory double counts
ext3-max-file-size-fix.patch
ext3: fix max file size
ext2-no-lock_super-ng.patch
ext2-ialloc-no-lock_super-ng.patch
linear-oops-fix-1.patch
md/linear oops fix
dev_t-32-bit.patch
[for playing only] change type of dev_t
dev_t-remove-B_FREE.patch
dev_t: eliminate B_FREE
dev_t-drm-warnings.patch
dev_t: fix drm printk warnings
sg-dev_t-fix.patch
32-bit dev_t fix for sg
oops-dump-preceding-code.patch
i386 oops output: dump preceding code
x86-clock-override-option.patch
x86 clock override boot option
tty_io-cleanup.patch
tty_io cleanup
page_to_pfn-in-blk_queue_bounce.patch
Subject: use page_to_pfn() in __blk_queue_bounce()
init_inode_once-bloat-fix.patch
Subject: init_inode_once() wants sizeof(struct hlist_head)
conntrack-use-after-free-fix.patch
fix use-after-free in ip_conntrack
VM_DONTEXPAND-fix.patch
honour VM_DONTEXPAND in vma merging
compound-page-warning-fix.patch
Fix 64bit warnings in mm/page_alloc.c
cdevname-irq-safety-fix.patch
make cdevname() callable from interrupts
register_chrdev_region-leak-fix.patch
register_chrdev_region() leak and race fix
slab-cache-sizes-cleanup.patch
slab: cache sizes cleanup
stat_t-larger-dev_t.patch
struct stat - support larger dev_t
acpi-build-fix.patch
ACPI build fix
sync_blockdev-on-final-close.patch
sync blockdevs on the final close only
ext3_mark_inode_dirty-speedup.patch
ext3_mark_inode_dirty() speedup
ext3_mark_inode_dirty-less-calls.patch
ext3_commit_write speedup
ext3-handle-cache.patch
ext3: create a slab cache for transaction handles
ext3-no-bkl.patch
journal_dirty_metadata-speedup.patch
journal_get_write_access-speedup.patch
ext3-concurrent-block-inode-allocation.patch
Subject: [PATCH] concurrent block/inode allocation for EXT3
ext3-concurrent-block-allocation-fix-1.patch
^ permalink raw reply [flat|nested] 27+ messages in thread
* 2.5.66-mm1
@ 2003-03-26 9:38 ` Andrew Morton
0 siblings, 0 replies; 27+ messages in thread
From: Andrew Morton @ 2003-03-26 9:38 UTC (permalink / raw)
To: linux-kernel, linux-mm
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.66/2.5.66-mm1/
. The anticipatory scheduler is in wrapup mode now. It is pretty much in
its final form.
. The ext2 locking changes have been significantly redone.
The per-blockgroup data structures had to go. For a 4TB filesystem we
cannot even kmalloc that many pointers, let alone data structures.
So the per-blockgroup spinlocking has been replaced with hashed
spinlocking and the per-blockgroup accounting has been removed. A "per-cpu
counter" thing has been invented to amortise the locking cost of the
filesystem-wide counters.
. ext3 is now using spinlocking in its block allocator rather than a
filesystem-wide semaphore.
It is stability-tested but I have not yet performance tested this
closely. It does appear to have improved the context switch problem (and
the file fragmentation problem which the context switch problem causes).
But there's a way to go here.
Changes since 2.5.65-mm4:
linus.patch
Latest -bk
-nfsd-32-bit-dev_t-fixes.patch
-i2c-fix.patch
Merged
+kgdb-ga.patch
George Anzinger's gdb stub
+ppa-null-pointer-fix.patch
Might fix the parport scsi driver
+initcall-debug.patch
Debugging support for misbehaving initcalls
+posix-timers-64-bit-fix.patch
Timer fix for 64-bit machines
+slab-off-by-one-fix.patch
Slab was using too much memory.
+install_page-flush_cache_page.patch
Cache coherency bug in remap_file_pages()
+as-minor-tweaks.patch
+as-remove-stats.patch
Anticipaory scheduler tuning and clanups.
+posix-timer-double-expiration-fix.patch
Posix timers were sending timer expiry info twice.
+hugh-01-no-SWAP_ERROR.patch
+hugh-02-try_to_unmap-CONFIG_SWAP.patch
+hugh-03-add_to_swap_cache.patch
+hugh-04-page_convert_anon-ENOMEM.patch
+hugh-05-page_convert_anon-unlocking.patch
+hugh-06-wrap-below-vm_start.patch
+hugh-07-objrmap-page_table_lock.patch
+hugh-08-rmap-comments.patch
+hugh-09-tmpfs-truncation.patch
+hugh-10-tmpfs-atomics.patch
+hugh-11-fix-unuse_pmd-fixme.patch
+hugh-12-vm_enough_memory-double-counts.patch
Various vm/mm fixes and cleanups
+ext3-max-file-size-fix.patch
Allow ext3 to create files larger than 32GB (should be nearly 2TB)
-ext2-no-lock_super.patch
-ext2-ialloc-no-lock_super.patch
+ext2-no-lock_super-ng.patch
+ext2-ialloc-no-lock_super-ng.patch
Rework the ext2 block and inode allocator locking changes.
+dev_t-remove-B_FREE.patch
Remove B_FREE.
+tty_io-cleanup.patch
+page_to_pfn-in-blk_queue_bounce.patch
+init_inode_once-bloat-fix.patch
Cleanups and fixlets
+compound-page-warning-fix.patch
Fix a warning
+slab-cache-sizes-cleanup.patch
Unduplicate some tables in slab.
+stat_t-larger-dev_t.patch
Large dev_t fix.
+acpi-build-fix.patch
make acpi compile.
+sync_blockdev-on-final-close.patch
Only write out blockdev mappings on the final close.
+ext3-concurrent-block-inode-allocation.patch
+ext3-concurrent-block-allocation-fix-1.patch
Use spinlocking in the ext3 block allocator, not as fs-wide semaphore.
All 104 patches:
linus.patch
mm.patch
add -mmN to EXTRAVERSION
kgdb-ga.patch
kgdb stub for ia32 (George Anzinger's one)
ppa-null-pointer-fix.patch
initcall-debug.patch
initcall debugging support
posix-timers-64-bit-fix.patch
POSIX timers interface long/int cleanup
slab-off-by-one-fix.patch
slab: fix off-by-one in size calculation
config_spinline.patch
uninline spinlocks for profiling accuracy.
ppc64-reloc_hide.patch
ppc64-pci-patch.patch
Subject: pci patch
ppc64-aio-32bit-emulation.patch
32/64bit emulation for aio
ppc64-scruffiness.patch
Fix some PPC64 compile warnings
sym-do-160.patch
make the SYM driver do 160 MB/sec
install_page-flush_cache_page.patch
add flush_cache_page() to install_page()
config-PAGE_OFFSET.patch
Configurable kenrel/user memory split
ptrace-flush.patch
cache flushing in the ptrace code
buffer-debug.patch
buffer.c debugging
warn-null-wakeup.patch
ext3-truncate-ordered-pages.patch
ext3: explicitly free truncated pages
reiserfs_file_write-5.patch
rcu-stats.patch
RCU statistics reporting
ext3-journalled-data-assertion-fix.patch
Remove incorrect assertion from ext3
nfs-speedup.patch
nfs-oom-fix.patch
nfs oom fix
sk-allocation.patch
Subject: Re: nfs oom
nfs-more-oom-fix.patch
rpciod-atomic-allocations.patch
Make rcpiod use atomic allocations
linux-isp.patch
isp-update-1.patch
kblockd.patch
Create `kblockd' workqueue
as-iosched.patch
anticipatory I/O scheduler
as-np-reads-1.patch
AS: read-vs-read fixes
as-np-reads-2.patch
AS: more read-vs-read fixes
as-predict-data-direction.patch
as: predict direction of next IO
as-remove-frontmerge.patch
AS: remove frontmerge tunable
as-misc-cleanups.patch
AS: misc cleanups
as-minor-tweaks.patch
AS: tuning and tweaks
as-remove-stats.patch
AS: remove statistics
cfq-2.patch
CFQ scheduler, #2
unplug-use-kblockd.patch
Use kblockd for running request queues
fremap-all-mappings.patch
Make all executable mappings be nonlinear
objrmap-2.5.62-5.patch
object-based rmap
sched-2.5.64-D3.patch
sched-2.5.64-D3, more interactivity changes
scheduler-tunables.patch
scheduler tunables
show_task-free-stack-fix.patch
show_task() fix and cleanup
yellowfin-set_bit-fix.patch
yellowfin driver set_bit fix
htree-nfs-fix.patch
Fix ext3 htree / NFS compatibility problems
task_prio-fix.patch
simple task_prio() fix
slab_store_user-large-objects.patch
slab debug: perform redzoning against larger objects
pcmcia-2.patch
pcmcia-3b.patch
pcmcia-3.patch
pcmcia-4.patch
pcmcia-5.patch
pcmcia-6.patch
pcmcia-7b.patch
pcmcia-7.patch
pcmcia-8.patch
pcmcia-9.patch
pcmcia-10.patch
htree-nfs-fix-2.patch
htree nfs fix
posix-timer-double-expiration-fix.patch
posix timers: fix double-reporting of timer expiration
hugh-01-no-SWAP_ERROR.patch
swap 01/13 no SWAP_ERROR
hugh-02-try_to_unmap-CONFIG_SWAP.patch
Subject: [PATCH] swap 02/13 !CONFIG_SWAP try_to_unmap
hugh-03-add_to_swap_cache.patch
swap 03/13 add_to_swap_cache
hugh-04-page_convert_anon-ENOMEM.patch
swap 04/13 page_convert_anon -ENOMEM
hugh-05-page_convert_anon-unlocking.patch
swap 05/13 page_convert_anon unlocking
hugh-06-wrap-below-vm_start.patch
swap 06/13 wrap below vm_start
hugh-07-objrmap-page_table_lock.patch
swap 07/13 objrmap page_table_lock
hugh-08-rmap-comments.patch
swap 08/13 rmap comments
hugh-09-tmpfs-truncation.patch
swap 09/13 tmpfs truncation
hugh-10-tmpfs-atomics.patch
swap 10/13 tmpfs atomics
hugh-11-fix-unuse_pmd-fixme.patch
swap 11/13 fix unuse_pmd fixme
hugh-12-vm_enough_memory-double-counts.patch
swap 12/13 vm_enough_memory double counts
ext3-max-file-size-fix.patch
ext3: fix max file size
ext2-no-lock_super-ng.patch
ext2-ialloc-no-lock_super-ng.patch
linear-oops-fix-1.patch
md/linear oops fix
dev_t-32-bit.patch
[for playing only] change type of dev_t
dev_t-remove-B_FREE.patch
dev_t: eliminate B_FREE
dev_t-drm-warnings.patch
dev_t: fix drm printk warnings
sg-dev_t-fix.patch
32-bit dev_t fix for sg
oops-dump-preceding-code.patch
i386 oops output: dump preceding code
x86-clock-override-option.patch
x86 clock override boot option
tty_io-cleanup.patch
tty_io cleanup
page_to_pfn-in-blk_queue_bounce.patch
Subject: use page_to_pfn() in __blk_queue_bounce()
init_inode_once-bloat-fix.patch
Subject: init_inode_once() wants sizeof(struct hlist_head)
conntrack-use-after-free-fix.patch
fix use-after-free in ip_conntrack
VM_DONTEXPAND-fix.patch
honour VM_DONTEXPAND in vma merging
compound-page-warning-fix.patch
Fix 64bit warnings in mm/page_alloc.c
cdevname-irq-safety-fix.patch
make cdevname() callable from interrupts
register_chrdev_region-leak-fix.patch
register_chrdev_region() leak and race fix
slab-cache-sizes-cleanup.patch
slab: cache sizes cleanup
stat_t-larger-dev_t.patch
struct stat - support larger dev_t
acpi-build-fix.patch
ACPI build fix
sync_blockdev-on-final-close.patch
sync blockdevs on the final close only
ext3_mark_inode_dirty-speedup.patch
ext3_mark_inode_dirty() speedup
ext3_mark_inode_dirty-less-calls.patch
ext3_commit_write speedup
ext3-handle-cache.patch
ext3: create a slab cache for transaction handles
ext3-no-bkl.patch
journal_dirty_metadata-speedup.patch
journal_get_write_access-speedup.patch
ext3-concurrent-block-inode-allocation.patch
Subject: [PATCH] concurrent block/inode allocation for EXT3
ext3-concurrent-block-allocation-fix-1.patch
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* LVM/Device mapper breaks with -mm (was: Re: 2.5.66-mm1)
2003-03-26 9:38 ` 2.5.66-mm1 Andrew Morton
(?)
@ 2003-03-26 12:26 ` Erik Hensema
2003-03-26 13:48 ` Andries Brouwer
-1 siblings, 1 reply; 27+ messages in thread
From: Erik Hensema @ 2003-03-26 12:26 UTC (permalink / raw)
To: linux-kernel
Andrew Morton (akpm@digeo.com) wrote:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.66/2.5.66-mm1/
LVM or device mapper seems to be broken in -mm. I've only tried the
following kernels so far:
2.5.64 - works
2.5.65-mm2 - doesn't work
2.5.66 - works
2.5.66-mm1 - doesn't work
I'm getting these messages while setting up LVM from my bootscripts (I've
included the actual commands prefixed with a > ):
Remounting root file system (/) read/write for vgscan...
> mount -n -o remount,rw /
Removing old device inodes...
> rm /dev/system/* /dev/mapper/*
Setting up devices...
> /usr/local/sbin/devmap_mknod.sh
Creating /dev/mapper/control character device with major:10 minor:63.
Scanning for LVM volume groups...
> /usr/local/sbin/vgscan
Reading all physical volumes. This may take a while...
Found volume group "system" using metadata type lvm1
Activating LVM volume groups...
> /usr/local/sbin/vgchange -a y system
device-mapper: allocating minor 0.
device-mapper: allocating minor 1.
device-mapper: destroying md
device-mapper: destroying table
device-mapper: allocating minor 0.
device-mapper: destroying md
device-mapper: destroying table
1 logical volume(s) in volume group "system" now active
The only active volume is the most recently created volume.
On 2.5.6x-vanilla the output of vgchange is:
device-mapper: allocating minor 0.
device-mapper: allocating minor 1.
device-mapper: allocating minor 2.
3 logical volume(s) in volume group "system" now active
--
Erik Hensema <erik@hensema.net>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: LVM/Device mapper breaks with -mm (was: Re: 2.5.66-mm1)
2003-03-26 12:26 ` LVM/Device mapper breaks with -mm (was: Re: 2.5.66-mm1) Erik Hensema
@ 2003-03-26 13:48 ` Andries Brouwer
2003-03-26 14:33 ` Erik Hensema
0 siblings, 1 reply; 27+ messages in thread
From: Andries Brouwer @ 2003-03-26 13:48 UTC (permalink / raw)
To: erik; +Cc: linux-kernel
On Wed, Mar 26, 2003 at 12:26:37PM +0000, Erik Hensema wrote:
> Andrew Morton (akpm@digeo.com) wrote:
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.66/2.5.66-mm1/
> LVM or device mapper seems to be broken in -mm. I've only tried the
> following kernels so far:
> 2.5.64 - works
> 2.5.65-mm2 - doesn't work
> 2.5.66 - works
> 2.5.66-mm1 - doesn't work
Probably you are hit by
dev_t-32-bit.patch
[for playing only] change type of dev_t
This is hidden somewhat in the 100+ patches in -mm,
but the kernel is not quite ready yet - that is
why this is labeled "not to be applied, for
playing only". Mostly things work, but some stuff
related to lvm, md, dm, nfs, loop will break
because ioctls use structs with a dev_t field.
You can revert this single patch and probably all will be fine.
More interesting would be to apply
http://marc.theaimsgroup.com/?l=linux-kernel&m=103956089203199&w=3
if possible, and see whether that helps.
You can see some earlier discussion today under a subject
containing the word dm_ioctl.
Andries
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: LVM/Device mapper breaks with -mm (was: Re: 2.5.66-mm1)
2003-03-26 13:48 ` Andries Brouwer
@ 2003-03-26 14:33 ` Erik Hensema
2003-03-26 16:03 ` Andries Brouwer
0 siblings, 1 reply; 27+ messages in thread
From: Erik Hensema @ 2003-03-26 14:33 UTC (permalink / raw)
To: linux-kernel
Andries Brouwer (aebr@win.tue.nl) wrote:
> On Wed, Mar 26, 2003 at 12:26:37PM +0000, Erik Hensema wrote:
>> Andrew Morton (akpm@digeo.com) wrote:
>> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.66/2.5.66-mm1/
>
>> LVM or device mapper seems to be broken in -mm. I've only tried the
>> following kernels so far:
>> 2.5.64 - works
>> 2.5.65-mm2 - doesn't work
>> 2.5.66 - works
>> 2.5.66-mm1 - doesn't work
>
> Probably you are hit by
>
> dev_t-32-bit.patch
> [for playing only] change type of dev_t
[...]
> You can revert this single patch and probably all will be fine.
For now I've reverted this patch and LVM is working again.
> More interesting would be to apply
>
> http://marc.theaimsgroup.com/?l=linux-kernel&m=103956089203199&w=3
I'd rather not change the ioctl interface, since that would make dual
booting with 2.5-vanilla harder.
--
Erik Hensema <erik@hensema.net>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: LVM/Device mapper breaks with -mm (was: Re: 2.5.66-mm1)
2003-03-26 14:33 ` Erik Hensema
@ 2003-03-26 16:03 ` Andries Brouwer
2003-03-26 17:43 ` Joe Thornber
2003-03-26 18:47 ` Joel Becker
0 siblings, 2 replies; 27+ messages in thread
From: Andries Brouwer @ 2003-03-26 16:03 UTC (permalink / raw)
To: Erik Hensema; +Cc: linux-kernel
On Wed, Mar 26, 2003 at 03:33:26PM +0100, Erik Hensema wrote:
> > You can revert this single patch and probably all will be fine.
>
> For now I've reverted this patch and LVM is working again.
Good.
> > More interesting would be to apply
> >
> > http://marc.theaimsgroup.com/?l=linux-kernel&m=103956089203199&w=3
>
> I'd rather not change the ioctl interface, since that would make dual
> booting with 2.5-vanilla harder.
The ioctl has a version field:
struct dm_ioctl {
uint32_t version[3];
...
and the above patch changes version 1.6.0 into 2.0.0.
With sufficiently recent user space utilities all
should work: they can find out the interface version
using the DM_VERSION ioctl, and then adapt what
they send to the kernel.
(I don't know whether such up-to-date utilities exist.)
Andries
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: LVM/Device mapper breaks with -mm (was: Re: 2.5.66-mm1)
2003-03-26 16:03 ` Andries Brouwer
@ 2003-03-26 17:43 ` Joe Thornber
2003-03-26 18:47 ` Joel Becker
1 sibling, 0 replies; 27+ messages in thread
From: Joe Thornber @ 2003-03-26 17:43 UTC (permalink / raw)
To: Andries Brouwer; +Cc: Erik Hensema, linux-kernel
On Wednesday, March 26, 2003, at 04:03 PM, Andries Brouwer wrote:
> (I don't know whether such up-to-date utilities exist.)
Alasdair Kergon should be making a new release of the dm utilities in
the next couple of days. Once this has been done we will be free to
fix the broken ioctl interface in 2.5.
- Joe
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: LVM/Device mapper breaks with -mm (was: Re: 2.5.66-mm1)
2003-03-26 16:03 ` Andries Brouwer
2003-03-26 17:43 ` Joe Thornber
@ 2003-03-26 18:47 ` Joel Becker
2003-03-26 20:52 ` Andries Brouwer
1 sibling, 1 reply; 27+ messages in thread
From: Joel Becker @ 2003-03-26 18:47 UTC (permalink / raw)
To: Andries Brouwer; +Cc: Erik Hensema, linux-kernel
On Wed, Mar 26, 2003 at 05:03:50PM +0100, Andries Brouwer wrote:
> With sufficiently recent user space utilities all
> should work: they can find out the interface version
> using the DM_VERSION ioctl, and then adapt what
> they send to the kernel.
We need to start tracking down what userspace needs fixing
still. We also should iron out our representations. eg, hpa's
recommendation for 64bits, or the 12/20 split for 32bit, or etc.
Joel
--
Life's Little Instruction Book #451
"Don't be afraid to say, 'I'm sorry.'"
Joel Becker
Senior Member of Technical Staff
Oracle Corporation
E-mail: joel.becker@oracle.com
Phone: (650) 506-8127
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: LVM/Device mapper breaks with -mm (was: Re: 2.5.66-mm1)
2003-03-26 18:47 ` Joel Becker
@ 2003-03-26 20:52 ` Andries Brouwer
2003-03-26 21:12 ` Joel Becker
2003-03-28 2:08 ` Dave Jones
0 siblings, 2 replies; 27+ messages in thread
From: Andries Brouwer @ 2003-03-26 20:52 UTC (permalink / raw)
To: Joel Becker; +Cc: Erik Hensema, linux-kernel
On Wed, Mar 26, 2003 at 10:47:23AM -0800, Joel Becker wrote:
> We need to start tracking down what userspace needs fixing.
My current series of patches is for the ioctls that use a
structure with dev_t field. If someone has time to burn,
or has automated tools that can identify these, that would
be good.
There is a double audit: find these ioctls, and then find
the userspace tools that use them.
For example, struct umsdos_ioctl has twice dev_t followed
by padding. Probably these should become unsigned longs.
I'll send a patch later tonight.
Is it used anywhere? That requires detective work.
It is used by the utilities udosctl (a useless demo utility),
umssync and umssetup. I do not know of any others.
No doubt people will tell me what I overlooked.
Less conservative people will tell me that umsdos has to
be killed entirely.
In old posts and other letters I have mentioned some more ioctls.
The list is not long but they have to be examined one by one,
and in some cases correspondence with authors/maintainers
is required.
> We also should iron out our representations. eg, hpa's
> recommendation for 64bits, or the 12/20 split for 32bit, or etc.
There is no hurry. These changes are just editing a few lines
in kdev_t.h. I tend to prefer 64 bits, like hpa.
Maybe I should send another patch tonight, just for playing.
Andries
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: LVM/Device mapper breaks with -mm (was: Re: 2.5.66-mm1)
2003-03-26 20:52 ` Andries Brouwer
@ 2003-03-26 21:12 ` Joel Becker
2003-03-28 2:08 ` Dave Jones
1 sibling, 0 replies; 27+ messages in thread
From: Joel Becker @ 2003-03-26 21:12 UTC (permalink / raw)
To: Andries Brouwer; +Cc: Erik Hensema, linux-kernel
On Wed, Mar 26, 2003 at 09:52:28PM +0100, Andries Brouwer wrote:
> > We also should iron out our representations. eg, hpa's
> > recommendation for 64bits, or the 12/20 split for 32bit, or etc.
>
> There is no hurry. These changes are just editing a few lines
> in kdev_t.h. I tend to prefer 64 bits, like hpa.
> Maybe I should send another patch tonight, just for playing.
Please, I'd like that. It does actually matter, because glibc
and mknod (to name a couple) have to pass a proper dev_t for the new
format (glibc actually does an explicit conversion to 8:8 in
sysdeps/sysv/linux/xmkmod.c, which we need to fix to the proper
mapping).
Stuff like that.
Joel
--
"This is the end, beautiful friend.
This is the end, my only friend the end
Of our elaborate plans, the end
Of everything that stands, the end
No safety or surprise, the end
I'll never look into your eyes again."
Joel Becker
Senior Member of Technical Staff
Oracle Corporation
E-mail: joel.becker@oracle.com
Phone: (650) 506-8127
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.5.66-mm1
2003-03-26 9:38 ` 2.5.66-mm1 Andrew Morton
@ 2003-03-28 2:06 ` Ed Tomlinson
-1 siblings, 0 replies; 27+ messages in thread
From: Ed Tomlinson @ 2003-03-28 2:06 UTC (permalink / raw)
To: Andrew Morton, linux-kernel, linux-mm
Hi Andrew,
Got this opps after about 20 hours with mm1 (65-mm3 lasted 5 days
until I rebooted).
Unable to handle kernel NULL pointer dereference at virtual address 00000000
printing eip:
c011516d
*pde = 00000000
Oops: 0002 [#1]
CPU: 0
EIP: 0060:[<c011516d>] Not tainted VLI
EFLAGS: 00010097
EIP is at schedule+0x8d/0x3a0
eax: 00000001 ebx: cf5e99c0 ecx: cf5e99c0 edx: ffffffff
esi: 00000000 edi: c031de00 ebp: cf5ebf08 esp: cf5ebef0
ds: 007b es: 007b ss: 0068
Process newsplex (pid: 1205, threadinfo=cf5ea000 task=cf5e99c0)
Stack: c011fbd7 c02bbc40 00000246 05261e41 cf5ebf14 cf5ebf50 cf5ebf3c c0120754
cf5ebf14 c02bc538 c02bc538 05261e41 4b87ad6e c01206e0 cf5e99c0 c02bbc40
c015abd6 000007d1 00000000 cf5ebf60 c015ac19 cf5ea000 cf5ea000 00000000
Call Trace:
[<c011fbd7>] add_timer+0x57/0xa0
[<c0120754>] schedule_timeout+0x54/0xa0
[<c01206e0>] process_timeout+0x0/0x20
[<c015abd6>] do_poll+0x56/0xc0
[<c015ac19>] do_poll+0x99/0xc0
[<c015ad88>] sys_poll+0x148/0x220
[<c013eb3b>] sys_mprotect+0x21b/0x22f
[<c01079ec>] sys_clone+0x2c/0x60
[<c015a200>] __pollwait+0x0/0xc0
[<c0109277>] syscall_call+0x7/0xb
Code: 40 17 04 75 4d 8b 03 85 c0 74 47 48 0f 84 da 02 00 00 ff 0d 00 de 31 c0 8b 43 68 ff 08 8b 03 83 f8 02 0f 84 b6 02 00 00 8b 73 28 <ff> 4e 00 8b 53 24 8b 43 20 89 50 04 89 02 8b 4b 18 8d 14 ce 8d
<6>note: newsplex[1205] exited with preempt_count 2
Debug: sleeping function called from illegal context at include/linux/rwsem.h:43
Call Trace:
[<c01168d3>] __might_sleep+0x53/0x60
[<c01198d5>] profile_exit_task+0x15/0x60
[<c011aee6>] do_exit+0x86/0x460
[<c0109ab5>] die+0x75/0x80
[<c0113854>] do_page_fault+0x134/0x45e
[<c0114798>] try_to_wake_up+0x138/0x240
[<c011fde4>] mod_timer+0x124/0x180
[<c012a520>] nanosleep_wake_up+0x0/0x20
[<c0131feb>] buffered_rmqueue+0xab/0x140
[<c0132103>] __alloc_pages+0x83/0x280
[<c0113720>] do_page_fault+0x0/0x45e
[<c01094dd>] error_code+0x2d/0x40
[<c011516d>] schedule+0x8d/0x3a0
[<c011fbd7>] add_timer+0x57/0xa0
[<c0120754>] schedule_timeout+0x54/0xa0
[<c01206e0>] process_timeout+0x0/0x20
[<c015abd6>] do_poll+0x56/0xc0
[<c015ac19>] do_poll+0x99/0xc0
[<c015ad88>] sys_poll+0x148/0x220
[<c013eb3b>] sys_mprotect+0x21b/0x22f
[<c01079ec>] sys_clone+0x2c/0x60
[<c015a200>] __pollwait+0x0/0xc0
[<c0109277>] syscall_call+0x7/0xb
Hope this helps
Ed Tomlinson
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.5.66-mm1
@ 2003-03-28 2:06 ` Ed Tomlinson
0 siblings, 0 replies; 27+ messages in thread
From: Ed Tomlinson @ 2003-03-28 2:06 UTC (permalink / raw)
To: Andrew Morton, linux-kernel, linux-mm
Hi Andrew,
Got this opps after about 20 hours with mm1 (65-mm3 lasted 5 days
until I rebooted).
Unable to handle kernel NULL pointer dereference at virtual address 00000000
printing eip:
c011516d
*pde = 00000000
Oops: 0002 [#1]
CPU: 0
EIP: 0060:[<c011516d>] Not tainted VLI
EFLAGS: 00010097
EIP is at schedule+0x8d/0x3a0
eax: 00000001 ebx: cf5e99c0 ecx: cf5e99c0 edx: ffffffff
esi: 00000000 edi: c031de00 ebp: cf5ebf08 esp: cf5ebef0
ds: 007b es: 007b ss: 0068
Process newsplex (pid: 1205, threadinfo=cf5ea000 task=cf5e99c0)
Stack: c011fbd7 c02bbc40 00000246 05261e41 cf5ebf14 cf5ebf50 cf5ebf3c c0120754
cf5ebf14 c02bc538 c02bc538 05261e41 4b87ad6e c01206e0 cf5e99c0 c02bbc40
c015abd6 000007d1 00000000 cf5ebf60 c015ac19 cf5ea000 cf5ea000 00000000
Call Trace:
[<c011fbd7>] add_timer+0x57/0xa0
[<c0120754>] schedule_timeout+0x54/0xa0
[<c01206e0>] process_timeout+0x0/0x20
[<c015abd6>] do_poll+0x56/0xc0
[<c015ac19>] do_poll+0x99/0xc0
[<c015ad88>] sys_poll+0x148/0x220
[<c013eb3b>] sys_mprotect+0x21b/0x22f
[<c01079ec>] sys_clone+0x2c/0x60
[<c015a200>] __pollwait+0x0/0xc0
[<c0109277>] syscall_call+0x7/0xb
Code: 40 17 04 75 4d 8b 03 85 c0 74 47 48 0f 84 da 02 00 00 ff 0d 00 de 31 c0 8b 43 68 ff 08 8b 03 83 f8 02 0f 84 b6 02 00 00 8b 73 28 <ff> 4e 00 8b 53 24 8b 43 20 89 50 04 89 02 8b 4b 18 8d 14 ce 8d
<6>note: newsplex[1205] exited with preempt_count 2
Debug: sleeping function called from illegal context at include/linux/rwsem.h:43
Call Trace:
[<c01168d3>] __might_sleep+0x53/0x60
[<c01198d5>] profile_exit_task+0x15/0x60
[<c011aee6>] do_exit+0x86/0x460
[<c0109ab5>] die+0x75/0x80
[<c0113854>] do_page_fault+0x134/0x45e
[<c0114798>] try_to_wake_up+0x138/0x240
[<c011fde4>] mod_timer+0x124/0x180
[<c012a520>] nanosleep_wake_up+0x0/0x20
[<c0131feb>] buffered_rmqueue+0xab/0x140
[<c0132103>] __alloc_pages+0x83/0x280
[<c0113720>] do_page_fault+0x0/0x45e
[<c01094dd>] error_code+0x2d/0x40
[<c011516d>] schedule+0x8d/0x3a0
[<c011fbd7>] add_timer+0x57/0xa0
[<c0120754>] schedule_timeout+0x54/0xa0
[<c01206e0>] process_timeout+0x0/0x20
[<c015abd6>] do_poll+0x56/0xc0
[<c015ac19>] do_poll+0x99/0xc0
[<c015ad88>] sys_poll+0x148/0x220
[<c013eb3b>] sys_mprotect+0x21b/0x22f
[<c01079ec>] sys_clone+0x2c/0x60
[<c015a200>] __pollwait+0x0/0xc0
[<c0109277>] syscall_call+0x7/0xb
Hope this helps
Ed Tomlinson
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: LVM/Device mapper breaks with -mm (was: Re: 2.5.66-mm1)
2003-03-26 20:52 ` Andries Brouwer
2003-03-26 21:12 ` Joel Becker
@ 2003-03-28 2:08 ` Dave Jones
1 sibling, 0 replies; 27+ messages in thread
From: Dave Jones @ 2003-03-28 2:08 UTC (permalink / raw)
To: Andries Brouwer; +Cc: Joel Becker, Erik Hensema, linux-kernel
On Wed, Mar 26, 2003 at 09:52:28PM +0100, Andries Brouwer wrote:
> For example, struct umsdos_ioctl has twice dev_t followed
> by padding. Probably these should become unsigned longs.
> I'll send a patch later tonight.
>
> Is it used anywhere? That requires detective work.
> It is used by the utilities udosctl (a useless demo utility),
> umssync and umssetup. I do not know of any others.
> No doubt people will tell me what I overlooked.
> Less conservative people will tell me that umsdos has to
> be killed entirely.
Isn't it still horribly broken ? I remember Al putting it on
the "To be fixed later" burner, but never saw anything happen
to it after that asides from janitor style fixes.
Dave
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.5.66-mm1
2003-03-28 2:06 ` 2.5.66-mm1 Ed Tomlinson
@ 2003-03-28 4:59 ` Andrew Morton
-1 siblings, 0 replies; 27+ messages in thread
From: Andrew Morton @ 2003-03-28 4:59 UTC (permalink / raw)
To: Ed Tomlinson; +Cc: linux-kernel, linux-mm, Ingo Molnar
Ed Tomlinson <tomlins@cam.org> wrote:
>
> Hi Andrew,
>
> Got this opps after about 20 hours with mm1 (65-mm3 lasted 5 days
> until I rebooted).
>
> Unable to handle kernel NULL pointer dereference at virtual address 00000000
> printing eip:
> c011516d
> *pde = 00000000
> Oops: 0002 [#1]
> CPU: 0
> EIP: 0060:[<c011516d>] Not tainted VLI
> EFLAGS: 00010097
> EIP is at schedule+0x8d/0x3a0
> eax: 00000001 ebx: cf5e99c0 ecx: cf5e99c0 edx: ffffffff
> esi: 00000000 edi: c031de00 ebp: cf5ebf08 esp: cf5ebef0
> ds: 007b es: 007b ss: 0068
> Process newsplex (pid: 1205, threadinfo=cf5ea000 task=cf5e99c0)
> Stack: c011fbd7 c02bbc40 00000246 05261e41 cf5ebf14 cf5ebf50 cf5ebf3c c0120754
> cf5ebf14 c02bc538 c02bc538 05261e41 4b87ad6e c01206e0 cf5e99c0 c02bbc40
> c015abd6 000007d1 00000000 cf5ebf60 c015ac19 cf5ea000 cf5ea000 00000000
> Call Trace:
> [<c011fbd7>] add_timer+0x57/0xa0
> [<c0120754>] schedule_timeout+0x54/0xa0
> [<c01206e0>] process_timeout+0x0/0x20
> [<c015abd6>] do_poll+0x56/0xc0
> [<c015ac19>] do_poll+0x99/0xc0
> [<c015ad88>] sys_poll+0x148/0x220
> [<c013eb3b>] sys_mprotect+0x21b/0x22f
> [<c01079ec>] sys_clone+0x2c/0x60
> [<c015a200>] __pollwait+0x0/0xc0
> [<c0109277>] syscall_call+0x7/0xb
>
> Code: 40 17 04 75 4d 8b 03 85 c0 74 47 48 0f 84 da 02 00 00 ff 0d 00 de 31 c0 8b 43 68 ff 08 8b 03 83 f8 02 0f 84 b6 02 00 00 8b 73 28 <ff> 4e 00 8b 53 24 8b 43 20 89 50 04 89 02 8b 4b 18 8d 14 ce 8d
That longer Code: line is really handy.
You died in schedule()->deactivate_task()->dequeue_task().
static inline void dequeue_task(struct task_struct *p, prio_array_t *array)
{
array->nr_active--;
`array' is zero.
I'm going to Cc Ingo and run away. Ed uses preempt.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.5.66-mm1
@ 2003-03-28 4:59 ` Andrew Morton
0 siblings, 0 replies; 27+ messages in thread
From: Andrew Morton @ 2003-03-28 4:59 UTC (permalink / raw)
To: Ed Tomlinson; +Cc: linux-kernel, linux-mm, Ingo Molnar
Ed Tomlinson <tomlins@cam.org> wrote:
>
> Hi Andrew,
>
> Got this opps after about 20 hours with mm1 (65-mm3 lasted 5 days
> until I rebooted).
>
> Unable to handle kernel NULL pointer dereference at virtual address 00000000
> printing eip:
> c011516d
> *pde = 00000000
> Oops: 0002 [#1]
> CPU: 0
> EIP: 0060:[<c011516d>] Not tainted VLI
> EFLAGS: 00010097
> EIP is at schedule+0x8d/0x3a0
> eax: 00000001 ebx: cf5e99c0 ecx: cf5e99c0 edx: ffffffff
> esi: 00000000 edi: c031de00 ebp: cf5ebf08 esp: cf5ebef0
> ds: 007b es: 007b ss: 0068
> Process newsplex (pid: 1205, threadinfo=cf5ea000 task=cf5e99c0)
> Stack: c011fbd7 c02bbc40 00000246 05261e41 cf5ebf14 cf5ebf50 cf5ebf3c c0120754
> cf5ebf14 c02bc538 c02bc538 05261e41 4b87ad6e c01206e0 cf5e99c0 c02bbc40
> c015abd6 000007d1 00000000 cf5ebf60 c015ac19 cf5ea000 cf5ea000 00000000
> Call Trace:
> [<c011fbd7>] add_timer+0x57/0xa0
> [<c0120754>] schedule_timeout+0x54/0xa0
> [<c01206e0>] process_timeout+0x0/0x20
> [<c015abd6>] do_poll+0x56/0xc0
> [<c015ac19>] do_poll+0x99/0xc0
> [<c015ad88>] sys_poll+0x148/0x220
> [<c013eb3b>] sys_mprotect+0x21b/0x22f
> [<c01079ec>] sys_clone+0x2c/0x60
> [<c015a200>] __pollwait+0x0/0xc0
> [<c0109277>] syscall_call+0x7/0xb
>
> Code: 40 17 04 75 4d 8b 03 85 c0 74 47 48 0f 84 da 02 00 00 ff 0d 00 de 31 c0 8b 43 68 ff 08 8b 03 83 f8 02 0f 84 b6 02 00 00 8b 73 28 <ff> 4e 00 8b 53 24 8b 43 20 89 50 04 89 02 8b 4b 18 8d 14 ce 8d
That longer Code: line is really handy.
You died in schedule()->deactivate_task()->dequeue_task().
static inline void dequeue_task(struct task_struct *p, prio_array_t *array)
{
array->nr_active--;
`array' is zero.
I'm going to Cc Ingo and run away. Ed uses preempt.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.5.66-mm1
2003-03-28 4:59 ` 2.5.66-mm1 Andrew Morton
@ 2003-03-28 10:45 ` Ingo Molnar
-1 siblings, 0 replies; 27+ messages in thread
From: Ingo Molnar @ 2003-03-28 10:45 UTC (permalink / raw)
To: Andrew Morton; +Cc: Ed Tomlinson, linux-kernel, linux-mm, Mike Galbraith
On Thu, 27 Mar 2003, Andrew Morton wrote:
> That longer Code: line is really handy.
>
> You died in schedule()->deactivate_task()->dequeue_task().
>
> static inline void dequeue_task(struct task_struct *p, prio_array_t *array)
> {
> array->nr_active--;
>
> `array' is zero.
>
> I'm going to Cc Ingo and run away. Ed uses preempt.
hm, this is an 'impossible' scenario from the scheduler code POV. Whenever
we deactivate a task, we remove it from the runqueue and set p->array to
NULL. Whenever we activate a task again, we set p->array to non-NULL. A
double-deactivate is not possible. I tried to reproduce it with various
scheduler workloads, but didnt succeed.
Mike, do you have a backtrace of the crash you saw?
Ingo
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.5.66-mm1
@ 2003-03-28 10:45 ` Ingo Molnar
0 siblings, 0 replies; 27+ messages in thread
From: Ingo Molnar @ 2003-03-28 10:45 UTC (permalink / raw)
To: Andrew Morton; +Cc: Ed Tomlinson, linux-kernel, linux-mm, Mike Galbraith
On Thu, 27 Mar 2003, Andrew Morton wrote:
> That longer Code: line is really handy.
>
> You died in schedule()->deactivate_task()->dequeue_task().
>
> static inline void dequeue_task(struct task_struct *p, prio_array_t *array)
> {
> array->nr_active--;
>
> `array' is zero.
>
> I'm going to Cc Ingo and run away. Ed uses preempt.
hm, this is an 'impossible' scenario from the scheduler code POV. Whenever
we deactivate a task, we remove it from the runqueue and set p->array to
NULL. Whenever we activate a task again, we set p->array to non-NULL. A
double-deactivate is not possible. I tried to reproduce it with various
scheduler workloads, but didnt succeed.
Mike, do you have a backtrace of the crash you saw?
Ingo
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.5.66-mm1
[not found] ` <Pine.LNX.4.44.0303281139500.6678-100000@localhost.localdom ain>
@ 2003-03-28 14:26 ` Mike Galbraith
0 siblings, 0 replies; 27+ messages in thread
From: Mike Galbraith @ 2003-03-28 14:26 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Andrew Morton, Ed Tomlinson, linux-kernel, linux-mm
At 11:45 AM 3/28/2003 +0100, Ingo Molnar wrote:
>On Thu, 27 Mar 2003, Andrew Morton wrote:
>
> > That longer Code: line is really handy.
> >
> > You died in schedule()->deactivate_task()->dequeue_task().
> >
> > static inline void dequeue_task(struct task_struct *p, prio_array_t *array)
> > {
> > array->nr_active--;
> >
> > `array' is zero.
> >
> > I'm going to Cc Ingo and run away. Ed uses preempt.
>
>hm, this is an 'impossible' scenario from the scheduler code POV. Whenever
>we deactivate a task, we remove it from the runqueue and set p->array to
>NULL. Whenever we activate a task again, we set p->array to non-NULL. A
>double-deactivate is not possible. I tried to reproduce it with various
>scheduler workloads, but didnt succeed.
>
>Mike, do you have a backtrace of the crash you saw?
No, I didn't save it due to "grubby fingerprints".
-Mike
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.5.66-mm1
@ 2003-03-28 14:26 ` Mike Galbraith
0 siblings, 0 replies; 27+ messages in thread
From: Mike Galbraith @ 2003-03-28 14:26 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Andrew Morton, Ed Tomlinson, linux-kernel, linux-mm
At 11:45 AM 3/28/2003 +0100, Ingo Molnar wrote:
>On Thu, 27 Mar 2003, Andrew Morton wrote:
>
> > That longer Code: line is really handy.
> >
> > You died in schedule()->deactivate_task()->dequeue_task().
> >
> > static inline void dequeue_task(struct task_struct *p, prio_array_t *array)
> > {
> > array->nr_active--;
> >
> > `array' is zero.
> >
> > I'm going to Cc Ingo and run away. Ed uses preempt.
>
>hm, this is an 'impossible' scenario from the scheduler code POV. Whenever
>we deactivate a task, we remove it from the runqueue and set p->array to
>NULL. Whenever we activate a task again, we set p->array to non-NULL. A
>double-deactivate is not possible. I tried to reproduce it with various
>scheduler workloads, but didnt succeed.
>
>Mike, do you have a backtrace of the crash you saw?
No, I didn't save it due to "grubby fingerprints".
-Mike
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.5.66-mm1
2003-03-28 14:26 ` 2.5.66-mm1 Mike Galbraith
@ 2003-03-28 14:56 ` Zwane Mwaikambo
-1 siblings, 0 replies; 27+ messages in thread
From: Zwane Mwaikambo @ 2003-03-28 14:56 UTC (permalink / raw)
To: Mike Galbraith
Cc: Ingo Molnar, Andrew Morton, Ed Tomlinson, linux-kernel, linux-mm
On Fri, 28 Mar 2003, Mike Galbraith wrote:
> >hm, this is an 'impossible' scenario from the scheduler code POV. Whenever
> >we deactivate a task, we remove it from the runqueue and set p->array to
> >NULL. Whenever we activate a task again, we set p->array to non-NULL. A
> >double-deactivate is not possible. I tried to reproduce it with various
> >scheduler workloads, but didnt succeed.
> >
> >Mike, do you have a backtrace of the crash you saw?
>
> No, I didn't save it due to "grubby fingerprints".
Hmm i think i may have his this one but i never posted due to being unable
to reproduce it on a vanilla kernel or the same kernel afterwards (which
was hacked so i won't vouch for it's cleanliness). I think preempt
might have bitten him in a bad place (mine is also CONFIG_PREEMPT), is it
possible that when we did the task_rq_unlock we got preempted and when we
got back we used the local variable requeue_waker which was set before
dropping the lock, and therefore might not be valid anymore due to
scheduler decisions done after dropping the runqueue lock?
Unable to handle kernel NULL pointer dereference at virtual address 00000000
printing eip:
c011b8d9
*pde = 00000000
Oops: 0000 [#1]
CPU: 0
EIP: 0060:[<c011b8d9>] Not tainted
EFLAGS: 00010046
EIP is at try_to_wake_up+0x1e9/0x4f0
eax: c055a000 ebx: c04e5aa0 ecx: c0552fc0 edx: c04e5aa0
esi: 00000000 edi: 00000000 ebp: c055bee4 esp: c055beb8
ds: 007b es: 007b ss: 0068
Process swapper (pid: 0, threadinfo=c055a000 task=c04e5aa0)
Stack: 00000001 c055a000 c0552fc0 00000000 cb1a0000 00000001 00000001 00000002
00000000 c04e88e4 00000001 c055bf08 c011d172 c1694700 00000001 00000000
c04e88e4 c04e88dc c055a000 00000001 c055bf3c c011d203 c04e88dc 00000001
Call Trace:
[<c011d172>] __wake_up_common+0x32/0x60
[<c011d203>] __wake_up+0x63/0xb0
[<c0122fb5>] release_console_sem+0x165/0x170
[<c0122d7b>] printk+0x1eb/0x270
[<c015e210>] invalidate_bh_lru+0x0/0x60
[<c015e210>] invalidate_bh_lru+0x0/0x60
[<c015e210>] invalidate_bh_lru+0x0/0x60
[<c01163f2>] smp_call_function_interrupt+0x42/0xb0
[<c015e210>] invalidate_bh_lru+0x0/0x60
[<c0106eb0>] default_idle+0x0/0x40
[<c010a41a>] call_function_interrupt+0x1a/0x20
[<c0106eb0>] default_idle+0x0/0x40
[<c0106ede>] default_idle+0x2e/0x40
[<c0106f6a>] cpu_idle+0x3a/0x50
[<c0105000>] rest_init+0x0/0x80
Code: 8b 06 48 89 06 8b 4a 24 8b 42 20 89 01 89 48 04 8b 4a 18 8d
0xc011b8d9 is in try_to_wake_up (kernel/sched.c:282).
277 /*
278 * Adding/removing a task to/from a priority array:
279 */
280 static inline void dequeue_task(struct task_struct *p, prio_array_t *array)
281 {
282 array->nr_active--;
283 list_del(&p->run_list);
284 if (list_empty(array->queue + p->prio))
285 __clear_bit(p->prio, array->bitmap);
286 }
(gdb) list *__wake_up_common+0x32
0xc011d1b2 is in __wake_up_common (kernel/sched.c:1424).
1419 list_for_each_safe(tmp, next, &q->task_list) {
1420 wait_queue_t *curr;
1421 unsigned flags;
1422 curr = list_entry(tmp, wait_queue_t, task_list);
1423 flags = curr->flags;
1424 if (curr->func(curr, mode, sync) &&
1425 (flags & WQ_FLAG_EXCLUSIVE) &&
1426 !--nr_exclusive)
1427 break;
1428 }
(gdb) list *__wake_up+0x62
0xc011d242 is in __wake_up (kernel/sched.c:1445).
1440
1441 if (unlikely(!q))
1442 return;
1443
1444 spin_lock_irqsave(&q->lock, flags);
1445 __wake_up_common(q, mode, nr_exclusive, 0);
1446 spin_unlock_irqrestore(&q->lock, flags);
1447 }
1448
1449 /*
--
function.linuxpower.ca
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.5.66-mm1
@ 2003-03-28 14:56 ` Zwane Mwaikambo
0 siblings, 0 replies; 27+ messages in thread
From: Zwane Mwaikambo @ 2003-03-28 14:56 UTC (permalink / raw)
To: Mike Galbraith
Cc: Ingo Molnar, Andrew Morton, Ed Tomlinson, linux-kernel, linux-mm
On Fri, 28 Mar 2003, Mike Galbraith wrote:
> >hm, this is an 'impossible' scenario from the scheduler code POV. Whenever
> >we deactivate a task, we remove it from the runqueue and set p->array to
> >NULL. Whenever we activate a task again, we set p->array to non-NULL. A
> >double-deactivate is not possible. I tried to reproduce it with various
> >scheduler workloads, but didnt succeed.
> >
> >Mike, do you have a backtrace of the crash you saw?
>
> No, I didn't save it due to "grubby fingerprints".
Hmm i think i may have his this one but i never posted due to being unable
to reproduce it on a vanilla kernel or the same kernel afterwards (which
was hacked so i won't vouch for it's cleanliness). I think preempt
might have bitten him in a bad place (mine is also CONFIG_PREEMPT), is it
possible that when we did the task_rq_unlock we got preempted and when we
got back we used the local variable requeue_waker which was set before
dropping the lock, and therefore might not be valid anymore due to
scheduler decisions done after dropping the runqueue lock?
Unable to handle kernel NULL pointer dereference at virtual address 00000000
printing eip:
c011b8d9
*pde = 00000000
Oops: 0000 [#1]
CPU: 0
EIP: 0060:[<c011b8d9>] Not tainted
EFLAGS: 00010046
EIP is at try_to_wake_up+0x1e9/0x4f0
eax: c055a000 ebx: c04e5aa0 ecx: c0552fc0 edx: c04e5aa0
esi: 00000000 edi: 00000000 ebp: c055bee4 esp: c055beb8
ds: 007b es: 007b ss: 0068
Process swapper (pid: 0, threadinfo=c055a000 task=c04e5aa0)
Stack: 00000001 c055a000 c0552fc0 00000000 cb1a0000 00000001 00000001 00000002
00000000 c04e88e4 00000001 c055bf08 c011d172 c1694700 00000001 00000000
c04e88e4 c04e88dc c055a000 00000001 c055bf3c c011d203 c04e88dc 00000001
Call Trace:
[<c011d172>] __wake_up_common+0x32/0x60
[<c011d203>] __wake_up+0x63/0xb0
[<c0122fb5>] release_console_sem+0x165/0x170
[<c0122d7b>] printk+0x1eb/0x270
[<c015e210>] invalidate_bh_lru+0x0/0x60
[<c015e210>] invalidate_bh_lru+0x0/0x60
[<c015e210>] invalidate_bh_lru+0x0/0x60
[<c01163f2>] smp_call_function_interrupt+0x42/0xb0
[<c015e210>] invalidate_bh_lru+0x0/0x60
[<c0106eb0>] default_idle+0x0/0x40
[<c010a41a>] call_function_interrupt+0x1a/0x20
[<c0106eb0>] default_idle+0x0/0x40
[<c0106ede>] default_idle+0x2e/0x40
[<c0106f6a>] cpu_idle+0x3a/0x50
[<c0105000>] rest_init+0x0/0x80
Code: 8b 06 48 89 06 8b 4a 24 8b 42 20 89 01 89 48 04 8b 4a 18 8d
0xc011b8d9 is in try_to_wake_up (kernel/sched.c:282).
277 /*
278 * Adding/removing a task to/from a priority array:
279 */
280 static inline void dequeue_task(struct task_struct *p, prio_array_t *array)
281 {
282 array->nr_active--;
283 list_del(&p->run_list);
284 if (list_empty(array->queue + p->prio))
285 __clear_bit(p->prio, array->bitmap);
286 }
(gdb) list *__wake_up_common+0x32
0xc011d1b2 is in __wake_up_common (kernel/sched.c:1424).
1419 list_for_each_safe(tmp, next, &q->task_list) {
1420 wait_queue_t *curr;
1421 unsigned flags;
1422 curr = list_entry(tmp, wait_queue_t, task_list);
1423 flags = curr->flags;
1424 if (curr->func(curr, mode, sync) &&
1425 (flags & WQ_FLAG_EXCLUSIVE) &&
1426 !--nr_exclusive)
1427 break;
1428 }
(gdb) list *__wake_up+0x62
0xc011d242 is in __wake_up (kernel/sched.c:1445).
1440
1441 if (unlikely(!q))
1442 return;
1443
1444 spin_lock_irqsave(&q->lock, flags);
1445 __wake_up_common(q, mode, nr_exclusive, 0);
1446 spin_unlock_irqrestore(&q->lock, flags);
1447 }
1448
1449 /*
--
function.linuxpower.ca
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.5.66-mm1
2003-03-28 14:56 ` 2.5.66-mm1 Zwane Mwaikambo
@ 2003-03-28 15:25 ` Ingo Molnar
-1 siblings, 0 replies; 27+ messages in thread
From: Ingo Molnar @ 2003-03-28 15:25 UTC (permalink / raw)
To: Zwane Mwaikambo
Cc: Mike Galbraith, Andrew Morton, Ed Tomlinson, linux-kernel,
linux-mm
On Fri, 28 Mar 2003, Zwane Mwaikambo wrote:
> Hmm i think i may have his this one but i never posted due to being
> unable to reproduce it on a vanilla kernel or the same kernel afterwards
> (which was hacked so i won't vouch for it's cleanliness). I think
> preempt might have bitten him in a bad place (mine is also
> CONFIG_PREEMPT), is it possible that when we did the task_rq_unlock we
> got preempted and when we got back we used the local variable
> requeue_waker which was set before dropping the lock, and therefore
> might not be valid anymore due to scheduler decisions done after
> dropping the runqueue lock?
yes, this one was my only suspect, but it should really never cause any
problems. We might change sleep_avg during the wakeup, and carry the
requeue_waker flag over a preemptible window, but the requeueing itself
re-takes the runqueue lock, and does not take anything for granted. The
flag could very well be random as well, and the code should still be
correct - there's no requirement to recalculate the priority every time we
change sleep_avg. (in fact we at times intentionally keep those values
detached.)
Ingo
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.5.66-mm1
@ 2003-03-28 15:25 ` Ingo Molnar
0 siblings, 0 replies; 27+ messages in thread
From: Ingo Molnar @ 2003-03-28 15:25 UTC (permalink / raw)
To: Zwane Mwaikambo
Cc: Mike Galbraith, Andrew Morton, Ed Tomlinson, linux-kernel,
linux-mm
On Fri, 28 Mar 2003, Zwane Mwaikambo wrote:
> Hmm i think i may have his this one but i never posted due to being
> unable to reproduce it on a vanilla kernel or the same kernel afterwards
> (which was hacked so i won't vouch for it's cleanliness). I think
> preempt might have bitten him in a bad place (mine is also
> CONFIG_PREEMPT), is it possible that when we did the task_rq_unlock we
> got preempted and when we got back we used the local variable
> requeue_waker which was set before dropping the lock, and therefore
> might not be valid anymore due to scheduler decisions done after
> dropping the runqueue lock?
yes, this one was my only suspect, but it should really never cause any
problems. We might change sleep_avg during the wakeup, and carry the
requeue_waker flag over a preemptible window, but the requeueing itself
re-takes the runqueue lock, and does not take anything for granted. The
flag could very well be random as well, and the code should still be
correct - there's no requirement to recalculate the priority every time we
change sleep_avg. (in fact we at times intentionally keep those values
detached.)
Ingo
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.5.66-mm1
[not found] ` <Pine.LNX.4.50.0303280942420.2884-100000@montezuma.mastecen de.com>
@ 2003-03-28 16:01 ` Mike Galbraith
0 siblings, 0 replies; 27+ messages in thread
From: Mike Galbraith @ 2003-03-28 16:01 UTC (permalink / raw)
To: Zwane Mwaikambo
Cc: Ingo Molnar, Andrew Morton, Ed Tomlinson, linux-kernel, linux-mm
[-- Attachment #1: Type: text/plain, Size: 3414 bytes --]
At 09:56 AM 3/28/2003 -0500, Zwane Mwaikambo wrote:
>On Fri, 28 Mar 2003, Mike Galbraith wrote:
>
> > >hm, this is an 'impossible' scenario from the scheduler code POV. Whenever
> > >we deactivate a task, we remove it from the runqueue and set p->array to
> > >NULL. Whenever we activate a task again, we set p->array to non-NULL. A
> > >double-deactivate is not possible. I tried to reproduce it with various
> > >scheduler workloads, but didnt succeed.
> > >
> > >Mike, do you have a backtrace of the crash you saw?
> >
> > No, I didn't save it due to "grubby fingerprints".
>
>Hmm i think i may have his this one but i never posted due to being unable
>to reproduce it on a vanilla kernel or the same kernel afterwards (which
>was hacked so i won't vouch for it's cleanliness). I think preempt
>might have bitten him in a bad place (mine is also CONFIG_PREEMPT), is it
>possible that when we did the task_rq_unlock we got preempted and when we
>got back we used the local variable requeue_waker which was set before
>dropping the lock, and therefore might not be valid anymore due to
>scheduler decisions done after dropping the runqueue lock?
Dunno. I did have one lying around. The attached one was while printing
out array switch latency after starvation timeout. Others happened while
printing wakeup stats for p->state > 1 tasks in scheduler_tick() [under
lock w/ wakeup disabled in printk.c]. It's nothing I did to the scheduler
;-) I don't think, but this was in 65-mm3-twiddle-twiddle-twiddle.
>Unable to handle kernel NULL pointer dereference at virtual address 00000000
> printing eip:
>c011b8d9
>*pde = 00000000
>Oops: 0000 [#1]
>CPU: 0
>EIP: 0060:[<c011b8d9>] Not tainted
>EFLAGS: 00010046
>EIP is at try_to_wake_up+0x1e9/0x4f0
>eax: c055a000 ebx: c04e5aa0 ecx: c0552fc0 edx: c04e5aa0
>esi: 00000000 edi: 00000000 ebp: c055bee4 esp: c055beb8
>ds: 007b es: 007b ss: 0068
>Process swapper (pid: 0, threadinfo=c055a000 task=c04e5aa0)
>Stack: 00000001 c055a000 c0552fc0 00000000 cb1a0000 00000001 00000001
>00000002
> 00000000 c04e88e4 00000001 c055bf08 c011d172 c1694700 00000001
> 00000000
> c04e88e4 c04e88dc c055a000 00000001 c055bf3c c011d203 c04e88dc
> 00000001
>Call Trace:
> [<c011d172>] __wake_up_common+0x32/0x60
> [<c011d203>] __wake_up+0x63/0xb0
> [<c0122fb5>] release_console_sem+0x165/0x170
> [<c0122d7b>] printk+0x1eb/0x270
> [<c015e210>] invalidate_bh_lru+0x0/0x60
> [<c015e210>] invalidate_bh_lru+0x0/0x60
> [<c015e210>] invalidate_bh_lru+0x0/0x60
> [<c01163f2>] smp_call_function_interrupt+0x42/0xb0
> [<c015e210>] invalidate_bh_lru+0x0/0x60
> [<c0106eb0>] default_idle+0x0/0x40
> [<c010a41a>] call_function_interrupt+0x1a/0x20
> [<c0106eb0>] default_idle+0x0/0x40
> [<c0106ede>] default_idle+0x2e/0x40
> [<c0106f6a>] cpu_idle+0x3a/0x50
> [<c0105000>] rest_init+0x0/0x80
>
>Code: 8b 06 48 89 06 8b 4a 24 8b 42 20 89 01 89 48 04 8b 4a 18 8d
>
>0xc011b8d9 is in try_to_wake_up (kernel/sched.c:282).
>277 /*
>278 * Adding/removing a task to/from a priority array:
>279 */
>280 static inline void dequeue_task(struct task_struct *p,
>prio_array_t *array)
>281 {
>282 array->nr_active--;
>283 list_del(&p->run_list);
>284 if (list_empty(array->queue + p->prio))
>285 __clear_bit(p->prio, array->bitmap);
>286 }
Same spot.
-Mike
[-- Attachment #2: oops.txt --]
[-- Type: text/plain, Size: 3183 bytes --]
Loglevel set to 9
hmm.. 289 ms
hmm.. 6 ms
hmm.. 4 ms
hmm.. 7 ms
hmm.. 13 ms
hmm.. 15 ms
Unable to handle kernel NULL pointer dereference at virtual address 00000000
printing eip:
c0114d0a
*pde = 00000000
Oops: 0002 [#1]
CPU: 0
EIP: 0060:[<c0114d0a>] Not tainted VLI
EFLAGS: 00010006
EIP is at try_to_wake_up+0x1e2/0x258
eax: 00000008 ebx: c02cb3c8 ecx: c0dcf360 edx: c0dcf360
esi: c0c24000 edi: 00000000 ebp: c0c25ed4 esp: c0c25eb8
ds: 007b es: 007b ss: 0068
Process gcc (pid: 592, threadinfo=c0c24000 task=c0dcf360)
Stack: 00000001 00000001 c0298ff4 c0c25ed0 00000001 00000001 00000002 c0c25ee8
c0115887 c7b8a0a0 00000003 00000000 c0c25f08 c01158c2 c2d81e5c 00000003
00000000 c0c24000 00000082 c0298fe8 c0c25f20 c011594a c0298ff0 00000003
Call Trace:
[<c0115887>] default_wake_function+0x17/0x1c
[<c01158c2>] __wake_up_common+0x36/0x50
[<c011594a>] __wake_up_locked+0xe/0x14
[<c0107cdc>] __down_trylock+0x34/0x54
[<c0107d1b>] __down_failed_trylock+0x7/0xc
[<c011928b>] .text.lock.printk+0x5/0x2a
[<c01155f0>] schedule+0x13c/0x378
[<c011ab07>] sys_wait4+0xab/0x234
[<c011ac5d>] sys_wait4+0x201/0x234
[<c0115870>] default_wake_function+0x0/0x1c
[<c0115870>] default_wake_function+0x0/0x1c
[<c0108b5f>] syscall_call+0x7/0xb
Code: ff 48 14 8b 40 08 a8 08 74 07 e8 3e 0b 00 00 89 f6 85 f6 74 7e 8b 55 f0 9c 8f 02 fa be 00 e0 ff ff 21 e6 ff 46 14 8b 16 8b 7a 28 <ff> 0f 8b 42 20 8b 4a 24 89 48 04 89 01 8b 52 18 8d 44 d7 18 39
(gdb) list *try_to_wake_up+0x1e2
0x26a is in try_to_wake_up (kernel/sched.c:310).
305 /*
306 * Adding/removing a task to/from a priority array:
307 */
308 static inline void dequeue_task(struct task_struct *p, prio_array_t *array)
309 {
310 array->nr_active--;
311 list_del(&p->run_list);
312 if (list_empty(array->queue + p->prio))
313 __clear_bit(p->prio, array->bitmap);
314 }
(gdb)
<6>note: gcc[592] exited with preempt_count 5
bad: scheduling while atomic!
Call Trace:
[<c01154f0>] schedule+0x3c/0x378
[<c0134b26>] unmap_vmas+0xea/0x1e0
[<c011647b>] __cond_resched+0x17/0x1c
[<c0134b86>] unmap_vmas+0x14a/0x1e0
[<c0137fb8>] exit_mmap+0x64/0x158
[<c0116dbd>] mmput+0x55/0x74
[<c011a368>] do_exit+0x158/0x3b4
[<c0109267>] die+0x87/0x88
[<c0114068>] do_page_fault+0x2d8/0x404
[<c0113d90>] do_page_fault+0x0/0x404
[<c011b91a>] do_softirq+0x5a/0xac
[<c010a170>] do_IRQ+0xfc/0x118
[<c012e173>] __rmqueue+0xa3/0x10c
[<c012e21f>] rmqueue_bulk+0x43/0x6c
[<c0108d69>] error_code+0x2d/0x38
[<c0114d0a>] try_to_wake_up+0x1e2/0x258
[<c0115887>] default_wake_function+0x17/0x1c
[<c01158c2>] __wake_up_common+0x36/0x50
[<c011594a>] __wake_up_locked+0xe/0x14
[<c0107cdc>] __down_trylock+0x34/0x54
[<c0107d1b>] __down_failed_trylock+0x7/0xc
[<c011928b>] .text.lock.printk+0x5/0x2a
[<c01155f0>] schedule+0x13c/0x378
[<c011ab07>] sys_wait4+0xab/0x234
[<c011ac5d>] sys_wait4+0x201/0x234
[<c0115870>] default_wake_function+0x0/0x1c
[<c0115870>] default_wake_function+0x0/0x1c
[<c0108b5f>] syscall_call+0x7/0xb
hmm.. 42 ms
hmm.. 24 ms
hmm.. 33 ms
hmm.. 23 ms
hmm.. 31 ms
hmm.. 30 ms
hmm.. 30 ms
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.5.66-mm1
@ 2003-03-28 16:01 ` Mike Galbraith
0 siblings, 0 replies; 27+ messages in thread
From: Mike Galbraith @ 2003-03-28 16:01 UTC (permalink / raw)
To: Zwane Mwaikambo
Cc: Ingo Molnar, Andrew Morton, Ed Tomlinson, linux-kernel, linux-mm
[-- Attachment #1: Type: text/plain, Size: 3414 bytes --]
At 09:56 AM 3/28/2003 -0500, Zwane Mwaikambo wrote:
>On Fri, 28 Mar 2003, Mike Galbraith wrote:
>
> > >hm, this is an 'impossible' scenario from the scheduler code POV. Whenever
> > >we deactivate a task, we remove it from the runqueue and set p->array to
> > >NULL. Whenever we activate a task again, we set p->array to non-NULL. A
> > >double-deactivate is not possible. I tried to reproduce it with various
> > >scheduler workloads, but didnt succeed.
> > >
> > >Mike, do you have a backtrace of the crash you saw?
> >
> > No, I didn't save it due to "grubby fingerprints".
>
>Hmm i think i may have his this one but i never posted due to being unable
>to reproduce it on a vanilla kernel or the same kernel afterwards (which
>was hacked so i won't vouch for it's cleanliness). I think preempt
>might have bitten him in a bad place (mine is also CONFIG_PREEMPT), is it
>possible that when we did the task_rq_unlock we got preempted and when we
>got back we used the local variable requeue_waker which was set before
>dropping the lock, and therefore might not be valid anymore due to
>scheduler decisions done after dropping the runqueue lock?
Dunno. I did have one lying around. The attached one was while printing
out array switch latency after starvation timeout. Others happened while
printing wakeup stats for p->state > 1 tasks in scheduler_tick() [under
lock w/ wakeup disabled in printk.c]. It's nothing I did to the scheduler
;-) I don't think, but this was in 65-mm3-twiddle-twiddle-twiddle.
>Unable to handle kernel NULL pointer dereference at virtual address 00000000
> printing eip:
>c011b8d9
>*pde = 00000000
>Oops: 0000 [#1]
>CPU: 0
>EIP: 0060:[<c011b8d9>] Not tainted
>EFLAGS: 00010046
>EIP is at try_to_wake_up+0x1e9/0x4f0
>eax: c055a000 ebx: c04e5aa0 ecx: c0552fc0 edx: c04e5aa0
>esi: 00000000 edi: 00000000 ebp: c055bee4 esp: c055beb8
>ds: 007b es: 007b ss: 0068
>Process swapper (pid: 0, threadinfo=c055a000 task=c04e5aa0)
>Stack: 00000001 c055a000 c0552fc0 00000000 cb1a0000 00000001 00000001
>00000002
> 00000000 c04e88e4 00000001 c055bf08 c011d172 c1694700 00000001
> 00000000
> c04e88e4 c04e88dc c055a000 00000001 c055bf3c c011d203 c04e88dc
> 00000001
>Call Trace:
> [<c011d172>] __wake_up_common+0x32/0x60
> [<c011d203>] __wake_up+0x63/0xb0
> [<c0122fb5>] release_console_sem+0x165/0x170
> [<c0122d7b>] printk+0x1eb/0x270
> [<c015e210>] invalidate_bh_lru+0x0/0x60
> [<c015e210>] invalidate_bh_lru+0x0/0x60
> [<c015e210>] invalidate_bh_lru+0x0/0x60
> [<c01163f2>] smp_call_function_interrupt+0x42/0xb0
> [<c015e210>] invalidate_bh_lru+0x0/0x60
> [<c0106eb0>] default_idle+0x0/0x40
> [<c010a41a>] call_function_interrupt+0x1a/0x20
> [<c0106eb0>] default_idle+0x0/0x40
> [<c0106ede>] default_idle+0x2e/0x40
> [<c0106f6a>] cpu_idle+0x3a/0x50
> [<c0105000>] rest_init+0x0/0x80
>
>Code: 8b 06 48 89 06 8b 4a 24 8b 42 20 89 01 89 48 04 8b 4a 18 8d
>
>0xc011b8d9 is in try_to_wake_up (kernel/sched.c:282).
>277 /*
>278 * Adding/removing a task to/from a priority array:
>279 */
>280 static inline void dequeue_task(struct task_struct *p,
>prio_array_t *array)
>281 {
>282 array->nr_active--;
>283 list_del(&p->run_list);
>284 if (list_empty(array->queue + p->prio))
>285 __clear_bit(p->prio, array->bitmap);
>286 }
Same spot.
-Mike
[-- Attachment #2: oops.txt --]
[-- Type: text/plain, Size: 3183 bytes --]
Loglevel set to 9
hmm.. 289 ms
hmm.. 6 ms
hmm.. 4 ms
hmm.. 7 ms
hmm.. 13 ms
hmm.. 15 ms
Unable to handle kernel NULL pointer dereference at virtual address 00000000
printing eip:
c0114d0a
*pde = 00000000
Oops: 0002 [#1]
CPU: 0
EIP: 0060:[<c0114d0a>] Not tainted VLI
EFLAGS: 00010006
EIP is at try_to_wake_up+0x1e2/0x258
eax: 00000008 ebx: c02cb3c8 ecx: c0dcf360 edx: c0dcf360
esi: c0c24000 edi: 00000000 ebp: c0c25ed4 esp: c0c25eb8
ds: 007b es: 007b ss: 0068
Process gcc (pid: 592, threadinfo=c0c24000 task=c0dcf360)
Stack: 00000001 00000001 c0298ff4 c0c25ed0 00000001 00000001 00000002 c0c25ee8
c0115887 c7b8a0a0 00000003 00000000 c0c25f08 c01158c2 c2d81e5c 00000003
00000000 c0c24000 00000082 c0298fe8 c0c25f20 c011594a c0298ff0 00000003
Call Trace:
[<c0115887>] default_wake_function+0x17/0x1c
[<c01158c2>] __wake_up_common+0x36/0x50
[<c011594a>] __wake_up_locked+0xe/0x14
[<c0107cdc>] __down_trylock+0x34/0x54
[<c0107d1b>] __down_failed_trylock+0x7/0xc
[<c011928b>] .text.lock.printk+0x5/0x2a
[<c01155f0>] schedule+0x13c/0x378
[<c011ab07>] sys_wait4+0xab/0x234
[<c011ac5d>] sys_wait4+0x201/0x234
[<c0115870>] default_wake_function+0x0/0x1c
[<c0115870>] default_wake_function+0x0/0x1c
[<c0108b5f>] syscall_call+0x7/0xb
Code: ff 48 14 8b 40 08 a8 08 74 07 e8 3e 0b 00 00 89 f6 85 f6 74 7e 8b 55 f0 9c 8f 02 fa be 00 e0 ff ff 21 e6 ff 46 14 8b 16 8b 7a 28 <ff> 0f 8b 42 20 8b 4a 24 89 48 04 89 01 8b 52 18 8d 44 d7 18 39
(gdb) list *try_to_wake_up+0x1e2
0x26a is in try_to_wake_up (kernel/sched.c:310).
305 /*
306 * Adding/removing a task to/from a priority array:
307 */
308 static inline void dequeue_task(struct task_struct *p, prio_array_t *array)
309 {
310 array->nr_active--;
311 list_del(&p->run_list);
312 if (list_empty(array->queue + p->prio))
313 __clear_bit(p->prio, array->bitmap);
314 }
(gdb)
<6>note: gcc[592] exited with preempt_count 5
bad: scheduling while atomic!
Call Trace:
[<c01154f0>] schedule+0x3c/0x378
[<c0134b26>] unmap_vmas+0xea/0x1e0
[<c011647b>] __cond_resched+0x17/0x1c
[<c0134b86>] unmap_vmas+0x14a/0x1e0
[<c0137fb8>] exit_mmap+0x64/0x158
[<c0116dbd>] mmput+0x55/0x74
[<c011a368>] do_exit+0x158/0x3b4
[<c0109267>] die+0x87/0x88
[<c0114068>] do_page_fault+0x2d8/0x404
[<c0113d90>] do_page_fault+0x0/0x404
[<c011b91a>] do_softirq+0x5a/0xac
[<c010a170>] do_IRQ+0xfc/0x118
[<c012e173>] __rmqueue+0xa3/0x10c
[<c012e21f>] rmqueue_bulk+0x43/0x6c
[<c0108d69>] error_code+0x2d/0x38
[<c0114d0a>] try_to_wake_up+0x1e2/0x258
[<c0115887>] default_wake_function+0x17/0x1c
[<c01158c2>] __wake_up_common+0x36/0x50
[<c011594a>] __wake_up_locked+0xe/0x14
[<c0107cdc>] __down_trylock+0x34/0x54
[<c0107d1b>] __down_failed_trylock+0x7/0xc
[<c011928b>] .text.lock.printk+0x5/0x2a
[<c01155f0>] schedule+0x13c/0x378
[<c011ab07>] sys_wait4+0xab/0x234
[<c011ac5d>] sys_wait4+0x201/0x234
[<c0115870>] default_wake_function+0x0/0x1c
[<c0115870>] default_wake_function+0x0/0x1c
[<c0108b5f>] syscall_call+0x7/0xb
hmm.. 42 ms
hmm.. 24 ms
hmm.. 33 ms
hmm.. 23 ms
hmm.. 31 ms
hmm.. 30 ms
hmm.. 30 ms
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.5.66-mm1
[not found] ` <Pine.LNX.4.44.0303281619530.9943-100000@localhost.localdom ain>
@ 2003-03-28 16:05 ` Mike Galbraith
0 siblings, 0 replies; 27+ messages in thread
From: Mike Galbraith @ 2003-03-28 16:05 UTC (permalink / raw)
To: Ingo Molnar
Cc: Zwane Mwaikambo, Andrew Morton, Ed Tomlinson, linux-kernel,
linux-mm
At 04:25 PM 3/28/2003 +0100, Ingo Molnar wrote:
>On Fri, 28 Mar 2003, Zwane Mwaikambo wrote:
>
> > Hmm i think i may have his this one but i never posted due to being
> > unable to reproduce it on a vanilla kernel or the same kernel afterwards
> > (which was hacked so i won't vouch for it's cleanliness). I think
> > preempt might have bitten him in a bad place (mine is also
> > CONFIG_PREEMPT), is it possible that when we did the task_rq_unlock we
> > got preempted and when we got back we used the local variable
> > requeue_waker which was set before dropping the lock, and therefore
> > might not be valid anymore due to scheduler decisions done after
> > dropping the runqueue lock?
>
>yes, this one was my only suspect, but it should really never cause any
>problems. We might change sleep_avg during the wakeup, and carry the
>requeue_waker flag over a preemptible window, but the requeueing itself
>re-takes the runqueue lock, and does not take anything for granted. The
>flag could very well be random as well, and the code should still be
>correct - there's no requirement to recalculate the priority every time we
>change sleep_avg. (in fact we at times intentionally keep those values
>detached.)
In my 66-twiddle tree, I moved that under the lock out of pure paranoia. I
can try to see if printing under hefty (very) load will still trigger the
occasional explosion.
-Mike
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.5.66-mm1
@ 2003-03-28 16:05 ` Mike Galbraith
0 siblings, 0 replies; 27+ messages in thread
From: Mike Galbraith @ 2003-03-28 16:05 UTC (permalink / raw)
To: Ingo Molnar
Cc: Zwane Mwaikambo, Andrew Morton, Ed Tomlinson, linux-kernel,
linux-mm
At 04:25 PM 3/28/2003 +0100, Ingo Molnar wrote:
>On Fri, 28 Mar 2003, Zwane Mwaikambo wrote:
>
> > Hmm i think i may have his this one but i never posted due to being
> > unable to reproduce it on a vanilla kernel or the same kernel afterwards
> > (which was hacked so i won't vouch for it's cleanliness). I think
> > preempt might have bitten him in a bad place (mine is also
> > CONFIG_PREEMPT), is it possible that when we did the task_rq_unlock we
> > got preempted and when we got back we used the local variable
> > requeue_waker which was set before dropping the lock, and therefore
> > might not be valid anymore due to scheduler decisions done after
> > dropping the runqueue lock?
>
>yes, this one was my only suspect, but it should really never cause any
>problems. We might change sleep_avg during the wakeup, and carry the
>requeue_waker flag over a preemptible window, but the requeueing itself
>re-takes the runqueue lock, and does not take anything for granted. The
>flag could very well be random as well, and the code should still be
>correct - there's no requirement to recalculate the priority every time we
>change sleep_avg. (in fact we at times intentionally keep those values
>detached.)
In my 66-twiddle tree, I moved that under the lock out of pure paranoia. I
can try to see if printing under hefty (very) load will still trigger the
occasional explosion.
-Mike
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2003-03-28 16:05 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-03-26 9:38 2.5.66-mm1 Andrew Morton
2003-03-26 9:38 ` 2.5.66-mm1 Andrew Morton
2003-03-26 12:26 ` LVM/Device mapper breaks with -mm (was: Re: 2.5.66-mm1) Erik Hensema
2003-03-26 13:48 ` Andries Brouwer
2003-03-26 14:33 ` Erik Hensema
2003-03-26 16:03 ` Andries Brouwer
2003-03-26 17:43 ` Joe Thornber
2003-03-26 18:47 ` Joel Becker
2003-03-26 20:52 ` Andries Brouwer
2003-03-26 21:12 ` Joel Becker
2003-03-28 2:08 ` Dave Jones
2003-03-28 2:06 ` 2.5.66-mm1 Ed Tomlinson
2003-03-28 2:06 ` 2.5.66-mm1 Ed Tomlinson
2003-03-28 4:59 ` 2.5.66-mm1 Andrew Morton
2003-03-28 4:59 ` 2.5.66-mm1 Andrew Morton
2003-03-28 10:45 ` 2.5.66-mm1 Ingo Molnar
2003-03-28 10:45 ` 2.5.66-mm1 Ingo Molnar
[not found] ` <Pine.LNX.4.44.0303281139500.6678-100000@localhost.localdom ain>
2003-03-28 14:26 ` 2.5.66-mm1 Mike Galbraith
2003-03-28 14:26 ` 2.5.66-mm1 Mike Galbraith
2003-03-28 14:56 ` 2.5.66-mm1 Zwane Mwaikambo
2003-03-28 14:56 ` 2.5.66-mm1 Zwane Mwaikambo
2003-03-28 15:25 ` 2.5.66-mm1 Ingo Molnar
2003-03-28 15:25 ` 2.5.66-mm1 Ingo Molnar
[not found] ` <Pine.LNX.4.44.0303281619530.9943-100000@localhost.localdom ain>
2003-03-28 16:05 ` 2.5.66-mm1 Mike Galbraith
2003-03-28 16:05 ` 2.5.66-mm1 Mike Galbraith
[not found] ` <Pine.LNX.4.50.0303280942420.2884-100000@montezuma.mastecen de.com>
2003-03-28 16:01 ` 2.5.66-mm1 Mike Galbraith
2003-03-28 16:01 ` 2.5.66-mm1 Mike Galbraith
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.