2.5.66-mm1

All of lore.kernel.org
 help / color / mirror / Atom feed

* 2.5.66-mm1
@ 2003-03-26  9:38 ` Andrew Morton
  0 siblings, 0 replies; 27+ messages in thread
From: Andrew Morton @ 2003-03-26  9:38 UTC (permalink / raw)
  To: linux-kernel, linux-mm


ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.66/2.5.66-mm1/


. The anticipatory scheduler is in wrapup mode now.  It is pretty much in
  its final form.

. The ext2 locking changes have been significantly redone.

  The per-blockgroup data structures had to go.  For a 4TB filesystem we
  cannot even kmalloc that many pointers, let alone data structures.

  So the per-blockgroup spinlocking has been replaced with hashed
  spinlocking and the per-blockgroup accounting has been removed.  A "per-cpu
  counter" thing has been invented to amortise the locking cost of the
  filesystem-wide counters.

. ext3 is now using spinlocking in its block allocator rather than a
  filesystem-wide semaphore.

  It is stability-tested but I have not yet performance tested this
  closely.  It does appear to have improved the context switch problem (and
  the file fragmentation problem which the context switch problem causes). 
  But there's a way to go here.




Changes since 2.5.65-mm4:


 linus.patch

 Latest -bk

-nfsd-32-bit-dev_t-fixes.patch
-i2c-fix.patch

 Merged

+kgdb-ga.patch

 George Anzinger's gdb stub

+ppa-null-pointer-fix.patch

 Might fix the parport scsi driver

+initcall-debug.patch

 Debugging support for misbehaving initcalls

+posix-timers-64-bit-fix.patch

 Timer fix for 64-bit machines

+slab-off-by-one-fix.patch

 Slab was using too much memory.

+install_page-flush_cache_page.patch

 Cache coherency bug in remap_file_pages()

+as-minor-tweaks.patch
+as-remove-stats.patch

 Anticipaory scheduler tuning and clanups.

+posix-timer-double-expiration-fix.patch

 Posix timers were sending timer expiry info twice.

+hugh-01-no-SWAP_ERROR.patch
+hugh-02-try_to_unmap-CONFIG_SWAP.patch
+hugh-03-add_to_swap_cache.patch
+hugh-04-page_convert_anon-ENOMEM.patch
+hugh-05-page_convert_anon-unlocking.patch
+hugh-06-wrap-below-vm_start.patch
+hugh-07-objrmap-page_table_lock.patch
+hugh-08-rmap-comments.patch
+hugh-09-tmpfs-truncation.patch
+hugh-10-tmpfs-atomics.patch
+hugh-11-fix-unuse_pmd-fixme.patch
+hugh-12-vm_enough_memory-double-counts.patch

 Various vm/mm fixes and cleanups

+ext3-max-file-size-fix.patch

 Allow ext3 to create files larger than 32GB (should be nearly 2TB)

-ext2-no-lock_super.patch
-ext2-ialloc-no-lock_super.patch
+ext2-no-lock_super-ng.patch
+ext2-ialloc-no-lock_super-ng.patch

 Rework the ext2 block and inode allocator locking changes.

+dev_t-remove-B_FREE.patch

 Remove B_FREE.

+tty_io-cleanup.patch
+page_to_pfn-in-blk_queue_bounce.patch
+init_inode_once-bloat-fix.patch

 Cleanups and fixlets

+compound-page-warning-fix.patch

 Fix a warning

+slab-cache-sizes-cleanup.patch

 Unduplicate some tables in slab.

+stat_t-larger-dev_t.patch

 Large dev_t fix.

+acpi-build-fix.patch

 make acpi compile.

+sync_blockdev-on-final-close.patch

 Only write out blockdev mappings on the final close.

+ext3-concurrent-block-inode-allocation.patch
+ext3-concurrent-block-allocation-fix-1.patch

 Use spinlocking in the ext3 block allocator, not as fs-wide semaphore.



All 104 patches:

linus.patch

mm.patch
  add -mmN to EXTRAVERSION

kgdb-ga.patch
  kgdb stub for ia32 (George Anzinger's one)

ppa-null-pointer-fix.patch

initcall-debug.patch
  initcall debugging support

posix-timers-64-bit-fix.patch
  POSIX timers interface long/int cleanup

slab-off-by-one-fix.patch
  slab: fix off-by-one in size calculation

config_spinline.patch
  uninline spinlocks for profiling accuracy.

ppc64-reloc_hide.patch

ppc64-pci-patch.patch
  Subject: pci patch

ppc64-aio-32bit-emulation.patch
  32/64bit emulation for aio

ppc64-scruffiness.patch
  Fix some PPC64 compile warnings

sym-do-160.patch
  make the SYM driver do 160 MB/sec

install_page-flush_cache_page.patch
  add flush_cache_page() to install_page()

config-PAGE_OFFSET.patch
  Configurable kenrel/user memory split

ptrace-flush.patch
  cache flushing in the ptrace code

buffer-debug.patch
  buffer.c debugging

warn-null-wakeup.patch

ext3-truncate-ordered-pages.patch
  ext3: explicitly free truncated pages

reiserfs_file_write-5.patch

rcu-stats.patch
  RCU statistics reporting

ext3-journalled-data-assertion-fix.patch
  Remove incorrect assertion from ext3

nfs-speedup.patch

nfs-oom-fix.patch
  nfs oom fix

sk-allocation.patch
  Subject: Re: nfs oom

nfs-more-oom-fix.patch

rpciod-atomic-allocations.patch
  Make rcpiod use atomic allocations

linux-isp.patch

isp-update-1.patch

kblockd.patch
  Create `kblockd' workqueue

as-iosched.patch
  anticipatory I/O scheduler

as-np-reads-1.patch
  AS: read-vs-read fixes

as-np-reads-2.patch
  AS: more read-vs-read fixes

as-predict-data-direction.patch
  as: predict direction of next IO

as-remove-frontmerge.patch
  AS: remove frontmerge tunable

as-misc-cleanups.patch
  AS: misc cleanups

as-minor-tweaks.patch
  AS: tuning and tweaks

as-remove-stats.patch
  AS: remove statistics

cfq-2.patch
  CFQ scheduler, #2

unplug-use-kblockd.patch
  Use kblockd for running request queues

fremap-all-mappings.patch
  Make all executable mappings be nonlinear

objrmap-2.5.62-5.patch
  object-based rmap

sched-2.5.64-D3.patch
  sched-2.5.64-D3, more interactivity changes

scheduler-tunables.patch
  scheduler tunables

show_task-free-stack-fix.patch
  show_task() fix and cleanup

yellowfin-set_bit-fix.patch
  yellowfin driver set_bit fix

htree-nfs-fix.patch
  Fix ext3 htree / NFS compatibility problems

task_prio-fix.patch
  simple task_prio() fix

slab_store_user-large-objects.patch
  slab debug: perform redzoning against larger objects

pcmcia-2.patch

pcmcia-3b.patch

pcmcia-3.patch

pcmcia-4.patch

pcmcia-5.patch

pcmcia-6.patch

pcmcia-7b.patch

pcmcia-7.patch

pcmcia-8.patch

pcmcia-9.patch

pcmcia-10.patch

htree-nfs-fix-2.patch
  htree nfs fix

posix-timer-double-expiration-fix.patch
  posix timers: fix double-reporting of timer expiration

hugh-01-no-SWAP_ERROR.patch
  swap 01/13 no SWAP_ERROR

hugh-02-try_to_unmap-CONFIG_SWAP.patch
  Subject: [PATCH] swap 02/13 !CONFIG_SWAP try_to_unmap

hugh-03-add_to_swap_cache.patch
  swap 03/13 add_to_swap_cache

hugh-04-page_convert_anon-ENOMEM.patch
  swap 04/13 page_convert_anon -ENOMEM

hugh-05-page_convert_anon-unlocking.patch
  swap 05/13 page_convert_anon unlocking

hugh-06-wrap-below-vm_start.patch
  swap 06/13 wrap below vm_start

hugh-07-objrmap-page_table_lock.patch
  swap 07/13 objrmap page_table_lock

hugh-08-rmap-comments.patch
  swap 08/13 rmap comments

hugh-09-tmpfs-truncation.patch
  swap 09/13 tmpfs truncation

hugh-10-tmpfs-atomics.patch
  swap 10/13 tmpfs atomics

hugh-11-fix-unuse_pmd-fixme.patch
  swap 11/13 fix unuse_pmd fixme

hugh-12-vm_enough_memory-double-counts.patch
  swap 12/13 vm_enough_memory double counts

ext3-max-file-size-fix.patch
  ext3: fix max file size

ext2-no-lock_super-ng.patch

ext2-ialloc-no-lock_super-ng.patch

linear-oops-fix-1.patch
  md/linear oops fix

dev_t-32-bit.patch
  [for playing only] change type of dev_t

dev_t-remove-B_FREE.patch
  dev_t: eliminate B_FREE

dev_t-drm-warnings.patch
  dev_t: fix drm printk warnings

sg-dev_t-fix.patch
  32-bit dev_t fix for sg

oops-dump-preceding-code.patch
  i386 oops output: dump preceding code

x86-clock-override-option.patch
  x86 clock override boot option

tty_io-cleanup.patch
  tty_io cleanup

page_to_pfn-in-blk_queue_bounce.patch
  Subject: use page_to_pfn() in __blk_queue_bounce()

init_inode_once-bloat-fix.patch
  Subject: init_inode_once() wants sizeof(struct hlist_head)

conntrack-use-after-free-fix.patch
  fix use-after-free in ip_conntrack

VM_DONTEXPAND-fix.patch
  honour VM_DONTEXPAND in vma merging

compound-page-warning-fix.patch
  Fix 64bit warnings in mm/page_alloc.c

cdevname-irq-safety-fix.patch
  make cdevname() callable from interrupts

register_chrdev_region-leak-fix.patch
  register_chrdev_region() leak and race fix

slab-cache-sizes-cleanup.patch
  slab: cache sizes cleanup

stat_t-larger-dev_t.patch
  struct stat - support larger dev_t

acpi-build-fix.patch
  ACPI build fix

sync_blockdev-on-final-close.patch
  sync blockdevs on the final close only

ext3_mark_inode_dirty-speedup.patch
  ext3_mark_inode_dirty() speedup

ext3_mark_inode_dirty-less-calls.patch
  ext3_commit_write speedup

ext3-handle-cache.patch
  ext3: create a slab cache for transaction handles

ext3-no-bkl.patch

journal_dirty_metadata-speedup.patch

journal_get_write_access-speedup.patch

ext3-concurrent-block-inode-allocation.patch
  Subject: [PATCH] concurrent block/inode allocation for EXT3

ext3-concurrent-block-allocation-fix-1.patch




^ permalink raw reply	[flat|nested] 27+ messages in thread

* 2.5.66-mm1
@ 2003-03-26  9:38 ` Andrew Morton
  0 siblings, 0 replies; 27+ messages in thread
From: Andrew Morton @ 2003-03-26  9:38 UTC (permalink / raw)
  To: linux-kernel, linux-mm

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.66/2.5.66-mm1/


. The anticipatory scheduler is in wrapup mode now.  It is pretty much in
  its final form.

. The ext2 locking changes have been significantly redone.

  The per-blockgroup data structures had to go.  For a 4TB filesystem we
  cannot even kmalloc that many pointers, let alone data structures.

  So the per-blockgroup spinlocking has been replaced with hashed
  spinlocking and the per-blockgroup accounting has been removed.  A "per-cpu
  counter" thing has been invented to amortise the locking cost of the
  filesystem-wide counters.

. ext3 is now using spinlocking in its block allocator rather than a
  filesystem-wide semaphore.

  It is stability-tested but I have not yet performance tested this
  closely.  It does appear to have improved the context switch problem (and
  the file fragmentation problem which the context switch problem causes). 
  But there's a way to go here.




Changes since 2.5.65-mm4:


 linus.patch

 Latest -bk

-nfsd-32-bit-dev_t-fixes.patch
-i2c-fix.patch

 Merged

+kgdb-ga.patch

 George Anzinger's gdb stub

+ppa-null-pointer-fix.patch

 Might fix the parport scsi driver

+initcall-debug.patch

 Debugging support for misbehaving initcalls

+posix-timers-64-bit-fix.patch

 Timer fix for 64-bit machines

+slab-off-by-one-fix.patch

 Slab was using too much memory.

+install_page-flush_cache_page.patch

 Cache coherency bug in remap_file_pages()

+as-minor-tweaks.patch
+as-remove-stats.patch

 Anticipaory scheduler tuning and clanups.

+posix-timer-double-expiration-fix.patch

 Posix timers were sending timer expiry info twice.

+hugh-01-no-SWAP_ERROR.patch
+hugh-02-try_to_unmap-CONFIG_SWAP.patch
+hugh-03-add_to_swap_cache.patch
+hugh-04-page_convert_anon-ENOMEM.patch
+hugh-05-page_convert_anon-unlocking.patch
+hugh-06-wrap-below-vm_start.patch
+hugh-07-objrmap-page_table_lock.patch
+hugh-08-rmap-comments.patch
+hugh-09-tmpfs-truncation.patch
+hugh-10-tmpfs-atomics.patch
+hugh-11-fix-unuse_pmd-fixme.patch
+hugh-12-vm_enough_memory-double-counts.patch

 Various vm/mm fixes and cleanups

+ext3-max-file-size-fix.patch

 Allow ext3 to create files larger than 32GB (should be nearly 2TB)

-ext2-no-lock_super.patch
-ext2-ialloc-no-lock_super.patch
+ext2-no-lock_super-ng.patch
+ext2-ialloc-no-lock_super-ng.patch

 Rework the ext2 block and inode allocator locking changes.

+dev_t-remove-B_FREE.patch

 Remove B_FREE.

+tty_io-cleanup.patch
+page_to_pfn-in-blk_queue_bounce.patch
+init_inode_once-bloat-fix.patch

 Cleanups and fixlets

+compound-page-warning-fix.patch

 Fix a warning

+slab-cache-sizes-cleanup.patch

 Unduplicate some tables in slab.

+stat_t-larger-dev_t.patch

 Large dev_t fix.

+acpi-build-fix.patch

 make acpi compile.

+sync_blockdev-on-final-close.patch

 Only write out blockdev mappings on the final close.

+ext3-concurrent-block-inode-allocation.patch
+ext3-concurrent-block-allocation-fix-1.patch

 Use spinlocking in the ext3 block allocator, not as fs-wide semaphore.



All 104 patches:

linus.patch

mm.patch
  add -mmN to EXTRAVERSION

kgdb-ga.patch
  kgdb stub for ia32 (George Anzinger's one)

ppa-null-pointer-fix.patch

initcall-debug.patch
  initcall debugging support

posix-timers-64-bit-fix.patch
  POSIX timers interface long/int cleanup

slab-off-by-one-fix.patch
  slab: fix off-by-one in size calculation

config_spinline.patch
  uninline spinlocks for profiling accuracy.

ppc64-reloc_hide.patch

ppc64-pci-patch.patch
  Subject: pci patch

ppc64-aio-32bit-emulation.patch
  32/64bit emulation for aio

ppc64-scruffiness.patch
  Fix some PPC64 compile warnings

sym-do-160.patch
  make the SYM driver do 160 MB/sec

install_page-flush_cache_page.patch
  add flush_cache_page() to install_page()

config-PAGE_OFFSET.patch
  Configurable kenrel/user memory split

ptrace-flush.patch
  cache flushing in the ptrace code

buffer-debug.patch
  buffer.c debugging

warn-null-wakeup.patch

ext3-truncate-ordered-pages.patch
  ext3: explicitly free truncated pages

reiserfs_file_write-5.patch

rcu-stats.patch
  RCU statistics reporting

ext3-journalled-data-assertion-fix.patch
  Remove incorrect assertion from ext3

nfs-speedup.patch

nfs-oom-fix.patch
  nfs oom fix

sk-allocation.patch
  Subject: Re: nfs oom

nfs-more-oom-fix.patch

rpciod-atomic-allocations.patch
  Make rcpiod use atomic allocations

linux-isp.patch

isp-update-1.patch

kblockd.patch
  Create `kblockd' workqueue

as-iosched.patch
  anticipatory I/O scheduler

as-np-reads-1.patch
  AS: read-vs-read fixes

as-np-reads-2.patch
  AS: more read-vs-read fixes

as-predict-data-direction.patch
  as: predict direction of next IO

as-remove-frontmerge.patch
  AS: remove frontmerge tunable

as-misc-cleanups.patch
  AS: misc cleanups

as-minor-tweaks.patch
  AS: tuning and tweaks

as-remove-stats.patch
  AS: remove statistics

cfq-2.patch
  CFQ scheduler, #2

unplug-use-kblockd.patch
  Use kblockd for running request queues

fremap-all-mappings.patch
  Make all executable mappings be nonlinear

objrmap-2.5.62-5.patch
  object-based rmap

sched-2.5.64-D3.patch
  sched-2.5.64-D3, more interactivity changes

scheduler-tunables.patch
  scheduler tunables

show_task-free-stack-fix.patch
  show_task() fix and cleanup

yellowfin-set_bit-fix.patch
  yellowfin driver set_bit fix

htree-nfs-fix.patch
  Fix ext3 htree / NFS compatibility problems

task_prio-fix.patch
  simple task_prio() fix

slab_store_user-large-objects.patch
  slab debug: perform redzoning against larger objects

pcmcia-2.patch

pcmcia-3b.patch

pcmcia-3.patch

pcmcia-4.patch

pcmcia-5.patch

pcmcia-6.patch

pcmcia-7b.patch

pcmcia-7.patch

pcmcia-8.patch

pcmcia-9.patch

pcmcia-10.patch

htree-nfs-fix-2.patch
  htree nfs fix

posix-timer-double-expiration-fix.patch
  posix timers: fix double-reporting of timer expiration

hugh-01-no-SWAP_ERROR.patch
  swap 01/13 no SWAP_ERROR

hugh-02-try_to_unmap-CONFIG_SWAP.patch
  Subject: [PATCH] swap 02/13 !CONFIG_SWAP try_to_unmap

hugh-03-add_to_swap_cache.patch
  swap 03/13 add_to_swap_cache

hugh-04-page_convert_anon-ENOMEM.patch
  swap 04/13 page_convert_anon -ENOMEM

hugh-05-page_convert_anon-unlocking.patch
  swap 05/13 page_convert_anon unlocking

hugh-06-wrap-below-vm_start.patch
  swap 06/13 wrap below vm_start

hugh-07-objrmap-page_table_lock.patch
  swap 07/13 objrmap page_table_lock

hugh-08-rmap-comments.patch
  swap 08/13 rmap comments

hugh-09-tmpfs-truncation.patch
  swap 09/13 tmpfs truncation

hugh-10-tmpfs-atomics.patch
  swap 10/13 tmpfs atomics

hugh-11-fix-unuse_pmd-fixme.patch
  swap 11/13 fix unuse_pmd fixme

hugh-12-vm_enough_memory-double-counts.patch
  swap 12/13 vm_enough_memory double counts

ext3-max-file-size-fix.patch
  ext3: fix max file size

ext2-no-lock_super-ng.patch

ext2-ialloc-no-lock_super-ng.patch

linear-oops-fix-1.patch
  md/linear oops fix

dev_t-32-bit.patch
  [for playing only] change type of dev_t

dev_t-remove-B_FREE.patch
  dev_t: eliminate B_FREE

dev_t-drm-warnings.patch
  dev_t: fix drm printk warnings

sg-dev_t-fix.patch
  32-bit dev_t fix for sg

oops-dump-preceding-code.patch
  i386 oops output: dump preceding code

x86-clock-override-option.patch
  x86 clock override boot option

tty_io-cleanup.patch
  tty_io cleanup

page_to_pfn-in-blk_queue_bounce.patch
  Subject: use page_to_pfn() in __blk_queue_bounce()

init_inode_once-bloat-fix.patch
  Subject: init_inode_once() wants sizeof(struct hlist_head)

conntrack-use-after-free-fix.patch
  fix use-after-free in ip_conntrack

VM_DONTEXPAND-fix.patch
  honour VM_DONTEXPAND in vma merging

compound-page-warning-fix.patch
  Fix 64bit warnings in mm/page_alloc.c

cdevname-irq-safety-fix.patch
  make cdevname() callable from interrupts

register_chrdev_region-leak-fix.patch
  register_chrdev_region() leak and race fix

slab-cache-sizes-cleanup.patch
  slab: cache sizes cleanup

stat_t-larger-dev_t.patch
  struct stat - support larger dev_t

acpi-build-fix.patch
  ACPI build fix

sync_blockdev-on-final-close.patch
  sync blockdevs on the final close only

ext3_mark_inode_dirty-speedup.patch
  ext3_mark_inode_dirty() speedup

ext3_mark_inode_dirty-less-calls.patch
  ext3_commit_write speedup

ext3-handle-cache.patch
  ext3: create a slab cache for transaction handles

ext3-no-bkl.patch

journal_dirty_metadata-speedup.patch

journal_get_write_access-speedup.patch

ext3-concurrent-block-inode-allocation.patch
  Subject: [PATCH] concurrent block/inode allocation for EXT3

ext3-concurrent-block-allocation-fix-1.patch



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* LVM/Device mapper breaks with -mm (was: Re: 2.5.66-mm1)
  2003-03-26  9:38 ` 2.5.66-mm1 Andrew Morton
  (?)
@ 2003-03-26 12:26 ` Erik Hensema
  2003-03-26 13:48   ` Andries Brouwer
  -1 siblings, 1 reply; 27+ messages in thread
From: Erik Hensema @ 2003-03-26 12:26 UTC (permalink / raw)
  To: linux-kernel

Andrew Morton (akpm@digeo.com) wrote:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.66/2.5.66-mm1/

LVM or device mapper seems to be broken in -mm. I've only tried the
following kernels so far:
2.5.64 - works
2.5.65-mm2 - doesn't work
2.5.66 - works
2.5.66-mm1 - doesn't work

I'm getting these messages while setting up LVM from my bootscripts (I've
included the actual commands prefixed with a > ):

Remounting root file system (/) read/write for vgscan...
> mount -n -o remount,rw /
Removing old device inodes...
> rm /dev/system/* /dev/mapper/*
Setting up devices...
> /usr/local/sbin/devmap_mknod.sh
Creating /dev/mapper/control character device with major:10 minor:63.
Scanning for LVM volume groups...
> /usr/local/sbin/vgscan
  Reading all physical volumes.  This may take a while...
  Found volume group "system" using metadata type lvm1
Activating LVM volume groups...
> /usr/local/sbin/vgchange -a y system
device-mapper: allocating minor 0.
device-mapper: allocating minor 1.
device-mapper: destroying md
device-mapper: destroying table
device-mapper: allocating minor 0.
device-mapper: destroying md
device-mapper: destroying table
  1 logical volume(s) in volume group "system" now active

The only active volume is the most recently created volume.

On 2.5.6x-vanilla the output of vgchange is:
device-mapper: allocating minor 0.
device-mapper: allocating minor 1.
device-mapper: allocating minor 2.
  3 logical volume(s) in volume group "system" now active
 
-- 
Erik Hensema <erik@hensema.net>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: LVM/Device mapper breaks with -mm (was: Re: 2.5.66-mm1)
  2003-03-26 12:26 ` LVM/Device mapper breaks with -mm (was: Re: 2.5.66-mm1) Erik Hensema
@ 2003-03-26 13:48   ` Andries Brouwer
  2003-03-26 14:33     ` Erik Hensema
  0 siblings, 1 reply; 27+ messages in thread
From: Andries Brouwer @ 2003-03-26 13:48 UTC (permalink / raw)
  To: erik; +Cc: linux-kernel

On Wed, Mar 26, 2003 at 12:26:37PM +0000, Erik Hensema wrote:
> Andrew Morton (akpm@digeo.com) wrote:
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.66/2.5.66-mm1/

> LVM or device mapper seems to be broken in -mm. I've only tried the
> following kernels so far:
> 2.5.64 - works
> 2.5.65-mm2 - doesn't work
> 2.5.66 - works
> 2.5.66-mm1 - doesn't work

Probably you are hit by

  dev_t-32-bit.patch
    [for playing only] change type of dev_t

This is hidden somewhat in the 100+ patches in -mm,
but the kernel is not quite ready yet - that is
why this is labeled "not to be applied, for
playing only". Mostly things work, but some stuff
related to lvm, md, dm, nfs, loop will break
because ioctls use structs with a dev_t field.

You can revert this single patch and probably all will be fine.
More interesting would be to apply

http://marc.theaimsgroup.com/?l=linux-kernel&m=103956089203199&w=3

if possible, and see whether that helps.
You can see some earlier discussion today under a subject
containing the word dm_ioctl.

Andries

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: LVM/Device mapper breaks with -mm (was: Re: 2.5.66-mm1)
  2003-03-26 13:48   ` Andries Brouwer
@ 2003-03-26 14:33     ` Erik Hensema
  2003-03-26 16:03       ` Andries Brouwer
  0 siblings, 1 reply; 27+ messages in thread
From: Erik Hensema @ 2003-03-26 14:33 UTC (permalink / raw)
  To: linux-kernel

Andries Brouwer (aebr@win.tue.nl) wrote:
> On Wed, Mar 26, 2003 at 12:26:37PM +0000, Erik Hensema wrote:
>> Andrew Morton (akpm@digeo.com) wrote:
>> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.66/2.5.66-mm1/
> 
>> LVM or device mapper seems to be broken in -mm. I've only tried the
>> following kernels so far:
>> 2.5.64 - works
>> 2.5.65-mm2 - doesn't work
>> 2.5.66 - works
>> 2.5.66-mm1 - doesn't work
> 
> Probably you are hit by
> 
>   dev_t-32-bit.patch
>     [for playing only] change type of dev_t
[...]
> You can revert this single patch and probably all will be fine.

For now I've reverted this patch and LVM is working again.

> More interesting would be to apply
> 
> http://marc.theaimsgroup.com/?l=linux-kernel&m=103956089203199&w=3

I'd rather not change the ioctl interface, since that would make dual
booting with 2.5-vanilla harder.

-- 
Erik Hensema <erik@hensema.net>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: LVM/Device mapper breaks with -mm (was: Re: 2.5.66-mm1)
  2003-03-26 14:33     ` Erik Hensema
@ 2003-03-26 16:03       ` Andries Brouwer
  2003-03-26 17:43         ` Joe Thornber
  2003-03-26 18:47         ` Joel Becker
  0 siblings, 2 replies; 27+ messages in thread
From: Andries Brouwer @ 2003-03-26 16:03 UTC (permalink / raw)
  To: Erik Hensema; +Cc: linux-kernel

On Wed, Mar 26, 2003 at 03:33:26PM +0100, Erik Hensema wrote:

> > You can revert this single patch and probably all will be fine.
> 
> For now I've reverted this patch and LVM is working again.

Good.

> > More interesting would be to apply
> > 
> > http://marc.theaimsgroup.com/?l=linux-kernel&m=103956089203199&w=3
> 
> I'd rather not change the ioctl interface, since that would make dual
> booting with 2.5-vanilla harder.

The ioctl has a version field:

struct dm_ioctl {
	uint32_t version[3];
	...

and the above patch changes version 1.6.0 into 2.0.0.
With sufficiently recent user space utilities all
should work: they can find out the interface version
using the DM_VERSION ioctl, and then adapt what
they send to the kernel.
(I don't know whether such up-to-date utilities exist.)

Andries


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: LVM/Device mapper breaks with -mm (was: Re: 2.5.66-mm1)
  2003-03-26 16:03       ` Andries Brouwer
@ 2003-03-26 17:43         ` Joe Thornber
  2003-03-26 18:47         ` Joel Becker
  1 sibling, 0 replies; 27+ messages in thread
From: Joe Thornber @ 2003-03-26 17:43 UTC (permalink / raw)
  To: Andries Brouwer; +Cc: Erik Hensema, linux-kernel

On Wednesday, March 26, 2003, at 04:03 PM, Andries Brouwer wrote:

> (I don't know whether such up-to-date utilities exist.)

Alasdair Kergon should be making a new release of the dm utilities in 
the next couple of days.  Once this has been done we will be free to 
fix the broken ioctl interface in 2.5.

- Joe

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: LVM/Device mapper breaks with -mm (was: Re: 2.5.66-mm1)
  2003-03-26 16:03       ` Andries Brouwer
  2003-03-26 17:43         ` Joe Thornber
@ 2003-03-26 18:47         ` Joel Becker
  2003-03-26 20:52           ` Andries Brouwer
  1 sibling, 1 reply; 27+ messages in thread
From: Joel Becker @ 2003-03-26 18:47 UTC (permalink / raw)
  To: Andries Brouwer; +Cc: Erik Hensema, linux-kernel

On Wed, Mar 26, 2003 at 05:03:50PM +0100, Andries Brouwer wrote:
> With sufficiently recent user space utilities all
> should work: they can find out the interface version
> using the DM_VERSION ioctl, and then adapt what
> they send to the kernel.

	We need to start tracking down what userspace needs fixing
still.  We also should iron out our representations.  eg, hpa's
recommendation for 64bits, or the 12/20 split for 32bit, or etc.

Joel


-- 

Life's Little Instruction Book #451

	"Don't be afraid to say, 'I'm sorry.'"

Joel Becker
Senior Member of Technical Staff
Oracle Corporation
E-mail: joel.becker@oracle.com
Phone: (650) 506-8127

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: LVM/Device mapper breaks with -mm (was: Re: 2.5.66-mm1)
  2003-03-26 18:47         ` Joel Becker
@ 2003-03-26 20:52           ` Andries Brouwer
  2003-03-26 21:12             ` Joel Becker
  2003-03-28  2:08             ` Dave Jones
  0 siblings, 2 replies; 27+ messages in thread
From: Andries Brouwer @ 2003-03-26 20:52 UTC (permalink / raw)
  To: Joel Becker; +Cc: Erik Hensema, linux-kernel

On Wed, Mar 26, 2003 at 10:47:23AM -0800, Joel Becker wrote:

> We need to start tracking down what userspace needs fixing.

My current series of patches is for the ioctls that use a
structure with dev_t field. If someone has time to burn,
or has automated tools that can identify these, that would
be good.

There is a double audit: find these ioctls, and then find
the userspace tools that use them.

For example, struct umsdos_ioctl has twice dev_t followed
by padding. Probably these should become unsigned longs.
I'll send a patch later tonight.

Is it used anywhere? That requires detective work.
It is used by the utilities udosctl (a useless demo utility),
umssync and umssetup. I do not know of any others.
No doubt people will tell me what I overlooked.
Less conservative people will tell me that umsdos has to
be killed entirely.

In old posts and other letters I have mentioned some more ioctls.
The list is not long but they have to be examined one by one,
and in some cases correspondence with authors/maintainers
is required.

> We also should iron out our representations.  eg, hpa's
> recommendation for 64bits, or the 12/20 split for 32bit, or etc.

There is no hurry. These changes are just editing a few lines
in kdev_t.h. I tend to prefer 64 bits, like hpa.
Maybe I should send another patch tonight, just for playing.

Andries

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: LVM/Device mapper breaks with -mm (was: Re: 2.5.66-mm1)
  2003-03-26 20:52           ` Andries Brouwer
@ 2003-03-26 21:12             ` Joel Becker
  2003-03-28  2:08             ` Dave Jones
  1 sibling, 0 replies; 27+ messages in thread
From: Joel Becker @ 2003-03-26 21:12 UTC (permalink / raw)
  To: Andries Brouwer; +Cc: Erik Hensema, linux-kernel

On Wed, Mar 26, 2003 at 09:52:28PM +0100, Andries Brouwer wrote:
> > We also should iron out our representations.  eg, hpa's
> > recommendation for 64bits, or the 12/20 split for 32bit, or etc.
> 
> There is no hurry. These changes are just editing a few lines
> in kdev_t.h. I tend to prefer 64 bits, like hpa.
> Maybe I should send another patch tonight, just for playing.

	Please, I'd like that.  It does actually matter, because glibc
and mknod (to name a couple) have to pass a proper dev_t for the new
format (glibc actually does an explicit conversion to 8:8 in
sysdeps/sysv/linux/xmkmod.c, which we need to fix to the proper
mapping).
	Stuff like that.

Joel


-- 

"This is the end, beautiful friend.
 This is the end, my only friend the end
 Of our elaborate plans, the end
 Of everything that stands, the end
 No safety or surprise, the end
 I'll never look into your eyes again."

Joel Becker
Senior Member of Technical Staff
Oracle Corporation
E-mail: joel.becker@oracle.com
Phone: (650) 506-8127

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.5.66-mm1
  2003-03-26  9:38 ` 2.5.66-mm1 Andrew Morton
@ 2003-03-28  2:06   ` Ed Tomlinson
  -1 siblings, 0 replies; 27+ messages in thread
From: Ed Tomlinson @ 2003-03-28  2:06 UTC (permalink / raw)
  To: Andrew Morton, linux-kernel, linux-mm

Hi Andrew,

Got this opps after about 20 hours with mm1 (65-mm3 lasted 5 days
until I rebooted).

Unable to handle kernel NULL pointer dereference at virtual address 00000000
 printing eip:
c011516d
*pde = 00000000
Oops: 0002 [#1]
CPU:    0
EIP:    0060:[<c011516d>]    Not tainted VLI
EFLAGS: 00010097
EIP is at schedule+0x8d/0x3a0
eax: 00000001   ebx: cf5e99c0   ecx: cf5e99c0   edx: ffffffff
esi: 00000000   edi: c031de00   ebp: cf5ebf08   esp: cf5ebef0
ds: 007b   es: 007b   ss: 0068
Process newsplex (pid: 1205, threadinfo=cf5ea000 task=cf5e99c0)
Stack: c011fbd7 c02bbc40 00000246 05261e41 cf5ebf14 cf5ebf50 cf5ebf3c c0120754 
       cf5ebf14 c02bc538 c02bc538 05261e41 4b87ad6e c01206e0 cf5e99c0 c02bbc40 
       c015abd6 000007d1 00000000 cf5ebf60 c015ac19 cf5ea000 cf5ea000 00000000 
Call Trace:
 [<c011fbd7>] add_timer+0x57/0xa0
 [<c0120754>] schedule_timeout+0x54/0xa0
 [<c01206e0>] process_timeout+0x0/0x20
 [<c015abd6>] do_poll+0x56/0xc0
 [<c015ac19>] do_poll+0x99/0xc0
 [<c015ad88>] sys_poll+0x148/0x220
 [<c013eb3b>] sys_mprotect+0x21b/0x22f
 [<c01079ec>] sys_clone+0x2c/0x60
 [<c015a200>] __pollwait+0x0/0xc0
 [<c0109277>] syscall_call+0x7/0xb

Code: 40 17 04 75 4d 8b 03 85 c0 74 47 48 0f 84 da 02 00 00 ff 0d 00 de 31 c0 8b 43 68 ff 08 8b 03 83 f8 02 0f 84 b6 02 00 00 8b 73 28 <ff> 4e 00 8b 53 24 8b 43 20 89 50 04 89 02 8b 4b 18 8d 14 ce 8d 
 <6>note: newsplex[1205] exited with preempt_count 2
Debug: sleeping function called from illegal context at include/linux/rwsem.h:43
Call Trace:
 [<c01168d3>] __might_sleep+0x53/0x60
 [<c01198d5>] profile_exit_task+0x15/0x60
 [<c011aee6>] do_exit+0x86/0x460
 [<c0109ab5>] die+0x75/0x80
 [<c0113854>] do_page_fault+0x134/0x45e
 [<c0114798>] try_to_wake_up+0x138/0x240
 [<c011fde4>] mod_timer+0x124/0x180
 [<c012a520>] nanosleep_wake_up+0x0/0x20
 [<c0131feb>] buffered_rmqueue+0xab/0x140
 [<c0132103>] __alloc_pages+0x83/0x280
 [<c0113720>] do_page_fault+0x0/0x45e
 [<c01094dd>] error_code+0x2d/0x40
 [<c011516d>] schedule+0x8d/0x3a0
 [<c011fbd7>] add_timer+0x57/0xa0
 [<c0120754>] schedule_timeout+0x54/0xa0
 [<c01206e0>] process_timeout+0x0/0x20
 [<c015abd6>] do_poll+0x56/0xc0
 [<c015ac19>] do_poll+0x99/0xc0
 [<c015ad88>] sys_poll+0x148/0x220
 [<c013eb3b>] sys_mprotect+0x21b/0x22f
 [<c01079ec>] sys_clone+0x2c/0x60
 [<c015a200>] __pollwait+0x0/0xc0
 [<c0109277>] syscall_call+0x7/0xb

Hope this helps

Ed Tomlinson



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.5.66-mm1
@ 2003-03-28  2:06   ` Ed Tomlinson
  0 siblings, 0 replies; 27+ messages in thread
From: Ed Tomlinson @ 2003-03-28  2:06 UTC (permalink / raw)
  To: Andrew Morton, linux-kernel, linux-mm

Hi Andrew,

Got this opps after about 20 hours with mm1 (65-mm3 lasted 5 days
until I rebooted).

Unable to handle kernel NULL pointer dereference at virtual address 00000000
 printing eip:
c011516d
*pde = 00000000
Oops: 0002 [#1]
CPU:    0
EIP:    0060:[<c011516d>]    Not tainted VLI
EFLAGS: 00010097
EIP is at schedule+0x8d/0x3a0
eax: 00000001   ebx: cf5e99c0   ecx: cf5e99c0   edx: ffffffff
esi: 00000000   edi: c031de00   ebp: cf5ebf08   esp: cf5ebef0
ds: 007b   es: 007b   ss: 0068
Process newsplex (pid: 1205, threadinfo=cf5ea000 task=cf5e99c0)
Stack: c011fbd7 c02bbc40 00000246 05261e41 cf5ebf14 cf5ebf50 cf5ebf3c c0120754 
       cf5ebf14 c02bc538 c02bc538 05261e41 4b87ad6e c01206e0 cf5e99c0 c02bbc40 
       c015abd6 000007d1 00000000 cf5ebf60 c015ac19 cf5ea000 cf5ea000 00000000 
Call Trace:
 [<c011fbd7>] add_timer+0x57/0xa0
 [<c0120754>] schedule_timeout+0x54/0xa0
 [<c01206e0>] process_timeout+0x0/0x20
 [<c015abd6>] do_poll+0x56/0xc0
 [<c015ac19>] do_poll+0x99/0xc0
 [<c015ad88>] sys_poll+0x148/0x220
 [<c013eb3b>] sys_mprotect+0x21b/0x22f
 [<c01079ec>] sys_clone+0x2c/0x60
 [<c015a200>] __pollwait+0x0/0xc0
 [<c0109277>] syscall_call+0x7/0xb

Code: 40 17 04 75 4d 8b 03 85 c0 74 47 48 0f 84 da 02 00 00 ff 0d 00 de 31 c0 8b 43 68 ff 08 8b 03 83 f8 02 0f 84 b6 02 00 00 8b 73 28 <ff> 4e 00 8b 53 24 8b 43 20 89 50 04 89 02 8b 4b 18 8d 14 ce 8d 
 <6>note: newsplex[1205] exited with preempt_count 2
Debug: sleeping function called from illegal context at include/linux/rwsem.h:43
Call Trace:
 [<c01168d3>] __might_sleep+0x53/0x60
 [<c01198d5>] profile_exit_task+0x15/0x60
 [<c011aee6>] do_exit+0x86/0x460
 [<c0109ab5>] die+0x75/0x80
 [<c0113854>] do_page_fault+0x134/0x45e
 [<c0114798>] try_to_wake_up+0x138/0x240
 [<c011fde4>] mod_timer+0x124/0x180
 [<c012a520>] nanosleep_wake_up+0x0/0x20
 [<c0131feb>] buffered_rmqueue+0xab/0x140
 [<c0132103>] __alloc_pages+0x83/0x280
 [<c0113720>] do_page_fault+0x0/0x45e
 [<c01094dd>] error_code+0x2d/0x40
 [<c011516d>] schedule+0x8d/0x3a0
 [<c011fbd7>] add_timer+0x57/0xa0
 [<c0120754>] schedule_timeout+0x54/0xa0
 [<c01206e0>] process_timeout+0x0/0x20
 [<c015abd6>] do_poll+0x56/0xc0
 [<c015ac19>] do_poll+0x99/0xc0
 [<c015ad88>] sys_poll+0x148/0x220
 [<c013eb3b>] sys_mprotect+0x21b/0x22f
 [<c01079ec>] sys_clone+0x2c/0x60
 [<c015a200>] __pollwait+0x0/0xc0
 [<c0109277>] syscall_call+0x7/0xb

Hope this helps

Ed Tomlinson


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: LVM/Device mapper breaks with -mm (was: Re: 2.5.66-mm1)
  2003-03-26 20:52           ` Andries Brouwer
  2003-03-26 21:12             ` Joel Becker
@ 2003-03-28  2:08             ` Dave Jones
  1 sibling, 0 replies; 27+ messages in thread
From: Dave Jones @ 2003-03-28  2:08 UTC (permalink / raw)
  To: Andries Brouwer; +Cc: Joel Becker, Erik Hensema, linux-kernel

On Wed, Mar 26, 2003 at 09:52:28PM +0100, Andries Brouwer wrote:

 > For example, struct umsdos_ioctl has twice dev_t followed
 > by padding. Probably these should become unsigned longs.
 > I'll send a patch later tonight.
 > 
 > Is it used anywhere? That requires detective work.
 > It is used by the utilities udosctl (a useless demo utility),
 > umssync and umssetup. I do not know of any others.
 > No doubt people will tell me what I overlooked.
 > Less conservative people will tell me that umsdos has to
 > be killed entirely.

Isn't it still horribly broken ? I remember Al putting it on
the "To be fixed later" burner, but never saw anything happen
to it after that asides from janitor style fixes.

		Dave


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.5.66-mm1
  2003-03-28  2:06   ` 2.5.66-mm1 Ed Tomlinson
@ 2003-03-28  4:59     ` Andrew Morton
  -1 siblings, 0 replies; 27+ messages in thread
From: Andrew Morton @ 2003-03-28  4:59 UTC (permalink / raw)
  To: Ed Tomlinson; +Cc: linux-kernel, linux-mm, Ingo Molnar

Ed Tomlinson <tomlins@cam.org> wrote:
>
> Hi Andrew,
> 
> Got this opps after about 20 hours with mm1 (65-mm3 lasted 5 days
> until I rebooted).
> 
> Unable to handle kernel NULL pointer dereference at virtual address 00000000
>  printing eip:
> c011516d
> *pde = 00000000
> Oops: 0002 [#1]
> CPU:    0
> EIP:    0060:[<c011516d>]    Not tainted VLI
> EFLAGS: 00010097
> EIP is at schedule+0x8d/0x3a0
> eax: 00000001   ebx: cf5e99c0   ecx: cf5e99c0   edx: ffffffff
> esi: 00000000   edi: c031de00   ebp: cf5ebf08   esp: cf5ebef0
> ds: 007b   es: 007b   ss: 0068
> Process newsplex (pid: 1205, threadinfo=cf5ea000 task=cf5e99c0)
> Stack: c011fbd7 c02bbc40 00000246 05261e41 cf5ebf14 cf5ebf50 cf5ebf3c c0120754 
>        cf5ebf14 c02bc538 c02bc538 05261e41 4b87ad6e c01206e0 cf5e99c0 c02bbc40 
>        c015abd6 000007d1 00000000 cf5ebf60 c015ac19 cf5ea000 cf5ea000 00000000 
> Call Trace:
>  [<c011fbd7>] add_timer+0x57/0xa0
>  [<c0120754>] schedule_timeout+0x54/0xa0
>  [<c01206e0>] process_timeout+0x0/0x20
>  [<c015abd6>] do_poll+0x56/0xc0
>  [<c015ac19>] do_poll+0x99/0xc0
>  [<c015ad88>] sys_poll+0x148/0x220
>  [<c013eb3b>] sys_mprotect+0x21b/0x22f
>  [<c01079ec>] sys_clone+0x2c/0x60
>  [<c015a200>] __pollwait+0x0/0xc0
>  [<c0109277>] syscall_call+0x7/0xb
> 
> Code: 40 17 04 75 4d 8b 03 85 c0 74 47 48 0f 84 da 02 00 00 ff 0d 00 de 31 c0 8b 43 68 ff 08 8b 03 83 f8 02 0f 84 b6 02 00 00 8b 73 28 <ff> 4e 00 8b 53 24 8b 43 20 89 50 04 89 02 8b 4b 18 8d 14 ce 8d 

That longer Code: line is really handy.

You died in schedule()->deactivate_task()->dequeue_task().

static inline void dequeue_task(struct task_struct *p, prio_array_t *array)
{
	array->nr_active--;

`array' is zero.

I'm going to Cc Ingo and run away.  Ed uses preempt.
	

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.5.66-mm1
@ 2003-03-28  4:59     ` Andrew Morton
  0 siblings, 0 replies; 27+ messages in thread
From: Andrew Morton @ 2003-03-28  4:59 UTC (permalink / raw)
  To: Ed Tomlinson; +Cc: linux-kernel, linux-mm, Ingo Molnar

Ed Tomlinson <tomlins@cam.org> wrote:
>
> Hi Andrew,
> 
> Got this opps after about 20 hours with mm1 (65-mm3 lasted 5 days
> until I rebooted).
> 
> Unable to handle kernel NULL pointer dereference at virtual address 00000000
>  printing eip:
> c011516d
> *pde = 00000000
> Oops: 0002 [#1]
> CPU:    0
> EIP:    0060:[<c011516d>]    Not tainted VLI
> EFLAGS: 00010097
> EIP is at schedule+0x8d/0x3a0
> eax: 00000001   ebx: cf5e99c0   ecx: cf5e99c0   edx: ffffffff
> esi: 00000000   edi: c031de00   ebp: cf5ebf08   esp: cf5ebef0
> ds: 007b   es: 007b   ss: 0068
> Process newsplex (pid: 1205, threadinfo=cf5ea000 task=cf5e99c0)
> Stack: c011fbd7 c02bbc40 00000246 05261e41 cf5ebf14 cf5ebf50 cf5ebf3c c0120754 
>        cf5ebf14 c02bc538 c02bc538 05261e41 4b87ad6e c01206e0 cf5e99c0 c02bbc40 
>        c015abd6 000007d1 00000000 cf5ebf60 c015ac19 cf5ea000 cf5ea000 00000000 
> Call Trace:
>  [<c011fbd7>] add_timer+0x57/0xa0
>  [<c0120754>] schedule_timeout+0x54/0xa0
>  [<c01206e0>] process_timeout+0x0/0x20
>  [<c015abd6>] do_poll+0x56/0xc0
>  [<c015ac19>] do_poll+0x99/0xc0
>  [<c015ad88>] sys_poll+0x148/0x220
>  [<c013eb3b>] sys_mprotect+0x21b/0x22f
>  [<c01079ec>] sys_clone+0x2c/0x60
>  [<c015a200>] __pollwait+0x0/0xc0
>  [<c0109277>] syscall_call+0x7/0xb
> 
> Code: 40 17 04 75 4d 8b 03 85 c0 74 47 48 0f 84 da 02 00 00 ff 0d 00 de 31 c0 8b 43 68 ff 08 8b 03 83 f8 02 0f 84 b6 02 00 00 8b 73 28 <ff> 4e 00 8b 53 24 8b 43 20 89 50 04 89 02 8b 4b 18 8d 14 ce 8d 

That longer Code: line is really handy.

You died in schedule()->deactivate_task()->dequeue_task().

static inline void dequeue_task(struct task_struct *p, prio_array_t *array)
{
	array->nr_active--;

`array' is zero.

I'm going to Cc Ingo and run away.  Ed uses preempt.
	
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.5.66-mm1
  2003-03-28  4:59     ` 2.5.66-mm1 Andrew Morton
@ 2003-03-28 10:45       ` Ingo Molnar
  -1 siblings, 0 replies; 27+ messages in thread
From: Ingo Molnar @ 2003-03-28 10:45 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ed Tomlinson, linux-kernel, linux-mm, Mike Galbraith

On Thu, 27 Mar 2003, Andrew Morton wrote:

> That longer Code: line is really handy.
> 
> You died in schedule()->deactivate_task()->dequeue_task().
> 
> static inline void dequeue_task(struct task_struct *p, prio_array_t *array)
> {
> 	array->nr_active--;
> 
> `array' is zero.
> 
> I'm going to Cc Ingo and run away.  Ed uses preempt.

hm, this is an 'impossible' scenario from the scheduler code POV. Whenever
we deactivate a task, we remove it from the runqueue and set p->array to
NULL. Whenever we activate a task again, we set p->array to non-NULL. A
double-deactivate is not possible. I tried to reproduce it with various
scheduler workloads, but didnt succeed.

Mike, do you have a backtrace of the crash you saw?

	Ingo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.5.66-mm1
@ 2003-03-28 10:45       ` Ingo Molnar
  0 siblings, 0 replies; 27+ messages in thread
From: Ingo Molnar @ 2003-03-28 10:45 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ed Tomlinson, linux-kernel, linux-mm, Mike Galbraith

On Thu, 27 Mar 2003, Andrew Morton wrote:

> That longer Code: line is really handy.
> 
> You died in schedule()->deactivate_task()->dequeue_task().
> 
> static inline void dequeue_task(struct task_struct *p, prio_array_t *array)
> {
> 	array->nr_active--;
> 
> `array' is zero.
> 
> I'm going to Cc Ingo and run away.  Ed uses preempt.

hm, this is an 'impossible' scenario from the scheduler code POV. Whenever
we deactivate a task, we remove it from the runqueue and set p->array to
NULL. Whenever we activate a task again, we set p->array to non-NULL. A
double-deactivate is not possible. I tried to reproduce it with various
scheduler workloads, but didnt succeed.

Mike, do you have a backtrace of the crash you saw?

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.5.66-mm1
       [not found]     ` <Pine.LNX.4.44.0303281139500.6678-100000@localhost.localdom ain>
@ 2003-03-28 14:26         ` Mike Galbraith
  0 siblings, 0 replies; 27+ messages in thread
From: Mike Galbraith @ 2003-03-28 14:26 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Andrew Morton, Ed Tomlinson, linux-kernel, linux-mm

At 11:45 AM 3/28/2003 +0100, Ingo Molnar wrote:

>On Thu, 27 Mar 2003, Andrew Morton wrote:
>
> > That longer Code: line is really handy.
> >
> > You died in schedule()->deactivate_task()->dequeue_task().
> >
> > static inline void dequeue_task(struct task_struct *p, prio_array_t *array)
> > {
> >       array->nr_active--;
> >
> > `array' is zero.
> >
> > I'm going to Cc Ingo and run away.  Ed uses preempt.
>
>hm, this is an 'impossible' scenario from the scheduler code POV. Whenever
>we deactivate a task, we remove it from the runqueue and set p->array to
>NULL. Whenever we activate a task again, we set p->array to non-NULL. A
>double-deactivate is not possible. I tried to reproduce it with various
>scheduler workloads, but didnt succeed.
>
>Mike, do you have a backtrace of the crash you saw?

No, I didn't save it due to "grubby fingerprints".

         -Mike 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.5.66-mm1
@ 2003-03-28 14:26         ` Mike Galbraith
  0 siblings, 0 replies; 27+ messages in thread
From: Mike Galbraith @ 2003-03-28 14:26 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Andrew Morton, Ed Tomlinson, linux-kernel, linux-mm

At 11:45 AM 3/28/2003 +0100, Ingo Molnar wrote:

>On Thu, 27 Mar 2003, Andrew Morton wrote:
>
> > That longer Code: line is really handy.
> >
> > You died in schedule()->deactivate_task()->dequeue_task().
> >
> > static inline void dequeue_task(struct task_struct *p, prio_array_t *array)
> > {
> >       array->nr_active--;
> >
> > `array' is zero.
> >
> > I'm going to Cc Ingo and run away.  Ed uses preempt.
>
>hm, this is an 'impossible' scenario from the scheduler code POV. Whenever
>we deactivate a task, we remove it from the runqueue and set p->array to
>NULL. Whenever we activate a task again, we set p->array to non-NULL. A
>double-deactivate is not possible. I tried to reproduce it with various
>scheduler workloads, but didnt succeed.
>
>Mike, do you have a backtrace of the crash you saw?

No, I didn't save it due to "grubby fingerprints".

         -Mike 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.5.66-mm1
  2003-03-28 14:26         ` 2.5.66-mm1 Mike Galbraith
@ 2003-03-28 14:56           ` Zwane Mwaikambo
  -1 siblings, 0 replies; 27+ messages in thread
From: Zwane Mwaikambo @ 2003-03-28 14:56 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Ingo Molnar, Andrew Morton, Ed Tomlinson, linux-kernel, linux-mm

On Fri, 28 Mar 2003, Mike Galbraith wrote:

> >hm, this is an 'impossible' scenario from the scheduler code POV. Whenever
> >we deactivate a task, we remove it from the runqueue and set p->array to
> >NULL. Whenever we activate a task again, we set p->array to non-NULL. A
> >double-deactivate is not possible. I tried to reproduce it with various
> >scheduler workloads, but didnt succeed.
> >
> >Mike, do you have a backtrace of the crash you saw?
> 
> No, I didn't save it due to "grubby fingerprints".

Hmm i think i may have his this one but i never posted due to being unable 
to reproduce it on a vanilla kernel or the same kernel afterwards (which 
was hacked so i won't vouch for it's cleanliness). I think preempt 
might have bitten him in a bad place (mine is also CONFIG_PREEMPT), is it 
possible that when we did the task_rq_unlock we got preempted and when we 
got back we used the local variable requeue_waker which was set before 
dropping the lock, and therefore might not be valid anymore due to 
scheduler decisions done after dropping the runqueue lock?

Unable to handle kernel NULL pointer dereference at virtual address 00000000
 printing eip:
c011b8d9
*pde = 00000000
Oops: 0000 [#1]
CPU:    0
EIP:    0060:[<c011b8d9>]    Not tainted
EFLAGS: 00010046
EIP is at try_to_wake_up+0x1e9/0x4f0
eax: c055a000   ebx: c04e5aa0   ecx: c0552fc0   edx: c04e5aa0
esi: 00000000   edi: 00000000   ebp: c055bee4   esp: c055beb8
ds: 007b   es: 007b   ss: 0068
Process swapper (pid: 0, threadinfo=c055a000 task=c04e5aa0)
Stack: 00000001 c055a000 c0552fc0 00000000 cb1a0000 00000001 00000001 00000002 
       00000000 c04e88e4 00000001 c055bf08 c011d172 c1694700 00000001 00000000 
       c04e88e4 c04e88dc c055a000 00000001 c055bf3c c011d203 c04e88dc 00000001 
Call Trace:
 [<c011d172>] __wake_up_common+0x32/0x60
 [<c011d203>] __wake_up+0x63/0xb0
 [<c0122fb5>] release_console_sem+0x165/0x170
 [<c0122d7b>] printk+0x1eb/0x270
 [<c015e210>] invalidate_bh_lru+0x0/0x60
 [<c015e210>] invalidate_bh_lru+0x0/0x60
 [<c015e210>] invalidate_bh_lru+0x0/0x60
 [<c01163f2>] smp_call_function_interrupt+0x42/0xb0
 [<c015e210>] invalidate_bh_lru+0x0/0x60
 [<c0106eb0>] default_idle+0x0/0x40
 [<c010a41a>] call_function_interrupt+0x1a/0x20
 [<c0106eb0>] default_idle+0x0/0x40
 [<c0106ede>] default_idle+0x2e/0x40
 [<c0106f6a>] cpu_idle+0x3a/0x50
 [<c0105000>] rest_init+0x0/0x80

Code: 8b 06 48 89 06 8b 4a 24 8b 42 20 89 01 89 48 04 8b 4a 18 8d 

0xc011b8d9 is in try_to_wake_up (kernel/sched.c:282).
277     /*
278      * Adding/removing a task to/from a priority array:
279      */
280     static inline void dequeue_task(struct task_struct *p, prio_array_t *array)
281     {
282             array->nr_active--;
283             list_del(&p->run_list);
284             if (list_empty(array->queue + p->prio))
285                     __clear_bit(p->prio, array->bitmap);
286     }

(gdb) list *__wake_up_common+0x32
0xc011d1b2 is in __wake_up_common (kernel/sched.c:1424).
1419            list_for_each_safe(tmp, next, &q->task_list) {
1420                    wait_queue_t *curr;
1421                    unsigned flags;
1422                    curr = list_entry(tmp, wait_queue_t, task_list);
1423                    flags = curr->flags;
1424                    if (curr->func(curr, mode, sync) &&
1425                        (flags & WQ_FLAG_EXCLUSIVE) &&
1426                        !--nr_exclusive)
1427                            break;
1428            }

(gdb) list *__wake_up+0x62
0xc011d242 is in __wake_up (kernel/sched.c:1445).
1440
1441            if (unlikely(!q))
1442                    return;
1443
1444            spin_lock_irqsave(&q->lock, flags);
1445            __wake_up_common(q, mode, nr_exclusive, 0);
1446            spin_unlock_irqrestore(&q->lock, flags);
1447    }
1448
1449    /*


-- 
function.linuxpower.ca

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.5.66-mm1
@ 2003-03-28 14:56           ` Zwane Mwaikambo
  0 siblings, 0 replies; 27+ messages in thread
From: Zwane Mwaikambo @ 2003-03-28 14:56 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Ingo Molnar, Andrew Morton, Ed Tomlinson, linux-kernel, linux-mm

On Fri, 28 Mar 2003, Mike Galbraith wrote:

> >hm, this is an 'impossible' scenario from the scheduler code POV. Whenever
> >we deactivate a task, we remove it from the runqueue and set p->array to
> >NULL. Whenever we activate a task again, we set p->array to non-NULL. A
> >double-deactivate is not possible. I tried to reproduce it with various
> >scheduler workloads, but didnt succeed.
> >
> >Mike, do you have a backtrace of the crash you saw?
> 
> No, I didn't save it due to "grubby fingerprints".

Hmm i think i may have his this one but i never posted due to being unable 
to reproduce it on a vanilla kernel or the same kernel afterwards (which 
was hacked so i won't vouch for it's cleanliness). I think preempt 
might have bitten him in a bad place (mine is also CONFIG_PREEMPT), is it 
possible that when we did the task_rq_unlock we got preempted and when we 
got back we used the local variable requeue_waker which was set before 
dropping the lock, and therefore might not be valid anymore due to 
scheduler decisions done after dropping the runqueue lock?

Unable to handle kernel NULL pointer dereference at virtual address 00000000
 printing eip:
c011b8d9
*pde = 00000000
Oops: 0000 [#1]
CPU:    0
EIP:    0060:[<c011b8d9>]    Not tainted
EFLAGS: 00010046
EIP is at try_to_wake_up+0x1e9/0x4f0
eax: c055a000   ebx: c04e5aa0   ecx: c0552fc0   edx: c04e5aa0
esi: 00000000   edi: 00000000   ebp: c055bee4   esp: c055beb8
ds: 007b   es: 007b   ss: 0068
Process swapper (pid: 0, threadinfo=c055a000 task=c04e5aa0)
Stack: 00000001 c055a000 c0552fc0 00000000 cb1a0000 00000001 00000001 00000002 
       00000000 c04e88e4 00000001 c055bf08 c011d172 c1694700 00000001 00000000 
       c04e88e4 c04e88dc c055a000 00000001 c055bf3c c011d203 c04e88dc 00000001 
Call Trace:
 [<c011d172>] __wake_up_common+0x32/0x60
 [<c011d203>] __wake_up+0x63/0xb0
 [<c0122fb5>] release_console_sem+0x165/0x170
 [<c0122d7b>] printk+0x1eb/0x270
 [<c015e210>] invalidate_bh_lru+0x0/0x60
 [<c015e210>] invalidate_bh_lru+0x0/0x60
 [<c015e210>] invalidate_bh_lru+0x0/0x60
 [<c01163f2>] smp_call_function_interrupt+0x42/0xb0
 [<c015e210>] invalidate_bh_lru+0x0/0x60
 [<c0106eb0>] default_idle+0x0/0x40
 [<c010a41a>] call_function_interrupt+0x1a/0x20
 [<c0106eb0>] default_idle+0x0/0x40
 [<c0106ede>] default_idle+0x2e/0x40
 [<c0106f6a>] cpu_idle+0x3a/0x50
 [<c0105000>] rest_init+0x0/0x80

Code: 8b 06 48 89 06 8b 4a 24 8b 42 20 89 01 89 48 04 8b 4a 18 8d 

0xc011b8d9 is in try_to_wake_up (kernel/sched.c:282).
277     /*
278      * Adding/removing a task to/from a priority array:
279      */
280     static inline void dequeue_task(struct task_struct *p, prio_array_t *array)
281     {
282             array->nr_active--;
283             list_del(&p->run_list);
284             if (list_empty(array->queue + p->prio))
285                     __clear_bit(p->prio, array->bitmap);
286     }

(gdb) list *__wake_up_common+0x32
0xc011d1b2 is in __wake_up_common (kernel/sched.c:1424).
1419            list_for_each_safe(tmp, next, &q->task_list) {
1420                    wait_queue_t *curr;
1421                    unsigned flags;
1422                    curr = list_entry(tmp, wait_queue_t, task_list);
1423                    flags = curr->flags;
1424                    if (curr->func(curr, mode, sync) &&
1425                        (flags & WQ_FLAG_EXCLUSIVE) &&
1426                        !--nr_exclusive)
1427                            break;
1428            }

(gdb) list *__wake_up+0x62
0xc011d242 is in __wake_up (kernel/sched.c:1445).
1440
1441            if (unlikely(!q))
1442                    return;
1443
1444            spin_lock_irqsave(&q->lock, flags);
1445            __wake_up_common(q, mode, nr_exclusive, 0);
1446            spin_unlock_irqrestore(&q->lock, flags);
1447    }
1448
1449    /*


-- 
function.linuxpower.ca
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.5.66-mm1
  2003-03-28 14:56           ` 2.5.66-mm1 Zwane Mwaikambo
@ 2003-03-28 15:25             ` Ingo Molnar
  -1 siblings, 0 replies; 27+ messages in thread
From: Ingo Molnar @ 2003-03-28 15:25 UTC (permalink / raw)
  To: Zwane Mwaikambo
  Cc: Mike Galbraith, Andrew Morton, Ed Tomlinson, linux-kernel,
	linux-mm

On Fri, 28 Mar 2003, Zwane Mwaikambo wrote:

> Hmm i think i may have his this one but i never posted due to being
> unable to reproduce it on a vanilla kernel or the same kernel afterwards
> (which was hacked so i won't vouch for it's cleanliness). I think
> preempt might have bitten him in a bad place (mine is also
> CONFIG_PREEMPT), is it possible that when we did the task_rq_unlock we
> got preempted and when we got back we used the local variable
> requeue_waker which was set before dropping the lock, and therefore
> might not be valid anymore due to scheduler decisions done after
> dropping the runqueue lock?

yes, this one was my only suspect, but it should really never cause any
problems. We might change sleep_avg during the wakeup, and carry the
requeue_waker flag over a preemptible window, but the requeueing itself
re-takes the runqueue lock, and does not take anything for granted. The
flag could very well be random as well, and the code should still be
correct - there's no requirement to recalculate the priority every time we
change sleep_avg. (in fact we at times intentionally keep those values
detached.)

	Ingo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.5.66-mm1
@ 2003-03-28 15:25             ` Ingo Molnar
  0 siblings, 0 replies; 27+ messages in thread
From: Ingo Molnar @ 2003-03-28 15:25 UTC (permalink / raw)
  To: Zwane Mwaikambo
  Cc: Mike Galbraith, Andrew Morton, Ed Tomlinson, linux-kernel,
	linux-mm

On Fri, 28 Mar 2003, Zwane Mwaikambo wrote:

> Hmm i think i may have his this one but i never posted due to being
> unable to reproduce it on a vanilla kernel or the same kernel afterwards
> (which was hacked so i won't vouch for it's cleanliness). I think
> preempt might have bitten him in a bad place (mine is also
> CONFIG_PREEMPT), is it possible that when we did the task_rq_unlock we
> got preempted and when we got back we used the local variable
> requeue_waker which was set before dropping the lock, and therefore
> might not be valid anymore due to scheduler decisions done after
> dropping the runqueue lock?

yes, this one was my only suspect, but it should really never cause any
problems. We might change sleep_avg during the wakeup, and carry the
requeue_waker flag over a preemptible window, but the requeueing itself
re-takes the runqueue lock, and does not take anything for granted. The
flag could very well be random as well, and the code should still be
correct - there's no requirement to recalculate the priority every time we
change sleep_avg. (in fact we at times intentionally keep those values
detached.)

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.5.66-mm1
       [not found]         ` <Pine.LNX.4.50.0303280942420.2884-100000@montezuma.mastecen de.com>
@ 2003-03-28 16:01             ` Mike Galbraith
  0 siblings, 0 replies; 27+ messages in thread
From: Mike Galbraith @ 2003-03-28 16:01 UTC (permalink / raw)
  To: Zwane Mwaikambo
  Cc: Ingo Molnar, Andrew Morton, Ed Tomlinson, linux-kernel, linux-mm

[-- Attachment #1: Type: text/plain, Size: 3414 bytes --]

At 09:56 AM 3/28/2003 -0500, Zwane Mwaikambo wrote:
>On Fri, 28 Mar 2003, Mike Galbraith wrote:
>
> > >hm, this is an 'impossible' scenario from the scheduler code POV. Whenever
> > >we deactivate a task, we remove it from the runqueue and set p->array to
> > >NULL. Whenever we activate a task again, we set p->array to non-NULL. A
> > >double-deactivate is not possible. I tried to reproduce it with various
> > >scheduler workloads, but didnt succeed.
> > >
> > >Mike, do you have a backtrace of the crash you saw?
> >
> > No, I didn't save it due to "grubby fingerprints".
>
>Hmm i think i may have his this one but i never posted due to being unable
>to reproduce it on a vanilla kernel or the same kernel afterwards (which
>was hacked so i won't vouch for it's cleanliness). I think preempt
>might have bitten him in a bad place (mine is also CONFIG_PREEMPT), is it
>possible that when we did the task_rq_unlock we got preempted and when we
>got back we used the local variable requeue_waker which was set before
>dropping the lock, and therefore might not be valid anymore due to
>scheduler decisions done after dropping the runqueue lock?

Dunno.  I did have one lying around.  The attached one was while printing 
out array switch latency after starvation timeout.  Others happened while 
printing wakeup stats for p->state > 1 tasks in scheduler_tick() [under 
lock w/ wakeup disabled in printk.c].  It's nothing I did to the scheduler 
;-) I don't think, but this was in 65-mm3-twiddle-twiddle-twiddle.

>Unable to handle kernel NULL pointer dereference at virtual address 00000000
>  printing eip:
>c011b8d9
>*pde = 00000000
>Oops: 0000 [#1]
>CPU:    0
>EIP:    0060:[<c011b8d9>]    Not tainted
>EFLAGS: 00010046
>EIP is at try_to_wake_up+0x1e9/0x4f0
>eax: c055a000   ebx: c04e5aa0   ecx: c0552fc0   edx: c04e5aa0
>esi: 00000000   edi: 00000000   ebp: c055bee4   esp: c055beb8
>ds: 007b   es: 007b   ss: 0068
>Process swapper (pid: 0, threadinfo=c055a000 task=c04e5aa0)
>Stack: 00000001 c055a000 c0552fc0 00000000 cb1a0000 00000001 00000001 
>00000002
>        00000000 c04e88e4 00000001 c055bf08 c011d172 c1694700 00000001 
> 00000000
>        c04e88e4 c04e88dc c055a000 00000001 c055bf3c c011d203 c04e88dc 
> 00000001
>Call Trace:
>  [<c011d172>] __wake_up_common+0x32/0x60
>  [<c011d203>] __wake_up+0x63/0xb0
>  [<c0122fb5>] release_console_sem+0x165/0x170
>  [<c0122d7b>] printk+0x1eb/0x270
>  [<c015e210>] invalidate_bh_lru+0x0/0x60
>  [<c015e210>] invalidate_bh_lru+0x0/0x60
>  [<c015e210>] invalidate_bh_lru+0x0/0x60
>  [<c01163f2>] smp_call_function_interrupt+0x42/0xb0
>  [<c015e210>] invalidate_bh_lru+0x0/0x60
>  [<c0106eb0>] default_idle+0x0/0x40
>  [<c010a41a>] call_function_interrupt+0x1a/0x20
>  [<c0106eb0>] default_idle+0x0/0x40
>  [<c0106ede>] default_idle+0x2e/0x40
>  [<c0106f6a>] cpu_idle+0x3a/0x50
>  [<c0105000>] rest_init+0x0/0x80
>
>Code: 8b 06 48 89 06 8b 4a 24 8b 42 20 89 01 89 48 04 8b 4a 18 8d
>
>0xc011b8d9 is in try_to_wake_up (kernel/sched.c:282).
>277     /*
>278      * Adding/removing a task to/from a priority array:
>279      */
>280     static inline void dequeue_task(struct task_struct *p, 
>prio_array_t *array)
>281     {
>282             array->nr_active--;
>283             list_del(&p->run_list);
>284             if (list_empty(array->queue + p->prio))
>285                     __clear_bit(p->prio, array->bitmap);
>286     }

Same spot.

         -Mike 

[-- Attachment #2: oops.txt --]
[-- Type: text/plain, Size: 3183 bytes --]

Loglevel set to 9
hmm.. 289 ms
hmm.. 6 ms
hmm.. 4 ms
hmm.. 7 ms
hmm.. 13 ms
hmm.. 15 ms
Unable to handle kernel NULL pointer dereference at virtual address 00000000
 printing eip:
c0114d0a
*pde = 00000000
Oops: 0002 [#1]
CPU:    0
EIP:    0060:[<c0114d0a>]    Not tainted VLI
EFLAGS: 00010006
EIP is at try_to_wake_up+0x1e2/0x258
eax: 00000008   ebx: c02cb3c8   ecx: c0dcf360   edx: c0dcf360
esi: c0c24000   edi: 00000000   ebp: c0c25ed4   esp: c0c25eb8
ds: 007b   es: 007b   ss: 0068
Process gcc (pid: 592, threadinfo=c0c24000 task=c0dcf360)
Stack: 00000001 00000001 c0298ff4 c0c25ed0 00000001 00000001 00000002 c0c25ee8 
       c0115887 c7b8a0a0 00000003 00000000 c0c25f08 c01158c2 c2d81e5c 00000003 
       00000000 c0c24000 00000082 c0298fe8 c0c25f20 c011594a c0298ff0 00000003 
Call Trace:
 [<c0115887>] default_wake_function+0x17/0x1c
 [<c01158c2>] __wake_up_common+0x36/0x50
 [<c011594a>] __wake_up_locked+0xe/0x14
 [<c0107cdc>] __down_trylock+0x34/0x54
 [<c0107d1b>] __down_failed_trylock+0x7/0xc
 [<c011928b>] .text.lock.printk+0x5/0x2a
 [<c01155f0>] schedule+0x13c/0x378
 [<c011ab07>] sys_wait4+0xab/0x234
 [<c011ac5d>] sys_wait4+0x201/0x234
 [<c0115870>] default_wake_function+0x0/0x1c
 [<c0115870>] default_wake_function+0x0/0x1c
 [<c0108b5f>] syscall_call+0x7/0xb

Code: ff 48 14 8b 40 08 a8 08 74 07 e8 3e 0b 00 00 89 f6 85 f6 74 7e 8b 55 f0 9c 8f 02 fa be 00 e0 ff ff 21 e6 ff 46 14 8b 16 8b 7a 28 <ff> 0f 8b 42 20 8b 4a 24 89 48 04 89 01 8b 52 18 8d 44 d7 18 39 

 (gdb) list *try_to_wake_up+0x1e2
 0x26a is in try_to_wake_up (kernel/sched.c:310).
 305     /*
 306      * Adding/removing a task to/from a priority array:
 307      */
 308     static inline void dequeue_task(struct task_struct *p, prio_array_t *array)
 309     {
 310             array->nr_active--;
 311             list_del(&p->run_list);
 312             if (list_empty(array->queue + p->prio))
 313                     __clear_bit(p->prio, array->bitmap);
 314     }
 (gdb)
 <6>note: gcc[592] exited with preempt_count 5
bad: scheduling while atomic!
Call Trace:
 [<c01154f0>] schedule+0x3c/0x378
 [<c0134b26>] unmap_vmas+0xea/0x1e0
 [<c011647b>] __cond_resched+0x17/0x1c
 [<c0134b86>] unmap_vmas+0x14a/0x1e0
 [<c0137fb8>] exit_mmap+0x64/0x158
 [<c0116dbd>] mmput+0x55/0x74
 [<c011a368>] do_exit+0x158/0x3b4
 [<c0109267>] die+0x87/0x88
 [<c0114068>] do_page_fault+0x2d8/0x404
 [<c0113d90>] do_page_fault+0x0/0x404
 [<c011b91a>] do_softirq+0x5a/0xac
 [<c010a170>] do_IRQ+0xfc/0x118
 [<c012e173>] __rmqueue+0xa3/0x10c
 [<c012e21f>] rmqueue_bulk+0x43/0x6c
 [<c0108d69>] error_code+0x2d/0x38
 [<c0114d0a>] try_to_wake_up+0x1e2/0x258
 [<c0115887>] default_wake_function+0x17/0x1c
 [<c01158c2>] __wake_up_common+0x36/0x50
 [<c011594a>] __wake_up_locked+0xe/0x14
 [<c0107cdc>] __down_trylock+0x34/0x54
 [<c0107d1b>] __down_failed_trylock+0x7/0xc
 [<c011928b>] .text.lock.printk+0x5/0x2a
 [<c01155f0>] schedule+0x13c/0x378
 [<c011ab07>] sys_wait4+0xab/0x234
 [<c011ac5d>] sys_wait4+0x201/0x234
 [<c0115870>] default_wake_function+0x0/0x1c
 [<c0115870>] default_wake_function+0x0/0x1c
 [<c0108b5f>] syscall_call+0x7/0xb

hmm.. 42 ms
hmm.. 24 ms
hmm.. 33 ms
hmm.. 23 ms
hmm.. 31 ms
hmm.. 30 ms
hmm.. 30 ms

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.5.66-mm1
@ 2003-03-28 16:01             ` Mike Galbraith
  0 siblings, 0 replies; 27+ messages in thread
From: Mike Galbraith @ 2003-03-28 16:01 UTC (permalink / raw)
  To: Zwane Mwaikambo
  Cc: Ingo Molnar, Andrew Morton, Ed Tomlinson, linux-kernel, linux-mm

[-- Attachment #1: Type: text/plain, Size: 3414 bytes --]

At 09:56 AM 3/28/2003 -0500, Zwane Mwaikambo wrote:
>On Fri, 28 Mar 2003, Mike Galbraith wrote:
>
> > >hm, this is an 'impossible' scenario from the scheduler code POV. Whenever
> > >we deactivate a task, we remove it from the runqueue and set p->array to
> > >NULL. Whenever we activate a task again, we set p->array to non-NULL. A
> > >double-deactivate is not possible. I tried to reproduce it with various
> > >scheduler workloads, but didnt succeed.
> > >
> > >Mike, do you have a backtrace of the crash you saw?
> >
> > No, I didn't save it due to "grubby fingerprints".
>
>Hmm i think i may have his this one but i never posted due to being unable
>to reproduce it on a vanilla kernel or the same kernel afterwards (which
>was hacked so i won't vouch for it's cleanliness). I think preempt
>might have bitten him in a bad place (mine is also CONFIG_PREEMPT), is it
>possible that when we did the task_rq_unlock we got preempted and when we
>got back we used the local variable requeue_waker which was set before
>dropping the lock, and therefore might not be valid anymore due to
>scheduler decisions done after dropping the runqueue lock?

Dunno.  I did have one lying around.  The attached one was while printing 
out array switch latency after starvation timeout.  Others happened while 
printing wakeup stats for p->state > 1 tasks in scheduler_tick() [under 
lock w/ wakeup disabled in printk.c].  It's nothing I did to the scheduler 
;-) I don't think, but this was in 65-mm3-twiddle-twiddle-twiddle.

>Unable to handle kernel NULL pointer dereference at virtual address 00000000
>  printing eip:
>c011b8d9
>*pde = 00000000
>Oops: 0000 [#1]
>CPU:    0
>EIP:    0060:[<c011b8d9>]    Not tainted
>EFLAGS: 00010046
>EIP is at try_to_wake_up+0x1e9/0x4f0
>eax: c055a000   ebx: c04e5aa0   ecx: c0552fc0   edx: c04e5aa0
>esi: 00000000   edi: 00000000   ebp: c055bee4   esp: c055beb8
>ds: 007b   es: 007b   ss: 0068
>Process swapper (pid: 0, threadinfo=c055a000 task=c04e5aa0)
>Stack: 00000001 c055a000 c0552fc0 00000000 cb1a0000 00000001 00000001 
>00000002
>        00000000 c04e88e4 00000001 c055bf08 c011d172 c1694700 00000001 
> 00000000
>        c04e88e4 c04e88dc c055a000 00000001 c055bf3c c011d203 c04e88dc 
> 00000001
>Call Trace:
>  [<c011d172>] __wake_up_common+0x32/0x60
>  [<c011d203>] __wake_up+0x63/0xb0
>  [<c0122fb5>] release_console_sem+0x165/0x170
>  [<c0122d7b>] printk+0x1eb/0x270
>  [<c015e210>] invalidate_bh_lru+0x0/0x60
>  [<c015e210>] invalidate_bh_lru+0x0/0x60
>  [<c015e210>] invalidate_bh_lru+0x0/0x60
>  [<c01163f2>] smp_call_function_interrupt+0x42/0xb0
>  [<c015e210>] invalidate_bh_lru+0x0/0x60
>  [<c0106eb0>] default_idle+0x0/0x40
>  [<c010a41a>] call_function_interrupt+0x1a/0x20
>  [<c0106eb0>] default_idle+0x0/0x40
>  [<c0106ede>] default_idle+0x2e/0x40
>  [<c0106f6a>] cpu_idle+0x3a/0x50
>  [<c0105000>] rest_init+0x0/0x80
>
>Code: 8b 06 48 89 06 8b 4a 24 8b 42 20 89 01 89 48 04 8b 4a 18 8d
>
>0xc011b8d9 is in try_to_wake_up (kernel/sched.c:282).
>277     /*
>278      * Adding/removing a task to/from a priority array:
>279      */
>280     static inline void dequeue_task(struct task_struct *p, 
>prio_array_t *array)
>281     {
>282             array->nr_active--;
>283             list_del(&p->run_list);
>284             if (list_empty(array->queue + p->prio))
>285                     __clear_bit(p->prio, array->bitmap);
>286     }

Same spot.

         -Mike 

[-- Attachment #2: oops.txt --]
[-- Type: text/plain, Size: 3183 bytes --]

Loglevel set to 9
hmm.. 289 ms
hmm.. 6 ms
hmm.. 4 ms
hmm.. 7 ms
hmm.. 13 ms
hmm.. 15 ms
Unable to handle kernel NULL pointer dereference at virtual address 00000000
 printing eip:
c0114d0a
*pde = 00000000
Oops: 0002 [#1]
CPU:    0
EIP:    0060:[<c0114d0a>]    Not tainted VLI
EFLAGS: 00010006
EIP is at try_to_wake_up+0x1e2/0x258
eax: 00000008   ebx: c02cb3c8   ecx: c0dcf360   edx: c0dcf360
esi: c0c24000   edi: 00000000   ebp: c0c25ed4   esp: c0c25eb8
ds: 007b   es: 007b   ss: 0068
Process gcc (pid: 592, threadinfo=c0c24000 task=c0dcf360)
Stack: 00000001 00000001 c0298ff4 c0c25ed0 00000001 00000001 00000002 c0c25ee8 
       c0115887 c7b8a0a0 00000003 00000000 c0c25f08 c01158c2 c2d81e5c 00000003 
       00000000 c0c24000 00000082 c0298fe8 c0c25f20 c011594a c0298ff0 00000003 
Call Trace:
 [<c0115887>] default_wake_function+0x17/0x1c
 [<c01158c2>] __wake_up_common+0x36/0x50
 [<c011594a>] __wake_up_locked+0xe/0x14
 [<c0107cdc>] __down_trylock+0x34/0x54
 [<c0107d1b>] __down_failed_trylock+0x7/0xc
 [<c011928b>] .text.lock.printk+0x5/0x2a
 [<c01155f0>] schedule+0x13c/0x378
 [<c011ab07>] sys_wait4+0xab/0x234
 [<c011ac5d>] sys_wait4+0x201/0x234
 [<c0115870>] default_wake_function+0x0/0x1c
 [<c0115870>] default_wake_function+0x0/0x1c
 [<c0108b5f>] syscall_call+0x7/0xb

Code: ff 48 14 8b 40 08 a8 08 74 07 e8 3e 0b 00 00 89 f6 85 f6 74 7e 8b 55 f0 9c 8f 02 fa be 00 e0 ff ff 21 e6 ff 46 14 8b 16 8b 7a 28 <ff> 0f 8b 42 20 8b 4a 24 89 48 04 89 01 8b 52 18 8d 44 d7 18 39 

 (gdb) list *try_to_wake_up+0x1e2
 0x26a is in try_to_wake_up (kernel/sched.c:310).
 305     /*
 306      * Adding/removing a task to/from a priority array:
 307      */
 308     static inline void dequeue_task(struct task_struct *p, prio_array_t *array)
 309     {
 310             array->nr_active--;
 311             list_del(&p->run_list);
 312             if (list_empty(array->queue + p->prio))
 313                     __clear_bit(p->prio, array->bitmap);
 314     }
 (gdb)
 <6>note: gcc[592] exited with preempt_count 5
bad: scheduling while atomic!
Call Trace:
 [<c01154f0>] schedule+0x3c/0x378
 [<c0134b26>] unmap_vmas+0xea/0x1e0
 [<c011647b>] __cond_resched+0x17/0x1c
 [<c0134b86>] unmap_vmas+0x14a/0x1e0
 [<c0137fb8>] exit_mmap+0x64/0x158
 [<c0116dbd>] mmput+0x55/0x74
 [<c011a368>] do_exit+0x158/0x3b4
 [<c0109267>] die+0x87/0x88
 [<c0114068>] do_page_fault+0x2d8/0x404
 [<c0113d90>] do_page_fault+0x0/0x404
 [<c011b91a>] do_softirq+0x5a/0xac
 [<c010a170>] do_IRQ+0xfc/0x118
 [<c012e173>] __rmqueue+0xa3/0x10c
 [<c012e21f>] rmqueue_bulk+0x43/0x6c
 [<c0108d69>] error_code+0x2d/0x38
 [<c0114d0a>] try_to_wake_up+0x1e2/0x258
 [<c0115887>] default_wake_function+0x17/0x1c
 [<c01158c2>] __wake_up_common+0x36/0x50
 [<c011594a>] __wake_up_locked+0xe/0x14
 [<c0107cdc>] __down_trylock+0x34/0x54
 [<c0107d1b>] __down_failed_trylock+0x7/0xc
 [<c011928b>] .text.lock.printk+0x5/0x2a
 [<c01155f0>] schedule+0x13c/0x378
 [<c011ab07>] sys_wait4+0xab/0x234
 [<c011ac5d>] sys_wait4+0x201/0x234
 [<c0115870>] default_wake_function+0x0/0x1c
 [<c0115870>] default_wake_function+0x0/0x1c
 [<c0108b5f>] syscall_call+0x7/0xb

hmm.. 42 ms
hmm.. 24 ms
hmm.. 33 ms
hmm.. 23 ms
hmm.. 31 ms
hmm.. 30 ms
hmm.. 30 ms

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.5.66-mm1
       [not found]           ` <Pine.LNX.4.44.0303281619530.9943-100000@localhost.localdom ain>
@ 2003-03-28 16:05               ` Mike Galbraith
  0 siblings, 0 replies; 27+ messages in thread
From: Mike Galbraith @ 2003-03-28 16:05 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Zwane Mwaikambo, Andrew Morton, Ed Tomlinson, linux-kernel,
	linux-mm

At 04:25 PM 3/28/2003 +0100, Ingo Molnar wrote:

>On Fri, 28 Mar 2003, Zwane Mwaikambo wrote:
>
> > Hmm i think i may have his this one but i never posted due to being
> > unable to reproduce it on a vanilla kernel or the same kernel afterwards
> > (which was hacked so i won't vouch for it's cleanliness). I think
> > preempt might have bitten him in a bad place (mine is also
> > CONFIG_PREEMPT), is it possible that when we did the task_rq_unlock we
> > got preempted and when we got back we used the local variable
> > requeue_waker which was set before dropping the lock, and therefore
> > might not be valid anymore due to scheduler decisions done after
> > dropping the runqueue lock?
>
>yes, this one was my only suspect, but it should really never cause any
>problems. We might change sleep_avg during the wakeup, and carry the
>requeue_waker flag over a preemptible window, but the requeueing itself
>re-takes the runqueue lock, and does not take anything for granted. The
>flag could very well be random as well, and the code should still be
>correct - there's no requirement to recalculate the priority every time we
>change sleep_avg. (in fact we at times intentionally keep those values
>detached.)

In my 66-twiddle tree, I moved that under the lock out of pure paranoia.  I 
can try to see if printing under hefty (very) load will still trigger the 
occasional explosion.

         -Mike 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.5.66-mm1
@ 2003-03-28 16:05               ` Mike Galbraith
  0 siblings, 0 replies; 27+ messages in thread
From: Mike Galbraith @ 2003-03-28 16:05 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Zwane Mwaikambo, Andrew Morton, Ed Tomlinson, linux-kernel,
	linux-mm

At 04:25 PM 3/28/2003 +0100, Ingo Molnar wrote:

>On Fri, 28 Mar 2003, Zwane Mwaikambo wrote:
>
> > Hmm i think i may have his this one but i never posted due to being
> > unable to reproduce it on a vanilla kernel or the same kernel afterwards
> > (which was hacked so i won't vouch for it's cleanliness). I think
> > preempt might have bitten him in a bad place (mine is also
> > CONFIG_PREEMPT), is it possible that when we did the task_rq_unlock we
> > got preempted and when we got back we used the local variable
> > requeue_waker which was set before dropping the lock, and therefore
> > might not be valid anymore due to scheduler decisions done after
> > dropping the runqueue lock?
>
>yes, this one was my only suspect, but it should really never cause any
>problems. We might change sleep_avg during the wakeup, and carry the
>requeue_waker flag over a preemptible window, but the requeueing itself
>re-takes the runqueue lock, and does not take anything for granted. The
>flag could very well be random as well, and the code should still be
>correct - there's no requirement to recalculate the priority every time we
>change sleep_avg. (in fact we at times intentionally keep those values
>detached.)

In my 66-twiddle tree, I moved that under the lock out of pure paranoia.  I 
can try to see if printing under hefty (very) load will still trigger the 
occasional explosion.

         -Mike 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2003-03-28 16:05 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-03-26  9:38 2.5.66-mm1 Andrew Morton
2003-03-26  9:38 ` 2.5.66-mm1 Andrew Morton
2003-03-26 12:26 ` LVM/Device mapper breaks with -mm (was: Re: 2.5.66-mm1) Erik Hensema
2003-03-26 13:48   ` Andries Brouwer
2003-03-26 14:33     ` Erik Hensema
2003-03-26 16:03       ` Andries Brouwer
2003-03-26 17:43         ` Joe Thornber
2003-03-26 18:47         ` Joel Becker
2003-03-26 20:52           ` Andries Brouwer
2003-03-26 21:12             ` Joel Becker
2003-03-28  2:08             ` Dave Jones
2003-03-28  2:06 ` 2.5.66-mm1 Ed Tomlinson
2003-03-28  2:06   ` 2.5.66-mm1 Ed Tomlinson
2003-03-28  4:59   ` 2.5.66-mm1 Andrew Morton
2003-03-28  4:59     ` 2.5.66-mm1 Andrew Morton
2003-03-28 10:45     ` 2.5.66-mm1 Ingo Molnar
2003-03-28 10:45       ` 2.5.66-mm1 Ingo Molnar
     [not found]     ` <Pine.LNX.4.44.0303281139500.6678-100000@localhost.localdom ain>
2003-03-28 14:26       ` 2.5.66-mm1 Mike Galbraith
2003-03-28 14:26         ` 2.5.66-mm1 Mike Galbraith
2003-03-28 14:56         ` 2.5.66-mm1 Zwane Mwaikambo
2003-03-28 14:56           ` 2.5.66-mm1 Zwane Mwaikambo
2003-03-28 15:25           ` 2.5.66-mm1 Ingo Molnar
2003-03-28 15:25             ` 2.5.66-mm1 Ingo Molnar
     [not found]           ` <Pine.LNX.4.44.0303281619530.9943-100000@localhost.localdom ain>
2003-03-28 16:05             ` 2.5.66-mm1 Mike Galbraith
2003-03-28 16:05               ` 2.5.66-mm1 Mike Galbraith
     [not found]         ` <Pine.LNX.4.50.0303280942420.2884-100000@montezuma.mastecen de.com>
2003-03-28 16:01           ` 2.5.66-mm1 Mike Galbraith
2003-03-28 16:01             ` 2.5.66-mm1 Mike Galbraith

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.