kernel-testers.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: linux-next: Tree for July 16 (crash on quad core AMD)
       [not found] <20080716235011.ac9643aa.sfr@canb.auug.org.au>
@ 2008-07-16 22:53 ` Rafael J. Wysocki
  2008-07-16 23:01   ` James Bottomley
  0 siblings, 1 reply; 14+ messages in thread
From: Rafael J. Wysocki @ 2008-07-16 22:53 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: linux-next, LKML, Andrew Morton, Kernel Testers List, scsi,
	Jens Axboe

On Wednesday, 16 of July 2008, Stephen Rothwell wrote:
> Hi all,
> 
> Changes since next-20080715:
> 
> Temporarily dropped tree: ttydev (it gets too many patch failures).
> 
> Most of the differences were conflicts moving from tree to tree as some
> of the trees are now merged into Linus' tree.  Most have been inflicted
> on the driver-core and usb trees.  I have not notified these separately.
> 
> Because of the moving of conflicts around it is difficult to tell when
> they are going away (though I assume some are).
> 
> The usb.current tree lost its conflict against Linus' tree.
> 
> The usb tree gained a conflict against Linus' tree.
> 
> The cpus4096 tree gained a conflict against Linus' tree.
> 
> The pci tree lost one of its build fix patches.
> 
> The i2c tree lost a conflict against Linus' tree.
> 
> The ide tree gained a lot of conflicts against Linus' tree because part
> of it was merged into Linus' tree but the remaining pat modified many of
> the same files further.
> 
> The acpi tree gained a conflict against Linus' tree.
> 
> The net tree gained a conflict against each of Linus' tree and the
> powerpc tree.
> 
> The sparc tree gained conflicts against Linus' tree and the ide and
> sparc-current trees.
> 
> The rr tree gained conflict against the net tree.
> 
> The semaphore tree gained a conflict against the sparc tree.
> 
> The generic-ipi tree gained a conflict against the powerpc tree.
> 
> The ttydev series had many patches fail to apply, so it was dropped for
> today.
> 
> I have also applied the following patches for known problems:
> 
> 	sparc64: sysdev API change fallout
> 
> Patches no longer necessary:
> 
> 	s390: fix compile error due to smp_call_function
> 	powerpc: mman.h export fixups
> 	powerpc/stacktrace: EXPORT SYMBOL_GPL needs module.h
> 
> This tree fails to build for ARCH=sparc (i.e. 32bit) with a 64bit gcc
> v3.4.5 - it tries to use the 64bit header files.  This may be an artifact
> of one of my merge fixups, but I don't actually think so.
> 
> ----------------------------------------------------------------------------

Crashes during boot on a box with Phenom X4 and AMD 790-based mainboard.
AFAICS, the Linus' tree is unaffected and linux-next from yesterday was fine
on the same box with the same .config.

Full dmesg: http://www.sisk.pl/kernel/debug/next/20080716/crash-M3A32-MVP.log
Kernel config: http://www.sisk.pl/kernel/debug/next/20080716/M3A32-MVP-config

scsi scan: INQUIRY result too short (5), using 36
scsi 2:0:0:0: Direct-Access                                    PQ: 0 ANSI: 0
------------[ cut here ]------------
kernel BUG at /home/rafael/src/linux-next/mm/slab.c:2822!
invalid opcode: 0000 [1] SMP
last sysfs file:
CPU 0
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.26-next #51
RIP: 0010:[<ffffffff802b59e8>]  [<ffffffff802b59e8>] cache_free_debugcheck+0x288/0x2b0
RSP: 0018:ffff880127c81880  EFLAGS: 00010016
RAX: 00da9803898590c8 RBX: ffff880127c01880 RCX: 204a483235324448
RDX: 0000000000da9803 RSI: ffff880124488810 RDI: ffff880127c01880
RBP: ffff880127c818b0 R08: 0000000000000058 R09: 2222222222222222
R10: 2222222222222222 R11: 2222222222222222 R12: ffff880124488810
R13: 09f911029d74e35b R14: 09f911029d74e35b R15: ffff880124488000
FS:  0000000000000000(0000) GS:ffffffff806b7f40(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000000201000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 1, threadinfo ffff880127c80000, task ffff880127c7e040)
Stack:  ffffffff803f9bcb ffff880127c01880 ffff880127c0bad0 ffff880124488818
 0000000000000282 ffff880124460000 ffff880127c818e0 ffffffff802b5c16
 0000000000000000 0000000000000002 ffffffff80627d9b 0000000000000000
Call Trace:
 [<ffffffff803f9bcb>] ? scsi_probe_and_add_lun+0x86b/0xc20
 [<ffffffff802b5c16>] kfree+0xd6/0x160
 [<ffffffff803f9bcb>] scsi_probe_and_add_lun+0x86b/0xc20
 [<ffffffff803fad4f>] __scsi_add_device+0xff/0x110
 [<ffffffff8040c99b>] ata_scsi_scan_host+0xdb/0x2b0
 [<ffffffff804094b3>] ata_host_register+0x243/0x2a0
 [<ffffffff8041b920>] ? ahci_interrupt+0x0/0x530
 [<ffffffff804095b4>] ata_host_activate+0xa4/0x110
 [<ffffffff8041b668>] ahci_init_one+0x9a8/0xc60
 [<ffffffff80378fb9>] pci_device_probe+0x79/0xa0
 [<ffffffff803e6aab>] driver_probe_device+0x9b/0x1a0
 [<ffffffff803e6c36>] __driver_attach+0x86/0x90
 [<ffffffff803e6bb0>] ? __driver_attach+0x0/0x90
 [<ffffffff803e5ffd>] bus_for_each_dev+0x5d/0x90
 [<ffffffff803e68ec>] driver_attach+0x1c/0x20
 [<ffffffff803e6525>] bus_add_driver+0xc5/0x250
 [<ffffffff803e6e0f>] driver_register+0x5f/0x140
 [<ffffffff8037927d>] __pci_register_driver+0x7d/0xc0
 [<ffffffff806f00c0>] ? ahci_init+0x0/0x20
 [<ffffffff806f00d9>] ahci_init+0x19/0x20
 [<ffffffff806c8a48>] kernel_init+0x128/0x310
 [<ffffffff8026167f>] ? trace_hardirqs_on_caller+0xbf/0x150
 [<ffffffff8026171d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff804ff1eb>] ? _spin_unlock_irq+0x2b/0x40
 [<ffffffff804fea80>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff8026167f>] ? trace_hardirqs_on_caller+0xbf/0x150
 [<ffffffff8020c6d9>] child_rip+0xa/0x11
 [<ffffffff8020bd0f>] ? restore_args+0x0/0x30
 [<ffffffff806c8920>] ? kernel_init+0x0/0x310
 [<ffffffff8020c6cf>] ? child_rip+0x0/0x11


Code: 48 8b 40 10 48 8b 08 f6 c5 40 0f 84 06 fe ff ff 48 8b 40 10 48 8b 08 e9 fa fd ff ff 0f 1f 80 00 00 00 00 0f 0b eb fe 0f 0b eb fe <0f> 0b eb fe 0f 1f 40 00 48
RIP  [<ffffffff802b59e8>] cache_free_debugcheck+0x288/0x2b0
 RSP <ffff880127c81880>
---[ end trace c69efc8b7b1131cd ]---
Kernel panic - not syncing: Attempted to kill init!
Pid: 1, comm: swapper Tainted: G      D   2.6.26-next #51

Call Trace:
 [<ffffffff8023c240>] panic+0xa0/0x190
 [<ffffffff802577f9>] ? up+0x19/0x50
 [<ffffffff8023d0f7>] ? printk+0x67/0x70
 [<ffffffff803be042>] ? account+0xc2/0x100
 [<ffffffff803be1ee>] ? extract_entropy+0x7e/0xa0
 [<ffffffff8024063f>] do_exit+0x8bf/0x8d0
 [<ffffffff803be22b>] ? get_random_bytes+0x1b/0x20
 [<ffffffff8020cbd5>] oops_end+0x85/0x90
 [<ffffffff8020d84e>] die+0x5e/0x90
 [<ffffffff8020da30>] do_trap+0x130/0x150
 [<ffffffff8020e84c>] do_invalid_op+0x9c/0xc0
 [<ffffffff802b59e8>] ? cache_free_debugcheck+0x288/0x2b0
 [<ffffffff80311490>] ? sysfs_ilookup_test+0x0/0x20
 [<ffffffff804fede6>] ? _spin_unlock+0x26/0x30
 [<ffffffff803e9927>] ? attribute_container_device_trigger+0x27/0xd0
 [<ffffffff804ff54d>] error_exit+0x0/0xa9
 [<ffffffff802b59e8>] ? cache_free_debugcheck+0x288/0x2b0
 [<ffffffff803f9bcb>] ? scsi_probe_and_add_lun+0x86b/0xc20
 [<ffffffff802b5c16>] kfree+0xd6/0x160
 [<ffffffff803f9bcb>] scsi_probe_and_add_lun+0x86b/0xc20
 [<ffffffff803fad4f>] __scsi_add_device+0xff/0x110
 [<ffffffff8040c99b>] ata_scsi_scan_host+0xdb/0x2b0
 [<ffffffff804094b3>] ata_host_register+0x243/0x2a0
 [<ffffffff8041b920>] ? ahci_interrupt+0x0/0x530
 [<ffffffff804095b4>] ata_host_activate+0xa4/0x110
 [<ffffffff8041b668>] ahci_init_one+0x9a8/0xc60
 [<ffffffff80378fb9>] pci_device_probe+0x79/0xa0
 [<ffffffff803e6aab>] driver_probe_device+0x9b/0x1a0
 [<ffffffff803e6c36>] __driver_attach+0x86/0x90
 [<ffffffff803e6bb0>] ? __driver_attach+0x0/0x90
 [<ffffffff803e5ffd>] bus_for_each_dev+0x5d/0x90
 [<ffffffff803e68ec>] driver_attach+0x1c/0x20
 [<ffffffff803e6525>] bus_add_driver+0xc5/0x250
 [<ffffffff803e6e0f>] driver_register+0x5f/0x140
 [<ffffffff8037927d>] __pci_register_driver+0x7d/0xc0
 [<ffffffff806f00c0>] ? ahci_init+0x0/0x20
 [<ffffffff806f00d9>] ahci_init+0x19/0x20
 [<ffffffff806c8a48>] kernel_init+0x128/0x310
 [<ffffffff8026167f>] ? trace_hardirqs_on_caller+0xbf/0x150
 [<ffffffff8026171d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff804ff1eb>] ? _spin_unlock_irq+0x2b/0x40
 [<ffffffff804fea80>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff8026167f>] ? trace_hardirqs_on_caller+0xbf/0x150
 [<ffffffff8020c6d9>] child_rip+0xa/0x11
 [<ffffffff8020bd0f>] ? restore_args+0x0/0x30
 [<ffffffff806c8920>] ? kernel_init+0x0/0x310
 [<ffffffff8020c6cf>] ? child_rip+0x0/0x11

------------[ cut here ]------------
WARNING: at /home/rafael/src/linux-next/kernel/smp.c:288 smp_call_function_mask+0x198/0x1a0()
Modules linked in:
Pid: 1, comm: swapper Tainted: G      D   2.6.26-next #51

Call Trace:
 [<ffffffff8023beff>] warn_on_slowpath+0x5f/0x80
 [<ffffffff8020c6cf>] ? child_rip+0x0/0x11
 [<ffffffff802577f9>] ? up+0x19/0x50
 [<ffffffff8023d0f7>] ? printk+0x67/0x70
 [<ffffffff80267f48>] smp_call_function_mask+0x198/0x1a0
 [<ffffffff8020d063>] ? dump_trace+0x373/0x400
 [<ffffffff8020d14e>] ? show_trace+0x5e/0x80
 [<ffffffff80267f6b>] smp_call_function+0x1b/0x20
 [<ffffffff8021c4f0>] native_smp_send_stop+0x30/0x60
 [<ffffffff8023c24d>] panic+0xad/0x190
 [<ffffffff802577f9>] ? up+0x19/0x50
 [<ffffffff8023d0f7>] ? printk+0x67/0x70
 [<ffffffff803be042>] ? account+0xc2/0x100
 [<ffffffff803be1ee>] ? extract_entropy+0x7e/0xa0
 [<ffffffff8024063f>] do_exit+0x8bf/0x8d0
 [<ffffffff803be22b>] ? get_random_bytes+0x1b/0x20
 [<ffffffff8020cbd5>] oops_end+0x85/0x90
 [<ffffffff8020d84e>] die+0x5e/0x90
 [<ffffffff8020da30>] do_trap+0x130/0x150
 [<ffffffff8020e84c>] do_invalid_op+0x9c/0xc0
 [<ffffffff802b59e8>] ? cache_free_debugcheck+0x288/0x2b0
 [<ffffffff80311490>] ? sysfs_ilookup_test+0x0/0x20
 [<ffffffff804fede6>] ? _spin_unlock+0x26/0x30
 [<ffffffff803e9927>] ? attribute_container_device_trigger+0x27/0xd0
 [<ffffffff804ff54d>] error_exit+0x0/0xa9
 [<ffffffff802b59e8>] ? cache_free_debugcheck+0x288/0x2b0
 [<ffffffff803f9bcb>] ? scsi_probe_and_add_lun+0x86b/0xc20
 [<ffffffff802b5c16>] kfree+0xd6/0x160
 [<ffffffff803f9bcb>] scsi_probe_and_add_lun+0x86b/0xc20
 [<ffffffff803fad4f>] __scsi_add_device+0xff/0x110
 [<ffffffff8040c99b>] ata_scsi_scan_host+0xdb/0x2b0
 [<ffffffff804094b3>] ata_host_register+0x243/0x2a0
 [<ffffffff8041b920>] ? ahci_interrupt+0x0/0x530
 [<ffffffff804095b4>] ata_host_activate+0xa4/0x110
 [<ffffffff8041b668>] ahci_init_one+0x9a8/0xc60
 [<ffffffff80378fb9>] pci_device_probe+0x79/0xa0
 [<ffffffff803e6aab>] driver_probe_device+0x9b/0x1a0
 [<ffffffff803e6c36>] __driver_attach+0x86/0x90
 [<ffffffff803e6bb0>] ? __driver_attach+0x0/0x90
 [<ffffffff803e5ffd>] bus_for_each_dev+0x5d/0x90
 [<ffffffff803e68ec>] driver_attach+0x1c/0x20
 [<ffffffff803e6525>] bus_add_driver+0xc5/0x250
 [<ffffffff803e6e0f>] driver_register+0x5f/0x140
 [<ffffffff8037927d>] __pci_register_driver+0x7d/0xc0
 [<ffffffff806f00c0>] ? ahci_init+0x0/0x20
 [<ffffffff806f00d9>] ahci_init+0x19/0x20
 [<ffffffff806c8a48>] kernel_init+0x128/0x310
 [<ffffffff8026167f>] ? trace_hardirqs_on_caller+0xbf/0x150
 [<ffffffff8026171d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff804ff1eb>] ? _spin_unlock_irq+0x2b/0x40
 [<ffffffff804fea80>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff8026167f>] ? trace_hardirqs_on_caller+0xbf/0x150
 [<ffffffff8020c6d9>] child_rip+0xa/0x11
 [<ffffffff8020bd0f>] ? restore_args+0x0/0x30
 [<ffffffff806c8920>] ? kernel_init+0x0/0x310
 [<ffffffff8020c6cf>] ? child_rip+0x0/0x11

---[ end trace c69efc8b7b1131cd ]---

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: linux-next: Tree for July 16 (crash on quad core AMD)
  2008-07-16 22:53 ` linux-next: Tree for July 16 (crash on quad core AMD) Rafael J. Wysocki
@ 2008-07-16 23:01   ` James Bottomley
       [not found]     ` <1216249292.3358.66.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: James Bottomley @ 2008-07-16 23:01 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Stephen Rothwell, linux-next, LKML, Andrew Morton,
	Kernel Testers List, scsi, Jens Axboe, linux-ide

On Thu, 2008-07-17 at 00:53 +0200, Rafael J. Wysocki wrote:
> On Wednesday, 16 of July 2008, Stephen Rothwell wrote:
> > Hi all,
> > 
> > Changes since next-20080715:
> > 
> > Temporarily dropped tree: ttydev (it gets too many patch failures).
> > 
> > Most of the differences were conflicts moving from tree to tree as some
> > of the trees are now merged into Linus' tree.  Most have been inflicted
> > on the driver-core and usb trees.  I have not notified these separately.
> > 
> > Because of the moving of conflicts around it is difficult to tell when
> > they are going away (though I assume some are).
> > 
> > The usb.current tree lost its conflict against Linus' tree.
> > 
> > The usb tree gained a conflict against Linus' tree.
> > 
> > The cpus4096 tree gained a conflict against Linus' tree.
> > 
> > The pci tree lost one of its build fix patches.
> > 
> > The i2c tree lost a conflict against Linus' tree.
> > 
> > The ide tree gained a lot of conflicts against Linus' tree because part
> > of it was merged into Linus' tree but the remaining pat modified many of
> > the same files further.
> > 
> > The acpi tree gained a conflict against Linus' tree.
> > 
> > The net tree gained a conflict against each of Linus' tree and the
> > powerpc tree.
> > 
> > The sparc tree gained conflicts against Linus' tree and the ide and
> > sparc-current trees.
> > 
> > The rr tree gained conflict against the net tree.
> > 
> > The semaphore tree gained a conflict against the sparc tree.
> > 
> > The generic-ipi tree gained a conflict against the powerpc tree.
> > 
> > The ttydev series had many patches fail to apply, so it was dropped for
> > today.
> > 
> > I have also applied the following patches for known problems:
> > 
> > 	sparc64: sysdev API change fallout
> > 
> > Patches no longer necessary:
> > 
> > 	s390: fix compile error due to smp_call_function
> > 	powerpc: mman.h export fixups
> > 	powerpc/stacktrace: EXPORT SYMBOL_GPL needs module.h
> > 
> > This tree fails to build for ARCH=sparc (i.e. 32bit) with a 64bit gcc
> > v3.4.5 - it tries to use the 64bit header files.  This may be an artifact
> > of one of my merge fixups, but I don't actually think so.
> > 
> > ----------------------------------------------------------------------------
> 
> Crashes during boot on a box with Phenom X4 and AMD 790-based mainboard.
> AFAICS, the Linus' tree is unaffected and linux-next from yesterday was fine
> on the same box with the same .config.

OK, that means that all the current SCSI patches were merged and it was
still OK (they're all in linus and I haven't put together the next slice
yet).  I'd suspect something in drivers/ata (cc ide list added).

James


> Full dmesg: http://www.sisk.pl/kernel/debug/next/20080716/crash-M3A32-MVP.log
> Kernel config: http://www.sisk.pl/kernel/debug/next/20080716/M3A32-MVP-config
> 
> scsi scan: INQUIRY result too short (5), using 36
> scsi 2:0:0:0: Direct-Access                                    PQ: 0 ANSI: 0
> ------------[ cut here ]------------
> kernel BUG at /home/rafael/src/linux-next/mm/slab.c:2822!
> invalid opcode: 0000 [1] SMP
> last sysfs file:
> CPU 0
> Modules linked in:
> Pid: 1, comm: swapper Not tainted 2.6.26-next #51
> RIP: 0010:[<ffffffff802b59e8>]  [<ffffffff802b59e8>] cache_free_debugcheck+0x288/0x2b0
> RSP: 0018:ffff880127c81880  EFLAGS: 00010016
> RAX: 00da9803898590c8 RBX: ffff880127c01880 RCX: 204a483235324448
> RDX: 0000000000da9803 RSI: ffff880124488810 RDI: ffff880127c01880
> RBP: ffff880127c818b0 R08: 0000000000000058 R09: 2222222222222222
> R10: 2222222222222222 R11: 2222222222222222 R12: ffff880124488810
> R13: 09f911029d74e35b R14: 09f911029d74e35b R15: ffff880124488000
> FS:  0000000000000000(0000) GS:ffffffff806b7f40(0000) knlGS:0000000000000000
> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000000 CR3: 0000000000201000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process swapper (pid: 1, threadinfo ffff880127c80000, task ffff880127c7e040)
> Stack:  ffffffff803f9bcb ffff880127c01880 ffff880127c0bad0 ffff880124488818
>  0000000000000282 ffff880124460000 ffff880127c818e0 ffffffff802b5c16
>  0000000000000000 0000000000000002 ffffffff80627d9b 0000000000000000
> Call Trace:
>  [<ffffffff803f9bcb>] ? scsi_probe_and_add_lun+0x86b/0xc20
>  [<ffffffff802b5c16>] kfree+0xd6/0x160
>  [<ffffffff803f9bcb>] scsi_probe_and_add_lun+0x86b/0xc20
>  [<ffffffff803fad4f>] __scsi_add_device+0xff/0x110
>  [<ffffffff8040c99b>] ata_scsi_scan_host+0xdb/0x2b0
>  [<ffffffff804094b3>] ata_host_register+0x243/0x2a0
>  [<ffffffff8041b920>] ? ahci_interrupt+0x0/0x530
>  [<ffffffff804095b4>] ata_host_activate+0xa4/0x110
>  [<ffffffff8041b668>] ahci_init_one+0x9a8/0xc60
>  [<ffffffff80378fb9>] pci_device_probe+0x79/0xa0
>  [<ffffffff803e6aab>] driver_probe_device+0x9b/0x1a0
>  [<ffffffff803e6c36>] __driver_attach+0x86/0x90
>  [<ffffffff803e6bb0>] ? __driver_attach+0x0/0x90
>  [<ffffffff803e5ffd>] bus_for_each_dev+0x5d/0x90
>  [<ffffffff803e68ec>] driver_attach+0x1c/0x20
>  [<ffffffff803e6525>] bus_add_driver+0xc5/0x250
>  [<ffffffff803e6e0f>] driver_register+0x5f/0x140
>  [<ffffffff8037927d>] __pci_register_driver+0x7d/0xc0
>  [<ffffffff806f00c0>] ? ahci_init+0x0/0x20
>  [<ffffffff806f00d9>] ahci_init+0x19/0x20
>  [<ffffffff806c8a48>] kernel_init+0x128/0x310
>  [<ffffffff8026167f>] ? trace_hardirqs_on_caller+0xbf/0x150
>  [<ffffffff8026171d>] ? trace_hardirqs_on+0xd/0x10
>  [<ffffffff804ff1eb>] ? _spin_unlock_irq+0x2b/0x40
>  [<ffffffff804fea80>] ? trace_hardirqs_on_thunk+0x3a/0x3f
>  [<ffffffff8026167f>] ? trace_hardirqs_on_caller+0xbf/0x150
>  [<ffffffff8020c6d9>] child_rip+0xa/0x11
>  [<ffffffff8020bd0f>] ? restore_args+0x0/0x30
>  [<ffffffff806c8920>] ? kernel_init+0x0/0x310
>  [<ffffffff8020c6cf>] ? child_rip+0x0/0x11
> 
> 
> Code: 48 8b 40 10 48 8b 08 f6 c5 40 0f 84 06 fe ff ff 48 8b 40 10 48 8b 08 e9 fa fd ff ff 0f 1f 80 00 00 00 00 0f 0b eb fe 0f 0b eb fe <0f> 0b eb fe 0f 1f 40 00 48
> RIP  [<ffffffff802b59e8>] cache_free_debugcheck+0x288/0x2b0
>  RSP <ffff880127c81880>
> ---[ end trace c69efc8b7b1131cd ]---
> Kernel panic - not syncing: Attempted to kill init!
> Pid: 1, comm: swapper Tainted: G      D   2.6.26-next #51
> 
> Call Trace:
>  [<ffffffff8023c240>] panic+0xa0/0x190
>  [<ffffffff802577f9>] ? up+0x19/0x50
>  [<ffffffff8023d0f7>] ? printk+0x67/0x70
>  [<ffffffff803be042>] ? account+0xc2/0x100
>  [<ffffffff803be1ee>] ? extract_entropy+0x7e/0xa0
>  [<ffffffff8024063f>] do_exit+0x8bf/0x8d0
>  [<ffffffff803be22b>] ? get_random_bytes+0x1b/0x20
>  [<ffffffff8020cbd5>] oops_end+0x85/0x90
>  [<ffffffff8020d84e>] die+0x5e/0x90
>  [<ffffffff8020da30>] do_trap+0x130/0x150
>  [<ffffffff8020e84c>] do_invalid_op+0x9c/0xc0
>  [<ffffffff802b59e8>] ? cache_free_debugcheck+0x288/0x2b0
>  [<ffffffff80311490>] ? sysfs_ilookup_test+0x0/0x20
>  [<ffffffff804fede6>] ? _spin_unlock+0x26/0x30
>  [<ffffffff803e9927>] ? attribute_container_device_trigger+0x27/0xd0
>  [<ffffffff804ff54d>] error_exit+0x0/0xa9
>  [<ffffffff802b59e8>] ? cache_free_debugcheck+0x288/0x2b0
>  [<ffffffff803f9bcb>] ? scsi_probe_and_add_lun+0x86b/0xc20
>  [<ffffffff802b5c16>] kfree+0xd6/0x160
>  [<ffffffff803f9bcb>] scsi_probe_and_add_lun+0x86b/0xc20
>  [<ffffffff803fad4f>] __scsi_add_device+0xff/0x110
>  [<ffffffff8040c99b>] ata_scsi_scan_host+0xdb/0x2b0
>  [<ffffffff804094b3>] ata_host_register+0x243/0x2a0
>  [<ffffffff8041b920>] ? ahci_interrupt+0x0/0x530
>  [<ffffffff804095b4>] ata_host_activate+0xa4/0x110
>  [<ffffffff8041b668>] ahci_init_one+0x9a8/0xc60
>  [<ffffffff80378fb9>] pci_device_probe+0x79/0xa0
>  [<ffffffff803e6aab>] driver_probe_device+0x9b/0x1a0
>  [<ffffffff803e6c36>] __driver_attach+0x86/0x90
>  [<ffffffff803e6bb0>] ? __driver_attach+0x0/0x90
>  [<ffffffff803e5ffd>] bus_for_each_dev+0x5d/0x90
>  [<ffffffff803e68ec>] driver_attach+0x1c/0x20
>  [<ffffffff803e6525>] bus_add_driver+0xc5/0x250
>  [<ffffffff803e6e0f>] driver_register+0x5f/0x140
>  [<ffffffff8037927d>] __pci_register_driver+0x7d/0xc0
>  [<ffffffff806f00c0>] ? ahci_init+0x0/0x20
>  [<ffffffff806f00d9>] ahci_init+0x19/0x20
>  [<ffffffff806c8a48>] kernel_init+0x128/0x310
>  [<ffffffff8026167f>] ? trace_hardirqs_on_caller+0xbf/0x150
>  [<ffffffff8026171d>] ? trace_hardirqs_on+0xd/0x10
>  [<ffffffff804ff1eb>] ? _spin_unlock_irq+0x2b/0x40
>  [<ffffffff804fea80>] ? trace_hardirqs_on_thunk+0x3a/0x3f
>  [<ffffffff8026167f>] ? trace_hardirqs_on_caller+0xbf/0x150
>  [<ffffffff8020c6d9>] child_rip+0xa/0x11
>  [<ffffffff8020bd0f>] ? restore_args+0x0/0x30
>  [<ffffffff806c8920>] ? kernel_init+0x0/0x310
>  [<ffffffff8020c6cf>] ? child_rip+0x0/0x11
> 
> ------------[ cut here ]------------
> WARNING: at /home/rafael/src/linux-next/kernel/smp.c:288 smp_call_function_mask+0x198/0x1a0()
> Modules linked in:
> Pid: 1, comm: swapper Tainted: G      D   2.6.26-next #51
> 
> Call Trace:
>  [<ffffffff8023beff>] warn_on_slowpath+0x5f/0x80
>  [<ffffffff8020c6cf>] ? child_rip+0x0/0x11
>  [<ffffffff802577f9>] ? up+0x19/0x50
>  [<ffffffff8023d0f7>] ? printk+0x67/0x70
>  [<ffffffff80267f48>] smp_call_function_mask+0x198/0x1a0
>  [<ffffffff8020d063>] ? dump_trace+0x373/0x400
>  [<ffffffff8020d14e>] ? show_trace+0x5e/0x80
>  [<ffffffff80267f6b>] smp_call_function+0x1b/0x20
>  [<ffffffff8021c4f0>] native_smp_send_stop+0x30/0x60
>  [<ffffffff8023c24d>] panic+0xad/0x190
>  [<ffffffff802577f9>] ? up+0x19/0x50
>  [<ffffffff8023d0f7>] ? printk+0x67/0x70
>  [<ffffffff803be042>] ? account+0xc2/0x100
>  [<ffffffff803be1ee>] ? extract_entropy+0x7e/0xa0
>  [<ffffffff8024063f>] do_exit+0x8bf/0x8d0
>  [<ffffffff803be22b>] ? get_random_bytes+0x1b/0x20
>  [<ffffffff8020cbd5>] oops_end+0x85/0x90
>  [<ffffffff8020d84e>] die+0x5e/0x90
>  [<ffffffff8020da30>] do_trap+0x130/0x150
>  [<ffffffff8020e84c>] do_invalid_op+0x9c/0xc0
>  [<ffffffff802b59e8>] ? cache_free_debugcheck+0x288/0x2b0
>  [<ffffffff80311490>] ? sysfs_ilookup_test+0x0/0x20
>  [<ffffffff804fede6>] ? _spin_unlock+0x26/0x30
>  [<ffffffff803e9927>] ? attribute_container_device_trigger+0x27/0xd0
>  [<ffffffff804ff54d>] error_exit+0x0/0xa9
>  [<ffffffff802b59e8>] ? cache_free_debugcheck+0x288/0x2b0
>  [<ffffffff803f9bcb>] ? scsi_probe_and_add_lun+0x86b/0xc20
>  [<ffffffff802b5c16>] kfree+0xd6/0x160
>  [<ffffffff803f9bcb>] scsi_probe_and_add_lun+0x86b/0xc20
>  [<ffffffff803fad4f>] __scsi_add_device+0xff/0x110
>  [<ffffffff8040c99b>] ata_scsi_scan_host+0xdb/0x2b0
>  [<ffffffff804094b3>] ata_host_register+0x243/0x2a0
>  [<ffffffff8041b920>] ? ahci_interrupt+0x0/0x530
>  [<ffffffff804095b4>] ata_host_activate+0xa4/0x110
>  [<ffffffff8041b668>] ahci_init_one+0x9a8/0xc60
>  [<ffffffff80378fb9>] pci_device_probe+0x79/0xa0
>  [<ffffffff803e6aab>] driver_probe_device+0x9b/0x1a0
>  [<ffffffff803e6c36>] __driver_attach+0x86/0x90
>  [<ffffffff803e6bb0>] ? __driver_attach+0x0/0x90
>  [<ffffffff803e5ffd>] bus_for_each_dev+0x5d/0x90
>  [<ffffffff803e68ec>] driver_attach+0x1c/0x20
>  [<ffffffff803e6525>] bus_add_driver+0xc5/0x250
>  [<ffffffff803e6e0f>] driver_register+0x5f/0x140
>  [<ffffffff8037927d>] __pci_register_driver+0x7d/0xc0
>  [<ffffffff806f00c0>] ? ahci_init+0x0/0x20
>  [<ffffffff806f00d9>] ahci_init+0x19/0x20
>  [<ffffffff806c8a48>] kernel_init+0x128/0x310
>  [<ffffffff8026167f>] ? trace_hardirqs_on_caller+0xbf/0x150
>  [<ffffffff8026171d>] ? trace_hardirqs_on+0xd/0x10
>  [<ffffffff804ff1eb>] ? _spin_unlock_irq+0x2b/0x40
>  [<ffffffff804fea80>] ? trace_hardirqs_on_thunk+0x3a/0x3f
>  [<ffffffff8026167f>] ? trace_hardirqs_on_caller+0xbf/0x150
>  [<ffffffff8020c6d9>] child_rip+0xa/0x11
>  [<ffffffff8020bd0f>] ? restore_args+0x0/0x30
>  [<ffffffff806c8920>] ? kernel_init+0x0/0x310
>  [<ffffffff8020c6cf>] ? child_rip+0x0/0x11
> 
> ---[ end trace c69efc8b7b1131cd ]---
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: linux-next: Tree for July 16 (crash on quad core AMD)
       [not found]     ` <1216249292.3358.66.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
@ 2008-07-16 23:09       ` Rafael J. Wysocki
  2008-07-18 12:38         ` Tejun Heo
  0 siblings, 1 reply; 14+ messages in thread
From: Rafael J. Wysocki @ 2008-07-16 23:09 UTC (permalink / raw)
  To: James Bottomley
  Cc: Stephen Rothwell, linux-next-u79uwXL29TY76Z2rM5mHXA, LKML,
	Andrew Morton, Kernel Testers List, scsi, Jens Axboe, linux-ide,
	Jeff Garzik, Tejun Heo

On Thursday, 17 of July 2008, James Bottomley wrote:
> On Thu, 2008-07-17 at 00:53 +0200, Rafael J. Wysocki wrote:
> > On Wednesday, 16 of July 2008, Stephen Rothwell wrote:
> > > Hi all,
> > > 
> > > Changes since next-20080715:
> > > 
> > > Temporarily dropped tree: ttydev (it gets too many patch failures).
> > > 
> > > Most of the differences were conflicts moving from tree to tree as some
> > > of the trees are now merged into Linus' tree.  Most have been inflicted
> > > on the driver-core and usb trees.  I have not notified these separately.
> > > 
> > > Because of the moving of conflicts around it is difficult to tell when
> > > they are going away (though I assume some are).
> > > 
> > > The usb.current tree lost its conflict against Linus' tree.
> > > 
> > > The usb tree gained a conflict against Linus' tree.
> > > 
> > > The cpus4096 tree gained a conflict against Linus' tree.
> > > 
> > > The pci tree lost one of its build fix patches.
> > > 
> > > The i2c tree lost a conflict against Linus' tree.
> > > 
> > > The ide tree gained a lot of conflicts against Linus' tree because part
> > > of it was merged into Linus' tree but the remaining pat modified many of
> > > the same files further.
> > > 
> > > The acpi tree gained a conflict against Linus' tree.
> > > 
> > > The net tree gained a conflict against each of Linus' tree and the
> > > powerpc tree.
> > > 
> > > The sparc tree gained conflicts against Linus' tree and the ide and
> > > sparc-current trees.
> > > 
> > > The rr tree gained conflict against the net tree.
> > > 
> > > The semaphore tree gained a conflict against the sparc tree.
> > > 
> > > The generic-ipi tree gained a conflict against the powerpc tree.
> > > 
> > > The ttydev series had many patches fail to apply, so it was dropped for
> > > today.
> > > 
> > > I have also applied the following patches for known problems:
> > > 
> > > 	sparc64: sysdev API change fallout
> > > 
> > > Patches no longer necessary:
> > > 
> > > 	s390: fix compile error due to smp_call_function
> > > 	powerpc: mman.h export fixups
> > > 	powerpc/stacktrace: EXPORT SYMBOL_GPL needs module.h
> > > 
> > > This tree fails to build for ARCH=sparc (i.e. 32bit) with a 64bit gcc
> > > v3.4.5 - it tries to use the 64bit header files.  This may be an artifact
> > > of one of my merge fixups, but I don't actually think so.
> > > 
> > > ----------------------------------------------------------------------------
> > 
> > Crashes during boot on a box with Phenom X4 and AMD 790-based mainboard.
> > AFAICS, the Linus' tree is unaffected and linux-next from yesterday was fine
> > on the same box with the same .config.
> 
> OK, that means that all the current SCSI patches were merged and it was
> still OK (they're all in linus and I haven't put together the next slice
> yet).  I'd suspect something in drivers/ata (cc ide list added).
> 
> James

OK, let's ask Tejun and Jeff too.

I suspect the AHCI driver, because the same kernel works on my other boxes
with different SATA controllers.
 
> > Full dmesg: http://www.sisk.pl/kernel/debug/next/20080716/crash-M3A32-MVP.log
> > Kernel config: http://www.sisk.pl/kernel/debug/next/20080716/M3A32-MVP-config
> > 
> > scsi scan: INQUIRY result too short (5), using 36
> > scsi 2:0:0:0: Direct-Access                                    PQ: 0 ANSI: 0
> > ------------[ cut here ]------------
> > kernel BUG at /home/rafael/src/linux-next/mm/slab.c:2822!
> > invalid opcode: 0000 [1] SMP
> > last sysfs file:
> > CPU 0
> > Modules linked in:
> > Pid: 1, comm: swapper Not tainted 2.6.26-next #51
> > RIP: 0010:[<ffffffff802b59e8>]  [<ffffffff802b59e8>] cache_free_debugcheck+0x288/0x2b0
> > RSP: 0018:ffff880127c81880  EFLAGS: 00010016
> > RAX: 00da9803898590c8 RBX: ffff880127c01880 RCX: 204a483235324448
> > RDX: 0000000000da9803 RSI: ffff880124488810 RDI: ffff880127c01880
> > RBP: ffff880127c818b0 R08: 0000000000000058 R09: 2222222222222222
> > R10: 2222222222222222 R11: 2222222222222222 R12: ffff880124488810
> > R13: 09f911029d74e35b R14: 09f911029d74e35b R15: ffff880124488000
> > FS:  0000000000000000(0000) GS:ffffffff806b7f40(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > CR2: 0000000000000000 CR3: 0000000000201000 CR4: 00000000000006e0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > Process swapper (pid: 1, threadinfo ffff880127c80000, task ffff880127c7e040)
> > Stack:  ffffffff803f9bcb ffff880127c01880 ffff880127c0bad0 ffff880124488818
> >  0000000000000282 ffff880124460000 ffff880127c818e0 ffffffff802b5c16
> >  0000000000000000 0000000000000002 ffffffff80627d9b 0000000000000000
> > Call Trace:
> >  [<ffffffff803f9bcb>] ? scsi_probe_and_add_lun+0x86b/0xc20
> >  [<ffffffff802b5c16>] kfree+0xd6/0x160
> >  [<ffffffff803f9bcb>] scsi_probe_and_add_lun+0x86b/0xc20
> >  [<ffffffff803fad4f>] __scsi_add_device+0xff/0x110
> >  [<ffffffff8040c99b>] ata_scsi_scan_host+0xdb/0x2b0
> >  [<ffffffff804094b3>] ata_host_register+0x243/0x2a0
> >  [<ffffffff8041b920>] ? ahci_interrupt+0x0/0x530
> >  [<ffffffff804095b4>] ata_host_activate+0xa4/0x110
> >  [<ffffffff8041b668>] ahci_init_one+0x9a8/0xc60
> >  [<ffffffff80378fb9>] pci_device_probe+0x79/0xa0
> >  [<ffffffff803e6aab>] driver_probe_device+0x9b/0x1a0
> >  [<ffffffff803e6c36>] __driver_attach+0x86/0x90
> >  [<ffffffff803e6bb0>] ? __driver_attach+0x0/0x90
> >  [<ffffffff803e5ffd>] bus_for_each_dev+0x5d/0x90
> >  [<ffffffff803e68ec>] driver_attach+0x1c/0x20
> >  [<ffffffff803e6525>] bus_add_driver+0xc5/0x250
> >  [<ffffffff803e6e0f>] driver_register+0x5f/0x140
> >  [<ffffffff8037927d>] __pci_register_driver+0x7d/0xc0
> >  [<ffffffff806f00c0>] ? ahci_init+0x0/0x20
> >  [<ffffffff806f00d9>] ahci_init+0x19/0x20
> >  [<ffffffff806c8a48>] kernel_init+0x128/0x310
> >  [<ffffffff8026167f>] ? trace_hardirqs_on_caller+0xbf/0x150
> >  [<ffffffff8026171d>] ? trace_hardirqs_on+0xd/0x10
> >  [<ffffffff804ff1eb>] ? _spin_unlock_irq+0x2b/0x40
> >  [<ffffffff804fea80>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> >  [<ffffffff8026167f>] ? trace_hardirqs_on_caller+0xbf/0x150
> >  [<ffffffff8020c6d9>] child_rip+0xa/0x11
> >  [<ffffffff8020bd0f>] ? restore_args+0x0/0x30
> >  [<ffffffff806c8920>] ? kernel_init+0x0/0x310
> >  [<ffffffff8020c6cf>] ? child_rip+0x0/0x11
> > 
> > 
> > Code: 48 8b 40 10 48 8b 08 f6 c5 40 0f 84 06 fe ff ff 48 8b 40 10 48 8b 08 e9 fa fd ff ff 0f 1f 80 00 00 00 00 0f 0b eb fe 0f 0b eb fe <0f> 0b eb fe 0f 1f 40 00 48
> > RIP  [<ffffffff802b59e8>] cache_free_debugcheck+0x288/0x2b0
> >  RSP <ffff880127c81880>
> > ---[ end trace c69efc8b7b1131cd ]---
> > Kernel panic - not syncing: Attempted to kill init!
> > Pid: 1, comm: swapper Tainted: G      D   2.6.26-next #51
> > 
> > Call Trace:
> >  [<ffffffff8023c240>] panic+0xa0/0x190
> >  [<ffffffff802577f9>] ? up+0x19/0x50
> >  [<ffffffff8023d0f7>] ? printk+0x67/0x70
> >  [<ffffffff803be042>] ? account+0xc2/0x100
> >  [<ffffffff803be1ee>] ? extract_entropy+0x7e/0xa0
> >  [<ffffffff8024063f>] do_exit+0x8bf/0x8d0
> >  [<ffffffff803be22b>] ? get_random_bytes+0x1b/0x20
> >  [<ffffffff8020cbd5>] oops_end+0x85/0x90
> >  [<ffffffff8020d84e>] die+0x5e/0x90
> >  [<ffffffff8020da30>] do_trap+0x130/0x150
> >  [<ffffffff8020e84c>] do_invalid_op+0x9c/0xc0
> >  [<ffffffff802b59e8>] ? cache_free_debugcheck+0x288/0x2b0
> >  [<ffffffff80311490>] ? sysfs_ilookup_test+0x0/0x20
> >  [<ffffffff804fede6>] ? _spin_unlock+0x26/0x30
> >  [<ffffffff803e9927>] ? attribute_container_device_trigger+0x27/0xd0
> >  [<ffffffff804ff54d>] error_exit+0x0/0xa9
> >  [<ffffffff802b59e8>] ? cache_free_debugcheck+0x288/0x2b0
> >  [<ffffffff803f9bcb>] ? scsi_probe_and_add_lun+0x86b/0xc20
> >  [<ffffffff802b5c16>] kfree+0xd6/0x160
> >  [<ffffffff803f9bcb>] scsi_probe_and_add_lun+0x86b/0xc20
> >  [<ffffffff803fad4f>] __scsi_add_device+0xff/0x110
> >  [<ffffffff8040c99b>] ata_scsi_scan_host+0xdb/0x2b0
> >  [<ffffffff804094b3>] ata_host_register+0x243/0x2a0
> >  [<ffffffff8041b920>] ? ahci_interrupt+0x0/0x530
> >  [<ffffffff804095b4>] ata_host_activate+0xa4/0x110
> >  [<ffffffff8041b668>] ahci_init_one+0x9a8/0xc60
> >  [<ffffffff80378fb9>] pci_device_probe+0x79/0xa0
> >  [<ffffffff803e6aab>] driver_probe_device+0x9b/0x1a0
> >  [<ffffffff803e6c36>] __driver_attach+0x86/0x90
> >  [<ffffffff803e6bb0>] ? __driver_attach+0x0/0x90
> >  [<ffffffff803e5ffd>] bus_for_each_dev+0x5d/0x90
> >  [<ffffffff803e68ec>] driver_attach+0x1c/0x20
> >  [<ffffffff803e6525>] bus_add_driver+0xc5/0x250
> >  [<ffffffff803e6e0f>] driver_register+0x5f/0x140
> >  [<ffffffff8037927d>] __pci_register_driver+0x7d/0xc0
> >  [<ffffffff806f00c0>] ? ahci_init+0x0/0x20
> >  [<ffffffff806f00d9>] ahci_init+0x19/0x20
> >  [<ffffffff806c8a48>] kernel_init+0x128/0x310
> >  [<ffffffff8026167f>] ? trace_hardirqs_on_caller+0xbf/0x150
> >  [<ffffffff8026171d>] ? trace_hardirqs_on+0xd/0x10
> >  [<ffffffff804ff1eb>] ? _spin_unlock_irq+0x2b/0x40
> >  [<ffffffff804fea80>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> >  [<ffffffff8026167f>] ? trace_hardirqs_on_caller+0xbf/0x150
> >  [<ffffffff8020c6d9>] child_rip+0xa/0x11
> >  [<ffffffff8020bd0f>] ? restore_args+0x0/0x30
> >  [<ffffffff806c8920>] ? kernel_init+0x0/0x310
> >  [<ffffffff8020c6cf>] ? child_rip+0x0/0x11
> > 
> > ------------[ cut here ]------------
> > WARNING: at /home/rafael/src/linux-next/kernel/smp.c:288 smp_call_function_mask+0x198/0x1a0()
> > Modules linked in:
> > Pid: 1, comm: swapper Tainted: G      D   2.6.26-next #51
> > 
> > Call Trace:
> >  [<ffffffff8023beff>] warn_on_slowpath+0x5f/0x80
> >  [<ffffffff8020c6cf>] ? child_rip+0x0/0x11
> >  [<ffffffff802577f9>] ? up+0x19/0x50
> >  [<ffffffff8023d0f7>] ? printk+0x67/0x70
> >  [<ffffffff80267f48>] smp_call_function_mask+0x198/0x1a0
> >  [<ffffffff8020d063>] ? dump_trace+0x373/0x400
> >  [<ffffffff8020d14e>] ? show_trace+0x5e/0x80
> >  [<ffffffff80267f6b>] smp_call_function+0x1b/0x20
> >  [<ffffffff8021c4f0>] native_smp_send_stop+0x30/0x60
> >  [<ffffffff8023c24d>] panic+0xad/0x190
> >  [<ffffffff802577f9>] ? up+0x19/0x50
> >  [<ffffffff8023d0f7>] ? printk+0x67/0x70
> >  [<ffffffff803be042>] ? account+0xc2/0x100
> >  [<ffffffff803be1ee>] ? extract_entropy+0x7e/0xa0
> >  [<ffffffff8024063f>] do_exit+0x8bf/0x8d0
> >  [<ffffffff803be22b>] ? get_random_bytes+0x1b/0x20
> >  [<ffffffff8020cbd5>] oops_end+0x85/0x90
> >  [<ffffffff8020d84e>] die+0x5e/0x90
> >  [<ffffffff8020da30>] do_trap+0x130/0x150
> >  [<ffffffff8020e84c>] do_invalid_op+0x9c/0xc0
> >  [<ffffffff802b59e8>] ? cache_free_debugcheck+0x288/0x2b0
> >  [<ffffffff80311490>] ? sysfs_ilookup_test+0x0/0x20
> >  [<ffffffff804fede6>] ? _spin_unlock+0x26/0x30
> >  [<ffffffff803e9927>] ? attribute_container_device_trigger+0x27/0xd0
> >  [<ffffffff804ff54d>] error_exit+0x0/0xa9
> >  [<ffffffff802b59e8>] ? cache_free_debugcheck+0x288/0x2b0
> >  [<ffffffff803f9bcb>] ? scsi_probe_and_add_lun+0x86b/0xc20
> >  [<ffffffff802b5c16>] kfree+0xd6/0x160
> >  [<ffffffff803f9bcb>] scsi_probe_and_add_lun+0x86b/0xc20
> >  [<ffffffff803fad4f>] __scsi_add_device+0xff/0x110
> >  [<ffffffff8040c99b>] ata_scsi_scan_host+0xdb/0x2b0
> >  [<ffffffff804094b3>] ata_host_register+0x243/0x2a0
> >  [<ffffffff8041b920>] ? ahci_interrupt+0x0/0x530
> >  [<ffffffff804095b4>] ata_host_activate+0xa4/0x110
> >  [<ffffffff8041b668>] ahci_init_one+0x9a8/0xc60
> >  [<ffffffff80378fb9>] pci_device_probe+0x79/0xa0
> >  [<ffffffff803e6aab>] driver_probe_device+0x9b/0x1a0
> >  [<ffffffff803e6c36>] __driver_attach+0x86/0x90
> >  [<ffffffff803e6bb0>] ? __driver_attach+0x0/0x90
> >  [<ffffffff803e5ffd>] bus_for_each_dev+0x5d/0x90
> >  [<ffffffff803e68ec>] driver_attach+0x1c/0x20
> >  [<ffffffff803e6525>] bus_add_driver+0xc5/0x250
> >  [<ffffffff803e6e0f>] driver_register+0x5f/0x140
> >  [<ffffffff8037927d>] __pci_register_driver+0x7d/0xc0
> >  [<ffffffff806f00c0>] ? ahci_init+0x0/0x20
> >  [<ffffffff806f00d9>] ahci_init+0x19/0x20
> >  [<ffffffff806c8a48>] kernel_init+0x128/0x310
> >  [<ffffffff8026167f>] ? trace_hardirqs_on_caller+0xbf/0x150
> >  [<ffffffff8026171d>] ? trace_hardirqs_on+0xd/0x10
> >  [<ffffffff804ff1eb>] ? _spin_unlock_irq+0x2b/0x40
> >  [<ffffffff804fea80>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> >  [<ffffffff8026167f>] ? trace_hardirqs_on_caller+0xbf/0x150
> >  [<ffffffff8020c6d9>] child_rip+0xa/0x11
> >  [<ffffffff8020bd0f>] ? restore_args+0x0/0x30
> >  [<ffffffff806c8920>] ? kernel_init+0x0/0x310
> >  [<ffffffff8020c6cf>] ? child_rip+0x0/0x11
> > 
> > ---[ end trace c69efc8b7b1131cd ]---
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: linux-next: Tree for July 16 (crash on quad core AMD)
  2008-07-16 23:09       ` Rafael J. Wysocki
@ 2008-07-18 12:38         ` Tejun Heo
       [not found]           ` <48808EE0.2060603-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2008-07-18 12:38 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: James Bottomley, Stephen Rothwell, linux-next, LKML,
	Andrew Morton, Kernel Testers List, scsi, Jens Axboe, linux-ide,
	Jeff Garzik, Takashi Iwai, tino.keitel, drzeus

Hello,

Rafael J. Wysocki wrote:
>>> Crashes during boot on a box with Phenom X4 and AMD 790-based mainboard.
>>> AFAICS, the Linus' tree is unaffected and linux-next from yesterday was fine
>>> on the same box with the same .config.
>> OK, that means that all the current SCSI patches were merged and it was
>> still OK (they're all in linus and I haven't put together the next slice
>> yet).  I'd suspect something in drivers/ata (cc ide list added).
>>
>> James
> 
> OK, let's ask Tejun and Jeff too.
> 
> I suspect the AHCI driver, because the same kernel works on my other boxes
> with different SATA controllers.
>  
>>> Full dmesg: http://www.sisk.pl/kernel/debug/next/20080716/crash-M3A32-MVP.log
>>> Kernel config: http://www.sisk.pl/kernel/debug/next/20080716/M3A32-MVP-config
>>>
>>> scsi scan: INQUIRY result too short (5), using 36
>>> scsi 2:0:0:0: Direct-Access                                    PQ: 0 ANSI: 0
>>> ------------[ cut here ]------------
>>> kernel BUG at /home/rafael/src/linux-next/mm/slab.c:2822!

The offending commit was 83e7d317cef3ee624886f128401a72e414c0a99d
which implements sg iterator but it forgot to add offset to the
kmapped address and copy goes out of bounds.  Takashi, this could also
be the problem you were seeing if you don't have slab debugging turned
on.

The implemented iterator didn't look too pretty and the usage was
quite awkward involving a callback and end condition check distributed
between the callback and the outer user who runs the loop.  For
copying, end of buffer condition was tested by testing whether the
callback returned 0 copied bytes for the iteration but AFAIK there's
no restriction against zero length sg entry in the middle and it will
terminate the copying prematurely.

So, I implemented slightly different version which follows below.

Subject: [PATCH next-20080716] sg: reimplement sg mapping iterator

sg mapping iterator implemented by 83e7d317... had a bug and was a bit
awkard to use.  Reimplement it as more regular iterator with start,
next and stop.  As there's already an sg iterator which iterates over
sg entries themselves, name this sg_mapping_iterator.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Pierre Ossman <drzeus@drzeus.cx>
---
 include/linux/scatterlist.h |   43 +++++++---
 lib/scatterlist.c           |  186 ++++++++++++++++++++++----------------------
 2 files changed, 125 insertions(+), 104 deletions(-)

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index 93411a1..c9058b0 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -13,12 +13,6 @@ struct sg_table {
 	unsigned int orig_nents;	/* original size of list */
 };
 
-struct sg_iterator {
-	struct scatterlist *sg;		/* current entry */
-	unsigned int nents;		/* number of remaining entries */
-	unsigned int offset;		/* offset within sg */
-};
-
 /*
  * Notes on SG table design.
  *
@@ -219,12 +213,6 @@ int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int, gfp_t,
 		     sg_alloc_fn *);
 int sg_alloc_table(struct sg_table *, unsigned int, gfp_t);
 
-typedef size_t (sg_iter_fn)(void *, size_t, struct page *, void *);
-
-void sg_iter_init(struct sg_iterator *iter, struct scatterlist *sgl,
-		  unsigned int nents);
-size_t sg_iterate(struct sg_iterator *iter, sg_iter_fn *fn, void *priv);
-
 size_t sg_copy_from_buffer(struct scatterlist *sgl, unsigned int nents,
 			   void *buf, size_t buflen);
 size_t sg_copy_to_buffer(struct scatterlist *sgl, unsigned int nents,
@@ -236,4 +224,35 @@ size_t sg_copy_to_buffer(struct scatterlist *sgl, unsigned int nents,
  */
 #define SG_MAX_SINGLE_ALLOC		(PAGE_SIZE / sizeof(struct scatterlist))
 
+
+/*
+ * Mapping sg iterator
+ *
+ * Iterates over sg entries mapping page-by-page.  On each successful
+ * iteration, @miter->page points to the mapped page and
+ * @miter->length bytes of data can be accessed at @miter->addr.  As
+ * long as an interation is enclosed between start and stop, the user
+ * is free to choose control structure and when to stop.
+ */
+
+#define SG_MITER_ATOMIC		(1 << 0)	 /* use kmap_atomic */
+
+struct sg_mapping_iter {
+	/* the following three fields can be accessed directly */
+	struct page		*page;		/* currently mapped page */
+	void			*addr;		/* pointer to the mapped area */
+	size_t			length;		/* length of the mapped area */
+
+	/* these are internal states, keep away */
+	struct scatterlist	*__sg;		/* current entry */
+	unsigned int		__nents;	/* nr of remaining entries */
+	unsigned int		__offset;	/* offset within sg */
+	unsigned int		__flags;
+};
+
+void sg_miter_start(struct sg_mapping_iter *miter, struct scatterlist *sgl,
+		    unsigned int nents, unsigned int flags);
+bool sg_miter_next(struct sg_mapping_iter *miter);
+void sg_miter_stop(struct sg_mapping_iter *miter);
+
 #endif /* _LINUX_SCATTERLIST_H */
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index ea8c3a1..6d51bd3 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -295,115 +295,107 @@ int sg_alloc_table(struct sg_table *table, unsigned int nents, gfp_t gfp_mask)
 EXPORT_SYMBOL(sg_alloc_table);
 
 /**
- * sg_iter_init - Initialize/reset an sg iterator structure
- * @iter:	The sg iterator structure
- * @sgl:	The sg list to iterate over
- * @nents:	Number of sg entries
+ * sg_miter_start - start mapping iteration over a sg list
+ * @miter: sg mapping iter to be started
+ * @sgl: sg list to iterate over
+ * @nents: number of sg entries
  *
- *  Description:
- *    Sets up the internal state of the sg iterator structure
- *    to the beginning of the given sg list.
+ * Description:
+ *   Starts mapping iterator @miter.
+ *
+ * Context:
+ *   Don't care.
  */
-void sg_iter_init(struct sg_iterator *iter, struct scatterlist *sgl,
-		  unsigned int nents)
+void sg_miter_start(struct sg_mapping_iter *miter, struct scatterlist *sgl,
+		    unsigned int nents, unsigned int flags)
 {
-	memset(iter, 0, sizeof(struct sg_iterator));
+	memset(miter, 0, sizeof(struct sg_mapping_iter));
 
-	iter->sg = sgl;
-	iter->nents = nents;
-	iter->offset = 0;
+	miter->__sg = sgl;
+	miter->__nents = nents;
+	miter->__offset = 0;
+	miter->__flags = flags;
 }
-EXPORT_SYMBOL(sg_iter_init);
+EXPORT_SYMBOL(sg_miter_start);
 
 /**
- * sg_iterate - Process the next chunk of the sg list
- * @iter:	The sg iterator structure
- * @fn:		Function that will process the data
- * @priv:	Private data passed on to @fn@
+ * sg_miter_next - proceed mapping iterator to the next mapping
+ * @miter: sg mapping iter to proceed
  *
- *  Description:
- *    The main worker of sg list iteration. Each invokation will
- *    call @fn@ with a chunk of data that has been mapped into
- *    kernel reachable virtual memory. The function must return
- *    the number of bytes processed, which may be less than the
- *    total size of the current chunk.
+ * Description:
+ *   Proceeds @miter@ to the next mapping.  @miter@ should have been
+ *   started using sg_miter_start().  On successful return,
+ *   @miter@->page, @miter@->addr and @miter@->length point to the
+ *   current mapping.
  *
- *    It is not required that @fn@ and @priv@ are identical between
- *    each invokation, allowing separate processing of different
- *    sections of the sg list.
+ * Context:
+ *   IRQ disabled if SG_MITER_ATOMIC.  IRQ must stay disabled till
+ *   @miter@ is stopped.  May sleep if !SG_MITER_ATOMIC.
  *
- *    The return value is that given by @fn@, or 0 if the end of
- *    the sg list has been reached.
+ * Returns:
+ *   true if @miter contains the next mapping.  false if end of sg
+ *   list is reached.
  */
-size_t sg_iterate(struct sg_iterator *iter, sg_iter_fn *fn, void *priv)
+bool sg_miter_next(struct sg_mapping_iter *miter)
 {
-	struct page *page;
-	int n;
-	size_t buflen;
-	unsigned int sg_offset, sg_remain;
-
-	void *buf;
-	size_t result;
-
-	WARN_ON(!irqs_disabled());
+	unsigned int off, len;
 
-	if (iter->nents == 0)
-		return 0;
+	/* check for end and drop resources from the last iteration */
+	if (!miter->__nents)
+		return false;
 
-	sg_offset = iter->sg->offset + iter->offset;
-	sg_remain = iter->sg->length - iter->offset;
+	sg_miter_stop(miter);
 
-	n = sg_offset / PAGE_SIZE;
-	page = nth_page(sg_page(iter->sg), n);
+	/* map the next page */
+	off = miter->__sg->offset + miter->__offset;
+	len = miter->__sg->length - miter->__offset;
 
-	buflen = PAGE_SIZE - (sg_offset % PAGE_SIZE);
-	if (buflen > sg_remain)
-		buflen = sg_remain;
+	miter->page = nth_page(sg_page(miter->__sg), off >> PAGE_SHIFT);
+	off &= ~PAGE_MASK;
+	miter->length = min_t(unsigned int, len, PAGE_SIZE - off);
 
-	buf = kmap_atomic(page, KM_BIO_SRC_IRQ);
-	result = fn(buf, buflen, page, priv);
-	kunmap_atomic(buf, KM_BIO_SRC_IRQ);
-
-	WARN_ON(result > buflen);
-
-	iter->offset += result;
+	if (miter->__flags & SG_MITER_ATOMIC)
+		miter->addr = kmap_atomic(miter->page, KM_BIO_SRC_IRQ) + off;
+	else
+		miter->addr = kmap(miter->page) + off;
 
-	if (iter->offset == iter->sg->length) {
-		iter->nents--;
-		if (iter->nents)
-			iter->sg = sg_next(iter->sg);
-		iter->offset = 0;
+	/* proceed the iterator */
+	miter->__offset += miter->length;
+	if (miter->__offset == miter->__sg->length && --miter->__nents) {
+		miter->__sg = sg_next(miter->__sg);
+		miter->__offset = 0;
 	}
 
-	return result;
+	return true;
 }
-EXPORT_SYMBOL(sg_iterate);
+EXPORT_SYMBOL(sg_miter_next);
 
-struct sg_copy_state {
-	void *buf;
-	size_t offset, buflen;
-	int to_buffer;
-};
-
-static size_t sg_copy_worker(void *buf, size_t buflen,
-			     struct page *page, void *priv)
+/**
+ * sg_miter_stop - stop mapping iteration
+ * @miter: sg mapping iter to be stopped
+ *
+ * Description:
+ *   Stops mapping iterator @miter.
+ *
+ * Context:
+ *   IRQ disabled if the SG_MITER_ATOMIC is set.  Don't care otherwise.
+ */
+void sg_miter_stop(struct sg_mapping_iter *miter)
 {
-	struct sg_copy_state *st = priv;
-
-	if (buflen > (st->buflen - st->offset))
-		buflen = st->buflen - st->offset;
+	/* drop resources from the last iteration */
+	if (miter->addr) {
+		if (miter->__flags & SG_MITER_ATOMIC) {
+			WARN_ON(!irqs_disabled());
+			kunmap_atomic(miter->addr, KM_BIO_SRC_IRQ);
+		} else
+			kunmap(miter->addr);
 
-	if (st->to_buffer)
-		memcpy(st->buf + st->offset, buf, buflen);
-	else {
-		memcpy(buf, st->buf + st->offset, buflen);
-		flush_kernel_dcache_page(page);
+		miter->page = NULL;
+		miter->addr = NULL;
+		miter->length = 0;
 	}
-
-	st->offset += buflen;
-
-	return buflen;
 }
+EXPORT_SYMBOL(sg_miter_stop);
 
 /**
  * sg_copy_buffer - Copy data between a linear buffer and an SG list
@@ -420,19 +412,29 @@ static size_t sg_copy_worker(void *buf, size_t buflen,
 static size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents,
 			     void *buf, size_t buflen, int to_buffer)
 {
-	struct sg_iterator iter;
-	struct sg_copy_state state;
+	unsigned int offset = 0;
+	struct sg_mapping_iter miter;
+
+	sg_miter_start(&miter, sgl, nents, SG_MITER_ATOMIC);
+
+	while (sg_miter_next(&miter) && offset < buflen) {
+		unsigned int len;
 
-	sg_iter_init(&iter, sgl, nents);
+		len = min(miter.length, buflen - offset);
 
-	state.buf = buf;
-	state.offset = 0;
-	state.buflen = buflen;
-	state.to_buffer = to_buffer;
+		if (to_buffer)
+			memcpy(buf + offset, miter.addr, len);
+		else {
+			memcpy(miter.addr, buf + offset, len);
+			flush_kernel_dcache_page(miter.page);
+		}
+
+		offset += len;
+	}
 
-	while (sg_iterate(&iter, sg_copy_worker, &state));
+	sg_miter_stop(&miter);
 
-	return state.offset;
+	return offset;
 }
 
 /**

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: linux-next: Tree for July 16 (crash on quad core AMD)
       [not found]           ` <48808EE0.2060603-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2008-07-18 22:47             ` Pierre Ossman
  2008-07-19  0:59               ` Tejun Heo
  0 siblings, 1 reply; 14+ messages in thread
From: Pierre Ossman @ 2008-07-18 22:47 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Rafael J. Wysocki, James Bottomley, Stephen Rothwell,
	linux-next-u79uwXL29TY76Z2rM5mHXA, LKML, Andrew Morton,
	Kernel Testers List, scsi, Jens Axboe, linux-ide, Jeff Garzik,
	Takashi Iwai, tino.keitel-Mmb7MZpHnFY

[-- Attachment #1: Type: text/plain, Size: 1718 bytes --]

On Fri, 18 Jul 2008 21:38:56 +0900
Tejun Heo <htejun-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

> 
> The offending commit was 83e7d317cef3ee624886f128401a72e414c0a99d
> which implements sg iterator but it forgot to add offset to the
> kmapped address and copy goes out of bounds.  Takashi, this could also
> be the problem you were seeing if you don't have slab debugging turned
> on.
> 

Oops, sorry. This thing wasn't supposed to go out and mess with anyone
else's tree quite yet. I'll make sure to clean out my -next tree right
away.

> The implemented iterator didn't look too pretty and the usage was
> quite awkward involving a callback and end condition check distributed
> between the callback and the outer user who runs the loop.  For
> copying, end of buffer condition was tested by testing whether the
> callback returned 0 copied bytes for the iteration but AFAIK there's
> no restriction against zero length sg entry in the middle and it will
> terminate the copying prematurely.
> 
> So, I implemented slightly different version which follows below.
> 

I just have one objection to your version, and that is that it cannot
be used to nibble away at the sg list. The _next() call jumps an entire
page, whereas you sometimes need to consume that page in two different
sweeps. This could be handled by some external buffer that keeps the
remainder of the page, but the point of these functions was to keep
things simple for the callers.

Rgds
-- 
     -- Pierre Ossman

  WARNING: This correspondence is being monitored by the
  Swedish government. Make sure your server uses encryption
  for SMTP traffic and consider using PGP for end-to-end
  encryption.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: linux-next: Tree for July 16 (crash on quad core AMD)
  2008-07-18 22:47             ` Pierre Ossman
@ 2008-07-19  0:59               ` Tejun Heo
  2008-07-19  1:30                 ` Pierre Ossman
  0 siblings, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2008-07-19  0:59 UTC (permalink / raw)
  To: Pierre Ossman
  Cc: Rafael J. Wysocki, James Bottomley, Stephen Rothwell, linux-next,
	LKML, Andrew Morton, Kernel Testers List, scsi, Jens Axboe,
	linux-ide, Jeff Garzik, Takashi Iwai, tino.keitel

Pierre Ossman wrote:
> I just have one objection to your version, and that is that it cannot
> be used to nibble away at the sg list. The _next() call jumps an entire
> page, whereas you sometimes need to consume that page in two different
> sweeps. This could be handled by some external buffer that keeps the
> remainder of the page, but the point of these functions was to keep
> things simple for the callers.

Well, I don't know how often such usages would be necessary.  If it's a
very common ops, you can add a param to the next function but frankly I
think it's better to build a inside control structure for that.  There's
no need for external buffer, just an inner loop is sufficient.

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: linux-next: Tree for July 16 (crash on quad core AMD)
  2008-07-19  0:59               ` Tejun Heo
@ 2008-07-19  1:30                 ` Pierre Ossman
       [not found]                   ` <20080719033050.552f9b49-OhHrUh4vRMSnewYJFaQfwJ5kstrrjoWp@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Pierre Ossman @ 2008-07-19  1:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Rafael J. Wysocki, James Bottomley, Stephen Rothwell, linux-next,
	LKML, Andrew Morton, Kernel Testers List, scsi, Jens Axboe,
	linux-ide, Jeff Garzik, Takashi Iwai, tino.keitel

[-- Attachment #1: Type: text/plain, Size: 1422 bytes --]

On Sat, 19 Jul 2008 09:59:11 +0900
Tejun Heo <htejun@gmail.com> wrote:

> Pierre Ossman wrote:
> > I just have one objection to your version, and that is that it cannot
> > be used to nibble away at the sg list. The _next() call jumps an entire
> > page, whereas you sometimes need to consume that page in two different
> > sweeps. This could be handled by some external buffer that keeps the
> > remainder of the page, but the point of these functions was to keep
> > things simple for the callers.
> 
> Well, I don't know how often such usages would be necessary.  If it's a
> very common ops, you can add a param to the next function but frankly I
> think it's better to build a inside control structure for that.  There's
> no need for external buffer, just an inner loop is sufficient.
> 

I'm not sure how this can be solved by an inner loop. My primary use
case is:

1. Wait for interrupt
2. Write n bytes
3. goto 1

n has no guarantee of being aligned to any page boundaries, so state
needs to be kept between each invokation of writing a chunk of data. I
doubt I'm alone in this use pattern (in fact, most device drivers using
PIO should do something similar).

-- 
     -- Pierre Ossman

  WARNING: This correspondence is being monitored by the
  Swedish government. Make sure your server uses encryption
  for SMTP traffic and consider using PGP for end-to-end
  encryption.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: linux-next: Tree for July 16 (crash on quad core AMD)
       [not found]                   ` <20080719033050.552f9b49-OhHrUh4vRMSnewYJFaQfwJ5kstrrjoWp@public.gmane.org>
@ 2008-07-19  2:12                     ` Tejun Heo
  2008-07-19 12:07                       ` Pierre Ossman
  0 siblings, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2008-07-19  2:12 UTC (permalink / raw)
  To: Pierre Ossman
  Cc: Rafael J. Wysocki, James Bottomley, Stephen Rothwell,
	linux-next-u79uwXL29TY76Z2rM5mHXA, LKML, Andrew Morton,
	Kernel Testers List, scsi, Jens Axboe, linux-ide, Jeff Garzik,
	Takashi Iwai, tino.keitel-Mmb7MZpHnFY

Pierre Ossman wrote:
>> Well, I don't know how often such usages would be necessary.  If it's a
>> very common ops, you can add a param to the next function but frankly I
>> think it's better to build a inside control structure for that.  There's
>> no need for external buffer, just an inner loop is sufficient.
>>
> 
> I'm not sure how this can be solved by an inner loop. My primary use
> case is:
> 
> 1. Wait for interrupt
> 2. Write n bytes
> 3. goto 1
> 
> n has no guarantee of being aligned to any page boundaries, so state
> needs to be kept between each invokation of writing a chunk of data. I
> doubt I'm alone in this use pattern (in fact, most device drivers using
> PIO should do something similar).

Oh... I see.  How about adding sg_miter_consume(@miter, @bytes)?  If the
function is never called, the whole chunk is assumed to be consumed.  If
the function is called only @bytes are consumed.

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: linux-next: Tree for July 16 (crash on quad core AMD)
  2008-07-19  2:12                     ` Tejun Heo
@ 2008-07-19 12:07                       ` Pierre Ossman
  2008-07-19 14:03                         ` Tejun Heo
  0 siblings, 1 reply; 14+ messages in thread
From: Pierre Ossman @ 2008-07-19 12:07 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Rafael J. Wysocki, James Bottomley, Stephen Rothwell, linux-next,
	LKML, Andrew Morton, Kernel Testers List, scsi, Jens Axboe,
	linux-ide, Jeff Garzik, Takashi Iwai, tino.keitel

[-- Attachment #1: Type: text/plain, Size: 627 bytes --]

On Sat, 19 Jul 2008 11:12:17 +0900
Tejun Heo <htejun@gmail.com> wrote:

> 
> Oh... I see.  How about adding sg_miter_consume(@miter, @bytes)?  If the
> function is never called, the whole chunk is assumed to be consumed.  If
> the function is called only @bytes are consumed.
> 

Sounds reasonable. Care to make an independent patch so that I can
continue my PIO hacking adventures? :)

Rgds
-- 
     -- Pierre Ossman

  WARNING: This correspondence is being monitored by the
  Swedish government. Make sure your server uses encryption
  for SMTP traffic and consider using PGP for end-to-end
  encryption.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: linux-next: Tree for July 16 (crash on quad core AMD)
  2008-07-19 12:07                       ` Pierre Ossman
@ 2008-07-19 14:03                         ` Tejun Heo
  2008-07-20 22:40                           ` Pierre Ossman
  0 siblings, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2008-07-19 14:03 UTC (permalink / raw)
  To: Pierre Ossman
  Cc: Rafael J. Wysocki, James Bottomley, Stephen Rothwell, linux-next,
	LKML, Andrew Morton, Kernel Testers List, scsi, Jens Axboe,
	linux-ide, Jeff Garzik, Takashi Iwai, tino.keitel

Pierre Ossman wrote:
> On Sat, 19 Jul 2008 11:12:17 +0900
> Tejun Heo <htejun@gmail.com> wrote:
> 
>> Oh... I see.  How about adding sg_miter_consume(@miter, @bytes)?  If the
>> function is never called, the whole chunk is assumed to be consumed.  If
>> the function is called only @bytes are consumed.
>>
> 
> Sounds reasonable. Care to make an independent patch so that I can
> continue my PIO hacking adventures? :)

Sure thing.  Here you go.  :-)

Subject: [PATCH 2.6.26] sg: reimplement sg mapping iterator

This is alternative implementation of sg content iterator introduced
by commit 83e7d317... from Pierre Ossman in next-20080716.  As there's
already an sg iterator which iterates over sg entries themselves, name
this sg_mapping_iterator.

Slightly edited description from the original implementation follows.

Iteration over a sg list is not that trivial when you take into
account that memory pages might have to be mapped before being used.
Unfortunately, that means that some parts of the kernel restrict
themselves to directly accesible memory just to not have to deal with
the mess.

This patch adds a simple iterator system that allows any code to
easily traverse an sg list and not have to deal with all the details.
The user can decide to consume part of the iteration.  Also, iteration
can be stopped and resumed later if releasing the kmap between
iteration steps is necessary.  These features are useful to implement
piecemeal sg copying for interrupt drive PIO for example.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Pierre Ossman <drzeus@drzeus.cx>
---
 include/linux/scatterlist.h |   38 +++++++++
 lib/scatterlist.c           |  176 ++++++++++++++++++++++++++++++++------------
 2 files changed, 168 insertions(+), 46 deletions(-)

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index 71fc813..e599698 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -224,4 +224,42 @@ size_t sg_copy_to_buffer(struct scatterlist *sgl, unsigned int nents,
  */
 #define SG_MAX_SINGLE_ALLOC		(PAGE_SIZE / sizeof(struct scatterlist))
 
+
+/*
+ * Mapping sg iterator
+ *
+ * Iterates over sg entries mapping page-by-page.  On each successful
+ * iteration, @miter->page points to the mapped page and
+ * @miter->length bytes of data can be accessed at @miter->addr.  As
+ * long as an interation is enclosed between start and stop, the user
+ * is free to choose control structure and when to stop.
+ *
+ * @miter->consumed is set to @miter->length on each iteration.  It
+ * can be adjusted if the user can't consume all the bytes in one go.
+ * Also, a stopped iteration can be resumed by calling next on it.
+ * This is useful when iteration needs to release all resources and
+ * continue later (e.g. at the next interrupt).
+ */
+
+#define SG_MITER_ATOMIC		(1 << 0)	 /* use kmap_atomic */
+
+struct sg_mapping_iter {
+	/* the following three fields can be accessed directly */
+	struct page		*page;		/* currently mapped page */
+	void			*addr;		/* pointer to the mapped area */
+	size_t			length;		/* length of the mapped area */
+	size_t			consumed;	/* number of consumed bytes */
+
+	/* these are internal states, keep away */
+	struct scatterlist	*__sg;		/* current entry */
+	unsigned int		__nents;	/* nr of remaining entries */
+	unsigned int		__offset;	/* offset within sg */
+	unsigned int		__flags;
+};
+
+void sg_miter_start(struct sg_mapping_iter *miter, struct scatterlist *sgl,
+		    unsigned int nents, unsigned int flags);
+bool sg_miter_next(struct sg_mapping_iter *miter);
+void sg_miter_stop(struct sg_mapping_iter *miter);
+
 #endif /* _LINUX_SCATTERLIST_H */
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index b80c211..876ba6d 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -295,6 +295,117 @@ int sg_alloc_table(struct sg_table *table, unsigned int nents, gfp_t gfp_mask)
 EXPORT_SYMBOL(sg_alloc_table);
 
 /**
+ * sg_miter_start - start mapping iteration over a sg list
+ * @miter: sg mapping iter to be started
+ * @sgl: sg list to iterate over
+ * @nents: number of sg entries
+ *
+ * Description:
+ *   Starts mapping iterator @miter.
+ *
+ * Context:
+ *   Don't care.
+ */
+void sg_miter_start(struct sg_mapping_iter *miter, struct scatterlist *sgl,
+		    unsigned int nents, unsigned int flags)
+{
+	memset(miter, 0, sizeof(struct sg_mapping_iter));
+
+	miter->__sg = sgl;
+	miter->__nents = nents;
+	miter->__offset = 0;
+	miter->__flags = flags;
+}
+EXPORT_SYMBOL(sg_miter_start);
+
+/**
+ * sg_miter_next - proceed mapping iterator to the next mapping
+ * @miter: sg mapping iter to proceed
+ *
+ * Description:
+ *   Proceeds @miter@ to the next mapping.  @miter@ should have been
+ *   started using sg_miter_start().  On successful return,
+ *   @miter@->page, @miter@->addr and @miter@->length point to the
+ *   current mapping.
+ *
+ * Context:
+ *   IRQ disabled if SG_MITER_ATOMIC.  IRQ must stay disabled till
+ *   @miter@ is stopped.  May sleep if !SG_MITER_ATOMIC.
+ *
+ * Returns:
+ *   true if @miter contains the next mapping.  false if end of sg
+ *   list is reached.
+ */
+bool sg_miter_next(struct sg_mapping_iter *miter)
+{
+	unsigned int off, len;
+
+	/* check for end and drop resources from the last iteration */
+	if (!miter->__nents)
+		return false;
+
+	sg_miter_stop(miter);
+
+	/* get to the next sg if necessary.  __offset is adjusted by stop */
+	if (miter->__offset == miter->__sg->length && --miter->__nents) {
+		miter->__sg = sg_next(miter->__sg);
+		miter->__offset = 0;
+	}
+
+	/* map the next page */
+	off = miter->__sg->offset + miter->__offset;
+	len = miter->__sg->length - miter->__offset;
+
+	miter->page = nth_page(sg_page(miter->__sg), off >> PAGE_SHIFT);
+	off &= ~PAGE_MASK;
+	miter->length = min_t(unsigned int, len, PAGE_SIZE - off);
+	miter->consumed = miter->length;
+
+	if (miter->__flags & SG_MITER_ATOMIC)
+		miter->addr = kmap_atomic(miter->page, KM_BIO_SRC_IRQ) + off;
+	else
+		miter->addr = kmap(miter->page) + off;
+
+	return true;
+}
+EXPORT_SYMBOL(sg_miter_next);
+
+/**
+ * sg_miter_stop - stop mapping iteration
+ * @miter: sg mapping iter to be stopped
+ *
+ * Description:
+ *   Stops mapping iterator @miter.  @miter should have been started
+ *   started using sg_miter_start().  A stopped iteration can be
+ *   resumed by calling sg_miter_next() on it.  This is useful when
+ *   resources (kmap) need to be released during iteration.
+ *
+ * Context:
+ *   IRQ disabled if the SG_MITER_ATOMIC is set.  Don't care otherwise.
+ */
+void sg_miter_stop(struct sg_mapping_iter *miter)
+{
+	WARN_ON(miter->consumed > miter->length);
+
+	/* drop resources from the last iteration */
+	if (miter->addr) {
+		miter->__offset += miter->consumed;
+
+		if (miter->__flags & SG_MITER_ATOMIC) {
+			WARN_ON(!irqs_disabled());
+			kunmap_atomic(miter->addr, KM_BIO_SRC_IRQ);
+		} else
+			kunmap(miter->addr);
+
+		miter->page = NULL;
+		miter->addr = NULL;
+		miter->length = 0;
+		miter->consumed = 0;
+	}
+}
+EXPORT_SYMBOL(sg_miter_stop);
+
+/**
  * sg_copy_buffer - Copy data between a linear buffer and an SG list
  * @sgl:		 The SG list
  * @nents:		 Number of SG entries
@@ -309,56 +420,29 @@ EXPORT_SYMBOL(sg_alloc_table);
 static size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents,
 			     void *buf, size_t buflen, int to_buffer)
 {
-	struct scatterlist *sg;
-	size_t buf_off = 0;
-	int i;
-
-	WARN_ON(!irqs_disabled());
-
-	for_each_sg(sgl, sg, nents, i) {
-		struct page *page;
-		int n = 0;
-		unsigned int sg_off = sg->offset;
-		unsigned int sg_copy = sg->length;
-
-		if (sg_copy > buflen)
-			sg_copy = buflen;
-		buflen -= sg_copy;
-
-		while (sg_copy > 0) {
-			unsigned int page_copy;
-			void *p;
-
-			page_copy = PAGE_SIZE - sg_off;
-			if (page_copy > sg_copy)
-				page_copy = sg_copy;
-
-			page = nth_page(sg_page(sg), n);
-			p = kmap_atomic(page, KM_BIO_SRC_IRQ);
-
-			if (to_buffer)
-				memcpy(buf + buf_off, p + sg_off, page_copy);
-			else {
-				memcpy(p + sg_off, buf + buf_off, page_copy);
-				flush_kernel_dcache_page(page);
-			}
-
-			kunmap_atomic(p, KM_BIO_SRC_IRQ);
-
-			buf_off += page_copy;
-			sg_off += page_copy;
-			if (sg_off == PAGE_SIZE) {
-				sg_off = 0;
-				n++;
-			}
-			sg_copy -= page_copy;
+	unsigned int offset = 0;
+	struct sg_mapping_iter miter;
+
+	sg_miter_start(&miter, sgl, nents, SG_MITER_ATOMIC);
+
+	while (sg_miter_next(&miter) && offset < buflen) {
+		unsigned int len;
+
+		len = min(miter.length, buflen - offset);
+
+		if (to_buffer)
+			memcpy(buf + offset, miter.addr, len);
+		else {
+			memcpy(miter.addr, buf + offset, len);
+			flush_kernel_dcache_page(miter.page);
 		}
 
-		if (!buflen)
-			break;
+		offset += len;
 	}
 
-	return buf_off;
+	sg_miter_stop(&miter);
+
+	return offset;
 }
 
 /**

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: linux-next: Tree for July 16 (crash on quad core AMD)
  2008-07-19 14:03                         ` Tejun Heo
@ 2008-07-20 22:40                           ` Pierre Ossman
       [not found]                             ` <20080721004033.1fc66aa3-OhHrUh4vRMSnewYJFaQfwJ5kstrrjoWp@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Pierre Ossman @ 2008-07-20 22:40 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Rafael J. Wysocki, James Bottomley, Stephen Rothwell, linux-next,
	LKML, Andrew Morton, Kernel Testers List, scsi, Jens Axboe,
	linux-ide, Jeff Garzik, Takashi Iwai, tino.keitel

[-- Attachment #1: Type: text/plain, Size: 1295 bytes --]

On Sat, 19 Jul 2008 23:03:35 +0900
Tejun Heo <htejun@gmail.com> wrote:

> Pierre Ossman wrote:
> > On Sat, 19 Jul 2008 11:12:17 +0900
> > Tejun Heo <htejun@gmail.com> wrote:
> > 
> >> Oh... I see.  How about adding sg_miter_consume(@miter, @bytes)?  If the
> >> function is never called, the whole chunk is assumed to be consumed.  If
> >> the function is called only @bytes are consumed.
> >>
> > 
> > Sounds reasonable. Care to make an independent patch so that I can
> > continue my PIO hacking adventures? :)
> 
> Sure thing.  Here you go.  :-)
> 
> Subject: [PATCH 2.6.26] sg: reimplement sg mapping iterator
> 

I've converted my code to use your implementation instead, and it seems
to be working perfectly.

How comfortable to you feel about your patch? :)
This fixes a big problem in the sdhci driver where falling back on PIO
(because of hw DMA bugs) would create lots of potential issues as the
driver couldn't access highmem pages. I'd like to get this fix in for
2.6.27, which means that your patch also needs to go in now.

Rgds
-- 
     -- Pierre Ossman

  WARNING: This correspondence is being monitored by the
  Swedish government. Make sure your server uses encryption
  for SMTP traffic and consider using PGP for end-to-end
  encryption.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: linux-next: Tree for July 16 (crash on quad core AMD)
       [not found]                             ` <20080721004033.1fc66aa3-OhHrUh4vRMSnewYJFaQfwJ5kstrrjoWp@public.gmane.org>
@ 2008-07-21  0:38                               ` Tejun Heo
       [not found]                                 ` <4883DA8C.5030306-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2008-07-21  0:38 UTC (permalink / raw)
  To: Pierre Ossman
  Cc: Rafael J. Wysocki, James Bottomley, Stephen Rothwell,
	linux-next-u79uwXL29TY76Z2rM5mHXA, LKML, Andrew Morton,
	Kernel Testers List, scsi, Jens Axboe, linux-ide, Jeff Garzik,
	Takashi Iwai, tino.keitel-Mmb7MZpHnFY

Hello, Pierre.

Pierre Ossman wrote:
> I've converted my code to use your implementation instead, and it seems
> to be working perfectly.

Cool.

> How comfortable to you feel about your patch? :)

I think it should be okay (well, of course :-) and tested a few corner
cases by modifying the copy function.

> This fixes a big problem in the sdhci driver where falling back on PIO
> (because of hw DMA bugs) would create lots of potential issues as the
> driver couldn't access highmem pages. I'd like to get this fix in for
> 2.6.27, which means that your patch also needs to go in now.

I think it needs an ACK from Jens.  Jens?

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: linux-next: Tree for July 16 (crash on quad core AMD)
       [not found]                                 ` <4883DA8C.5030306-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2008-07-21 11:32                                   ` Pierre Ossman
  2008-07-21 15:35                                     ` Tejun Heo
  0 siblings, 1 reply; 14+ messages in thread
From: Pierre Ossman @ 2008-07-21 11:32 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Rafael J. Wysocki, James Bottomley, Stephen Rothwell,
	linux-next-u79uwXL29TY76Z2rM5mHXA, LKML, Andrew Morton,
	Kernel Testers List, scsi, Jens Axboe, linux-ide, Jeff Garzik,
	Takashi Iwai, tino.keitel-Mmb7MZpHnFY

[-- Attachment #1: Type: text/plain, Size: 624 bytes --]

On Mon, 21 Jul 2008 09:38:36 +0900
Tejun Heo <htejun-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

> 
> I think it needs an ACK from Jens.  Jens?
> 

He seems to be on vacation. I sent him a review request for my original
code a couple of weeks ago, and I haven't heard anything. marc.info
also doesn't show anything since the 3rd. Odds are he won't be back
during the merge window. :/

-- 
     -- Pierre Ossman

  WARNING: This correspondence is being monitored by the
  Swedish government. Make sure your server uses encryption
  for SMTP traffic and consider using PGP for end-to-end
  encryption.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: linux-next: Tree for July 16 (crash on quad core AMD)
  2008-07-21 11:32                                   ` Pierre Ossman
@ 2008-07-21 15:35                                     ` Tejun Heo
  0 siblings, 0 replies; 14+ messages in thread
From: Tejun Heo @ 2008-07-21 15:35 UTC (permalink / raw)
  To: Pierre Ossman
  Cc: Rafael J. Wysocki, James Bottomley, Stephen Rothwell, linux-next,
	LKML, Andrew Morton, Kernel Testers List, scsi, Jens Axboe,
	linux-ide, Jeff Garzik, Takashi Iwai, tino.keitel

Pierre Ossman wrote:
> On Mon, 21 Jul 2008 09:38:36 +0900
> Tejun Heo <htejun@gmail.com> wrote:
> 
>> I think it needs an ACK from Jens.  Jens?
> 
> He seems to be on vacation. I sent him a review request for my original
> code a couple of weeks ago, and I haven't heard anything. marc.info
> also doesn't show anything since the 3rd. Odds are he won't be back
> during the merge window. :/

Right, I have a patchset waiting for his review too.  Anyone knows when
he'll be back?

We can always back it out later but I don't think pushing stuff in w/o
maintainer's ack is a good idea.  :-(

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2008-07-21 15:35 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20080716235011.ac9643aa.sfr@canb.auug.org.au>
2008-07-16 22:53 ` linux-next: Tree for July 16 (crash on quad core AMD) Rafael J. Wysocki
2008-07-16 23:01   ` James Bottomley
     [not found]     ` <1216249292.3358.66.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2008-07-16 23:09       ` Rafael J. Wysocki
2008-07-18 12:38         ` Tejun Heo
     [not found]           ` <48808EE0.2060603-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2008-07-18 22:47             ` Pierre Ossman
2008-07-19  0:59               ` Tejun Heo
2008-07-19  1:30                 ` Pierre Ossman
     [not found]                   ` <20080719033050.552f9b49-OhHrUh4vRMSnewYJFaQfwJ5kstrrjoWp@public.gmane.org>
2008-07-19  2:12                     ` Tejun Heo
2008-07-19 12:07                       ` Pierre Ossman
2008-07-19 14:03                         ` Tejun Heo
2008-07-20 22:40                           ` Pierre Ossman
     [not found]                             ` <20080721004033.1fc66aa3-OhHrUh4vRMSnewYJFaQfwJ5kstrrjoWp@public.gmane.org>
2008-07-21  0:38                               ` Tejun Heo
     [not found]                                 ` <4883DA8C.5030306-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2008-07-21 11:32                                   ` Pierre Ossman
2008-07-21 15:35                                     ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).