linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: 2.6.13-mm1: hangs during boot ...
       [not found]   ` <43191DBF.10407@bigpond.net.au>
@ 2005-09-03  4:16     ` Andrew Morton
  0 siblings, 0 replies; 17+ messages in thread
From: Andrew Morton @ 2005-09-03  4:16 UTC (permalink / raw)
  To: Peter Williams; +Cc: linux-kernel, James Bottomley, linux-scsi

Peter Williams <pwil3058@bigpond.net.au> wrote:
>
> Andrew Morton wrote:
> > Peter Williams <pwil3058@bigpond.net.au> wrote:
> > 
> >>... at the the point indicated by the following output:
> >>
> >>[    8.197224] Freeing unused kernel memory: 288k freed
> >>[    8.428217] SCSI subsystem initialized
> >>[    8.510376] sym0: <810a> rev 0x23 at pci 0000:00:08.0 irq 11
> >>[    8.587731] sym0: No NVRAM, ID 7, Fast-10, SE, parity checking
> >>[    8.671531] sym0: SCSI BUS has been reset.
> >>[    8.725530] scsi0 : sym-2.2.1
> >>[   17.256480]  0:0:0:0: ABORT operation started.
> >>[   22.323534]  0:0:0:0: ABORT operation timed-out.
> >>[   22.384348]  0:0:0:0: DEVICE RESET operation started.
> >>[   27.458702]  0:0:0:0: DEVICE RESET operation timed-out.
> >>[   27.527544]  0:0:0:0: BUS RESET operation started.
> >>[   32.533775]  0:0:0:0: BUS RESET operation timed-out.
> >>[   32.599173]  0:0:0:0: HOST RESET operation started.
> >>[   32.669659] sym0: SCSI BUS has been reset.
> >>
> > 
> > 
> > Is there no response from sysrq-T?
> 
> Now that I've tried it there is a response.  I've attached the complete 
> output from the boot including the sysrq-T output in the hang.output 
> attachment to this e-mail.

Thanks.

> ...
> [  278.990398] Call Trace:
> [  279.024761]  [<c02279ff>] serio_thread+0xbf/0xf0
> [  279.085573]  [<c013b2a6>] kthread+0xa6/0xb0
> [  279.140552]  [<c01034d9>] kernel_thread_helper+0x5/0xc
> [  279.208130] insmod        D C171DCC0     0   227      1           232    70 )[  279.309031] d7f33b04 d7f33ab8 d8836bb0 c171dcc0 00001055 0fbf64f3 00000000 d
> [  279.408678]        0000e83b d7f33acc c01da354 d750e6ac d750e570 c130d160 0a9
> [  279.518639]        d7f32000 0a72aa15 00000002 00000246 c172de50 c172de50 d7f
> [  279.628599] Call Trace:
> [  279.662960]  [<c02d5c74>] wait_for_completion+0xa4/0x110
> [  279.732934]  [<c0245c16>] blk_execute_rq+0x66/0xb0
> [  279.796035]  [<d8836eb6>] scsi_execute+0xb6/0xd0 [scsi_mod]
> [  279.869446]  [<d8836f4d>] scsi_execute_req+0x7d/0xb0 [scsi_mod]
> [  279.947438]  [<d88393f6>] scsi_probe_lun+0xb6/0x1d0 [scsi_mod]
> [  280.024285]  [<d883995e>] scsi_probe_and_add_lun+0xde/0x1e0 [scsi_mod]
> [  280.110295]  [<d883a119>] scsi_scan_target+0xc9/0x140 [scsi_mod]
> [  280.189431]  [<d883a208>] scsi_scan_channel+0x78/0x90 [scsi_mod]
> [  280.268569]  [<d883a2e9>] scsi_scan_host_selected+0xc9/0x120 [scsi_mod]
> [  280.355722]  [<d883a362>] scsi_scan_host+0x22/0x30 [scsi_mod]
> [  280.431425]  [<d8864e45>] sym2_probe+0xf5/0x120 [sym53c8xx]
> [  280.504835]  [<c01e6ced>] pci_call_probe+0xd/0x10
> [  280.566791]  [<c01e6d39>] __pci_device_probe+0x49/0x60
> [  280.634369]  [<c01e6d79>] pci_device_probe+0x29/0x50
> [  280.699657]  [<c023e0ee>] driver_probe_device+0x3e/0xc0
> [  280.768486]  [<c023e25f>] __driver_attach+0x5f/0x70
> [  280.832628]  [<c023d7d3>] bus_for_each_dev+0x43/0x70
> [  280.897916]  [<c023e289>] driver_attach+0x19/0x20
> [  280.959770]  [<c023dc5b>] bus_add_driver+0x7b/0xd0
> [  281.022767]  [<c023e692>] driver_register+0x42/0x50
> [  281.086910]  [<c01e6fd0>] pci_register_driver+0x70/0x90
> [  281.155635]  [<d880202b>] sym2_init+0x2b/0x45 [sym53c8xx]
> [  281.226752]  [<c014340c>] sys_init_module+0xec/0x230
> [  281.292042]  [<c0105199>] syscall_call+0x7/0xb
> [  281.350458] scsi_eh_0     D 00000000     0   232      1                 227 )[  281.451357] d7a51ea0 d7a51e64 1e62bb57 00000000 d7a50000 1e62c494 00000000 d
> [  281.551005]        00000106 d79b0ab0 c130d160 d79b0bec d79b0ab0 c130d160 9de
> [  281.660963]        d7a50000 9de05c44 00000007 d7a50000 d7a51ef4 d7a51ef0 d7a
> [  281.770923] Call Trace:
> [  281.805288]  [<c02d5c74>] wait_for_completion+0xa4/0x110
> [  281.875159]  [<d8863490>] sym_eh_handler+0x240/0x290 [sym53c8xx]
> [  281.954293]  [<d88635fd>] sym53c8xx_eh_host_reset_handler+0x2d/0x50 [sym53c8][  282.050611]  [<d8835e9b>] scsi_try_host_reset+0x2b/0xa0 [scsi_mod]
> [  282.132041]  [<d883602c>] scsi_eh_host_reset+0x1c/0xa0 [scsi_mod]
> [  282.212324]  [<d88363f7>] scsi_eh_ready_devs+0x57/0x70 [scsi_mod]
> [  282.292604]  [<d883654f>] scsi_unjam_host+0x9f/0xc0 [scsi_mod]
> [  282.369451]  [<d8836629>] scsi_error_handler+0xb9/0xe0 [scsi_mod]
> [  282.449734]  [<c01034d9>] kernel_thread_helper+0x5/0xc
> 

scsi went ga-ga during insertion of the sym2 driver.  Usual culprits cc'ed ;)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: 2.6.13-mm1: hangs during boot ...
@ 2005-09-03  4:22 Brown, Len
  2005-09-03  4:58 ` Peter Williams
  0 siblings, 1 reply; 17+ messages in thread
From: Brown, Len @ 2005-09-03  4:22 UTC (permalink / raw)
  To: Andrew Morton, Peter Williams; +Cc: linux-kernel, James Bottomley, linux-scsi

> > [  279.662960]  [<c02d5c74>] wait_for_completion+0xa4/0x110

possibly a missing interrupt?

> CONFIG_ACPI=y

any difference if booted with "acpi=off" or "acpi=noirq"?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.13-mm1: hangs during boot ...
  2005-09-03  4:22 Brown, Len
@ 2005-09-03  4:58 ` Peter Williams
  2005-09-03  5:25   ` Andrew Morton
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Williams @ 2005-09-03  4:58 UTC (permalink / raw)
  To: Brown, Len; +Cc: Andrew Morton, linux-kernel, James Bottomley, linux-scsi

[-- Attachment #1: Type: text/plain, Size: 759 bytes --]

Brown, Len wrote:
>>>[  279.662960]  [<c02d5c74>] wait_for_completion+0xa4/0x110
> 
> 
> possibly a missing interrupt?
> 
> 
>>CONFIG_ACPI=y
> 
> 
> any difference if booted with "acpi=off" or "acpi=noirq"?

Yes.  In both cases, the system appears to boot normally but I'm unable 
to login or connect via ssh.  Also there's a "device not ready" message 
after the scsi initialization which I don't normally see.  I've attached 
the scsi initialization output.  The PF_NETLINK error messages after the 
login prompt in this output are created whenever I try to log in or 
connect via ssh.

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

[-- Attachment #2: acpi=off.output --]
[-- Type: text/plain, Size: 1511 bytes --]

[    8.345086] SCSI subsystem initialized
[    8.427503] sym0: <810a> rev 0x23 at pci 0000:00:08.0 irq 16
[    8.504636] sym0: No NVRAM, ID 7, Fast-10, SE, parity checking
[    8.588216] sym0: SCSI BUS has been reset.
[    8.642194] scsi0 : sym-2.2.1
[   12.368622]   Vendor: PIONEER   Model: DVD-ROM DVD-303R  Rev: 2.00
[   12.450118]   Type:   CD-ROM                             ANSI SCSI revision:2[   12.546506]  target0:0:2: Beginning Domain Validation
[   12.613354]  target0:0:2: asynchronous.
[   12.667699]  target0:0:2: Domain Validation skipping write tests
[   12.747629]  target0:0:2: FAST-10 SCSI 10.0 MB/s ST (100 ns, offset 8)
[   12.837395]  target0:0:2: Ending Domain Validation
[   13.256875]   Vendor: SONY      Model: CD-RW  CRX140S    Rev: 1.0e
[   13.338323]   Type:   CD-ROM                             ANSI SCSI revision:4[   13.434891]  target0:0:4: Beginning Domain Validation
[   13.503101]  target0:0:4: asynchronous.
[   13.602931]  target0:0:4: Domain Validation skipping write tests
[   13.683605]  target0:0:4: FAST-10 SCSI 10.0 MB/s ST (100 ns, offset 8)
[   13.777934]  target0:0:4: Ending Domain Validation
[   14.884703] Device  not ready.
[   15.763312] kjournald starting.  Commit interval 5 seconds
[   15.835612] EXT3-fs: mounted filesystem with ordered data mode.


Fedora Core release 4 (Stentz)
Kernel 2.6.13-mm1 on an i686

origma.pw.nest login:

[  101.886572] DEBUG: Failed to load PF_NETLINK protocol 9
[  101.963572] DEBUG: Failed to load PF_NETLINK protocol 9


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.13-mm1: hangs during boot ...
  2005-09-03  4:58 ` Peter Williams
@ 2005-09-03  5:25   ` Andrew Morton
  0 siblings, 0 replies; 17+ messages in thread
From: Andrew Morton @ 2005-09-03  5:25 UTC (permalink / raw)
  To: Peter Williams
  Cc: len.brown, linux-kernel, James.Bottomley, linux-scsi,
	David S. Miller

Peter Williams <pwil3058@bigpond.net.au> wrote:
>
> Brown, Len wrote:
> >>>[  279.662960]  [<c02d5c74>] wait_for_completion+0xa4/0x110
> > 
> > 
> > possibly a missing interrupt?
> > 
> > 
> >>CONFIG_ACPI=y
> > 
> > 
> > any difference if booted with "acpi=off" or "acpi=noirq"?
> 
> Yes.  In both cases, the system appears to boot normally

OK, we can pass this ball over to the ACPI team.

> but I'm unable 
> to login or connect via ssh.  Also there's a "device not ready" message 
> after the scsi initialization which I don't normally see.  I've attached 
> the scsi initialization output.  The PF_NETLINK error messages after the 
> login prompt in this output are created whenever I try to log in or 
> connect via ssh.

Linus hit that too - it's an interaction between PAM and a modified netlink
error code.

Dave, where are we up to with the fix for that?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: 2.6.13-mm1: hangs during boot ...
@ 2005-09-03  5:37 Brown, Len
  2005-09-03  6:18 ` Peter Williams
  0 siblings, 1 reply; 17+ messages in thread
From: Brown, Len @ 2005-09-03  5:37 UTC (permalink / raw)
  To: Peter Williams; +Cc: Andrew Morton, linux-kernel, James Bottomley, linux-scsi

 
>Brown, Len wrote:
>>>>[  279.662960]  [<c02d5c74>] wait_for_completion+0xa4/0x110
>> 
>> 
>> possibly a missing interrupt?
>> 
>> 
>>>CONFIG_ACPI=y
>> 
>> 
>> any difference if booted with "acpi=off" or "acpi=noirq"?
>
>Yes.  In both cases, the system appears to boot normally but 
>I'm unable 
>to login or connect via ssh.  Also there's a "device not 
>ready" message 
>after the scsi initialization which I don't normally see.  
>I've attached 
>the scsi initialization output.  The PF_NETLINK error messages 
>after the 
>login prompt in this output are created whenever I try to log in or 
>connect via ssh.

Please confirm that vanilla 2.6.13 has none of these symptoms.
Please apply just the ACPI part of the 2.6.13-mm1 patch to see if
these issues are caused by that or if they are caused by something
else in the mm patch.

http://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13/2.6.13-mm1/broken-out/git-acpi.patch

thanks,
-Len

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.13-mm1: hangs during boot ...
  2005-09-03  5:37 Brown, Len
@ 2005-09-03  6:18 ` Peter Williams
  2005-09-03  7:18   ` Peter Williams
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Williams @ 2005-09-03  6:18 UTC (permalink / raw)
  To: Brown, Len; +Cc: Andrew Morton, linux-kernel, James Bottomley, linux-scsi

Brown, Len wrote:
>  
> 
>>Brown, Len wrote:
>>
>>>>>[  279.662960]  [<c02d5c74>] wait_for_completion+0xa4/0x110
>>>
>>>
>>>possibly a missing interrupt?
>>>
>>>
>>>
>>>>CONFIG_ACPI=y
>>>
>>>
>>>any difference if booted with "acpi=off" or "acpi=noirq"?
>>
>>Yes.  In both cases, the system appears to boot normally but 
>>I'm unable 
>>to login or connect via ssh.  Also there's a "device not 
>>ready" message 
>>after the scsi initialization which I don't normally see.  
>>I've attached 
>>the scsi initialization output.  The PF_NETLINK error messages 
>>after the 
>>login prompt in this output are created whenever I try to log in or 
>>connect via ssh.
> 
> 
> Please confirm that vanilla 2.6.13 has none of these symptoms.

That's correct.  2.6.13 exhibits none of these symptoms.

> Please apply just the ACPI part of the 2.6.13-mm1 patch to see if
> these issues are caused by that or if they are caused by something
> else in the mm patch.
> 
> http://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13/2.6.13-mm1/broken-out/git-acpi.patch

OK.  I'll get back to you shortly.

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.13-mm1: hangs during boot ...
  2005-09-03  6:18 ` Peter Williams
@ 2005-09-03  7:18   ` Peter Williams
  2005-09-03  8:19     ` Len Brown
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Williams @ 2005-09-03  7:18 UTC (permalink / raw)
  To: Peter Williams
  Cc: Brown, Len, Andrew Morton, linux-kernel, James Bottomley,
	linux-scsi

Peter Williams wrote:
> Brown, Len wrote:
> 
>>  
>>
>>> Brown, Len wrote:
>>>
>>>>>> [  279.662960]  [<c02d5c74>] wait_for_completion+0xa4/0x110
>>>>
>>>>
>>>>
>>>> possibly a missing interrupt?
>>>>
>>>>
>>>>
>>>>> CONFIG_ACPI=y
>>>>
>>>>
>>>>
>>>> any difference if booted with "acpi=off" or "acpi=noirq"?
>>>
>>>
>>> Yes.  In both cases, the system appears to boot normally but I'm 
>>> unable to login or connect via ssh.  Also there's a "device not 
>>> ready" message after the scsi initialization which I don't normally 
>>> see.  I've attached the scsi initialization output.  The PF_NETLINK 
>>> error messages after the login prompt in this output are created 
>>> whenever I try to log in or connect via ssh.
>>
>>
>>
>> Please confirm that vanilla 2.6.13 has none of these symptoms.
> 
> 
> That's correct.  2.6.13 exhibits none of these symptoms.
> 
>> Please apply just the ACPI part of the 2.6.13-mm1 patch to see if
>> these issues are caused by that or if they are caused by something
>> else in the mm patch.
>>
>> http://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13/2.6.13-mm1/broken-out/git-acpi.patch 
>>
> 
> 
> OK.  I'll get back to you shortly.

I am able to confirm that the problem occurs with vanilla 2.5.13 after I 
apply the above patch.

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.13-mm1: hangs during boot ...
  2005-09-03  7:18   ` Peter Williams
@ 2005-09-03  8:19     ` Len Brown
  2005-09-03  9:28       ` Peter Williams
  0 siblings, 1 reply; 17+ messages in thread
From: Len Brown @ 2005-09-03  8:19 UTC (permalink / raw)
  To: Peter Williams; +Cc: Andrew Morton, linux-kernel, James Bottomley, linux-scsi

On Sat, 2005-09-03 at 03:18 -0400, Peter Williams wrote:
> 
> http://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13/2.6.13-mm1/broken-out/git-acpi.patch
> >>

> I am able to confirm that the problem occurs with vanilla 2.5.13 after
> I apply the above patch.

Thanks.

Please then try the latest ACPI patch here:
http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/patches/release/2.6.13/acpi-20050902-2.6.13.diff.gz
It should apply to vanilla 2.6.13 with a reject in ia64/Kconfig
that you can ignore.

If this works, then we munged git-acpi.patch in 2.6.13-mm1 somehow.

If this fails, then please confirm it still fails with pnpacpi=off

if it still fails, then please open a bugzilla here:
http://bugzilla.kernel.org/enter_bug.cgi?product=ACPI
component=config-interrupts

build the failing kernel with CONFIG_ACPI_DEBUG=y
boot it with "acpi=noirq" and attach the output from
dmesg -s64000
lspci -vv
cat /proc/interrupts
acpidump, available in the latest pmtools here:
http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/

also include the dmesg -s64000 from the successful
acpi-enabled 2.6.13 boot, along with its /proc/interrupts.

If you have a  serial console and can then capture the
failing console log with "debug", that would be ideal.

Where we got from there will depend what we see...

thanks,
-Len

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.13-mm1: hangs during boot ...
  2005-09-03  8:19     ` Len Brown
@ 2005-09-03  9:28       ` Peter Williams
  2005-09-03  9:48         ` Andrew Morton
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Williams @ 2005-09-03  9:28 UTC (permalink / raw)
  To: Len Brown; +Cc: Andrew Morton, linux-kernel, James Bottomley, linux-scsi

Len Brown wrote:
> On Sat, 2005-09-03 at 03:18 -0400, Peter Williams wrote:
> 
>>http://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13/2.6.13-mm1/broken-out/git-acpi.patch
>>
> 
>>I am able to confirm that the problem occurs with vanilla 2.5.13 after
>>I apply the above patch.
> 
> 
> Thanks.
> 
> Please then try the latest ACPI patch here:
> http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/patches/release/2.6.13/acpi-20050902-2.6.13.diff.gz
> It should apply to vanilla 2.6.13 with a reject in ia64/Kconfig
> that you can ignore.
> 
> If this works, then we munged git-acpi.patch in 2.6.13-mm1 somehow.

There were no problems with this patch applied.  So it looks like the 
munge theory is correct.

> 
> If this fails, then please confirm it still fails with pnpacpi=off
> 
> if it still fails, then please open a bugzilla here:
> http://bugzilla.kernel.org/enter_bug.cgi?product=ACPI
> component=config-interrupts
> 
> build the failing kernel with CONFIG_ACPI_DEBUG=y
> boot it with "acpi=noirq" and attach the output from
> dmesg -s64000
> lspci -vv
> cat /proc/interrupts
> acpidump, available in the latest pmtools here:
> http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/
> 
> also include the dmesg -s64000 from the successful
> acpi-enabled 2.6.13 boot, along with its /proc/interrupts.
> 
> If you have a  serial console and can then capture the
> failing console log with "debug", that would be ideal.
> 
> Where we got from there will depend what we see...
> 
> thanks,
> -Len
> 

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.13-mm1: hangs during boot ...
  2005-09-03  9:28       ` Peter Williams
@ 2005-09-03  9:48         ` Andrew Morton
  0 siblings, 0 replies; 17+ messages in thread
From: Andrew Morton @ 2005-09-03  9:48 UTC (permalink / raw)
  To: Peter Williams; +Cc: len.brown, linux-kernel, James.Bottomley, linux-scsi

Peter Williams <pwil3058@bigpond.net.au> wrote:
>
> > Please then try the latest ACPI patch here:
>  > http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/patches/release/2.6.13/acpi-20050902-2.6.13.diff.gz
>  > It should apply to vanilla 2.6.13 with a reject in ia64/Kconfig
>  > that you can ignore.
>  > 
>  > If this works, then we munged git-acpi.patch in 2.6.13-mm1 somehow.
> 
>  There were no problems with this patch applied.  So it looks like the 
>  munge theory is correct.

That diff is significantly different from the diff I plucked from
master.kernel.org:/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6.git#test
for 2.6.13-mm1.

Doing (patch -R | grep FAILED) on 2.6.13-mm1 says:

Hunk #3 FAILED at 356.
1 out of 3 hunks FAILED -- saving rejects to file arch/ia64/Kconfig.rej
Hunk #6 FAILED at 190.
Hunk #8 FAILED at 221.
Hunk #10 FAILED at 254.
Hunk #11 FAILED at 357.
Hunk #15 FAILED at 474.
Hunk #17 FAILED at 569.
6 out of 17 hunks FAILED -- saving rejects to file drivers/acpi/dispatcher/dsmethod.c.rej
Hunk #19 FAILED at 468.
Hunk #29 FAILED at 701.
2 out of 38 hunks FAILED -- saving rejects to file drivers/acpi/dispatcher/dswload.c.rej
Hunk #14 FAILED at 321.
Hunk #43 FAILED at 1159.
2 out of 44 hunks FAILED -- saving rejects to file drivers/acpi/osl.c.rej
Hunk #17 FAILED at 1134.
1 out of 18 hunks FAILED -- saving rejects to file drivers/acpi/parser/psparse.c.rej
Hunk #3 FAILED at 74.
1 out of 3 hunks FAILED -- saving rejects to file drivers/acpi/parser/psxface.c.rej
Hunk #1 FAILED at 35.
1 out of 15 hunks FAILED -- saving rejects to file drivers/acpi/pci_bind.c.rej
Hunk #5 FAILED at 220.
Hunk #15 FAILED at 412.
Hunk #16 FAILED at 425.
Hunk #17 FAILED at 446.
Hunk #19 FAILED at 484.
5 out of 36 hunks FAILED -- saving rejects to file drivers/acpi/processor_core.c.rej
Hunk #1 FAILED at 41.
Hunk #2 FAILED at 71.
Hunk #4 FAILED at -55.
Hunk #5 FAILED at 30.
Hunk #6 FAILED at 40.
Hunk #7 FAILED at 69.
Hunk #9 FAILED at 317.
Hunk #10 FAILED at 344.
Hunk #12 FAILED at 289.
Hunk #14 FAILED at 523.
Hunk #15 FAILED at 607.
Hunk #16 FAILED at 618.
Hunk #17 FAILED at 645.
Hunk #19 FAILED at 534.
Hunk #20 FAILED at 686.
Hunk #22 FAILED at 916.
Hunk #23 FAILED at 968.
Hunk #25 FAILED at 881.
Hunk #26 FAILED at 891.
Hunk #29 FAILED at 953.
20 out of 29 hunks FAILED -- saving rejects to file drivers/acpi/resources/rsaddr.c.rej
Hunk #11 FAILED at 289.
Hunk #16 FAILED at 407.
Hunk #17 FAILED at 425.
Hunk #18 FAILED at 434.
Hunk #20 FAILED at 470.
Hunk #21 FAILED at 527.
6 out of 21 hunks FAILED -- saving rejects to file drivers/acpi/resources/rsirq.c.rej
Hunk #27 FAILED at 553.
1 out of 61 hunks FAILED -- saving rejects to file drivers/acpi/scan.c.rej
Hunk #1 FAILED at 41.
1 out of 34 hunks FAILED -- saving rejects to file drivers/acpi/utilities/utmisc.c.rej
Hunk #5 FAILED at 291.
1 out of 76 hunks FAILED -- saving rejects to file drivers/acpi/video.c.rej
Hunk #2 FAILED at 64.
1 out of 8 hunks FAILED -- saving rejects to file include/acpi/acconfig.h.rej
Hunk #1 FAILED at 41.
1 out of 1 hunk FAILED -- saving rejects to file include/acpi/acdispat.h.rej
Hunk #29 FAILED at 1078.
Hunk #30 FAILED at 1240.
2 out of 31 hunks FAILED -- saving rejects to file include/acpi/actypes.h.rej


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.13-mm1: hangs during boot ...
       [not found] ` <fa.fm9i4v6.1ekchhm@ifi.uio.no>
@ 2005-09-03 13:24   ` Reuben Farrelly
  2005-09-03 13:51     ` Peter Williams
  2005-09-04 16:32     ` James Bottomley
  0 siblings, 2 replies; 17+ messages in thread
From: Reuben Farrelly @ 2005-09-03 13:24 UTC (permalink / raw)
  To: Peter Williams
  Cc: Brown, Len, Andrew Morton, linux-kernel, James Bottomley,
	linux-scsi

Hi Peter,

On 3/09/2005 4:59 a.m., Peter Williams wrote:
> Brown, Len wrote:
>>>> [  279.662960]  [<c02d5c74>] wait_for_completion+0xa4/0x110
>>
>>
>> possibly a missing interrupt?
>>
>>
>>> CONFIG_ACPI=y
>>
>>
>> any difference if booted with "acpi=off" or "acpi=noirq"?
> 
> Yes.  In both cases, the system appears to boot normally but I'm unable 
> to login or connect via ssh.  Also there's a "device not ready" message

Are you seeing this "Device  not ready" message appear over and over, or just 
the once?

I am seeing it fill up my messages log as it is logging 1 or so messages each 
minute.  I've emailed the SCSI maintainer James Bottomley twice about it but 
had no response either time.

The SCSI device I have is:

Sep  3 22:14:40 tornado kernel: Vendor: SONY  Model: CD-RW  CRX145S  Rev: 1.0b

As for the inability to log in, this bug may be relevant, given I also had 
that problem:

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=166422

There are fixes in the pipeline for util-linux audit interaction in Fedora as 
well.  I know because I reported those too ;)

> after the scsi initialization which I don't normally see.  I've attached 
> the scsi initialization output.  The PF_NETLINK error messages after the 
> login prompt in this output are created whenever I try to log in or 
> connect via ssh.

The workaround by enabling audit support, but obviously a better fix is in the 
pipeline..

I'm surprised more people aren't discovering these 'interactions' due to 
having audit not turned on.  Does everyone build audit into their kernels?

reuben


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.13-mm1: hangs during boot ...
  2005-09-03 13:24   ` 2.6.13-mm1: hangs during boot Reuben Farrelly
@ 2005-09-03 13:51     ` Peter Williams
  2005-09-03 16:14       ` James Bottomley
  2005-09-04 16:32     ` James Bottomley
  1 sibling, 1 reply; 17+ messages in thread
From: Peter Williams @ 2005-09-03 13:51 UTC (permalink / raw)
  To: Reuben Farrelly
  Cc: Brown, Len, Andrew Morton, linux-kernel, James Bottomley,
	linux-scsi

Reuben Farrelly wrote:
> Hi Peter,
> 
> On 3/09/2005 4:59 a.m., Peter Williams wrote:
> 
>> Brown, Len wrote:
>>
>>>>> [  279.662960]  [<c02d5c74>] wait_for_completion+0xa4/0x110
>>>
>>>
>>>
>>> possibly a missing interrupt?
>>>
>>>
>>>> CONFIG_ACPI=y
>>>
>>>
>>>
>>> any difference if booted with "acpi=off" or "acpi=noirq"?
>>
>>
>> Yes.  In both cases, the system appears to boot normally but I'm 
>> unable to login or connect via ssh.  Also there's a "device not ready" 
>> message
> 
> 
> Are you seeing this "Device  not ready" message appear over and over, or 
> just the once?

Just the once.

> 
> I am seeing it fill up my messages log as it is logging 1 or so messages 
> each minute.  I've emailed the SCSI maintainer James Bottomley twice 
> about it but had no response either time.
> 
> The SCSI device I have is:
> 
> Sep  3 22:14:40 tornado kernel: Vendor: SONY  Model: CD-RW  CRX145S  
> Rev: 1.0b
> 
> As for the inability to log in, this bug may be relevant, given I also 
> had that problem:
> 
> https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=166422
> 
> There are fixes in the pipeline for util-linux audit interaction in 
> Fedora as well.  I know because I reported those too ;)
> 
>> after the scsi initialization which I don't normally see.  I've 
>> attached the scsi initialization output.  The PF_NETLINK error 
>> messages after the login prompt in this output are created whenever I 
>> try to log in or connect via ssh.
> 
> 
> The workaround by enabling audit support, but obviously a better fix is 
> in the pipeline..
> 
> I'm surprised more people aren't discovering these 'interactions' due to 
> having audit not turned on.  Does everyone build audit into their kernels?
> 
> reuben


-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: 2.6.13-mm1: hangs during boot ...
@ 2005-09-03 15:57 Brown, Len
  2005-09-03 22:48 ` Peter Williams
  0 siblings, 1 reply; 17+ messages in thread
From: Brown, Len @ 2005-09-03 15:57 UTC (permalink / raw)
  To: Andrew Morton, Peter Williams; +Cc: linux-kernel, James.Bottomley, linux-scsi

>> > Please then try the latest ACPI patch here:
>>  > 
>http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/patches
>/release/2.6.13/acpi-20050902-2.6.13.diff.gz
>>  > It should apply to vanilla 2.6.13 with a reject in ia64/Kconfig
>>  > that you can ignore.
>>  > 
>>  > If this works, then we munged git-acpi.patch in 
>2.6.13-mm1 somehow.
>> 
>>  There were no problems with this patch applied.  So it 
>looks like the 
>>  munge theory is correct.
>
>That diff is significantly different from the diff I plucked from
>master.kernel.org:/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
>.git#test
>for 2.6.13-mm1.
>
>Doing (patch -R | grep FAILED) on 2.6.13-mm1 says:

Right.
2.6.13/acpi-20050902-2.6.13.diff.gz
is newers than 2.6.13-rc1's git-acpi.patch

2.6.13/acpi-20050815-2.6.13.diff.gz
is a closer match -- though not exact.

Peter, it might be illustrative if you have a moment
if you can also test 2.6.13/acpi-20050815-2.6.13.diff.gz
all by itself.

If it fails, then I broke -mm1
with acpi-20050815-2.6.13.diff.gz, but fixed
it by acpi-20050902-2.6.13.diff.gz.

If it succeeds, then the issue lies in the relatively small delta
between acpi-20050815-2.6.13.diff.gz 2.6.13-mm1's git-acpi.patch.

thanks,
-Len


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.13-mm1: hangs during boot ...
  2005-09-03 13:51     ` Peter Williams
@ 2005-09-03 16:14       ` James Bottomley
  0 siblings, 0 replies; 17+ messages in thread
From: James Bottomley @ 2005-09-03 16:14 UTC (permalink / raw)
  To: Peter Williams
  Cc: Reuben Farrelly, Brown, Len, Andrew Morton, Linux Kernel,
	SCSI Mailing List

On Sat, 2005-09-03 at 23:51 +1000, Peter Williams wrote:
> > Are you seeing this "Device  not ready" message appear over and over, or 
> > just the once?
> 
> Just the once.

OK, I finally have a theory about this.  It's the everything goes via
bios code.  Previously there were several levels at which commands could
exit the SCSI stack; now we make everything go via bios, so they all
come out at the top.

get_capabilities() in sr.c is sending a TEST_UNIT_READY which will get
NOT_READY back.  Previously this was completing before it got to
scsi_io_completion(); now it doesn't.  There must be quite a few cases
like this.  The best fix is probably to use and respect REQ_QUIET for
internally generated commands.

James

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.13-mm1: hangs during boot ...
  2005-09-03 15:57 Brown, Len
@ 2005-09-03 22:48 ` Peter Williams
  0 siblings, 0 replies; 17+ messages in thread
From: Peter Williams @ 2005-09-03 22:48 UTC (permalink / raw)
  To: Brown, Len; +Cc: Andrew Morton, linux-kernel, James.Bottomley, linux-scsi

Brown, Len wrote:
>>>>Please then try the latest ACPI patch here:
>>>
>>> > 
>>
>>http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/patches
>>/release/2.6.13/acpi-20050902-2.6.13.diff.gz
>>
>>> > It should apply to vanilla 2.6.13 with a reject in ia64/Kconfig
>>> > that you can ignore.
>>> > 
>>> > If this works, then we munged git-acpi.patch in 
>>
>>2.6.13-mm1 somehow.
>>
>>> There were no problems with this patch applied.  So it 
>>
>>looks like the 
>>
>>> munge theory is correct.
>>
>>That diff is significantly different from the diff I plucked from
>>master.kernel.org:/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
>>.git#test
>>for 2.6.13-mm1.
>>
>>Doing (patch -R | grep FAILED) on 2.6.13-mm1 says:
> 
> 
> Right.
> 2.6.13/acpi-20050902-2.6.13.diff.gz
> is newers than 2.6.13-rc1's git-acpi.patch
> 
> 2.6.13/acpi-20050815-2.6.13.diff.gz
> is a closer match -- though not exact.
> 
> Peter, it might be illustrative if you have a moment
> if you can also test 2.6.13/acpi-20050815-2.6.13.diff.gz
> all by itself.
> 
> If it fails,

It does.

> then I broke -mm1
> with acpi-20050815-2.6.13.diff.gz, but fixed
> it by acpi-20050902-2.6.13.diff.gz.

So you did.

> 
> If it succeeds, then the issue lies in the relatively small delta
> between acpi-20050815-2.6.13.diff.gz 2.6.13-mm1's git-acpi.patch.
> 
> thanks,
> -Len
> 

My pleasure
Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.13-mm1: hangs during boot ...
  2005-09-03 13:24   ` 2.6.13-mm1: hangs during boot Reuben Farrelly
  2005-09-03 13:51     ` Peter Williams
@ 2005-09-04 16:32     ` James Bottomley
  2005-09-05  8:37       ` Reuben Farrelly
  1 sibling, 1 reply; 17+ messages in thread
From: James Bottomley @ 2005-09-04 16:32 UTC (permalink / raw)
  To: Reuben Farrelly
  Cc: Peter Williams, Brown, Len, Andrew Morton, Linux Kernel,
	SCSI Mailing List

On Sun, 2005-09-04 at 01:24 +1200, Reuben Farrelly wrote:
> I am seeing it fill up my messages log as it is logging 1 or so messages each 
> minute.  I've emailed the SCSI maintainer James Bottomley twice about it but 
> had no response either time.

OK, can you try this ... it should confirm the theory if the messages go
away.

Thanks,

James

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -315,7 +315,7 @@ int scsi_execute(struct scsi_device *sde
 	req->sense = sense;
 	req->sense_len = 0;
 	req->timeout = timeout;
-	req->flags |= flags | REQ_BLOCK_PC | REQ_SPECIAL;
+	req->flags |= flags | REQ_BLOCK_PC | REQ_SPECIAL | REQ_QUIET;
 
 	/*
 	 * head injection *required* here otherwise quiesce won't work
@@ -927,17 +927,20 @@ void scsi_io_completion(struct scsi_cmnd
 				scsi_requeue_command(q, cmd);
 				return;
 			}
-			printk(KERN_INFO "Device %s not ready.\n",
-			       req->rq_disk ? req->rq_disk->disk_name : "");
+			if (!(req->flags & REQ_QUIET))
+				dev_printk(KERN_INFO,
+					   &cmd->device->sdev_gendev,
+					   "Device not ready.\n");
 			cmd = scsi_end_request(cmd, 0, this_count, 1);
 			return;
 		case VOLUME_OVERFLOW:
-			printk(KERN_INFO "Volume overflow <%d %d %d %d> CDB: ",
-			       cmd->device->host->host_no,
-			       (int)cmd->device->channel,
-			       (int)cmd->device->id, (int)cmd->device->lun);
-			__scsi_print_command(cmd->data_cmnd);
-			scsi_print_sense("", cmd);
+			if (!(req->flags & REQ_QUIET)) {
+				dev_printk(KERN_INFO,
+					   &cmd->device->sdev_gendev,
+					   "Volume overflow, CDB: ");
+				__scsi_print_command(cmd->data_cmnd);
+				scsi_print_sense("", cmd);
+			}
 			cmd = scsi_end_request(cmd, 0, block_bytes, 1);
 			return;
 		default:
@@ -954,15 +957,13 @@ void scsi_io_completion(struct scsi_cmnd
 		return;
 	}
 	if (result) {
-		if (!(req->flags & REQ_SPECIAL))
-			printk(KERN_INFO "SCSI error : <%d %d %d %d> return code "
-			       "= 0x%x\n", cmd->device->host->host_no,
-			       cmd->device->channel,
-			       cmd->device->id,
-			       cmd->device->lun, result);
+		if (!(req->flags & REQ_QUIET)) {
+			dev_printk(KERN_INFO, &cmd->device->sdev_gendev,
+				   "SCSI error: return code = 0x%x\n", result);
 
-		if (driver_byte(result) & DRIVER_SENSE)
-			scsi_print_sense("", cmd);
+			if (driver_byte(result) & DRIVER_SENSE)
+				scsi_print_sense("", cmd);
+		}
 		/*
 		 * Mark a single buffer as not uptodate.  Queue the remainder.
 		 * We sometimes get this cruft in the event that a medium error



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.13-mm1: hangs during boot ...
  2005-09-04 16:32     ` James Bottomley
@ 2005-09-05  8:37       ` Reuben Farrelly
  0 siblings, 0 replies; 17+ messages in thread
From: Reuben Farrelly @ 2005-09-05  8:37 UTC (permalink / raw)
  To: James Bottomley
  Cc: Peter Williams, Brown, Len, Andrew Morton, Linux Kernel,
	SCSI Mailing List

Hi,

On 5/09/2005 4:32 a.m., James Bottomley wrote:
> On Sun, 2005-09-04 at 01:24 +1200, Reuben Farrelly wrote:
>> I am seeing it fill up my messages log as it is logging 1 or so messages each 
>> minute.  I've emailed the SCSI maintainer James Bottomley twice about it but 
>> had no response either time.
> 
> OK, can you try this ... it should confirm the theory if the messages go
> away.
> 
> Thanks,
> 
> James
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -315,7 +315,7 @@ int scsi_execute(struct scsi_device *sde
>  	req->sense = sense;
>  	req->sense_len = 0;
>  	req->timeout = timeout;
> -	req->flags |= flags | REQ_BLOCK_PC | REQ_SPECIAL;
> +	req->flags |= flags | REQ_BLOCK_PC | REQ_SPECIAL | REQ_QUIET;
>  
>  	/*
>  	 * head injection *required* here otherwise quiesce won't work
> @@ -927,17 +927,20 @@ void scsi_io_completion(struct scsi_cmnd
>  				scsi_requeue_command(q, cmd);
>  				return;
>  			}
> -			printk(KERN_INFO "Device %s not ready.\n",
> -			       req->rq_disk ? req->rq_disk->disk_name : "");
> +			if (!(req->flags & REQ_QUIET))
> +				dev_printk(KERN_INFO,
> +					   &cmd->device->sdev_gendev,
> +					   "Device not ready.\n");
>  			cmd = scsi_end_request(cmd, 0, this_count, 1);
>  			return;
>  		case VOLUME_OVERFLOW:
> -			printk(KERN_INFO "Volume overflow <%d %d %d %d> CDB: ",
> -			       cmd->device->host->host_no,
> -			       (int)cmd->device->channel,
> -			       (int)cmd->device->id, (int)cmd->device->lun);
> -			__scsi_print_command(cmd->data_cmnd);
> -			scsi_print_sense("", cmd);
> +			if (!(req->flags & REQ_QUIET)) {
> +				dev_printk(KERN_INFO,
> +					   &cmd->device->sdev_gendev,
> +					   "Volume overflow, CDB: ");
> +				__scsi_print_command(cmd->data_cmnd);
> +				scsi_print_sense("", cmd);
> +			}
>  			cmd = scsi_end_request(cmd, 0, block_bytes, 1);
>  			return;
>  		default:
> @@ -954,15 +957,13 @@ void scsi_io_completion(struct scsi_cmnd
>  		return;
>  	}
>  	if (result) {
> -		if (!(req->flags & REQ_SPECIAL))
> -			printk(KERN_INFO "SCSI error : <%d %d %d %d> return code "
> -			       "= 0x%x\n", cmd->device->host->host_no,
> -			       cmd->device->channel,
> -			       cmd->device->id,
> -			       cmd->device->lun, result);
> +		if (!(req->flags & REQ_QUIET)) {
> +			dev_printk(KERN_INFO, &cmd->device->sdev_gendev,
> +				   "SCSI error: return code = 0x%x\n", result);
>  
> -		if (driver_byte(result) & DRIVER_SENSE)
> -			scsi_print_sense("", cmd);
> +			if (driver_byte(result) & DRIVER_SENSE)
> +				scsi_print_sense("", cmd);
> +		}
>  		/*
>  		 * Mark a single buffer as not uptodate.  Queue the remainder.
>  		 * We sometimes get this cruft in the event that a medium error

This patch fixes it, and there was no message during boot about not being 
ready, nor after the machine had fully booted.  Great ;-)

However, I did get an oops when warm booting the kernel, I suspect this may be 
the oops that I get every now and then when warm rebooting, with no real 
pattern, and possibly isn't related to the patch.  As my serial console wasn't 
set up at the time, I took a photo instead, at 
http://www.reub.net/kernel/scsi-oops.jpg

Thanks
reuben

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2005-09-05  8:37 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <fa.qs5cahs.i2khgm@ifi.uio.no>
     [not found] ` <fa.fm9i4v6.1ekchhm@ifi.uio.no>
2005-09-03 13:24   ` 2.6.13-mm1: hangs during boot Reuben Farrelly
2005-09-03 13:51     ` Peter Williams
2005-09-03 16:14       ` James Bottomley
2005-09-04 16:32     ` James Bottomley
2005-09-05  8:37       ` Reuben Farrelly
2005-09-03 15:57 Brown, Len
2005-09-03 22:48 ` Peter Williams
  -- strict thread matches above, loose matches on Subject: below --
2005-09-03  5:37 Brown, Len
2005-09-03  6:18 ` Peter Williams
2005-09-03  7:18   ` Peter Williams
2005-09-03  8:19     ` Len Brown
2005-09-03  9:28       ` Peter Williams
2005-09-03  9:48         ` Andrew Morton
2005-09-03  4:22 Brown, Len
2005-09-03  4:58 ` Peter Williams
2005-09-03  5:25   ` Andrew Morton
     [not found] <43184B8A.4040801@bigpond.net.au>
     [not found] ` <20050902131122.4c634211.akpm@osdl.org>
     [not found]   ` <43191DBF.10407@bigpond.net.au>
2005-09-03  4:16     ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).