From: Brian King <brking@charter.net>
To: Andrey Borzenkov <arvidjaar@mail.ru>
Cc: James Stevenson <james@stev.org>, linux-kernel@vger.kernel.org
Subject: Re: OOPS in idescsi_end_request
Date: Thu, 23 Jan 2003 21:15:44 -0600 [thread overview]
Message-ID: <3E30AFE0.5060400@charter.net> (raw)
In-Reply-To: <E18bd5l-000Duj-00@f16.mail.ru>
Can the interrupt handler even look at the request structure in this
scenario though? I can't say I know much about the ide subsystem but it
seems like after ide-scsi returns success on the reset function then it
should have already pushed scsi_done for the op and should not be
looking at the request structure. Would it be possible to have the
ide-scsi reset routine invoke ide_do_reset? Is there a problem with
multiple callers of ide_do_reset? I would think the right solution would
be to do this and not return success until the reset was successful and
the op was sent back to the mid-layer.
-Brian
Andrey Borzenkov wrote:
> As long as we cannot easily abort IDE request (correct me if I am wrong) the workaround seems to be to mark current request as aborted in idescsi_abort and ignore it later in idescsi_end_request, i.e. something like (with new flag PC_ABORTED)
>
> struct request *rq = HWGROUP(idescsi_drives[cmd->target])->rq;
> idescsi_pc_t *pc = (idescsi_pc_t *) rq->buffer;
> pc->flags |= PC_ABORTED;
>
> and later on assume we can ignore SCSI layer completely in this case in idescsi_end_request, just do general cleanup.
>
> If you can reliably reproduce the problem you could give it a try.
>
> Anybody sees yet another race condition here? :))
>
> -andrey
>
>
>>While burning a CD tonight I ended up taking an oops on my system. I had
>>the lkcd patch applied to my 2.4.19 kernel, so I was able to look at the
>> oops after my system rebooted. After digging into it a little and
>>looking at the ide-scsi code I think I found the problem but am not
>>sure. How can idescsi_reset simply return SCSI_RESET_SUCCESS to the scsi
>>mid layer? I think what is happening is that a command times out,
>>idescsi_abort is called, which returns SCSI_ABORT_SNOOZE. Later on
>>idescsi_reset gets called, which returns SCSI_RESET_SUCCESS. At this
>>point the scsi mid-layer owns the scsi_cmnd and returns the failure back
>>up the chain. Later on, the command gets run through
>>idescsi_end_request, which then tries to access the scsi_cmnd structure
>>which is it no longer owns.
>>
>>Any help is appreciated. I have a complete lkcd dump of the failure if
>>anyone would like more information...
>>
>>-Brian King
>>
>>
>>Here is the last bit in the log buffer:
>>
>> <4>scsi : aborting command due to timeout : pid 2534304, scsi0,
>>channel 0, id 1, lun 0 Write (10) 00 00 01 1e 91 00 00 1b 00
>> <4>hdk: timeout waiting for DMA
>> <4>ide_dmaproc: chipset supported ide_dma_timeout func only: 14
>> <4>hdk: status timeout: status=0xd8 { Busy }
>> <4>hdk: drive not ready for command
>> <4>hdk: ATAPI reset complete
>> <4>hdk: irq timeout: status=0x80 { Busy }
>> <4>hdk: ATAPI reset complete
>> <4>hdk: irq timeout: status=0x80 { Busy }
>> <1>Unable to handle kernel NULL pointer dereference at virtual
>>address 00000184
>> <4> printing eip:
>> <4>e0fd22f1
>> <1>*pde = 00000000
>> <4>Oops: 0002
>> <4>CPU: 0
>> <4>EIP: 0010:[<e0fd22f1>] Tainted: PF
>> <4>EFLAGS: 00010046
>> <4>eax: 00000000 ebx: 00000000 ecx: dfef8000 edx: c75bcbc0
>> <4>esi: 00000080 edi: c0491938 ebp: d5908000 esp: c0435ea4
>> <4>ds: 0018 es: 0018 ss: 0018
>> <4>Process swapper (pid: 0, stackpage=c0435000)
>> <4>Stack: c0491938 00000000 00000000 c0491938 00000088 000001f4
>>c03349e2 c75bcbc0
>> <4> ce0a3b80 c0491938 00000080 00000080 c75bcbc0 c0222d6c
>>00000000 c1671580
>> <4> 00000080 c04918f4 c0491938 c0434000 c1671580 e0fd2550
>>c0223b30 c0491938
>> <4>Call Trace: [<c0222d6c>] [<e0fd2550>] [<c0223b30>]
>>[<c0223990>] [<c0127af0>]
>> <4> [<c01233d4>] [<c01232a6>] [<c01230ed>] [<c010a97f>]
>>[<c010d173>] [<c0106f80>]
>> <4> [<c0106fa3>] [<c0107012>] [<c0105000>]
>> <4>
>> <4>Code: c7 80 84 01 00 00 00 00 07 00 75 72 9c 5e fa bb 00 e0 ff ff
>>
>>
>> From lkcd:
>>
>>================================================================
>>STACK TRACE FOR TASK: 0xc0434000 (swapper)
>>
>> 0 [ide-scsi]idescsi_end_request+129 [0xe0fd22f1]
>>TRACE ERROR 0x800000000
>
>
>
--
Some days it's just not worth chewing through the restraints...
next prev parent reply other threads:[~2003-01-24 3:14 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-01-23 8:52 OOPS in idescsi_end_request Andrey Borzenkov
2003-01-23 9:19 ` Strong kernel lock with linux-2.5.59 : futex in Huge Pages dada1
2003-01-23 9:41 ` Andrew Morton
2003-01-24 3:15 ` Brian King [this message]
2003-01-24 6:21 ` Re[2]: OOPS in idescsi_end_request Andrey Borzenkov
2003-01-24 9:06 ` James Stevenson
2003-01-24 13:39 ` Brian King
-- strict thread matches above, loose matches on Subject: below --
2003-01-22 3:41 Brian King
2003-01-22 21:49 ` James Stevenson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3E30AFE0.5060400@charter.net \
--to=brking@charter.net \
--cc=arvidjaar@mail.ru \
--cc=james@stev.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.