Re: Kernel panic reading bad disk sector

The Linux Kernel Mailing List
 help / color / mirror / Atom feed

* Re: Kernel panic reading bad disk sector
       [not found]         ` <20051123095640.GA5022@flint.arm.linux.org.uk>
@ 2005-11-23 10:26           ` Chris Ross
  2005-11-23 10:48             ` Chris Ross
  2005-11-23 12:51             ` Chris Ross
  0 siblings, 2 replies; 3+ messages in thread
From: Chris Ross @ 2005-11-23 10:26 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: Greg Ungerer, linux-arm-kernel, linux-kernel



Russell King - ARM Linux escreveu:
> On Wed, Nov 23, 2005 at 09:25:40AM +0000, Chris Ross wrote:
>>Greg Ungerer escreveu:
>>>Chris Ross wrote:
>>>
>>>>According System.map it is in the function ide_dma_timeout_retry.
>>>
>>>Ok, that is good information. I would try and figure out which
>>>line of code in there is dereferencing a NULL pointer.
>>
>>It would seem to be this line
>>
>>	rq->errors = 0;

because rq is set to NULL by earlier the line

	ret = DRIVER(drive)->error(drive, "dma timeout retry",
				hwif->INB(IDE_STATUS_REG));


> I'd strongly suggest that you talk to IDE folk about this - I
> suspect HWGROUP(drive)->rq should never be NULL while a request
> is being handled on drive.

Which list specifically? I've taken your advice and "promoted" this to 
LKML so if that was wrong please correct it politely.

For those just tuning in this is about an ARM system with a Promise 
20275 IDE controller which suffers a kernel panic when attempting to 
read from a bad sector on the disk.

Regards,
Chris R.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Kernel panic reading bad disk sector
  2005-11-23 10:26           ` Kernel panic reading bad disk sector Chris Ross
@ 2005-11-23 10:48             ` Chris Ross
  2005-11-23 12:51             ` Chris Ross
  1 sibling, 0 replies; 3+ messages in thread
From: Chris Ross @ 2005-11-23 10:48 UTC (permalink / raw)
  To: Chris Ross; +Cc: Russell King - ARM Linux, linux-arm-kernel, linux-kernel



Chris Ross escreveu:
> For those just tuning in this is about an ARM system with a Promise 
> 20275 IDE controller which suffers a kernel panic when attempting to 
> read from a bad sector on the disk.

And the vital piece of information I missed - on kernels 2.4.31-uc0 and 
2.4.27-uc1 at least. At this stage it's believed to be from upstream - 
i.e. it's probably also in vanilla 2.4.32 although I haven't tested that.

Regards,
Chris R.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Kernel panic reading bad disk sector
  2005-11-23 10:26           ` Kernel panic reading bad disk sector Chris Ross
  2005-11-23 10:48             ` Chris Ross
@ 2005-11-23 12:51             ` Chris Ross
  1 sibling, 0 replies; 3+ messages in thread
From: Chris Ross @ 2005-11-23 12:51 UTC (permalink / raw)
  To: Chris Ross
  Cc: Russell King - ARM Linux, linux-arm-kernel, linux-kernel,
	Greg Ungerer


Chris Ross escreveu:
> Russell King - ARM Linux escreveu:
>> On Wed, Nov 23, 2005 at 09:25:40AM +0000, Chris Ross wrote:
>>> Greg Ungerer escreveu:
>>>> Chris Ross wrote:
>>>>
>>>>> According System.map it is in the function ide_dma_timeout_retry.
>>>>
>>>> Ok, that is good information. I would try and figure out which
>>>> line of code in there is dereferencing a NULL pointer.
>>>
>>> It would seem to be this line
>>>
>>>     rq->errors = 0;
> 
> because rq is set to NULL by earlier the line
> 
>     ret = DRIVER(drive)->error(drive, "dma timeout retry",
>                 hwif->INB(IDE_STATUS_REG));

Which looks like the the correct thing to do. In idedisk_error once the 
threshold for the maximum number of retries has been reached the request 
is ended because it cannot be serviced

	if (rq->errors >= ERROR_MAX)
		DRIVER(drive)->end_request(drive, 0);

in idedisk_end_request the request is explicitly set to NULL because it 
is now ended, in the code block...

	if (!end_that_request_first(rq, uptodate, drive->name)) {
		add_blkdev_randomness(MAJOR(rq->rq_dev));
		blkdev_dequeue_request(rq);
		HWGROUP(drive)->rq = NULL;
		end_that_request_last(rq);
		ret = 0;
	}

Which means that ide_dma_timeout_retry should take account of the fact 
that the request might no longer be valid before using it.

In other words it should be...

	/* Check whether the request ended early due to disk errors */
	if( rq ) {
		rq->errors = 0;
		rq->sector = rq->bh->b_rsector;
		rq->current_nr_sectors = rq->bh->b_size >> 9;
		rq->hard_cur_sectors = rq->current_nr_sectors;
		rq->buffer = rq->bh->b_data;
	}


If anyone has a better solution I would be glad to hear it. Failing that 
I'll submit this in normal kernel patch format as soon as I've worked 
out how...

Regards,
Chris R.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2005-11-23 12:52 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <4381DA23.10201@tebibyte.org>
     [not found] ` <4382B815.5000701@snapgear.com>
     [not found]   ` <43836758.6050001@tebibyte.org>
     [not found]     ` <4383C205.7020608@snapgear.com>
     [not found]       ` <43843594.9050009@tebibyte.org>
     [not found]         ` <20051123095640.GA5022@flint.arm.linux.org.uk>
2005-11-23 10:26           ` Kernel panic reading bad disk sector Chris Ross
2005-11-23 10:48             ` Chris Ross
2005-11-23 12:51             ` Chris Ross

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox