From: Hannes Reinecke <hare@suse.de>
To: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Cc: Alan Stern <stern@rowland.harvard.edu>,
Rocko Requin <rockorequin@hotmail.com>,
Jun'ichi Nomura <j-nomura@ce.jp.nec.com>,
Ben Hutchings <ben@decadent.org.uk>,
jaxboe@fusionio.com,
James Bottomley <James.Bottomley@HansenPartnership.com>,
tytso@mit.edu,
Kernel development list <linux-kernel@vger.kernel.org>,
linux-scsi <linux-scsi@vger.kernel.org>
Subject: Re: [Bug 25832] kernel crashes when a mounted ext3/4 file system is physically removed
Date: Thu, 22 Sep 2011 18:32:26 +0200 [thread overview]
Message-ID: <4E7B631A.20605@suse.de> (raw)
In-Reply-To: <20110922162047.GD3431@oc1711230544.ibm.com>
On 09/22/2011 06:20 PM, Thadeu Lima de Souza Cascardo wrote:
> On Thu, Sep 22, 2011 at 11:16:30AM -0400, Alan Stern wrote:
>> Rocko:
>>
>> Can you try testing this patch instead of all the patches I sent to
>> you (but keep Ted's patch)?
>>
>> Alan Stern
>>
>> On Thu, 22 Sep 2011, Hannes Reinecke wrote:
>>
>>> On 09/20/2011 09:32 AM, Jun'ichi Nomura wrote:
>>>> On 09/19/11 08:00, Ben Hutchings wrote:
>>> [ .. ]
>>>>>
>>>>> There have been reports of this in Debian going back to 2.6.39:
>>>>>
>>>>> http://bugs.debian.org/631187
>>>>> http://bugs.debian.org/636263
>>>>> http://bugs.debian.org/642043
>>>>>
>>>>> Plus possibly related crashes in elv_put_request after CD-ROM removal:
>>>>>
>>>>> http://bugs.debian.org/633890
>>>>> http://bugs.debian.org/634681
>>>>> http://bugs.debian.org/636103
>>>>>
>>>>> The former was also reported in Ubuntu since their 2.6.38-10:
>>>>>
>>>>> https://bugs.launchpad.net/debian/+source/linux-2.6/+bug/793796
>>>>>
>>>>> The result of the discussion there was that it appeared to be a
>>>>> regression due to commit 86cbfb5607d4b81b1a993ff689bbd2addd5d3a9b
>>>>> ("[SCSI] put stricter guards on queue dead checks") which was also
>>>>> included in a stable update for 2.6.38.
>>>>>
>>>>> There was also a report on bugzilla.kernel.org, though no-one can see
>>>>> quite what that says now:
>>>>>
>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=38842
>>>>>
>>>>> I also reported most of the above to James Bottomley and linux-scsi
>>>>> nearly 2 months ago, to no response.
>>>>
>>>> I've reported a similar oops related to the above commit:
>>>> [BUG] Oops when SCSI device under multipath is removed
>>>> https://lkml.org/lkml/2011/8/10/11
>>>>
>>>> Elevator being removed is the core of the problem.
>>>> And the essential issue seems 2 different models of queue/driver relation
>>>> implied by queue_lock.
>>>>
>>>> If reverting the commit is not an option,
>>>> until somebody comes up to fix the essential issue,
>>>> the patch below should close the regressions introduced by the commit.
>>>>
>>> Why do you have to do it that complicated?
>>> Couldn't we just state that any external lock is being disconnected from
>>> queue_lock after blk_cleanup_queue()?
>>>
>>> Then something like this should suffice here:
>>
>>
>>
>> diff --git a/block/blk-core.c b/block/blk-core.c
>> index 90e1ffd..a4ac005 100644
>> --- a/block/blk-core.c
>> +++ b/block/blk-core.c
>> @@ -367,10 +367,8 @@ void blk_cleanup_queue(struct request_queue *q)
>> queue_flag_set_unlocked(QUEUE_FLAG_DEAD, q);
>> mutex_unlock(&q->sysfs_lock);
>>
>> - if (q->elevator)
>> - elevator_exit(q->elevator);
>> -
>> - blk_throtl_exit(q);
>> + if (q->queue_lock != q->__queue_lock)
>> + q->queue_lock = q->__queue_lock;
>
> That should be &q->__queue_lock.
>
Why, but of course.
It's been fixed with the official patch
(cf block: Free queue resources at blk_release_queue())
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
WARNING: multiple messages have this Message-ID (diff)
From: Hannes Reinecke <hare@suse.de>
To: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Cc: Alan Stern <stern@rowland.harvard.edu>,
Rocko Requin <rockorequin@hotmail.com>,
"Jun'ichi Nomura" <j-nomura@ce.jp.nec.com>,
Ben Hutchings <ben@decadent.org.uk>,
jaxboe@fusionio.com,
James Bottomley <James.Bottomley@HansenPartnership.com>,
tytso@mit.edu,
Kernel development list <linux-kernel@vger.kernel.org>,
linux-scsi <linux-scsi@vger.kernel.org>
Subject: Re: [Bug 25832] kernel crashes when a mounted ext3/4 file system is physically removed
Date: Thu, 22 Sep 2011 18:32:26 +0200 [thread overview]
Message-ID: <4E7B631A.20605@suse.de> (raw)
In-Reply-To: <20110922162047.GD3431@oc1711230544.ibm.com>
On 09/22/2011 06:20 PM, Thadeu Lima de Souza Cascardo wrote:
> On Thu, Sep 22, 2011 at 11:16:30AM -0400, Alan Stern wrote:
>> Rocko:
>>
>> Can you try testing this patch instead of all the patches I sent to
>> you (but keep Ted's patch)?
>>
>> Alan Stern
>>
>> On Thu, 22 Sep 2011, Hannes Reinecke wrote:
>>
>>> On 09/20/2011 09:32 AM, Jun'ichi Nomura wrote:
>>>> On 09/19/11 08:00, Ben Hutchings wrote:
>>> [ .. ]
>>>>>
>>>>> There have been reports of this in Debian going back to 2.6.39:
>>>>>
>>>>> http://bugs.debian.org/631187
>>>>> http://bugs.debian.org/636263
>>>>> http://bugs.debian.org/642043
>>>>>
>>>>> Plus possibly related crashes in elv_put_request after CD-ROM removal:
>>>>>
>>>>> http://bugs.debian.org/633890
>>>>> http://bugs.debian.org/634681
>>>>> http://bugs.debian.org/636103
>>>>>
>>>>> The former was also reported in Ubuntu since their 2.6.38-10:
>>>>>
>>>>> https://bugs.launchpad.net/debian/+source/linux-2.6/+bug/793796
>>>>>
>>>>> The result of the discussion there was that it appeared to be a
>>>>> regression due to commit 86cbfb5607d4b81b1a993ff689bbd2addd5d3a9b
>>>>> ("[SCSI] put stricter guards on queue dead checks") which was also
>>>>> included in a stable update for 2.6.38.
>>>>>
>>>>> There was also a report on bugzilla.kernel.org, though no-one can see
>>>>> quite what that says now:
>>>>>
>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=38842
>>>>>
>>>>> I also reported most of the above to James Bottomley and linux-scsi
>>>>> nearly 2 months ago, to no response.
>>>>
>>>> I've reported a similar oops related to the above commit:
>>>> [BUG] Oops when SCSI device under multipath is removed
>>>> https://lkml.org/lkml/2011/8/10/11
>>>>
>>>> Elevator being removed is the core of the problem.
>>>> And the essential issue seems 2 different models of queue/driver relation
>>>> implied by queue_lock.
>>>>
>>>> If reverting the commit is not an option,
>>>> until somebody comes up to fix the essential issue,
>>>> the patch below should close the regressions introduced by the commit.
>>>>
>>> Why do you have to do it that complicated?
>>> Couldn't we just state that any external lock is being disconnected from
>>> queue_lock after blk_cleanup_queue()?
>>>
>>> Then something like this should suffice here:
>>
>>
>>
>> diff --git a/block/blk-core.c b/block/blk-core.c
>> index 90e1ffd..a4ac005 100644
>> --- a/block/blk-core.c
>> +++ b/block/blk-core.c
>> @@ -367,10 +367,8 @@ void blk_cleanup_queue(struct request_queue *q)
>> queue_flag_set_unlocked(QUEUE_FLAG_DEAD, q);
>> mutex_unlock(&q->sysfs_lock);
>>
>> - if (q->elevator)
>> - elevator_exit(q->elevator);
>> -
>> - blk_throtl_exit(q);
>> + if (q->queue_lock != q->__queue_lock)
>> + q->queue_lock = q->__queue_lock;
>
> That should be &q->__queue_lock.
>
Why, but of course.
It's been fixed with the official patch
(cf block: Free queue resources at blk_release_queue())
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
next prev parent reply other threads:[~2011-09-22 16:28 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <BAY151-W32DCB4BAFEC97DD4913A12A1090@phx.gbl>
2011-09-17 17:34 ` [Bug 25832] kernel crashes when a mounted ext3/4 file system is physically removed Alan Stern
2011-09-18 23:00 ` Ben Hutchings
2011-09-20 7:32 ` Jun'ichi Nomura
2011-09-22 12:26 ` Hannes Reinecke
2011-09-22 12:26 ` Hannes Reinecke
2011-09-22 12:35 ` James Bottomley
2011-09-22 15:16 ` Alan Stern
2011-09-22 15:16 ` Alan Stern
2011-09-22 16:20 ` Thadeu Lima de Souza Cascardo
2011-09-22 16:32 ` Hannes Reinecke [this message]
2011-09-22 16:32 ` Hannes Reinecke
[not found] <BAY151-W13DDCCEFEB7B68EE506214A10C0@phx.gbl>
2011-09-23 15:18 ` Alan Stern
2011-09-23 15:18 ` Alan Stern
[not found] <BAY151-W234D9A977DF076A732C2AAA1080@phx.gbl>
2011-09-18 14:43 ` Alan Stern
[not found] <BAY151-W1224E6C1A20D179965A149A1090@phx.gbl>
2011-09-17 13:21 ` Alan Stern
[not found] <BAY151-W3498E8491E671BDAE90421A1070@phx.gbl>
2011-09-16 16:28 ` Alan Stern
[not found] <bug-25832-13602@https.bugzilla.kernel.org/>
2011-04-22 13:42 ` bugzilla-daemon
2011-04-22 15:00 ` bugzilla-daemon
2011-04-23 0:32 ` bugzilla-daemon
2011-04-23 4:12 ` bugzilla-daemon
2011-04-23 19:31 ` bugzilla-daemon
2011-04-24 1:35 ` bugzilla-daemon
2011-04-25 0:36 ` bugzilla-daemon
2011-04-25 0:37 ` bugzilla-daemon
2011-04-25 0:39 ` bugzilla-daemon
2011-04-25 20:28 ` bugzilla-daemon
2011-04-26 0:28 ` bugzilla-daemon
2011-04-26 0:44 ` bugzilla-daemon
2011-04-26 1:22 ` bugzilla-daemon
2011-04-26 3:29 ` bugzilla-daemon
2011-04-26 4:02 ` bugzilla-daemon
2011-04-26 18:15 ` bugzilla-daemon
2011-05-03 2:19 ` bugzilla-daemon
2011-05-04 7:36 ` bugzilla-daemon
2011-05-10 23:27 ` bugzilla-daemon
2011-05-26 6:44 ` bugzilla-daemon
2011-05-26 14:27 ` bugzilla-daemon
2011-07-13 7:52 ` bugzilla-daemon
2011-08-31 5:00 ` bugzilla-daemon
2011-08-31 5:07 ` bugzilla-daemon
2011-08-31 14:36 ` bugzilla-daemon
2011-08-31 23:43 ` bugzilla-daemon
2011-09-01 1:30 ` bugzilla-daemon
2011-09-04 3:53 ` bugzilla-daemon
2011-09-04 13:55 ` bugzilla-daemon
2011-09-04 14:00 ` bugzilla-daemon
2011-09-05 17:44 ` bugzilla-daemon
2011-09-09 19:13 ` Ted Ts'o
2011-09-09 22:10 ` Alan Stern
[not found] ` <BAY151-W6176D929049AA9E2BDBAEBA1000@phx.gbl>
2011-09-10 14:06 ` Ted Ts'o
2011-09-10 18:07 ` Alan Stern
2011-09-12 1:58 ` Alan Stern
2012-07-02 13:24 ` bugzilla-daemon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E7B631A.20605@suse.de \
--to=hare@suse.de \
--cc=James.Bottomley@HansenPartnership.com \
--cc=ben@decadent.org.uk \
--cc=cascardo@linux.vnet.ibm.com \
--cc=j-nomura@ce.jp.nec.com \
--cc=jaxboe@fusionio.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=rockorequin@hotmail.com \
--cc=stern@rowland.harvard.edu \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.