Re: filesystem corruption with "scsi: core: Reallocate device's budget map on queue depth change"

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

From: Damien Le Moal <damien.lemoal@opensource.wdc.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: James Bottomley <jejb@linux.ibm.com>,
	John Garry <john.garry@huawei.com>,
	Andrea Righi <andrea.righi@canonical.com>,
	Martin Wilck <martin.wilck@suse.com>,
	Bart Van Assche <bvanassche@acm.org>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: filesystem corruption with "scsi: core: Reallocate device's budget map on queue depth change"
Date: Thu, 31 Mar 2022 15:12:09 +0900	[thread overview]
Message-ID: <9cdf5eb3-201d-254e-1404-54522131eeb0@opensource.wdc.com> (raw)
In-Reply-To: <YkUOip75R8DH613s@T590>

On 3/31/22 11:14, Ming Lei wrote:
> On Thu, Mar 31, 2022 at 07:30:35AM +0900, Damien Le Moal wrote:
>> On 3/30/22 22:48, Ming Lei wrote:
>>> On Wed, Mar 30, 2022 at 09:31:35AM -0400, James Bottomley wrote:
>>>> On Wed, 2022-03-30 at 13:59 +0100, John Garry wrote:
>>>>> On 30/03/2022 12:21, Andrea Righi wrote:
>>>>>> On Wed, Mar 30, 2022 at 11:38:02AM +0100, John Garry wrote:
>>>>>>> On 30/03/2022 11:11, Andrea Righi wrote:
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> after this commit I'm experiencing some filesystem corruptions
>>>>>>>> at boot on a power9 box with an aacraid controller.
>>>>>>>>
>>>>>>>> At the moment I'm running a 5.15.30 kernel; when the filesystem
>>>>>>>> is mounted at boot I see the following errors in the console:
>>>>>
>>>>> About "scsi: core: Reallocate device's budget map on queue depth
>>>>> change" being added to a stable kernel, I am not sure if this was
>>>>> really a fix  or just a memory optimisation.
>>>>
>>>> I can see how it becomes the problem: it frees and allocates a new
>>>> bitmap across a queue freeze, but bits in the old one might still be in
>>>> use.  This isn't a problem except when they return and we now possibly
>>>> see a tag greater than we think we can allocate coming back. 
>>>> Presumably we don't check this and we end up doing a write to
>>>> unallocated memory.
>>>>
>>>> I think if you want to reallocate on queue depth reduction, you might
>>>> have to drain the queue as well as freeze it.
>>>
>>> After queue is frozen, there can't be any in-flight request/scsi
>>> command, so the sbitmap is zeroed at that time, and safe to reallocate.
>>>
>>> The problem is aacraid specific, since the driver has hard limit
>>> of 256 queue depth, see aac_change_queue_depth().
>>
>> 256 is the scsi hard limit per device... Any SAS drive has the same limit
>> by default since there is no way to know the max queue depth of a scsi
>> disk.So what is special about aacraid ?
>>
> 
> I meant aac_change_queue_depth() sets hard limit of 256.
> 
> Yeah, for any hba driver which implements its own .change_queue_depth(),
> there may be one hard limit there.
> 
> So I still don't understand why you mention '256 is the scsi hard limit per
> device', and where is the code? If both .cma_per_lun and .can_queue are > 256
> and the driver uses default scsi_change_queue_depth() and sdev->tagged_supported
> is true, then user is free to change queue depth via /sys/block/$SDN/device/queue_depth
> to > 256. It is same for SAS, see sas_change_queue_depth().
> 
> Also I am pretty sure some type of scsi device is capable of supporting >256 queue
> depth, include sas, and sas usually has big queue depth.

Correct. The 256 hard limit comes from the old parallel scsi transport
which had only 8 bits for the tag value. Other SCSI transports do have
more bits for tags, allowing higher maximum.

That said, the scsi stack limits the max queue depth to 1024 (see
scsi_device_max_queue_depth(), and most drivers set can_queue to 256 or
less by default.

> 
> 
> Thanks,
> Ming
> 


-- 
Damien Le Moal
Western Digital Research

next prev parent reply	other threads:[~2022-03-31  6:12 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-30 10:11 filesystem corruption with "scsi: core: Reallocate device's budget map on queue depth change" Andrea Righi
2022-03-30 10:38 ` John Garry
2022-03-30 11:21   ` Andrea Righi
2022-03-30 12:59     ` John Garry
2022-03-30 13:31       ` James Bottomley
2022-03-30 13:48         ` Ming Lei
2022-03-30 22:30           ` Damien Le Moal
2022-03-31  2:14             ` Ming Lei
2022-03-31  6:12               ` Damien Le Moal [this message]
2022-03-30 13:41       ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9cdf5eb3-201d-254e-1404-54522131eeb0@opensource.wdc.com \
    --to=damien.lemoal@opensource.wdc.com \
    --cc=andrea.righi@canonical.com \
    --cc=bvanassche@acm.org \
    --cc=jejb@linux.ibm.com \
    --cc=john.garry@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=martin.wilck@suse.com \
    --cc=ming.lei@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox