Re: filesystem corruption with "scsi: core: Reallocate device's budget map on queue depth change"

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Ming Lei <ming.lei@redhat.com>
To: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: James Bottomley <jejb@linux.ibm.com>,
	John Garry <john.garry@huawei.com>,
	Andrea Righi <andrea.righi@canonical.com>,
	Martin Wilck <martin.wilck@suse.com>,
	Bart Van Assche <bvanassche@acm.org>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: filesystem corruption with "scsi: core: Reallocate device's budget map on queue depth change"
Date: Thu, 31 Mar 2022 10:14:34 +0800	[thread overview]
Message-ID: <YkUOip75R8DH613s@T590> (raw)
In-Reply-To: <ba090f1b-a767-46a1-5728-82d9c587ef3c@opensource.wdc.com>

On Thu, Mar 31, 2022 at 07:30:35AM +0900, Damien Le Moal wrote:
> On 3/30/22 22:48, Ming Lei wrote:
> > On Wed, Mar 30, 2022 at 09:31:35AM -0400, James Bottomley wrote:
> >> On Wed, 2022-03-30 at 13:59 +0100, John Garry wrote:
> >>> On 30/03/2022 12:21, Andrea Righi wrote:
> >>>> On Wed, Mar 30, 2022 at 11:38:02AM +0100, John Garry wrote:
> >>>>> On 30/03/2022 11:11, Andrea Righi wrote:
> >>>>>> Hello,
> >>>>>>
> >>>>>> after this commit I'm experiencing some filesystem corruptions
> >>>>>> at boot on a power9 box with an aacraid controller.
> >>>>>>
> >>>>>> At the moment I'm running a 5.15.30 kernel; when the filesystem
> >>>>>> is mounted at boot I see the following errors in the console:
> >>>
> >>> About "scsi: core: Reallocate device's budget map on queue depth
> >>> change" being added to a stable kernel, I am not sure if this was
> >>> really a fix  or just a memory optimisation.
> >>
> >> I can see how it becomes the problem: it frees and allocates a new
> >> bitmap across a queue freeze, but bits in the old one might still be in
> >> use.  This isn't a problem except when they return and we now possibly
> >> see a tag greater than we think we can allocate coming back. 
> >> Presumably we don't check this and we end up doing a write to
> >> unallocated memory.
> >>
> >> I think if you want to reallocate on queue depth reduction, you might
> >> have to drain the queue as well as freeze it.
> > 
> > After queue is frozen, there can't be any in-flight request/scsi
> > command, so the sbitmap is zeroed at that time, and safe to reallocate.
> > 
> > The problem is aacraid specific, since the driver has hard limit
> > of 256 queue depth, see aac_change_queue_depth().
> 
> 256 is the scsi hard limit per device... Any SAS drive has the same limit
> by default since there is no way to know the max queue depth of a scsi
> disk.So what is special about aacraid ?
> 

I meant aac_change_queue_depth() sets hard limit of 256.

Yeah, for any hba driver which implements its own .change_queue_depth(),
there may be one hard limit there.

So I still don't understand why you mention '256 is the scsi hard limit per
device', and where is the code? If both .cma_per_lun and .can_queue are > 256
and the driver uses default scsi_change_queue_depth() and sdev->tagged_supported
is true, then user is free to change queue depth via /sys/block/$SDN/device/queue_depth
to > 256. It is same for SAS, see sas_change_queue_depth().

Also I am pretty sure some type of scsi device is capable of supporting >256 queue
depth, include sas, and sas usually has big queue depth.


Thanks,
Ming

next prev parent reply	other threads:[~2022-03-31  2:15 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-30 10:11 filesystem corruption with "scsi: core: Reallocate device's budget map on queue depth change" Andrea Righi
2022-03-30 10:38 ` John Garry
2022-03-30 11:21   ` Andrea Righi
2022-03-30 12:59     ` John Garry
2022-03-30 13:31       ` James Bottomley
2022-03-30 13:48         ` Ming Lei
2022-03-30 22:30           ` Damien Le Moal
2022-03-31  2:14             ` Ming Lei [this message]
2022-03-31  6:12               ` Damien Le Moal
2022-03-30 13:41       ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YkUOip75R8DH613s@T590 \
    --to=ming.lei@redhat.com \
    --cc=andrea.righi@canonical.com \
    --cc=bvanassche@acm.org \
    --cc=damien.lemoal@opensource.wdc.com \
    --cc=jejb@linux.ibm.com \
    --cc=john.garry@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=martin.wilck@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox