Re: [PATC] block: update queue limits atomically

public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed

From: Mikulas Patocka <mpatocka@redhat.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>, Alasdair Kergon <agk@redhat.com>,
	 Mike Snitzer <snitzer@kernel.org>,
	linux-block@vger.kernel.org,  dm-devel@lists.linux.dev
Subject: Re: [PATC] block: update queue limits atomically
Date: Tue, 18 Mar 2025 16:31:35 +0100 (CET)	[thread overview]
Message-ID: <d5193df0-5944-8cf6-7eb6-26adf191269e@redhat.com> (raw)
In-Reply-To: <Z9mJmlhmZwnOlnqT@fedora>

On Tue, 18 Mar 2025, Ming Lei wrote:

> On Tue, Mar 18, 2025 at 03:26:10PM +0100, Mikulas Patocka wrote:
> > The block limits may be read while they are being modified. The statement
> 
> It is supposed to not be so for IO path, that is why queue is usually down
> or frozen when updating limit.

The limits are read at some points when constructing a bio - for example 
bio_integrity_add_page, bvec_try_merge_hw_page, bio_integrity_map_user.

> For other cases, limit lock can be held for sync the read/write.
> 
> Or you have cases not covered by both queue freeze and limit lock?

For example, device mapper reads the limits of the underlying devices 
without holding any lock (dm_set_device_limits, __process_abnormal_io, 
__max_io_len). It also writes the limits in the I/O path - 
disable_discard, disable_write_zeroes - you couldn't easily lock it here 
because it happens in the interrupt contex.

I'm not sure how many other kernel subsystems do it and whether they could 
all be converted to locking.

> > "q->limits = *lim" is not really atomic. The compiler may turn it into
> > memcpy (clang does).
> > 
> > On x86-64, the kernel uses the "rep movsb" instruction for memcpy - it is
> > optimized on modern CPUs, but it is not atomic, it may be interrupted at
> > any byte boundary - and if it is interrupted, the readers may read
> > garbage.
> > 
> > On sparc64, there's an instruction that zeroes a cache line without
> > reading it from memory. The kernel memcpy implementation uses it (see
> > b3a04ed507bf) to avoid loading the destination buffer from memory. The
> > problem is that if we copy a block of data to q->limits and someone reads
> > it at the same time, the reader may read zeros.
> > 
> > This commit changes it to use WRITE_ONCE, so that individual words are
> > updated atomically.
> 
> It isn't necessary, for this particular problem, it is also fragile to
> provide atomic word update in this low level way, such as, what if
> sizeof(struct queue_limits) isn't 8byte aligned?

struct queue_limits contains two "unsigned long" fields, so it must be 
aligned on "unsigned long" boundary.

In order to make it future-proof, we could use __alignof__(struct 
queue_limits) to determine the size of the update step.

> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > Cc: stable@vger.kernel.org
> 
> stable often requires bug description.

The bug is that you may read mangled numbers when reading the 
queue_limits.

Mikulas

next prev parent reply	other threads:[~2025-03-18 15:31 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-18 14:26 [PATC] block: update queue limits atomically Mikulas Patocka
2025-03-18 14:56 ` Ming Lei
2025-03-18 15:31   ` Mikulas Patocka [this message]
2025-03-19  1:22     ` Ming Lei
2025-03-19  1:58       ` Jens Axboe
2025-03-19 21:18         ` Mikulas Patocka
2025-03-20  2:22           ` Ming Lei
2025-03-20 12:58             ` Jens Axboe
2025-03-20  5:26         ` Christoph Hellwig
2025-03-18 15:58 ` Bart Van Assche
2025-03-18 16:13   ` Mikulas Patocka
2025-03-20  5:25 ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d5193df0-5944-8cf6-7eb6-26adf191269e@redhat.com \
    --to=mpatocka@redhat.com \
    --cc=agk@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=dm-devel@lists.linux.dev \
    --cc=linux-block@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=snitzer@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox