From: Ming Lei <ming.lei@redhat.com>
To: Waiman Long <longman@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org, Tejun Heo <tj@kernel.org>,
ming.lei@redhat.com
Subject: Re: [PATCH 2/2] blk-cgroup: fix list corruption from reorder of WRITE ->lqueued
Date: Thu, 16 May 2024 08:58:50 +0800 [thread overview]
Message-ID: <ZkVaSg154oU3XltA@fedora> (raw)
In-Reply-To: <2a0cae15-e9d9-4524-a0db-f665b1832db4@redhat.com>
On Wed, May 15, 2024 at 10:09:30AM -0400, Waiman Long wrote:
>
> On 5/14/24 21:31, Ming Lei wrote:
> > __blkcg_rstat_flush() can be run anytime, especially when blk_cgroup_bio_start
> > is being executed.
> >
> > If WRITE of `->lqueued` is re-ordered with READ of 'bisc->lnode.next' in
> > the loop of __blkcg_rstat_flush(), `next_bisc` can be assigned with one
> > stat instance being added in blk_cgroup_bio_start(), then the local
> > list in __blkcg_rstat_flush() could be corrupted.
> >
> > Fix the issue by adding one barrier.
> >
> > Cc: Tejun Heo <tj@kernel.org>
> > Cc: Waiman Long <longman@redhat.com>
> > Fixes: 3b8cc6298724 ("blk-cgroup: Optimize blkcg_rstat_flush()")
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> > block/blk-cgroup.c | 10 ++++++++++
> > 1 file changed, 10 insertions(+)
> >
> > diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
> > index 86752b1652b5..b36ba1d40ba1 100644
> > --- a/block/blk-cgroup.c
> > +++ b/block/blk-cgroup.c
> > @@ -1036,6 +1036,16 @@ static void __blkcg_rstat_flush(struct blkcg *blkcg, int cpu)
> > struct blkg_iostat cur;
> > unsigned int seq;
> > + /*
> > + * Order assignment of `next_bisc` from `bisc->lnode.next` in
> > + * llist_for_each_entry_safe and clearing `bisc->lqueued` for
> > + * avoiding to assign `next_bisc` with new next pointer added
> > + * in blk_cgroup_bio_start() in case of re-ordering.
> > + *
> > + * The pair barrier is implied in llist_add() in blk_cgroup_bio_start().
> > + */
> > + smp_mb();
> > +
> > WRITE_ONCE(bisc->lqueued, false);
>
> I believe replacing WRITE_ONCE() by smp_store_release() and the READ_ONCE()
> in blk_cgroup_bio_start() by smp_load_acquire() should provide enough
> barrier to prevent unexpected reordering as
Yeah, smp_load_acquire() and smp_store_release() pair works too, but with
one extra cost of smp_mb() around llist_add() which implies barrier
already.
So I prefer to this patch.
> the subsequent
> u64_stats_fetch_begin() will also provide a read barrier for subsequent
> read.
u64_stats_fetch_begin() is nop in case of `BITS_PER_LONG == 64`.
Thanks,
Ming
next prev parent reply other threads:[~2024-05-16 0:59 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-15 1:31 [PATCH 0/2] blk-cgroup: two fixes on list corruption Ming Lei
2024-05-15 1:31 ` [PATCH 1/2] blk-cgroup: fix list corruption from resetting io stat Ming Lei
2024-05-15 13:59 ` Waiman Long
2024-05-15 16:49 ` Tejun Heo
2024-05-15 1:31 ` [PATCH 2/2] blk-cgroup: fix list corruption from reorder of WRITE ->lqueued Ming Lei
2024-05-15 14:09 ` Waiman Long
2024-05-15 14:46 ` Waiman Long
2024-05-16 0:58 ` Ming Lei [this message]
2024-05-16 2:21 ` [PATCH 0/2] blk-cgroup: two fixes on list corruption Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZkVaSg154oU3XltA@fedora \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=linux-block@vger.kernel.org \
--cc=longman@redhat.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox