From: "Yu Kuai" <yukuai@fnnas.com>
To: "Xiao Ni" <xni@redhat.com>
Cc: <ncroxon@redhat.com>, <linux-raid@vger.kernel.org>, <yukuai@fnnas.com>
Subject: Re: [PATCH v3 1/1] md/raid1: serialize overlap io for writemostly disk
Date: Tue, 7 Apr 2026 13:30:26 +0800 [thread overview]
Message-ID: <aafc7d27-ae0d-4580-b1f9-752d3e3e7324@fnnas.com> (raw)
In-Reply-To: <20260324072501.59865-1-xni@redhat.com>
在 2026/3/24 15:24, Xiao Ni 写道:
> Previously, using wait_event() would wake up all waiters simultaneously,
> and they would compete for the tree lock. The bio which gets the lock
> first will be handled, so the write sequence cannot be guaranteed.
>
> For example:
> bio1(100,200)
> bio2(150,200)
> bio3(150,300)
>
> The write sequence of fast device is bio1,bio2,bio3. But the write sequence
> of slow device could be bio1,bio3,bio2 due to lock competition. This causes
> data corruption.
>
> Replace waitqueue with a fifo list to guarantee the write sequence. And it
> also needs to iterate the list when removing one entry. If not, it may miss
> the opportunity to wake up the waiting io.
>
> For example:
> bio1(1,3), bio2(2,4)
> bio3(5,7), bio4(6,8)
> These four bios are in the same bucket. bio1 and bio3 are inserted into
> the rbtree. bio2 and bio4 are added to the waiting list and bio2 is the
> first one. bio3 returns from slow disk and tries to wake up the waiting
> bios. bio2 is removed from the list and will be handled. But bio1 hasn't
> finished. So bio2 will be added into waiting list again. Then bio1 returns
> from slow disk and wakes up waiting bios. bio4 is removed from the list
> and will be handled. Now bio1, bio3 and bio4 all finish and bio2 is left
> on the waiting list. So it needs to iterate the waiting list to wake up
> the right bio.
>
> Signed-off-by: Xiao Ni<xni@redhat.com>
> ---
> v2: use prepare_to_wait_exclusive
> v3: return back to self managed fifo list to find the right waiting node
> drivers/md/md.c | 1 -
> drivers/md/md.h | 5 ++++-
> drivers/md/raid1.c | 45 ++++++++++++++++++++++++++++++++++-----------
> 3 files changed, 38 insertions(+), 13 deletions(-)
Applied to md-7.1
--
Thansk,
Kuai
prev parent reply other threads:[~2026-04-07 5:30 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-24 7:24 [PATCH v3 1/1] md/raid1: serialize overlap io for writemostly disk Xiao Ni
2026-04-07 5:30 ` Yu Kuai [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aafc7d27-ae0d-4580-b1f9-752d3e3e7324@fnnas.com \
--to=yukuai@fnnas.com \
--cc=linux-raid@vger.kernel.org \
--cc=ncroxon@redhat.com \
--cc=xni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox