From: Xiao Ni <xni@redhat.com>
To: Song Liu <song@kernel.org>
Cc: guoqing.jiang@linux.dev, linux-raid@vger.kernel.org, ffan@redhat.com
Subject: Re: [PATCH V2 1/1] Add mddev->io_acct_cnt for raid0_quiesce
Date: Fri, 18 Nov 2022 12:24:18 +0800 [thread overview]
Message-ID: <CALTww29FEEDB4yQtYRA1ht2TiX8LGiysDD2DSah46Jc7-ShbpA@mail.gmail.com> (raw)
In-Reply-To: <CAPhsuW4Rv5_n8Zp0REr++UAaOab_fq4nBLHM=5Ai-zFkKknfhg@mail.gmail.com>
On Fri, Nov 18, 2022 at 10:37 AM Song Liu <song@kernel.org> wrote:
>
> On Thu, Nov 17, 2022 at 5:39 PM Xiao Ni <xni@redhat.com> wrote:
> >
> > On Fri, Nov 18, 2022 at 3:57 AM Song Liu <song@kernel.org> wrote:
> > >
> > > Hi Xiao,
> > >
> > > Thanks for the results.
> > >
> > > On Wed, Nov 16, 2022 at 6:03 PM Xiao Ni <xni@redhat.com> wrote:
> > > >
> > > > Hi Song
> > > >
> > > > The performance is good. Please check the result below.
> > > >
> > > > And for the patch itself, do you think we should add a smp_mb
> > > > diff --git a/drivers/md/md.c b/drivers/md/md.c
> > > > index 4d0139cae8b5..3696e3825e27 100644
> > > > --- a/drivers/md/md.c
> > > > +++ b/drivers/md/md.c
> > > > @@ -8650,9 +8650,11 @@ static void md_end_io_acct(struct bio *bio)
> > > > bio_put(bio);
> > > > bio_endio(orig_bio);
> > > >
> > > > - if (atomic_dec_and_test(&mddev->io_acct_cnt))
> > > > + if (atomic_dec_and_test(&mddev->io_acct_cnt)) {
> > > > + smp_mb();
> > > > if (unlikely(test_bit(MD_QUIESCE, &mddev->flags)))
> > > > wake_up(&mddev->wait_io_acct);
> > > > + }
> > > > }
> > > >
> > > > /*
> > > > diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
> > > > index 9d4831ca802c..1818f79bfdf7 100644
> > > > --- a/drivers/md/raid0.c
> > > > +++ b/drivers/md/raid0.c
> > > > @@ -757,6 +757,7 @@ static void raid0_quiesce(struct mddev *mddev, int quiesce)
> > > > * to member disks to avoid memory alloc and performance decrease
> > > > */
> > > > set_bit(MD_QUIESCE, &mddev->flags);
> > > > + smp_mb();
> > > > wait_event(mddev->wait_io_acct, !atomic_read(&mddev->io_acct_cnt));
> > > > clear_bit(MD_QUIESCE, &mddev->flags);
> > > > }
> > > >
> > > > Test result:
> > >
> > > I think there is some noise in the result?
> > >
> > > >
> > > > without patch with patch
> > > > psync read 100MB/s 101MB/s job:1 bs:4k
> > >
> > > For example, this is a small improvement, but
> > >
> > > > 1015MB/s 1016MB/s job:1 bs:128k
> > > > 1359MB/s 1358MB/s job:1 bs:256k
> > > > 1394MB/s 1393MB/s job:40 bs:4k
> > > > 4959MB/s 4873MB/s job:40 bs:128k
> > > > 6166MB/s 6157MB/s job:40 bs:256k
> > > >
> > > > without patch with patch
> > > > psync write 286MB/s 275MB/s job:1 bs:4k
> > >
> > > this is a big regression (~4%).
> > >
> > > > 1810MB/s 1808MB/s job:1 bs:128k
> > > > 1814MB/s 1814MB/s job:1 bs:256k
> > > > 1802MB/s 1801MB/s job:40 bs:4k
> > > > 1814MB/s 1814MB/s job:40 bs:128k
> > > > 1814MB/s 1814MB/s job:40 bs:256k
> > > >
> > > > without patch
> > > > psync randread 39.3MB/s 39.7MB/s job:1 bs:4k
> > > > 791MB/s 783MB/s job:1 bs:128k
> > > > 1183MiB/s 1217MB/s job:1 bs:256k
> > > > 1183MiB/s 1235MB/s job:40 bs:4k
> > > > 3768MB/s 3705MB/s job:40 bs:128k
> > >
> > > And some regression for 128kB but improvement for 4kB.
> > >
> > > > 4410MB/s 4418MB/s job:40 bs:256k
> > >
> > > So I am not quite convinced by these results.
> >
> > Thanks for pointing out the problem. Maybe I need to do a precondition before
> > the testing. I'll give the result again.
> > >
> > > Also, do we really need an extra counter here? Can we use
> > > mddev->active_io instead?
> >
> > At first, I thought of this way too. But active_io is decreased only
> > after pers->make_request.
> > So it can't be used to wait for all bios to return back.
>
> Ah, that's right.
>
> >
> > Can we decrease mddev->active_io in the bi_end_io of each personality?
> > Now only mddev_supsend
> > uses mddev->active_io. It needs to wait all active io to finish. From
> > this point, it should be better
> > to decrease acitve_io in bi_end_io. What's your opinion?
>
> I think we can give it a try. It may break some cases though. If mdadm tests
> behave the same with the change, we can give it a try.
What are the cases?
>
> OTOH, depend on the perf results, we may consider using percpu_ref for
> both active_io and io_acct_cnt.
Thanks. I'll have a try with it.
Regards
Xiao
prev parent reply other threads:[~2022-11-18 4:25 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-24 6:48 [PATCH V2 1/1] Add mddev->io_acct_cnt for raid0_quiesce Xiao Ni
2022-10-28 21:06 ` Song Liu
2022-11-14 18:14 ` Song Liu
2022-11-14 23:18 ` Xiao Ni
2022-11-17 2:02 ` Xiao Ni
2022-11-17 19:56 ` Song Liu
2022-11-18 1:39 ` Xiao Ni
2022-11-18 2:36 ` Song Liu
2022-11-18 4:24 ` Xiao Ni [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CALTww29FEEDB4yQtYRA1ht2TiX8LGiysDD2DSah46Jc7-ShbpA@mail.gmail.com \
--to=xni@redhat.com \
--cc=ffan@redhat.com \
--cc=guoqing.jiang@linux.dev \
--cc=linux-raid@vger.kernel.org \
--cc=song@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).