From: Michal Rostecki <mrostecki@suse.de>
To: Anand Jain <anand.jain@oracle.com>
Cc: Chris Mason <clm@fb.com>, Josef Bacik <josef@toxicpanda.com>,
David Sterba <dsterba@suse.com>,
"open list:BTRFS FILE SYSTEM" <linux-btrfs@vger.kernel.org>,
open list <linux-kernel@vger.kernel.org>,
Michal Rostecki <mrostecki@suse.com>
Subject: Re: [PATCH RFC 0/6] Add roundrobin raid1 read policy
Date: Wed, 10 Feb 2021 12:18:53 +0000 [thread overview]
Message-ID: <20210210121853.GA23499@wotan.suse.de> (raw)
In-Reply-To: <4f24ef7f-c1cf-3cda-b12f-a2c8c84a7e45@oracle.com>
On Wed, Feb 10, 2021 at 02:52:01PM +0800, Anand Jain wrote:
> On 10/02/2021 04:30, Michal Rostecki wrote:
> > From: Michal Rostecki <mrostecki@suse.com>
> >
> > This patch series adds a new raid1 read policy - roundrobin. For each
> > request, it selects the mirror which has lower load than queue depth.
> > Load is defined as the number of inflight requests + a penalty value
> > (if the scheduled request is not local to the last processed request for
> > a rotational disk).
> >
> > The series consists of preparational changes which add necessary
> > information to the btrfs_device struct and the change with the policy.
> >
> > This policy was tested with fio and compared with the default `pid`
> > policy.
> >
> > The singlethreaded test has the following parameters:
> >
> > [global]
> > name=btrfs-raid1-seqread
> > filename=btrfs-raid1-seqread
> > rw=read
> > bs=64k
> > direct=0
> > numjobs=1
> > time_based=0
> >
> > [file1]
> > size=10G
> > ioengine=libaio
> >
> > and shows the following results:
> >
> > - raid1c3 with 3 HDDs:
> > 3 x Segate Barracuda ST2000DM008 (2TB)
> > * pid policy
> > READ: bw=217MiB/s (228MB/s), 217MiB/s-217MiB/s (228MB/s-228MB/s),
> > io=10.0GiB (10.7GB), run=47082-47082msec
> > * roundrobin policy
> > READ: bw=409MiB/s (429MB/s), 409MiB/s-409MiB/s (429MB/s-429MB/s),
> > io=10.0GiB (10.7GB), run=25028-25028mse
>
>
> > - raid1c3 with 2 HDDs and 1 SSD:
> > 2 x Segate Barracuda ST2000DM008 (2TB)
> > 1 x Crucial CT256M550SSD1 (256GB)
> > * pid policy (the worst case when only HDDs were chosen)
> > READ: bw=220MiB/s (231MB/s), 220MiB/s-220MiB/s (231MB/s-231MB/s),
> > io=10.0GiB (10.7GB), run=46577-46577mse
> > * pid policy (the best case when SSD was used as well)
> > READ: bw=513MiB/s (538MB/s), 513MiB/s-513MiB/s (538MB/s-538MB/s),
> > io=10.0GiB (10.7GB), run=19954-19954msec
> > * roundrobin (there are no noticeable differences when testing multiple
> > times)
> > READ: bw=541MiB/s (567MB/s), 541MiB/s-541MiB/s (567MB/s-567MB/s),
> > io=10.0GiB (10.7GB), run=18933-18933msec
> >
> > The multithreaded test has the following parameters:
> >
> > [global]
> > name=btrfs-raid1-seqread
> > filename=btrfs-raid1-seqread
> > rw=read
> > bs=64k
> > direct=0
> > numjobs=8
> > time_based=0
> >
> > [file1]
> > size=10G
> > ioengine=libaio
> >
> > and shows the following results:
> >
> > - raid1c3 with 3 HDDs: 3 x Segate Barracuda ST2000DM008 (2TB)
> > 3 x Segate Barracuda ST2000DM008 (2TB)
> > * pid policy
> > READ: bw=1569MiB/s (1645MB/s), 196MiB/s-196MiB/s (206MB/s-206MB/s),
> > io=80.0GiB (85.9GB), run=52210-52211msec
> > * roundrobin
> > READ: bw=1733MiB/s (1817MB/s), 217MiB/s-217MiB/s (227MB/s-227MB/s),
> > io=80.0GiB (85.9GB), run=47269-47271msec
> > - raid1c3 with 2 HDDs and 1 SSD:
> > 2 x Segate Barracuda ST2000DM008 (2TB)
> > 1 x Crucial CT256M550SSD1 (256GB)
> > * pid policy
> > READ: bw=1843MiB/s (1932MB/s), 230MiB/s-230MiB/s (242MB/s-242MB/s),
> > io=80.0GiB (85.9GB), run=44449-44450msec
> > * roundrobin
> > READ: bw=2485MiB/s (2605MB/s), 311MiB/s-311MiB/s (326MB/s-326MB/s),
> > io=80.0GiB (85.9GB), run=32969-32970msec
> >
>
> Both of the above test cases are sequential. How about some random IO
> workload?
>
> Also, the seek time for non-rotational devices does not exist. So it is
> a good idea to test with ssd + nvme and all nvme or all ssd.
>
Good idea. I will test random I/O and will try to test all-nvme /
all-ssd and mixed nonrot.
> > To measure the performance of each policy and find optimal penalty
> > values, I created scripts which are available here:
> >
> > https://gitlab.com/vadorovsky/btrfs-perf
> > https://github.com/mrostecki/btrfs-perf
> >
> > Michal Rostecki (6):
>
>
> > btrfs: Add inflight BIO request counter
> > btrfs: Store the last device I/O offset
>
> These patches look good. But as only round-robin policy requires
> to monitor the inflight and last-offset. Could you bring them under
> if policy=roundrobin? Otherwise, it is just a waste of CPU cycles
> if the policy != roundrobin.
>
If I bring those stats under if policy=roundrobin, they are going to be
inaccurate if someone switches policies on the running system, after
doing any I/O in that filesystem.
I'm open to suggestions how can I make those stats as lightweight as
possible. Unfortunately, I don't think I can store the last physical
location without atomic_t.
The BIO percpu counter is probably the least to be worried about, though
I could maybe get rid of it entirely in favor of using part_stat_read().
> > btrfs: Add stripe_physical function
> > btrfs: Check if the filesystem is has mixed type of devices
>
> Thanks, Anand
>
> > btrfs: sysfs: Add directory for read policies
> > btrfs: Add roundrobin raid1 read policy
> >
> > fs/btrfs/ctree.h | 3 +
> > fs/btrfs/disk-io.c | 3 +
> > fs/btrfs/sysfs.c | 144 ++++++++++++++++++++++++++----
> > fs/btrfs/volumes.c | 218 +++++++++++++++++++++++++++++++++++++++++++--
> > fs/btrfs/volumes.h | 22 +++++
> > 5 files changed, 366 insertions(+), 24 deletions(-)
> >
>
Thanks for review,
Michal
next prev parent reply other threads:[~2021-02-10 12:22 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-09 20:30 [PATCH RFC 0/6] Add roundrobin raid1 read policy Michal Rostecki
2021-02-09 20:30 ` [PATCH RFC 1/6] btrfs: Add inflight BIO request counter Michal Rostecki
2021-02-09 20:30 ` [PATCH RFC 2/6] btrfs: Store the last device I/O offset Michal Rostecki
2021-02-09 20:30 ` [PATCH RFC 3/6] btrfs: Add stripe_physical function Michal Rostecki
2021-02-09 20:30 ` [PATCH RFC 4/6] btrfs: Check if the filesystem is has mixed type of devices Michal Rostecki
2021-02-10 4:08 ` Michał Mirosław
2021-02-10 12:50 ` Michal Rostecki
2021-02-12 18:26 ` Michal Rostecki
2021-02-12 23:36 ` Michał Mirosław
2021-02-15 14:40 ` Michal Rostecki
2021-02-10 10:09 ` Filipe Manana
2021-02-10 12:55 ` Michal Rostecki
2021-02-09 20:30 ` [PATCH RFC 5/6] btrfs: sysfs: Add directory for read policies Michal Rostecki
2021-02-13 10:19 ` Greg KH
2021-02-15 14:35 ` Michal Rostecki
2021-02-15 14:59 ` Greg KH
2021-02-09 20:30 ` [PATCH RFC 6/6] btrfs: Add roundrobin raid1 read policy Michal Rostecki
2021-02-10 4:24 ` Michał Mirosław
2021-02-10 12:29 ` Michal Rostecki
2021-02-10 12:58 ` Michał Mirosław
2021-02-10 19:23 ` Michal Rostecki
2021-02-11 2:27 ` Michał Mirosław
2021-02-11 12:35 ` Michal Rostecki
2021-02-10 8:20 ` Anand Jain
2021-02-11 15:55 ` Michal Rostecki
2021-02-12 17:12 ` Michal Rostecki
2021-02-10 6:52 ` [PATCH RFC 0/6] " Anand Jain
2021-02-10 12:18 ` Michal Rostecki [this message]
2021-02-10 14:00 ` Michal Rostecki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210210121853.GA23499@wotan.suse.de \
--to=mrostecki@suse.de \
--cc=anand.jain@oracle.com \
--cc=clm@fb.com \
--cc=dsterba@suse.com \
--cc=josef@toxicpanda.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mrostecki@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).