From: Ricky WU <ricky_wu@realtek.com>
To: Ulf Hansson <ulf.hansson@linaro.org>
Cc: "tommyhebb@gmail.com" <tommyhebb@gmail.com>,
"linux-mmc@vger.kernel.org" <linux-mmc@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: RE: [PATCH v3] mmc: rtsx: improve performance for multi block rw
Date: Wed, 11 Oct 2023 05:36:46 +0000 [thread overview]
Message-ID: <a533dde76d2d4345b85cd060a8e403db@realtek.com> (raw)
In-Reply-To: <CAPDyKFo59Q3dmUJU-hJ++=k0uwx2KxamW9KckDX=O_CA84O1_g@mail.gmail.com>
Hi Ulf Hansson,
Can I know what is this patch status or has some concern on this patch?
Ricky
> -----Original Message-----
> From: Ulf Hansson <ulf.hansson@linaro.org>
> Sent: Thursday, February 10, 2022 10:57 PM
> To: Ricky WU <ricky_wu@realtek.com>
> Cc: tommyhebb@gmail.com; linux-mmc@vger.kernel.org;
> linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v3] mmc: rtsx: improve performance for multi block rw
>
> On Thu, 10 Feb 2022 at 07:43, Ricky WU <ricky_wu@realtek.com> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: Ulf Hansson <ulf.hansson@linaro.org>
> > > Sent: Monday, February 7, 2022 7:11 PM
> > > To: Ricky WU <ricky_wu@realtek.com>
> > > Cc: tommyhebb@gmail.com; linux-mmc@vger.kernel.org;
> > > linux-kernel@vger.kernel.org
> > > Subject: Re: [PATCH v3] mmc: rtsx: improve performance for multi
> > > block rw
> > >
> > > [...]
> > >
> > > > > > > >
> > > > > > > > Do you have any suggestion for testing random I/O But we
> > > > > > > > think random I/O will not change much
> > > > > > >
> > > > > > > I would probably look into using fio,
> > > > > > > https://fio.readthedocs.io/en/latest/
> > > > > > >
> > > > > >
> > > > > > Filled random I/O data
> > > > > > Before the patch:
> > > > > > CMD (Randread):
> > > > > > sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread
> > > > > > -group_reporting -ioengine=psync -iodepth=1 -size=1G
> > > > > > -name=mytest -bs=1M -rw=randread
> > > > >
> > > > > Thanks for running the tests! Overall, I would not expect an
> > > > > impact on the throughput when using a big blocksize like 1M.
> > > > > This is also pretty clear from the result you have provided.
> > > > >
> > > > > However, especially for random writes and reads, we want to try
> > > > > with smaller blocksizes. Like 8k or 16k, would you mind running
> > > > > another round of tests to see how that works out?
> > > > >
> > > >
> > > > Filled random I/O data(8k/16k)
> > >
> > > Hi Ricky,
> > >
> > > Apologize for the delay! Thanks for running the tests. Let me
> > > comment on them below.
> > >
> > > >
> > > > Before(randread)
> > > > 8k:
> > > > Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread
> > > > -group_reporting -ioengine=psync -iodepth=1 -size=1G -name=mytest
> > > > -bs=8k -rw=randread
> > > > mytest: (g=0): rw=randread, bs=(R) 8192B-8192B, (W) 8192B-8192B,
> > > > (T) 8192B-8192B, ioengine=psync, iodepth=1
> > > > result:
> > > > Run status group 0 (all jobs):
> > > > READ: bw=16.5MiB/s (17.3MB/s), 16.5MiB/s-16.5MiB/s
> > > > (17.3MB/s-17.3MB/s), io=1024MiB (1074MB), run=62019-62019msec
> Disk
> > > stats (read/write):
> > > > mmcblk0: ios=130757/0, merge=0/0, ticks=57751/0, in_queue=57751,
> > > > util=99.89%
> > > >
> > > > 16k:
> > > > Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread
> > > > -group_reporting -ioengine=psync -iodepth=1 -size=1G -name=mytest
> > > > -bs=16k -rw=randread
> > > > mytest: (g=0): rw=randread, bs=(R) 16.0KiB-16.0KiB, (W)
> > > > 16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=psync, iodepth=1
> > > > result:
> > > > Run status group 0 (all jobs):
> > > > READ: bw=23.3MiB/s (24.4MB/s), 23.3MiB/s-23.3MiB/s
> > > > (24.4MB/s-24.4MB/s), io=1024MiB (1074MB), run=44034-44034msec
> Disk
> > > stats (read/write):
> > > > mmcblk0: ios=65333/0, merge=0/0, ticks=39420/0, in_queue=39420,
> > > > util=99.84%
> > > >
> > > > Before(randrwrite)
> > > > 8k:
> > > > Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread
> > > > -group_reporting -ioengine=psync -iodepth=1 -size=100M
> > > > -name=mytest -bs=8k -rw=randwrite
> > > > mytest: (g=0): rw=randwrite, bs=(R) 8192B-8192B, (W) 8192B-8192B,
> > > > (T) 8192B-8192B, ioengine=psync, iodepth=1
> > > > result:
> > > > Run status group 0 (all jobs):
> > > > WRITE: bw=4060KiB/s (4158kB/s), 4060KiB/s-4060KiB/s
> > > > (4158kB/s-4158kB/s), io=100MiB (105MB), run=25220-25220msec Disk
> > > > stats
> > > (read/write):
> > > > mmcblk0: ios=51/12759, merge=0/0, ticks=80/24154,
> > > > in_queue=24234, util=99.90%
> > > >
> > > > 16k:
> > > > Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread
> > > > -group_reporting -ioengine=psync -iodepth=1 -size=100M
> > > > -name=mytest -bs=16k -rw=randwrite
> > > > mytest: (g=0): rw=randwrite, bs=(R) 16.0KiB-16.0KiB, (W)
> > > > 16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=psync, iodepth=1
> > > > result:
> > > > Run status group 0 (all jobs):
> > > > WRITE: bw=7201KiB/s (7373kB/s), 7201KiB/s-7201KiB/s
> > > > (7373kB/s-7373kB/s), io=100MiB (105MB), run=14221-14221msec Disk
> > > > stats
> > > (read/write):
> > > > mmcblk0: ios=51/6367, merge=0/0, ticks=82/13647, in_queue=13728,
> > > > util=99.81%
> > > >
> > > >
> > > > After(randread)
> > > > 8k:
> > > > Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread
> > > > -group_reporting -ioengine=psync -iodepth=1 -size=1G -name=mytest
> > > > -bs=8k -rw=randread
> > > > mytest: (g=0): rw=randread, bs=(R) 8192B-8192B, (W) 8192B-8192B,
> > > > (T) 8192B-8192B, ioengine=psync, iodepth=1
> > > > result:
> > > > Run status group 0 (all jobs):
> > > > READ: bw=12.4MiB/s (13.0MB/s), 12.4MiB/s-12.4MiB/s
> > > > (13.0MB/s-13.0MB/s), io=1024MiB (1074MB), run=82397-82397msec
> Disk
> > > stats (read/write):
> > > > mmcblk0: ios=130640/0, merge=0/0, ticks=74125/0, in_queue=74125,
> > > > util=99.94%
> > > >
> > > > 16k:
> > > > Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread
> > > > -group_reporting -ioengine=psync -iodepth=1 -size=1G -name=mytest
> > > > -bs=16k -rw=randread
> > > > mytest: (g=0): rw=randread, bs=(R) 16.0KiB-16.0KiB, (W)
> > > > 16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=psync, iodepth=1
> > > > result:
> > > > Run status group 0 (all jobs):
> > > > READ: bw=20.0MiB/s (21.0MB/s), 20.0MiB/s-20.0MiB/s
> > > > (21.0MB/s-21.0MB/s), io=1024MiB (1074MB), run=51076-51076msec
> Disk
> > > stats (read/write):
> > > > mmcblk0: ios=65282/0, merge=0/0, ticks=46255/0, in_queue=46254,
> > > > util=99.87%
> > > >
> > > > After(randwrite)
> > > > 8k:
> > > > Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread
> > > > -group_reporting -ioengine=psync -iodepth=1 -size=100M
> > > > -name=mytest -bs=8k -rw=randwrite
> > > > mytest: (g=0): rw=randwrite, bs=(R) 8192B-8192B, (W) 8192B-8192B,
> > > > (T) 8192B-8192B, ioengine=psync, iodepth=1
> > > > result:
> > > > Run status group 0 (all jobs):
> > > > WRITE: bw=4215KiB/s (4317kB/s), 4215KiB/s-4215KiB/s
> > > > (4317kB/s-4317kB/s), io=100MiB (105MB), run=24292-24292msec Disk
> > > > stats
> > > (read/write):
> > > > mmcblk0: ios=52/12717, merge=0/0, ticks=86/23182,
> > > > in_queue=23267, util=99.92%
> > > >
> > > > 16k:
> > > > Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread
> > > > -group_reporting -ioengine=psync -iodepth=1 -size=100M
> > > > -name=mytest -bs=16k -rw=randwrite
> > > > mytest: (g=0): rw=randwrite, bs=(R) 16.0KiB-16.0KiB, (W)
> > > > 16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=psync, iodepth=1
> > > > result:
> > > > Run status group 0 (all jobs):
> > > > WRITE: bw=6499KiB/s (6655kB/s), 6499KiB/s-6499KiB/s
> > > > (6655kB/s-6655kB/s), io=100MiB (105MB), run=15756-15756msec Disk
> > > > stats
> > > (read/write):
> > > > mmcblk0: ios=51/6347, merge=0/0, ticks=84/15120, in_queue=15204,
> > > > util=99.80%
> > >
> > > It looks like the rand-read tests above are degrading with the new
> > > changes, while rand-writes are both improving and degrading.
> > >
> > > To summarize my view from all the tests you have done at this point
> > > (thanks a lot); it looks like the block I/O merging isn't really
> > > happening at common blocklayer, at least to that extent that would
> > > benefit us. Clearly you have shown that by the suggested change in
> > > the mmc host driver, by detecting whether the "next" request is
> > > sequential to the previous one, which allows us to skip a
> > > CMD12 and minimize some command overhead.
> > >
> > > However, according to the latest tests above, you have also proved
> > > that the changes in the mmc host driver doesn't come without a cost.
> > > In particular, small random-reads would degrade in performance from
> > > these changes.
> > >
> > > That said, it looks to me that rather than trying to improve things
> > > for one specific mmc host driver, it would be better to look at this
> > > from the generic block layer point of view - and investigate why
> > > sequential reads/writes aren't getting merged often enough for the
> > > MMC/SD case. If we can fix the problem there, all mmc host drivers would
> benefit I assume.
> > >
> >
> > So you are thinking about how to patch this in MMC/SD?
> > I don't know if this method is compatible with other MMC Hosts? Or
> > they need to patch other code on their host driver
>
> I would not limit this to the core layer of MMC/SD. The point I was trying to
> make was that it doesn't look like the generic block layer is merging the
> sequential I/O requests in the most efficient way, at least for the eMMC/SD
> devices. Why this is the case, I can't tell. It looks like we need to do some more
> in-depth analysis to understand why merging isn't efficient for us.
>
> >
> > > BTW, have you tried with different I/O schedulers? If you haven't
> > > tried BFQ, I suggest you do as it's a good fit for MMC/SD.
> > >
> >
> > I don’t know what is different I/O schedulers means?
>
> What I/O scheduler did you use when running the test?
>
> For MMC/SD the only one that makes sense to use is BFQ, however that needs
> to be configured via sysfs after boot. There is no way, currently, to make it the
> default, I think. You may look at Documentation/block/bfq-iosched.rst, if you
> are more interested.
>
> Kind regards
> Uffe
> ------Please consider the environment before printing this e-mail.
next prev parent reply other threads:[~2023-10-11 5:36 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-21 12:24 [PATCH v3] mmc: rtsx: improve performance for multi block rw Ricky WU
2021-12-21 12:51 ` Ulf Hansson
2021-12-23 10:26 ` Ricky WU
2021-12-23 10:37 ` Ulf Hansson
2021-12-24 7:23 ` Ricky WU
2021-12-28 14:04 ` Ulf Hansson
2021-12-29 12:39 ` Ricky WU
2022-02-07 11:11 ` Ulf Hansson
2022-02-10 6:43 ` Ricky WU
2022-02-10 14:56 ` Ulf Hansson
2023-10-11 5:36 ` Ricky WU [this message]
2023-10-12 13:40 ` Ulf Hansson
2023-10-13 2:27 ` Ricky WU
2023-10-17 21:28 ` Ulf Hansson
2023-10-25 10:30 ` Ricky WU
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a533dde76d2d4345b85cd060a8e403db@realtek.com \
--to=ricky_wu@realtek.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mmc@vger.kernel.org \
--cc=tommyhebb@gmail.com \
--cc=ulf.hansson@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.