From: Durval Menezes <jmmml@durval.com>
To: fio <fio@vger.kernel.org>
Cc: Sitsofe Wheeler <sitsofe@gmail.com>
Subject: Re: Samsung PM863 SSD: surprisingly high Write IOPS measured using `fio`, over 4.6 times more than spec!?
Date: Tue, 15 Feb 2022 13:40:26 -0300 [thread overview]
Message-ID: <20220215164026.GK487@angmar.tmp.com.br> (raw)
Hi Sitsofe,
First of all, thanks for your detailed, thoughtful response. More, below:
On Mon, Feb 14, 2022 at 4:51 PM Sitsofe Wheeler <sitsofe@gmail.com> wrote:
> On Mon, 14 Feb 2022 at 18:44, Durval Menezes <jmmml@durval.com> wrote:
> >
> > Hello everyone,
> >
> > I've arrived at a very surprising number measuring IOPS write performance
> > on my SSDs' "bare metal" (ie, straight on the /dev/$DISK, no filesystem
> > involved):
> >
> > export COMMON_OPTIONS='--ioengine=libaio --direct=1
> --runtime=120 --time_based --group_reporting'
> >
> > ls -l /dev/disk/by-id | grep 'ata-.*sda'
> > lrwxrwxrwx 1 root root 9 Feb 13 17:19
> ata-SAMSUNG_MZ7LM1T9HCJM-00003_XXXXXXXXXXXXXX -> ../../sda
> >
> >
> TANGO=/dev/disk/by-id/ata-SAMSUNG_MZ7LM1T9HCJM-00003_XXXXXXXXXXXXXX
> > sudo fio --filename=${TANGO} --name=device_iops_write
> --rw=randwrite --bs=4k --iodepth=256 --numjobs=4 ${COMMON_OPTIONS}
> > [...]
> > write: *IOPS=83.1k*, BW=325MiB/s
> (341MB/s)(38.1GiB/120007msec)
> > [...]
> >
> > (please find the complete output at the end of this message, in case I
> should
> > have looked at some other lines and/or you are curious)
> >
> > As per the official manufacturer specs (both in this whitepaper at their
> > website[1]), and also in this datasheet I found somewhere else[2]), it's
> > supposed to be only *18K IOPS*.
> >
> > All the other base performance numbers I've measured (read IOPS, read and
> > write MB/s, read and write latencies) are at or very near the
> manufacturer
> > specs.
> >
> > What's going on?
> >
> > At first I thought that, despite `--direct=1` being explicitly indicated,
> > my machine's 64GB RAM (via the Linux buffer cache) could be caching the
> > writes (even if the number, in that case, should have been much
> higher)...
> > so, I tested it again with `--runtime=120` to saturate the buffer cache
> in
> > case it was really the 'culprit'... lo and behold, the result was:
> >
> > [...]
> > write: IOPS=83.1k, BW=325MiB/s (341MB/s)(190GiB/600019msec)
> > [...]
> >
> >
> > So, the surprising over-4.6x-times-the-spec Write IOPS is mantained, even
> > for 190GiB total data.
> >
> > And with 190GiB data written (about 10% the total device capacity), I do
> > not believe it's any kind of cache (RAM, MLC or whatever) inside the SSD
> > either.
> >
> You're running your workload for a comparatively short time
OK, I was able to find in the whitepaper (see below) the manufacturer
stating that the rand writes should be run for twice the capacity of the
disk. That will also imply a much longer test time...
> and additionally we don't know how "fresh" your SSD is.
Good point; here's its "freshness"-relevant data straight from `smartctl -a`:
9 Power_On_Hours 0x0032 091 091 000 Old_age Always
- 43694
177 Wear_Leveling_Count 0x0013 094 094 005 Pre-fail Always
- 394
241 Total_LBAs_Written 0x0032 099 099 000 Old_age Always
- 535797643689
242 Total_LBAs_Read 0x0032 099 099 000 Old_age Always
- 1848967801660
251 NAND_Writes 0x0032 100 100 000 Old_age Always
- 1642499721864
So, I think it's a pretty 'mature' disk already (but hopefully with a lot
of 'life' still in it).
In other words, I don't think it's "fresh" enough to explain a 4x I/O increase.
Perhaps "freshness" in this case refers to it being recently
secure-erased (which I did prior to start testing)?
> The 18K IOPS value
> might be when the drive has been fully written and there are no
> pre-erased blocks available (via so-called preconditioning)... I'll
> also note the whitepaper [1] mentions this:
>
> SSD Precondition: Sustained state (or steady state)
> [...]
> It's important to note that all performance items mentioned in this
> white paper have been measured at the sustained state, except the
> sequential read/write performance
>
Thanks for going through the whitepaper and picking this up. It passed
right by me...
I went through the whitepaper again, and found this:
The sustained state in this document refers to the status that a
128 KB sequential write has been completed equal to the drive capacity and
then 4 KB random write has completed twice as much as the drive capacity
OK, so at least there's a "recipe" for this preconditioning. I will try it
and come back later to report.
> I notice that your SSD appears to be SATA (sda) so I'd be surprised
> that a total queue depth greater than 32 makes a difference (your
> total queue depth is 1024). Do you get a similar result with just the
> one job with an iodepth=32?
I tested with iodepth=32 (instead of 256) and got the same result, so I
guess you are not surprised ;-)
Just did it again, this time with `--numjobs=1` (instead of 4) and here's
the result:
write: IOPS=83.0k, BW=324MiB/s (340MB/s)(38.0GiB/120001msec)
So that's not it either.
> It's unlikely but if the jobs were submitting I/O to the same areas as
> other jobs at the same time then some of the I/O could be elided but
> given what you've posted this should not be the case.
Agreed.
Cheers,
--
Durval.
>
>
> --
> Sitsofe
> >
> >
next reply other threads:[~2022-02-15 16:37 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-15 16:40 Durval Menezes [this message]
-- strict thread matches above, loose matches on Subject: below --
2022-02-15 16:48 Samsung PM863 SSD: surprisingly high Write IOPS measured using `fio`, over 4.6 times more than spec!? Durval Menezes
2022-02-14 14:29 Durval Menezes
2022-02-14 19:50 ` Sitsofe Wheeler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220215164026.GK487@angmar.tmp.com.br \
--to=jmmml@durval.com \
--cc=fio@vger.kernel.org \
--cc=sitsofe@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox