From: Durval Menezes <jmmml@durval.com>
To: fio <fio@vger.kernel.org>
Cc: Sitsofe Wheeler <sitsofe@gmail.com>
Subject: Re: Samsung PM863 SSD: surprisingly high Write IOPS measured using `fio`, over 4.6 times more than spec!?
Date: Tue, 15 Feb 2022 13:40:26 -0300 [thread overview]
Message-ID: <20220215164026.GK487@angmar.tmp.com.br> (raw)
Hi Sitsofe,
First of all, thanks for your detailed, thoughtful response. More, below:
On Mon, Feb 14, 2022 at 4:51 PM Sitsofe Wheeler <sitsofe@gmail.com> wrote:
> On Mon, 14 Feb 2022 at 18:44, Durval Menezes <jmmml@durval.com> wrote:
> >
> > Hello everyone,
> >
> > I've arrived at a very surprising number measuring IOPS write performance
> > on my SSDs' "bare metal" (ie, straight on the /dev/$DISK, no filesystem
> > involved):
> >
> > export COMMON_OPTIONS='--ioengine=libaio --direct=1
> --runtime=120 --time_based --group_reporting'
> >
> > ls -l /dev/disk/by-id | grep 'ata-.*sda'
> > lrwxrwxrwx 1 root root 9 Feb 13 17:19
> ata-SAMSUNG_MZ7LM1T9HCJM-00003_XXXXXXXXXXXXXX -> ../../sda
> >
> >
> TANGO=/dev/disk/by-id/ata-SAMSUNG_MZ7LM1T9HCJM-00003_XXXXXXXXXXXXXX
> > sudo fio --filename=${TANGO} --name=device_iops_write
> --rw=randwrite --bs=4k --iodepth=256 --numjobs=4 ${COMMON_OPTIONS}
> > [...]
> > write: *IOPS=83.1k*, BW=325MiB/s
> (341MB/s)(38.1GiB/120007msec)
> > [...]
> >
> > (please find the complete output at the end of this message, in case I
> should
> > have looked at some other lines and/or you are curious)
> >
> > As per the official manufacturer specs (both in this whitepaper at their
> > website[1]), and also in this datasheet I found somewhere else[2]), it's
> > supposed to be only *18K IOPS*.
> >
> > All the other base performance numbers I've measured (read IOPS, read and
> > write MB/s, read and write latencies) are at or very near the
> manufacturer
> > specs.
> >
> > What's going on?
> >
> > At first I thought that, despite `--direct=1` being explicitly indicated,
> > my machine's 64GB RAM (via the Linux buffer cache) could be caching the
> > writes (even if the number, in that case, should have been much
> higher)...
> > so, I tested it again with `--runtime=120` to saturate the buffer cache
> in
> > case it was really the 'culprit'... lo and behold, the result was:
> >
> > [...]
> > write: IOPS=83.1k, BW=325MiB/s (341MB/s)(190GiB/600019msec)
> > [...]
> >
> >
> > So, the surprising over-4.6x-times-the-spec Write IOPS is mantained, even
> > for 190GiB total data.
> >
> > And with 190GiB data written (about 10% the total device capacity), I do
> > not believe it's any kind of cache (RAM, MLC or whatever) inside the SSD
> > either.
> >
> You're running your workload for a comparatively short time
OK, I was able to find in the whitepaper (see below) the manufacturer
stating that the rand writes should be run for twice the capacity of the
disk. That will also imply a much longer test time...
> and additionally we don't know how "fresh" your SSD is.
Good point; here's its "freshness"-relevant data straight from `smartctl -a`:
9 Power_On_Hours 0x0032 091 091 000 Old_age Always
- 43694
177 Wear_Leveling_Count 0x0013 094 094 005 Pre-fail Always
- 394
241 Total_LBAs_Written 0x0032 099 099 000 Old_age Always
- 535797643689
242 Total_LBAs_Read 0x0032 099 099 000 Old_age Always
- 1848967801660
251 NAND_Writes 0x0032 100 100 000 Old_age Always
- 1642499721864
So, I think it's a pretty 'mature' disk already (but hopefully with a lot
of 'life' still in it).
In other words, I don't think it's "fresh" enough to explain a 4x I/O increase.
Perhaps "freshness" in this case refers to it being recently
secure-erased (which I did prior to start testing)?
> The 18K IOPS value
> might be when the drive has been fully written and there are no
> pre-erased blocks available (via so-called preconditioning)... I'll
> also note the whitepaper [1] mentions this:
>
> SSD Precondition: Sustained state (or steady state)
> [...]
> It's important to note that all performance items mentioned in this
> white paper have been measured at the sustained state, except the
> sequential read/write performance
>
Thanks for going through the whitepaper and picking this up. It passed
right by me...
I went through the whitepaper again, and found this:
The sustained state in this document refers to the status that a
128 KB sequential write has been completed equal to the drive capacity and
then 4 KB random write has completed twice as much as the drive capacity
OK, so at least there's a "recipe" for this preconditioning. I will try it
and come back later to report.
> I notice that your SSD appears to be SATA (sda) so I'd be surprised
> that a total queue depth greater than 32 makes a difference (your
> total queue depth is 1024). Do you get a similar result with just the
> one job with an iodepth=32?
I tested with iodepth=32 (instead of 256) and got the same result, so I
guess you are not surprised ;-)
Just did it again, this time with `--numjobs=1` (instead of 4) and here's
the result:
write: IOPS=83.0k, BW=324MiB/s (340MB/s)(38.0GiB/120001msec)
So that's not it either.
> It's unlikely but if the jobs were submitting I/O to the same areas as
> other jobs at the same time then some of the I/O could be elided but
> given what you've posted this should not be the case.
Agreed.
Cheers,
--
Durval.
>
>
> --
> Sitsofe
> >
> >
next reply other threads:[~2022-02-15 16:37 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-15 16:40 Durval Menezes [this message]
-- strict thread matches above, loose matches on Subject: below --
2022-02-15 16:48 Samsung PM863 SSD: surprisingly high Write IOPS measured using `fio`, over 4.6 times more than spec!? Durval Menezes
2022-02-14 14:29 Durval Menezes
2022-02-14 19:50 ` Sitsofe Wheeler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220215164026.GK487@angmar.tmp.com.br \
--to=jmmml@durval.com \
--cc=fio@vger.kernel.org \
--cc=sitsofe@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.