From: Stefan Priebe <s.priebe@profihost.ag>
To: Chinmay V S <cvs268@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>,
linux-fsdevel@vger.kernel.org, Al Viro <viro@zeniv.linux.org.uk>,
LKML <linux-kernel@vger.kernel.org>,
matthew@wil.cx
Subject: Re: Why is O_DSYNC on linux so slow / what's wrong with my SSD?
Date: Fri, 22 Nov 2013 20:55:26 +0100 [thread overview]
Message-ID: <528FB6AE.1080405@profihost.ag> (raw)
In-Reply-To: <CAK-9PRDyGhXPef-Vbt83Os-oowFZ2HzSZVY9PH3rSdnTwNmf2w@mail.gmail.com>
Am 20.11.2013 16:22, schrieb Chinmay V S:
> Hi Stefan,
>
>> thanks for your great and detailed reply. I'm just wondering why an
>> intel 520 ssd degrades the speed just by 2% in case of O_SYNC. intel 530
>> the newer model and replacement for the 520 degrades speed by 75% like
>> the crucial m4.
>>
>> The Intel DC S3500 instead delivers also nearly 98% of it's performance
>> even under O_SYNC.
>
> If you have confirmed the performance numbers, then it indicates that
> the Intel 530 controller is more advanced and makes better use of the
> internal disk-cache to achieve better performance (as compared to the
> Intel 520). Thus forcing CMD_FLUSH on each IOP (negating the benefits
> of the disk write-cache and not allowing any advanced disk controller
> optimisations) has a more pronouced effect of degrading the
> performance on Intel 530 SSDs. (Someone with some actual info on Intel
> SSDs kindly confirm this.)
>
>>> To simply disable this behaviour and make the SYNC/DSYNC behaviour and
>>> performance on raw block-device I/O resemble the standard filesystem
>>> I/O you may want to apply the following patch to your kernel -
>>> https://gist.github.com/TheCodeArtist/93dddcd6a21dc81414ba
>>>
>>> The above patch simply disables the CMD_FLUSH command support even on
>>> disks that claim to support it.
>>
>> Is this the right one? By assing ahci_dummy_read_id we disable the
>> CMD_FLUSH?
>>
>> What is the risk of that one?
>
> Yes, https://gist.github.com/TheCodeArtist/93dddcd6a21dc81414ba is the
> right one. The dummy read_id() provides a hook into the initial
> disk-properties discovery process when the disk is plugged-in. By
> explicitly negating the bits that indicate cache and
> flush-cache(CMD_FLUSH) support, we can ensure that the block driver
> does NOT issue CMD_FLUSH commands to the disk. Note that this does NOT
> disable the write-cache on the disk itself i.e. performance improves
> due to the on-disk write-cache in the absence of any CMD_FLUSH
> commands from the host-PC.
ah OK thanks.
> Theoretically, it increases the chances of data loss i.e. if power is
> removed while the write is in progress from the app. Personally though
> i have found that the impact of this is minimal because SYNC on a raw
> block device with CMD_FLUSH does NOT guarantee atomicity in case of a
> power-loss. Hence, in the event of a power loss, applications cannot
> rely on SYNC(with CMD_FLUSH) for data integrity. Rather they have to
> maintain other data-structures with redundant disk metadata (which is
> precisely what modern file-systems do). Thus, removing CMD_FLUSH
> doesn't really result in a downside as such.
In my production system i've crucial m500 which have a capicitor so in a
case of power loss they flush their data to disk automatically.
> The main thing to consider when applying the above simple patch is
> that it is system-wide. The above patch prevents the host-PC from
> issuing CMD_FLUSH for ALL drives enumerated via SATA/SCSI on the
> system.
>
> If this patch works for you, then to restrict the change in behaviour
> to a specific disk, you will need to:
> 1. Identify the disk by its model number within the dummy read_id().
> 2. Zero the bits ONLY for your particular disk.
> 3. Return without modifying anything for all other disks.
>
> Try out the above patch and let me know if you have any further issues.
The best thing would be a a flag under
/sys/bock/sdc/device/
for ssds with capictor - so everybody can decide on their own.
Stefan
prev parent reply other threads:[~2013-11-22 19:55 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-20 12:12 Why is O_DSYNC on linux so slow / what's wrong with my SSD? Stefan Priebe - Profihost AG
2013-11-20 12:54 ` Christoph Hellwig
2013-11-20 13:34 ` Chinmay V S
2013-11-20 13:38 ` Christoph Hellwig
2013-11-20 14:12 ` Stefan Priebe - Profihost AG
2013-11-20 15:22 ` Chinmay V S
2013-11-20 15:37 ` Theodore Ts'o
2013-11-20 15:55 ` J. Bruce Fields
2013-11-20 17:11 ` Chinmay V S
2013-11-20 17:58 ` J. Bruce Fields
2013-11-20 18:43 ` Chinmay V S
2013-11-21 10:11 ` Christoph Hellwig
2013-11-22 20:01 ` Stefan Priebe
2013-11-22 20:37 ` Ric Wheeler
2013-11-22 21:05 ` Stefan Priebe
2013-11-23 18:27 ` Stefan Priebe
2013-11-23 19:35 ` Ric Wheeler
2013-11-23 19:48 ` Stefan Priebe
2013-11-25 7:37 ` Stefan Priebe
2020-01-08 6:58 ` slow sync performance on LSI / Broadcom MegaRaid performance with battery cache Stefan Priebe - Profihost AG
2013-11-22 19:57 ` Why is O_DSYNC on linux so slow / what's wrong with my SSD? Stefan Priebe
2013-11-24 0:10 ` One Thousand Gnomes
2013-11-20 16:02 ` Howard Chu
2013-11-23 20:36 ` Pavel Machek
2013-11-23 23:01 ` Ric Wheeler
2013-11-24 0:22 ` Pavel Machek
2013-11-24 1:03 ` One Thousand Gnomes
2013-11-24 2:43 ` Ric Wheeler
2013-11-22 19:55 ` Stefan Priebe [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=528FB6AE.1080405@profihost.ag \
--to=s.priebe@profihost.ag \
--cc=cvs268@gmail.com \
--cc=hch@infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=matthew@wil.cx \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).