linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Priebe <s.priebe@profihost.ag>
To: Chinmay V S <cvs268@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>,
	linux-fsdevel@vger.kernel.org, Al Viro <viro@zeniv.linux.org.uk>,
	LKML <linux-kernel@vger.kernel.org>,
	matthew@wil.cx
Subject: Re: Why is O_DSYNC on linux so slow / what's wrong with my SSD?
Date: Fri, 22 Nov 2013 20:55:26 +0100	[thread overview]
Message-ID: <528FB6AE.1080405@profihost.ag> (raw)
In-Reply-To: <CAK-9PRDyGhXPef-Vbt83Os-oowFZ2HzSZVY9PH3rSdnTwNmf2w@mail.gmail.com>

Am 20.11.2013 16:22, schrieb Chinmay V S:
> Hi Stefan,
>
>> thanks for your great and detailed reply. I'm just wondering why an
>> intel 520 ssd degrades the speed just by 2% in case of O_SYNC. intel 530
>> the newer model and replacement for the 520 degrades speed by 75% like
>> the crucial m4.
>>
>> The Intel DC S3500 instead delivers also nearly 98% of it's performance
>> even under O_SYNC.
>
> If you have confirmed the performance numbers, then it indicates that
> the Intel 530 controller is more advanced and makes better use of the
> internal disk-cache to achieve better performance (as compared to the
> Intel 520). Thus forcing CMD_FLUSH on each IOP (negating the benefits
> of the disk write-cache and not allowing any advanced disk controller
> optimisations) has a more pronouced effect of degrading the
> performance on Intel 530 SSDs. (Someone with some actual info on Intel
> SSDs kindly confirm this.)
>
>>> To simply disable this behaviour and make the SYNC/DSYNC behaviour and
>>> performance on raw block-device I/O resemble the standard filesystem
>>> I/O you may want to apply the following patch to your kernel -
>>> https://gist.github.com/TheCodeArtist/93dddcd6a21dc81414ba
>>>
>>> The above patch simply disables the CMD_FLUSH command support even on
>>> disks that claim to support it.
>>
>> Is this the right one? By assing ahci_dummy_read_id we disable the
>> CMD_FLUSH?
>>
>> What is the risk of that one?
>
> Yes, https://gist.github.com/TheCodeArtist/93dddcd6a21dc81414ba is the
> right one. The dummy read_id() provides a hook into the initial
> disk-properties discovery process when the disk is plugged-in. By
> explicitly negating the bits that indicate cache and
> flush-cache(CMD_FLUSH) support, we can ensure that the block driver
> does NOT issue CMD_FLUSH commands to the disk. Note that this does NOT
> disable the write-cache on the disk itself i.e. performance improves
> due to the on-disk write-cache in the absence of any CMD_FLUSH
> commands from the host-PC.

ah OK thanks.

> Theoretically, it increases the chances of data loss i.e. if power is
> removed while the write is in progress from the app. Personally though
> i have found that the impact of this is minimal because SYNC on a raw
> block device with CMD_FLUSH does NOT guarantee atomicity in case of a
> power-loss. Hence, in the event of a power loss, applications cannot
> rely on SYNC(with CMD_FLUSH) for data integrity. Rather they have to
> maintain other data-structures with redundant disk metadata (which is
> precisely what modern file-systems do). Thus, removing CMD_FLUSH
> doesn't really result in a downside as such.

In my production system i've crucial m500 which have a capicitor so in a 
case of power loss they flush their data to disk automatically.

> The main thing to consider when applying the above simple patch is
> that it is system-wide. The above patch prevents the host-PC from
> issuing CMD_FLUSH for ALL drives enumerated via SATA/SCSI on the
> system.
>
> If this patch works for you, then to restrict the change in behaviour
> to a specific disk, you will need to:
> 1. Identify the disk by its model number within the dummy read_id().
> 2. Zero the bits ONLY for your particular disk.
> 3. Return without modifying anything for all other disks.
>
> Try out the above patch and let me know if you have any further issues.

The best thing would be a a flag under
/sys/bock/sdc/device/

for ssds with capictor - so everybody can decide on their own.

Stefan

      parent reply	other threads:[~2013-11-22 19:55 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-20 12:12 Why is O_DSYNC on linux so slow / what's wrong with my SSD? Stefan Priebe - Profihost AG
2013-11-20 12:54 ` Christoph Hellwig
2013-11-20 13:34   ` Chinmay V S
2013-11-20 13:38     ` Christoph Hellwig
2013-11-20 14:12     ` Stefan Priebe - Profihost AG
2013-11-20 15:22       ` Chinmay V S
2013-11-20 15:37         ` Theodore Ts'o
2013-11-20 15:55           ` J. Bruce Fields
2013-11-20 17:11             ` Chinmay V S
2013-11-20 17:58               ` J. Bruce Fields
2013-11-20 18:43                 ` Chinmay V S
2013-11-21 10:11                   ` Christoph Hellwig
2013-11-22 20:01                     ` Stefan Priebe
2013-11-22 20:37                       ` Ric Wheeler
2013-11-22 21:05                         ` Stefan Priebe
2013-11-23 18:27                         ` Stefan Priebe
2013-11-23 19:35                           ` Ric Wheeler
2013-11-23 19:48                             ` Stefan Priebe
2013-11-25  7:37                             ` Stefan Priebe
2020-01-08  6:58                             ` slow sync performance on LSI / Broadcom MegaRaid performance with battery cache Stefan Priebe - Profihost AG
2013-11-22 19:57             ` Why is O_DSYNC on linux so slow / what's wrong with my SSD? Stefan Priebe
2013-11-24  0:10               ` One Thousand Gnomes
2013-11-20 16:02           ` Howard Chu
2013-11-23 20:36             ` Pavel Machek
2013-11-23 23:01               ` Ric Wheeler
2013-11-24  0:22                 ` Pavel Machek
2013-11-24  1:03                   ` One Thousand Gnomes
2013-11-24  2:43                   ` Ric Wheeler
2013-11-22 19:55         ` Stefan Priebe [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=528FB6AE.1080405@profihost.ag \
    --to=s.priebe@profihost.ag \
    --cc=cvs268@gmail.com \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthew@wil.cx \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).