All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Kampe <mark.kampe@inktank.com>
To: sheng qiu <herbert1984106@gmail.com>
Cc: "Chen, Xiaoxi" <xiaoxi.chen@intel.com>,
	"ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: some performance issue
Date: Mon, 04 Feb 2013 09:29:26 -0800	[thread overview]
Message-ID: <510FEFF6.7010709@inktank.com> (raw)
In-Reply-To: <CAB7xdinzWjS7PfWCf+3HLby7DYVr4Yz21C1WLV6Fa=6CwzjTOA@mail.gmail.com>

Writes are intrinsically more expensive (in both the file
system and hardware) but it is not uncommon for individual
small random writes to substantially outperform reads even
if O_DIRECT.

If the I/O is not massively parallel, reads are going to be
processed one at a time (e.g. ~6ms seek, ~4ms latency, and
27us transfer).  Writes, however, are commonly accepted by
the drive and then queued, enabling the drive to choose among
the competing requests to significantly (e.g. 2-3x) reduce
both average seek time and rotational latency.

If the I/O is being buffered, the performance advantages for
random writes can be even greater (due to a deeper request
queue and potential request aggregation).  Isolated random
reads (with few cache hits) get a much smaller performance
boost (if any) from buffered I/O.

With massively parallel requests, however, the write
advantage should evaporate.

On 02/04/2013 09:15 AM, sheng qiu wrote:
> Hi Xiaoxi,
>
> thanks for your reply.
>
> On Mon, Feb 4, 2013 at 10:52 AM, Chen, Xiaoxi <xiaoxi.chen@intel.com> wrote:
>> I doubt your data is correct ,even the ext4 data, have you use O_DIRECT when doing the test? It's unusual to have 2X random write IOPS than random read.
>>
>
>   i did not use O_DIRECT. so page cache is used during the test.
> one thing i guess why random write is better than random read is that
> since the io request size is 4KB, so for each write request if miss on
> page cache, it will allocate a new page and write the complete 4KB
> dirty data there (since no partitional writes, no need to fetch the
> missed data from OSDs). While for read requests, it has to wait until
> the data are fetched from the OSDs.

      reply	other threads:[~2013-02-04 17:29 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-01 20:20 some performance issue sheng qiu
2013-02-01 21:10 ` Mark Nelson
2013-02-04 15:36   ` sheng qiu
2013-02-04 16:52     ` Chen, Xiaoxi
2013-02-04 17:15       ` sheng qiu
2013-02-04 17:29         ` Mark Kampe [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=510FEFF6.7010709@inktank.com \
    --to=mark.kampe@inktank.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=herbert1984106@gmail.com \
    --cc=xiaoxi.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.