Re: ceph and efficient access of distributed resources

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Mark Kampe <mark.kampe@inktank.com>
To: Gandalf Corvotempesta <gandalf.corvotempesta@gmail.com>
Cc: Matthias Urlichs <matthias@urlichs.de>,
	"ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: ceph and efficient access of distributed resources
Date: Tue, 16 Apr 2013 07:18:27 -0700	[thread overview]
Message-ID: <516D5DB3.4060800@inktank.com> (raw)
In-Reply-To: <CAJH6TXhRtWQ1ypEb5JOwzp9T3Nd7A=J_6Dn0M25vXjYMg8j7fQ@mail.gmail.com>

On 04/16/13 00:20, Gandalf Corvotempesta wrote:
> 2013/4/16 Mark Kampe <mark.kampe@inktank.com>:
>> The entire web is richly festooned with cache servers whose
>> sole raison d'etre is to solve precisely this problem.  They
>> are so good at it that back-bone providers often find it more
>> cash-efficient to buy more cache servers than to lay more
>> fiber.  Cache servers don't merely save disk I/O, they catch
>> these requests before they reach the server (or even the
>> backbone).
>
> Mine was just an example, there are many other cases where a frotnend
> cache is not possible.
> I think that ceph should spread reads across the whole clusters by
> default (like a big RAID-1), to archieve bandwidth improvement.

At my previous distributed storage start-up (Parascale) we had the
ability to distribute reads across copies for load distribution
purposes and everybody we talked to said "who cares!".  Why?

    For hot-spot situations (as in your original example)
    higher level caching is far more effective than random
    traffic distribution.

    For lower level (e.g. coincidental) reuse, sending all the
    requests to a single server will usually perform better.
    Network I/O is much faster than disk I/O, and a single
    recipient will have N * the cache hit rate that N servers
    would have.

> What happens in case of a big file (for example, 100MB) with multiple
> chunks? Is ceph smart enough to read multiple chunks from multiple
> servers simultaneously or the whole file will be served by just an OSD

RADOS is the underlying storage cluster, but the access methods (block,
object, and file) stripe their data across many RADOS objects, which
CRUSH very effectively distributes across all of the servers.  A 100MB
read or write turns into dozens of parallel operations to servers all
over the cluster.

next prev parent reply	other threads:[~2013-04-16 14:18 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-12  3:59 ceph and efficient access of distributed resources Matthias Urlichs
2013-04-12 16:08 ` Mark Nelson
2013-04-12 16:20   ` Gregory Farnum
2013-04-13  2:32     ` Chen, Xiaoxi
2013-04-15 16:42       ` Gregory Farnum
2013-04-15 23:14         ` Chen, Xiaoxi
2013-04-15 20:06   ` Gandalf Corvotempesta
2013-04-15 22:25     ` Dan Mick
2013-04-15 22:38       ` Mark Kampe
2013-04-16  7:20         ` Gandalf Corvotempesta
2013-04-16 13:59           ` Sage Weil
2013-04-16 14:18           ` Mark Kampe [this message]
2013-04-16 20:06             ` Gandalf Corvotempesta
2013-04-16 20:44               ` Mark Kampe
2013-04-17  7:22                 ` Gandalf Corvotempesta

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=516D5DB3.4060800@inktank.com \
    --to=mark.kampe@inktank.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=gandalf.corvotempesta@gmail.com \
    --cc=matthias@urlichs.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.