All of lore.kernel.org
 help / color / mirror / Atom feed
From: Loic Dachary <loic@dachary.org>
To: Kyle Bader <kyle@inktank.com>
Cc: ceph-devel@vger.kernel.org
Subject: Re: Pyramid erasure codes and replica hinted recovery
Date: Mon, 13 Jan 2014 09:38:47 +0100	[thread overview]
Message-ID: <52D3A617.9010409@dachary.org> (raw)
In-Reply-To: <CAHxYaFOzaeeRPv+N0mU_uWODUOvQnAe2kAn8rkuuSchFigkBAQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1986 bytes --]



On 13/01/2014 03:35, Kyle Bader wrote:
>> How is it different from what is described above? There must be something I fail to understand.
> 
> No misunderstanding on your part, on second look that does achieve the
> desired placement. Could you please help walk me through the following
> scenarios:
> 
> Can data or local parity chunks that have been lost (erasures) be
> recovered locally, with no inter-dc backfill traffic?

If the primary happens to be located in the same data enter as the lost chunk and the layout is as described previously, then it will be recovered without the need for inter-dc traffic. If the primary is not in the same datacenter, it may be possible to move it to the datacenter where the lost chunk is located. When the primary OSD is lost, another must be chosen. It would be nice to change the primary not only when it is lost but also when doing so helps recovery. 

> Global parity chunks that are lost require reading....6x data or
> global parity chunks (effectively 1x the original write)?

From the point of view of recovery, global parity chunks are treated in the same way as data chunks. If you have RS(6,3,3), you will need to read 6 chunks out of 9 ( 6 data chunks + 3 global parity chunks ) to be able to recover from the loss of 2 or 3 chunks ( data or parity, it does not matter ). In other words, to recover from the loss of more chunks than local parity allows, you need to read 1x the original write. 

> Would placement groups containing a data or local parity chunk that
> have been remapped backfill from the local chunk (member of previous
> acting set)?

David is working on multiple backfill at the moment https://github.com/ceph/ceph/pull/931 and will have a definitive answer. The data flows from the primary OSD to the OSDs supporting the other chunks there is no peer-to-peer communication between the OSDs participating in a placement group. 

Cheers

-- 
Loïc Dachary, Artisan Logiciel Libre


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

  reply	other threads:[~2014-01-13  8:38 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-10 23:40 Pyramid erasure codes and replica hinted recovery Kyle Bader
2014-01-12  1:31 ` Loic Dachary
2014-01-12 14:31   ` Kyle Bader
2014-01-12 19:37     ` Loic Dachary
2014-01-13  2:35       ` Kyle Bader
2014-01-13  8:38         ` Loic Dachary [this message]
2014-01-13 11:15           ` Andreas Joachim Peters

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52D3A617.9010409@dachary.org \
    --to=loic@dachary.org \
    --cc=ceph-devel@vger.kernel.org \
    --cc=kyle@inktank.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.