From: Loic Dachary <loic@dachary.org>
To: Andreas Joachim Peters <Andreas.Joachim.Peters@cern.ch>
Cc: Ceph Development <ceph-devel@vger.kernel.org>
Subject: Re: controlling erasure code chunk size
Date: Tue, 04 Feb 2014 17:17:55 +0100 [thread overview]
Message-ID: <52F112B3.6050502@dachary.org> (raw)
In-Reply-To: <CA+4uBUYROX1gWuBzkUgxTN4Vc2+62BfXq8Ch5FHPQbvKHU8Jsw@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 2514 bytes --]
Hi Andreas,
> For w=(multiple of 8) we could probably skip the (*sizeof(int)) and get the chunksize factor 4 down ... Loic we should check if this is ok with the Jerasure implementation .... I wonder if we should have 'packetsize' as a plugin parameter or we should just adjust the packetsize based on the desired chunk_size to get it close.
You are correct : the packet size is best adapted to the object size (or stripe size) rather than being set once for all. However Sam wants to use a fixed stripe size and we don't need this flexibility right now.
I don't fully understand the alignment requirements of Jerasure. Since we're using Cauchy because it is the fastest, here is how I understand its alignment constraints. I copied them from the original encode/decode methods found in jerasure into the get_alignment method whithout understanding the details.
* each chunk memory address must be aligned to allow
https://github.com/ceph/ceph/blob/v0.76/src/osd/ErasureCodePluginJerasure/vectorop.h to be used by https://github.com/ceph/ceph/blob/v0.76/src/osd/ErasureCodePluginJerasure/galois.c#L748 . This is done without reading from get_alignment() because each buffer is created with https://github.com/ceph/ceph/blob/v0.76/src/common/buffer.cc#L519 buffer::create_page_aligned which calls https://github.com/ceph/ceph/blob/v0.76/src/common/buffer.cc#L235 posix_memalign with an alignment of CEPH_PAGE_SIZE which is large enough. It is implicit though and it would be better to explicitly set this constraint.
https://github.com/ceph/ceph/blob/v0.76/src/osd/ErasureCodePluginJerasure/ErasureCodeJerasure.cc#L288
* each chunk size must be a multiple of get_alignment() and in the case of the Cauch techniques it means:
** being a multiple of sizeof(int) (why?)
** being a multiple of LARGEST_VECTOR_WORDSIZE (because https://github.com/ceph/ceph/blob/v0.76/src/osd/ErasureCodePluginJerasure/galois.c#L748)
** being a multiple of k*w*packetsize (because each chunk contains k packets of packets size and each packet is made of words of size w)
I would be grateful if you could explain what the sizeof(int) is about. Also, I understand that k*w*packetsize should be a multiple of LARGEST_VECTOR_WORDSIZE but I don't understand why you would multiply the alignment to achieve this. Is it be enough to if (alignment % LARGEST_VECTOR_WORDSIZE) alignment += alignment % LARGEST_VECTOR_WORDSIZE ?
Thanks in advance for your patience :-)
--
Loïc Dachary, Artisan Logiciel Libre
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]
next prev parent reply other threads:[~2014-02-04 16:17 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-02 15:15 controlling erasure code chunk size Loic Dachary
2014-02-02 16:18 ` Andreas Joachim Peters
2014-02-02 22:45 ` Samuel Just
2014-02-02 23:27 ` Andreas Joachim Peters
2014-02-02 23:33 ` Samuel Just
2014-02-04 16:17 ` Loic Dachary [this message]
2014-02-04 17:01 ` Andreas Joachim Peters
2014-02-03 10:57 ` Loic Dachary
2014-02-03 11:35 ` Loic Dachary
2014-02-03 18:15 ` Samuel Just
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52F112B3.6050502@dachary.org \
--to=loic@dachary.org \
--cc=Andreas.Joachim.Peters@cern.ch \
--cc=ceph-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.