From: Evgeniy Firsov <Evgeniy.Firsov@sandisk.com>
To: Haomai Wang <haomaiwang@gmail.com>, Sage Weil <sage@newdream.net>
Cc: changtao381 <changtao381@163.com>,
"Dałek, Piotr" <Piotr.Dalek@ts.fujitsu.com>,
"ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: 答复: journal alignment
Date: Fri, 20 Nov 2015 19:53:18 +0000 [thread overview]
Message-ID: <D274B588.92A7%evgeniy.firsov@sandisk.com> (raw)
In-Reply-To: <CACJqLyY48OX4rLFg9xj_UxEmO24x0bx70wyJsLjGnLyM-skkCA@mail.gmail.com>
Do you see any performance difference between direct io and non direct io
mode?
If it¹s disabled, you don¹t need any buffer alignment.
On 11/20/15, 5:29 AM, "ceph-devel-owner@vger.kernel.org on behalf of
Haomai Wang" <ceph-devel-owner@vger.kernel.org on behalf of
haomaiwang@gmail.com> wrote:
>On Fri, Nov 20, 2015 at 9:08 PM, Sage Weil <sage@newdream.net> wrote:
>> On Fri, 20 Nov 2015, Haomai Wang wrote:
>>> On Fri, Nov 20, 2015 at 7:41 PM, Sage Weil <sage@newdream.net> wrote:
>>> > On Fri, 20 Nov 2015, changtao381 wrote:
>>> >> Hi All,
>>> >>
>>> >> Thanks for you apply!
>>> >>
>>> >> If directioIO + async IO requirement that alignment, it shouldn't
>>>aligned by PAGE for each journal entry.
>>> >> For it may write many entries of journal once time
>>> >
>>> > We also want to avoid copying the data around in memory to change the
>>> > alignment. The messenger takes care to read data off the wire into
>>> > buffers with the correct alignment so that we can later use them for
>>> > direct-io.
>>> >
>>> > If you're worried about the small io case, I think this is just a
>>>matter
>>> > of setting a threshold for small ios so that we don't bother with
>>>all of
>>> > the padding when the memory copy isn't that expensive. But... given
>>>that
>>> > we have a header *and* footer in the journal format and almost all
>>>IOs are
>>> > 4k multiples I think it'd save you a single 4k block at most.
>>> >
>>> > (Also, I thought we already did something like this, but perhaps
>>>not!)
>>>
>>> Hmm, based on our recently test, the data from messenger is aligned.
>>> But the encoded data(pglog, transaction) will make thing worse, like
>>> PR(https://github.com/ceph/ceph/pull/6368) solved, we even will get 14
>>> ptr in the bufferlist which passed into filejournal before. So it make
>>> we rebuild each time within filejournal thread. Like this
>>> PR(https://github.com/ceph/ceph/pull/6484), we try to make it rebuild
>>> not in filejournal thread which is single.
>>
>> buffer::list::rebuild_page_aligned() should only copy/rebuild ptrs that
>> are unaligned, and leave aligned ones untouched. It looks like the
>> journal code is already doing this?
>
>Yes or not, for example we have a bufferlist contains 2 ptrs, the
>first is unaligned, the second is aligned. But the current impl will
>ignore the second alignment fact. Look at the code:
>
> void buffer::list::rebuild_aligned_size_and_memory(unsigned align_size,
> unsigned align_memory)
> {
>........
> list unaligned;
> unsigned offset = 0;
> do {
> /*cout << " segment " << (void*)p->c_str()
> << " offset " << ((unsigned long)p->c_str() & (align - 1))
> << " length " << p->length() << " " << (p->length() &
>(align - 1))
> << " overall offset " << offset << " " << (offset & (align
>- 1))
> << " not ok" << std::endl;
> */
> offset += p->length();
> unaligned.push_back(*p);
> _buffers.erase(p++);
> } while (p != _buffers.end() &&
> (!p->is_aligned(align_memory) ||
> !p->is_n_align_sized(align_size) ||
> (offset % align_size)));
>((((((((((((( it will check offset alignment, so won't continues after
>meeting the first unalign ptr ))))))))))))))
>
> if (!(unaligned.is_contiguous() &&
>unaligned._buffers.front().is_aligned(align_memory))) {
> ptr nb(buffer::create_aligned(unaligned._len, align_memory));
> unaligned.rebuild(nb);
> _memcopy_count += unaligned._len;
> }
> _buffers.insert(p, unaligned._buffers.front());
> }
> }
>
>
>>
>> sage
>
>
>
>--
>Best Regards,
>
>Wheat
>--
>To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2015-11-20 19:53 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-20 8:33 journal alignment changtao381
2015-11-20 8:41 ` Haomai Wang
2015-11-20 8:42 ` Piotr.Dalek
2015-11-20 9:12 ` 答复: " changtao381
2015-11-20 9:16 ` 池信泽
2015-11-20 9:19 ` 池信泽
[not found] ` <alpine.DEB.2.00.1511200339470.25088@cobra.newdream.net>
2015-11-20 13:03 ` Haomai Wang
[not found] ` <alpine.DEB.2.00.1511200507120.24879@cobra.newdream.net>
2015-11-20 13:29 ` Haomai Wang
2015-11-20 19:53 ` Evgeniy Firsov [this message]
2015-11-23 16:59 ` Haomai Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=D274B588.92A7%evgeniy.firsov@sandisk.com \
--to=evgeniy.firsov@sandisk.com \
--cc=Piotr.Dalek@ts.fujitsu.com \
--cc=ceph-devel@vger.kernel.org \
--cc=changtao381@163.com \
--cc=haomaiwang@gmail.com \
--cc=sage@newdream.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.