From: majianpeng <majianpeng@gmail.com>
To: sage <sage@inktank.com>
Cc: "Yan, Zheng" <ukernel@gmail.com>,
ceph-devel <ceph-devel@vger.kernel.org>
Subject: Re: Re: question about striped_read
Date: Wed, 31 Jul 2013 08:44:56 +0800 [thread overview]
Message-ID: <201307310844488455702@gmail.com> (raw)
In-Reply-To: alpine.DEB.2.00.1307301738350.14027@cobra.newdream.net
>On Wed, 31 Jul 2013, majianpeng wrote:
>> >On Tue, Jul 30, 2013 at 7:41 PM, majianpeng <majianpeng@gmail.com> wrote:
[snip]
>
>For ceph_osdc_readpages(),
>
>> A: ret = ENOENT
>
From the original code, for this case we should zero the area.
Why?
Thanks!
Jianpeng Ma
>The object does not exist.
>
>> B: ret = 0
>
>The object exists but we read 0 bytes, which means we are past EOF or the
>object has size 0 bytes. Either way, we are either in a hole or past EOF.
>
>sage
>
>>
>> Only we knowed this, we can handle exactly.
>> Sage, can you explain those meaning in detail?
>>
>> Thanks!
>> Jianpeng Ma
>>
>> >Regards
>> >Yan, Zheng
>> >
>> >
>> >>>> But i think i will add a parameter about hit_hole. It will make the code easy to understand.
>> >>>>
>> >>>
>> >>> i think 'was_short' is equal to 'hit_hole'
>> >>>
>> >[snip]
>> Thanks!
>> Jianpeng Ma
>> >On Tue, Jul 30, 2013 at 7:41 PM, majianpeng <majianpeng@gmail.com> wrote:
>> >>>>
>> >>>>>dd if=/dev/urandom bs=1M count=2 of=file_with_holes
>> >>>>>dd if=/dev/urandom bs=1M count=2 seek=4 of=file_with_holes conv=notrunc
>> >>>>>dd if=file_with_holes bs=8M >/dev/null
>> >>>>>
>> >>>> diff --git a/fs/ceph/file.c b/fs/ceph/file.c
>> >>>> index 2ddf061..22a98e5 100644
>> >>>> --- a/fs/ceph/file.c
>> >>>> +++ b/fs/ceph/file.c
>> >>>> @@ -349,17 +349,17 @@ more:
>> >>>> dout("striped_read %llu~%u (read %u) got %d%s%s\n", pos, left, read,
>> >>>> ret, hit_stripe ? " HITSTRIPE" : "", was_short ? " SHORT" : "");
>> >>>>
>> >>>> - if (ret > 0) {
>> >>>> - int didpages = (page_align + ret) >> PAGE_CACHE_SHIFT;
>> >>>> + if (ret >= 0) {
>> >>>> + int didpages = (page_align + this_len) >> PAGE_CACHE_SHIFT;
>> >>>>
>> >>>> - if (read < pos - off) {
>> >>>> - dout(" zero gap %llu to %llu\n", off + read, pos);
>> >>>> - ceph_zero_page_vector_range(page_align + read,
>> >>>> - pos - off - read, pages);
>> >>>> + if (was_short) {
>> >>>> + dout(" zero gap %llu to %llu\n", pos + ret, pos + this_len);
>> >>>> + ceph_zero_page_vector_range(page_align + ret,
>> >>>> + this_len - ret, pages);
>> >>>> }
>> >>>> - pos += ret;
>> >>>> + pos += this_len;
>> >>>> read = pos - off;
>> >>>> - left -= ret;
>> >>>> + left -= this_len;
>> >>>> page_pos += didpages;
>> >>>> pages_left -= didpages;
>> >>>>
>> >>>> This patch can do those case. It only add ret== 0 in judgement 'ret > 0".
>> >>>
>> >>>maybe we should add a i_size check. stop reading next strip object
>> >>>when 'pos > i_size'
>> >>>
>> >> I think we can't do this because i_size may smaller than real size in ceph.
>> >>
>> >ceph_aio_read() calls ceph_do_getattr() when 'checkeof = true', it
>> >handles the case.
>> >
>> >We must do i_size check in striped_read(), otherwise user program always gets as
>> >much data as it requests. For example
>> >
>> >dd if=/dev/urandom bs=1M count=1 of=file_with_holes
>> >dd if=file_with_holes bs=64M iflag=direct of=/dev/null
>> >
>> >Regards
>> >Yan, Zheng
>> >
>> >
>> >>>> But i think i will add a parameter about hit_hole. It will make the code easy to understand.
>> >>>>
>> >>>
>> >>> i think 'was_short' is equal to 'hit_hole'
>> >>>
>> >[snip]
Thanks!
Jianpeng Ma
>On Wed, 31 Jul 2013, majianpeng wrote:
>> >On Tue, Jul 30, 2013 at 7:41 PM, majianpeng <majianpeng@gmail.com> wrote:
>> >>>>
>> >>>>>dd if=/dev/urandom bs=1M count=2 of=file_with_holes
>> >>>>>dd if=/dev/urandom bs=1M count=2 seek=4 of=file_with_holes conv=notrunc
>> >>>>>dd if=file_with_holes bs=8M >/dev/null
>> >>>>>
>> >>>> diff --git a/fs/ceph/file.c b/fs/ceph/file.c
>> >>>> index 2ddf061..22a98e5 100644
>> >>>> --- a/fs/ceph/file.c
>> >>>> +++ b/fs/ceph/file.c
>> >>>> @@ -349,17 +349,17 @@ more:
>> >>>> dout("striped_read %llu~%u (read %u) got %d%s%s\n", pos, left, read,
>> >>>> ret, hit_stripe ? " HITSTRIPE" : "", was_short ? " SHORT" : "");
>> >>>>
>> >>>> - if (ret > 0) {
>> >>>> - int didpages = (page_align + ret) >> PAGE_CACHE_SHIFT;
>> >>>> + if (ret >= 0) {
>> >>>> + int didpages = (page_align + this_len) >> PAGE_CACHE_SHIFT;
>> >>>>
>> >>>> - if (read < pos - off) {
>> >>>> - dout(" zero gap %llu to %llu\n", off + read, pos);
>> >>>> - ceph_zero_page_vector_range(page_align + read,
>> >>>> - pos - off - read, pages);
>> >>>> + if (was_short) {
>> >>>> + dout(" zero gap %llu to %llu\n", pos + ret, pos + this_len);
>> >>>> + ceph_zero_page_vector_range(page_align + ret,
>> >>>> + this_len - ret, pages);
>> >>>> }
>> >>>> - pos += ret;
>> >>>> + pos += this_len;
>> >>>> read = pos - off;
>> >>>> - left -= ret;
>> >>>> + left -= this_len;
>> >>>> page_pos += didpages;
>> >>>> pages_left -= didpages;
>> >>>>
>> >>>> This patch can do those case. It only add ret== 0 in judgement 'ret > 0".
>> >>>
>> >>>maybe we should add a i_size check. stop reading next strip object
>> >>>when 'pos > i_size'
>> >>>
>> >> I think we can't do this because i_size may smaller than real size in ceph.
>> >>
>> >ceph_aio_read() calls ceph_do_getattr() when 'checkeof = true', it
>> >handles the case.
>> >
>> >We must do i_size check in striped_read(), otherwise user program always gets as
>> >much data as it requests. For example
>> >
>> >dd if=/dev/urandom bs=1M count=1 of=file_with_holes
>> >dd if=file_with_holes bs=64M iflag=direct of=/dev/null
>> >
>> Before doing that, we must know the meaning of return value.
>
>For ceph_osdc_readpages(),
>
>> A: ret = ENOENT
>
>The object does not exist.
>
>> B: ret = 0
>
>The object exists but we read 0 bytes, which means we are past EOF or the
>object has size 0 bytes. Either way, we are either in a hole or past EOF.
>
>sage
>
>>
>> Only we knowed this, we can handle exactly.
>> Sage, can you explain those meaning in detail?
>>
>> Thanks!
>> Jianpeng Ma
>>
>> >Regards
>> >Yan, Zheng
>> >
>> >
>> >>>> But i think i will add a parameter about hit_hole. It will make the code easy to understand.
>> >>>>
>> >>>
>> >>> i think 'was_short' is equal to 'hit_hole'
>> >>>
>> >[snip]
>> Thanks!
>> Jianpeng Ma
>> >On Tue, Jul 30, 2013 at 7:41 PM, majianpeng <majianpeng@gmail.com> wrote:
>> >>>>
>> >>>>>dd if=/dev/urandom bs=1M count=2 of=file_with_holes
>> >>>>>dd if=/dev/urandom bs=1M count=2 seek=4 of=file_with_holes conv=notrunc
>> >>>>>dd if=file_with_holes bs=8M >/dev/null
>> >>>>>
>> >>>> diff --git a/fs/ceph/file.c b/fs/ceph/file.c
>> >>>> index 2ddf061..22a98e5 100644
>> >>>> --- a/fs/ceph/file.c
>> >>>> +++ b/fs/ceph/file.c
>> >>>> @@ -349,17 +349,17 @@ more:
>> >>>> dout("striped_read %llu~%u (read %u) got %d%s%s\n", pos, left, read,
>> >>>> ret, hit_stripe ? " HITSTRIPE" : "", was_short ? " SHORT" : "");
>> >>>>
>> >>>> - if (ret > 0) {
>> >>>> - int didpages = (page_align + ret) >> PAGE_CACHE_SHIFT;
>> >>>> + if (ret >= 0) {
>> >>>> + int didpages = (page_align + this_len) >> PAGE_CACHE_SHIFT;
>> >>>>
>> >>>> - if (read < pos - off) {
>> >>>> - dout(" zero gap %llu to %llu\n", off + read, pos);
>> >>>> - ceph_zero_page_vector_range(page_align + read,
>> >>>> - pos - off - read, pages);
>> >>>> + if (was_short) {
>> >>>> + dout(" zero gap %llu to %llu\n", pos + ret, pos + this_len);
>> >>>> + ceph_zero_page_vector_range(page_align + ret,
>> >>>> + this_len - ret, pages);
>> >>>> }
>> >>>> - pos += ret;
>> >>>> + pos += this_len;
>> >>>> read = pos - off;
>> >>>> - left -= ret;
>> >>>> + left -= this_len;
>> >>>> page_pos += didpages;
>> >>>> pages_left -= didpages;
>> >>>>
>> >>>> This patch can do those case. It only add ret== 0 in judgement 'ret > 0".
>> >>>
>> >>>maybe we should add a i_size check. stop reading next strip object
>> >>>when 'pos > i_size'
>> >>>
>> >> I think we can't do this because i_size may smaller than real size in ceph.
>> >>
>> >ceph_aio_read() calls ceph_do_getattr() when 'checkeof = true', it
>> >handles the case.
>> >
>> >We must do i_size check in striped_read(), otherwise user program always gets as
>> >much data as it requests. For example
>> >
>> >dd if=/dev/urandom bs=1M count=1 of=file_with_holes
>> >dd if=file_with_holes bs=64M iflag=direct of=/dev/null
>> >
>> >Regards
>> >Yan, Zheng
>> >
>> >
>> >>>> But i think i will add a parameter about hit_hole. It will make the code easy to understand.
>> >>>>
>> >>>
>> >>> i think 'was_short' is equal to 'hit_hole'
>> >>>
>> >[snip]
next prev parent reply other threads:[~2013-07-31 0:45 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-25 0:52 question about striped_read majianpeng
2013-07-25 5:54 ` Sage Weil
2013-07-25 6:55 ` majianpeng
2013-07-25 12:27 ` Yan, Zheng
2013-07-25 15:50 ` Sage Weil
2013-07-26 0:48 ` majianpeng
2013-07-26 1:14 ` Yan, Zheng
2013-07-26 1:22 ` majianpeng
2013-07-26 1:36 ` Yan, Zheng
2013-07-26 1:38 ` majianpeng
2013-07-26 1:59 ` Yan, Zheng
2013-07-26 2:07 ` majianpeng
[not found] ` <CAAM7YAkNQA5PqVr15CXRQ5xPLk42VCCb3kf3U8ic9f6n3d9SGg@mail.gmail.com>
2013-07-29 3:00 ` majianpeng
2013-07-29 5:02 ` Yan, Zheng
2013-07-30 2:08 ` majianpeng
2013-07-30 2:56 ` Yan, Zheng
2013-07-30 11:01 ` majianpeng
2013-07-30 11:14 ` Yan, Zheng
2013-07-30 11:20 ` majianpeng
2013-07-30 11:41 ` majianpeng
2013-07-30 12:25 ` Yan, Zheng
2013-07-31 0:27 ` majianpeng
2013-07-31 0:40 ` Sage Weil
2013-07-31 0:44 ` majianpeng [this message]
2013-07-31 0:47 ` Sage Weil
2013-07-31 1:36 ` majianpeng
[not found] ` <CAAM7YAnGaXcQm1LcaCUGL71FGRV5zfNx1iRObFkvXsyVpu91Ag@mail.gmail.com>
2013-07-31 5:46 ` majianpeng
[not found] ` <CAAM7YAmv6Ar_oTdYG31YSHnQwyUUYSNq3Zj_4fHcwMoOvno7Sw@mail.gmail.com>
2013-07-31 7:32 ` majianpeng
2013-07-31 8:26 ` Yan, Zheng
2013-08-01 1:45 ` majianpeng
2013-08-01 3:29 ` Yan, Zheng
2013-08-01 6:30 ` majianpeng
2013-08-01 7:19 ` Yan, Zheng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201307310844488455702@gmail.com \
--to=majianpeng@gmail.com \
--cc=ceph-devel@vger.kernel.org \
--cc=sage@inktank.com \
--cc=ukernel@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.