From: Luis Henriques <lhenriques@suse.com>
To: Jeff Layton <jlayton@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>, Sage Weil <sage@redhat.com>,
ceph-devel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH] ceph: fix directories inode i_blkbits initialization
Date: Tue, 23 Jul 2019 19:31:00 +0100 [thread overview]
Message-ID: <87o91k61sr.fsf@suse.com> (raw)
In-Reply-To: <c657b0d65acd5e8bc9d5d726d68e2ad1fff38b51.camel@kernel.org> (Jeff Layton's message of "Tue, 23 Jul 2019 13:18:31 -0400")
"Jeff Layton" <jlayton@kernel.org> writes:
> On Tue, 2019-07-23 at 16:50 +0100, Luis Henriques wrote:
>> When filling an inode with info from the MDS, i_blkbits is being
>> initialized using fl_stripe_unit, which contains the stripe unit in
>> bytes. Unfortunately, this doesn't make sense for directories as they
>> have fl_stripe_unit set to '0'. This means that i_blkbits will be set
>> to 0xff, causing an UBSAN undefined behaviour in i_blocksize():
>>
>> UBSAN: Undefined behaviour in ./include/linux/fs.h:731:12
>> shift exponent 255 is too large for 32-bit type 'int'
>>
>> Fix this by initializing i_blkbits to CEPH_BLOCK_SHIFT if fl_stripe_unit
>> is zero.
>>
>> Signed-off-by: Luis Henriques <lhenriques@suse.com>
>> ---
>> fs/ceph/inode.c | 7 ++++++-
>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> Hi Jeff,
>>
>> To be honest, I'm not sure CEPH_BLOCK_SHIFT is the right value to use
>> here, but for sure the one currently being used isn't correct if the
>> inode is a directory. Using stripe units seems to be a bug that has
>> been there since the beginning, but it definitely became bigger problem
>> after commit 69448867abcb ("fs: shave 8 bytes off of struct inode").
>>
>> This fix could also be moved into the 'switch' statement later in that
>> function, in the S_IFDIR case, similar to commit 5ba72e607cdb ("ceph:
>> set special inode's blocksize to page size"). Let me know which version
>> you would prefer.
>>
>
> What happens with (e.g.) named pipes or symlinks? Do those inodes also
> get this bogus value? Assuming that they do, I'd probably prefer this
> patch since it'd fix things for all inode types, not just directories.
I tested symlinks and they seem to be handled correctly (i.e. the stripe
units seems to be the same as the target file). Regarding pipes, I
didn't test them, but from the code it should be set to PAGE_SHIFT (see
the above mentioned commit 5ba72e607cdb).
Anyway, I can change the code to do *all* the i_blkbits initialization
inside the switch statement. Something like:
switch (inode->i_mode & S_IFMT) {
case S_IFIFO:
case S_IFBLK:
case S_IFCHR:
case S_IFSOCK:
inode->i_blkbits = PAGE_SHIFT;
...
case S_IFREG:
inode->i_blkbits = fls(le32_to_cpu(info->layout.fl_stripe_unit)) - 1;
...
case S_IFLNK:
inode->i_blkbits = fls(le32_to_cpu(info->layout.fl_stripe_unit)) - 1;
...
case S_IFDIR:
inode->i_blkbits = CEPH_BLOCK_SHIFT;
...
default:
pr_err();
...
}
This would add some code duplication (S_IFREG and S_IFLNK cases), but
maybe it's a bit more clear. The other option would be obviously to
leave the initialization outside the switch and only change the
i_blkbits value in the S_IF{IFO,BLK,CHR,SOCK,DIR} cases.
Cheers,
--
Luis
>
>> Cheers,
>> --
>> Luis
>>
>> diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
>> index 791f84a13bb8..0e6d6db848b7 100644
>> --- a/fs/ceph/inode.c
>> +++ b/fs/ceph/inode.c
>> @@ -800,7 +800,12 @@ static int fill_inode(struct inode *inode, struct page *locked_page,
>>
>> /* update inode */
>> inode->i_rdev = le32_to_cpu(info->rdev);
>> - inode->i_blkbits = fls(le32_to_cpu(info->layout.fl_stripe_unit)) - 1;
>> + /* directories have fl_stripe_unit set to zero */
>> + if (le32_to_cpu(info->layout.fl_stripe_unit))
>> + inode->i_blkbits =
>> + fls(le32_to_cpu(info->layout.fl_stripe_unit)) - 1;
>> + else
>> + inode->i_blkbits = CEPH_BLOCK_SHIFT;
>>
>> __ceph_update_quota(ci, iinfo->max_bytes, iinfo->max_files);
>>
WARNING: multiple messages have this Message-ID (diff)
From: Luis Henriques <lhenriques@suse.com>
To: "Jeff Layton" <jlayton@kernel.org>
Cc: "Ilya Dryomov" <idryomov@gmail.com>,
"Sage Weil" <sage@redhat.com>, <ceph-devel@vger.kernel.org>,
<linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH] ceph: fix directories inode i_blkbits initialization
Date: Tue, 23 Jul 2019 19:31:00 +0100 [thread overview]
Message-ID: <87o91k61sr.fsf@suse.com> (raw)
In-Reply-To: <c657b0d65acd5e8bc9d5d726d68e2ad1fff38b51.camel@kernel.org> (Jeff Layton's message of "Tue, 23 Jul 2019 13:18:31 -0400")
"Jeff Layton" <jlayton@kernel.org> writes:
> On Tue, 2019-07-23 at 16:50 +0100, Luis Henriques wrote:
>> When filling an inode with info from the MDS, i_blkbits is being
>> initialized using fl_stripe_unit, which contains the stripe unit in
>> bytes. Unfortunately, this doesn't make sense for directories as they
>> have fl_stripe_unit set to '0'. This means that i_blkbits will be set
>> to 0xff, causing an UBSAN undefined behaviour in i_blocksize():
>>
>> UBSAN: Undefined behaviour in ./include/linux/fs.h:731:12
>> shift exponent 255 is too large for 32-bit type 'int'
>>
>> Fix this by initializing i_blkbits to CEPH_BLOCK_SHIFT if fl_stripe_unit
>> is zero.
>>
>> Signed-off-by: Luis Henriques <lhenriques@suse.com>
>> ---
>> fs/ceph/inode.c | 7 ++++++-
>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> Hi Jeff,
>>
>> To be honest, I'm not sure CEPH_BLOCK_SHIFT is the right value to use
>> here, but for sure the one currently being used isn't correct if the
>> inode is a directory. Using stripe units seems to be a bug that has
>> been there since the beginning, but it definitely became bigger problem
>> after commit 69448867abcb ("fs: shave 8 bytes off of struct inode").
>>
>> This fix could also be moved into the 'switch' statement later in that
>> function, in the S_IFDIR case, similar to commit 5ba72e607cdb ("ceph:
>> set special inode's blocksize to page size"). Let me know which version
>> you would prefer.
>>
>
> What happens with (e.g.) named pipes or symlinks? Do those inodes also
> get this bogus value? Assuming that they do, I'd probably prefer this
> patch since it'd fix things for all inode types, not just directories.
I tested symlinks and they seem to be handled correctly (i.e. the stripe
units seems to be the same as the target file). Regarding pipes, I
didn't test them, but from the code it should be set to PAGE_SHIFT (see
the above mentioned commit 5ba72e607cdb).
Anyway, I can change the code to do *all* the i_blkbits initialization
inside the switch statement. Something like:
switch (inode->i_mode & S_IFMT) {
case S_IFIFO:
case S_IFBLK:
case S_IFCHR:
case S_IFSOCK:
inode->i_blkbits = PAGE_SHIFT;
...
case S_IFREG:
inode->i_blkbits = fls(le32_to_cpu(info->layout.fl_stripe_unit)) - 1;
...
case S_IFLNK:
inode->i_blkbits = fls(le32_to_cpu(info->layout.fl_stripe_unit)) - 1;
...
case S_IFDIR:
inode->i_blkbits = CEPH_BLOCK_SHIFT;
...
default:
pr_err();
...
}
This would add some code duplication (S_IFREG and S_IFLNK cases), but
maybe it's a bit more clear. The other option would be obviously to
leave the initialization outside the switch and only change the
i_blkbits value in the S_IF{IFO,BLK,CHR,SOCK,DIR} cases.
Cheers,
--
Luis
>
>> Cheers,
>> --
>> Luis
>>
>> diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
>> index 791f84a13bb8..0e6d6db848b7 100644
>> --- a/fs/ceph/inode.c
>> +++ b/fs/ceph/inode.c
>> @@ -800,7 +800,12 @@ static int fill_inode(struct inode *inode, struct page *locked_page,
>>
>> /* update inode */
>> inode->i_rdev = le32_to_cpu(info->rdev);
>> - inode->i_blkbits = fls(le32_to_cpu(info->layout.fl_stripe_unit)) - 1;
>> + /* directories have fl_stripe_unit set to zero */
>> + if (le32_to_cpu(info->layout.fl_stripe_unit))
>> + inode->i_blkbits =
>> + fls(le32_to_cpu(info->layout.fl_stripe_unit)) - 1;
>> + else
>> + inode->i_blkbits = CEPH_BLOCK_SHIFT;
>>
>> __ceph_update_quota(ci, iinfo->max_bytes, iinfo->max_files);
>>
next prev parent reply other threads:[~2019-07-23 18:31 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-23 15:50 [RFC PATCH] ceph: fix directories inode i_blkbits initialization Luis Henriques
2019-07-23 17:18 ` Jeff Layton
2019-07-23 18:31 ` Luis Henriques [this message]
2019-07-23 18:31 ` Luis Henriques
2019-07-24 10:04 ` Luis Henriques
2019-07-24 10:04 ` Luis Henriques
2019-07-24 11:50 ` Jeff Layton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87o91k61sr.fsf@suse.com \
--to=lhenriques@suse.com \
--cc=ceph-devel@vger.kernel.org \
--cc=idryomov@gmail.com \
--cc=jlayton@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=sage@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.