All of lore.kernel.org
 help / color / mirror / Atom feed
From: Luis Henriques <lhenriques@suse.com>
To: Jeff Layton <jlayton@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>, Sage Weil <sage@redhat.com>,
	ceph-devel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH] ceph: fix directories inode i_blkbits initialization
Date: Wed, 24 Jul 2019 11:04:58 +0100	[thread overview]
Message-ID: <874l3b694l.fsf@suse.com> (raw)
In-Reply-To: <87o91k61sr.fsf@suse.com> (Luis Henriques's message of "Tue, 23 Jul 2019 19:31:00 +0100")

Luis Henriques <lhenriques@suse.com> writes:

> "Jeff Layton" <jlayton@kernel.org> writes:
>
>> On Tue, 2019-07-23 at 16:50 +0100, Luis Henriques wrote:
>>> When filling an inode with info from the MDS, i_blkbits is being
>>> initialized using fl_stripe_unit, which contains the stripe unit in
>>> bytes.  Unfortunately, this doesn't make sense for directories as they
>>> have fl_stripe_unit set to '0'.  This means that i_blkbits will be set
>>> to 0xff, causing an UBSAN undefined behaviour in i_blocksize():
>>> 
>>>   UBSAN: Undefined behaviour in ./include/linux/fs.h:731:12
>>>   shift exponent 255 is too large for 32-bit type 'int'
>>> 
>>> Fix this by initializing i_blkbits to CEPH_BLOCK_SHIFT if fl_stripe_unit
>>> is zero.
>>> 
>>> Signed-off-by: Luis Henriques <lhenriques@suse.com>
>>> ---
>>>  fs/ceph/inode.c | 7 ++++++-
>>>  1 file changed, 6 insertions(+), 1 deletion(-)
>>> 
>>> Hi Jeff,
>>> 
>>> To be honest, I'm not sure CEPH_BLOCK_SHIFT is the right value to use
>>> here, but for sure the one currently being used isn't correct if the
>>> inode is a directory.  Using stripe units seems to be a bug that has
>>> been there since the beginning, but it definitely became bigger problem
>>> after commit 69448867abcb ("fs: shave 8 bytes off of struct inode").
>>> 
>>> This fix could also be moved into the 'switch' statement later in that
>>> function, in the S_IFDIR case, similar to commit 5ba72e607cdb ("ceph:
>>> set special inode's blocksize to page size").  Let me know which version
>>> you would prefer.
>>> 
>>
>> What happens with (e.g.) named pipes or symlinks? Do those inodes also
>> get this bogus value? Assuming that they do, I'd probably prefer this
>> patch since it'd fix things for all inode types, not just directories.
>
> I tested symlinks and they seem to be handled correctly (i.e. the stripe
> units seems to be the same as the target file).  Regarding pipes, I
> didn't test them, but from the code it should be set to PAGE_SHIFT (see
> the above mentioned commit 5ba72e607cdb).

Ok, after looking closer at the other inode types and running a few
tests with extra debug code, it all seems to be sane -- only directories
(root dir is an exception) will cause problems with i_blkbits being set
to a bogus value.  So, I'm sticking with my original RFC patch approach,
which should be easy to apply to stable kernels.

Cheers,
-- 
Luis

>
> Anyway, I can change the code to do *all* the i_blkbits initialization
> inside the switch statement.  Something like:
>
> switch (inode->i_mode & S_IFMT) {
> case S_IFIFO:
> case S_IFBLK:
> case S_IFCHR:
> case S_IFSOCK:
> 	inode->i_blkbits = PAGE_SHIFT;
>         ...
> case S_IFREG:
> 	inode->i_blkbits = fls(le32_to_cpu(info->layout.fl_stripe_unit)) - 1;
> 	...
> case S_IFLNK:
> 	inode->i_blkbits = fls(le32_to_cpu(info->layout.fl_stripe_unit)) - 1;
> 	...
> case S_IFDIR:
> 	inode->i_blkbits = CEPH_BLOCK_SHIFT;
> 	...
> default:
> 	pr_err();
>         ...
> }
>
> This would add some code duplication (S_IFREG and S_IFLNK cases), but
> maybe it's a bit more clear.  The other option would be obviously to
> leave the initialization outside the switch and only change the
> i_blkbits value in the S_IF{IFO,BLK,CHR,SOCK,DIR} cases.
>
> Cheers,

WARNING: multiple messages have this Message-ID (diff)
From: Luis Henriques <lhenriques@suse.com>
To: "Jeff Layton" <jlayton@kernel.org>
Cc: "Ilya Dryomov" <idryomov@gmail.com>,
	"Sage Weil" <sage@redhat.com>, <ceph-devel@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH] ceph: fix directories inode i_blkbits initialization
Date: Wed, 24 Jul 2019 11:04:58 +0100	[thread overview]
Message-ID: <874l3b694l.fsf@suse.com> (raw)
In-Reply-To: <87o91k61sr.fsf@suse.com> (Luis Henriques's message of "Tue, 23 Jul 2019 19:31:00 +0100")

Luis Henriques <lhenriques@suse.com> writes:

> "Jeff Layton" <jlayton@kernel.org> writes:
>
>> On Tue, 2019-07-23 at 16:50 +0100, Luis Henriques wrote:
>>> When filling an inode with info from the MDS, i_blkbits is being
>>> initialized using fl_stripe_unit, which contains the stripe unit in
>>> bytes.  Unfortunately, this doesn't make sense for directories as they
>>> have fl_stripe_unit set to '0'.  This means that i_blkbits will be set
>>> to 0xff, causing an UBSAN undefined behaviour in i_blocksize():
>>> 
>>>   UBSAN: Undefined behaviour in ./include/linux/fs.h:731:12
>>>   shift exponent 255 is too large for 32-bit type 'int'
>>> 
>>> Fix this by initializing i_blkbits to CEPH_BLOCK_SHIFT if fl_stripe_unit
>>> is zero.
>>> 
>>> Signed-off-by: Luis Henriques <lhenriques@suse.com>
>>> ---
>>>  fs/ceph/inode.c | 7 ++++++-
>>>  1 file changed, 6 insertions(+), 1 deletion(-)
>>> 
>>> Hi Jeff,
>>> 
>>> To be honest, I'm not sure CEPH_BLOCK_SHIFT is the right value to use
>>> here, but for sure the one currently being used isn't correct if the
>>> inode is a directory.  Using stripe units seems to be a bug that has
>>> been there since the beginning, but it definitely became bigger problem
>>> after commit 69448867abcb ("fs: shave 8 bytes off of struct inode").
>>> 
>>> This fix could also be moved into the 'switch' statement later in that
>>> function, in the S_IFDIR case, similar to commit 5ba72e607cdb ("ceph:
>>> set special inode's blocksize to page size").  Let me know which version
>>> you would prefer.
>>> 
>>
>> What happens with (e.g.) named pipes or symlinks? Do those inodes also
>> get this bogus value? Assuming that they do, I'd probably prefer this
>> patch since it'd fix things for all inode types, not just directories.
>
> I tested symlinks and they seem to be handled correctly (i.e. the stripe
> units seems to be the same as the target file).  Regarding pipes, I
> didn't test them, but from the code it should be set to PAGE_SHIFT (see
> the above mentioned commit 5ba72e607cdb).

Ok, after looking closer at the other inode types and running a few
tests with extra debug code, it all seems to be sane -- only directories
(root dir is an exception) will cause problems with i_blkbits being set
to a bogus value.  So, I'm sticking with my original RFC patch approach,
which should be easy to apply to stable kernels.

Cheers,
-- 
Luis

>
> Anyway, I can change the code to do *all* the i_blkbits initialization
> inside the switch statement.  Something like:
>
> switch (inode->i_mode & S_IFMT) {
> case S_IFIFO:
> case S_IFBLK:
> case S_IFCHR:
> case S_IFSOCK:
> 	inode->i_blkbits = PAGE_SHIFT;
>         ...
> case S_IFREG:
> 	inode->i_blkbits = fls(le32_to_cpu(info->layout.fl_stripe_unit)) - 1;
> 	...
> case S_IFLNK:
> 	inode->i_blkbits = fls(le32_to_cpu(info->layout.fl_stripe_unit)) - 1;
> 	...
> case S_IFDIR:
> 	inode->i_blkbits = CEPH_BLOCK_SHIFT;
> 	...
> default:
> 	pr_err();
>         ...
> }
>
> This would add some code duplication (S_IFREG and S_IFLNK cases), but
> maybe it's a bit more clear.  The other option would be obviously to
> leave the initialization outside the switch and only change the
> i_blkbits value in the S_IF{IFO,BLK,CHR,SOCK,DIR} cases.
>
> Cheers,

  reply	other threads:[~2019-07-24 10:04 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-23 15:50 [RFC PATCH] ceph: fix directories inode i_blkbits initialization Luis Henriques
2019-07-23 17:18 ` Jeff Layton
2019-07-23 18:31   ` Luis Henriques
2019-07-23 18:31     ` Luis Henriques
2019-07-24 10:04     ` Luis Henriques [this message]
2019-07-24 10:04       ` Luis Henriques
2019-07-24 11:50       ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874l3b694l.fsf@suse.com \
    --to=lhenriques@suse.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=idryomov@gmail.com \
    --cc=jlayton@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sage@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.