linux-security-module.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: Christian Brauner <brauner@kernel.org>
Cc: Amir Goldstein <amir73il@gmail.com>,
	Stefan Berger <stefanb@linux.ibm.com>,
	Paul Moore <paul@paul-moore.com>,
	zohar@linux.ibm.com, linux-integrity@vger.kernel.org,
	miklos@szeredi.hu, linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-unionfs@vger.kernel.org
Subject: Re: [PATCH] overlayfs: Trigger file re-evaluation by IMA / EVM after writes
Date: Tue, 11 Apr 2023 05:32:11 -0400	[thread overview]
Message-ID: <8f5cc243398d5bae731a26e674bdeff465da3968.camel@kernel.org> (raw)
In-Reply-To: <20230411-umgewandelt-gastgewerbe-870e4170781c@brauner>

On Tue, 2023-04-11 at 10:38 +0200, Christian Brauner wrote:
> On Sun, Apr 09, 2023 at 06:12:09PM -0400, Jeff Layton wrote:
> > On Sun, 2023-04-09 at 17:22 +0200, Christian Brauner wrote:
> > > On Fri, Apr 07, 2023 at 09:29:29AM -0400, Jeff Layton wrote:
> > > > > > > > 
> > > > > > > > I would ditch the original proposal in favor of this 2-line patch shown here:
> > > > > > > > 
> > > > > > > > https://lore.kernel.org/linux-integrity/a95f62ed-8b8a-38e5-e468-ecbde3b221af@linux.ibm.com/T/#m3bd047c6e5c8200df1d273c0ad551c645dd43232
> > > > > 
> > > > > We should cool it with the quick hacks to fix things. :)
> > > > > 
> > > > 
> > > > Yeah. It might fix this specific testcase, but I think the way it uses
> > > > the i_version is "gameable" in other situations. Then again, I don't
> > > > know a lot about IMA in this regard.
> > > > 
> > > > When is it expected to remeasure? If it's only expected to remeasure on
> > > > a close(), then that's one thing. That would be a weird design though.
> > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > > Ok, I think I get it. IMA is trying to use the i_version from the
> > > > > > > overlayfs inode.
> > > > > > > 
> > > > > > > I suspect that the real problem here is that IMA is just doing a bare
> > > > > > > inode_query_iversion. Really, we ought to make IMA call
> > > > > > > vfs_getattr_nosec (or something like it) to query the getattr routine in
> > > > > > > the upper layer. Then overlayfs could just propagate the results from
> > > > > > > the upper layer in its response.
> > > > > > > 
> > > > > > > That sort of design may also eventually help IMA work properly with more
> > > > > > > exotic filesystems, like NFS or Ceph.
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > > Maybe something like this? It builds for me but I haven't tested it. It
> > > > > > looks like overlayfs already should report the upper layer's i_version
> > > > > > in getattr, though I haven't tested that either:
> > > > > > 
> > > > > > -----------------------8<---------------------------
> > > > > > 
> > > > > > [PATCH] IMA: use vfs_getattr_nosec to get the i_version
> > > > > > 
> > > > > > IMA currently accesses the i_version out of the inode directly when it
> > > > > > does a measurement. This is fine for most simple filesystems, but can be
> > > > > > problematic with more complex setups (e.g. overlayfs).
> > > > > > 
> > > > > > Make IMA instead call vfs_getattr_nosec to get this info. This allows
> > > > > > the filesystem to determine whether and how to report the i_version, and
> > > > > > should allow IMA to work properly with a broader class of filesystems in
> > > > > > the future.
> > > > > > 
> > > > > > Reported-by: Stefan Berger <stefanb@linux.ibm.com>
> > > > > > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > > > > > ---
> > > > > 
> > > > > So, I think we want both; we want the ovl_copyattr() and the
> > > > > vfs_getattr_nosec() change:
> > > > > 
> > > > > (1) overlayfs should copy up the inode version in ovl_copyattr(). That
> > > > >     is in line what we do with all other inode attributes. IOW, the
> > > > >     overlayfs inode's i_version counter should aim to mirror the
> > > > >     relevant layer's i_version counter. I wouldn't know why that
> > > > >     shouldn't be the case. Asking the other way around there doesn't
> > > > >     seem to be any use for overlayfs inodes to have an i_version that
> > > > >     isn't just mirroring the relevant layer's i_version.
> > > > 
> > > > It's less than ideal to do this IMO, particularly with an IS_I_VERSION
> > > > inode.
> > > > 
> > > > You can't just copy up the value from the upper. You'll need to call
> > > > inode_query_iversion(upper_inode), which will flag the upper inode for a
> > > > logged i_version update on the next write. IOW, this could create some
> > > > (probably minor) metadata write amplification in the upper layer inode
> > > > with IS_I_VERSION inodes.
> > > 
> > > I'm likely just missing context and am curious about this so bear with me. Why
> > > do we need to flag the upper inode for a logged i_version update? Any required
> > > i_version interactions should've already happened when overlayfs called into
> > > the upper layer. So all that's left to do is for overlayfs' to mirror the
> > > i_version value after the upper operation has returned.
> > 
> > > ovl_copyattr() - which copies the inode attributes - is always called after the
> > > operation on the upper inode has finished. So the additional query seems odd at
> > > first glance. But there might well be a good reason for it. In my naive
> > > approach I would've thought that sm along the lines of:
> > > 
> > > diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> > > index 923d66d131c1..8b089035b9b3 100644
> > > --- a/fs/overlayfs/util.c
> > > +++ b/fs/overlayfs/util.c
> > > @@ -1119,4 +1119,5 @@ void ovl_copyattr(struct inode *inode)
> > >         inode->i_mtime = realinode->i_mtime;
> > >         inode->i_ctime = realinode->i_ctime;
> > >         i_size_write(inode, i_size_read(realinode));
> > > +       inode_set_iversion_raw(inode, inode_peek_iversion_raw(realinode));
> > >  }
> > > 
> > > would've been sufficient.
> > > 
> > 
> > Nope, because then you wouldn't get any updates to i_version after that
> > point.
> > 
> > Note that with an IS_I_VERSION inode we only update the i_version when
> > there has been a query since the last update. What you're doing above is
> > circumventing that mechanism. You'll get the i_version at the time of of
> > the ovl_copyattr, but there won't be any updates of it after that point
> > because the QUERIED bit won't end up being set on realinode.
> 
> I get all that.
> But my understanding had been that the i_version value at the time of
> ovl_copyattr() would be correct. Because when ovl_copyattr() is called
> the expected i_version change will have been done in the relevant layer
> includig raising the QUERIED bit. Since the layers are not allowed to be
> changed outside of the overlayfs mount any change to them can only
> originate from overlayfs which would necessarily call ovl_copyattr()
> again. IOW, overlayfs would by virtue of its implementation keep the
> i_version value in sync.
>
> Overlayfs wouldn't even raise SB_I_VERSION. It would indeed just be a
> cache of i_version of the relevant layer.
> 
> > 
> > 
> > > Since overlayfs' does explicitly disallow changes to the upper and lower trees
> > > while overlayfs is mounted it seems intuitive that it should just mirror the
> > > relevant layer's i_version.
> > > 
> > > 
> > > If we don't do this, then we should probably document that i_version doesn't
> > > have a meaning yet for the inodes of stacking filesystems.
> > > 
> > 
> > Trying to cache the i_version is counterproductive, IMO, at least with
> > an IS_I_VERSION inode.
> > 
> > The problem is that a query against the i_version has a side-effect. It
> > has to (atomically) mark the inode for an update on the next change.
> > 
> > If you try to cache that value, you'll likely end up doing more queries
> > than you really need to (because you'll need to keep the cache up to
> > date) and you'll have an i_version that will necessarily lag the one in
> > the upper layer inode.
> > 
> > The whole point of the change attribute is to get the value as it is at
> > this very moment so we can check whether there have been changes. A
> > laggy value is not terribly useful.
> > 
> > Overlayfs should just always call the upper layer's ->getattr to get the
> > version. I wouldn't even bother copying it up in the first place. Doing
> > so is just encouraging someone to try use the value in the overlayfs
> > inode, when they really need to go through ->getattr and get the one
> > from the upper layer.
> 
> That seems reasonable to me. I read this as an agreeing with my earlier
> suggestion to document that i_version doesn't have a meaning for the
> inodes of stacking filesystems and that we should spell out that
> vfs_getattr()/->getattr() needs to be used to interact with i_version.
> 

It really has no meaning in the stacked filesystem's _inode_. The only
i_version that has any meaning in a (simple) stacking setup is the upper
layer inode.

> We need to explain to subsystems such as IMA somwhere what the correct
> way to query i_version agnostically is; independent of filesystem
> implementation details.
> 
> Looking at IMA, it queries the i_version directly without checking
> whether it's an IS_I_VERSION() inode first. This might make a
> difference.
> 

IMA should just use getattr. That allows the filesystem to present the
i_version to the caller as it sees fit. Fetching i_version directly
without testing for IS_I_VERSION is wrong, because you don't know what
that field contains, or whether the fs supports it at all.

> Afaict, filesystems that persist i_version to disk automatically raise
> SB_I_VERSION. I would guess that it be considered a bug if a filesystem
> would persist i_version to disk and not raise SB_I_VERSION. If so IMA
> should probably be made to check for IS_I_VERSION() and it will probably
> get that by switching to vfs_getattr_nosec().

Not quite. SB_I_VERSION tells the vfs that the filesystem wants the
kernel to manage the increment of the i_version for it. The filesystem
is still responsible for persisting that value to disk (if appropriate).

Switching to vfs_getattr_nosec should make it so IMA doesn't need to
worry about the gory details of all of this. If STATX_CHANGE_COOKIE is
set in the response, then it can trust that value. Otherwise, it's no
good.

-- 
Jeff Layton <jlayton@kernel.org>

  reply	other threads:[~2023-04-11  9:32 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-05 17:14 [PATCH] overlayfs: Trigger file re-evaluation by IMA / EVM after writes Stefan Berger
2023-04-06 10:26 ` Christian Brauner
2023-04-06 14:05   ` Paul Moore
2023-04-06 14:20     ` Stefan Berger
2023-04-06 14:36       ` Paul Moore
2023-04-06 15:01         ` Christian Brauner
2023-04-06 18:46           ` Jeff Layton
2023-04-06 19:11             ` Stefan Berger
2023-04-06 19:37               ` Jeff Layton
2023-04-06 20:22                 ` Stefan Berger
2023-04-06 21:24                   ` Jeff Layton
2023-04-06 21:58                     ` Stefan Berger
2023-04-06 22:09                       ` Jeff Layton
2023-04-06 22:04                     ` Jeff Layton
2023-04-06 22:27                       ` Stefan Berger
2023-04-07  8:31                       ` Christian Brauner
2023-04-07 13:29                         ` Jeff Layton
2023-04-09 15:22                           ` Christian Brauner
2023-04-09 22:12                             ` Jeff Layton
2023-04-11  8:38                               ` Christian Brauner
2023-04-11  9:32                                 ` Jeff Layton [this message]
2023-04-11  9:49                                   ` Christian Brauner
2023-04-11 10:13                                     ` Jeff Layton
2023-04-11 14:08                                       ` Christian Brauner
2023-04-21 14:55                                 ` Mimi Zohar
2023-04-17  1:57                           ` Stefan Berger
2023-04-17  8:11                             ` Christian Brauner
2023-04-17 10:05                             ` Jeff Layton
2023-04-17 12:45                               ` Stefan Berger
2023-04-17 13:18                                 ` Jeff Layton
2023-04-21 14:43                           ` Mimi Zohar
2023-05-18 20:46                             ` Paul Moore
2023-05-18 20:50                               ` Mimi Zohar
2023-05-19 14:58                                 ` Paul Moore
2023-05-25 14:43                                   ` Mimi Zohar
2023-05-19 19:42                         ` Mimi Zohar
2023-05-20  9:15                           ` Amir Goldstein
2023-05-22 12:18                             ` Mimi Zohar
2023-05-22 14:00                               ` Amir Goldstein
2023-05-23 19:38                                 ` Mimi Zohar
2023-05-20  9:17                           ` Christian Brauner
2023-05-21 22:49                             ` Dave Chinner
2023-05-23 17:35                               ` Mimi Zohar
2023-04-17 14:07                       ` Stefan Berger
2023-04-07  6:42                   ` Amir Goldstein
2023-04-06 16:10         ` Stefan Berger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8f5cc243398d5bae731a26e674bdeff465da3968.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=amir73il@gmail.com \
    --cc=brauner@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-integrity@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=linux-unionfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=paul@paul-moore.com \
    --cc=stefanb@linux.ibm.com \
    --cc=zohar@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).