[viro@zeniv.linux.org.uk: [RFC] audit reporting (or not reporting) pathnames on early failures in syscalls]

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [viro@zeniv.linux.org.uk: [RFC] audit reporting (or not reporting) pathnames on early failures in syscalls]
@ 2025-10-31  8:06 Al Viro
  2025-10-31 14:45 ` Richard Guy Briggs
  2025-11-01 17:36 ` Paul Moore
  0 siblings, 2 replies; 3+ messages in thread
From: Al Viro @ 2025-10-31  8:06 UTC (permalink / raw)
  To: audit; +Cc: Linus Torvalds, linux-fsdevel, Paul Moore

OK, that's two misspellings of the list name already;-/

Al, deeply embarrassed and crawling to get some sleep...

----- Forwarded message from Al Viro <viro@zeniv.linux.org.uk> -----

Date: Fri, 31 Oct 2025 07:58:56 +0000
From: Al Viro <viro@zeniv.linux.org.uk>
To: linux-audit@vger.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>, linux-fsdevel@vger.kernel.org, Paul Moore <paul@paul-moore.com>
Subject: [RFC] audit reporting (or not reporting) pathnames on early failures in syscalls

	FWIW, I've just noticed that a patch in the series I'd been
reordering had the following chunk:
@@ -1421,20 +1421,16 @@ static int do_sys_openat2(int dfd, const char __user *filename,
                          struct open_how *how)
 {
        struct open_flags op;
-       struct filename *tmp;
        int err, fd;
 
        err = build_open_flags(how, &op);
        if (unlikely(err))
                return err;
 
-       tmp = getname(filename);
-       if (IS_ERR(tmp))
-               return PTR_ERR(tmp);
-
        fd = get_unused_fd_flags(how->flags);
        if (likely(fd >= 0)) {
-               struct file *f = do_filp_open(dfd, tmp, &op);
+               struct filename *name __free(putname) = getname(filename);
+               struct file *f = do_filp_open(dfd, name, &op);
                if (IS_ERR(f)) {
                        put_unused_fd(fd);
                        fd = PTR_ERR(f);

	From the VFS or userland POV there's no problem - we would get a
different error reported e.g. in case when *both* EMFILE and ENAMETOOLONG
would be applicable, but that's perfectly fine.  However, from the audit
POV it changes behaviour.

	Consider behaviour of openat2(2).
1.  we do sanity checks on the last ('usize') argument.  If they
fail, we are done.
2.  we copy struct open_how from userland ('how' argument).
If copyin fails, we are done.
3.  we do sanity checks on how->flags, how->resolve and how->mode.
If they fail, we are done.
4.  we copy the pathname to be opened from userland ('filename' argument).
If that fails, or if the pathname is either empty or too long, we are done.
5.  we reserve an unused file descriptor.  If that fails, we are done.
6.  we allocate an empty struct file.  If that fails, we are done.
7.  we finally get around to the business - finding and opening the damn thing.
Which also can fail, of course.

	We are expected to be able to produce a record of failing
syscall.  If we fail on step 4, well, the lack of pathname to come with
the record is to be expected - we have failed to get it, after all.
The same goes for failures on steps 1..3 - we hadn't gotten around to
looking at the pathname yet, so there's no pathname to report.	What (if
anything) makes "insane how->flags" different from "we have too many
descriptors opened already"?  The contents of the pathname is equally
irrelevant in both cases.  Yet in the latter case (failure at step 5)
the pathname would get reported.  Do we need to preserve that behaviour?

	Because the patch quoted above would change it.  It puts the failure
to allocate a descriptor into the same situation as failures on steps 1..3.

	As far as I can see, there are three possible approaches:

1) if the current kernel imports the pathname before some check, that shall
always remain that way, no matter what.  Audit might be happy, but nobody
else would - we'll need to document that constraint and watch out for such
regressions.  And I'm pretty sure that over the years there had been
other such changes that went into mainline unnoticed.

2) reordering is acceptable.  Of course, the pathname import must happen
before we start using it, but that's the only real constraint.  That would
mean the least headache for everyone other than audit folks.

3) import the pathnames as early as possible.  It would mean a non-trivial
amount of churn, but it's at least a definite policy - validity of change
depends only on the resulting code, not the comparison with the earlier
state, as it would in case (1).  From QoI POV it's as nice as audit folks
could possibly ask, but it would cause quite a bit of churn to get there.
Not impossible to do, but I would rather not go there without a need.
Said that, struct filename handling is mostly a decent match to CLASS()
machinery, and all required churn wouldn't be hard to fold into conversion
to that.

	My preference would be (2), obviously.	However, it really depends
upon the kind of requirements audit users have.  Note that currently the
position of pathname import in the sequence is not documented anywhere,
so there's not much audit users can rely upon other than "the current
behaviour is such-and-such, let's hope it doesn't change"... ;-/

	Comments?


----- End forwarded message -----

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [viro@zeniv.linux.org.uk: [RFC] audit reporting (or not reporting) pathnames on early failures in syscalls]
  2025-10-31  8:06 [viro@zeniv.linux.org.uk: [RFC] audit reporting (or not reporting) pathnames on early failures in syscalls] Al Viro
@ 2025-10-31 14:45 ` Richard Guy Briggs
  2025-11-01 17:36 ` Paul Moore
  1 sibling, 0 replies; 3+ messages in thread
From: Richard Guy Briggs @ 2025-10-31 14:45 UTC (permalink / raw)
  To: Al Viro
  Cc: audit, Linus Torvalds, linux-fsdevel, Paul Moore,
	Linux-Audit Mailing List

On 2025-10-31 08:06, Al Viro wrote:
> OK, that's two misspellings of the list name already;-/

Adding the audit userspace list to get Steve Grubb's certification take on this.

> Al, deeply embarrassed and crawling to get some sleep...
> 
> ----- Forwarded message from Al Viro <viro@zeniv.linux.org.uk> -----
> 
> Date: Fri, 31 Oct 2025 07:58:56 +0000
> From: Al Viro <viro@zeniv.linux.org.uk>
> To: linux-audit@vger.kernel.org
> Cc: Linus Torvalds <torvalds@linux-foundation.org>, linux-fsdevel@vger.kernel.org, Paul Moore <paul@paul-moore.com>
> Subject: [RFC] audit reporting (or not reporting) pathnames on early failures in syscalls
> 
> 	FWIW, I've just noticed that a patch in the series I'd been
> reordering had the following chunk:
> @@ -1421,20 +1421,16 @@ static int do_sys_openat2(int dfd, const char __user *filename,
>                           struct open_how *how)
>  {
>         struct open_flags op;
> -       struct filename *tmp;
>         int err, fd;
>  
>         err = build_open_flags(how, &op);
>         if (unlikely(err))
>                 return err;
>  
> -       tmp = getname(filename);
> -       if (IS_ERR(tmp))
> -               return PTR_ERR(tmp);
> -
>         fd = get_unused_fd_flags(how->flags);
>         if (likely(fd >= 0)) {
> -               struct file *f = do_filp_open(dfd, tmp, &op);
> +               struct filename *name __free(putname) = getname(filename);
> +               struct file *f = do_filp_open(dfd, name, &op);
>                 if (IS_ERR(f)) {
>                         put_unused_fd(fd);
>                         fd = PTR_ERR(f);
> 
> 	From the VFS or userland POV there's no problem - we would get a
> different error reported e.g. in case when *both* EMFILE and ENAMETOOLONG
> would be applicable, but that's perfectly fine.  However, from the audit
> POV it changes behaviour.
> 
> 	Consider behaviour of openat2(2).
> 1.  we do sanity checks on the last ('usize') argument.  If they
> fail, we are done.
> 2.  we copy struct open_how from userland ('how' argument).
> If copyin fails, we are done.
> 3.  we do sanity checks on how->flags, how->resolve and how->mode.
> If they fail, we are done.
> 4.  we copy the pathname to be opened from userland ('filename' argument).
> If that fails, or if the pathname is either empty or too long, we are done.
> 5.  we reserve an unused file descriptor.  If that fails, we are done.
> 6.  we allocate an empty struct file.  If that fails, we are done.
> 7.  we finally get around to the business - finding and opening the damn thing.
> Which also can fail, of course.
> 
> 	We are expected to be able to produce a record of failing
> syscall.  If we fail on step 4, well, the lack of pathname to come with
> the record is to be expected - we have failed to get it, after all.
> The same goes for failures on steps 1..3 - we hadn't gotten around to
> looking at the pathname yet, so there's no pathname to report.	What (if
> anything) makes "insane how->flags" different from "we have too many
> descriptors opened already"?  The contents of the pathname is equally
> irrelevant in both cases.  Yet in the latter case (failure at step 5)
> the pathname would get reported.  Do we need to preserve that behaviour?
> 
> 	Because the patch quoted above would change it.  It puts the failure
> to allocate a descriptor into the same situation as failures on steps 1..3.
> 
> 	As far as I can see, there are three possible approaches:
> 
> 1) if the current kernel imports the pathname before some check, that shall
> always remain that way, no matter what.  Audit might be happy, but nobody
> else would - we'll need to document that constraint and watch out for such
> regressions.  And I'm pretty sure that over the years there had been
> other such changes that went into mainline unnoticed.
> 
> 2) reordering is acceptable.  Of course, the pathname import must happen
> before we start using it, but that's the only real constraint.  That would
> mean the least headache for everyone other than audit folks.
> 
> 3) import the pathnames as early as possible.  It would mean a non-trivial
> amount of churn, but it's at least a definite policy - validity of change
> depends only on the resulting code, not the comparison with the earlier
> state, as it would in case (1).  From QoI POV it's as nice as audit folks
> could possibly ask, but it would cause quite a bit of churn to get there.
> Not impossible to do, but I would rather not go there without a need.
> Said that, struct filename handling is mostly a decent match to CLASS()
> machinery, and all required churn wouldn't be hard to fold into conversion
> to that.
> 
> 	My preference would be (2), obviously.	However, it really depends
> upon the kind of requirements audit users have.  Note that currently the
> position of pathname import in the sequence is not documented anywhere,
> so there's not much audit users can rely upon other than "the current
> behaviour is such-and-such, let's hope it doesn't change"... ;-/
> 
> 	Comments?
> 
> 
> ----- End forwarded message -----
> 

- RGB

--
Richard Guy Briggs <rgb@redhat.com>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
Upstream IRC: SunRaycer
Voice: +1.613.860 2354 SMS: +1.613.518.6570


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [viro@zeniv.linux.org.uk: [RFC] audit reporting (or not reporting) pathnames on early failures in syscalls]
  2025-10-31  8:06 [viro@zeniv.linux.org.uk: [RFC] audit reporting (or not reporting) pathnames on early failures in syscalls] Al Viro
  2025-10-31 14:45 ` Richard Guy Briggs
@ 2025-11-01 17:36 ` Paul Moore
  1 sibling, 0 replies; 3+ messages in thread
From: Paul Moore @ 2025-11-01 17:36 UTC (permalink / raw)
  To: Al Viro; +Cc: audit, Linus Torvalds, linux-fsdevel

On Fri, Oct 31, 2025 at 4:06 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>         FWIW, I've just noticed that a patch in the series I'd been
> reordering had the following chunk:
> @@ -1421,20 +1421,16 @@ static int do_sys_openat2(int dfd, const char __user *filename,
>                           struct open_how *how)
>  {
>         struct open_flags op;
> -       struct filename *tmp;
>         int err, fd;
>
>         err = build_open_flags(how, &op);
>         if (unlikely(err))
>                 return err;
>
> -       tmp = getname(filename);
> -       if (IS_ERR(tmp))
> -               return PTR_ERR(tmp);
> -
>         fd = get_unused_fd_flags(how->flags);
>         if (likely(fd >= 0)) {
> -               struct file *f = do_filp_open(dfd, tmp, &op);
> +               struct filename *name __free(putname) = getname(filename);
> +               struct file *f = do_filp_open(dfd, name, &op);
>                 if (IS_ERR(f)) {
>                         put_unused_fd(fd);
>                         fd = PTR_ERR(f);
>
>         From the VFS or userland POV there's no problem - we would get a
> different error reported e.g. in case when *both* EMFILE and ENAMETOOLONG
> would be applicable, but that's perfectly fine.  However, from the audit
> POV it changes behaviour.
>
>         Consider behaviour of openat2(2).
> 1.  we do sanity checks on the last ('usize') argument.  If they
> fail, we are done.
> 2.  we copy struct open_how from userland ('how' argument).
> If copyin fails, we are done.
> 3.  we do sanity checks on how->flags, how->resolve and how->mode.
> If they fail, we are done.
> 4.  we copy the pathname to be opened from userland ('filename' argument).
> If that fails, or if the pathname is either empty or too long, we are done.
> 5.  we reserve an unused file descriptor.  If that fails, we are done.
> 6.  we allocate an empty struct file.  If that fails, we are done.
> 7.  we finally get around to the business - finding and opening the damn thing.
> Which also can fail, of course.
>
>         We are expected to be able to produce a record of failing
> syscall.  If we fail on step 4, well, the lack of pathname to come with
> the record is to be expected - we have failed to get it, after all.
> The same goes for failures on steps 1..3 - we hadn't gotten around to
> looking at the pathname yet, so there's no pathname to report.  What (if
> anything) makes "insane how->flags" different from "we have too many
> descriptors opened already"?  The contents of the pathname is equally
> irrelevant in both cases.  Yet in the latter case (failure at step 5)
> the pathname would get reported.  Do we need to preserve that behaviour?

With the only real difference, from an audit perspective, between the
current behavior and the proposed reordering being audit doesn't
record a pathname in the case a fd can't be allocated, I'm not too
bothered by the change.

Based on discussions with people that actually use audit on real
systems, on open failure the pathname is most interesting when the
failure is caused by a permission failure somewhere along the path
walk, the you're-trying-to-access-things-you-shouldn't-be-accessing
case.  Even with the reordering we still do the getname() call before
the do_filp_open() (duh) so we are fine in that regard.

I suppose one could make an argument about someone possibly trying to
do something "bad" and bumping into the open file limit, either as a
test or the exploit itself, and wanting to audit that.  However, in
this case I think the pathname is probably the least interesting thing
here, the goal would likely be to successfully open the file (or hit
the limit) in which case the user would surely pick a file they know
they would be able to open, making the pathname a bit boring and not
very useful as an indicator of bad behavior.

>         Because the patch quoted above would change it.  It puts the failure
> to allocate a descriptor into the same situation as failures on steps 1..3.
>
>         As far as I can see, there are three possible approaches:
>
> 1) if the current kernel imports the pathname before some check, that shall
> always remain that way, no matter what.  Audit might be happy, but nobody
> else would - we'll need to document that constraint and watch out for such
> regressions.  And I'm pretty sure that over the years there had been
> other such changes that went into mainline unnoticed.

Re: regressions ... sadly we don't have great audit test coverage at
this point.  We used to have a pretty good test suite thanks to the
security certification work, but I never could get the distros to do
the maintenance/upkeep of the test suite outside the scope of the
certification effort so it has fallen out of use.  We've got a
smaller, easier to run and maintain test suite which is "okay" but it
still has a number of gaps; thankfully we're starting to see some
renewed effort there which makes me happy.

> 2) reordering is acceptable.  Of course, the pathname import must happen
> before we start using it, but that's the only real constraint.  That would
> mean the least headache for everyone other than audit folks.

I'm not sure it really bothers me at this point in time.  Can we make
an agreement that reordering is okay right now, but if we have a
certification requirement in the future about having a pathname if
earlier we can revisit/rework this?

... and yes, I know people hate security certifications, and just
"security" in general.  Fine, whatever, good for you.  However, like
it or not, these certifications make or break the use of Linux in a
number of situations that are important to a lot of users.  If we want
to ensure that Linux continues to be a viable solution across a large
number of use cases, and not just the small handful of hyperscalers,
one of the things we need to do is make sure Linux can continue to
meet these security certifications.

> 3) import the pathnames as early as possible.  It would mean a non-trivial
> amount of churn, but it's at least a definite policy - validity of change
> depends only on the resulting code, not the comparison with the earlier
> state, as it would in case (1).  From QoI POV it's as nice as audit folks
> could possibly ask, but it would cause quite a bit of churn to get there.
> Not impossible to do, but I would rather not go there without a need.
> Said that, struct filename handling is mostly a decent match to CLASS()
> machinery, and all required churn wouldn't be hard to fold into conversion
> to that.
>
>         My preference would be (2), obviously.  However, it really depends
> upon the kind of requirements audit users have.  Note that currently the
> position of pathname import in the sequence is not documented anywhere,
> so there's not much audit users can rely upon other than "the current
> behaviour is such-and-such, let's hope it doesn't change"... ;-/

Like I said above, I'm okay with option #2 so long as we can move the
pathname capture earlier if needed at some point in the future.  I
don't want to inflict pain on the kernel if it isn't needed, but I've
also grown very weary of having to fight with every subsystem out
there to claw things back that are necessary for adoption.  If you
need a hard and fast policy now, then it would have to be option #3,
but if we all promise to be reasonable adults I'm okay with option #2.

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-11-01 17:36 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-31  8:06 [viro@zeniv.linux.org.uk: [RFC] audit reporting (or not reporting) pathnames on early failures in syscalls] Al Viro
2025-10-31 14:45 ` Richard Guy Briggs
2025-11-01 17:36 ` Paul Moore

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).