Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry

Linux Security Modules development
 help / color / mirror / Atom feed

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
       [not found] <20240729-zollfrei-verteidigen-cf359eb36601@brauner>
@ 2024-08-19  7:18 ` Song Liu
  2024-08-19 11:16   ` Christian Brauner
  0 siblings, 1 reply; 13+ messages in thread
From: Song Liu @ 2024-08-19  7:18 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Song Liu, Song Liu, bpf, Linux-Fsdevel, LKML, Kernel Team,
	andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org,
	daniel@iogearbox.net, martin.lau@linux.dev,
	viro@zeniv.linux.org.uk, jack@suse.cz, kpsingh@kernel.org,
	mattbobrowski@google.com, Liam Wisehart, Liang Tang,
	Shankaran Gnanashanmugam, LSM List

Hi Christian, 

Thanks again for your suggestions here. I have got more questions on
this work. 

> On Jul 29, 2024, at 6:46 AM, Christian Brauner <brauner@kernel.org> wrote:

[...]

>> I am not sure I follow the suggestion to implement this with 
>> security_inode_permission()? Could you please share more details about
>> this idea?
> 
> Given a path like /bin/gcc-6.9/gcc what that code currently does is:
> 
> * walk down to /bin/gcc-6.9/gcc
> * walk up from /bin/gcc-6.9/gcc and then checking xattr labels for:
>  gcc
>  gcc-6.9/
>  bin/
>  /
> 
> That's broken because someone could've done
> mv /bin/gcc-6.9/gcc /attack/ and when this walks back and it checks xattrs on
> /attack even though the path lookup was for /bin/gcc-6.9. IOW, the
> security_file_open() checks have nothing to do with the permission checks that
> were done during path lookup.
> 
> Why isn't that logic:
> 
> * walk down to /bin/gcc-6.9/gcc and check for each component:
> 
>  security_inode_permission(/)
>  security_inode_permission(gcc-6.9/)
>  security_inode_permission(bin/)
>  security_inode_permission(gcc)
>  security_file_open(gcc)

I am trying to implement this approach. The idea, IIUC, is:

1. For each open/openat, as we walk the path in do_filp_open=>path_openat, 
   check xattr for "/", "gcc-6.9/", "bin/" for all given flags.
2. Save the value of the flag somewhere (for BPF, we can use inode local
   storage). This is needed, because openat(dfd, ..) will not start from
   root again. 
3. Propagate these flag to children. All the above are done at 
   security_inode_permission. 
4. Finally, at security_file_open, check the xattr with the file, which 
   is probably propagated from some parents.

Did I get this right? 

IIUC, there are a few issues with this approach. 

1. security_inode_permission takes inode as parameter. However, we need 
   dentry to get the xattr. Shall we change security_inode_permission
   to take dentry instead? 
   PS: Maybe we should change most/all inode hooks to take dentry instead?

2. There is no easy way to propagate data from parent. Assuming we already
   changed security_inode_permission to take dentry, we still need some
   mechanism to look up xattr from the parent, which is probably still 
   something like bpf_dget_parent(). Or maybe we should add another hook 
   with both parent and child dentry as input?

3. Given we save the flag from parents in children's inode local storage, 
   we may consume non-trivial extra memory. BPF inode local storage will 
   be freed as the inode gets freed, so we will not leak any memory or 
   overflow some hash map. However, this will probably increase the 
   memory consumption of inode by a few percents. I think a "walk-up" 
   based approach will not have this problem, as we don't need the extra
   storage. Of course, this means more xattr lookups in some cases. 

> 
> I think that dget_parent() logic also wouldn't make sense for relative path
> lookups:
> 
> dfd = open("/bin/gcc-6.9", O_RDONLY | O_DIRECTORY | O_CLOEXEC);
> 
> This walks down to /bin/gcc-6.9 and then walks back up (subject to the
> same problem mentioned earlier) and check xattrs for:
> 
>  gcc-6.9
>  bin/
>  /
> 
> then that dfd is passed to openat() to open "gcc":
> 
> fd = openat(dfd, "gcc", O_RDONLY);
> 
> which again walks up to /bin/gcc-6.9 and checks xattrs for:
>  gcc
>  gcc-6.9
>  bin/
>  /
> 
> Which means this code ends up charging relative lookups twice. Even if one
> irons that out in the program this encourages really bad patterns.
> Path lookup is iterative top down. One can't just randomly walk back up and
> assume that's equivalent.

I understand that walk-up is not equivalent to walk down. But it is probably
accurate enough for some security policies. For example, LSM LandLock uses
similar logic in the file_open hook (file security/landlock/fs.c, function 
is_access_to_paths_allowed). 

To summary my thoughts here. I think we need:

1. Change security_inode_permission to take dentry instead of inode. 
2. Still add bpf_dget_parent. We will use it with security_inode_permission
   so that we can propagate flags from parents to children. We will need
   a bpf_dput as well. 
3. There are pros and cons with different approaches to implement this
   policy (tags on directory work for all files in it). We probably need 
   the policy writer to decide with one to use. From BPF's POV, dget_parent
   is "safe", because it won't crash the system. It may encourage some bad
   patterns, but it appears to be required in some use cases. 

Does this make sense?

Thanks,
Song

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-08-19  7:18 ` [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr Song Liu
@ 2024-08-19 11:16   ` Christian Brauner
  2024-08-19 13:12     ` Mickaël Salaün
  2024-08-19 20:25     ` Song Liu
  0 siblings, 2 replies; 13+ messages in thread
From: Christian Brauner @ 2024-08-19 11:16 UTC (permalink / raw)
  To: Song Liu, Mickaël Salaün
  Cc: Song Liu, bpf, Linux-Fsdevel, LKML, Kernel Team,
	andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org,
	daniel@iogearbox.net, martin.lau@linux.dev,
	viro@zeniv.linux.org.uk, jack@suse.cz, kpsingh@kernel.org,
	mattbobrowski@google.com, Liam Wisehart, Liang Tang,
	Shankaran Gnanashanmugam, LSM List

On Mon, Aug 19, 2024 at 07:18:40AM GMT, Song Liu wrote:
> Hi Christian, 
> 
> Thanks again for your suggestions here. I have got more questions on
> this work. 
> 
> > On Jul 29, 2024, at 6:46 AM, Christian Brauner <brauner@kernel.org> wrote:
> 
> [...]
> 
> >> I am not sure I follow the suggestion to implement this with 
> >> security_inode_permission()? Could you please share more details about
> >> this idea?
> > 
> > Given a path like /bin/gcc-6.9/gcc what that code currently does is:
> > 
> > * walk down to /bin/gcc-6.9/gcc
> > * walk up from /bin/gcc-6.9/gcc and then checking xattr labels for:
> >  gcc
> >  gcc-6.9/
> >  bin/
> >  /
> > 
> > That's broken because someone could've done
> > mv /bin/gcc-6.9/gcc /attack/ and when this walks back and it checks xattrs on
> > /attack even though the path lookup was for /bin/gcc-6.9. IOW, the
> > security_file_open() checks have nothing to do with the permission checks that
> > were done during path lookup.
> > 
> > Why isn't that logic:
> > 
> > * walk down to /bin/gcc-6.9/gcc and check for each component:
> > 
> >  security_inode_permission(/)
> >  security_inode_permission(gcc-6.9/)
> >  security_inode_permission(bin/)
> >  security_inode_permission(gcc)
> >  security_file_open(gcc)
> 
> I am trying to implement this approach. The idea, IIUC, is:
> 
> 1. For each open/openat, as we walk the path in do_filp_open=>path_openat, 
>    check xattr for "/", "gcc-6.9/", "bin/" for all given flags.
> 2. Save the value of the flag somewhere (for BPF, we can use inode local
>    storage). This is needed, because openat(dfd, ..) will not start from
>    root again. 
> 3. Propagate these flag to children. All the above are done at 
>    security_inode_permission. 
> 4. Finally, at security_file_open, check the xattr with the file, which 
>    is probably propagated from some parents.
> 
> Did I get this right? 
> 
> IIUC, there are a few issues with this approach. 
> 
> 1. security_inode_permission takes inode as parameter. However, we need 
>    dentry to get the xattr. Shall we change security_inode_permission
>    to take dentry instead? 
>    PS: Maybe we should change most/all inode hooks to take dentry instead?

security_inode_permission() is called in generic_permission() which in
turn is called from inode_permission() which in turn is called from
inode->i_op->permission() for various filesystems. So to make
security_inode_permission() take a dentry argument one would need to
change all inode->i_op->permission() to take a dentry argument for all
filesystems. NAK on that.

That's ignoring that it's just plain wrong to pass a dentry to
**inode**_permission() or security_**inode**_permission(). It's
permissions on the inode, not the dentry.

> 
> 2. There is no easy way to propagate data from parent. Assuming we already
>    changed security_inode_permission to take dentry, we still need some
>    mechanism to look up xattr from the parent, which is probably still 
>    something like bpf_dget_parent(). Or maybe we should add another hook 
>    with both parent and child dentry as input?
> 
> 3. Given we save the flag from parents in children's inode local storage, 
>    we may consume non-trivial extra memory. BPF inode local storage will 
>    be freed as the inode gets freed, so we will not leak any memory or 
>    overflow some hash map. However, this will probably increase the 
>    memory consumption of inode by a few percents. I think a "walk-up" 
>    based approach will not have this problem, as we don't need the extra
>    storage. Of course, this means more xattr lookups in some cases. 
> 
> > 
> > I think that dget_parent() logic also wouldn't make sense for relative path
> > lookups:
> > 
> > dfd = open("/bin/gcc-6.9", O_RDONLY | O_DIRECTORY | O_CLOEXEC);
> > 
> > This walks down to /bin/gcc-6.9 and then walks back up (subject to the
> > same problem mentioned earlier) and check xattrs for:
> > 
> >  gcc-6.9
> >  bin/
> >  /
> > 
> > then that dfd is passed to openat() to open "gcc":
> > 
> > fd = openat(dfd, "gcc", O_RDONLY);
> > 
> > which again walks up to /bin/gcc-6.9 and checks xattrs for:
> >  gcc
> >  gcc-6.9
> >  bin/
> >  /
> > 
> > Which means this code ends up charging relative lookups twice. Even if one
> > irons that out in the program this encourages really bad patterns.
> > Path lookup is iterative top down. One can't just randomly walk back up and
> > assume that's equivalent.
> 
> I understand that walk-up is not equivalent to walk down. But it is probably
> accurate enough for some security policies. For example, LSM LandLock uses
> similar logic in the file_open hook (file security/landlock/fs.c, function 
> is_access_to_paths_allowed). 

I'm not well-versed in landlock so I'll let Mickaël comment on this with
more details but there's very important restrictions and differences
here.

Landlock expresses security policies with file hierarchies and
security_inode_permission() doesn't and cannot have access to that.

Landlock is subject to the same problem that the BPF is here. Namely
that the VFS permission checking could have been done on a path walk
completely different from the path walk that is checked when walking
back up from security_file_open().

But because landlock works with a deny-by-default security policy this
is ok and it takes overmounts into account etc.

> 
> To summary my thoughts here. I think we need:
> 
> 1. Change security_inode_permission to take dentry instead of inode. 

Sorry, no.

> 2. Still add bpf_dget_parent. We will use it with security_inode_permission
>    so that we can propagate flags from parents to children. We will need
>    a bpf_dput as well. 
> 3. There are pros and cons with different approaches to implement this
>    policy (tags on directory work for all files in it). We probably need 
>    the policy writer to decide with one to use. From BPF's POV, dget_parent
>    is "safe", because it won't crash the system. It may encourage some bad
>    patterns, but it appears to be required in some use cases. 

You cannot just walk a path upwards and check permissions and assume
that this is safe unless you have a clear idea what makes it safe in
this scenario. Landlock has afaict. But so far you only have a vague
sketch of checking permissions walking upwards and retrieving xattrs
without any notion of the problems involved.

If you provide a bpf_get_parent() api for userspace to consume you'll
end up providing them with an api that is extremly easy to misuse.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-08-19 11:16   ` Christian Brauner
@ 2024-08-19 13:12     ` Mickaël Salaün
  2024-08-19 20:35       ` Song Liu
  2024-08-19 20:25     ` Song Liu
  1 sibling, 1 reply; 13+ messages in thread
From: Mickaël Salaün @ 2024-08-19 13:12 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Song Liu, Song Liu, bpf, Linux-Fsdevel, LKML, Kernel Team,
	andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org,
	daniel@iogearbox.net, martin.lau@linux.dev,
	viro@zeniv.linux.org.uk, jack@suse.cz, kpsingh@kernel.org,
	mattbobrowski@google.com, Liam Wisehart, Liang Tang,
	Shankaran Gnanashanmugam, LSM List, Günther Noack

On Mon, Aug 19, 2024 at 01:16:04PM +0200, Christian Brauner wrote:
> On Mon, Aug 19, 2024 at 07:18:40AM GMT, Song Liu wrote:
> > Hi Christian, 
> > 
> > Thanks again for your suggestions here. I have got more questions on
> > this work. 
> > 
> > > On Jul 29, 2024, at 6:46 AM, Christian Brauner <brauner@kernel.org> wrote:
> > 
> > [...]
> > 
> > >> I am not sure I follow the suggestion to implement this with 
> > >> security_inode_permission()? Could you please share more details about
> > >> this idea?
> > > 
> > > Given a path like /bin/gcc-6.9/gcc what that code currently does is:
> > > 
> > > * walk down to /bin/gcc-6.9/gcc
> > > * walk up from /bin/gcc-6.9/gcc and then checking xattr labels for:
> > >  gcc
> > >  gcc-6.9/
> > >  bin/
> > >  /
> > > 
> > > That's broken because someone could've done
> > > mv /bin/gcc-6.9/gcc /attack/ and when this walks back and it checks xattrs on
> > > /attack even though the path lookup was for /bin/gcc-6.9. IOW, the
> > > security_file_open() checks have nothing to do with the permission checks that
> > > were done during path lookup.
> > > 
> > > Why isn't that logic:
> > > 
> > > * walk down to /bin/gcc-6.9/gcc and check for each component:
> > > 
> > >  security_inode_permission(/)
> > >  security_inode_permission(gcc-6.9/)
> > >  security_inode_permission(bin/)
> > >  security_inode_permission(gcc)
> > >  security_file_open(gcc)
> > 
> > I am trying to implement this approach. The idea, IIUC, is:
> > 
> > 1. For each open/openat, as we walk the path in do_filp_open=>path_openat, 
> >    check xattr for "/", "gcc-6.9/", "bin/" for all given flags.
> > 2. Save the value of the flag somewhere (for BPF, we can use inode local
> >    storage). This is needed, because openat(dfd, ..) will not start from
> >    root again. 
> > 3. Propagate these flag to children. All the above are done at 
> >    security_inode_permission. 
> > 4. Finally, at security_file_open, check the xattr with the file, which 
> >    is probably propagated from some parents.
> > 
> > Did I get this right? 
> > 
> > IIUC, there are a few issues with this approach. 
> > 
> > 1. security_inode_permission takes inode as parameter. However, we need 
> >    dentry to get the xattr. Shall we change security_inode_permission
> >    to take dentry instead? 
> >    PS: Maybe we should change most/all inode hooks to take dentry instead?
> 
> security_inode_permission() is called in generic_permission() which in
> turn is called from inode_permission() which in turn is called from
> inode->i_op->permission() for various filesystems. So to make
> security_inode_permission() take a dentry argument one would need to
> change all inode->i_op->permission() to take a dentry argument for all
> filesystems. NAK on that.
> 
> That's ignoring that it's just plain wrong to pass a dentry to
> **inode**_permission() or security_**inode**_permission(). It's
> permissions on the inode, not the dentry.
> 
> > 
> > 2. There is no easy way to propagate data from parent. Assuming we already
> >    changed security_inode_permission to take dentry, we still need some
> >    mechanism to look up xattr from the parent, which is probably still 
> >    something like bpf_dget_parent(). Or maybe we should add another hook 
> >    with both parent and child dentry as input?
> > 
> > 3. Given we save the flag from parents in children's inode local storage, 
> >    we may consume non-trivial extra memory. BPF inode local storage will 
> >    be freed as the inode gets freed, so we will not leak any memory or 
> >    overflow some hash map. However, this will probably increase the 
> >    memory consumption of inode by a few percents. I think a "walk-up" 
> >    based approach will not have this problem, as we don't need the extra
> >    storage. Of course, this means more xattr lookups in some cases. 
> > 
> > > 
> > > I think that dget_parent() logic also wouldn't make sense for relative path
> > > lookups:
> > > 
> > > dfd = open("/bin/gcc-6.9", O_RDONLY | O_DIRECTORY | O_CLOEXEC);
> > > 
> > > This walks down to /bin/gcc-6.9 and then walks back up (subject to the
> > > same problem mentioned earlier) and check xattrs for:
> > > 
> > >  gcc-6.9
> > >  bin/
> > >  /
> > > 
> > > then that dfd is passed to openat() to open "gcc":
> > > 
> > > fd = openat(dfd, "gcc", O_RDONLY);
> > > 
> > > which again walks up to /bin/gcc-6.9 and checks xattrs for:
> > >  gcc
> > >  gcc-6.9
> > >  bin/
> > >  /
> > > 
> > > Which means this code ends up charging relative lookups twice. Even if one
> > > irons that out in the program this encourages really bad patterns.
> > > Path lookup is iterative top down. One can't just randomly walk back up and
> > > assume that's equivalent.
> > 
> > I understand that walk-up is not equivalent to walk down. But it is probably
> > accurate enough for some security policies. For example, LSM LandLock uses
> > similar logic in the file_open hook (file security/landlock/fs.c, function 
> > is_access_to_paths_allowed). 
> 
> I'm not well-versed in landlock so I'll let Mickaël comment on this with
> more details but there's very important restrictions and differences
> here.
> 
> Landlock expresses security policies with file hierarchies and
> security_inode_permission() doesn't and cannot have access to that.
> 
> Landlock is subject to the same problem that the BPF is here. Namely
> that the VFS permission checking could have been done on a path walk
> completely different from the path walk that is checked when walking
> back up from security_file_open().
> 
> But because landlock works with a deny-by-default security policy this
> is ok and it takes overmounts into account etc.

Correct. Another point is that Landlock uses the file's path (i.e.
dentry + mnt) to walk down to the parent.  Only using the dentry would
be incorrect for most use cases (i.e. any system with more than one
mount point).

> 
> > 
> > To summary my thoughts here. I think we need:
> > 
> > 1. Change security_inode_permission to take dentry instead of inode. 
> 
> Sorry, no.
> 
> > 2. Still add bpf_dget_parent. We will use it with security_inode_permission
> >    so that we can propagate flags from parents to children. We will need
> >    a bpf_dput as well. 
> > 3. There are pros and cons with different approaches to implement this
> >    policy (tags on directory work for all files in it). We probably need 
> >    the policy writer to decide with one to use. From BPF's POV, dget_parent
> >    is "safe", because it won't crash the system. It may encourage some bad
> >    patterns, but it appears to be required in some use cases. 
> 
> You cannot just walk a path upwards and check permissions and assume
> that this is safe unless you have a clear idea what makes it safe in
> this scenario. Landlock has afaict. But so far you only have a vague
> sketch of checking permissions walking upwards and retrieving xattrs
> without any notion of the problems involved.

Something to keep in mind is that relying on xattr to label files
requires to deny sanboxed processes to change this xattr, otherwise it
would be trivial to bypass such a sandbox.  Sandboxing must be though as
a whole and Landlock's design for file system access control takes into
account all kind of file system operations that could bypass a sandbox
policy (e.g. mount operations), and also protects from impersonations.

What is the use case for this patch series?  Couldn't Landlock be used
for that?

> 
> If you provide a bpf_get_parent() api for userspace to consume you'll
> end up providing them with an api that is extremly easy to misuse.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-08-19 13:12     ` Mickaël Salaün
@ 2024-08-19 20:35       ` Song Liu
  2024-08-20 12:45         ` Mickaël Salaün
  0 siblings, 1 reply; 13+ messages in thread
From: Song Liu @ 2024-08-19 20:35 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Christian Brauner, Song Liu, Song Liu, bpf, Linux-Fsdevel, LKML,
	Kernel Team, andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org,
	daniel@iogearbox.net, martin.lau@linux.dev,
	viro@zeniv.linux.org.uk, jack@suse.cz, kpsingh@kernel.org,
	mattbobrowski@google.com, Liam Wisehart, Liang Tang,
	Shankaran Gnanashanmugam, LSM List, Günther Noack

Hi Mickaël, 

> On Aug 19, 2024, at 6:12 AM, Mickaël Salaün <mic@digikod.net> wrote:

[...]

>> But because landlock works with a deny-by-default security policy this
>> is ok and it takes overmounts into account etc.
> 
> Correct. Another point is that Landlock uses the file's path (i.e.
> dentry + mnt) to walk down to the parent.  Only using the dentry would
> be incorrect for most use cases (i.e. any system with more than one
> mount point).

Thanks for highlighting the difference. Let me see whether we can bridge
the gap for this set. 

[...]

>>> 
>>> 1. Change security_inode_permission to take dentry instead of inode.
>> 
>> Sorry, no.
>> 
>>> 2. Still add bpf_dget_parent. We will use it with security_inode_permission
>>>   so that we can propagate flags from parents to children. We will need
>>>   a bpf_dput as well. 
>>> 3. There are pros and cons with different approaches to implement this
>>>   policy (tags on directory work for all files in it). We probably need 
>>>   the policy writer to decide with one to use. From BPF's POV, dget_parent
>>>   is "safe", because it won't crash the system. It may encourage some bad
>>>   patterns, but it appears to be required in some use cases.
>> 
>> You cannot just walk a path upwards and check permissions and assume
>> that this is safe unless you have a clear idea what makes it safe in
>> this scenario. Landlock has afaict. But so far you only have a vague
>> sketch of checking permissions walking upwards and retrieving xattrs
>> without any notion of the problems involved.
> 
> Something to keep in mind is that relying on xattr to label files
> requires to deny sanboxed processes to change this xattr, otherwise it
> would be trivial to bypass such a sandbox.  Sandboxing must be though as
> a whole and Landlock's design for file system access control takes into
> account all kind of file system operations that could bypass a sandbox
> policy (e.g. mount operations), and also protects from impersonations.

Thanks for sharing these experiences! 

> What is the use case for this patch series?  Couldn't Landlock be used
> for that?

We have multiple use cases. We can use Landlock for some of them. The 
primary goal of this patchset is to add useful building blocks to BPF LSM
so that we can build effective and flexible security policies for various
use cases. These building blocks alone won't be very useful. For example,
as you pointed out, to make xattr labels useful, we need some policies 
for xattr read/write.

Does this make sense?

Thanks,
Song



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-08-19 20:35       ` Song Liu
@ 2024-08-20 12:45         ` Mickaël Salaün
  2024-08-20 17:42           ` Song Liu
  0 siblings, 1 reply; 13+ messages in thread
From: Mickaël Salaün @ 2024-08-20 12:45 UTC (permalink / raw)
  To: Song Liu
  Cc: Christian Brauner, Song Liu, bpf, Linux-Fsdevel, LKML,
	Kernel Team, andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org,
	daniel@iogearbox.net, martin.lau@linux.dev,
	viro@zeniv.linux.org.uk, jack@suse.cz, kpsingh@kernel.org,
	mattbobrowski@google.com, Liam Wisehart, Liang Tang,
	Shankaran Gnanashanmugam, LSM List, Günther Noack

On Mon, Aug 19, 2024 at 08:35:53PM +0000, Song Liu wrote:
> Hi Mickaël, 
> 
> > On Aug 19, 2024, at 6:12 AM, Mickaël Salaün <mic@digikod.net> wrote:
> 
> [...]
> 
> >> But because landlock works with a deny-by-default security policy this
> >> is ok and it takes overmounts into account etc.
> > 
> > Correct. Another point is that Landlock uses the file's path (i.e.
> > dentry + mnt) to walk down to the parent.  Only using the dentry would
> > be incorrect for most use cases (i.e. any system with more than one
> > mount point).
> 
> Thanks for highlighting the difference. Let me see whether we can bridge
> the gap for this set. 
> 
> [...]
> 
> >>> 
> >>> 1. Change security_inode_permission to take dentry instead of inode.
> >> 
> >> Sorry, no.
> >> 
> >>> 2. Still add bpf_dget_parent. We will use it with security_inode_permission
> >>>   so that we can propagate flags from parents to children. We will need
> >>>   a bpf_dput as well. 
> >>> 3. There are pros and cons with different approaches to implement this
> >>>   policy (tags on directory work for all files in it). We probably need 
> >>>   the policy writer to decide with one to use. From BPF's POV, dget_parent
> >>>   is "safe", because it won't crash the system. It may encourage some bad
> >>>   patterns, but it appears to be required in some use cases.
> >> 
> >> You cannot just walk a path upwards and check permissions and assume
> >> that this is safe unless you have a clear idea what makes it safe in
> >> this scenario. Landlock has afaict. But so far you only have a vague
> >> sketch of checking permissions walking upwards and retrieving xattrs
> >> without any notion of the problems involved.
> > 
> > Something to keep in mind is that relying on xattr to label files
> > requires to deny sanboxed processes to change this xattr, otherwise it
> > would be trivial to bypass such a sandbox.  Sandboxing must be though as
> > a whole and Landlock's design for file system access control takes into
> > account all kind of file system operations that could bypass a sandbox
> > policy (e.g. mount operations), and also protects from impersonations.
> 
> Thanks for sharing these experiences! 
> 
> > What is the use case for this patch series?  Couldn't Landlock be used
> > for that?
> 
> We have multiple use cases. We can use Landlock for some of them. The 
> primary goal of this patchset is to add useful building blocks to BPF LSM
> so that we can build effective and flexible security policies for various
> use cases. These building blocks alone won't be very useful. For example,
> as you pointed out, to make xattr labels useful, we need some policies 
> for xattr read/write.
> 
> Does this make sense?

Yes, but I think you'll end up with a code pretty close to the Landlock
implementation.

What about adding BPF hooks to Landlock?  User space could create
Landlock sandboxes that would delegate the denials to a BPF program,
which could then also allow such access, but without directly handling
nor reimplementing filesystem path walks.  The Landlock user space ABI
changes would mainly be a new landlock_ruleset_attr field to explicitly
ask for a (system-wide) BPF program to handle access requests if no
Landlock rule allow them.  We could also tie a BPF data (i.e. blob) to
Landlock domains for consistent sandbox management.  One of the
advantage of this approach is to only run related BPF programs if the
sandbox policy would deny the request.  Another advantage would be to
leverage the Landlock user space interface to let any program partially
define and extend their security policy.

I'm working on implementing audit support for Landlock [1] and I think
these changes could be useful to implement BPF hooks to run a dedicated
BPF program type per event (see landlock_log_denial() and struct
landlock_request).  I'll get back on this patch series in September.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/mic/linux.git/log/?h=wip-audit

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-08-20 12:45         ` Mickaël Salaün
@ 2024-08-20 17:42           ` Song Liu
  2024-08-20 21:11             ` Paul Moore
  0 siblings, 1 reply; 13+ messages in thread
From: Song Liu @ 2024-08-20 17:42 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Song Liu, Christian Brauner, Song Liu, bpf, Linux-Fsdevel, LKML,
	Kernel Team, andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org,
	daniel@iogearbox.net, martin.lau@linux.dev,
	viro@zeniv.linux.org.uk, jack@suse.cz, kpsingh@kernel.org,
	mattbobrowski@google.com, Liam Wisehart, Liang Tang,
	Shankaran Gnanashanmugam, LSM List, Günther Noack


> On Aug 20, 2024, at 5:45 AM, Mickaël Salaün <mic@digikod.net> wrote:
> 
> On Mon, Aug 19, 2024 at 08:35:53PM +0000, Song Liu wrote:
>> Hi Mickaël, 
>> 
>>> On Aug 19, 2024, at 6:12 AM, Mickaël Salaün <mic@digikod.net> wrote:
>> 
>> [...]
>> 
>>>> But because landlock works with a deny-by-default security policy this
>>>> is ok and it takes overmounts into account etc.
>>> 
>>> Correct. Another point is that Landlock uses the file's path (i.e.
>>> dentry + mnt) to walk down to the parent.  Only using the dentry would
>>> be incorrect for most use cases (i.e. any system with more than one
>>> mount point).
>> 
>> Thanks for highlighting the difference. Let me see whether we can bridge
>> the gap for this set. 
>> 
>> [...]
>> 
>>>>> 
>>>>> 1. Change security_inode_permission to take dentry instead of inode.
>>>> 
>>>> Sorry, no.
>>>> 
>>>>> 2. Still add bpf_dget_parent. We will use it with security_inode_permission
>>>>>  so that we can propagate flags from parents to children. We will need
>>>>>  a bpf_dput as well. 
>>>>> 3. There are pros and cons with different approaches to implement this
>>>>>  policy (tags on directory work for all files in it). We probably need 
>>>>>  the policy writer to decide with one to use. From BPF's POV, dget_parent
>>>>>  is "safe", because it won't crash the system. It may encourage some bad
>>>>>  patterns, but it appears to be required in some use cases.
>>>> 
>>>> You cannot just walk a path upwards and check permissions and assume
>>>> that this is safe unless you have a clear idea what makes it safe in
>>>> this scenario. Landlock has afaict. But so far you only have a vague
>>>> sketch of checking permissions walking upwards and retrieving xattrs
>>>> without any notion of the problems involved.
>>> 
>>> Something to keep in mind is that relying on xattr to label files
>>> requires to deny sanboxed processes to change this xattr, otherwise it
>>> would be trivial to bypass such a sandbox.  Sandboxing must be though as
>>> a whole and Landlock's design for file system access control takes into
>>> account all kind of file system operations that could bypass a sandbox
>>> policy (e.g. mount operations), and also protects from impersonations.
>> 
>> Thanks for sharing these experiences! 
>> 
>>> What is the use case for this patch series?  Couldn't Landlock be used
>>> for that?
>> 
>> We have multiple use cases. We can use Landlock for some of them. The 
>> primary goal of this patchset is to add useful building blocks to BPF LSM
>> so that we can build effective and flexible security policies for various
>> use cases. These building blocks alone won't be very useful. For example,
>> as you pointed out, to make xattr labels useful, we need some policies 
>> for xattr read/write.
>> 
>> Does this make sense?
> 
> Yes, but I think you'll end up with a code pretty close to the Landlock
> implementation.

At the moment, I think it is not possible to do full Landlock logic in
BPF. We are learning from other LSMs. 

> What about adding BPF hooks to Landlock?  User space could create
> Landlock sandboxes that would delegate the denials to a BPF program,
> which could then also allow such access, but without directly handling
> nor reimplementing filesystem path walks.  The Landlock user space ABI
> changes would mainly be a new landlock_ruleset_attr field to explicitly
> ask for a (system-wide) BPF program to handle access requests if no
> Landlock rule allow them.  We could also tie a BPF data (i.e. blob) to
> Landlock domains for consistent sandbox management.  One of the
> advantage of this approach is to only run related BPF programs if the
> sandbox policy would deny the request.  Another advantage would be to
> leverage the Landlock user space interface to let any program partially
> define and extend their security policy.

Given there is BPF LSM, I have never thought about adding BPF hooks to 
Landlock or other LSMs. I personally would prefer to have a common API
to walk the path, maybe something like vma_iterator. But I need to read
more code to understand whether this makes sense?

Thanks,
Song

> I'm working on implementing audit support for Landlock [1] and I think
> these changes could be useful to implement BPF hooks to run a dedicated
> BPF program type per event (see landlock_log_denial() and struct
> landlock_request).  I'll get back on this patch series in September.
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/mic/linux.git/log/?h=wip-audit



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-08-20 17:42           ` Song Liu
@ 2024-08-20 21:11             ` Paul Moore
  2024-08-21  3:43               ` Song Liu
  0 siblings, 1 reply; 13+ messages in thread
From: Paul Moore @ 2024-08-20 21:11 UTC (permalink / raw)
  To: Song Liu
  Cc: Mickaël Salaün, Christian Brauner, Song Liu, bpf,
	Linux-Fsdevel, LKML, Kernel Team, andrii@kernel.org,
	eddyz87@gmail.com, ast@kernel.org, daniel@iogearbox.net,
	martin.lau@linux.dev, viro@zeniv.linux.org.uk, jack@suse.cz,
	kpsingh@kernel.org, mattbobrowski@google.com, Liam Wisehart,
	Liang Tang, Shankaran Gnanashanmugam, LSM List,
	Günther Noack

On Tue, Aug 20, 2024 at 1:43 PM Song Liu <songliubraving@meta.com> wrote:
> > On Aug 20, 2024, at 5:45 AM, Mickaël Salaün <mic@digikod.net> wrote:

...

> > What about adding BPF hooks to Landlock?  User space could create
> > Landlock sandboxes that would delegate the denials to a BPF program,
> > which could then also allow such access, but without directly handling
> > nor reimplementing filesystem path walks.  The Landlock user space ABI
> > changes would mainly be a new landlock_ruleset_attr field to explicitly
> > ask for a (system-wide) BPF program to handle access requests if no
> > Landlock rule allow them.  We could also tie a BPF data (i.e. blob) to
> > Landlock domains for consistent sandbox management.  One of the
> > advantage of this approach is to only run related BPF programs if the
> > sandbox policy would deny the request.  Another advantage would be to
> > leverage the Landlock user space interface to let any program partially
> > define and extend their security policy.
>
> Given there is BPF LSM, I have never thought about adding BPF hooks to
> Landlock or other LSMs. I personally would prefer to have a common API
> to walk the path, maybe something like vma_iterator. But I need to read
> more code to understand whether this makes sense?

Just so there isn't any confusion, I want to make sure that everyone
is clear that "adding BPF hooks to Landlock" should mean "add a new
Landlock specific BPF hook inside Landlock" and not "reuse existing
BPF LSM hooks inside Landlock".

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-08-20 21:11             ` Paul Moore
@ 2024-08-21  3:43               ` Song Liu
  2024-08-23 10:38                 ` Mickaël Salaün
  0 siblings, 1 reply; 13+ messages in thread
From: Song Liu @ 2024-08-21  3:43 UTC (permalink / raw)
  To: Paul Moore
  Cc: Song Liu, Mickaël Salaün, Christian Brauner, Song Liu,
	bpf, Linux-Fsdevel, LKML, Kernel Team, andrii@kernel.org,
	eddyz87@gmail.com, ast@kernel.org, daniel@iogearbox.net,
	martin.lau@linux.dev, viro@zeniv.linux.org.uk, jack@suse.cz,
	kpsingh@kernel.org, mattbobrowski@google.com, Liam Wisehart,
	Liang Tang, Shankaran Gnanashanmugam, LSM List,
	Günther Noack



> On Aug 20, 2024, at 2:11 PM, Paul Moore <paul@paul-moore.com> wrote:
> 
> On Tue, Aug 20, 2024 at 1:43 PM Song Liu <songliubraving@meta.com> wrote:
>>> On Aug 20, 2024, at 5:45 AM, Mickaël Salaün <mic@digikod.net> wrote:
> 
> ...
> 
>>> What about adding BPF hooks to Landlock?  User space could create
>>> Landlock sandboxes that would delegate the denials to a BPF program,
>>> which could then also allow such access, but without directly handling
>>> nor reimplementing filesystem path walks.  The Landlock user space ABI
>>> changes would mainly be a new landlock_ruleset_attr field to explicitly
>>> ask for a (system-wide) BPF program to handle access requests if no
>>> Landlock rule allow them.  We could also tie a BPF data (i.e. blob) to
>>> Landlock domains for consistent sandbox management.  One of the
>>> advantage of this approach is to only run related BPF programs if the
>>> sandbox policy would deny the request.  Another advantage would be to
>>> leverage the Landlock user space interface to let any program partially
>>> define and extend their security policy.
>> 
>> Given there is BPF LSM, I have never thought about adding BPF hooks to
>> Landlock or other LSMs. I personally would prefer to have a common API
>> to walk the path, maybe something like vma_iterator. But I need to read
>> more code to understand whether this makes sense?
> 
> Just so there isn't any confusion, I want to make sure that everyone
> is clear that "adding BPF hooks to Landlock" should mean "add a new
> Landlock specific BPF hook inside Landlock" and not "reuse existing
> BPF LSM hooks inside Landlock".

I think we are on the same page. My understanding of Mickaël's idea is
to add some brand new hooks to Landlock code, so that Landlock can
use BPF program to make some decisions. 

Thanks,
Song


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-08-21  3:43               ` Song Liu
@ 2024-08-23 10:38                 ` Mickaël Salaün
  0 siblings, 0 replies; 13+ messages in thread
From: Mickaël Salaün @ 2024-08-23 10:38 UTC (permalink / raw)
  To: Song Liu
  Cc: Paul Moore, Christian Brauner, Song Liu, bpf, Linux-Fsdevel, LKML,
	Kernel Team, andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org,
	daniel@iogearbox.net, martin.lau@linux.dev,
	viro@zeniv.linux.org.uk, jack@suse.cz, kpsingh@kernel.org,
	mattbobrowski@google.com, Liam Wisehart, Liang Tang,
	Shankaran Gnanashanmugam, LSM List, Günther Noack

On Wed, Aug 21, 2024 at 03:43:48AM +0000, Song Liu wrote:
> 
> 
> > On Aug 20, 2024, at 2:11 PM, Paul Moore <paul@paul-moore.com> wrote:
> > 
> > On Tue, Aug 20, 2024 at 1:43 PM Song Liu <songliubraving@meta.com> wrote:
> >>> On Aug 20, 2024, at 5:45 AM, Mickaël Salaün <mic@digikod.net> wrote:
> > 
> > ...
> > 
> >>> What about adding BPF hooks to Landlock?  User space could create
> >>> Landlock sandboxes that would delegate the denials to a BPF program,
> >>> which could then also allow such access, but without directly handling
> >>> nor reimplementing filesystem path walks.  The Landlock user space ABI
> >>> changes would mainly be a new landlock_ruleset_attr field to explicitly
> >>> ask for a (system-wide) BPF program to handle access requests if no
> >>> Landlock rule allow them.  We could also tie a BPF data (i.e. blob) to
> >>> Landlock domains for consistent sandbox management.  One of the
> >>> advantage of this approach is to only run related BPF programs if the
> >>> sandbox policy would deny the request.  Another advantage would be to
> >>> leverage the Landlock user space interface to let any program partially
> >>> define and extend their security policy.
> >> 
> >> Given there is BPF LSM, I have never thought about adding BPF hooks to
> >> Landlock or other LSMs. I personally would prefer to have a common API
> >> to walk the path, maybe something like vma_iterator. But I need to read
> >> more code to understand whether this makes sense?

I think it would not be an issue to use BPF Landlock hooks along with
BPF LSM hooks for the same global policy.  This could also use the
Landlock domain concept for your use case, including domain inheritance,
domain identification, cross-domain protections... to avoid
reimplementing the same semantic (and going through the same issues).
Limiting the BPF program calls could also improve performance.

> > 
> > Just so there isn't any confusion, I want to make sure that everyone
> > is clear that "adding BPF hooks to Landlock" should mean "add a new
> > Landlock specific BPF hook inside Landlock" and not "reuse existing
> > BPF LSM hooks inside Landlock".
> 
> I think we are on the same page. My understanding of Mickaël's idea is
> to add some brand new hooks to Landlock code, so that Landlock can
> use BPF program to make some decisions. 

Correct

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-08-19 11:16   ` Christian Brauner
  2024-08-19 13:12     ` Mickaël Salaün
@ 2024-08-19 20:25     ` Song Liu
  2024-08-20  5:42       ` Song Liu
  2024-08-20  6:29       ` Al Viro
  1 sibling, 2 replies; 13+ messages in thread
From: Song Liu @ 2024-08-19 20:25 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Song Liu, Mickaël Salaün, Song Liu, bpf, Linux-Fsdevel,
	LKML, Kernel Team, andrii@kernel.org, eddyz87@gmail.com,
	ast@kernel.org, daniel@iogearbox.net, martin.lau@linux.dev,
	viro@zeniv.linux.org.uk, jack@suse.cz, kpsingh@kernel.org,
	mattbobrowski@google.com, Liam Wisehart, Liang Tang,
	Shankaran Gnanashanmugam, LSM List

Hi Christian, 

> On Aug 19, 2024, at 4:16 AM, Christian Brauner <brauner@kernel.org> wrote:

[...]

>> Did I get this right? 
>> 
>> IIUC, there are a few issues with this approach. 
>> 
>> 1. security_inode_permission takes inode as parameter. However, we need 
>>   dentry to get the xattr. Shall we change security_inode_permission
>>   to take dentry instead? 
>>   PS: Maybe we should change most/all inode hooks to take dentry instead?
> 
> security_inode_permission() is called in generic_permission() which in
> turn is called from inode_permission() which in turn is called from
> inode->i_op->permission() for various filesystems. So to make
> security_inode_permission() take a dentry argument one would need to
> change all inode->i_op->permission() to take a dentry argument for all
> filesystems. NAK on that.
> 
> That's ignoring that it's just plain wrong to pass a dentry to
> **inode**_permission() or security_**inode**_permission(). It's
> permissions on the inode, not the dentry.

Agreed. 

[...]

>>> 
>>> Which means this code ends up charging relative lookups twice. Even if one
>>> irons that out in the program this encourages really bad patterns.
>>> Path lookup is iterative top down. One can't just randomly walk back up and
>>> assume that's equivalent.
>> 
>> I understand that walk-up is not equivalent to walk down. But it is probably
>> accurate enough for some security policies. For example, LSM LandLock uses
>> similar logic in the file_open hook (file security/landlock/fs.c, function 
>> is_access_to_paths_allowed).
> 
> I'm not well-versed in landlock so I'll let Mickaël comment on this with
> more details but there's very important restrictions and differences
> here.
> 
> Landlock expresses security policies with file hierarchies and
> security_inode_permission() doesn't and cannot have access to that.
> 
> Landlock is subject to the same problem that the BPF is here. Namely
> that the VFS permission checking could have been done on a path walk
> completely different from the path walk that is checked when walking
> back up from security_file_open().
> 
> But because landlock works with a deny-by-default security policy this
> is ok and it takes overmounts into account etc.
> 
>> 
>> To summary my thoughts here. I think we need:
>> 
>> 1. Change security_inode_permission to take dentry instead of inode.
> 
> Sorry, no.
> 
>> 2. Still add bpf_dget_parent. We will use it with security_inode_permission
>>   so that we can propagate flags from parents to children. We will need
>>   a bpf_dput as well. 
>> 3. There are pros and cons with different approaches to implement this
>>   policy (tags on directory work for all files in it). We probably need 
>>   the policy writer to decide with one to use. From BPF's POV, dget_parent
>>   is "safe", because it won't crash the system. It may encourage some bad
>>   patterns, but it appears to be required in some use cases.
> 
> You cannot just walk a path upwards and check permissions and assume
> that this is safe unless you have a clear idea what makes it safe in
> this scenario. Landlock has afaict. But so far you only have a vague
> sketch of checking permissions walking upwards and retrieving xattrs
> without any notion of the problems involved.

I am sorry for being vague with the use case here. We are trying to 
cover a few different use cases, such as sandboxing, allowlisting 
certain operations to selected binaries, prevent operation errors, etc. 
For this work, we are looking for the right building blocks to enable
these use cases. 

> If you provide a bpf_get_parent() api for userspace to consume you'll
> end up providing them with an api that is extremly easy to misuse.

Does this make sense to have higher level API that walks up the path, 
so that it takes mounts into account. It can probably be something like:

int bpf_get_parent_path(struct path *p) {
again:
    if (p->dentry == p->mnt.mnt_root) {
        follow_up(p);
        goto again;
    }
    if (unlikely(IS_ROOT(p->dentry))) {
        return PARENT_WALK_DONE;  
    }
    parent_dentry = dget_parent(p->dentry);
    dput(p->dentry);
    p->dentry = parent_dentry;
    return PARENT_WALK_NEXT; 
}

This will handle the mount. However, we cannot guarantee deny-by-default
policies like LandLock does, because this is just a building block of 
some security policies. 

Is this something we can give a try with?

Thanks,
Song


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-08-19 20:25     ` Song Liu
@ 2024-08-20  5:42       ` Song Liu
  2024-08-20  6:29       ` Al Viro
  1 sibling, 0 replies; 13+ messages in thread
From: Song Liu @ 2024-08-20  5:42 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Song Liu, Mickaël Salaün, Song Liu, bpf, Linux-Fsdevel,
	LKML, Kernel Team, andrii@kernel.org, eddyz87@gmail.com,
	ast@kernel.org, daniel@iogearbox.net, martin.lau@linux.dev,
	viro@zeniv.linux.org.uk, jack@suse.cz, kpsingh@kernel.org,
	mattbobrowski@google.com, Liam Wisehart, Liang Tang,
	Shankaran Gnanashanmugam, LSM List

Hi Christian, 

> On Aug 19, 2024, at 1:25 PM, Song Liu <songliubraving@fb.com> wrote:
> 
> Hi Christian, 

[...]

> If you provide a bpf_get_parent() api for userspace to consume you'll
>> end up providing them with an api that is extremly easy to misuse.
> 
> Does this make sense to have higher level API that walks up the path, 
> so that it takes mounts into account. It can probably be something like:
> 
> int bpf_get_parent_path(struct path *p) {
> again:
>    if (p->dentry == p->mnt.mnt_root) {
>        follow_up(p);
>        goto again;
>    }
>    if (unlikely(IS_ROOT(p->dentry))) {
>        return PARENT_WALK_DONE;  
>    }
>    parent_dentry = dget_parent(p->dentry);
>    dput(p->dentry);
>    p->dentry = parent_dentry;
>    return PARENT_WALK_NEXT; 
> }
> 
> This will handle the mount. However, we cannot guarantee deny-by-default
> policies like LandLock does, because this is just a building block of 
> some security policies. 

I guess the above is not really clear. Here is a prototype I got. 
With the kernel diff attached below, we are able to do something
like:

SEC("lsm.s/file_open")
int BPF_PROG(test_file_open, struct file *f)
{
	/* ... */

        bpf_for_each(dentry, dentry, &f->f_path, BPF_DENTRY_ITER_TO_ROOT) {
                ret = bpf_get_dentry_xattr(dentry, "user.kfunc", &value_ptr);
		/* do work with the xattr in value_ptr */
        }

	/* ... */
}

With this helper, the user cannot walk the tree randomly. Instead, 
the walk has to follow some pattern, namely, TO_ROOT, TO_MNT_ROOT, 
etc. And helper makes sure the walk is safe. 

Does this solution make sense to you? 

Thanks,
Song



The kernel diff below. 
============================== 8< ===============================


diff --git c/fs/bpf_fs_kfuncs.c w/fs/bpf_fs_kfuncs.c
index 3fe9f59ef867..4b1400dec984 100644
--- c/fs/bpf_fs_kfuncs.c
+++ w/fs/bpf_fs_kfuncs.c
@@ -8,6 +8,7 @@
 #include <linux/fs.h>
 #include <linux/file.h>
 #include <linux/mm.h>
+#include <linux/namei.h>
 #include <linux/xattr.h>

 __bpf_kfunc_start_defs();
@@ -154,13 +155,91 @@ __bpf_kfunc int bpf_get_file_xattr(struct file *file, const char *name__str,

 __bpf_kfunc_end_defs();

+struct bpf_iter_dentry {
+       __u64 __opaque[3];
+} __attribute__((aligned(8)));
+
+struct bpf_iter_dentry_kern {
+       struct path path;
+       unsigned int flags;
+} __attribute__((aligned(8)));
+
+enum {
+       /* all the parent paths, until root (/) */
+       BPF_DENTRY_ITER_TO_ROOT,
+       /* all the parent paths, until mnt root */
+       BPF_DENTRY_ITER_TO_MNT_ROOT,
+};
+
+__bpf_kfunc_start_defs();
+
+__bpf_kfunc int bpf_iter_dentry_new(struct bpf_iter_dentry *it,
+                                        struct path *path, unsigned int flags)
+{
+       struct bpf_iter_dentry_kern *kit = (void*)it;
+
+       BUILD_BUG_ON(sizeof(struct bpf_iter_dentry_kern) >
+                    sizeof(struct bpf_iter_dentry));
+       BUILD_BUG_ON(__alignof__(struct bpf_iter_dentry_kern) !=
+                    __alignof__(struct bpf_iter_dentry));
+
+       if (flags)
+               return -EINVAL;
+
+       switch (flags) {
+       case BPF_DENTRY_ITER_TO_ROOT:
+       case BPF_DENTRY_ITER_TO_MNT_ROOT:
+               break;
+       default:
+               return -EINVAL;
+       }
+       kit->path = *path;
+       path_get(&kit->path);
+       kit->flags = flags;
+       return 0;
+}
+
+__bpf_kfunc struct dentry *bpf_iter_dentry_next(struct bpf_iter_dentry *it)
+{
+       struct bpf_iter_dentry_kern *kit = (void*)it;
+       struct dentry *parent_dentry;
+
+       if (unlikely(IS_ROOT(kit->path.dentry)))
+               return NULL;
+
+jump_up:
+       if (kit->path.dentry == kit->path.mnt->mnt_root) {
+               if (kit->flags == BPF_DENTRY_ITER_TO_MNT_ROOT)
+                       return NULL;
+               if (follow_up(&kit->path)) {
+                       goto jump_up;
+               }
+       }
+       parent_dentry = dget_parent(kit->path.dentry);
+       dput(kit->path.dentry);
+       kit->path.dentry = parent_dentry;
+       return parent_dentry;
+}
+
+__bpf_kfunc void bpf_iter_dentry_destroy(struct bpf_iter_dentry *it)
+{
+       struct bpf_iter_dentry_kern *kit = (void*)it;
+
+       path_put(&kit->path);
+}
+
+__bpf_kfunc_end_defs();
+
 BTF_KFUNCS_START(bpf_fs_kfunc_set_ids)
 BTF_ID_FLAGS(func, bpf_get_task_exe_file,
             KF_ACQUIRE | KF_TRUSTED_ARGS | KF_RET_NULL)
 BTF_ID_FLAGS(func, bpf_put_file, KF_RELEASE)
 BTF_ID_FLAGS(func, bpf_path_d_path, KF_TRUSTED_ARGS)
-BTF_ID_FLAGS(func, bpf_get_dentry_xattr, KF_SLEEPABLE | KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_get_dentry_xattr, KF_SLEEPABLE)    /* Will fix this later */
 BTF_ID_FLAGS(func, bpf_get_file_xattr, KF_SLEEPABLE | KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_iter_dentry_new, KF_ITER_NEW | KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_iter_dentry_next, KF_ITER_NEXT | KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_iter_dentry_destroy, KF_ITER_DESTROY)
 BTF_KFUNCS_END(bpf_fs_kfunc_set_ids)

 static int bpf_fs_kfuncs_filter(const struct bpf_prog *prog, u32 kfunc_id)



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-08-19 20:25     ` Song Liu
  2024-08-20  5:42       ` Song Liu
@ 2024-08-20  6:29       ` Al Viro
  2024-08-20  7:23         ` Song Liu
  1 sibling, 1 reply; 13+ messages in thread
From: Al Viro @ 2024-08-20  6:29 UTC (permalink / raw)
  To: Song Liu
  Cc: Christian Brauner, Mickaël Salaün, Song Liu, bpf,
	Linux-Fsdevel, LKML, Kernel Team, andrii@kernel.org,
	eddyz87@gmail.com, ast@kernel.org, daniel@iogearbox.net,
	martin.lau@linux.dev, jack@suse.cz, kpsingh@kernel.org,
	mattbobrowski@google.com, Liam Wisehart, Liang Tang,
	Shankaran Gnanashanmugam, LSM List

On Mon, Aug 19, 2024 at 08:25:38PM +0000, Song Liu wrote:

> int bpf_get_parent_path(struct path *p) {
> again:
>     if (p->dentry == p->mnt.mnt_root) {
>         follow_up(p);
>         goto again;
>     }
>     if (unlikely(IS_ROOT(p->dentry))) {
>         return PARENT_WALK_DONE;  
>     }
>     parent_dentry = dget_parent(p->dentry);
>     dput(p->dentry);
>     p->dentry = parent_dentry;
>     return PARENT_WALK_NEXT; 
> }
> 
> This will handle the mount. However, we cannot guarantee deny-by-default
> policies like LandLock does, because this is just a building block of 
> some security policies. 

You do realize that above is racy as hell, right?

Filesystem objects do get moved around.  You can, theoretically, play with
rename_lock, but that is highly antisocial.

What's more, _mounts_ can get moved around.  That is to say, there is no
such thing as stable canonical pathname of a file.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-08-20  6:29       ` Al Viro
@ 2024-08-20  7:23         ` Song Liu
  0 siblings, 0 replies; 13+ messages in thread
From: Song Liu @ 2024-08-20  7:23 UTC (permalink / raw)
  To: Al Viro
  Cc: Song Liu, Christian Brauner, Mickaël Salaün, Song Liu,
	bpf, Linux-Fsdevel, LKML, Kernel Team, andrii@kernel.org,
	eddyz87@gmail.com, ast@kernel.org, daniel@iogearbox.net,
	martin.lau@linux.dev, jack@suse.cz, kpsingh@kernel.org,
	mattbobrowski@google.com, Liam Wisehart, Liang Tang,
	Shankaran Gnanashanmugam, LSM List

> On Aug 19, 2024, at 11:29 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> 
> On Mon, Aug 19, 2024 at 08:25:38PM +0000, Song Liu wrote:
> 
>> int bpf_get_parent_path(struct path *p) {
>> again:
>>    if (p->dentry == p->mnt.mnt_root) {
>>        follow_up(p);
>>        goto again;
>>    }
>>    if (unlikely(IS_ROOT(p->dentry))) {
>>        return PARENT_WALK_DONE;  
>>    }
>>    parent_dentry = dget_parent(p->dentry);
>>    dput(p->dentry);
>>    p->dentry = parent_dentry;
>>    return PARENT_WALK_NEXT; 
>> }
>> 
>> This will handle the mount. However, we cannot guarantee deny-by-default
>> policies like LandLock does, because this is just a building block of 
>> some security policies.
> 
> You do realize that above is racy as hell, right?
> 
> Filesystem objects do get moved around.  You can, theoretically, play with
> rename_lock, but that is highly antisocial.

I do understand filesystem objects may get moved around. However, I am not
sure whether we have to avoid all the race conditions (and whether it is
really possible to avoid all race conditions). 

> What's more, _mounts_ can get moved around.  That is to say, there is no
> such thing as stable canonical pathname of a file.

Maybe I should really step back and ask for high level suggestions. 

We are hoping to tag all files in a directory with xattr (or something
else) on the directory. For example, a xattr "Do_not_rename" on /usr 
should block rename of all files inside /usr. 

Our original idea is to start from security_file_open() hook, and walk 
up the tree (/usr/bin/gcc => /usr/bin => /usr). However, this appears
to be wasteful and unreliable, and Christian suggested we should use a 
combination of security_inode_permission and security_file_open. I 
tried to build something on this direction, and hits a few issues:

1. Getting xattr from security_inode_permission() is not easy. Some
   FS requires dentry to get xattr. 
2. Finding parent from security_inode_permission() is also tricky.
   (maybe as trick as doing dget_parent() from security_file_open?)
   We need tag on /usr to work on /usr/bin. But how do we know /usr
   is /usr/bin's parent?

For the original goal… tag all files in a directory with xattr on 
the directory, is it possible at all? If not, we will go back and
implement something to tag all the files one at a time. If it is
indeed possible, what's the right way to do it, and what are the
race conditions we need to avoid and to accept?

Comments and suggestions are highly appreciated. 

Thanks,
Song

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2024-08-23 10:38 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20240729-zollfrei-verteidigen-cf359eb36601@brauner>
2024-08-19  7:18 ` [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr Song Liu
2024-08-19 11:16   ` Christian Brauner
2024-08-19 13:12     ` Mickaël Salaün
2024-08-19 20:35       ` Song Liu
2024-08-20 12:45         ` Mickaël Salaün
2024-08-20 17:42           ` Song Liu
2024-08-20 21:11             ` Paul Moore
2024-08-21  3:43               ` Song Liu
2024-08-23 10:38                 ` Mickaël Salaün
2024-08-19 20:25     ` Song Liu
2024-08-20  5:42       ` Song Liu
2024-08-20  6:29       ` Al Viro
2024-08-20  7:23         ` Song Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox