BPF List
 help / color / mirror / Atom feed
From: Song Liu <songliubraving@meta.com>
To: "Dr. Greg" <greg@enjellic.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>,
	Song Liu <songliubraving@meta.com>,
	James Bottomley <James.Bottomley@HansenPartnership.com>,
	"jack@suse.cz" <jack@suse.cz>,
	"brauner@kernel.org" <brauner@kernel.org>,
	Song Liu <song@kernel.org>,
	"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-security-module@vger.kernel.org"
	<linux-security-module@vger.kernel.org>,
	Kernel Team <kernel-team@meta.com>,
	"andrii@kernel.org" <andrii@kernel.org>,
	"eddyz87@gmail.com" <eddyz87@gmail.com>,
	"ast@kernel.org" <ast@kernel.org>,
	"daniel@iogearbox.net" <daniel@iogearbox.net>,
	"martin.lau@linux.dev" <martin.lau@linux.dev>,
	"viro@zeniv.linux.org.uk" <viro@zeniv.linux.org.uk>,
	"kpsingh@kernel.org" <kpsingh@kernel.org>,
	"mattbobrowski@google.com" <mattbobrowski@google.com>,
	"amir73il@gmail.com" <amir73il@gmail.com>,
	"repnop@google.com" <repnop@google.com>,
	"jlayton@kernel.org" <jlayton@kernel.org>,
	Josef Bacik <josef@toxicpanda.com>,
	"mic@digikod.net" <mic@digikod.net>,
	"gnoack@google.com" <gnoack@google.com>
Subject: Re: [PATCH bpf-next 0/4] Make inode storage available to tracing prog
Date: Thu, 21 Nov 2024 08:28:05 +0000	[thread overview]
Message-ID: <28FEFAE6-ABEE-454C-AF59-8491FAB08E77@fb.com> (raw)
In-Reply-To: <20241120165425.GA1723@wind.enjellic.com>

Hi Dr. Greg,

Thanks for your input!

> On Nov 20, 2024, at 8:54 AM, Dr. Greg <greg@enjellic.com> wrote:
> 
> On Tue, Nov 19, 2024 at 10:14:29AM -0800, Casey Schaufler wrote:

[...]

> 
>>> 2.) Implement key/value mapping for inode specific storage.
>>> 
>>> The key would be a sub-system specific numeric value that returns a
>>> pointer the sub-system uses to manage its inode specific memory for a
>>> particular inode.
>>> 
>>> A participating sub-system in turn uses its identifier to register an
>>> inode specific pointer for its sub-system.
>>> 
>>> This strategy loses O(1) lookup complexity but reduces total memory
>>> consumption and only imposes memory costs for inodes when a sub-system
>>> desires to use inode specific storage.
> 
>> SELinux and Smack use an inode blob for every inode. The performance
>> regression boggles the mind. Not to mention the additional
>> complexity of managing the memory.
> 
> I guess we would have to measure the performance impacts to understand
> their level of mind boggliness.
> 
> My first thought is that we hear a huge amount of fanfare about BPF
> being a game changer for tracing and network monitoring.  Given
> current networking speeds, if its ability to manage storage needed for
> it purposes are truely abysmal the industry wouldn't be finding the
> technology useful.
> 
> Beyond that.
> 
> As I noted above, the LSM could be an independent subscriber.  The
> pointer to register would come from the the kmem_cache allocator as it
> does now, so that cost is idempotent with the current implementation.
> The pointer registration would also be a single instance cost.
> 
> So the primary cost differential over the common arena model will be
> the complexity costs associated with lookups in a red/black tree, if
> we used the old IMA integrity cache as an example implementation.
> 
> As I noted above, these per inode local storage structures are complex
> in of themselves, including lists and locks.  If touching an inode
> involves locking and walking lists and the like it would seem that
> those performance impacts would quickly swamp an r/b lookup cost.

bpf local storage is designed to be an arena like solution that works
for multiple bpf maps (and we don't know how many of maps we need 
ahead of time). Therefore, we may end up doing what you suggested 
earlier: every LSM should use bpf inode storage. ;) I am only 90%
kidding. 

> 
>>> Approach 2 requires the introduction of generic infrastructure that
>>> allows an inode's key/value mappings to be located, presumably based
>>> on the inode's pointer value.  We could probably just resurrect the
>>> old IMA iint code for this purpose.
>>> 
>>> In the end it comes down to a rather standard trade-off in this
>>> business, memory vs. execution cost.
>>> 
>>> We would posit that option 2 is the only viable scheme if the design
>>> metric is overall good for the Linux kernel eco-system.
> 
>> No. Really, no. You need look no further than secmarks to understand
>> how a key based blob allocation scheme leads to tears. Keys are fine
>> in the case where use of data is sparse. They have no place when data
>> use is the norm.
> 
> Then it would seem that we need to get everyone to agree that we can
> get by with using two pointers in struct inode.  One for uses best
> served by common arena allocation and one for a key/pointer mapping,
> and then convert the sub-systems accordingly.
> 
> Or alternately, getting everyone to agree that allocating a mininum of
> eight additional bytes for every subscriber to private inode data
> isn't the end of the world, even if use of the resource is sparse.

Christian suggested we can use an inode_addon structure, which is 
similar to this idea. It won't work well in all contexts, though. 
So it is not as good as other bpf local storage (task, sock, cgroup). 

Thanks,
Song

> 
> Of course, experience would suggest, that getting everyone in this
> community to agree on something is roughly akin to throwing a hand
> grenade into a chicken coop with an expectation that all of the
> chickens will fly out in a uniform flock formation.
> 
> As always,
> Dr. Greg
> 
> The Quixote Project - Flailing at the Travails of Cybersecurity
>              https://github.com/Quixote-Project


  reply	other threads:[~2024-11-21  8:28 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-12  8:25 [PATCH bpf-next 0/4] Make inode storage available to tracing prog Song Liu
2024-11-12  8:25 ` [PATCH bpf-next 1/4] bpf: lsm: Remove hook to bpf_task_storage_free Song Liu
2024-11-12  8:25 ` [PATCH bpf-next 2/4] bpf: Make bpf inode storage available to tracing program Song Liu
2024-11-13 10:19   ` Christian Brauner
2024-11-13 14:15     ` Song Liu
2024-11-13 18:29       ` Casey Schaufler
2024-11-13 19:00         ` Song Liu
2024-11-21  9:04       ` Christian Brauner
2024-11-14 21:11     ` Song Liu
2024-11-15 11:19       ` Jan Kara
2024-11-15 17:35         ` Song Liu
2024-11-19 14:21           ` Jeff Layton
2024-11-19 15:25             ` Amir Goldstein
2024-11-19 15:30               ` Amir Goldstein
2024-11-19 21:53                 ` Song Liu
2024-11-20  9:19                   ` Amir Goldstein
2024-11-20  9:28                   ` Christian Brauner
2024-11-20 11:19                     ` Amir Goldstein
2024-11-21  8:43                       ` Christian Brauner
2024-11-21 13:48                       ` Jeff Layton
2024-11-21  8:08                     ` Song Liu
2024-11-21  9:14         ` Christian Brauner
2024-11-23  0:08           ` Alexei Starovoitov
2024-11-12  8:25 ` [PATCH bpf-next 3/4] bpf: Add recursion avoid logic for inode storage Song Liu
2024-11-12  8:25 ` [PATCH bpf-next 3/4] bpf: Add recursion prevention " Song Liu
2024-11-12  8:25 ` [PATCH bpf-next 4/4] selftest/bpf: Add test for inode local storage recursion Song Liu
2024-11-12  8:26 ` [PATCH bpf-next 4/4] selftest/bpf: Test inode local storage recursion prevention Song Liu
2024-11-12  8:35 ` [PATCH bpf-next 0/4] Make inode storage available to tracing prog Song Liu
2024-11-12 18:09 ` Casey Schaufler
2024-11-12 18:44   ` Song Liu
2024-11-13  1:10     ` Casey Schaufler
2024-11-13  1:37       ` Song Liu
2024-11-13 18:06         ` Casey Schaufler
2024-11-13 18:57           ` Song Liu
2024-11-14 16:36             ` Dr. Greg
2024-11-14 17:29               ` Casey Schaufler
2024-11-14 18:08                 ` Song Liu
2024-11-14 21:49                   ` James Bottomley
2024-11-14 22:30                     ` Song Liu
2024-11-17 22:59                     ` Song Liu
2024-11-19 12:27                       ` Dr. Greg
2024-11-19 18:14                         ` Casey Schaufler
2024-11-19 22:35                           ` Song Liu
2024-11-20 16:54                           ` Dr. Greg
2024-11-21  8:28                             ` Song Liu [this message]
2024-11-21 16:02                               ` Dr. Greg
2024-11-21 18:11                                 ` Casey Schaufler
2024-11-23 17:01                                   ` Dr. Greg
2024-11-25 20:49                                     ` Casey Schaufler
2024-11-21 17:47                               ` Casey Schaufler
2024-11-21 18:28                                 ` Song Liu
2024-11-23 19:11                     ` Paul Moore
2024-11-14 17:51               ` Song Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=28FEFAE6-ABEE-454C-AF59-8491FAB08E77@fb.com \
    --to=songliubraving@meta.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=amir73il@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brauner@kernel.org \
    --cc=casey@schaufler-ca.com \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=gnoack@google.com \
    --cc=greg@enjellic.com \
    --cc=jack@suse.cz \
    --cc=jlayton@kernel.org \
    --cc=josef@toxicpanda.com \
    --cc=kernel-team@meta.com \
    --cc=kpsingh@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=mattbobrowski@google.com \
    --cc=mic@digikod.net \
    --cc=repnop@google.com \
    --cc=song@kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox