public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
* bpf_task_storage improvement question
@ 2024-12-02 12:38 Christian Brauner
  2024-12-02 19:37 ` Martin KaFai Lau
  0 siblings, 1 reply; 2+ messages in thread
From: Christian Brauner @ 2024-12-02 12:38 UTC (permalink / raw)
  To: Alexei Starovoitov, Song Liu, Andrii Nakryiko, Martin KaFai Lau,
	KP Singh
  Cc: linux-fsdevel, bpf

Hey,

I just had to take a quick look at kernel/bpf/bpf_task_storage.c and
realized that you're doing:


	fd = *(int *)key;
	pid = pidfd_get_pid(fd, &f_flags);

	// something something

	task = pid_task(pid, PIDTYPE_PID);

	bpf_task_storage_lock();
	// something something
	bpf_task_storage_unlock();
	put_pid(pid);

That reference count bump on struct pid seems unnecessary and I suspect
your lookup routines are supposed to be fast. So why don't you just
open-code this. Something like:

diff --git a/kernel/bpf/bpf_task_storage.c b/kernel/bpf/bpf_task_storage.c
index bf7fa15fdcc6..dc36a33c7b6d 100644
--- a/kernel/bpf/bpf_task_storage.c
+++ b/kernel/bpf/bpf_task_storage.c
@@ -92,10 +92,12 @@ static void *bpf_pid_task_storage_lookup_elem(struct bpf_map *map, void *key)
        struct task_struct *task;
        unsigned int f_flags;
        struct pid *pid;
-       int fd, err;

-       fd = *(int *)key;
-       pid = pidfd_get_pid(fd, &f_flags);
+       CLASS(fd, f)(*(int *)key);
+       if (fd_empty(f))
+               return -EBADF;
+
+       pid = pidfd_pid(f);
        if (IS_ERR(pid))
                return ERR_CAST(pid);

@@ -104,19 +106,13 @@ static void *bpf_pid_task_storage_lookup_elem(struct bpf_map *map, void *key)
         */
        WARN_ON_ONCE(!rcu_read_lock_held());
        task = pid_task(pid, PIDTYPE_PID);
-       if (!task) {
-               err = -ENOENT;
-               goto out;
-       }
+       if (!task)
+               return ERR_PTR(-ENOENT);

        bpf_task_storage_lock();
        sdata = task_storage_lookup(task, map, true);
        bpf_task_storage_unlock();
-       put_pid(pid);
        return sdata ? sdata->data : NULL;
-out:
-       put_pid(pid);
-       return ERR_PTR(err);
 }

which avoids the reference count bumps on @pid.
It remains pinned by the pidfd anyway.

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: bpf_task_storage improvement question
  2024-12-02 12:38 bpf_task_storage improvement question Christian Brauner
@ 2024-12-02 19:37 ` Martin KaFai Lau
  0 siblings, 0 replies; 2+ messages in thread
From: Martin KaFai Lau @ 2024-12-02 19:37 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Alexei Starovoitov, Song Liu, Andrii Nakryiko, KP Singh,
	linux-fsdevel, bpf

On 12/2/24 4:38 AM, Christian Brauner wrote:
> Hey,
> 
> I just had to take a quick look at kernel/bpf/bpf_task_storage.c and
> realized that you're doing:
> 
> 
> 	fd = *(int *)key;
> 	pid = pidfd_get_pid(fd, &f_flags);
> 
> 	// something something
> 
> 	task = pid_task(pid, PIDTYPE_PID);
> 
> 	bpf_task_storage_lock();
> 	// something something
> 	bpf_task_storage_unlock();
> 	put_pid(pid);
> 
> That reference count bump on struct pid seems unnecessary and I suspect
> your lookup routines are supposed to be fast. So why don't you just
> open-code this. Something like:
> 
> diff --git a/kernel/bpf/bpf_task_storage.c b/kernel/bpf/bpf_task_storage.c
> index bf7fa15fdcc6..dc36a33c7b6d 100644
> --- a/kernel/bpf/bpf_task_storage.c
> +++ b/kernel/bpf/bpf_task_storage.c
> @@ -92,10 +92,12 @@ static void *bpf_pid_task_storage_lookup_elem(struct bpf_map *map, void *key)
>          struct task_struct *task;
>          unsigned int f_flags;
>          struct pid *pid;
> -       int fd, err;
> 
> -       fd = *(int *)key;
> -       pid = pidfd_get_pid(fd, &f_flags);
> +       CLASS(fd, f)(*(int *)key);
> +       if (fd_empty(f))
> +               return -EBADF;
> +
> +       pid = pidfd_pid(f);
>          if (IS_ERR(pid))
>                  return ERR_CAST(pid);
> 
> @@ -104,19 +106,13 @@ static void *bpf_pid_task_storage_lookup_elem(struct bpf_map *map, void *key)
>           */
>          WARN_ON_ONCE(!rcu_read_lock_held());
>          task = pid_task(pid, PIDTYPE_PID);
> -       if (!task) {
> -               err = -ENOENT;
> -               goto out;
> -       }
> +       if (!task)
> +               return ERR_PTR(-ENOENT);
> 
>          bpf_task_storage_lock();
>          sdata = task_storage_lookup(task, map, true);
>          bpf_task_storage_unlock();
> -       put_pid(pid);
>          return sdata ? sdata->data : NULL;
> -out:
> -       put_pid(pid);
> -       return ERR_PTR(err);
>   }
> 
> which avoids the reference count bumps on @pid.
> It remains pinned by the pidfd anyway.

The "bpf_pid_task_storage_lookup_elem()" is used by the syscall path which may 
be less looked at. The bpf prog uses another function "__bpf_task_storage_get()" 
which directly has a task pointer.

The change makes sense to me. A nice improvement on the syscall path. It will be 
great if you can post a patch for it. Thanks.



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-12-02 19:37 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-02 12:38 bpf_task_storage improvement question Christian Brauner
2024-12-02 19:37 ` Martin KaFai Lau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox