All of lore.kernel.org
 help / color / mirror / Atom feed
From: Carlos Antonio Neira Bustos <cneirabustos@gmail.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Networking <netdev@vger.kernel.org>, Yonghong Song <yhs@fb.com>,
	ebiederm@xmission.com, Jesper Dangaard Brouer <brouer@redhat.com>,
	bpf <bpf@vger.kernel.org>
Subject: Re: [PATCH v13 2/4] bpf: added new helper bpf_get_ns_current_pid_tgid
Date: Wed, 9 Oct 2019 14:45:38 -0300	[thread overview]
Message-ID: <20191009174538.GB12351@frodo.byteswizards.com> (raw)
In-Reply-To: <CAEf4BzYf77BxVy9bNBhW5SFA7nkMLyt_BfDEKC1Nis8Hcc2MqA@mail.gmail.com>

On Wed, Oct 09, 2019 at 09:14:42AM -0700, Andrii Nakryiko wrote:
> On Wed, Oct 9, 2019 at 8:27 AM Carlos Neira <cneirabustos@gmail.com> wrote:
> >
> > New bpf helper bpf_get_ns_current_pid_tgid,
> > This helper will return pid and tgid from current task
> > which namespace matches dev_t and inode number provided,
> > this will allows us to instrument a process inside a container.
> >
> > Signed-off-by: Carlos Neira <cneirabustos@gmail.com>
> > ---
> >  include/linux/bpf.h      |  1 +
> >  include/uapi/linux/bpf.h | 22 +++++++++++++++++++-
> >  kernel/bpf/core.c        |  1 +
> >  kernel/bpf/helpers.c     | 43 ++++++++++++++++++++++++++++++++++++++++
> >  kernel/trace/bpf_trace.c |  2 ++
> >  5 files changed, 68 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index 5b9d22338606..231001475504 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -1055,6 +1055,7 @@ extern const struct bpf_func_proto bpf_get_local_storage_proto;
> >  extern const struct bpf_func_proto bpf_strtol_proto;
> >  extern const struct bpf_func_proto bpf_strtoul_proto;
> >  extern const struct bpf_func_proto bpf_tcp_sock_proto;
> > +extern const struct bpf_func_proto bpf_get_ns_current_pid_tgid_proto;
> >
> >  /* Shared helpers among cBPF and eBPF. */
> >  void bpf_user_rnd_init_once(void);
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index 77c6be96d676..6ad3f2abf00d 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -2750,6 +2750,19 @@ union bpf_attr {
> >   *             **-EOPNOTSUPP** kernel configuration does not enable SYN cookies
> >   *
> >   *             **-EPROTONOSUPPORT** IP packet version is not 4 or 6
> > + *
> > + * u64 bpf_get_ns_current_pid_tgid(struct *bpf_pidns_info, u32 size)
> 
> Should be:
> 
> struct bpf_pidns_info *nsdata
> 
> > + *     Return
> > + *             0 on success, values for pid and tgid from nsinfo will be as seen
> > + *             from the namespace that matches dev and inum from nsinfo.
> 
> I think its cleaner to have a Description section, explaining that it
> will return pid/tgid in bpf_pidns_info, and then describe exit codes
> in Return section.
> 
> > + *
> > + *             On failure, the returned value is one of the following:
> > + *
> > + *             **-EINVAL** if dev and inum supplied don't match dev_t and inode number
> > + *              with nsfs of current task, or if dev conversion to dev_t lost high bits.
> > + *
> > + *             **-ENOENT** if /proc/self/ns does not exists.
> > + *
> >   */
> >  #define __BPF_FUNC_MAPPER(FN)          \
> >         FN(unspec),                     \
> > @@ -2862,7 +2875,8 @@ union bpf_attr {
> >         FN(sk_storage_get),             \
> >         FN(sk_storage_delete),          \
> >         FN(send_signal),                \
> > -       FN(tcp_gen_syncookie),
> > +       FN(tcp_gen_syncookie),          \
> > +       FN(get_ns_current_pid_tgid),
> >
> >  /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> >   * function eBPF program intends to call
> > @@ -3613,4 +3627,10 @@ struct bpf_sockopt {
> >         __s32   retval;
> >  };
> >
> > +struct bpf_pidns_info {
> > +       __u64 dev;
> > +       __u64 inum;
> 
> seems like conventionally this should be named "ino", this is what
> ns_match calls it, so let's stay consistent.
> 
> > +       __u32 pid;
> > +       __u32 tgid;
> > +};
> 
> So it seems like dev and inum are treated as input parameters, while
> pid/tgid is output parameter, right? Wouldn't it be cleaner to have
> dev and inum as explicit arguments into bpf_get_ns_current_pid_tgid()?
> What's also not great, is that on failure you'll memset this entire
> struct to zero, and user will lose its dev/inum. So in practice you'll
> be keeping dev/inum somewhere else, then constructing and filling in
> this bpf_pidns_info struct every time you need to invoke
> bpf_get_ns_current_pid_tgid.
> 
> Maybe it was discussed already, but IMO feels cleaner to have only
> pid/tgid in bpf_pidns_info and pass dev/inum as direct arguments.
> 
> >  #endif /* _UAPI__LINUX_BPF_H__ */
> > diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> > index 66088a9e9b9e..b2fd5358f472 100644
> > --- a/kernel/bpf/core.c
> > +++ b/kernel/bpf/core.c
> > @@ -2042,6 +2042,7 @@ const struct bpf_func_proto bpf_get_current_uid_gid_proto __weak;
> >  const struct bpf_func_proto bpf_get_current_comm_proto __weak;
> >  const struct bpf_func_proto bpf_get_current_cgroup_id_proto __weak;
> >  const struct bpf_func_proto bpf_get_local_storage_proto __weak;
> > +const struct bpf_func_proto bpf_get_ns_current_pid_tgid_proto __weak;
> >
> >  const struct bpf_func_proto * __weak bpf_get_trace_printk_proto(void)
> >  {
> > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > index 5e28718928ca..78a1ce7726aa 100644
> > --- a/kernel/bpf/helpers.c
> > +++ b/kernel/bpf/helpers.c
> > @@ -11,6 +11,8 @@
> >  #include <linux/uidgid.h>
> >  #include <linux/filter.h>
> >  #include <linux/ctype.h>
> > +#include <linux/pid_namespace.h>
> > +#include <linux/proc_ns.h>
> >
> >  #include "../../lib/kstrtox.h"
> >
> > @@ -487,3 +489,44 @@ const struct bpf_func_proto bpf_strtoul_proto = {
> >         .arg4_type      = ARG_PTR_TO_LONG,
> >  };
> >  #endif
> > +
> > +BPF_CALL_2(bpf_get_ns_current_pid_tgid, struct bpf_pidns_info *, nsdata, u32,
> > +       size)
> > +{
> > +       struct task_struct *task = current;
> > +       struct pid_namespace *pidns;
> > +       int err = -EINVAL;
> > +
> > +       if (unlikely(size != sizeof(struct bpf_pidns_info)))
> > +               goto clear;
> > +
> > +       if ((u64)(dev_t)nsdata->dev != nsdata->dev)
> 
> this seems unlikely() as well :)
> 
> > +               goto clear;
> > +
> > +       if (unlikely(!task))
> > +               goto clear;
> > +
> > +       pidns = task_active_pid_ns(task);
> > +       if (unlikely(!pidns)) {
> > +               err = -ENOENT;
> > +               goto clear;
> > +       }
> > +
> > +       if (!ns_match(&pidns->ns, (dev_t)nsdata->dev, nsdata->inum))
> > +               goto clear;
> > +
> > +       nsdata->pid = task_pid_nr_ns(task, pidns);
> > +       nsdata->tgid = task_tgid_nr_ns(task, pidns);
> > +       return 0;
> > +clear:
> > +       memset((void *)nsdata, 0, (size_t) size);
> > +       return err;
> > +}
> > +
> > +const struct bpf_func_proto bpf_get_ns_current_pid_tgid_proto = {
> > +       .func           = bpf_get_ns_current_pid_tgid,
> > +       .gpl_only       = false,
> > +       .ret_type       = RET_INTEGER,
> > +       .arg1_type      = ARG_PTR_TO_UNINIT_MEM,
> 
> So this is a lie, you do expect part of that struct to be initialized.
> One more reason to just split off dev/inum(ino?).
> 
> 
> > +       .arg2_type      = ARG_CONST_SIZE,
> > +};
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index 44bd08f2443b..32331a1dcb6d 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -735,6 +735,8 @@ tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> >  #endif
> >         case BPF_FUNC_send_signal:
> >                 return &bpf_send_signal_proto;
> > +       case BPF_FUNC_get_ns_current_pid_tgid:
> > +               return &bpf_get_ns_current_pid_tgid_proto;
> >         default:
> >                 return NULL;
> >         }
> > --
> > 2.20.1
> >
Thanks for reviewing this, I'll make the changes you suggest.
I'm not sure about removing dev and ino from struct bpf_pidns_info, I
think is useful to know to which ns pid/tgid belong to using dev and ino
to figure it out. Maybe in the future we could filter bpf_pidns_info
structs by dev/ino, just an idea.  

Bests

  reply	other threads:[~2019-10-09 17:45 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-09 15:26 [PATCH v13 0/4] BPF: New helper to obtain namespace data from current task Carlos Neira
2019-10-09 15:26 ` [PATCH v13 1/4] fs/nsfs.c: added ns_match Carlos Neira
2019-10-09 15:26 ` [PATCH v13 2/4] bpf: added new helper bpf_get_ns_current_pid_tgid Carlos Neira
2019-10-09 16:14   ` Andrii Nakryiko
2019-10-09 17:45     ` Carlos Antonio Neira Bustos [this message]
2019-10-09 19:50       ` Andrii Nakryiko
2019-10-09 20:30         ` Carlos Antonio Neira Bustos
2019-10-09 15:26 ` [PATCH v13 3/4] tools: Added bpf_get_ns_current_pid_tgid helper Carlos Neira
2019-10-09 15:26 ` [PATCH v13 4/4] tools/testing/selftests/bpf: Add self-tests for new helper Carlos Neira
2019-10-09 16:26   ` Andrii Nakryiko
2019-10-09 16:44     ` Carlos Antonio Neira Bustos

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191009174538.GB12351@frodo.byteswizards.com \
    --to=cneirabustos@gmail.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=netdev@vger.kernel.org \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.