BPF List
 help / color / mirror / Atom feed
From: Carlos Antonio Neira Bustos <cneirabustos@gmail.com>
To: Yonghong Song <yhs@fb.com>
Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	Eric Biederman <ebiederm@xmission.com>,
	"brouer@redhat.com" <brouer@redhat.com>,
	"bpf@vger.kernel.org" <bpf@vger.kernel.org>
Subject: Re: [PATCH bpf-next V9 1/3] bpf: new helper to obtain namespace data from current task
Date: Tue, 3 Sep 2019 14:45:04 -0400	[thread overview]
Message-ID: <20190903184502.2vpaqnoubbr7nnf6@ebpf-metal> (raw)
In-Reply-To: <20190828210333.itwtyqa5w5egnrwm@ebpf-metal>

Hi Yonghong,

> > Yes, the samples/bpf test case can be removed.
> > Could you create a selftest with tracpoint net/netif_receive_skb, which 
> > also uses the proposed helper? net/netif_receive_skb will happen in
> > interrupt context and it should catch the issue as well if 
> > filename_lookup still get called in interrupt context.
>
For this one scenario I just created another selftest with the only difference 
that the tracepoint is /net/netif_receive_skb so this fails with -EPERM.
Is that enough?.

I have made this comment on include/uapi/linux/bpf.h, maybe is too terse?

struct bpf_pidns_info {
	__u32 dev;	/* dev_t from /proc/self/ns/pid inode */
	__u32 nsid;
	__u32 tgid;
	__u32 pid;
};

I'm only missing clearing out those questions to be ready to submit v11 of this patch.

Bests

On Wed, Aug 28, 2019 at 05:03:35PM -0400, Carlos Antonio Neira Bustos wrote:
> Thanks, I'll work on the net/netif_receive_skb selftest using this helper.
> I hope I could complete this work this week.
> 
> Bests.
> 
> On Wed, Aug 28, 2019 at 08:53:25PM +0000, Yonghong Song wrote:
> > 
> > 
> > On 8/28/19 1:39 PM, Carlos Antonio Neira Bustos wrote:
> > > Yonghong,
> > > 
> > > Thanks for the pointer, I fixed this bug, but I found another one that's triggered
> > > now the test program I included in  tools/testing/selftests/bpf/test_pidns.
> > > It's seemed that fname was not correctly setup when passing it to filename_lookup.
> > > This is fixed now and I'm doing some more testing.
> > > I think I'll remove the tests on samples/bpf as they are mostly end on -EPERM as
> > > the fix intended.
> > > Is ok to remove them and just focus to finish the self tests code?.
> > 
> > Yes, the samples/bpf test case can be removed.
> > Could you create a selftest with tracpoint net/netif_receive_skb, which 
> > also uses the proposed helper? net/netif_receive_skb will happen in
> > interrupt context and it should catch the issue as well if 
> > filename_lookup still get called in interrupt context.
> > 
> > > 
> > > Bests
> > > 
> > > On Wed, Aug 14, 2019 at 01:25:06AM -0400, carlos antonio neira bustos wrote:
> > >> Thank you very much!
> > >>
> > >> Bests
> > >>
> > >> El mié., 14 de ago. de 2019 00:50, Yonghong Song <yhs@fb.com> escribió:
> > >>
> > >>>
> > >>>
> > >>> On 8/13/19 5:56 PM, Carlos Antonio Neira Bustos wrote:
> > >>>> On Tue, Aug 13, 2019 at 11:11:14PM +0000, Yonghong Song wrote:
> > >>>>>
> > >>>>>
> > >>>>> On 8/13/19 11:47 AM, Carlos Neira wrote:
> > >>>>>> From: Carlos <cneirabustos@gmail.com>
> > >>>>>>
> > >>>>>> New bpf helper bpf_get_current_pidns_info.
> > >>>>>> This helper obtains the active namespace from current and returns
> > >>>>>> pid, tgid, device and namespace id as seen from that namespace,
> > >>>>>> allowing to instrument a process inside a container.
> > >>>>>>
> > >>>>>> Signed-off-by: Carlos Neira <cneirabustos@gmail.com>
> > >>>>>> ---
> > >>>>>>     fs/internal.h            |  2 --
> > >>>>>>     fs/namei.c               |  1 -
> > >>>>>>     include/linux/bpf.h      |  1 +
> > >>>>>>     include/linux/namei.h    |  4 +++
> > >>>>>>     include/uapi/linux/bpf.h | 31 ++++++++++++++++++++++-
> > >>>>>>     kernel/bpf/core.c        |  1 +
> > >>>>>>     kernel/bpf/helpers.c     | 64
> > >>> ++++++++++++++++++++++++++++++++++++++++++++++++
> > >>>>>>     kernel/trace/bpf_trace.c |  2 ++
> > >>>>>>     8 files changed, 102 insertions(+), 4 deletions(-)
> > >>>>>>
> > >>> [...]
> > >>>>>>
> > >>>>>> +BPF_CALL_2(bpf_get_current_pidns_info, struct bpf_pidns_info *,
> > >>> pidns_info, u32,
> > >>>>>> +    size)
> > >>>>>> +{
> > >>>>>> +   const char *pidns_path = "/proc/self/ns/pid";
> > >>>>>> +   struct pid_namespace *pidns = NULL;
> > >>>>>> +   struct filename *tmp = NULL;
> > >>>>>> +   struct inode *inode;
> > >>>>>> +   struct path kp;
> > >>>>>> +   pid_t tgid = 0;
> > >>>>>> +   pid_t pid = 0;
> > >>>>>> +   int ret;
> > >>>>>> +   int len;
> > >>>>>
> > >>>>
> > >>>> Thank you very much for catching this!.
> > >>>> Could you share how to replicate this bug?.
> > >>>
> > >>> The config is attached. just run trace_ns_info and you
> > >>> can reproduce the issue.
> > >>>
> > >>>>
> > >>>>> I am running your sample program and get the following kernel bug:
> > >>>>>
> > >>>>> ...
> > >>>>> [   26.414825] BUG: sleeping function called from invalid context at
> > >>>>> /data/users/yhs/work/net-next/fs
> > >>>>> /dcache.c:843
> > >>>>> [   26.416314] in_atomic(): 1, irqs_disabled(): 0, pid: 1911, name: ping
> > >>>>> [   26.417189] CPU: 0 PID: 1911 Comm: ping Tainted: G        W
> > >>>>> 5.3.0-rc1+ #280
> > >>>>> [   26.418182] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> > >>>>> BIOS 1.9.3-1.el7.centos 04/01/2
> > >>>>> 014
> > >>>>> [   26.419393] Call Trace:
> > >>>>> [   26.419697]  <IRQ>
> > >>>>> [   26.419960]  dump_stack+0x46/0x5b
> > >>>>> [   26.420434]  ___might_sleep+0xe4/0x110
> > >>>>> [   26.420894]  dput+0x2a/0x200
> > >>>>> [   26.421265]  walk_component+0x10c/0x280
> > >>>>> [   26.421773]  link_path_walk+0x327/0x560
> > >>>>> [   26.422280]  ? proc_ns_dir_readdir+0x1a0/0x1a0
> > >>>>> [   26.422848]  ? path_init+0x232/0x330
> > >>>>> [   26.423364]  path_lookupat+0x88/0x200
> > >>>>> [   26.423808]  ? selinux_parse_skb.constprop.69+0x124/0x430
> > >>>>> [   26.424521]  filename_lookup+0xaf/0x190
> > >>>>> [   26.425031]  ? simple_attr_release+0x20/0x20
> > >>>>> [   26.425560]  bpf_get_current_pidns_info+0xfa/0x190
> > >>>>> [   26.426168]  bpf_prog_83627154cefed596+0xe66/0x1000
> > >>>>> [   26.426779]  trace_call_bpf+0xb5/0x160
> > >>>>> [   26.427317]  ? __netif_receive_skb_core+0x1/0xbb0
> > >>>>> [   26.427929]  ? __netif_receive_skb_core+0x1/0xbb0
> > >>>>> [   26.428496]  kprobe_perf_func+0x4d/0x280
> > >>>>> [   26.428986]  ? tracing_record_taskinfo_skip+0x1a/0x30
> > >>>>> [   26.429584]  ? tracing_record_taskinfo+0xe/0x80
> > >>>>> [   26.430152]  ? ttwu_do_wakeup.isra.114+0xcf/0xf0
> > >>>>> [   26.430737]  ? __netif_receive_skb_core+0x1/0xbb0
> > >>>>> [   26.431334]  ? __netif_receive_skb_core+0x5/0xbb0
> > >>>>> [   26.431930]  kprobe_ftrace_handler+0x90/0xf0
> > >>>>> [   26.432495]  ftrace_ops_assist_func+0x63/0x100
> > >>>>> [   26.433060]  0xffffffffc03180bf
> > >>>>> [   26.433471]  ? __netif_receive_skb_core+0x1/0xbb0
> > >>>>> ...
> > >>>>>
> > >>>>> To prevent we are running in arbitrary task (e.g., idle task)
> > >>>>> context which may introduce sleeping issues, the following
> > >>>>> probably appropriate:
> > >>>>>
> > >>>>>           if (in_nmi() || in_softirq())
> > >>>>>                   return -EPERM;
> > >>>>>
> > >>>>> Anyway, if in nmi or softirq, the namespace and pid/tgid
> > >>>>> we get may be just accidentally associated with the bpf running
> > >>>>> context, but it could be in a different context. So such info
> > >>>>> is not reliable any way.
> > >>>>>
> > >>>>>> +
> > >>>>>> +   if (unlikely(size != sizeof(struct bpf_pidns_info)))
> > >>>>>> +           return -EINVAL;
> > >>>>>> +   pidns = task_active_pid_ns(current);
> > >>> [...]
> > >>>

  reply	other threads:[~2019-09-03 18:45 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-13 18:47 [PATCH bpf-next V9 0/3] BPF: New helper to obtain namespace data from current task Carlos Neira
2019-08-13 18:47 ` [PATCH bpf-next V9 1/3] bpf: new " Carlos Neira
2019-08-13 22:35   ` Yonghong Song
2019-08-20 15:10     ` Carlos Antonio Neira Bustos
2019-08-20 17:29       ` Yonghong Song
2019-08-13 23:11   ` Yonghong Song
2019-08-13 23:51     ` [Potential Spoof] " Yonghong Song
2019-08-14  0:56     ` Carlos Antonio Neira Bustos
     [not found]       ` <9a2cacad-b79f-5d39-6d62-bb48cbaaac07@fb.com>
     [not found]         ` <CACiB22jyN9=0ATWWE+x=BoWD6u+8KO+MvBfsFQmcNfkmANb2_w@mail.gmail.com>
2019-08-28 20:39           ` Carlos Antonio Neira Bustos
2019-08-28 20:53             ` Yonghong Song
2019-08-28 21:03               ` Carlos Antonio Neira Bustos
2019-09-03 18:45                 ` Carlos Antonio Neira Bustos [this message]
2019-09-03 20:36                   ` Yonghong Song
2019-08-13 18:47 ` [PATCH bpf-next V9 2/3] samples/bpf: added sample code for bpf_get_current_pidns_info Carlos Neira
2019-08-13 18:47 ` [PATCH bpf-next V9 3/3] tools/testing/selftests/bpf: Add self-tests for new helper Carlos Neira
2019-08-13 23:19   ` Yonghong Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190903184502.2vpaqnoubbr7nnf6@ebpf-metal \
    --to=cneirabustos@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=netdev@vger.kernel.org \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox