From mboxrd@z Thu Jan 1 00:00:00 1970 From: Song Liu Subject: Re: [PATCH bpf-next 1/2] bpf: enable stackmap with build_id in nmi context Date: Wed, 2 May 2018 16:48:32 +0000 Message-ID: <30F32ACF-5155-459B-BD47-5060CCA52788@fb.com> References: <20180502000220.2585320-1-songliubraving@fb.com> <20180502092109.GI12180@hirez.programming.kicks-ass.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Cc: Networking , Kernel Team , Alexei Starovoitov , Daniel Borkmann To: Peter Zijlstra Return-path: Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:43482 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751100AbeEBQtN (ORCPT ); Wed, 2 May 2018 12:49:13 -0400 In-Reply-To: <20180502092109.GI12180@hirez.programming.kicks-ass.net> Content-Language: en-US Content-ID: <7C35E04CBB925640A77901CE85BB2730@namprd15.prod.outlook.com> Sender: netdev-owner@vger.kernel.org List-ID: > On May 2, 2018, at 2:21 AM, Peter Zijlstra wrote: >=20 > On Tue, May 01, 2018 at 05:02:19PM -0700, Song Liu wrote: >> @@ -267,17 +285,27 @@ static void stack_map_get_build_id_offset(struct b= pf_stack_build_id *id_offs, >> { >> int i; >> struct vm_area_struct *vma; >> + bool in_nmi_ctx =3D in_nmi(); >> + bool irq_work_busy =3D false; >> + struct stack_map_irq_work *work; >> + >> + if (in_nmi_ctx) { >> + work =3D this_cpu_ptr(&irq_work); >> + if (work->sem) >> + /* cannot queue more up_read, fallback */ >> + irq_work_busy =3D true; >> + } >>=20 >> /* >> + * We cannot do up_read() in nmi context. To do build_id lookup >> + * in nmi context, we need to run up_read() in irq_work. We use >> + * a percpu variable to do the irq_work. If the irq_work is >> + * already used by another lookup, we fall back to report ips. >> * >> * Same fallback is used for kernel stack (!user) on a stackmap >> * with build_id. >> */ >> + if (!user || !current || !current->mm || irq_work_busy || >> down_read_trylock(¤t->mm->mmap_sem) =3D=3D 0) { >> /* cannot access current->mm, fall back to ips */ >> for (i =3D 0; i < trace_nr; i++) { >> @@ -299,7 +327,13 @@ static void stack_map_get_build_id_offset(struct bp= f_stack_build_id *id_offs, >> - vma->vm_start; >> id_offs[i].status =3D BPF_STACK_BUILD_ID_VALID; >> } >> + >> + if (!in_nmi_ctx) >> + up_read(¤t->mm->mmap_sem); >> + else { >> + work->sem =3D ¤t->mm->mmap_sem; >> + irq_work_queue(&work->work); >> + } >> } >=20 > This is disguisting.. :-) >=20 > It's broken though, I've bet you've never actually ran this with lockdep > enabled for example. I am not following here. I just run the new selftest with CONFIG_LOCKDEP on= ,=20 and got no warning for this.=20 > Also, you set work->sem before you do trylock, if the trylock fails you > return early and keep work->sem set, which will thereafter always result > in irq_work_busy. work->sem was set after down_read_trylock(). I guess you misread the patch? Thanks, Song=