From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtpout.efficios.com (smtpout.efficios.com [167.114.26.122]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F33AB1C9ED2; Tue, 22 Oct 2024 20:06:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=167.114.26.122 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729627598; cv=none; b=eZA5q+vNIMdqKQV/Lh0stccEu8bm9KtJwOTKtg+2shuPwNTEpyo0xX3hc6dBtC+v3TwgiVosKTJusDMpltUaGpBOGjgyiLzCy/2v6yBkov+SSMKGwSVt4L/HPL9dawUJ9GDE/PKnzKYS8xnPuscmN3rGYIjLQVwRE2oag5MP5H4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729627598; c=relaxed/simple; bh=drK3eYYnJyoywUvF3r02TIDWaPZHQ8aA0CUG3R0HUiE=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=kG6ZhLlKlnzWz/svJ175lcpfJ+4KXBP6DvdEM7HySnD994j6oobVAaDhTNPg3kSuLkJTsmLV3BbzVBnV8Zjr21xSbxoG0qR7r/30qA3dm1duNn6BXd9PogqzunOT4MeuxRl2QUF7AUB8dGZHvJBfaF/OcTm5Pqg6QJPO2ZmS9NQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=efficios.com; spf=pass smtp.mailfrom=efficios.com; dkim=pass (2048-bit key) header.d=efficios.com header.i=@efficios.com header.b=Ndzwbot2; arc=none smtp.client-ip=167.114.26.122 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=efficios.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=efficios.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=efficios.com header.i=@efficios.com header.b="Ndzwbot2" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1729627593; bh=drK3eYYnJyoywUvF3r02TIDWaPZHQ8aA0CUG3R0HUiE=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=Ndzwbot2E0Pi1n7eCyEi4HOE/bJBXQcpmVezyCZbLmzpcuTwc7Bxp4bPOoTucS/V6 0OqvJVWdHQfmqQWAZQOMbS8F9RqBlGpAn7ut3uCIG0i5aprX742NZ8htLcOtrmdwZf 48Myj/g9O/Ip95bTB7GN6fmGN19H9xUWpM1wnWLlumxdr3icfA3b/BGp/K83oQDcU0 bhbIxQ11L4QDdZTYv8/64l9XZfmJavxXDBqU5qNELXaUg1ylRtB6hQhAfvxZamqG3U w7L2D1LXaqEqhLF2eFaeXmh4imlPvEn/gEpMg9gIqYWsThhvXqB3+DU+XDYDD8DRcj Gxni1Nj1Y9Cag== Received: from [172.16.0.134] (96-127-217-162.qc.cable.ebox.net [96.127.217.162]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4XY3951TMQzSQX; Tue, 22 Oct 2024 16:06:33 -0400 (EDT) Message-ID: <1ab8fe0d-de92-49be-b10b-ebb5c7f5573a@efficios.com> Date: Tue, 22 Oct 2024 16:04:49 -0400 Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH] tracing: Fix syscall tracepoint use-after-free To: Andrii Nakryiko Cc: Jordan Rife , Steven Rostedt , linux-kernel@vger.kernel.org, syzbot+b390c8062d8387b6272a@syzkaller.appspotmail.com, Michael Jeanson , Masami Hiramatsu , Peter Zijlstra , Alexei Starovoitov , Yonghong Song , "Paul E . McKenney" , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Namhyung Kim , bpf@vger.kernel.org, Joel Fernandes References: <20241022151804.284424-1-mathieu.desnoyers@efficios.com> <3362d414-4d6f-43a7-80af-1c72c5e66d70@efficios.com> From: Mathieu Desnoyers Content-Language: en-US In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 2024-10-22 15:53, Andrii Nakryiko wrote: > On Tue, Oct 22, 2024 at 10:55 AM Mathieu Desnoyers > wrote: >> >> On 2024-10-22 12:14, Jordan Rife wrote: >>> I assume this patch isn't meant to fix the related issues with freeing >>> BPF programs/links with call_rcu? >> >> No, indeed. I notice that bpf_link_free() uses a prog->sleepable flag to >> choose between: >> >> if (sleepable) >> call_rcu_tasks_trace(&link->rcu, bpf_link_defer_dealloc_mult_rcu_gp); >> else >> call_rcu(&link->rcu, bpf_link_defer_dealloc_rcu_gp); >> >> But the faultable syscall tracepoint series does not require syscall programs >> to be sleepable. So some changes may be needed on the ebpf side there. > > Your fix now adds a chain of call_rcu -> call_rcu_tasks_trace -> > kfree, which should work regardless of sleepable/non-sleepable. For > the BPF-side, yes, we do different things depending on prog->sleepable > (adding extra call_rcu_tasks_trace for sleepable, while still keeping > call_rcu in the chain), so the BPF side should be good, I think. > >> >>> >>> On the BPF side I think there needs to be some smarter handling of >>> when to use call_rcu or call_rcu_tasks_trace to free links/programs >>> based on whether or not the program type can be executed in this >>> context. Right now call_rcu_tasks_trace is used if the program is >>> sleepable, but that isn't necessarily the case here. Off the top of my >>> head this would be BPF_PROG_TYPE_RAW_TRACEPOINT and >>> BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE, but may extend to >>> BPF_PROG_TYPE_TRACEPOINT? I'll let some of the BPF folks chime in >>> here, as I'm not entirely sure. >> > > From the BPF standpoint, as of right now, neither of RAW_TRACEPOINT or > TRACEPOINT programs are sleepable. So a single RCU grace period is > fine. But even if they were (and we'll allow that later on), we handle > sleepable programs with the same call_rcu_tasks_trace -> call_rcu > chain. Good points, in this commit: commit 4aadde89d8 ("tracing/bpf: disable preemption in syscall probe") I took care to disable preemption around use of the bpf program attached to a syscall tracepoint, which makes this change a no-op from the tracers' perspective. It's only when you'll decide to remove this preempt-off and allow syscall tracepoints to sleep in bpf that you'll need to tweak that. > > That's just to say that I don't think that we need any BPF-specific > fix beyond what Mathieu is doing in this patch, so: > > Acked-by: Andrii Nakryiko Thanks! Mathieu > > >> A big hammer solution would be to make all grace periods waited for after >> a bpf tracepoint probe unregister chain call_rcu and call_rcu_tasks_trace. >> >> Else, if we properly tag all programs attached to syscall tracepoints as >> sleepable, then keeping the call_rcu_tasks_trace() only for those would >> work. >> >> Thanks, >> >> Mathieu >> >> -- >> Mathieu Desnoyers >> EfficiOS Inc. >> https://www.efficios.com >> -- Mathieu Desnoyers EfficiOS Inc. https://www.efficios.com