From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-186.mta1.migadu.com (out-186.mta1.migadu.com [95.215.58.186]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 977BCEEC0 for ; Sun, 26 Apr 2026 06:31:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.186 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777185097; cv=none; b=JDV6dZ94gBOYtCf75zoRJdjeS2BFXgSjaf2R7eg5Y+lneGAcx9tq1m6MvlWzWLRmgao2WrHG2R44E+tOXA5ZpItPL81KJDViCaYgII1+h3ROrPK1V+i4eQbnXF+LhQLeMpPH/VpQMoNTvhvylC7Z8S41bQ+eSWy/VOdpI1J6pcA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777185097; c=relaxed/simple; bh=HxZVHvWazSVorOvbpUJDJhXnaoAWTKCRWrUhAhzsq0Q=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=BhQrKfA/Yz4d6hQGx9DkwabnLRGmYxw5tAQblwTkgVqf5TH/wJ5l2OabxKZahZAA5agaFnPMiFoGFhF753b/pLUTWwLeSxchPgXcVHgYrENsnBBfR61c56HTmWxdU8S5YL7atpR6xBbNSXehnf7VwVfIFvmrYqvBDkbZEcIDEmY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=qUraleRI; arc=none smtp.client-ip=95.215.58.186 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="qUraleRI" Message-ID: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1777185093; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Hcsw6u5oUUHH4lK0hwux6zR3OGq3uFxkPRUG26PVa5Q=; b=qUraleRIi/EyCohjYn2anF9Btn8a9wY0sT35yziTGOQMt/MbggGIR+OnbXTX4sdxfvOnJ3 gNnqfgRScqb77nvNSdn0uxC1ShrMJdmM/DAG1pMZsxw9Q4yP6zEXJuTt5/Souia32wR4eg ItNqMVDBNhDywQPHfcGVjhIul9vHil8= Date: Sun, 26 Apr 2026 14:31:21 +0800 Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Subject: Re: [PATCH bpf] bpf, sockmap: zero-initialize pages allocated in bpf_msg_push_data To: Weiming Shi , Jiayuan Chen Cc: Martin KaFai Lau , Daniel Borkmann , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Kumar Kartikeya Dwivedi , "David S . Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , John Fastabend , Stanislav Fomichev , Song Liu , Yonghong Song , Jiri Olsa , Simon Horman , bpf@vger.kernel.org, netdev@vger.kernel.org, Xiang Mei , Xinyu Ma References: <20260424190310.1520555-2-bestswngs@gmail.com> X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Jiayuan Chen In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT On 4/26/26 1:59 AM, Weiming Shi wrote: > On 26-04-25 11:17, Jiayuan Chen wrote: >> On 4/25/26 3:03 AM, Weiming Shi wrote: >>> bpf_msg_push_data() allocates pages via alloc_pages() without >>> __GFP_ZERO. In the non-copy path, the entire page of uninitialized >>> heap content is added directly to the sk_msg scatterlist, which is >>> then transmitted over TCP to userspace via tcp_bpf_push(). In the >>> copy path, a gap of len bytes between the front and back memcpy >>> regions is similarly left uninitialized. >>> >>> This leads to a kernel heap information leak: stale page content >>> including kernel pointers from the direct-map and vmemmap regions >>> is transmitted to userspace, which can be used to defeat KASLR. >>> >>> Add __GFP_ZERO to the alloc_pages() call to ensure the allocated >>> page is always zeroed before it enters the scatterlist. >> >> >> As the helper's own documentation says: >> >>     If a program of type BPF_PROG_TYPE_SK_MSG is run on a msg it may >>     want to insert metadata or options into the msg. This can later be >>     read and used by any of the lower layer BPF hooks. >> >> The inserted region is meant to be written by the BPF program — that's the >> entire point of calling push. >> >> If the program doesn't fill it,  the push has no purpose to begin with. >> >> >> Isn't the uninitialized content a bug in the BPF program rather than >> something the kernel helper should paper over? >> > Hi, Thanks for the review. > > In my testing a process with only CAP_BPF + CAP_NET_ADMIN can receive > kernel heap and vmalloc pointers through recv() from the uninitialized > pushed region. The uninitialized memory contains critical kernel metadata > such as direct-map and vmalloc pointers, which breaks KASLR. > > Kernels without CONFIG_INIT_ON_ALLOC_DEFAULT_ON (e.g. RHEL) are > directly affected the leak is not masked by any mitigation. > > Thanks, > Weiming Shi > Reviewed-by: Jiayuan Chen Previously I thought this was as same as bpf_xdp_adjust_head / bpf_xdp_adjust_meta, but the function itself allocates a page, I believed the cost of GFP_ZERO flag was irrelevant. Add one more thing: in the future, more and more AI systems will complain about this kind of problem. I believe it is worth it.