From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-189.mta0.migadu.com (out-189.mta0.migadu.com [91.218.175.189]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 81BD81EA91 for ; Fri, 8 Aug 2025 21:31:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.189 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754688705; cv=none; b=KV3m4tAky8xOe6qz31lcUB0MZkLQDjg1Pbb4oJbrGreRH8LpJKtk9V4oVeSXPAwqMcX71YWgHgXI9op5KFEpd4qG2f16wwl3obVvzeX08ij6InXfBCS4thL44ixlTQfuasP4EbpUBcoG39QYJAUI+EkPbNsjbMV3yQY5lzlIG8I= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754688705; c=relaxed/simple; bh=SZz6YrdANxo2wtVekMvJalu96H0sxblAmKiPlZuVZd8=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=YScRjlC9Qgwjrq7hs6eJsfeJlhucAFv7m8ZWyqMiDQntHlgaCgKcXR64kO1R2wTBEXyALVnb5mXKZbPMg9qeZrjxxg6p5qFtTgsWPuNB9SabShdXNUc26NU7bZLWQD8MeRIDbaSCXoCqT7IRGgN9iazxC4bFJEarLZGOOnknLuM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=iEeY/Q2I; arc=none smtp.client-ip=91.218.175.189 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="iEeY/Q2I" Message-ID: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1754688700; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hy1OHQn7a5mHql5BsO+HMy70fvmaW3m2Pja8vp+9DMU=; b=iEeY/Q2I8GwwRClmOrXUu+XjsYzrjQ47UrRmiiwh0quB+ga9T5mmRMMOq08RJG5/KkTc/q nFDCihZTT1vdvZjTDEsBJgA8CJR0fYx6HAW4CDjHidMX70XLt6fSFipRkmOakYtSaO5Q0y vflz3SA6Fgjo13LPm4srDIx1hIRKO2w= Date: Fri, 8 Aug 2025 14:31:33 -0700 Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Subject: Re: [PATCH bpf-next v6 9/9] selftests/bpf: Cover metadata access from a modified skb clone To: Jakub Sitnicki Cc: Alexei Starovoitov , Andrii Nakryiko , Arthur Fabre , Daniel Borkmann , Eduard Zingerman , Eric Dumazet , Jakub Kicinski , Jesper Dangaard Brouer , Jesse Brandeburg , Joanne Koong , Lorenzo Bianconi , =?UTF-8?Q?Toke_H=C3=B8iland-J=C3=B8rgensen?= , Yan Zhai , kernel-team@cloudflare.com, netdev@vger.kernel.org, bpf@vger.kernel.org, Stanislav Fomichev References: <20250804-skb-metadata-thru-dynptr-v6-0-05da400bfa4b@cloudflare.com> <20250804-skb-metadata-thru-dynptr-v6-9-05da400bfa4b@cloudflare.com> <7a73fb00-9433-40d7-acb7-691f32f198ff@linux.dev> <87h5yi82gp.fsf@cloudflare.com> Content-Language: en-US X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Martin KaFai Lau In-Reply-To: <87h5yi82gp.fsf@cloudflare.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_OUT On 8/8/25 4:41 AM, Jakub Sitnicki wrote: > On Thu, Aug 07, 2025 at 05:33 PM -07, Martin KaFai Lau wrote: >> On 8/4/25 5:52 AM, Jakub Sitnicki wrote: >>> +/* Check that skb_meta dynptr is empty */ >>> +SEC("tc") >>> +int ing_cls_dynptr_empty(struct __sk_buff *ctx) >>> +{ >>> + struct bpf_dynptr data, meta; >>> + struct ethhdr *eth; >>> + >>> + bpf_dynptr_from_skb(ctx, 0, &data); >>> + eth = bpf_dynptr_slice_rdwr(&data, 0, NULL, sizeof(*eth)); >> >> If this is bpf_dynptr_slice() instead of bpf_dynptr_slice_rdwr() and... >> >>> + if (!eth) >>> + goto out; >>> + /* Ignore non-test packets */ >>> + if (eth->h_proto != 0) >>> + goto out; >>> + /* Packet write to trigger unclone in prologue */ >>> + eth->h_proto = 42; >> >> ... remove this eth->h_proto write. >> >> Then bpf_dynptr_write() will succeed. like, >> >> bpf_dynptr_from_skb(ctx, 0, &data); >> eth = bpf_dynptr_slice(&data, 0, NULL, sizeof(*eth)); >> if (!eth) >> goto out; >> >> /* Ignore non-test packets */ >> if (eth->h_proto != 0) >> goto out; >> >> bpf_dynptr_from_skb_meta(ctx, 0, &meta); >> /* Expect write to fail because skb is a clone. */ >> err = bpf_dynptr_write(&meta, 0, (void *)eth, sizeof(*eth), 0); >> >> The bpf_dynptr_write for a skb dynptr will do the pskb_expand_head(). The >> skb_meta dynptr write is only a memmove. It probably can also do >> pskb_expand_head() and change it to keep the data_meta. >> >> Another option is to set the DYNPTR_RDONLY_BIT in bpf_dynptr_from_skb_meta() for >> a clone skb. This restriction can be removed in the future. > > Ah, crap. Forgot that bpf_dynptr_write->bpf_skb_store_bytes calls > bpf_try_make_writable(skb) behind the scenes. > > OK, so the head page copy for skb clone happens either in BPF prologue > or lazily inside bpf_dynptr_write() call today. > > Best if I make it consistent for skb_meta from the start, no? > > Happy to take a shot at tweaking pskb_expand_head() to keep the metadata > in tact, while at it. There is no write helper for the data_meta now. It must directly write to skb->data_meta, so data_meta is a read-only for a clone now. I guess the current use case is mostly for tc to read the data_meta immediately after the xdp prog has added it (fwiw, it is how we tried to use it also), so it is usually not a clone (?). Not even sure if it currently has a write use case considering, 1) there is no bpf_"skb"_adjust_meta, and 2) the upper layer cannot use it. No strong opinion to either copy the metadata on a clone or set the dynptr rdonly for a clone. I am ok with either way. A brain dump: On one hand, it is hard to comment without visibility on how will it look like when data_meta can be preserved in the future, e.g. what may be the overhead but there is flags in bpf_dynptr_from_skb_meta and bpf_dynptr_write, so there is some flexibility. On the other hand, having a copy will be less surprise on the clone skb like what we have discovered in this and the earlier email thread but I suspect there is actually no write use case on the skb data_meta now.