From mboxrd@z Thu Jan 1 00:00:00 1970 From: John Fastabend Subject: Re: [bpf-next PATCH v2 3/4] bpf: sockmap, BPF_F_INGRESS flag for BPF_SK_SKB_STREAM_VERDICT: Date: Wed, 28 Mar 2018 08:45:25 -0700 Message-ID: <3e7dd92b-a932-d977-2a7e-505b374733df@gmail.com> References: <20180327172042.11354.81872.stgit@john-Precision-Tower-5810> <20180327172322.11354.54016.stgit@john-Precision-Tower-5810> <71a13f21-6886-85c3-6911-8ac33c486901@iogearbox.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Cc: netdev@vger.kernel.org, davem@davemloft.net To: Daniel Borkmann , ast@kernel.org Return-path: Received: from mail-pl0-f68.google.com ([209.85.160.68]:45177 "EHLO mail-pl0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753937AbeC1Ppl (ORCPT ); Wed, 28 Mar 2018 11:45:41 -0400 Received: by mail-pl0-f68.google.com with SMTP id n15-v6so1806767plp.12 for ; Wed, 28 Mar 2018 08:45:41 -0700 (PDT) In-Reply-To: <71a13f21-6886-85c3-6911-8ac33c486901@iogearbox.net> Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: On 03/28/2018 07:21 AM, Daniel Borkmann wrote: > On 03/27/2018 07:23 PM, John Fastabend wrote: >> Add support for the BPF_F_INGRESS flag in skb redirect helper. To >> do this convert skb into a scatterlist and push into ingress queue. >> This is the same logic that is used in the sk_msg redirect helper >> so it should feel familiar. >> >> Signed-off-by: John Fastabend >> --- >> include/linux/filter.h | 1 + >> kernel/bpf/sockmap.c | 94 +++++++++++++++++++++++++++++++++++++++--------- >> net/core/filter.c | 2 + >> 3 files changed, 78 insertions(+), 19 deletions(-) > [...] >> if (!sg->length && md->sg_start == md->sg_end) { >> list_del(&md->list); >> + if (md->skb) >> + consume_skb(md->skb); >> kfree(md); >> } >> } >> @@ -1045,27 +1048,72 @@ static int smap_verdict_func(struct smap_psock *psock, struct sk_buff *skb) >> __SK_DROP; >> } >> >> +static int smap_do_ingress(struct smap_psock *psock, struct sk_buff *skb) >> +{ >> + struct sock *sk = psock->sock; >> + int copied = 0, num_sg; >> + struct sk_msg_buff *r; >> + >> + r = kzalloc(sizeof(struct sk_msg_buff), __GFP_NOWARN | GFP_ATOMIC); >> + if (unlikely(!r)) >> + return -EAGAIN; >> + >> + if (!sk_rmem_schedule(sk, skb, skb->len)) { >> + kfree(r); >> + return -EAGAIN; >> + } >> + sk_mem_charge(sk, skb->len); > > Usually mem accounting is based on truesize. This is not done here since > you need the exact length of the skb for the sg list later on, right? Correct. > >> + sg_init_table(r->sg_data, MAX_SKB_FRAGS); >> + num_sg = skb_to_sgvec(skb, r->sg_data, 0, skb->len); >> + if (unlikely(num_sg < 0)) { >> + kfree(r); > > Don't we need to undo the mem charge here in case of error? > Actually, I'll just move the sk_mem_charge() down below this error then we don't need to unwind it. >> + return num_sg; >> + } >> + copied = skb->len; >> + r->sg_start = 0; >> + r->sg_end = num_sg == MAX_SKB_FRAGS ? 0 : num_sg; >> + r->skb = skb; >> + list_add_tail(&r->list, &psock->ingress); >> + sk->sk_data_ready(sk); >> + return copied; >> +}