All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stanislav Fomichev <sdf@google.com>
To: Jesper Dangaard Brouer <jbrouer@redhat.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	brouer@redhat.com, bpf <bpf@vger.kernel.org>,
	Network Development <netdev@vger.kernel.org>,
	Martin KaFai Lau <martin.lau@kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Alexander Lobakin <alexandr.lobakin@intel.com>,
	Larysa Zaremba <larysa.zaremba@intel.com>,
	xdp-hints@xdp-project.net, anthony.l.nguyen@intel.com,
	yoong.siang.song@intel.com, boon.leong.ong@intel.com,
	intel-wired-lan <intel-wired-lan@lists.osuosl.org>,
	Paolo Abeni <pabeni@redhat.com>,
	Jesse Brandeburg <jesse.brandeburg@intel.com>,
	Jakub Kicinski <kuba@kernel.org>,
	Eric Dumazet <edumazet@google.com>,
	John Fastabend <john.fastabend@gmail.com>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	"David S. Miller" <davem@davemloft.net>
Subject: Re: [PATCH bpf-next V2 3/6] selftests/bpf: xdp_hw_metadata RX hash return code info
Date: Fri, 24 Mar 2023 12:40:32 -0700	[thread overview]
Message-ID: <ZB38kFPy0C3e7bZn@google.com> (raw)
In-Reply-To: <6d9dbeb8-e1e7-8f8e-73eb-8fa66a1323de@redhat.com>

On 03/24, Jesper Dangaard Brouer wrote:


> On 23/03/2023 18.47, Stanislav Fomichev wrote:
> > On Thu, Mar 23, 2023 at 10:35 AM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Thu, Mar 23, 2023 at 1:51 AM Jesper Dangaard Brouer
> > > <jbrouer@redhat.com> wrote:
> > > >
> > > >
> > > > On 22/03/2023 20.33, Stanislav Fomichev wrote:
> > > > > On Wed, Mar 22, 2023 at 12:30 PM Alexei Starovoitov
> > > > > <alexei.starovoitov@gmail.com> wrote:
> > > > > >
> > > > > > On Wed, Mar 22, 2023 at 12:23 PM Stanislav Fomichev  
> <sdf@google.com> wrote:
> > > > > > >
> > > > > > > On Wed, Mar 22, 2023 at 12:17 PM Alexei Starovoitov
> > > > > > > <alexei.starovoitov@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Wed, Mar 22, 2023 at 12:00 PM Stanislav Fomichev  
> <sdf@google.com> wrote:
> > > > > > > > >
> > > > > > > > > On Wed, Mar 22, 2023 at 9:07 AM Alexei Starovoitov
> > > > > > > > > <alexei.starovoitov@gmail.com> wrote:
> > > > > > > > > >
> > > > > > > > > > On Wed, Mar 22, 2023 at 9:05 AM Jesper Dangaard Brouer
> > > > > > > > > > <jbrouer@redhat.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On 21/03/2023 19.47, Stanislav Fomichev wrote:
> > > > > > > > > > > > On Tue, Mar 21, 2023 at 6:47 AM Jesper Dangaard  
> Brouer
> > > > > > > > > > > > <brouer@redhat.com> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > When driver developers add XDP-hints kfuncs for  
> RX hash it is
> > > > > > > > > > > > > practical to print the return code in bpf_printk  
> trace pipe log.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Print hash value as a hex value, both AF_XDP  
> userspace and bpf_prog,
> > > > > > > > > > > > > as this makes it easier to spot poor quality  
> hashes.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Signed-off-by: Jesper Dangaard Brouer  
> <brouer@redhat.com>
> > > > > > > > > > > > > ---
> > > > > > > > > > > >  
> >     .../testing/selftests/bpf/progs/xdp_hw_metadata.c  |    9 ++++++---
> > > > > > > > > > > > >      
> tools/testing/selftests/bpf/xdp_hw_metadata.c      |    5 ++++-
> > > > > > > > > > > > >     2 files changed, 10 insertions(+), 4  
> deletions(-)
> > > > > > > > > > > > >
> > > > > > > > > > > > > diff --git  
> a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c  
> b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > > > > > > > > > > > > index 40c17adbf483..ce07010e4d48 100644
> > > > > > > > > > > > > ---  
> a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > > > > > > > > > > > > +++  
> b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > > > > > > > > > > > > @@ -77,10 +77,13 @@ int rx(struct xdp_md *ctx)
> > > > > > > > > > > > >                    meta->rx_timestamp = 0; /*  
> Used by AF_XDP as not avail signal */
> > > > > > > > > > > > >            }
> > > > > > > > > > > > >
> > > > > > > > > > > > > -       if (!bpf_xdp_metadata_rx_hash(ctx,  
> &meta->rx_hash))
> > > > > > > > > > > > > -               bpf_printk("populated rx_hash  
> with %u", meta->rx_hash);
> > > > > > > > > > > > > -       else
> > > > > > > > > > > > > +       ret = bpf_xdp_metadata_rx_hash(ctx,  
> &meta->rx_hash);
> > > > > > > > > > > > > +       if (ret >= 0) {
> > > > > > > > > > > > > +               bpf_printk("populated rx_hash  
> with 0x%08X", meta->rx_hash);
> > > > > > > > > > > > > +       } else {
> > > > > > > > > > > > > +               bpf_printk("rx_hash not-avail  
> errno:%d", ret);
> > > > > > > > > > > > >                    meta->rx_hash = 0; /* Used by  
> AF_XDP as not avail signal */
> > > > > > > > > > > > > +       }
> > > > > > > > > > > > >
> > > > > > > > > > > > >            return bpf_redirect_map(&xsk,  
> ctx->rx_queue_index, XDP_PASS);
> > > > > > > > > > > > >     }
> > > > > > > > > > > > > diff --git  
> a/tools/testing/selftests/bpf/xdp_hw_metadata.c  
> b/tools/testing/selftests/bpf/xdp_hw_metadata.c
> > > > > > > > > > > > > index 400bfe19abfe..f3ec07ccdc95 100644
> > > > > > > > > > > > > ---  
> a/tools/testing/selftests/bpf/xdp_hw_metadata.c
> > > > > > > > > > > > > +++  
> b/tools/testing/selftests/bpf/xdp_hw_metadata.c
> > > > > > > > > > > > > @@ -3,6 +3,9 @@
> > > > > > > > > > > > >     /* Reference program for verifying XDP  
> metadata on real HW. Functional test
> > > > > > > > > > > > >      * only, doesn't test the performance.
> > > > > > > > > > > > >      *
> > > > > > > > > > > > > + * BPF-prog bpf_printk info outout can be access  
> via
> > > > > > > > > > > > > + * /sys/kernel/debug/tracing/trace_pipe
> > > > > > > > > > > >
> > > > > > > > > > > > s/outout/output/
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Fixed in V3
> > > > > > > > > > >
> > > > > > > > > > > > But let's maybe drop it? If you want to make it  
> more usable, let's
> > > > > > > > > > > > have a separate patch to enable tracing and  
> periodically dump it to
> > > > > > > > > > > > the console instead (as previously discussed).
> > > > > > > > > > >
> > > > > > > > > > > Cat'ing /sys/kernel/debug/tracing/trace_pipe work for  
> me regardless of
> > > > > > > > > > > setting in
> > > > > > > > > > >  
> /sys/kernel/debug/tracing/events/bpf_trace/bpf_trace_printk/enable
> > > > > > > > > > >
> > > > > > > > > > > We likely need a followup patch that adds a BPF  
> config switch that can
> > > > > > > > > > > disable bpf_printk calls, because this adds overhead  
> and thus affects
> > > > > > > > > > > the timestamps.
> > > > > > > > > >
> > > > > > > > > > No. This is by design.
> > > > > > > > > > Do not use bpf_printk* in production.
> > > >
> > > > I fully agree do not use bpf_printk in *production*.
> > > >
> > > > > > > > >
> > > > > > > > > But that's not for the production? xdp_hw_metadata is a  
> small tool to
> > > > > > > > > verify that the metadata being dumped is correct (during  
> the
> > > > > > > > > development).
> > > > > > > > > We have a proper (less verbose) selftest in
> > > > > > > > > {progs,prog_tests}/xdp_metadata.c (over veth).
> > > > > > > > > This xdp_hw_metadata was supposed to be used for running  
> it against
> > > > > > > > > the real hardware, so having as much debugging at hand as  
> possible
> > > > > > > > > seems helpful? (at least it was helpful to me when  
> playing with mlx4)
> > > >
> > > > My experience when developing these kfuncs for igc (real hardware),  
> this
> > > > "tool" xdp_hw_metadata was super helpful, because it was very  
> verbose
> > > > (and I was juggling reading chip registers BE/LE and see patch1 a  
> buggy
> > > > implementation for RX-hash).
> > > >
> > > > As I wrote in cover-letter, I recommend other driver developers to  
> do
> > > > the same, because it really help speed up the development. In theory
> > > > xdp_hw_metadata doesn't belong in selftests directory and IMHO it  
> should
> > > > have been placed in samples/bpf/, but given the relationship with  
> real
> > > > selftest {progs,prog_tests}/xdp_metadata.c I think it makes sense to
> > > > keep here.
> > > >
> > > >
> > > > > > > >
> > > > > > > > The only use of bpf_printk is for debugging of bpf progs  
> themselves.
> > > > > > > > It should not be used in any tool.
> > > > > > >
> > > > > > > Hmm, good point. I guess it also means we won't have to mess  
> with
> > > > > > > enabling/dumping ftrace (and don't need this comment about  
> cat'ing the
> > > > > > > file).
> > > > > > > Jesper, maybe we can instead pass the status of those
> > > > > > > bpf_xdp_metadata_xxx kfuncs via 'struct xdp_meta'? And dump  
> this info
> > > > > > > from the userspace if needed.
> > > > > >
> > > > > > There are so many other ways for bpf prog to communicate with  
> user space.
> > > > > > Use ringbuf, perf_event buffer, global vars, maps, etc.
> > > > > > trace_pipe is debug only because it's global and will conflict  
> with
> > > > > > all other debug sessions.
> > > >
> > > > I want to highlight above paragraph: It is very true for production
> > > > code. (Anyone Googling this pay attention to above paragraph).
> > > >
> > > > >
> > > > > 👍 makes sense, ty! hopefully we won't have to add a separate  
> channel
> > > > > for those and can (ab)use the metadata area.
> > > > >
> > > >
> > > > Proposed solution: How about default disabling the bpf_printk's via  
> a
> > > > macro define, and then driver developer can manually reenable them
> > > > easily via a single define, to enable a debugging mode.
> > > >
> > > > I was only watching /sys/kernel/debug/tracing/trace_pipe when I was
> > > > debugging a driver issue.  Thus, an extra step of modifying a single
> > > > define in BPF seems easier, than instrumenting my driver with  
> printk.
> > >
> > > It's certainly fine to have commented out bpf_printk in selftests
> > > and sample code.
> > > But the patch does:
> > > +       if (ret >= 0) {
> > > +               bpf_printk("populated rx_hash with 0x%08X",  
> meta->rx_hash);
> > > +       } else {
> > > +               bpf_printk("rx_hash not-avail errno:%d", ret);
> > >                  meta->rx_hash = 0; /* Used by AF_XDP as not avail  
> signal */
> > > +       }
> > >
> > > It feels that printk is the only thing that provides the signal to  
> the user.
> > > Doing s/ret >= 0/true/ won't make any difference to selftest and
> > > to the consumer and that's my main concern with such selftest/samples.
> >
> > I agree, having this signal delivered to the user (without the ifdefs)
> > seems like a better way to go.

> I agree that we have a signal that we are not delivering to the user.

> Just so we are on the same page, let me explain how above code is
> problematic. As the rx_hash value zero have two meanings:

>   (1) Negative 'ret' set rx_hash=0 as err signal to AF_XDP "user"
>   (2) Hardware set rx_hash=0, when hash-type == IGC_RSS_TYPE_NO_HASH

> Case (2) happens for all L2 packets e.g. ARP packets.  See in patch-1
> where I map IGC_RSS_TYPE_NO_HASH to netstack type PKT_HASH_TYPE_L2.
> I did consider return errno/negative number for IGC_RSS_TYPE_NO_HASH,
> but I wanted to keep kfunc code as simple as possible (both for speed
> and if we need to unroll as byte-code later). As i225 hardware still
> writes zero into hash word I choose to keep code simple.


> IMHO this symptom is related to an API problem of the kfunc, that
> doesn't provide the hash-type.

> > Seems trivial to do something like the following below? (untested,
> > doesn't compile, gmail-copy-pasted so don't try to apply it)
> >

> If the kfunc provided the hash-type, which will be a positive number.
> Then I would add a signed integer to meta, which can store the hash-type
> or the negative errno number.

What's wrong with storing the return value from the kfuncs as is?
We are not writing super optimized production code here, so should be
fine?

> > diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > index 4c55b4d79d3d..061c410f68ea 100644
> > --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > @@ -12,6 +12,9 @@ struct {
> >    __type(value, __u32);
> >   } xsk SEC(".maps");
> >
> > +int received;
> > +int dropped;
> > +
> >   extern int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx,
> >    __u64 *timestamp) __ksym;
> >   extern int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx,
> > @@ -52,11 +55,11 @@ int rx(struct xdp_md *ctx)
> >    if (udp->dest != bpf_htons(9091))
> >    return XDP_PASS;

> It we wanted to make this program user "friendly", we should also count
> packets that doesn't get redirected to AF_XDP "user" and instead skipped.

Sure, let's do that?

> > - bpf_printk("forwarding UDP:9091 to AF_XDP");
> > + __sync_fetch_and_add(&received, 1);
> >
> >    ret = bpf_xdp_adjust_meta(ctx, -(int)sizeof(struct xdp_meta));
> >    if (ret != 0) {
> > - bpf_printk("bpf_xdp_adjust_meta returned %d", ret);
> > + __sync_fetch_and_add(&dropped, 1);
> >    return XDP_PASS;

> Is it a "dropped" or a "skipped" packet (return XDP_PASS)?

Up to you. I was mostly throwing an idea on how to drop these bpf_printk
completely and report those events from the userspace.

> >    }
> [...]

> --Jesper


WARNING: multiple messages have this Message-ID (diff)
From: Stanislav Fomichev <sdf@google.com>
To: Jesper Dangaard Brouer <jbrouer@redhat.com>
Cc: xdp-hints@xdp-project.net,
	Martin KaFai Lau <martin.lau@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Larysa Zaremba <larysa.zaremba@intel.com>,
	Network Development <netdev@vger.kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	John Fastabend <john.fastabend@gmail.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Jesse Brandeburg <jesse.brandeburg@intel.com>,
	Jakub Kicinski <kuba@kernel.org>,
	Eric Dumazet <edumazet@google.com>,
	Paolo Abeni <pabeni@redhat.com>,
	yoong.siang.song@intel.com, brouer@redhat.com,
	boon.leong.ong@intel.com, anthony.l.nguyen@intel.com,
	bpf <bpf@vger.kernel.org>,
	intel-wired-lan <intel-wired-lan@lists.osuosl.org>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Jesper Dangaard Brouer <hawk@kernel.org>
Subject: Re: [Intel-wired-lan] [PATCH bpf-next V2 3/6] selftests/bpf: xdp_hw_metadata RX hash return code info
Date: Fri, 24 Mar 2023 12:40:32 -0700	[thread overview]
Message-ID: <ZB38kFPy0C3e7bZn@google.com> (raw)
In-Reply-To: <6d9dbeb8-e1e7-8f8e-73eb-8fa66a1323de@redhat.com>

On 03/24, Jesper Dangaard Brouer wrote:


> On 23/03/2023 18.47, Stanislav Fomichev wrote:
> > On Thu, Mar 23, 2023 at 10:35 AM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Thu, Mar 23, 2023 at 1:51 AM Jesper Dangaard Brouer
> > > <jbrouer@redhat.com> wrote:
> > > >
> > > >
> > > > On 22/03/2023 20.33, Stanislav Fomichev wrote:
> > > > > On Wed, Mar 22, 2023 at 12:30 PM Alexei Starovoitov
> > > > > <alexei.starovoitov@gmail.com> wrote:
> > > > > >
> > > > > > On Wed, Mar 22, 2023 at 12:23 PM Stanislav Fomichev  
> <sdf@google.com> wrote:
> > > > > > >
> > > > > > > On Wed, Mar 22, 2023 at 12:17 PM Alexei Starovoitov
> > > > > > > <alexei.starovoitov@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Wed, Mar 22, 2023 at 12:00 PM Stanislav Fomichev  
> <sdf@google.com> wrote:
> > > > > > > > >
> > > > > > > > > On Wed, Mar 22, 2023 at 9:07 AM Alexei Starovoitov
> > > > > > > > > <alexei.starovoitov@gmail.com> wrote:
> > > > > > > > > >
> > > > > > > > > > On Wed, Mar 22, 2023 at 9:05 AM Jesper Dangaard Brouer
> > > > > > > > > > <jbrouer@redhat.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On 21/03/2023 19.47, Stanislav Fomichev wrote:
> > > > > > > > > > > > On Tue, Mar 21, 2023 at 6:47 AM Jesper Dangaard  
> Brouer
> > > > > > > > > > > > <brouer@redhat.com> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > When driver developers add XDP-hints kfuncs for  
> RX hash it is
> > > > > > > > > > > > > practical to print the return code in bpf_printk  
> trace pipe log.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Print hash value as a hex value, both AF_XDP  
> userspace and bpf_prog,
> > > > > > > > > > > > > as this makes it easier to spot poor quality  
> hashes.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Signed-off-by: Jesper Dangaard Brouer  
> <brouer@redhat.com>
> > > > > > > > > > > > > ---
> > > > > > > > > > > >  
> >     .../testing/selftests/bpf/progs/xdp_hw_metadata.c  |    9 ++++++---
> > > > > > > > > > > > >      
> tools/testing/selftests/bpf/xdp_hw_metadata.c      |    5 ++++-
> > > > > > > > > > > > >     2 files changed, 10 insertions(+), 4  
> deletions(-)
> > > > > > > > > > > > >
> > > > > > > > > > > > > diff --git  
> a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c  
> b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > > > > > > > > > > > > index 40c17adbf483..ce07010e4d48 100644
> > > > > > > > > > > > > ---  
> a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > > > > > > > > > > > > +++  
> b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > > > > > > > > > > > > @@ -77,10 +77,13 @@ int rx(struct xdp_md *ctx)
> > > > > > > > > > > > >                    meta->rx_timestamp = 0; /*  
> Used by AF_XDP as not avail signal */
> > > > > > > > > > > > >            }
> > > > > > > > > > > > >
> > > > > > > > > > > > > -       if (!bpf_xdp_metadata_rx_hash(ctx,  
> &meta->rx_hash))
> > > > > > > > > > > > > -               bpf_printk("populated rx_hash  
> with %u", meta->rx_hash);
> > > > > > > > > > > > > -       else
> > > > > > > > > > > > > +       ret = bpf_xdp_metadata_rx_hash(ctx,  
> &meta->rx_hash);
> > > > > > > > > > > > > +       if (ret >= 0) {
> > > > > > > > > > > > > +               bpf_printk("populated rx_hash  
> with 0x%08X", meta->rx_hash);
> > > > > > > > > > > > > +       } else {
> > > > > > > > > > > > > +               bpf_printk("rx_hash not-avail  
> errno:%d", ret);
> > > > > > > > > > > > >                    meta->rx_hash = 0; /* Used by  
> AF_XDP as not avail signal */
> > > > > > > > > > > > > +       }
> > > > > > > > > > > > >
> > > > > > > > > > > > >            return bpf_redirect_map(&xsk,  
> ctx->rx_queue_index, XDP_PASS);
> > > > > > > > > > > > >     }
> > > > > > > > > > > > > diff --git  
> a/tools/testing/selftests/bpf/xdp_hw_metadata.c  
> b/tools/testing/selftests/bpf/xdp_hw_metadata.c
> > > > > > > > > > > > > index 400bfe19abfe..f3ec07ccdc95 100644
> > > > > > > > > > > > > ---  
> a/tools/testing/selftests/bpf/xdp_hw_metadata.c
> > > > > > > > > > > > > +++  
> b/tools/testing/selftests/bpf/xdp_hw_metadata.c
> > > > > > > > > > > > > @@ -3,6 +3,9 @@
> > > > > > > > > > > > >     /* Reference program for verifying XDP  
> metadata on real HW. Functional test
> > > > > > > > > > > > >      * only, doesn't test the performance.
> > > > > > > > > > > > >      *
> > > > > > > > > > > > > + * BPF-prog bpf_printk info outout can be access  
> via
> > > > > > > > > > > > > + * /sys/kernel/debug/tracing/trace_pipe
> > > > > > > > > > > >
> > > > > > > > > > > > s/outout/output/
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Fixed in V3
> > > > > > > > > > >
> > > > > > > > > > > > But let's maybe drop it? If you want to make it  
> more usable, let's
> > > > > > > > > > > > have a separate patch to enable tracing and  
> periodically dump it to
> > > > > > > > > > > > the console instead (as previously discussed).
> > > > > > > > > > >
> > > > > > > > > > > Cat'ing /sys/kernel/debug/tracing/trace_pipe work for  
> me regardless of
> > > > > > > > > > > setting in
> > > > > > > > > > >  
> /sys/kernel/debug/tracing/events/bpf_trace/bpf_trace_printk/enable
> > > > > > > > > > >
> > > > > > > > > > > We likely need a followup patch that adds a BPF  
> config switch that can
> > > > > > > > > > > disable bpf_printk calls, because this adds overhead  
> and thus affects
> > > > > > > > > > > the timestamps.
> > > > > > > > > >
> > > > > > > > > > No. This is by design.
> > > > > > > > > > Do not use bpf_printk* in production.
> > > >
> > > > I fully agree do not use bpf_printk in *production*.
> > > >
> > > > > > > > >
> > > > > > > > > But that's not for the production? xdp_hw_metadata is a  
> small tool to
> > > > > > > > > verify that the metadata being dumped is correct (during  
> the
> > > > > > > > > development).
> > > > > > > > > We have a proper (less verbose) selftest in
> > > > > > > > > {progs,prog_tests}/xdp_metadata.c (over veth).
> > > > > > > > > This xdp_hw_metadata was supposed to be used for running  
> it against
> > > > > > > > > the real hardware, so having as much debugging at hand as  
> possible
> > > > > > > > > seems helpful? (at least it was helpful to me when  
> playing with mlx4)
> > > >
> > > > My experience when developing these kfuncs for igc (real hardware),  
> this
> > > > "tool" xdp_hw_metadata was super helpful, because it was very  
> verbose
> > > > (and I was juggling reading chip registers BE/LE and see patch1 a  
> buggy
> > > > implementation for RX-hash).
> > > >
> > > > As I wrote in cover-letter, I recommend other driver developers to  
> do
> > > > the same, because it really help speed up the development. In theory
> > > > xdp_hw_metadata doesn't belong in selftests directory and IMHO it  
> should
> > > > have been placed in samples/bpf/, but given the relationship with  
> real
> > > > selftest {progs,prog_tests}/xdp_metadata.c I think it makes sense to
> > > > keep here.
> > > >
> > > >
> > > > > > > >
> > > > > > > > The only use of bpf_printk is for debugging of bpf progs  
> themselves.
> > > > > > > > It should not be used in any tool.
> > > > > > >
> > > > > > > Hmm, good point. I guess it also means we won't have to mess  
> with
> > > > > > > enabling/dumping ftrace (and don't need this comment about  
> cat'ing the
> > > > > > > file).
> > > > > > > Jesper, maybe we can instead pass the status of those
> > > > > > > bpf_xdp_metadata_xxx kfuncs via 'struct xdp_meta'? And dump  
> this info
> > > > > > > from the userspace if needed.
> > > > > >
> > > > > > There are so many other ways for bpf prog to communicate with  
> user space.
> > > > > > Use ringbuf, perf_event buffer, global vars, maps, etc.
> > > > > > trace_pipe is debug only because it's global and will conflict  
> with
> > > > > > all other debug sessions.
> > > >
> > > > I want to highlight above paragraph: It is very true for production
> > > > code. (Anyone Googling this pay attention to above paragraph).
> > > >
> > > > >
> > > > > 👍 makes sense, ty! hopefully we won't have to add a separate  
> channel
> > > > > for those and can (ab)use the metadata area.
> > > > >
> > > >
> > > > Proposed solution: How about default disabling the bpf_printk's via  
> a
> > > > macro define, and then driver developer can manually reenable them
> > > > easily via a single define, to enable a debugging mode.
> > > >
> > > > I was only watching /sys/kernel/debug/tracing/trace_pipe when I was
> > > > debugging a driver issue.  Thus, an extra step of modifying a single
> > > > define in BPF seems easier, than instrumenting my driver with  
> printk.
> > >
> > > It's certainly fine to have commented out bpf_printk in selftests
> > > and sample code.
> > > But the patch does:
> > > +       if (ret >= 0) {
> > > +               bpf_printk("populated rx_hash with 0x%08X",  
> meta->rx_hash);
> > > +       } else {
> > > +               bpf_printk("rx_hash not-avail errno:%d", ret);
> > >                  meta->rx_hash = 0; /* Used by AF_XDP as not avail  
> signal */
> > > +       }
> > >
> > > It feels that printk is the only thing that provides the signal to  
> the user.
> > > Doing s/ret >= 0/true/ won't make any difference to selftest and
> > > to the consumer and that's my main concern with such selftest/samples.
> >
> > I agree, having this signal delivered to the user (without the ifdefs)
> > seems like a better way to go.

> I agree that we have a signal that we are not delivering to the user.

> Just so we are on the same page, let me explain how above code is
> problematic. As the rx_hash value zero have two meanings:

>   (1) Negative 'ret' set rx_hash=0 as err signal to AF_XDP "user"
>   (2) Hardware set rx_hash=0, when hash-type == IGC_RSS_TYPE_NO_HASH

> Case (2) happens for all L2 packets e.g. ARP packets.  See in patch-1
> where I map IGC_RSS_TYPE_NO_HASH to netstack type PKT_HASH_TYPE_L2.
> I did consider return errno/negative number for IGC_RSS_TYPE_NO_HASH,
> but I wanted to keep kfunc code as simple as possible (both for speed
> and if we need to unroll as byte-code later). As i225 hardware still
> writes zero into hash word I choose to keep code simple.


> IMHO this symptom is related to an API problem of the kfunc, that
> doesn't provide the hash-type.

> > Seems trivial to do something like the following below? (untested,
> > doesn't compile, gmail-copy-pasted so don't try to apply it)
> >

> If the kfunc provided the hash-type, which will be a positive number.
> Then I would add a signed integer to meta, which can store the hash-type
> or the negative errno number.

What's wrong with storing the return value from the kfuncs as is?
We are not writing super optimized production code here, so should be
fine?

> > diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > index 4c55b4d79d3d..061c410f68ea 100644
> > --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > @@ -12,6 +12,9 @@ struct {
> >    __type(value, __u32);
> >   } xsk SEC(".maps");
> >
> > +int received;
> > +int dropped;
> > +
> >   extern int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx,
> >    __u64 *timestamp) __ksym;
> >   extern int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx,
> > @@ -52,11 +55,11 @@ int rx(struct xdp_md *ctx)
> >    if (udp->dest != bpf_htons(9091))
> >    return XDP_PASS;

> It we wanted to make this program user "friendly", we should also count
> packets that doesn't get redirected to AF_XDP "user" and instead skipped.

Sure, let's do that?

> > - bpf_printk("forwarding UDP:9091 to AF_XDP");
> > + __sync_fetch_and_add(&received, 1);
> >
> >    ret = bpf_xdp_adjust_meta(ctx, -(int)sizeof(struct xdp_meta));
> >    if (ret != 0) {
> > - bpf_printk("bpf_xdp_adjust_meta returned %d", ret);
> > + __sync_fetch_and_add(&dropped, 1);
> >    return XDP_PASS;

> Is it a "dropped" or a "skipped" packet (return XDP_PASS)?

Up to you. I was mostly throwing an idea on how to drop these bpf_printk
completely and report those events from the userspace.

> >    }
> [...]

> --Jesper

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

  reply	other threads:[~2023-03-24 19:40 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-21 13:47 [PATCH bpf-next V2 0/6] XDP-hints kfuncs for Intel driver igc Jesper Dangaard Brouer
2023-03-21 13:47 ` [Intel-wired-lan] " Jesper Dangaard Brouer
2023-03-21 13:47 ` [PATCH bpf-next V2 1/6] igc: enable and fix RX hash usage by netstack Jesper Dangaard Brouer
2023-03-21 13:47   ` [Intel-wired-lan] " Jesper Dangaard Brouer
2023-03-21 13:47 ` [PATCH bpf-next V2 2/6] selftests/bpf: xdp_hw_metadata track more timestamps Jesper Dangaard Brouer
2023-03-21 13:47   ` [Intel-wired-lan] " Jesper Dangaard Brouer
2023-03-21 13:47 ` [PATCH bpf-next V2 3/6] selftests/bpf: xdp_hw_metadata RX hash return code info Jesper Dangaard Brouer
2023-03-21 13:47   ` [Intel-wired-lan] " Jesper Dangaard Brouer
2023-03-21 18:47   ` Stanislav Fomichev
2023-03-21 18:47     ` [Intel-wired-lan] " Stanislav Fomichev
2023-03-22 16:05     ` Jesper Dangaard Brouer
2023-03-22 16:05       ` [Intel-wired-lan] " Jesper Dangaard Brouer
2023-03-22 16:07       ` Alexei Starovoitov
2023-03-22 16:07         ` [Intel-wired-lan] " Alexei Starovoitov
2023-03-22 19:00         ` Stanislav Fomichev
2023-03-22 19:00           ` [Intel-wired-lan] " Stanislav Fomichev
2023-03-22 19:17           ` Alexei Starovoitov
2023-03-22 19:17             ` [Intel-wired-lan] " Alexei Starovoitov
2023-03-22 19:22             ` Stanislav Fomichev
2023-03-22 19:22               ` [Intel-wired-lan] " Stanislav Fomichev
2023-03-22 19:30               ` Alexei Starovoitov
2023-03-22 19:30                 ` [Intel-wired-lan] " Alexei Starovoitov
2023-03-22 19:33                 ` Stanislav Fomichev
2023-03-22 19:33                   ` [Intel-wired-lan] " Stanislav Fomichev
2023-03-23  8:51                   ` Jesper Dangaard Brouer
2023-03-23  8:51                     ` [Intel-wired-lan] " Jesper Dangaard Brouer
2023-03-23 17:35                     ` Alexei Starovoitov
2023-03-23 17:35                       ` [Intel-wired-lan] " Alexei Starovoitov
2023-03-23 17:47                       ` Stanislav Fomichev
2023-03-23 17:47                         ` [Intel-wired-lan] " Stanislav Fomichev
2023-03-24 12:14                         ` Jesper Dangaard Brouer
2023-03-24 12:14                           ` [Intel-wired-lan] " Jesper Dangaard Brouer
2023-03-24 19:40                           ` Stanislav Fomichev [this message]
2023-03-24 19:40                             ` Stanislav Fomichev
2023-03-21 13:47 ` [PATCH bpf-next V2 4/6] igc: add igc_xdp_buff wrapper for xdp_buff in driver Jesper Dangaard Brouer
2023-03-21 13:47   ` [Intel-wired-lan] " Jesper Dangaard Brouer
2023-03-21 13:47 ` [PATCH bpf-next V2 5/6] igc: add XDP hints kfuncs for RX timestamp Jesper Dangaard Brouer
2023-03-21 13:47   ` [Intel-wired-lan] " Jesper Dangaard Brouer
2023-03-21 13:47 ` [PATCH bpf-next V2 6/6] igc: add XDP hints kfuncs for RX hash Jesper Dangaard Brouer
2023-03-21 13:47   ` [Intel-wired-lan] " Jesper Dangaard Brouer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZB38kFPy0C3e7bZn@google.com \
    --to=sdf@google.com \
    --cc=alexandr.lobakin@intel.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=anthony.l.nguyen@intel.com \
    --cc=ast@kernel.org \
    --cc=boon.leong.ong@intel.com \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=jbrouer@redhat.com \
    --cc=jesse.brandeburg@intel.com \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=larysa.zaremba@intel.com \
    --cc=martin.lau@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=xdp-hints@xdp-project.net \
    --cc=yoong.siang.song@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.