netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stanislav Fomichev <sdf@google.com>
To: Daniele Salvatore Albano <d.albano@gmail.com>
Cc: netdev@vger.kernel.org
Subject: Re: [mlx5_core] kernel NULL pointer dereference when sending packets with AF_XDP using the hw checksum
Date: Mon, 18 Mar 2024 09:16:22 -0700	[thread overview]
Message-ID: <Zfho1lRIg0cjpWwK@google.com> (raw)
In-Reply-To: <CAKq9yRiy166UpMA1HFiuzs0EMEM_aXbXbaTztbXcJ5CKF4F64w@mail.gmail.com>

On 03/16, Daniele Salvatore Albano wrote:
> On Sat, 16 Mar 2024 at 05:11, Stanislav Fomichev <sdf@google.com> wrote:
> >
> > On 03/16, Daniele Salvatore Albano wrote:
> > > Hey there,
> > >
> > > Hope this is the right ml, if not sorry in advance.
> > >
> > > I have been facing a reproducible kernel panic with 6.8.0 and 6.8.1
> > > when sending packets and enabling the HW checksum calculation with
> > > AF_XDP on my mellanox connect 5.
> > >
> > > Running xskgen ( https://github.com/fomichev/xskgen ), which I saw
> > > mentioned in some patches related to AF_XDP and the hw checksum
> > > support. In addition to the minimum parameters to make it work, adding
> > > the -m option is enough to trigger the kernel panic.
> >
> > Now I wonder if I ever tested only -m (without passing a flag to request
> > tx timestamp). Maybe you can try to confirm that `xskgen -mC` works?
> 
> No, the kernel panics and, from the look of it, the stack trace and
> the RIP are the same.
> 
> [  157.108402] RIP: 0010:mlx5e_free_xdpsq_desc+0x266/0x320 [mlx5_core]
> ...
> [  157.108827] Call Trace:
> [  157.108841]  <TASK>
> [  157.108855]  ? show_regs+0x6d/0x80
> [  157.108876]  ? __die+0x24/0x80
> [  157.108893]  ? page_fault_oops+0x99/0x1b0
> [  157.108916]  ? do_user_addr_fault+0x2ee/0x6b0
> [  157.108937]  ? exc_page_fault+0x83/0x1b0
> [  157.108958]  ? asm_exc_page_fault+0x27/0x30
> [  157.108986]  ? mlx5e_free_xdpsq_desc+0x266/0x320 [mlx5_core]
> [  157.109154]  mlx5e_poll_xdpsq_cq+0x17c/0x4f0 [mlx5_core]
> [  157.109324]  mlx5e_napi_poll+0x45e/0x7b0 [mlx5_core]
> [  157.109470]  __napi_poll+0x33/0x200
> [  157.109488]  net_rx_action+0x181/0x2e0
> [  157.109502]  ? sched_clock_cpu+0x12/0x1e0
> [  157.109524]  __do_softirq+0xe1/0x363
> [  157.109544]  ? __pfx_smpboot_thread_fn+0x10/0x10
> [  157.109565]  run_ksoftirqd+0x37/0x60
> [  157.109582]  smpboot_thread_fn+0xe3/0x1e0
> [  157.109600]  kthread+0xf2/0x120
> [  157.109616]  ? __pfx_kthread+0x10/0x10
> [  157.109632]  ret_from_fork+0x47/0x70
> [  157.109648]  ? __pfx_kthread+0x10/0x10
> [  157.109663]  ret_from_fork_asm+0x1b/0x30
> [  157.109686]  </TASK>
> 
> > If you can test custom patches, I think the following should fix it:
> >
> > diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h
> > index 3cb4dc9bd70e..3d54de168a6d 100644
> > --- a/include/net/xdp_sock.h
> > +++ b/include/net/xdp_sock.h
> > @@ -188,6 +188,8 @@ static inline void xsk_tx_metadata_complete(struct xsk_tx_metadata_compl *compl,
> >  {
> >         if (!compl)
> >                 return;
> > +       if (!compl->tx_timestamp)
> > +               return;
> >
> >         *compl->tx_timestamp = ops->tmo_fill_timestamp(priv);
> >  }
> 
> Just built the same kernel from mainline ubuntu 6.8.1 with the patch
> applied and it now works with both xsk and my code.

Thanks, will send this fix shortly!

      reply	other threads:[~2024-03-18 16:16 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-16  0:39 [mlx5_core] kernel NULL pointer dereference when sending packets with AF_XDP using the hw checksum Daniele Salvatore Albano
2024-03-16  4:11 ` Stanislav Fomichev
2024-03-16 15:26   ` Daniele Salvatore Albano
2024-03-18 16:16     ` Stanislav Fomichev [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zfho1lRIg0cjpWwK@google.com \
    --to=sdf@google.com \
    --cc=d.albano@gmail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).