From: "Toke Høiland-Jørgensen" <toke@toke.dk>
To: Saeed Mahameed <saeedm@mellanox.com>,
"netdev\@vger.kernel.org" <netdev@vger.kernel.org>
Cc: Eran Ben Elisha <eranbe@mellanox.com>,
Tariq Toukan <tariqt@mellanox.com>,
"brouer\@redhat.com" <brouer@redhat.com>
Subject: Re: Kernel oops with mlx5 and dual XDP redirect programs
Date: Tue, 23 Oct 2018 22:29:52 +0200 [thread overview]
Message-ID: <87efcgfltr.fsf@toke.dk> (raw)
In-Reply-To: <a5b5d2612c5ffdc1805098a63810bb29044530aa.camel@mellanox.com>
Saeed Mahameed <saeedm@mellanox.com> writes:
> On Tue, 2018-10-23 at 12:10 +0200, Toke Høiland-Jørgensen wrote:
>> Saeed Mahameed <saeedm@mellanox.com> writes:
>>
>> > On Thu, 2018-10-18 at 23:53 +0200, Toke Høiland-Jørgensen wrote:
>> > > Saeed Mahameed <saeedm@mellanox.com> writes:
>> > >
>> > > > I think that the mlx5 driver doesn't know how to tell the other
>> > > > device
>> > > > to stop transmitting to it while it is resetting.. Maybe tariq
>> > > > or
>> > > > Jesper know more about this ?
>> > > > I will look at this tomorrow after noon and will try to
>> > > > repro...
>> > >
>> > > Hi Saeed
>> > >
>> > > Did you have a chance to poke at this? :)
>> >
>> > HI Toke, yes i have been planing to respond but also i wanted to
>> > dig
>> > more,
>> >
>> > so the root cause is very clear.
>> >
>> > 1. core 1 is doing tx_dev->ndo_xdp_xmit()
>> > 2. core 2 is doing tx_dev->xdp_set() //remove xdp program.
>>
>> Right, it was also my guess that it was related to this interaction.
>> Thanks for looking into it!
>>
>> > and the problem is beyond mlx5, since we don't have a way to tell a
>> > different core/different netdev to stop xmitting, or at least
>> > synchronize with it.
>>
>> Hmm, ideally there should be some way for the higher level XDP API to
>> notice this and abort the call before it even reaches the driver on
>> the
>> TX side, shouldn't there? At LPC, Jesper and I will be talking about
>> a
>> proposal for decoupling the ndo_xdp_xmit() resource allocation from
>> loading and unloading XDP programs, which I guess could be a way to
>> deal
>> with this as well.
>>
>> In the meantime...
>>
>
> Yes totally agree, this is why my fix is temporary.
> Good Idea about LPC, let's discuss this there.
>
>> > I will be waiting for your confirmation that the fix did work.
>>
>> I tested your patch, and it does indeed fix the crash. However, it
>> also
>> seems to have the effect that the XDP redirect continues to function
>> even after removing the XDP program on the target device.
>>
>> I.e., after the call to ./xdp_fwd -d $TX_IF, I still see packets
>> being
>> redirected out $TX_IF. Is this intentional?
>>
>
> Interesting, shouldn't happen, unless there is something weird going on
> when running xpd_fwd -d together with xdp_redirect_map, i just checked
> the code and if ndo_xdp_set was called with null program we will remove
> xdp tx resources, nothing suspicious in the driver.
>
> I will look at this later this week.
Cool. Let me know if you need anything more from me :)
-Toke
prev parent reply other threads:[~2018-10-24 4:54 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-03 9:30 Kernel oops with mlx5 and dual XDP redirect programs Toke Høiland-Jørgensen
2018-10-03 23:44 ` Saeed Mahameed
2018-10-04 12:03 ` Toke Høiland-Jørgensen
2018-10-18 21:53 ` Toke Høiland-Jørgensen
2018-10-22 17:57 ` Saeed Mahameed
2018-10-23 10:10 ` Toke Høiland-Jørgensen
2018-10-23 18:01 ` Saeed Mahameed
2018-10-23 20:29 ` Toke Høiland-Jørgensen [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87efcgfltr.fsf@toke.dk \
--to=toke@toke.dk \
--cc=brouer@redhat.com \
--cc=eranbe@mellanox.com \
--cc=netdev@vger.kernel.org \
--cc=saeedm@mellanox.com \
--cc=tariqt@mellanox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.