From: Konstantin Ananyev <konstantin.ananyev@huawei.com>
To: Stephen Hemminger <stephen@networkplumber.org>,
Luc Pelletier <lucp.at.work@gmail.com>
Cc: "grive@u256.net" <grive@u256.net>, "dev@dpdk.org" <dev@dpdk.org>,
"stable@dpdk.org" <stable@dpdk.org>,
"hyonkim@cisco.com" <hyonkim@cisco.com>,
"johndale@cisco.com" <johndale@cisco.com>
Subject: RE: [PATCH] failsafe: fix segfault on hotplug event
Date: Fri, 18 Nov 2022 16:31:34 +0000 [thread overview]
Message-ID: <774190f3094342c28de2e28f9d93d486@huawei.com> (raw)
In-Reply-To: <20221116142548.554a7a9e@hermes.local>
Hi Luc,
> > Hi Konstantin,
> >
> > > It is not recommended way to update rte_eth_fp_ops[] contents directly.
> > > There are eth_dev_fp_ops_setup()/ eth_dev_fp_ops_reset() that supposed
> > > to be used for that.
> >
> > Good to know. I see another fix that was made in a different PMD that
> > does exactly the same thing:
> >
> > https://github.com/DPDK/dpdk/commit/bcd68b68415172815e55fc67cf3947c0433baf74
> >
> > CC'ing the authors for awareness.
> >
> > > About the fix itself - while it might help till some extent,
> > > I think it will not remove the problem completely.
> > > There still remain a race-condition between rte_eth_rx_burst() and failsafe_eth_rmv_event_callback().
> > > Right now DPDK doesn't support switching PMD fast-ops functions (or updating rxq/txq data)
> > > on the fly.
> >
> > Thanks for the information. This is very helpful.
> >
> > Are you saying that the previous code also had that same race
> > condition?
Yes, I believe so.
> It was only updating the rte_eth_dev structure, but I
> > assume the problem would have been the same since rte_eth_rx_burst()
> > in DPDK versions <=20 use the function pointers in rte_eth_dev, not
> > rte_eth_fp_ops.
> >
> > Can you think of a possible solution to this problem? I'm happy to
> > provide a patch to properly fix the problem. Having your guidance
> > would be extremely helpful.
> >
> > Thanks!
>
> Changing burst mode on a running device is not safe because
> of lack of locking and/or memory barriers.
>
> Would have been better to not to do this optimization.
> Just have one rx_burst/tx_burst function and look at what
> ever conditions are present there.
I think Stephen is right - within DPDK it is just not possible to switch RX/TX
function on the fly (without some external synchronization).
So the safe way is to always use safe version of RX/TX call.
I personally don't think such few extra checks will affect performance that much.
As another nit: inside failsafe rx_burst functions it is probably better not to access dev->data->rx_queues directly,
but call rte_eth_rx_burst(sdev->sdev_port_id, ...); instead.
Same for TX.
Thanks
Konstantin
next prev parent reply other threads:[~2022-11-18 16:31 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-10 16:34 [PATCH] failsafe: fix segfault on hotplug event Luc Pelletier
2022-11-10 17:42 ` [PATCH v2] " Luc Pelletier
2022-11-10 23:33 ` Stephen Hemminger
2022-11-16 17:35 ` [PATCH] " Konstantin Ananyev
2022-11-16 21:51 ` Luc Pelletier
2022-11-16 22:25 ` Stephen Hemminger
2022-11-18 16:31 ` Konstantin Ananyev [this message]
2022-11-29 14:48 ` [PATCH v3 0/5] Failsafe bug fixes and improvements Luc Pelletier
2022-11-29 14:48 ` [PATCH v3 1/5] failsafe: fix segfault on hotplug event Luc Pelletier
2023-10-31 17:35 ` Stephen Hemminger
2022-11-29 14:48 ` [PATCH v3 2/5] failsafe: use public rx/tx burst API Luc Pelletier
2022-11-29 14:48 ` [PATCH v3 3/5] failsafe: fix double release of port Luc Pelletier
2022-11-29 14:48 ` [PATCH v3 4/5] failsafe: use public APIs in fs_link_update Luc Pelletier
2022-11-29 14:48 ` [PATCH v3 5/5] failsafe: make sub-device remove flag thread-safe Luc Pelletier
2022-11-29 15:25 ` [PATCH v4 0/5] Failsafe bug fixes and improvements Luc Pelletier
2022-11-29 15:25 ` [PATCH v4 1/5] failsafe: fix segfault on hotplug event Luc Pelletier
2022-11-29 15:25 ` [PATCH v4 2/5] failsafe: use public rx/tx burst API Luc Pelletier
2022-11-29 15:25 ` [PATCH v4 3/5] failsafe: fix double release of port Luc Pelletier
2022-11-29 15:25 ` [PATCH v4 4/5] failsafe: use public APIs in fs_link_update Luc Pelletier
2022-11-29 15:25 ` [PATCH v4 5/5] failsafe: make sub-device remove flag thread-safe Luc Pelletier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=774190f3094342c28de2e28f9d93d486@huawei.com \
--to=konstantin.ananyev@huawei.com \
--cc=dev@dpdk.org \
--cc=grive@u256.net \
--cc=hyonkim@cisco.com \
--cc=johndale@cisco.com \
--cc=lucp.at.work@gmail.com \
--cc=stable@dpdk.org \
--cc=stephen@networkplumber.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.