From: Jacob Keller <jacob.e.keller@intel.com>
To: Przemek Kitszel <przemyslaw.kitszel@intel.com>,
Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
"Eric Dumazet" <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Piotr Kwapulinski <piotr.kwapulinski@intel.com>,
Aleksandr Loktionov <aleksandr.loktionov@intel.com>,
Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>,
Maciej Fijalkowski <maciej.fijalkowski@intel.com>,
Michal Kubiak <michal.kubiak@intel.com>,
Joshua Hay <joshua.a.hay@intel.com>,
Madhu Chittim <madhu.chittim@intel.com>,
Willem de Bruijn <willemb@google.com>,
Dave Ertman <david.m.ertman@intel.com>,
Ivan Vecera <ivecera@redhat.com>,
Grzegorz Nitka <grzegorz.nitka@intel.com>
Cc: <netdev@vger.kernel.org>, <stable@vger.kernel.org>,
Bart Van Assche <bvanassche@acm.org>,
<intel-wired-lan@lists.osuosl.org>,
Arpana Arland <arpanax.arland@intel.com>
Subject: Re: [PATCH net 10/13] ice: fix locking in ice_dcb_rebuild()
Date: Wed, 6 May 2026 14:13:03 -0700 [thread overview]
Message-ID: <7343677e-a652-4770-8cb4-2a938eddaf74@intel.com> (raw)
In-Reply-To: <20260504-jk-iwl-net-2026-05-04-v1-10-a222a88bd962@intel.com>
On 5/4/2026 10:14 PM, Jacob Keller wrote:
> From: Bart Van Assche <bvanassche@acm.org>
>
> Move the mutex_lock() call up to prevent that DCB settings change after
> the first ice_query_port_ets() call. The second ice_query_port_ets()
> call in ice_dcb_rebuild() is already protected by pf->tc_mutex.
>
> This also fixes a bug in an error path, as before taking the first
> "goto dcb_error" in the function jumped over mutex_lock() to
> mutex_unlock().
>
> This bug has been detected by the clang thread-safety analyzer.
>
> Cc: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> Cc: intel-wired-lan@lists.osuosl.org
> Fixes: 242b5e068b25 ("ice: Fix DCB rebuild after reset")
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
> Tested-by: Arpana Arland <arpanax.arland@intel.com>
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> ---
> drivers/net/ethernet/intel/ice/ice_dcb_lib.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
> index 16aa25535152..0bc6dd375687 100644
> --- a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
> +++ b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
> @@ -537,14 +537,14 @@ void ice_dcb_rebuild(struct ice_pf *pf)
> struct ice_dcbx_cfg *err_cfg;
> int ret;
>
> + mutex_lock(&pf->tc_mutex);
> +
> ret = ice_query_port_ets(pf->hw.port_info, &buf, sizeof(buf), NULL);
> if (ret) {
> dev_err(dev, "Query Port ETS failed\n");
> goto dcb_error;
> }
>
> - mutex_lock(&pf->tc_mutex);
> -
> if (!pf->hw.port_info->qos_cfg.is_sw_lldp)
> ice_cfg_etsrec_defaults(pf->hw.port_info);
>
>
Sashiko thinks there is a problem here:
> Does moving mutex_lock(&pf->tc_mutex) to the top of ice_dcb_rebuild()
> introduce an AB-BA deadlock with rtnl_lock?
> In standard DCB netlink operations, the netlink layer calls into the driver
> with rtnl_lock already held, and the driver subsequently acquires
> pf->tc_mutex. This establishes a required lock ordering of rtnl_lock
> followed by pf->tc_mutex.
> With this change, if the first ice_query_port_ets() fails, we take the goto
> dcb_error path while already holding pf->tc_mutex.
> Looking at the error cleanup path lower down in ice_dcb_rebuild():
> dcb_error:
> dev_err(dev, "Disabling DCB until new settings occur\n");
> [ ... ]
> ice_pf_dcb_cfg(pf, err_cfg, false);
> Because the locked parameter is passed as false, ice_pf_dcb_cfg() will
> unconditionally acquire rtnl_lock().
> Does this create a lock inversion (pf->tc_mutex followed by rtnl_lock)
> against concurrent DCB netlink operations?
This seems like a fully pre-existing error. We already jump to dcb_error
else where in the function.
I don't know if this locking order really is an ABBA violation (I did
not review any of the other flows that take tc_mutex to confirm), but I
don't think it should hold this fix.
Someone from the ice team will need to investigate and see what the best
solution is. I suspect we'll have to take RTNL lock then the tc_mutex
and pass true to the ice_pf_dcb_cfg function. Or, better yet, see if
this converts to the netdev per-instance lock and we could drop the
tc_mutex entirely, relying on netdev_lock?
Thanks,
Jake
next prev parent reply other threads:[~2026-05-06 21:13 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-05 5:14 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-05-04 (i40e, ice, idpf) Jacob Keller
2026-05-05 5:14 ` [PATCH net 01/13] i40e: Cleanup PTP registration on probe failure Jacob Keller
2026-05-06 20:24 ` Jacob Keller
2026-05-05 5:14 ` [PATCH net 02/13] i40e: Cleanup PTP pins " Jacob Keller
2026-05-06 20:28 ` Jacob Keller
2026-05-05 5:14 ` [PATCH net 03/13] i40e: keep q_vectors array in sync with channel count changes Jacob Keller
2026-05-06 20:53 ` Jacob Keller
2026-05-05 5:14 ` [PATCH net 04/13] idpf: fix read_dev_clk_lock spinlock init in idpf_ptp_init() Jacob Keller
2026-05-05 5:14 ` [PATCH net 05/13] idpf: do not enable XDP if queue based scheduling is not supported Jacob Keller
2026-05-06 20:59 ` Jacob Keller
2026-05-05 5:14 ` [PATCH net 06/13] idpf: fix skb datapath queue based scheduling crashes and timeouts Jacob Keller
2026-05-05 5:14 ` [PATCH net 07/13] idpf: fix xdp crash in soft reset error path Jacob Keller
2026-05-05 5:14 ` [PATCH net 08/13] idpf: fix double free and use-after-free in aux device error paths Jacob Keller
2026-05-06 21:04 ` Jacob Keller
2026-05-05 5:14 ` [PATCH net 09/13] ice: fix setting RSS VSI hash for E830 Jacob Keller
2026-05-06 21:06 ` Jacob Keller
2026-05-07 11:47 ` Marcin Szycik
2026-05-07 16:59 ` Marcin Szycik
2026-05-07 21:13 ` Jacob Keller
2026-05-05 5:14 ` [PATCH net 10/13] ice: fix locking in ice_dcb_rebuild() Jacob Keller
2026-05-06 21:13 ` Jacob Keller [this message]
2026-05-05 5:14 ` [PATCH net 11/13] ice: fix PTP hang for E825C devices Jacob Keller
2026-05-06 21:16 ` Jacob Keller
2026-05-05 5:14 ` [PATCH net 12/13] ice: dpll: fix rclk pin state get for E810 Jacob Keller
2026-05-05 5:14 ` [PATCH net 13/13] ice: dpll: fix misplaced header macros Jacob Keller
2026-05-06 21:21 ` [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-05-04 (i40e, ice, idpf) Jacob Keller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7343677e-a652-4770-8cb4-2a938eddaf74@intel.com \
--to=jacob.e.keller@intel.com \
--cc=aleksandr.loktionov@intel.com \
--cc=andrew+netdev@lunn.ch \
--cc=arkadiusz.kubalewski@intel.com \
--cc=arpanax.arland@intel.com \
--cc=bvanassche@acm.org \
--cc=davem@davemloft.net \
--cc=david.m.ertman@intel.com \
--cc=edumazet@google.com \
--cc=grzegorz.nitka@intel.com \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=ivecera@redhat.com \
--cc=joshua.a.hay@intel.com \
--cc=kuba@kernel.org \
--cc=maciej.fijalkowski@intel.com \
--cc=madhu.chittim@intel.com \
--cc=michal.kubiak@intel.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=piotr.kwapulinski@intel.com \
--cc=przemyslaw.kitszel@intel.com \
--cc=stable@vger.kernel.org \
--cc=willemb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox