From: Anshumali Gaur <agaur@marvell.com>
To: Jacob Keller <jacob.e.keller@intel.com>
Cc: <netdev@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
Sunil Goutham <sgoutham@marvell.com>,
Linu Cherian <lcherian@marvell.com>,
Geetha sowjanya <gakula@marvell.com>,
Jerin Jacob <jerinj@marvell.com>, hariprasad <hkelam@marvell.com>,
Subbaraya Sundeep <sbhatta@marvell.com>,
Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
"Eric Dumazet" <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>
Date: Mon, 2 Feb 2026 10:53:11 +0000 [thread overview]
Message-ID: <aYCCFw3cTeyA3p8R@ccs-coresight-hyd> (raw)
On 2026-01-29 at 23:02:43, Jacob Keller (jacob.e.keller@intel.com) wrote:
>
>
> On 1/29/2026 1:19 AM, Anshumali Gaur wrote:
> > When both AF and PF drivers are built as modules, the PF driver in the
> > kexec kernel may probe before the AF driver is ready. This leads to
> > a crash due to uninitialized hardware state.
> >
> > This patch ensures the PF driver properly detects and waits for AF
> > driver readiness before proceeding with initialization.
> >
>
> To me, the patch description is not sufficient to describe the what and why
> of this change.
>
> Could you please provide a better explanation of how the addition of the
> provided shutdown handler fixes initialization?
>
Hi Jacob,
The issue being addressed here is specific to kexec and persistent AF
hardware state across kernel transitions. When both AF and PF drivers
are built as modules and a kexec kernel is performed, the PF driver in
the new kernel may probe before the AF driver has completed probing and
reinitializing the RVU hardware. In this scenario, the hardware state
left behind by the AF driver in the old kernel is still visible to the
PF driver in the new kernel resulting in crash due to stale state.
> > Fixes: 54494aa5d1e6 ("octeontx2-af: Add Marvell OcteonTX2 RVU AF driver")
> > Signed-off-by: Anshumali Gaur <agaur@marvell.com>
> > ---
> > drivers/net/ethernet/marvell/octeontx2/af/rvu.c | 11 +++++++++++
> > 1 file changed, 11 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
> > index 747fbdf2a908..8530df8b3fda 100644
> > --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
> > +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
> > @@ -3632,11 +3632,22 @@ static void rvu_remove(struct pci_dev *pdev)
> > devm_kfree(&pdev->dev, rvu);
> > }
> > +static void rvu_shutdown(struct pci_dev *pdev)
> > +{
> > + struct rvu *rvu = pci_get_drvdata(pdev);
> > +
> > + if (!rvu)
> > + return;
> > +
> > + rvu_clear_rvum_blk_revid(rvu);
>
> Here, I guess you are clearing some data about the device status. Does that
> mean that when you initialize later you will wait for the AF driver to
> finish probing and configure this? It would be nice to explain how this
> change fixes initialization.
>
The RVUM block revision field acts as an implicit indication that the AF
driver has completed its initialization. If this value is left uncleared
during kexec kernel booting, the PF driver may observe a non-zero/valid
RVUM block revision and incorrectly assume that the AF is already
initialized and ready, even though the AF driver in the kexec kernel has
not yet probed. This leads to PF initialization proceeding against
partially initialized hardware, resulting in a crash.
> > +}
> > +
> > static struct pci_driver rvu_driver = {
> > .name = DRV_NAME,
> > .id_table = rvu_id_table,
> > .probe = rvu_probe,
> > .remove = rvu_remove,
> > + .shutdown = rvu_shutdown,
>
> This is the shutdown handler:
> >
> > * @shutdown: Hook into reboot_notifier_list (kernel/sys.c).
> > * Intended to stop any idling DMA operations.
> > * Useful for enabling wake-on-lan (NIC) or changing
> > * the power state of a device before reboot.
> > * e.g. drivers/net/e100.c.
>
> How does this have anything to do with initialization?
>
> > };
> > static int __init rvu_init_module(void)
>
>
next reply other threads:[~2026-02-02 10:53 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-02 10:53 Anshumali Gaur [this message]
2026-02-03 0:34 ` Jacob Keller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aYCCFw3cTeyA3p8R@ccs-coresight-hyd \
--to=agaur@marvell.com \
--cc=andrew+netdev@lunn.ch \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=gakula@marvell.com \
--cc=hkelam@marvell.com \
--cc=jacob.e.keller@intel.com \
--cc=jerinj@marvell.com \
--cc=kuba@kernel.org \
--cc=lcherian@marvell.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sbhatta@marvell.com \
--cc=sgoutham@marvell.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox