public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Daniel Machon <daniel.machon@microchip.com>
To: Herve Codina <herve.codina@bootlin.com>
Cc: Andrew Lunn <andrew+netdev@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Horatiu Vultur <horatiu.vultur@microchip.com>,
	Steen Hegelund <steen.hegelund@microchip.com>,
	<UNGLinuxDriver@microchip.com>,
	"Alexei Starovoitov" <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	"Jesper Dangaard Brouer" <hawk@kernel.org>,
	John Fastabend <john.fastabend@gmail.com>,
	Stanislav Fomichev <sdf@fomichev.me>,
	Arnd Bergmann <arnd@arndb.de>,
	"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	<netdev@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<bpf@vger.kernel.org>
Subject: Re: [PATCH net-next 00/10] net: lan966x: add support for PCIe FDMA
Date: Tue, 7 Apr 2026 15:20:16 +0200	[thread overview]
Message-ID: <20260407132016.4cfivs24ljqneyu7@DEN-DL-M70577> (raw)
In-Reply-To: <20260327113337.0368eea3@bootlin.com>

Hi Hervé,

> >
> > As I remembered, doing rmmod on the lan966x_switch followed by modprobe
> > lan966x_switch works fine. This is because neither the switch core, nor the FDMA
> > engine is reset, so they remain in sync.
> >
> > When the lan966x_pci module is removed and reloaded (what you did), the DT
> > overlay is re-applied, which causes the reset controller
> > (reset-microchip-sparx5) to re-probe. During probe, it performs a GCB soft reset
> > that resets the switch core, but protects the CPU domain from the reset. The
> > FDMA engine is part of the CPU domain, so it is not reset.
> >
> > This leaves the switch core in a reset state while the FDMA
> > retains state from the previous driver instance. When the switch driver
> > subsequently probes and activates the FDMA channels, the two are out of
> > sync, and the FDMA immediately reports extraction errors.
> >
> > Theres actually an FDMA register called NRESET that resets the FDMA controller
> > state. Calling this in the FDMA init path causes traffic to work correctly on
> > lan966x_pci reload, but it does not get rid of the FDMA splats you posted above.
> > They get queued up between the switch core reset, in the reset controller, and
> > the FDMA enabling. I tried different approaches to drain or flush queues, but
> > they wont go away entirely.
> >
> > The only thing that seems to work consistently is to *not* do the soft reset in
> > the reset controller for the PCI path. The soft reset is actually the problem:
> > it only resets the switch core while protecting the CPU domain (including FDMA),
> > causing a desync.
> >
> > A simple fix could be (in reset-microchip-sparx5.c):
> >
> > +static bool mchp_reset_is_pci(struct device *dev)
> > +{
> > +     for (dev = dev->parent; dev; dev = dev->parent) {
> > +             if (dev_is_pci(dev))
> > +                     return true;
> > +     }
> > +     return false;
> > +}
> >
> > -     /* Issue the reset very early, our actual reset callback is a noop. */
> > -     err = sparx5_switch_reset(ctx);
> > -     if (err)
> > -             return err;
> > +     /* Issue the reset very early, our actual reset callback is a noop.
> > +      *
> > +      * On the PCI path, skip the reset. The endpoint is already in
> > +      * power-on reset state on the first probe. On subsequent probes
> > +      * (after driver reload), resetting the switch core while the FDMA
> > +      * retains state (CPU domain is protected from the soft reset)
> > +      * causes the two to go out of sync, leading to FDMA extraction
> > +      * errors.
> > +      */
> > +     if (!mchp_reset_is_pci(&pdev->dev)) {
> > +             err = sparx5_switch_reset(ctx);
> > +             if (err)
> > +                     return err;
> > +     }
> >
> > Could you test it and see if it helps the problem on your side.
> >
> 
> I have tested it on my ARM and x86 system. It fixes the lan966x_pci module
> unloading / reloading issue.
> 
> However an other regression is present. After a reboot, without power
> off/on, the board is not working (tested on both my ARM and x86 systems).
> 
> According to your explanation, this makes sense.
> 
> IMHO, the problem is that we cannot make the assumption that "The endpoint
> is already in power-on reset state on the first probe". That's not true
> when you just call the reboot command.
> 
> Best regards,
> Hervé

The following diff should fix the FDMA traffic issue, and the FDMA error splat,
when reloading the lan966x-pci driver, by:

1. Resetting the FDMA engine on PCI init()

2. Clearing any rogue FDMA errors that may latch due to the soft reset by the
reset driver.

  diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_fdma_pci.c                                                   
  b/drivers/net/ethernet/microchip/lan966x/lan966x_fdma_pci.c                                                              
  --- a/drivers/net/ethernet/microchip/lan966x/lan966x_fdma_pci.c                                                          
  +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_fdma_pci.c                                                          
  @@ -372,6 +372,9 @@ static int lan966x_fdma_pci_init(struct lan966x *lan966x)                                            
        if (!lan966x->fdma)                                                                                                
                return 0;                                                                                                  
                                                                                                                         
  +     lan_wr(FDMA_CTRL_NRESET_SET(0), lan966x, FDMA_CTRL);                                                               
  +     lan_wr(FDMA_CTRL_NRESET_SET(1), lan966x, FDMA_CTRL);                                                               
  +                                                                                                                        
        fdma_pci_atu_init(&lan966x->atu, lan966x->regs[TARGET_PCIE_DBI]);                                                  
                                                            
        lan966x->rx.lan966x = lan966x;                                                                                     
  diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
  b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c                                                                  
  --- a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c                                                              
  +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c                                                              
  @@ -1071,6 +1071,15 @@ static int lan966x_reset_switch(struct lan966x *lan966x)                                          
                                                            
        reset_control_reset(switch_reset);                                                                                 
                                                                                                                           
  +     /* When in PCI mode, the GCB soft reset issued by the reset
  +      * controller can latch spurious bits in the FDMA error stickies.                                                  
  +      * Clear them before request_irq hooks up the FDMA IRQ line,
  +      * otherwise the handler fires immediately on probe.                                                               
  +      */                                                 
  +     lan_wr(lan_rd(lan966x, FDMA_ERRORS),   lan966x, FDMA_ERRORS);                                                      
  +     lan_wr(lan_rd(lan966x, FDMA_INTR_ERR), lan966x, FDMA_INTR_ERR);                                                    
  +     lan_wr(lan_rd(lan966x, FDMA_INTR_DB),  lan966x, FDMA_INTR_DB);                                                     
  +                                                                                                                        
        /* Don't reinitialize the switch core, if it is already initialized. In                                            
         * case it is initialized twice, some pointers inside the queue system                                             
         * in HW will get corrupted and then after a while the queue system gets                                           
  diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_regs.h                                                       
  b/drivers/net/ethernet/microchip/lan966x/lan966x_regs.h                                                                  
  --- a/drivers/net/ethernet/microchip/lan966x/lan966x_regs.h                                                              
  +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_regs.h                                                              
  @@ -1010,6 +1010,15 @@ enum lan966x_target {                                                                             
   #define FDMA_CH_CFG_CH_MEM_GET(x)\                                                                                      
        FIELD_GET(FDMA_CH_CFG_CH_MEM, x)                                                                                   
                                                            
  +/*      FDMA:FDMA:FDMA_CTRL */                                                                                          
  +#define FDMA_CTRL                 __REG(TARGET_FDMA, 0, 1, 8, 0, 1, 428, 424, 0, 1, 4)                                  
  +                                                                                                                        
  +#define FDMA_CTRL_NRESET                         BIT(0)                                                                 
  +#define FDMA_CTRL_NRESET_SET(x)\                                                                                        
  +     FIELD_PREP(FDMA_CTRL_NRESET, x)                                                                                    
  +#define FDMA_CTRL_NRESET_GET(x)\                         
  +     FIELD_GET(FDMA_CTRL_NRESET, x)                                                                                     
  +                                                         
   /*      FDMA:FDMA:FDMA_PORT_CTRL */                                                                                     
   #define FDMA_PORT_CTRL(r)         __REG(TARGET_FDMA, 0, 1, 8, 0, 1, 428, 376, r, 2, 4)

Let me know if it works on your end.

(Btw. I have noticed another issue where TX stops working on lan966x-pci reload.
It happens more rarely, but is unrelated to this patch series, as it also
happens in register-based INJ/XTR mode. Whenever that happens, you will see
"Flush timeout chip port" in the logs. This should also be fixed, but sent as a
separate fix commit, I believe.)

/Daniel

  parent reply	other threads:[~2026-04-07 13:20 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-20 15:00 [PATCH net-next 00/10] net: lan966x: add support for PCIe FDMA Daniel Machon
2026-03-20 15:00 ` [PATCH net-next 01/10] net: microchip: fdma: rename contiguous dataptr helpers Daniel Machon
2026-03-20 15:00 ` [PATCH net-next 02/10] net: microchip: fdma: add PCIe ATU support Daniel Machon
2026-03-20 15:00 ` [PATCH net-next 03/10] net: lan966x: add FDMA LLP register write helper Daniel Machon
2026-03-20 15:01 ` [PATCH net-next 04/10] net: lan966x: export FDMA helpers for reuse Daniel Machon
2026-03-20 15:01 ` [PATCH net-next 05/10] net: lan966x: add FDMA ops dispatch for PCIe support Daniel Machon
2026-03-20 15:01 ` [PATCH net-next 06/10] net: lan966x: add PCIe FDMA support Daniel Machon
2026-03-20 15:01 ` [PATCH net-next 07/10] net: lan966x: add PCIe FDMA MTU change support Daniel Machon
2026-03-20 15:01 ` [PATCH net-next 08/10] net: lan966x: add PCIe FDMA XDP support Daniel Machon
2026-03-22  7:11   ` Mohsin Bashir
2026-03-22 20:30     ` Daniel Machon
2026-03-20 15:01 ` [PATCH net-next 09/10] misc: lan966x-pci: dts: extend cpu reg to cover PCIE DBI space Daniel Machon
2026-03-20 15:01 ` [PATCH net-next 10/10] misc: lan966x-pci: dts: add fdma interrupt to overlay Daniel Machon
2026-03-23 14:52 ` [PATCH net-next 00/10] net: lan966x: add support for PCIe FDMA Herve Codina
2026-03-23 16:26   ` Herve Codina
2026-03-23 19:40     ` Daniel Machon
2026-03-24  8:07       ` Herve Codina
2026-03-26 15:48         ` Daniel Machon
2026-03-27 10:33           ` Herve Codina
2026-03-27 11:07             ` Daniel Machon
2026-04-07 13:20             ` Daniel Machon [this message]
2026-04-08  9:51               ` Herve Codina

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260407132016.4cfivs24ljqneyu7@DEN-DL-M70577 \
    --to=daniel.machon@microchip.com \
    --cc=UNGLinuxDriver@microchip.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=arnd@arndb.de \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hawk@kernel.org \
    --cc=herve.codina@bootlin.com \
    --cc=horatiu.vultur@microchip.com \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    --cc=steen.hegelund@microchip.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox