public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Herve Codina <herve.codina@bootlin.com>
To: Daniel Machon <daniel.machon@microchip.com>
Cc: Andrew Lunn <andrew+netdev@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Horatiu Vultur <horatiu.vultur@microchip.com>,
	Steen Hegelund <steen.hegelund@microchip.com>,
	<UNGLinuxDriver@microchip.com>,
	"Alexei Starovoitov" <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	"Jesper Dangaard Brouer" <hawk@kernel.org>,
	John Fastabend <john.fastabend@gmail.com>,
	Stanislav Fomichev <sdf@fomichev.me>,
	Arnd Bergmann <arnd@arndb.de>,
	"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	<netdev@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<bpf@vger.kernel.org>
Subject: Re: [PATCH net-next 00/10] net: lan966x: add support for PCIe FDMA
Date: Wed, 8 Apr 2026 11:51:26 +0200	[thread overview]
Message-ID: <20260408115126.5529c823@bootlin.com> (raw)
In-Reply-To: <20260407132016.4cfivs24ljqneyu7@DEN-DL-M70577>

Hi Daniel,

On Tue, 7 Apr 2026 15:20:16 +0200
Daniel Machon <daniel.machon@microchip.com> wrote:

...
> 
> The following diff should fix the FDMA traffic issue, and the FDMA error splat,
> when reloading the lan966x-pci driver, by:
> 
> 1. Resetting the FDMA engine on PCI init()
> 
> 2. Clearing any rogue FDMA errors that may latch due to the soft reset by the
> reset driver.
> 
>   diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_fdma_pci.c                                                   
>   b/drivers/net/ethernet/microchip/lan966x/lan966x_fdma_pci.c                                                              
>   --- a/drivers/net/ethernet/microchip/lan966x/lan966x_fdma_pci.c                                                          
>   +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_fdma_pci.c                                                          
>   @@ -372,6 +372,9 @@ static int lan966x_fdma_pci_init(struct lan966x *lan966x)                                            
>         if (!lan966x->fdma)                                                                                                
>                 return 0;                                                                                                  
>                                                                                                                          
>   +     lan_wr(FDMA_CTRL_NRESET_SET(0), lan966x, FDMA_CTRL);                                                               
>   +     lan_wr(FDMA_CTRL_NRESET_SET(1), lan966x, FDMA_CTRL);                                                               
>   +                                                                                                                        
>         fdma_pci_atu_init(&lan966x->atu, lan966x->regs[TARGET_PCIE_DBI]);                                                  
>                                                             
>         lan966x->rx.lan966x = lan966x;                                                                                     
>   diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
>   b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c                                                                  
>   --- a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c                                                              
>   +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c                                                              
>   @@ -1071,6 +1071,15 @@ static int lan966x_reset_switch(struct lan966x *lan966x)                                          
>                                                             
>         reset_control_reset(switch_reset);                                                                                 
>                                                                                                                            
>   +     /* When in PCI mode, the GCB soft reset issued by the reset
>   +      * controller can latch spurious bits in the FDMA error stickies.                                                  
>   +      * Clear them before request_irq hooks up the FDMA IRQ line,
>   +      * otherwise the handler fires immediately on probe.                                                               
>   +      */                                                 
>   +     lan_wr(lan_rd(lan966x, FDMA_ERRORS),   lan966x, FDMA_ERRORS);                                                      
>   +     lan_wr(lan_rd(lan966x, FDMA_INTR_ERR), lan966x, FDMA_INTR_ERR);                                                    
>   +     lan_wr(lan_rd(lan966x, FDMA_INTR_DB),  lan966x, FDMA_INTR_DB);                                                     
>   +                                                                                                                        
>         /* Don't reinitialize the switch core, if it is already initialized. In                                            
>          * case it is initialized twice, some pointers inside the queue system                                             
>          * in HW will get corrupted and then after a while the queue system gets                                           
>   diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_regs.h                                                       
>   b/drivers/net/ethernet/microchip/lan966x/lan966x_regs.h                                                                  
>   --- a/drivers/net/ethernet/microchip/lan966x/lan966x_regs.h                                                              
>   +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_regs.h                                                              
>   @@ -1010,6 +1010,15 @@ enum lan966x_target {                                                                             
>    #define FDMA_CH_CFG_CH_MEM_GET(x)\                                                                                      
>         FIELD_GET(FDMA_CH_CFG_CH_MEM, x)                                                                                   
>                                                             
>   +/*      FDMA:FDMA:FDMA_CTRL */                                                                                          
>   +#define FDMA_CTRL                 __REG(TARGET_FDMA, 0, 1, 8, 0, 1, 428, 424, 0, 1, 4)                                  
>   +                                                                                                                        
>   +#define FDMA_CTRL_NRESET                         BIT(0)                                                                 
>   +#define FDMA_CTRL_NRESET_SET(x)\                                                                                        
>   +     FIELD_PREP(FDMA_CTRL_NRESET, x)                                                                                    
>   +#define FDMA_CTRL_NRESET_GET(x)\                         
>   +     FIELD_GET(FDMA_CTRL_NRESET, x)                                                                                     
>   +                                                         
>    /*      FDMA:FDMA:FDMA_PORT_CTRL */                                                                                     
>    #define FDMA_PORT_CTRL(r)         __REG(TARGET_FDMA, 0, 1, 8, 0, 1, 428, 376, r, 2, 4)
> 
> Let me know if it works on your end.
> 
> (Btw. I have noticed another issue where TX stops working on lan966x-pci reload.
> It happens more rarely, but is unrelated to this patch series, as it also
> happens in register-based INJ/XTR mode. Whenever that happens, you will see
> "Flush timeout chip port" in the logs. This should also be fixed, but sent as a
> separate fix commit, I believe.)
> 

I have tested your proposed modification.

- From a cold boot, module unloading / re-loading:
  Tested Ok

- From a cold boot, module loading / no interface configuration / reboot:
  Tested Ok

- From a cold boot, module loading / interface configuration / reboot:
  Issue present.

- From a cold boot, module loading / interface configuration / module unloading / reboot:
  Tested Ok.

The interface configuration was the following command:
  ip link set eth2 up && ip addr add 192.168.32.20/16 dev eth2

With the interface configured and without unloading the lan966x_pci module
before the reboot, I have the already known FDMA error splat after the
reboot:
   [   18.908397] ------------[ cut here ]------------
   [   18.913226] Unexpected error: 64, error_type: 1073741824
   [   18.918704] WARNING: drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c:558 at lan966x_fdma_irq_handler+0xe0/0x12c [lan966x_switch], CPU#0: kworker/u8:2/38
   ...

The issue is not present if I unload the lan966x_pci module before the
reboot. This let me think that something has been cleaned by the module
unloading.

On top of your proposed modification, I use the shutdown() hook in the
lan966x_driver:

    --- a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
    +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
    @@ -1326,9 +1326,18 @@ static void lan966x_remove(struct platform_device *pdev)
            debugfs_remove_recursive(lan966x->debugfs_root);
     }
     
    +static void lan966x_shutdown(struct platform_device *pdev)
    +{
    +       struct lan966x *lan966x = platform_get_drvdata(pdev);
    +
    +       lan966x->ops->fdma_deinit(lan966x);
    +}
    +
    static struct platform_driver lan966x_driver = {
            .probe = lan966x_probe,
            .remove = lan966x_remove,
    +       .shutdown = lan966x_shutdown,
            .driver = {
                    .name = "lan966x-switch",
                    .of_match_table = lan966x_match,

With that done, the issue is no more present.

I am not sure that this is a correct fix but it can give some clues.

Feel free to ask for more tests or more details if you need to.

Best regards,
Hervé

      reply	other threads:[~2026-04-08  9:51 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-20 15:00 [PATCH net-next 00/10] net: lan966x: add support for PCIe FDMA Daniel Machon
2026-03-20 15:00 ` [PATCH net-next 01/10] net: microchip: fdma: rename contiguous dataptr helpers Daniel Machon
2026-03-20 15:00 ` [PATCH net-next 02/10] net: microchip: fdma: add PCIe ATU support Daniel Machon
2026-03-20 15:00 ` [PATCH net-next 03/10] net: lan966x: add FDMA LLP register write helper Daniel Machon
2026-03-20 15:01 ` [PATCH net-next 04/10] net: lan966x: export FDMA helpers for reuse Daniel Machon
2026-03-20 15:01 ` [PATCH net-next 05/10] net: lan966x: add FDMA ops dispatch for PCIe support Daniel Machon
2026-03-20 15:01 ` [PATCH net-next 06/10] net: lan966x: add PCIe FDMA support Daniel Machon
2026-03-20 15:01 ` [PATCH net-next 07/10] net: lan966x: add PCIe FDMA MTU change support Daniel Machon
2026-03-20 15:01 ` [PATCH net-next 08/10] net: lan966x: add PCIe FDMA XDP support Daniel Machon
2026-03-22  7:11   ` Mohsin Bashir
2026-03-22 20:30     ` Daniel Machon
2026-03-20 15:01 ` [PATCH net-next 09/10] misc: lan966x-pci: dts: extend cpu reg to cover PCIE DBI space Daniel Machon
2026-03-20 15:01 ` [PATCH net-next 10/10] misc: lan966x-pci: dts: add fdma interrupt to overlay Daniel Machon
2026-03-23 14:52 ` [PATCH net-next 00/10] net: lan966x: add support for PCIe FDMA Herve Codina
2026-03-23 16:26   ` Herve Codina
2026-03-23 19:40     ` Daniel Machon
2026-03-24  8:07       ` Herve Codina
2026-03-26 15:48         ` Daniel Machon
2026-03-27 10:33           ` Herve Codina
2026-03-27 11:07             ` Daniel Machon
2026-04-07 13:20             ` Daniel Machon
2026-04-08  9:51               ` Herve Codina [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260408115126.5529c823@bootlin.com \
    --to=herve.codina@bootlin.com \
    --cc=UNGLinuxDriver@microchip.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=arnd@arndb.de \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel.machon@microchip.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hawk@kernel.org \
    --cc=horatiu.vultur@microchip.com \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    --cc=steen.hegelund@microchip.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox