netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Roger Quadros <rogerq@kernel.org>
To: Siddharth Vadapalli <s-vadapalli@ti.com>
Cc: Andrew Lunn <andrew+netdev@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Grygorii Strashko <grygorii.strashko@ti.com>,
	srk@ti.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH net] net: ethernet: ti: am65-cpsw: fix freeing IRQ in am65_cpsw_nuss_remove_tx_chns()
Date: Wed, 15 Jan 2025 18:38:57 +0200	[thread overview]
Message-ID: <8829d58b-fbfc-4040-93de-51970631d935@kernel.org> (raw)
In-Reply-To: <3c9bdd38-d60f-466d-a767-63f71368d41e@kernel.org>

Siddharth,

On 15/01/2025 17:49, Roger Quadros wrote:
> Hi Siddharth,
> 
> On 15/01/2025 12:38, Siddharth Vadapalli wrote:
>> On Wed, Jan 15, 2025 at 12:04:17PM +0200, Roger Quadros wrote:
>>> Hi Siddharth,
>>>
>>> On 15/01/2025 07:18, Siddharth Vadapalli wrote:
>>>> On Tue, Jan 14, 2025 at 06:44:02PM +0200, Roger Quadros wrote:
>>>>
>>>> Hello Roger,
>>>>
>>>>> When getting the IRQ we use k3_udma_glue_rx_get_irq() which returns
>>>>
>>>> You probably meant "k3_udma_glue_tx_get_irq()" instead? It is used to
>>>> assign tx_chn->irq within am65_cpsw_nuss_init_tx_chns() as follows:
>>>
>>> Yes I meant tx instead of rx.
>>>
>>>>
>>>> 		tx_chn->irq = k3_udma_glue_tx_get_irq(tx_chn->tx_chn);
>>>>
>>>> Additionally, following the above section we have:
>>>>
>>>> 		if (tx_chn->irq < 0) {
>>>> 			dev_err(dev, "Failed to get tx dma irq %d\n",
>>>> 				tx_chn->irq);
>>>> 			ret = tx_chn->irq;
>>>> 			goto err;
>>>> 		}
>>>>
>>>> Could you please provide details on the code-path which will lead to a
>>>> negative "tx_chn->irq" within "am65_cpsw_nuss_remove_tx_chns()"?
>>>>
>>>> There seem to be two callers of am65_cpsw_nuss_remove_tx_chns(), namely:
>>>> 1. am65_cpsw_nuss_update_tx_rx_chns()
>>>> 2. am65_cpsw_nuss_suspend()
>>>> Since both of them seem to invoke am65_cpsw_nuss_remove_tx_chns() only
>>>> in the case where am65_cpsw_nuss_init_tx_chns() *did not* error out, it
>>>> appears to me that "tx_chn->irq" will never be negative within
>>>> am65_cpsw_nuss_remove_tx_chns()
>>>>
>>>> Please let me know if I have overlooked something.
>>>
>>> The issue is with am65_cpsw_nuss_update_tx_rx_chns(). It can be called
>>> repeatedly (by user changing number of TX queues) even if previous call
>>> to am65_cpsw_nuss_init_tx_chns() failed.
>>
>> Thank you for clarifying. So the issue/bug was discovered since the
>> implementation of am65_cpsw_nuss_update_tx_rx_chns(). The "Fixes" tag
>> misled me. Maybe the "Fixes" tag should be updated? Though we should
>> code to future-proof it as done in this patch, the "Fixes" tag pointing
>> to the very first commit of the driver might not be accurate as the
>> code-path associated with the bug cannot be exercised at that commit.
> 
> Fair enough. I'll change the Fixes commit.

Now that I check the code again, am65_cpsw_nuss_remove_tx_chns(),
am65_cpsw_nuss_update_tx_chns() and am65_cpsw_nuss_init_tx_chns()
were all introduced in the Fixes commit I stated.

Could you please share why you thought it is not accurate?

> 
>>
>> Independent of the above change suggested for the "Fixes" tag,
>>
>> Reviewed-by: Siddharth Vadapalli <s-vadapalli@ti.com>
>>
>> There seems to be a different bug in am65_cpsw_nuss_update_tx_rx_chns()
>> which I have described below.
>>
>>>
>>> Please try the below patch to simulate the error condition.
>>>
>>> Then do the following
>>> - bring down all network interfaces
>>> - change num TX queues to 2. IRQ for 2nd channel fails.
>>> - change num TX queues to 3. Now we try to free an invalid IRQ and we see warning.
>>>
>>> Also I think it is good practice to code for set value than to code
>>> for existing code paths as they can change in the future.
>>>
>>> --test patch starts--
>>>
>>> diff --git a/drivers/net/ethernet/ti/am65-cpsw-nuss.c b/drivers/net/ethernet/ti/am65-cpsw-nuss.c
>>> index 36c29d3db329..c22cadaaf3d3 100644
>>> --- a/drivers/net/ethernet/ti/am65-cpsw-nuss.c
>>> +++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.c
>>> @@ -155,7 +155,7 @@
>>>  			 NETIF_MSG_IFUP	| NETIF_MSG_PROBE | NETIF_MSG_IFDOWN | \
>>>  			 NETIF_MSG_RX_ERR | NETIF_MSG_TX_ERR)
>>>  
>>> -#define AM65_CPSW_DEFAULT_TX_CHNS	8
>>> +#define AM65_CPSW_DEFAULT_TX_CHNS	1
>>>  #define AM65_CPSW_DEFAULT_RX_CHN_FLOWS	1
>>>  
>>>  /* CPPI streaming packet interface */
>>> @@ -2346,7 +2348,10 @@ static int am65_cpsw_nuss_init_tx_chns(struct am65_cpsw_common *common)
>>>  		tx_chn->dsize_log2 = __fls(hdesc_size_out);
>>>  		WARN_ON(hdesc_size_out != (1 << tx_chn->dsize_log2));
>>>  
>>> -		tx_chn->irq = k3_udma_glue_tx_get_irq(tx_chn->tx_chn);
>>> +		if (i == 1)
>>> +			tx_chn->irq = -ENODEV;
>>> +		else
>>> +			tx_chn->irq = k3_udma_glue_tx_get_irq(tx_chn->tx_chn);
>>
>> The pair - am65_cpsw_nuss_init_tx_chns()/am65_cpsw_nuss_remove_tx_chns()
>> seem to be written under the assumption that failure will result in the
>> driver's probe failing.
>>
>> With am65_cpsw_nuss_update_tx_rx_chns(), that assumption no longer holds
>> true. Please consider the following sequence:
>>
>> 1.
>> am65_cpsw_nuss_probe()
>>   am65_cpsw_nuss_register_ndevs()
>>     am65_cpsw_nuss_init_tx_chns() => Succeeds
>>
>> 2.
>> Probe is successful
>>
>> 3.
>> am65_cpsw_nuss_update_tx_rx_chns() => Invoked by user
>>   am65_cpsw_nuss_remove_tx_chns() => Succeeds
>>     am65_cpsw_nuss_init_tx_chns() => Partially fails
>>       devm_add_action(dev, am65_cpsw_nuss_free_tx_chns, common);
>>       ^ DEVM Action is added, but since the driver isn't removed,
>>       the cleanup via am65_cpsw_nuss_free_tx_chns() will not run.
>>
>> Only when the user re-invokes am65_cpsw_nuss_update_tx_rx_chns(),
>> the cleanup will be performed. This might have to be fixed in the
>> following manner:
>>
>> @@ -3416,10 +3416,17 @@ int am65_cpsw_nuss_update_tx_rx_chns(struct am65_cpsw_common *common,
>>         common->tx_ch_num = num_tx;
>>         common->rx_ch_num_flows = num_rx;
>>         ret = am65_cpsw_nuss_init_tx_chns(common);
>> -       if (ret)
>> +       if (ret) {
>> +               devm_remove_action(dev, am65_cpsw_nuss_free_tx_chns, common);
>> +               am65_cpsw_nuss_free_tx_chns(common);
>>                 return ret;
>> +       }
>>
>>         ret = am65_cpsw_nuss_init_rx_chns(common);
>> +       if (ret) {
>> +               devm_remove_action(dev, am65_cpsw_nuss_free_rx_chns, common);
>> +               am65_cpsw_nuss_free_rx_chns(common);
>> +       }
>>
>>         return ret;
>>  }
>>
>>  Please let me know what you think.
> 
> I've already implemented a cleanup series to get rid of devm_add/remove_action,
> cleanup probe error path and streamline TX and RQ queue init/cleanup.
> I'll send out the series soon as soon as I finish some tests.
> 

-- 
cheers,
-roger


  reply	other threads:[~2025-01-15 16:39 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-14 16:44 [PATCH net] net: ethernet: ti: am65-cpsw: fix freeing IRQ in am65_cpsw_nuss_remove_tx_chns() Roger Quadros
2025-01-14 17:12 ` Simon Horman
2025-01-15  5:18 ` Siddharth Vadapalli
2025-01-15 10:04   ` Roger Quadros
2025-01-15 10:38     ` Siddharth Vadapalli
2025-01-15 15:49       ` Roger Quadros
2025-01-15 16:38         ` Roger Quadros [this message]
2025-01-16  5:15           ` Siddharth Vadapalli
2025-01-16 11:47             ` Roger Quadros
2025-01-16 12:10               ` Siddharth Vadapalli
2025-01-16 13:00                 ` Roger Quadros
2025-01-16  5:09         ` Siddharth Vadapalli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8829d58b-fbfc-4040-93de-51970631d935@kernel.org \
    --to=rogerq@kernel.org \
    --cc=andrew+netdev@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=grygorii.strashko@ti.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=s-vadapalli@ti.com \
    --cc=srk@ti.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).