All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rahul Rameshbabu <rrameshbabu@nvidia.com>
To: Vadim Fedorenko <vadfed@meta.com>
Cc: Saeed Mahameed <saeedm@nvidia.com>,
	Saeed Mahameed <saeed@kernel.org>,
	Jakub Kicinski <kuba@kernel.org>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	Tariq Toukan <tariqt@nvidia.com>
Subject: Re: [pull request][net 00/10] mlx5 fixes 2023-02-07
Date: Tue, 14 Feb 2023 19:19:33 -0800	[thread overview]
Message-ID: <87edqraa4a.fsf@nvidia.com> (raw)
In-Reply-To: <5c6cad41-b54f-ed86-e067-b84c0e4bd647@meta.com> (Vadim Fedorenko's message of "Thu, 9 Feb 2023 01:10:50 +0000")

Sorry for the late response. Needed a bit of time to wrap my head around
the patch.

On Thu, 09 Feb, 2023 01:10:50 +0000 Vadim Fedorenko <vadfed@meta.com> wrote:
> On 08/02/2023 23:52, Rahul Rameshbabu wrote:
>> On Wed, 08 Feb, 2023 21:36:52 +0000 Vadim Fedorenko <vadfed@meta.com> wrote:
>>> On 08/02/2023 21:16, Rahul Rameshbabu wrote:
>>>> On Wed, 08 Feb, 2023 12:52:55 -0800 Saeed Mahameed <saeedm@nvidia.com> wrote:
>>>>> Hi Vadim,
>>>>>
>>>>> We have some new findings internally and Rahul is testing your patches,
>>>>> he found some issues where the patches don't handle the case where only drops are happening, meanings no OOO.
>>>>>
>>>>> Rahul can share more details, he's still working on this and I believe we will have a fully detailed follow-up by the end of the week.
>>>> One thing I noticed was the conditional in mlx5e_ptp_ts_cqe_ooo in v5
>>>> does handle OOO but considers the monotomically increasing case of 1,3,4
>>>> for example to be OOO as well (a resync does not occur when I tested
>>>> this case).
>>>> A simple patch I made to verify this.
>>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
>>>> b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
>>>> index ae75e230170b..dfa5c53bd0d5 100644
>>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
>>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
>>>> @@ -125,6 +125,8 @@ static void mlx5e_ptp_handle_ts_cqe(struct mlx5e_ptpsq *ptpsq,
>>>>    	struct sk_buff *skb;
>>>>    	ktime_t hwtstamp;
>>>>    +	pr_info("wqe_counter value: %u\n", skb_id);
>>>> +
>>>>    	if (unlikely(MLX5E_RX_ERR_CQE(cqe))) {
>>>>    		skb = mlx5e_skb_fifo_pop(&ptpsq->skb_fifo);
>>>>    		ptpsq->cq_stats->err_cqe++;
>>>> @@ -133,6 +135,7 @@ static void mlx5e_ptp_handle_ts_cqe(struct mlx5e_ptpsq *ptpsq,
>>>>      	if (mlx5e_ptp_ts_cqe_drop(ptpsq, skb_cc, skb_id)) {
>>>>    		if (mlx5e_ptp_ts_cqe_ooo(ptpsq, skb_id)) {
>>>> +			pr_info("Marked ooo wqe_counter: %u\n", skb_id);
>>>>    			/* already handled by a previous resync */
>>>>    			ptpsq->cq_stats->ooo_cqe_drop++;
>>>>    			return;
>>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
>>>> index f7897ddb29c5..8582f0535e21 100644
>>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
>>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
>>>> @@ -646,7 +646,7 @@ static void mlx5e_cqe_ts_id_eseg(struct mlx5e_ptpsq *ptpsq, struct sk_buff *skb,
>>>>    				 struct mlx5_wqe_eth_seg *eseg)
>>>>    {
>>>>    	if (ptpsq->ts_cqe_ctr_mask && unlikely(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP))
>>>> -		eseg->flow_table_metadata = cpu_to_be32(ptpsq->skb_fifo_pc &
>>>> +		eseg->flow_table_metadata = cpu_to_be32((ptpsq->skb_fifo_pc * 2) &
>>>>    							ptpsq->ts_cqe_ctr_mask);
>>>>    }
>>>>    Basically, I multiply the wqe_counter written in the WQE by two. The
>>>> thing here is we have a situation where we have "lost" a CQE with
>>>> wqe_counter index of one, but the patch treats that as OOO, which
>>>> basically disables our normal resiliency path for resyncs on drops. At
>>>> that point, the patch could just remove the resync logic altogether when
>>>> a drop is detected.
>>>> What I noticed then was that the case of 0,2 was marked as OOO even
>>>> though out of order would be something like 0,2,1.
>>>>     [Feb 8 02:40] wqe_counter value: 0
>>>>     [ +24.199404] wqe_counter value: 2
>>>>     [  +0.001041] Marked ooo wqe_counter: 2
>>>> I acknowledge the OOO issue but not sure the patch as is, correctly
>>>> solves the issue.
>>>>
>>>
>>> With this patch it's not clear how many skbs were in the queue. AFAIU if there
>>> was only skb id = 1 in the queue, then the id = 2 is definitely OOO because it
>>> couldn't be found in the queue. Otherwise resync should be triggered and that is
>>> what I have seen in our setup with v5 patches.
>> 
>> With this patch at the time of testing, the pc is only 2 because we
>> skipped generating a WQE with a wqe_counter of 1. This matches your
>> expectation that it's OOO since we don't have a pc of 3 (wqe_counter
>> <skb id> 1 was never actually put on the WQ).
>> 
>> One thing I am still concerned about then.
>> 
>>    wqe_counter   0   3   1   2
>>    skb_cc        0   1   2   3
>>    skb_pc        4   4   4   4
>> 
>> Lets say we encounter wqe_counter 3 and the pc is currently 4. OOO is
>> not triggered and we go into the resync logic. The resync logic then
>> consumers 3, 1, and 2 out of order which is still an issue?
>
> Resync logic will drop 1 and 2. The 3 will be consumed and the logic 
> will wait for 4 as the next one. And in this case it's OK to count 1 and 
> 2 as OOO because both of them have arrived after 3. I have to mention 
> that I didn't implement "resync logic". It was implemented before as 
> there should never be OOO cqes according to what was stated in the 
> previous versions of patches by reviewers. My patches do not change 
> logic, they just fix the implementation which is currently crashes the 
> kernel. Once the root cause in FW (which is completely closed source and 
> I can only guess what logic is implemented in it) is found we can 
> re-think the logic. But for now I just want to fix the easy reproducible 
> crash, even if the patch is "bandage".
>

Agree with this explanation and tested the behavior just to confirm.
Also, tested a couple other cases and logically understand how this
patch can make use of the resync logic to drop out-of-order skbs using
the resync logic and then ignore the corresponding CQEs with the OOO
check.

>> 
>>>
>>>
>>>>>
>>>>> Sorry for the late update but these new findings are only from yesterday.
>>>>>
>>>>> Thanks,
>>>>> Saeed.
>>>>>
>>>>>    -------------------------------------------------------------------------------------------------------------------------
>>>>> From: Vadim Fedorenko <vadfed@meta.com>
>>>>> Sent: Wednesday, February 8, 2023 4:40 AM
>>>>> To: Saeed Mahameed <saeed@kernel.org>; Jakub Kicinski <kuba@kernel.org>
>>>>> Cc: Saeed Mahameed <saeedm@nvidia.com>; netdev@vger.kernel.org <netdev@vger.kernel.org>; Tariq Toukan <tariqt@nvidia.com>
>>>>> Subject: Re: [pull request][net 00/10] mlx5 fixes 2023-02-07
>>>>>    On 08/02/2023 03:02, Saeed Mahameed wrote:
>>>>>> From: Saeed Mahameed <saeedm@nvidia.com>
>>>>>>
>>>>>> This series provides bug fixes to mlx5 driver.
>>>>>> Please pull and let me know if there is any problem.
>>>>>>
>>>>> Still no patches for PTP queue? That's a bit wierd.
>>>>> Do you think that they are not ready to be in -net?
>>>> -- Rahul Rameshbabu

Acked-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>

      reply	other threads:[~2023-02-15  3:19 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-08  3:02 [pull request][net 00/10] mlx5 fixes 2023-02-07 Saeed Mahameed
2023-02-08  3:02 ` [net 01/10] net/mlx5e: Update rx ring hw mtu upon each rx-fcs flag change Saeed Mahameed
2023-02-09  5:00   ` patchwork-bot+netdevbpf
2023-02-08  3:02 ` [net 02/10] net/mlx5: DR, Fix potential race in dr_rule_create_rule_nic Saeed Mahameed
2023-02-08  3:02 ` [net 03/10] net/mlx5: Bridge, fix ageing of peer FDB entries Saeed Mahameed
2023-02-08  3:02 ` [net 04/10] net/mlx5e: Fix crash unsetting rx-vlan-filter in switchdev mode Saeed Mahameed
2023-02-08  3:02 ` [net 05/10] net/mlx5e: IPoIB, Show unknown speed instead of error Saeed Mahameed
2023-02-08  3:02 ` [net 06/10] net/mlx5: Store page counters in a single array Saeed Mahameed
2023-02-08  3:02 ` [net 07/10] net/mlx5: Expose SF firmware pages counter Saeed Mahameed
2023-02-08  3:03 ` [net 08/10] net/mlx5: fw_tracer, Clear load bit when freeing string DBs buffers Saeed Mahameed
2023-02-08  3:03 ` [net 09/10] net/mlx5: fw_tracer, Zero consumer index when reloading the tracer Saeed Mahameed
2023-02-08  3:03 ` [net 10/10] net/mlx5: Serialize module cleanup with reload and remove Saeed Mahameed
2023-02-08 12:40 ` [pull request][net 00/10] mlx5 fixes 2023-02-07 Vadim Fedorenko
     [not found]   ` <DM5PR12MB134054EC92BC13E36B6C5711B3D89@DM5PR12MB1340.namprd12.prod.outlook.com>
2023-02-08 21:16     ` Rahul Rameshbabu
2023-02-08 21:36       ` Vadim Fedorenko
2023-02-08 23:52         ` Rahul Rameshbabu
2023-02-09  1:10           ` Vadim Fedorenko
2023-02-15  3:19             ` Rahul Rameshbabu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87edqraa4a.fsf@nvidia.com \
    --to=rrameshbabu@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=saeed@kernel.org \
    --cc=saeedm@nvidia.com \
    --cc=tariqt@nvidia.com \
    --cc=vadfed@meta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.