All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: Praveen Kumar Kannoju <praveen.kannoju@oracle.com>
Cc: jhs@mojatatu.com, xiyou.wangcong@gmail.com, jiri@resnulli.us,
	davem@davemloft.net, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	rajesh.sivaramasubramaniom@oracle.com,
	rama.nichanamatlu@oracle.com, manjunath.b.patil@oracle.com
Subject: Re: [PATCH RFC] net/sched: adjust device watchdog timer to detect stopped queue at right time
Date: Wed, 1 May 2024 15:12:51 -0700	[thread overview]
Message-ID: <20240501151251.2eccb4d0@kernel.org> (raw)
In-Reply-To: <20240430140010.5005-1-praveen.kannoju@oracle.com>

On Tue, 30 Apr 2024 19:30:10 +0530 Praveen Kumar Kannoju wrote:
> Applications are sensitive to long network latency, particularly
> heartbeat monitoring ones. Longer the tx timeout recovery higher the
> risk with such applications on a production machines. This patch
> remedies, yet honoring device set tx timeout.
> 
> Modify watchdog next timeout to be shorter than the device specified.
> Compute the next timeout be equal to device watchdog timeout less the
> how long ago queue stop had been done. At next watchdog timeout tx
> timeout handler is called into if still in stopped state. Either called
> or not called, restore the watchdog timeout back to device specified.

Idea makes sense, some comments on the code below.

> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> index 4a2c763e2d11..64e31f8b4ac1 100644
> --- a/net/sched/sch_generic.c
> +++ b/net/sched/sch_generic.c
> @@ -506,18 +506,25 @@ static void dev_watchdog(struct timer_list *t)
>  			unsigned int timedout_ms = 0;
>  			unsigned int i;
>  			unsigned long trans_start;
> +			unsigned long next_check = 0;
> +			unsigned long current_jiffies;
>  
>  			for (i = 0; i < dev->num_tx_queues; i++) {
>  				struct netdev_queue *txq;
> +				current_jiffies = jiffies;

Not sure why you save current jiffies.

>  				txq = netdev_get_tx_queue(dev, i);
>  				trans_start = READ_ONCE(txq->trans_start);
> -				if (netif_xmit_stopped(txq) &&
> -				    time_after(jiffies, (trans_start +
> -							 dev->watchdog_timeo))) {
> -					timedout_ms = jiffies_to_msecs(jiffies - trans_start);
> -					atomic_long_inc(&txq->trans_timeout);
> -					break;
> +				if (netif_xmit_stopped(txq)) {

please use continue instead of adding another indentation level

> +					if (time_after(current_jiffies, (trans_start +

wrap at 80 characters

> +								   dev->watchdog_timeo))) {
> +						timedout_ms = jiffies_to_msecs(current_jiffies -
> +										trans_start);
> +						atomic_long_inc(&txq->trans_timeout);
> +						break;
> +					}
> +					next_check = trans_start + dev->watchdog_timeo -
> +									current_jiffies;

this will give us "next_check" for last queue. Let's instead find the
oldest trans_start in the loop. Do:

		unsigned long oldest_start = jiffies;

then in the loop:

		oldest_start = min(...)

>  				}
>  			}
>  
> @@ -530,9 +537,11 @@ static void dev_watchdog(struct timer_list *t)
>  				dev->netdev_ops->ndo_tx_timeout(dev, i);
>  				netif_unfreeze_queues(dev);
>  			}
> +			if (!next_check)
> +				next_check = dev->watchdog_timeo;
>  			if (!mod_timer(&dev->watchdog_timer,
>  				       round_jiffies(jiffies +
> -						     dev->watchdog_timeo)))
> +						     next_check)))

then here you just need to swap jiffies for oldest_start

>  				release = false;
>  		}
>  	}


  reply	other threads:[~2024-05-01 22:12 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-30 14:00 [PATCH RFC] net/sched: adjust device watchdog timer to detect stopped queue at right time Praveen Kumar Kannoju
2024-05-01 22:12 ` Jakub Kicinski [this message]
2024-05-03 14:28   ` Praveen Kannoju
2024-05-03 19:29     ` Jakub Kicinski
2024-05-06 14:02       ` Praveen Kannoju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240501151251.2eccb4d0@kernel.org \
    --to=kuba@kernel.org \
    --cc=davem@davemloft.net \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=linux-kernel@vger.kernel.org \
    --cc=manjunath.b.patil@oracle.com \
    --cc=netdev@vger.kernel.org \
    --cc=praveen.kannoju@oracle.com \
    --cc=rajesh.sivaramasubramaniom@oracle.com \
    --cc=rama.nichanamatlu@oracle.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.