All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jiri Wiesner <jwiesner@suse.de>
To: Julian Anastasov <ja@ssi.bg>
Cc: Simon Horman <horms@verge.net.au>,
	lvs-devel@vger.kernel.org,
	yunhong-cgl jiang <xintian1976@gmail.com>,
	dust.li@linux.alibaba.com
Subject: Re: [RFC PATCHv5 3/6] ipvs: use kthreads for stats estimation
Date: Wed, 16 Nov 2022 17:41:19 +0100	[thread overview]
Message-ID: <20221116164119.GK3484@incl> (raw)
In-Reply-To: <753051f-655d-bef5-70f-cbc41928adeb@ssi.bg>

On Sat, Oct 29, 2022 at 05:12:28PM +0300, Julian Anastasov wrote:
> On Thu, 27 Oct 2022, Jiri Wiesner wrote:
> > On Mon, Oct 24, 2022 at 06:01:32PM +0300, Julian Anastasov wrote:
> > > - fast and safe way to apply a new chain_max or similar
> > > parameter for cond_resched rate. If possible, without
> > > relinking. stop+start can be slow too.
> > 
> > I am still wondering where the requirement for 100 us latency in non-preemtive kernels comes from. Typical time slices assigned by a time-sharing scheduler are measured in milliseconds. A kernel with volutary preemption does not need any cond_resched statements in ip_vs_tick_estimation() because every spin_unlock() in ip_vs_chain_estimation() is a preemption point, which actually puts the accuracy of the computed estimates at risk but nothing can be done about that, I guess.
> 
> 	I'm not sure about the 100us requirements for non-RT
> kernels, this document covers only RT requirements, I think:
> 
> Documentation/RCU/Design/Requirements/Requirements.rst
> 
> 	In fact, I don't worry for the RCU-preemptible
> case where we can be rescheduled at any time. In this
> case cond_resched_rcu() is NOP and chain_max has only
> one purpose of limiting ests in kthread, i.e. not to
> determine period between cond_resched calls which is
> its 2nd purpose for the non-preemptible case.
> 
> 	As for the non-preemptible case,
> rcu_read_lock/rcu_read_unlock are just preempt_disable/preempt_enable 
> which means the spin locking can not preempt us, the only way is
> we to call rcu_read_unlock which is just preempt_count_dec()
> or a simple barrier() but __preempt_schedule() is not
> called as it happens on CONFIG_PREEMPTION. So, only
> cond_resched() can allow rescheduling.
> 
> 	Also, there are some configurations like nohz_full
> that expect cond_resched() to check for any pending
> rcu_urgent_qs condition via rcu_all_qs(). I'm not
> expert in areas such as RCU and scheduling, so I'm
> not sure about the 100us latency budget for the
> non-preemptible cases we cover:
> 
> 1. PREEMPT_NONE "No Forced Preemption (Server)"
> 2. PREEMPT_VOLUNTARY "Voluntary Kernel Preemption (Desktop)"
> 
> 	Where the latency can matter is setups where the
> IPVS kthreads are set to some low priority, as a
> way to work in idle times and to allow app servers
> to react to clients' requests faster. Once request
> is served with short delay, app blocks somewhere and
> our kthreads run again running in idle times.
> 
> 	In short, the IPVS kthreads do not have an
> urgent work, they should do their 4.8ms work in 40ms
> or even more but it is preferred not to delay other
> more-priority tasks such as applications or even other
> kthreads. That is why I think we should stick to some low
> period between cond_resched calls without causing
> it to take large part of our CPU usage.

OK, I agree that volutary preemption without CONFIG_PREEMPT_RCU will need a preemption point in ip_vs_tick_estimation().

> 	If we want to reduce its rate, it can be
> in this way, for example:
> 
> 	int n = 0;
> 
> 	/* 400us for forced cond_resched() but reschedule on demand */
> 	if (!(++n & 3) || need_resched()) {
> 		cond_resched_rcu();
> 		n = 0;
> 	}
> 
> 	This controls both the RCU requirements and
> reacts faster on scheduler's indication. There will be
> an useless need_resched() call for the RCU-preemptible
> case, though, where cond_resched_rcu is NOP.

I do not see that as an improvement as well.

-- 
Jiri Wiesner
SUSE Labs

  reply	other threads:[~2022-11-16 16:41 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-09 15:37 [RFC PATCHv5 0/6] ipvs: Use kthreads for stats Julian Anastasov
2022-10-09 15:37 ` [RFC PATCHv5 1/6] ipvs: add rcu protection to stats Julian Anastasov
2022-10-09 15:37 ` [RFC PATCHv5 2/6] ipvs: use common functions for stats allocation Julian Anastasov
2022-10-09 15:37 ` [RFC PATCHv5 3/6] ipvs: use kthreads for stats estimation Julian Anastasov
2022-10-15  9:21   ` Jiri Wiesner
2022-10-16 12:21     ` Julian Anastasov
2022-10-22 18:15       ` Jiri Wiesner
2022-10-24 15:01         ` Julian Anastasov
2022-10-26 15:29           ` Julian Anastasov
2022-10-27 18:07           ` Jiri Wiesner
2022-10-29 14:12             ` Julian Anastasov
2022-11-16 16:41               ` Jiri Wiesner [this message]
2022-10-09 15:37 ` [RFC PATCHv5 4/6] ipvs: add est_cpulist and est_nice sysctl vars Julian Anastasov
2022-10-09 15:37 ` [RFC PATCHv5 5/6] ipvs: run_estimation should control the kthread tasks Julian Anastasov
2022-10-09 15:37 ` [RFC PATCHv5 6/6] ipvs: debug the tick time Julian Anastasov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221116164119.GK3484@incl \
    --to=jwiesner@suse.de \
    --cc=dust.li@linux.alibaba.com \
    --cc=horms@verge.net.au \
    --cc=ja@ssi.bg \
    --cc=lvs-devel@vger.kernel.org \
    --cc=xintian1976@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.