linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Kees Cook <keescook@chromium.org>
To: cl@rock-chips.com
Cc: juri.lelli@redhat.com, mark.rutland@arm.com, heiko@sntech.de,
	geert+renesas@glider.be, peterz@infradead.org,
	catalin.marinas@arm.com, bsegall@google.com, will@kernel.org,
	mpe@ellerman.id.au, linux@armlinux.org.uk,
	dietmar.eggemann@arm.com, ben.dooks@codethink.co.uk,
	mgorman@suse.de, huangtao@rock-chips.com,
	anshuman.khandual@arm.com, rostedt@goodmis.org,
	tglx@linutronix.de, surenb@google.com, mingo@redhat.com,
	allison@lohutok.net, linux-arm-kernel@lists.infradead.org,
	wad@chromium.org, gregkh@linuxfoundation.org,
	linux-kernel@vger.kernel.org, luto@amacapital.net,
	george_davis@mentor.com, sudeep.holla@arm.com,
	akpm@linux-foundation.org, info@metux.net,
	kstewart@linuxfoundation.org
Subject: Re: [PATCH v1 1/1] sched/fair: do not preempt current task if it is going to call schedule()
Date: Thu, 5 Mar 2020 00:43:41 -0800	[thread overview]
Message-ID: <202003050031.27889B4@keescook> (raw)
In-Reply-To: <20200305081611.29323-2-cl@rock-chips.com>

On Thu, Mar 05, 2020 at 04:16:11PM +0800, cl@rock-chips.com wrote:
> From: Liang Chen <cl@rock-chips.com>
> 
> when we create a kthread with ktrhead_create_on_cpu(),the child thread
> entry is ktread.c:ktrhead() which will be preempted by the parent after
> call complete(done) while schedule() is not called yet,then the parent
> will call wait_task_inactive(child) but the child is still on the runqueue,
> so the parent will schedule_hrtimeout() for 1 jiffy,it will waste a lot of
> time,especially on startup.
> 
>   parent                             child
> ktrhead_create_on_cpu()
>   wait_fo_completion(&done) -----> ktread.c:ktrhead()
>                              |----- complete(done);--wakeup and preempted by parent
>  kthread_bind() <------------|  |-> schedule();--dequeue here
>   wait_task_inactive(child)     |
>    schedule_hrtimeout(1 jiffy) -|
> 
> So we hope the child just wakeup parent but not preempted by parent, and the
> child is going to call schedule() soon,then the parent will not call
> schedule_hrtimeout(1 jiffy) as the child is already dequeue.
> 
> The same issue for ktrhead_park()&&kthread_parkme().
> This patch can save 120ms on rk312x startup with CONFIG_HZ=300.

Interesting improvement!

> 
> Signed-off-by: Liang Chen <cl@rock-chips.com>
> ---
>  arch/arm/include/asm/thread_info.h   |  1 +
>  arch/arm64/include/asm/thread_info.h |  1 +
>  include/linux/sched.h                | 15 +++++++++++++++
>  kernel/kthread.c                     |  4 ++++
>  kernel/sched/fair.c                  |  4 ++++
>  5 files changed, 25 insertions(+)
> 
> diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h
> index 0d0d5178e2c3..51802991ba1f 100644
> --- a/arch/arm/include/asm/thread_info.h
> +++ b/arch/arm/include/asm/thread_info.h
> @@ -145,6 +145,7 @@ extern int vfp_restore_user_hwstate(struct user_vfp *,
>  #define TIF_USING_IWMMXT	17
>  #define TIF_MEMDIE		18	/* is terminating due to OOM killer */
>  #define TIF_RESTORE_SIGMASK	20
> +#define TIF_GOING_TO_SCHED	27	/* task is going to call schedule() */
>  
>  #define _TIF_SIGPENDING		(1 << TIF_SIGPENDING)
>  #define _TIF_NEED_RESCHED	(1 << TIF_NEED_RESCHED)
> diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
> index f0cec4160136..332786f11dc3 100644
> --- a/arch/arm64/include/asm/thread_info.h
> +++ b/arch/arm64/include/asm/thread_info.h
> @@ -78,6 +78,7 @@ void arch_release_task_struct(struct task_struct *tsk);
>  #define TIF_SVE_VL_INHERIT	24	/* Inherit sve_vl_onexec across exec */
>  #define TIF_SSBD		25	/* Wants SSB mitigation */
>  #define TIF_TAGGED_ADDR		26	/* Allow tagged user addresses */
> +#define TIF_GOING_TO_SCHED	27	/* task is going to call schedule() */
>  
>  #define _TIF_SIGPENDING		(1 << TIF_SIGPENDING)
>  #define _TIF_NEED_RESCHED	(1 << TIF_NEED_RESCHED)

I don't think you want a TIF flag for this (they're used in special
places, especially in entry code, etc). Since you're only changing
"normal" C code, I would suggest a atomic_flags addition instead:

#define PFA_GOING_TO_SCHED	8
...
TASK_PFA_TEST(GOING_TO_SCHED, going_to_sched)
TASK_PFA_SET(GOING_TO_SCHED, going_to_sched)
TASK_PFA_CLEAR(GOING_TO_SCHED, going_to_sched)

(Also if you used TIF, you'd need to add the TIF to every architecture
to use it in the common scheduler code, which too much work.)

> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 04278493bf15..cb9058d2cf0b 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1768,6 +1768,21 @@ static inline int test_tsk_need_resched(struct task_struct *tsk)
>  	return unlikely(test_tsk_thread_flag(tsk,TIF_NEED_RESCHED));
>  }
>  
> +static inline void set_tsk_going_to_sched(struct task_struct *tsk)
> +{
> +	set_tsk_thread_flag(tsk, TIF_GOING_TO_SCHED);
> +}
> +
> +static inline void clear_tsk_going_to_sched(struct task_struct *tsk)
> +{
> +	clear_tsk_thread_flag(tsk, TIF_GOING_TO_SCHED);
> +}
> +
> +static inline int test_tsk_going_to_sched(struct task_struct *tsk)
> +{
> +	return unlikely(test_tsk_thread_flag(tsk, TIF_GOING_TO_SCHED));
> +}

Then you can drop these wrappers since you'll have the test/set/clear
functions declared above (task_set/clear_going_to_sched(),
task_going_to_sched()) with the TASK_PFA... macros.

> +
>  /*
>   * cond_resched() and cond_resched_lock(): latency reduction via
>   * explicit rescheduling in places that are safe. The return
> diff --git a/kernel/kthread.c b/kernel/kthread.c
> index b262f47046ca..8a4e4c9cdc22 100644
> --- a/kernel/kthread.c
> +++ b/kernel/kthread.c
> @@ -199,8 +199,10 @@ static void __kthread_parkme(struct kthread *self)
>  		if (!test_bit(KTHREAD_SHOULD_PARK, &self->flags))
>  			break;
>  
> +		set_tsk_going_to_sched(current);

task_set_going_to_sched(current);

>  		complete(&self->parked);
>  		schedule();
> +		clear_tsk_going_to_sched(current);

task_clear_going_to_sched(current);

>  	}
>  	__set_current_state(TASK_RUNNING);
>  }
> @@ -245,8 +247,10 @@ static int kthread(void *_create)
>  	/* OK, tell user we're spawned, wait for stop or wakeup */
>  	__set_current_state(TASK_UNINTERRUPTIBLE);
>  	create->result = current;
> +	set_tsk_going_to_sched(current);
>  	complete(done);
>  	schedule();
> +	clear_tsk_going_to_sched(current);

etc.

>  
>  	ret = -EINTR;
>  	if (!test_bit(KTHREAD_SHOULD_STOP, &self->flags)) {
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 3c8a379c357e..28a308743bf8 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -4330,6 +4330,8 @@ entity_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr, int queued)
>  			hrtimer_active(&rq_of(cfs_rq)->hrtick_timer))
>  		return;
>  #endif
> +	if (test_tsk_going_to_sched(rq_of(cfs_rq)->curr))
> +		return;

if (unlikely(task_going_to_sched(rq_of(cfs_rq)->curr)))
	return;

>  
>  	if (cfs_rq->nr_running > 1)
>  		check_preempt_tick(cfs_rq, curr);
> @@ -6633,6 +6635,8 @@ static void check_preempt_wakeup(struct rq *rq, struct task_struct *p, int wake_
>  	 */
>  	if (test_tsk_need_resched(curr))
>  		return;
> +	if (test_tsk_going_to_sched(curr))
> +		return;

same.

>  
>  	/* Idle tasks are by definition preempted by non-idle tasks. */
>  	if (unlikely(task_has_idle_policy(curr)) &&
> -- 
> 2.17.1
> 
> 
> 

I'd add comments above each of the "return" cases to help people
understand why the test is important.

-- 
Kees Cook

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2020-03-05  8:43 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-05  8:16 [PATCH v1 0/1] wait_task_inactive() spend too much time on system startup cl
2020-03-05  8:16 ` [PATCH v1 1/1] sched/fair: do not preempt current task if it is going to call schedule() cl
2020-03-05  8:43   ` Kees Cook [this message]
2020-03-05  9:58   ` Peter Zijlstra
2020-03-05 17:30     ` Kees Cook
2020-03-05 13:22   ` kbuild test robot
2020-03-05 13:33   ` kbuild test robot
2020-03-05 13:33   ` kbuild test robot
2020-03-05 13:59   ` kbuild test robot
2020-03-05 14:38   ` kbuild test robot
2020-03-05 15:14     ` [kbuild-all] " Li, Philip

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202003050031.27889B4@keescook \
    --to=keescook@chromium.org \
    --cc=akpm@linux-foundation.org \
    --cc=allison@lohutok.net \
    --cc=anshuman.khandual@arm.com \
    --cc=ben.dooks@codethink.co.uk \
    --cc=bsegall@google.com \
    --cc=catalin.marinas@arm.com \
    --cc=cl@rock-chips.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=geert+renesas@glider.be \
    --cc=george_davis@mentor.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=heiko@sntech.de \
    --cc=huangtao@rock-chips.com \
    --cc=info@metux.net \
    --cc=juri.lelli@redhat.com \
    --cc=kstewart@linuxfoundation.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=luto@amacapital.net \
    --cc=mark.rutland@arm.com \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=sudeep.holla@arm.com \
    --cc=surenb@google.com \
    --cc=tglx@linutronix.de \
    --cc=wad@chromium.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).