Re: [PATCH v2] sched: rt: Make RT capacity aware

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Pavan Kondeti <pkondeti@codeaurora.org>
To: Qais Yousef <qais.yousef@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2] sched: rt: Make RT capacity aware
Date: Mon, 3 Feb 2020 11:02:35 +0530	[thread overview]
Message-ID: <20200203053235.GE27398@codeaurora.org> (raw)
In-Reply-To: <CAEU1=PnYryM26F-tNAT0JVUoFcygRgE374JiBeJPQeTEoZpANg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 7397 bytes --]

Hi Qais,

On Sat, Feb 01, 2020 at 07:08:34AM +0530, Pavan Kondeti wrote:
> Hi Qais,
> 
> On Fri, Jan 31, 2020 at 9:04 PM Qais Yousef <qais.yousef@arm.com> wrote:
> 
> > Hi Pavan
> >
> > On 01/31/20 15:36, Pavan Kondeti wrote:
> > > Hi Qais,
> > >
> > > On Wed, Oct 09, 2019 at 11:46:11AM +0100, Qais Yousef wrote:
> >
> > [...]
> >
> > > >
> > > > For RT we don't have a per task utilization signal and we lack any
> > > > information in general about what performance requirement the RT task
> > > > needs. But with the introduction of uclamp, RT tasks can now control
> > > > that by setting uclamp_min to guarantee a minimum performance point.
> >
> > [...]
> >
> > > > ---
> > > >
> > > > Changes in v2:
> > > >     - Use cpupri_find() to check the fitness of the task instead of
> > > >       sprinkling find_lowest_rq() with several checks of
> > > >       rt_task_fits_capacity().
> > > >
> > > >       The selected implementation opted to pass the fitness function
> > as an
> > > >       argument rather than call rt_task_fits_capacity() capacity which
> > is
> > > >       a cleaner to keep the logical separation of the 2 modules; but it
> > > >       means the compiler has less room to optimize
> > rt_task_fits_capacity()
> > > >       out when it's a constant value.
> > > >
> > > > The logic is not perfect. For example if a 'small' task is occupying a
> > big CPU
> > > > and another big task wakes up; we won't force migrate the small task
> > to clear
> > > > the big cpu for the big task that woke up.
> > > >
> > > > IOW, the logic is best effort and can't give hard guarantees. But
> > improves the
> > > > current situation where a task can randomly end up on any CPU
> > regardless of
> > > > what it needs. ie: without this patch an RT task can wake up on a big
> > or small
> > > > CPU, but with this it will always wake up on a big CPU (assuming the
> > big CPUs
> > > > aren't overloaded) - hence provide a consistent performance.
> >
> > [...]
> >
> > > I understand that RT tasks run on BIG cores by default when uclamp is
> > enabled.
> > > Can you tell what happens when we have more runnable RT tasks than the
> > BIG
> > > CPUs? Do they get packed on the BIG CPUs or eventually silver CPUs pull
> > those
> > > tasks? Since rt_task_fits_capacity() is considered during wakeup, push
> > and
> > > pull, the tasks may get packed on BIG forever. Is my understanding
> > correct?
> >
> > I left up the relevant part from the commit message and my 'cover-letter'
> > above
> > that should contain answers to your question.
> >
> > In short, the logic is best effort and isn't a hard guarantee. When the
> > system
> > is overloaded we'll still spread, and a task that needs a big core might
> > end up
> > on a little one. But AFAIU with RT, if you really want guarantees you need
> > to
> > do some planning otherwise there are no guarantees in general that your
> > task
> > will get what it needs.
> >
> > But I understand your question is for the general purpose case. I've
> > hacked my
> > notebook to run a few tests for you
> >
> >
> > https://gist.github.com/qais-yousef/cfe7487e3b43c3c06a152da31ae09101
> >
> > Look at the diagrams in "Test {1, 2, 3} Results". I spawned 6 tasks which
> > match
> > the 6 cores on the Juno I ran on. Based on Linus' master from a couple of
> > days.
> >
> > Note on Juno cores 1 and 2 are the big cors. 'b_*' and 'l_*' are the task
> > names
> > which are remnants from my previous testing where I spawned different
> > numbers
> > of big and small tasks.
> >
> > I repeat the same tests 3 times to demonstrate the repeatability. The logic
> > causes 2 tasks to run on a big CPU, but there's spreading. IMO on a general
> > purpose system this is a good behavior. On a real time system that needs
> > better
> > guarantee then there's no alternative to doing proper RT planning.
> >
> > In the last test I just spawn 2 tasks which end up on the right CPUs, 1
> > and 2.
> > On system like Android my observations has been that there are very little
> > concurrent RT tasks active at the same time. So if there are some tasks in
> > the
> > system that do want to be on the big CPU, they most likely to get that
> > guarantee. Without this patch what you get is completely random.
> >
> >
> Thanks for the results. I see that tasks are indeed spreading on to silver.
> However it is not
> clear to me from the code how tasks would get spread. Because cpupri_find()
> never returns
> little CPU in the lowest_mask because RT task does not fit on little CPU.
> So from wake up
> path, we place the task on the previous CPU or BIG CPU. The push logic also
> has the
> RT capacity checks, so an overloaded BIG CPU may not push tasks to an idle
> (RT free) little CPU.
> 
> 

I pulled your patch on top of v5.5 and run the below rt-app test on SDM845
platform. We expect 5 RT tasks to spread on different CPUs which was happening
without the patch. With the patch, I see one of the task got woken up on a CPU
which is running another RT task.

{
	"tasks" : {
		"big-task" : {
			"instance" : 5,
			"loop" : 10,
			"sleep" : 100000,
			"runtime" : 100000,
		},
	},
	"global" : {
		"duration" : -1,
		"calibration" : 720,
		"default_policy" : "SCHED_FIFO",
		"pi_enabled" : false,
		"lock_pages" : false,
		"logdir" : "/",
		"log_basename" : "rt-app2",
		"ftrace" : false,
		"gnuplot" : false
	}
}

Full trace is attached. Copy/pasting the snippet where it shows packing is
happening. The custom trace_printk are added in cpupri_find() before calling
fitness_fn(). As you can see pid=535 is woken on CPU#7 where pid=538 RT task
is runnning. Right after waking, the push is tried but it did not work either.

This is not a serious problem for us since we don't set RT tasks
uclamp.min=1024 . However, it changed the behavior and might introduce latency
for RT tasks on b.L platforms running the upstream kernel as is.

        big-task-538   [007] d.h.   403.401633: irq_handler_entry: irq=3 name=arch_timer
        big-task-538   [007] d.h2   403.401633: sched_waking: comm=big-task pid=535 prio=89 target_cpu=007
        big-task-538   [007] d.h2   403.401635: cpupri_find: before task=big-task-535 lowest_mask=f
        big-task-538   [007] d.h2   403.401636: cpupri_find: after task=big-task-535 lowest_mask=0
        big-task-538   [007] d.h2   403.401637: cpupri_find: it comes here
        big-task-538   [007] d.h2   403.401638: find_lowest_rq: task=big-task-535 ret=0 lowest_mask=0
        big-task-538   [007] d.h3   403.401640: sched_wakeup: comm=big-task pid=535 prio=89 target_cpu=007
        big-task-538   [007] d.h3   403.401641: cpupri_find: before task=big-task-535 lowest_mask=f
        big-task-538   [007] d.h3   403.401642: cpupri_find: after task=big-task-535 lowest_mask=0
        big-task-538   [007] d.h3   403.401642: cpupri_find: it comes here
        big-task-538   [007] d.h3   403.401643: find_lowest_rq: task=big-task-535 ret=0 lowest_mask=0
        big-task-538   [007] d.h.   403.401644: irq_handler_exit: irq=3 ret=handled
        big-task-538   [007] d..2   403.402413: sched_switch: prev_comm=big-task prev_pid=538 prev_prio=89 prev_state=S ==> next_comm=big-task next_pid=535 next_prio=89

Thanks,
Pavan
-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.


[-- Attachment #2: trace.tar.gz --]
[-- Type: application/octet-stream, Size: 31906 bytes --]

next prev parent reply	other threads:[~2020-02-03  5:32 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-09 10:46 [PATCH v2] sched: rt: Make RT capacity aware Qais Yousef
2019-10-23 12:34 ` Qais Yousef
2019-10-28 14:37 ` Peter Zijlstra
2019-10-28 18:01   ` Steven Rostedt
2019-10-28 20:50     ` Peter Zijlstra
2019-12-20 16:01       ` Qais Yousef
2019-12-20 17:18         ` Peter Zijlstra
2019-12-20 17:36           ` Qais Yousef
2019-11-07  9:15     ` Qais Yousef
2019-11-18 15:43     ` Qais Yousef
2019-11-18 15:53       ` Steven Rostedt
2019-11-18 16:12         ` Qais Yousef
2019-10-29  8:13 ` Vincent Guittot
2019-10-29 11:02   ` Qais Yousef
2019-10-29 11:17     ` Vincent Guittot
2019-10-29 11:48       ` Qais Yousef
2019-10-29 12:20         ` Vincent Guittot
2019-10-29 12:46           ` Qais Yousef
2019-10-29 12:54             ` Vincent Guittot
2019-10-29 13:02               ` Peter Zijlstra
2019-10-29 20:36               ` Patrick Bellasi
2019-10-30  8:04                 ` Vincent Guittot
2019-10-30  9:26                   ` Qais Yousef
2019-10-30 12:11                   ` Quentin Perret
2019-10-30 11:57 ` Dietmar Eggemann
2019-10-30 17:43   ` Qais Yousef
2019-11-28 13:59     ` Dietmar Eggemann
2019-11-25 21:36 ` Steven Rostedt
2019-11-26  9:39   ` Qais Yousef
2019-12-25 10:38 ` [tip: sched/core] sched/rt: Make RT capacity-aware tip-bot2 for Qais Yousef
2020-01-31 10:06 ` [PATCH v2] sched: rt: Make RT capacity aware Pavan Kondeti
2020-01-31 15:34   ` Qais Yousef
     [not found]     ` <CAEU1=PnYryM26F-tNAT0JVUoFcygRgE374JiBeJPQeTEoZpANg@mail.gmail.com>
2020-02-03  5:32       ` Pavan Kondeti [this message]
2020-02-03 14:57         ` Qais Yousef
2020-02-03 14:27       ` Qais Yousef
2020-02-03 16:14         ` Steven Rostedt
2020-02-03 17:15           ` Valentin Schneider
2020-02-03 17:17           ` Qais Yousef
2020-02-03 18:12             ` Steven Rostedt
2020-02-03 19:03               ` Qais Yousef
2020-02-04 17:23                 ` Dietmar Eggemann
2020-02-05 14:48                   ` Qais Yousef

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200203053235.GE27398@codeaurora.org \
    --to=pkondeti@codeaurora.org \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=qais.yousef@arm.com \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.