All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rik van Riel <riel@redhat.com>
To: Frederic Weisbecker <fweisbec@gmail.com>
Cc: linux-kernel@vger.kernel.org, peterz@infradead.org,
	mingo@kernel.org, pbonzini@redhat.com, fweisbec@redhat.com,
	wanpeng.li@hotmail.com, efault@gmx.de, tglx@linutronix.de,
	rkrcmar@redhat.com
Subject: Re: [PATCH 1/5] sched,time: count actually elapsed irq & softirq time
Date: Mon, 27 Jun 2016 08:50:06 -0400	[thread overview]
Message-ID: <1467031806.22723.18.camel@redhat.com> (raw)
In-Reply-To: <20160627122553.GA2111@lerouge>

[-- Attachment #1: Type: text/plain, Size: 5303 bytes --]

On Mon, 2016-06-27 at 14:25 +0200, Frederic Weisbecker wrote:
> On Wed, Jun 22, 2016 at 10:25:47PM -0400, riel@redhat.com wrote:
> > 
> > From: Rik van Riel <riel@redhat.com>
> > 
> > Currently, if there was any irq or softirq time during 'ticks'
> > jiffies, the entire period will be accounted as irq or softirq
> > time.
> > 
> > This is inaccurate if only a subset of 'ticks' jiffies was
> > actually spent handling irqs, and could conceivably mis-count
> > all of the ticks during a period as irq time, when there was
> > some irq and some softirq time.
> Good catch!
> 
> Many comments following.
> 
> > 
> > 
> > This can actually happen when irqtime_account_process_tick
> > is called from account_idle_ticks, which can pass a larger
> > number of ticks down all at once.
> > 
> > Fix this by changing irqtime_account_hi_update and
> > irqtime_account_si_update to round elapsed irq and softirq
> > time to jiffies, and return the number of jiffies spent in
> > each mode, similar to how steal time is handled.
> > 
> > Additionally, have irqtime_account_process_tick take into
> > account how much time was spent in each of steal, irq,
> > and softirq time.
> > 
> > The latter could help improve the accuracy of timekeeping
> Maybe you meant cputime? Timekeeping is rather about jiffies and
> GTOD.
> 
> > 
> > when returning from idle on a NO_HZ_IDLE CPU.
> > 
> > Properly accounting how much time was spent in hardirq and
> > softirq time will also allow the NO_HZ_FULL code to re-use
> > these same functions for hardirq and softirq accounting.
> > 
> > Signed-off-by: Rik van Riel <riel@redhat.com>
> > 
> >  	local_irq_save(flags);
> > -	latest_ns = this_cpu_read(cpu_hardirq_time);
> > -	if (nsecs_to_cputime64(latest_ns) > cpustat[CPUTIME_IRQ])
> > -		ret = 1;
> > +	irq = this_cpu_read(cpu_hardirq_time) -
> > cpustat[CPUTIME_IRQ];
> cpu_hardirq_time is made of nsecs whereas cpustat is of cputime_t (in
> fact
> even cputime64_t). So you need to convert cpu_hardirq_time before
> doing the
> substract.

Doh. Good catch!

> > -static int irqtime_account_si_update(void)
> > +static unsigned long irqtime_account_si_update(unsigned long
> > max_jiffies)
> >  {
> >  	u64 *cpustat = kcpustat_this_cpu->cpustat;
> > +	unsigned long si_jiffies = 0;
> >  	unsigned long flags;
> > -	u64 latest_ns;
> > -	int ret = 0;
> > +	u64 softirq;
> >  
> >  	local_irq_save(flags);
> > -	latest_ns = this_cpu_read(cpu_softirq_time);
> > -	if (nsecs_to_cputime64(latest_ns) >
> > cpustat[CPUTIME_SOFTIRQ])
> > -		ret = 1;
> > +	softirq = this_cpu_read(cpu_softirq_time) -
> > cpustat[CPUTIME_SOFTIRQ];
> > +	if (softirq > cputime_one_jiffy) {
> > +		si_jiffies = min(max_jiffies,
> > cputime_to_jiffies(softirq));
> > +		cpustat[CPUTIME_SOFTIRQ] +=
> > jiffies_to_cputime(si_jiffies);
> > +	}
> >  	local_irq_restore(flags);
> > -	return ret;
> > +	return si_jiffies;
> So same comments apply here.
> 
> [...]
> > 
> >   * Accumulate raw cputime values of dead tasks (sig->[us]time) and
> > live
> >   * tasks (sum on group iteration) belonging to @tsk's group.
> >   */
> > @@ -344,19 +378,24 @@ static void
> > irqtime_account_process_tick(struct task_struct *p, int user_tick,
> >  {
> >  	cputime_t scaled = cputime_to_scaled(cputime_one_jiffy);
> >  	u64 cputime = (__force u64) cputime_one_jiffy;
> > -	u64 *cpustat = kcpustat_this_cpu->cpustat;
> > +	unsigned long other;
> >  
> > -	if (steal_account_process_tick(ULONG_MAX))
> > +	/*
> > +	 * When returning from idle, many ticks can get accounted
> > at
> > +	 * once, including some ticks of steal, irq, and softirq
> > time.
> > +	 * Subtract those ticks from the amount of time accounted
> > to
> > +	 * idle, or potentially user or system time. Due to
> > rounding,
> > +	 * other time can exceed ticks occasionally.
> > +	 */
> > +	other = account_other_ticks(ticks);
> > +	if (other >= ticks)
> >  		return;
> > +	ticks -= other;
> >  
> >  	cputime *= ticks;
> >  	scaled *= ticks;
> So instead of dealing with ticks here, I think you should rather use
> the above
> cputime as both the limit and the remaining time to account after
> steal/irqs.
> 
> This should avoid some middle conversions and improve precision when
> cputime_t == nsecs granularity.
> 
> If we account 2 ticks to idle (lets say HZ=100) and irq time to
> account is 15 ms. 2 ticks = 20 ms
> so we have 5 ms left to account to idle. With the jiffies granularity
> in this patch, we would account
> one tick to irqtime (1 tick = 10 ms, there will be 5 ms to account
> back later) and one tick to idle
> time whereas if you deal with cputime_t, you are going to account the
> correct amount of idle time.
> 

Ahhh, so you want irqtime_account_process_tick to work with
and account fractional ticks when calling account_system_time,
account_user_time, account_idle_time, etc?

I guess that should work fine since we already pass cputime
values in, anyway.

I suppose we can do the same for get_vtime_delta, too.

They can both work with the actual remaining time (in cputime_t),
after the other time has been subtracted.

I can rework the series to do that.

-- 
All Rights Reversed.


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

  reply	other threads:[~2016-06-27 12:50 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-23  2:25 [PATCH v2 0/5] sched,time: fix irq time accounting with nohz_idle riel
2016-06-23  2:25 ` [PATCH 1/5] sched,time: count actually elapsed irq & softirq time riel
2016-06-27 12:25   ` Frederic Weisbecker
2016-06-27 12:50     ` Rik van Riel [this message]
2016-06-28 20:56       ` Frederic Weisbecker
2016-06-23  2:25 ` [PATCH 2/5] nohz,cputime: remove VTIME_GEN vtime irq time code riel
2016-06-27 23:21   ` Frederic Weisbecker
2016-06-27 23:31     ` Rik van Riel
2016-06-27 23:51       ` Frederic Weisbecker
2016-06-23  2:25 ` [PATCH 3/5] cputime: allow irq time accounting to be selected as an option riel
2016-06-26  3:03   ` kbuild test robot
2016-06-23  2:25 ` [PATCH 4/5] irqtime: add irq type parameter to irqtime_account_irq riel
2016-06-23  2:25 ` [PATCH 5/5] irqtime: drop local_irq_save/restore from irqtime_account_irq riel
  -- strict thread matches above, loose matches on Subject: below --
2016-07-13 14:50 [GIT PULL] cputime fixes and cleanups Frederic Weisbecker
2016-07-13 14:50 ` [PATCH 1/5] sched,time: Count actually elapsed irq & softirq time Frederic Weisbecker
2016-08-09  3:59   ` Wanpeng Li
2016-08-09 14:06     ` Rik van Riel
2016-08-09 23:07       ` Wanpeng Li
2016-08-10  7:51         ` Wanpeng Li
2016-08-09 23:25       ` Wanpeng Li
2016-08-09 23:31         ` Wanpeng Li
2016-08-09 23:35         ` Wanpeng Li
2016-08-09 23:39         ` Wanpeng Li
2016-08-10  5:07           ` Rik van Riel
2016-08-10  6:33             ` Wanpeng Li
2016-06-16 16:06 [PATCH 0/5] sched,time: fix irq time accounting with nohz_idle riel
2016-06-16 16:06 ` [PATCH 1/5] sched,time: count actually elapsed irq & softirq time riel
2016-06-16 16:22   ` kbuild test robot
2016-06-21 21:21   ` Peter Zijlstra
2016-06-21 22:20     ` Rik van Riel
2016-06-22 10:40     ` Paolo Bonzini
2016-06-22 10:52       ` Peter Zijlstra
2016-06-08  2:29 [PATCH RFC 0/5] sched,time: make irq time accounting work for nohz_idle riel
2016-06-08  2:30 ` [PATCH 1/5] sched,time: count actually elapsed irq & softirq time riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1467031806.22723.18.camel@redhat.com \
    --to=riel@redhat.com \
    --cc=efault@gmx.de \
    --cc=fweisbec@gmail.com \
    --cc=fweisbec@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rkrcmar@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=wanpeng.li@hotmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.