From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755173AbcGENJi (ORCPT ); Tue, 5 Jul 2016 09:09:38 -0400 Received: from mx1.redhat.com ([209.132.183.28]:54406 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751986AbcGENIi (ORCPT ); Tue, 5 Jul 2016 09:08:38 -0400 Message-ID: <1467724096.17336.41.camel@redhat.com> Subject: Re: [PATCH 1/4] sched,time: count actually elapsed irq & softirq time From: Rik van Riel To: Frederic Weisbecker Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@kernel.org, pbonzini@redhat.com, fweisbec@redhat.com, wanpeng.li@hotmail.com, efault@gmx.de, tglx@linutronix.de, rkrcmar@redhat.com Date: Tue, 05 Jul 2016 09:08:16 -0400 In-Reply-To: <20160705124033.GA5332@lerouge> References: <1467315350-3152-1-git-send-email-riel@redhat.com> <1467315350-3152-2-git-send-email-riel@redhat.com> <20160705124033.GA5332@lerouge> Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="=-i9zQ5p/9eSmHoIlyvM2k" Mime-Version: 1.0 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Tue, 05 Jul 2016 13:08:38 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --=-i9zQ5p/9eSmHoIlyvM2k Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, 2016-07-05 at 14:40 +0200, Frederic Weisbecker wrote: > On Thu, Jun 30, 2016 at 03:35:47PM -0400, riel@redhat.com wrote: > > diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c > > index 3d60e5d76fdb..018bae2ada36 100644 > > --- a/kernel/sched/cputime.c > > +++ b/kernel/sched/cputime.c > > @@ -79,40 +79,50 @@ void irqtime_account_irq(struct task_struct > > *curr) > > =C2=A0} > > =C2=A0EXPORT_SYMBOL_GPL(irqtime_account_irq); > > =C2=A0 > > -static int irqtime_account_hi_update(void) > > +static cputime_t irqtime_account_hi_update(cputime_t maxtime) > > =C2=A0{ > > =C2=A0 u64 *cpustat =3D kcpustat_this_cpu->cpustat; > > =C2=A0 unsigned long flags; > > - u64 latest_ns; > > - int ret =3D 0; > > + cputime_t irq_cputime; > > =C2=A0 > > =C2=A0 local_irq_save(flags); > > - latest_ns =3D this_cpu_read(cpu_hardirq_time); > > - if (nsecs_to_cputime64(latest_ns) > cpustat[CPUTIME_IRQ]) > > - ret =3D 1; > > + irq_cputime =3D > > nsecs_to_cputime(this_cpu_read(cpu_hardirq_time)) - > > + =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0cpustat[CPUTIME_IRQ]; >=20 > We might want to keep nsecs_to_cputime64(). If cputime_t =3D=3D jiffies_t > =3D=3D unsigned long, > we may have a problem after 49 days of interrupts. Arguably that's a > lot of IRQs > but lets be paranoid. The macro nsecs_to_cputime64 is only defined in cputime_jiffies.h though, not in cputime_nsecs.h Want me to add a #define to the second file? > > + irq_cputime =3D min(irq_cputime, maxtime); > > + cpustat[CPUTIME_IRQ] +=3D irq_cputime; > > =C2=A0 local_irq_restore(flags); > > - return ret; > > + return irq_cputime; > > =C2=A0} > > =C2=A0 > > -static int irqtime_account_si_update(void) > > +static cputime_t irqtime_account_si_update(cputime_t maxtime) > > =C2=A0{ > > =C2=A0 u64 *cpustat =3D kcpustat_this_cpu->cpustat; > > =C2=A0 unsigned long flags; > > - u64 latest_ns; > > - int ret =3D 0; > > + cputime_t softirq_cputime; > > =C2=A0 > > =C2=A0 local_irq_save(flags); > > - latest_ns =3D this_cpu_read(cpu_softirq_time); > > - if (nsecs_to_cputime64(latest_ns) > > > cpustat[CPUTIME_SOFTIRQ]) > > - ret =3D 1; > > + softirq_cputime =3D > > nsecs_to_cputime(this_cpu_read(cpu_softirq_time)) - >=20 > Ditto. >=20 > > + =C2=A0=C2=A0cpustat[CPUTIME_SOFTIRQ]; > > + softirq_cputime =3D min(softirq_cputime, maxtime); > > + cpustat[CPUTIME_SOFTIRQ] +=3D softirq_cputime; > > =C2=A0 local_irq_restore(flags); > > - return ret; > > + return softirq_cputime; > > =C2=A0} > > =C2=A0 > > =C2=A0#else /* CONFIG_IRQ_TIME_ACCOUNTING */ > > =C2=A0 > > =C2=A0#define sched_clock_irqtime (0) > > =C2=A0 > > +static cputime_t irqtime_account_hi_update(cputime_t dummy) > > +{ > > + return 0; > > +} > > + > > +static cputime_t irqtime_account_si_update(cputime_t dummy) > > +{ > > + return 0; > > +} > > + > > =C2=A0#endif /* !CONFIG_IRQ_TIME_ACCOUNTING */ > > =C2=A0 > > =C2=A0static inline void task_group_account_field(struct task_struct *p= , > > int index, > > @@ -257,32 +267,45 @@ void account_idle_time(cputime_t cputime) > > =C2=A0 cpustat[CPUTIME_IDLE] +=3D (__force u64) cputime; > > =C2=A0} > > =C2=A0 > > -static __always_inline unsigned long > > steal_account_process_tick(unsigned long max_jiffies) > > +static __always_inline cputime_t > > steal_account_process_time(cputime_t maxtime) > > =C2=A0{ > > =C2=A0#ifdef CONFIG_PARAVIRT > > =C2=A0 if (static_key_false(¶virt_steal_enabled)) { > > + cputime_t steal_cputime; > > =C2=A0 u64 steal; > > - unsigned long steal_jiffies; > > =C2=A0 > > =C2=A0 steal =3D paravirt_steal_clock(smp_processor_id()); > > =C2=A0 steal -=3D this_rq()->prev_steal_time; > > + this_rq()->prev_steal_time +=3D steal; >=20 > We are accounting steal_cputime but you make it remember steal_nsecs. > This is > leaking quite some steal time in the way. >=20 > Imagine that cputime_t =3D=3D jiffies_t and HZ=3D100. > paravirt_steal_clock() returns 199 nsecs. prev_steal_time gets added > those 199. > nsecs_to_cputime() return 1 jiffy (we are one nsec off the next > jiffy). So > account_steal_time() is accounting 1 jiffy and the 99 remaining nsecs > are leaked. > If some more steal time is to be accounted on the next tick, the 99 > previous nsecs > are forgotten. >=20 > A non-leaking sequence would rather be: >=20 > steal =3D paravirt_steal_clock(smp_processor_id()); > steal -=3D this_rq()->prev_steal_time; > steal_cputime =3D min(nsecs_to_cputime(steal), maxtime); > account_steal_time(steal_cputime); > this_rq()->prev_steal_time +=3D cputime_to_nsecs(steal_cputime); >=20 > Thanks! Good catch. I will fix this! Thanks for reviewing. --=20 All Rights Reversed. --=-i9zQ5p/9eSmHoIlyvM2k Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABCAAGBQJXe7FBAAoJEM553pKExN6DzhwH/0Swaup4vBdLnEQriN2Y1rOG qKAwvgB8uGZthIvF/0zs9NsUElB0uQE9e8Rax/86N72E+eze4Iq74nr2aRSx+RE8 r1Z0B+yFZ5p+tyxFM9rUidSZoEjq8CgJG8CcWDjzKaDhM+XKNuOwqD/4xlxYSCe7 Bt5ujLWibVINNprxjTkO9vNi5GUpdPhI5NbBbDyhMBd19C9w0AynRf4exJggK6iV qJ5cef7o7lhaAaQEvFrrvJh90VZ9UhJ7Vyu1ggOIAv8s63SudWII44DaTAMOD2ap OS1Zy13ELxIn4NJ9/tmPX46e/NWYUYcU6A877gP3ypWiDrjihiXRo+VLRYiO+e4= =O+7j -----END PGP SIGNATURE----- --=-i9zQ5p/9eSmHoIlyvM2k--