public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mike Galbraith <umgwanakikbuti@gmail.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@elte.hu>, Steven Rostedt <rostedt@goodmis.org>
Subject: Re: RFC: futex_wait() can DoS the tick
Date: Wed, 10 Jun 2015 18:29:18 +0200	[thread overview]
Message-ID: <1433953758.6306.25.camel@gmail.com> (raw)
In-Reply-To: <alpine.DEB.2.11.1506101639230.3786@nanos>

[-- Attachment #1: Type: text/plain, Size: 5721 bytes --]

On Wed, 2015-06-10 at 17:12 +0200, Thomas Gleixner wrote:
> On Wed, 10 Jun 2015, Mike Galbraith wrote:
> > The above was handed to me by a colleague working on a Xen guest that
> > livelocked.  I at first though Xen arch must have a weird problem, but
> > when I tried proggy on my desktop box, while it didn't stop the tick
> > completely as it did the Xen box, it slowed it to a crawl.  I noticed
> > that this did not happen with newer kernels, so a bisecting I did go,
> > and found that...
> > 
> > 279f14614 x86: apic: Use tsc deadline for oneshot when available
> > 
> > ..is what fixed it up.  Trouble is, while it fixes up my Haswell box, a
> 
> This does not make any sense at all. It does not matter whether the
> box uses tscdeadline or local apic timer. We do not even program the
> hardware because we see that the event is in the past already.

Yup.

> So we raise the hrtimer softirqd, which then expires the timer. So all
> what happens is that ksoftirqd accumulates runtime, but I cannot at
> all see how that amounts to a DoS and brings the machine to a grinding
> halt.

The tick certainly appears to crawl here, and Dom0 boxen gripe if you
let them not tick at all for a while.

> I just booted a SNB with lapic=notscdeadline and ran that test
> program. All what happens is - as expected - that ksoftirqd runs more
> than we would like it to. I cannot observe any anomality vs. local
> timer interrupts at all. If I run this pinned on an otherwise idle
> core, then I get ~ CONFIG_HZ interrupts per second, which is what you
> expect when the cpu never reaches idle.

Hm.  In order to successfully bisect the thing 3.7->3.8 I ran 2xCPUS
copies because the first bisect went gaga.  I'm not having any trouble
reproducing on master with a single pinned copy though, nor did I have
any on any of the kernels either stable or enterprise I tested, and
that's quite a few.  Whatever, that first bisect did go bad.

> > The below targets the symptom, consider it hrtimer cluebat attractant.
> 
> By now I know to take your patches with a grain of salt :)

Sodium being bad for blood pressure is a medical myth.

> Some more information about your symptoms in form of configuration,
> extra patches, kernel traces etc. would be appreciated.

Virgin source or kernels with zillion+ patches, doesn't matter.  To test
virgin source earlier than EFI_STUB I had to pollute the source with
EFI backports, but nothing else.

Just a sec while I check yet again that absolutely virgin master really
really does stall....  Yup.  I pinned the tescase to CPU3..

while sleep 1; do grep LOC /proc/interrupts; done
LOC:       6706       5367       5053       6217       3031       2866       5477       3022   Local timer interrupts
LOC:       6753       5391       5074       6238       3058       2894       5576       3034   Local timer interrupts
LOC:       6791       5422       5104       6265       3066       2903       5582       3039   Local timer interrupts
LOC:       6846       5472       5154       6293       3096       2909       5595       3042   Local timer interrupts
LOC:       6855       5518       5177       6325       3199       2920       5613       3046   Local timer interrupts
LOC:       6892       5552       5217       6338       3234       2935       5637       3053   Local timer interrupts
LOC:       6983       5568       5236       6347       3244       2944       5660       3065   Local timer interrupts
LOC:       7028       5583       5251       6363       3262       2963       5673       3071   Local timer interrupts
LOC:       7217       5676       5343       6383       3305       2976       5682       3078   Local timer interrupts
LOC:       7432       5803       5418       6387       3371       3039       5757       3080   Local timer interrupts <== here
LOC:       7560       6028       5632       6394       3538       3195       5937       3084   Local timer interrupts
LOC:       7747       6135       5720       6394       3543       3262       6087       3086   Local timer interrupts
LOC:       7930       6206       5785       6394       3571       3288       6303       3087   Local timer interrupts
LOC:       8057       6299       5842       6394       3606       3346       6415       3088   Local timer interrupts
LOC:       8236       6361       5921       6394       3632       3409       6630       3090   Local timer interrupts
LOC:       8382       6448       6004       6394       3664       3478       6754       3090   Local timer interrupts
LOC:       8460       6571       6124       6394       3690       3542       6951       3092   Local timer interrupts
LOC:       8605       6670       6224       6394       3723       3614       7078       3093   Local timer interrupts
LOC:       8710       6842       6323       6394       3776       3702       7295       3123   Local timer interrupts
LOC:       8868       6947       6402       6394       3828       3784       7422       3149   Local timer interrupts
LOC:       9077       7124       6523       6394       3901       3848       7637       3172   Local timer interrupts
LOC:       9222       7189       6596       6394       3971       3928       7763       3174   Local timer interrupts
LOC:       9336       7325       6699       6394       4020       3948       7912       3176   Local timer interrupts
LOC:       9423       7414       6849       6395       4089       3979       7940       3177   Local timer interrupts
LOC:       9637       7595       6923       6395       4111       4039       7942       3179   Local timer interrupts
LOC:       9807       7734       7095       6395       4232       4108       8069       3180   Local timer interrupts
^C

Config attached.

	-Mike

[-- Attachment #2: config.xz --]
[-- Type: application/x-xz, Size: 23776 bytes --]

  reply	other threads:[~2015-06-10 16:29 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-10 12:44 RFC: futex_wait() can DoS the tick Mike Galbraith
2015-06-10 15:12 ` Thomas Gleixner
2015-06-10 16:29   ` Mike Galbraith [this message]
2015-06-10 18:27     ` Thomas Gleixner
2015-06-10 17:30   ` Mike Galbraith
2015-06-10 18:59     ` Thomas Gleixner
2015-06-10 19:15       ` Steven Rostedt
2015-06-10 19:32         ` Thomas Gleixner
2015-06-11  2:27       ` Mike Galbraith
2015-06-11  8:34         ` Thomas Gleixner
2015-06-11 11:41           ` Mike Galbraith
2015-06-11 13:13             ` Thomas Gleixner
2015-06-11 13:58               ` Mike Galbraith
2015-06-11 18:43               ` Mike Galbraith
2015-06-11 18:52                 ` Thomas Gleixner
2015-06-11 18:55                   ` Mike Galbraith
2015-06-11  7:35   ` Mike Galbraith
2015-06-11  7:43     ` Mike Galbraith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1433953758.6306.25.camel@gmail.com \
    --to=umgwanakikbuti@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox