All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Galbraith <umgwanakikbuti@gmail.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@elte.hu>, Steven Rostedt <rostedt@goodmis.org>
Subject: Re: RFC: futex_wait() can DoS the tick
Date: Wed, 10 Jun 2015 18:29:18 +0200	[thread overview]
Message-ID: <1433953758.6306.25.camel@gmail.com> (raw)
In-Reply-To: <alpine.DEB.2.11.1506101639230.3786@nanos>

[-- Attachment #1: Type: text/plain, Size: 5721 bytes --]

On Wed, 2015-06-10 at 17:12 +0200, Thomas Gleixner wrote:
> On Wed, 10 Jun 2015, Mike Galbraith wrote:
> > The above was handed to me by a colleague working on a Xen guest that
> > livelocked.  I at first though Xen arch must have a weird problem, but
> > when I tried proggy on my desktop box, while it didn't stop the tick
> > completely as it did the Xen box, it slowed it to a crawl.  I noticed
> > that this did not happen with newer kernels, so a bisecting I did go,
> > and found that...
> > 
> > 279f14614 x86: apic: Use tsc deadline for oneshot when available
> > 
> > ..is what fixed it up.  Trouble is, while it fixes up my Haswell box, a
> 
> This does not make any sense at all. It does not matter whether the
> box uses tscdeadline or local apic timer. We do not even program the
> hardware because we see that the event is in the past already.

Yup.

> So we raise the hrtimer softirqd, which then expires the timer. So all
> what happens is that ksoftirqd accumulates runtime, but I cannot at
> all see how that amounts to a DoS and brings the machine to a grinding
> halt.

The tick certainly appears to crawl here, and Dom0 boxen gripe if you
let them not tick at all for a while.

> I just booted a SNB with lapic=notscdeadline and ran that test
> program. All what happens is - as expected - that ksoftirqd runs more
> than we would like it to. I cannot observe any anomality vs. local
> timer interrupts at all. If I run this pinned on an otherwise idle
> core, then I get ~ CONFIG_HZ interrupts per second, which is what you
> expect when the cpu never reaches idle.

Hm.  In order to successfully bisect the thing 3.7->3.8 I ran 2xCPUS
copies because the first bisect went gaga.  I'm not having any trouble
reproducing on master with a single pinned copy though, nor did I have
any on any of the kernels either stable or enterprise I tested, and
that's quite a few.  Whatever, that first bisect did go bad.

> > The below targets the symptom, consider it hrtimer cluebat attractant.
> 
> By now I know to take your patches with a grain of salt :)

Sodium being bad for blood pressure is a medical myth.

> Some more information about your symptoms in form of configuration,
> extra patches, kernel traces etc. would be appreciated.

Virgin source or kernels with zillion+ patches, doesn't matter.  To test
virgin source earlier than EFI_STUB I had to pollute the source with
EFI backports, but nothing else.

Just a sec while I check yet again that absolutely virgin master really
really does stall....  Yup.  I pinned the tescase to CPU3..

while sleep 1; do grep LOC /proc/interrupts; done
LOC:       6706       5367       5053       6217       3031       2866       5477       3022   Local timer interrupts
LOC:       6753       5391       5074       6238       3058       2894       5576       3034   Local timer interrupts
LOC:       6791       5422       5104       6265       3066       2903       5582       3039   Local timer interrupts
LOC:       6846       5472       5154       6293       3096       2909       5595       3042   Local timer interrupts
LOC:       6855       5518       5177       6325       3199       2920       5613       3046   Local timer interrupts
LOC:       6892       5552       5217       6338       3234       2935       5637       3053   Local timer interrupts
LOC:       6983       5568       5236       6347       3244       2944       5660       3065   Local timer interrupts
LOC:       7028       5583       5251       6363       3262       2963       5673       3071   Local timer interrupts
LOC:       7217       5676       5343       6383       3305       2976       5682       3078   Local timer interrupts
LOC:       7432       5803       5418       6387       3371       3039       5757       3080   Local timer interrupts <== here
LOC:       7560       6028       5632       6394       3538       3195       5937       3084   Local timer interrupts
LOC:       7747       6135       5720       6394       3543       3262       6087       3086   Local timer interrupts
LOC:       7930       6206       5785       6394       3571       3288       6303       3087   Local timer interrupts
LOC:       8057       6299       5842       6394       3606       3346       6415       3088   Local timer interrupts
LOC:       8236       6361       5921       6394       3632       3409       6630       3090   Local timer interrupts
LOC:       8382       6448       6004       6394       3664       3478       6754       3090   Local timer interrupts
LOC:       8460       6571       6124       6394       3690       3542       6951       3092   Local timer interrupts
LOC:       8605       6670       6224       6394       3723       3614       7078       3093   Local timer interrupts
LOC:       8710       6842       6323       6394       3776       3702       7295       3123   Local timer interrupts
LOC:       8868       6947       6402       6394       3828       3784       7422       3149   Local timer interrupts
LOC:       9077       7124       6523       6394       3901       3848       7637       3172   Local timer interrupts
LOC:       9222       7189       6596       6394       3971       3928       7763       3174   Local timer interrupts
LOC:       9336       7325       6699       6394       4020       3948       7912       3176   Local timer interrupts
LOC:       9423       7414       6849       6395       4089       3979       7940       3177   Local timer interrupts
LOC:       9637       7595       6923       6395       4111       4039       7942       3179   Local timer interrupts
LOC:       9807       7734       7095       6395       4232       4108       8069       3180   Local timer interrupts
^C

Config attached.

	-Mike

[-- Attachment #2: config.xz --]
[-- Type: application/x-xz, Size: 23776 bytes --]

  reply	other threads:[~2015-06-10 16:29 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-10 12:44 RFC: futex_wait() can DoS the tick Mike Galbraith
2015-06-10 15:12 ` Thomas Gleixner
2015-06-10 16:29   ` Mike Galbraith [this message]
2015-06-10 18:27     ` Thomas Gleixner
2015-06-10 17:30   ` Mike Galbraith
2015-06-10 18:59     ` Thomas Gleixner
2015-06-10 19:15       ` Steven Rostedt
2015-06-10 19:32         ` Thomas Gleixner
2015-06-11  2:27       ` Mike Galbraith
2015-06-11  8:34         ` Thomas Gleixner
2015-06-11 11:41           ` Mike Galbraith
2015-06-11 13:13             ` Thomas Gleixner
2015-06-11 13:58               ` Mike Galbraith
2015-06-11 18:43               ` Mike Galbraith
2015-06-11 18:52                 ` Thomas Gleixner
2015-06-11 18:55                   ` Mike Galbraith
2015-06-11  7:35   ` Mike Galbraith
2015-06-11  7:43     ` Mike Galbraith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1433953758.6306.25.camel@gmail.com \
    --to=umgwanakikbuti@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.