From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1751554AbdITA25 (ORCPT <rfc822;w@1wt.eu>);
        Tue, 19 Sep 2017 20:28:57 -0400
Received: from mail-io0-f195.google.com ([209.85.223.195]:33043 "EHLO
        mail-io0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1751348AbdITA24 (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Tue, 19 Sep 2017 20:28:56 -0400
X-Google-Smtp-Source: AOwi7QB6WejMraEOkG0VKg8vsNw6gM5T7B785sRpqPXoQ8cBJsTcRVkJBvYA6hkrTqAwmc/nle/9xQ==
Date: Tue, 19 Sep 2017 20:30:44 -0400
From: Chuck Ebbert <cebbert.lkml@gmail.com>
To: Marc Zyngier <marc.zyngier@arm.com>
Cc: Yanko Kaneti <yaneti@declera.com>, LKML <linux-kernel@vger.kernel.org>,
        Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [regression 4.14rc] 74def747bcd0 (genirq: Restrict effective
 affinity to interrupts actually using it)
Message-ID: <20170919203044.560cb9f1@gmail.com>
In-Reply-To: <225dd0d8-2c27-57a6-c17d-c552c011d8da@arm.com>
References: <1505833936.2634.11.camel@declera.com>
        <4374f6c0-dd67-3bd3-91a0-685eb9a0d711@arm.com>
        <1505835616.2634.14.camel@declera.com>
        <225dd0d8-2c27-57a6-c17d-c552c011d8da@arm.com>
Organization: Very little
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Tue, 19 Sep 2017 16:51:06 +0100
Marc Zyngier <marc.zyngier@arm.com> wrote:

> On 19/09/17 16:40, Yanko Kaneti wrote:
> > On Tue, 2017-09-19 at 16:33 +0100, Marc Zyngier wrote:  
> >> On 19/09/17 16:12, Yanko Kaneti wrote:  
> >>> Hello, 
> >>>
> >>> Fedora rawhide config here. 
> >>> AMD FX-8370E
> >>>
> >>> Bisected a problem to:
> >>> 74def747bcd0 (genirq: Restrict effective affinity to interrupts
> >>> actually using it) 
> >>>
> >>> It seems to be causing stalls, short lived or long lived lockups
> >>> very shortly after boot. Everything becomes jerky.
> >>>
> >>> The only visible in the log indication is something like :
> >>> ....
> >>> [   59.802129] clocksource: timekeeping watchdog on CPU3: Marking
> >>> clocksource 'tsc' as unstable because the skew is too large:
> >>> [   59.802134] clocksource:                       'hpet' wd_now:
> >>> 3326e7aa wd_last: 329956f8 mask: ffffffff [   59.802137]
> >>> clocksource:                       'tsc' cs_now: 423662bc6f
> >>> cs_last: 41dfc91650 mask: ffffffffffffffff [   59.802140] tsc:
> >>> Marking TSC unstable due to clocksource watchdog [   59.802158]
> >>> TSC found unstable after boot, most likely due to broken BIOS.
> >>> Use 'tsc=unstable'. [   59.802161] sched_clock: Marking unstable
> >>> (59802142067, 15510)<-(59920871789, -118714277) [   60.015604]
> >>> clocksource: Switched to clocksource hpet [   89.015994] INFO:
> >>> NMI handler (perf_event_nmi_handler) took too long to run:
> >>> 209.660 msecs [   89.016003] perf: interrupt took too long
> >>> (1638003 > 2500), lowering kernel.perf_event_max_sample_rate to
> >>> 1000 ....
> >>>
> >>> Just reverting that commit on top of linus mainline cures all the
> >>> symptoms  
> >>
> >> Interesting. Do you still get HPET interrupts?  
> > 
> > Sorry, I might need some basic help here (i.e where do I count
> > them...)  
> 
> /proc/interrupts should display them.
> 
> > After the watchdog switches the clocksource to hpet the system is
> > still somewhat alive, so I'll guess some clock is still
> > ticking....  
> Probably, but I suspect they're not hitting the right CPU, hence the
> lockups.
> 
> Unfortunately, my x86-foo is pretty minimal, and I'm about to drop off
> the net for a few days.
> 
> Thomas, any insight?

Looking at flat_cpu_mask_to_apicid(), I don't see how 74def747bcd0
can be correct:

	struct cpumask *effmsk =
	irq_data_get_effective_affinity_mask(irqdata); unsigned long
	cpu_mask = cpumask_bits(mask)[0] & APIC_ALL_CPUS;

	if (!cpu_mask)
		return -EINVAL;
	*apicid = (unsigned int)cpu_mask;
	cpumask_bits(effmsk)[0] = cpu_mask;

Before that patch, this function wrote to the effective mask
unconditionally. After, it only writes to effective_mask if it is
already non-zero.