From: Jiang Liu <jiang.liu@linux.intel.com>
To: Borislav Petkov <bp@alien8.de>, "Huang, Ying" <ying.huang@intel.com>
Cc: Joe Lawrence <joe.lawrence@stratus.com>,
Thomas Gleixner <tglx@linutronix.de>,
lkp@01.org, LKML <linux-kernel@vger.kernel.org>,
x86-ml <x86@kernel.org>
Subject: Re: [LKP] [lkp] [x86/irq] 4c24cee6b2: IP-Config: Auto-configuration of network failed
Date: Tue, 15 Dec 2015 15:55:14 +0800 [thread overview]
Message-ID: <566FC762.1040107@linux.intel.com> (raw)
In-Reply-To: <20151214095427.GA11638@pd.tnic>
[-- Attachment #1: Type: text/plain, Size: 4430 bytes --]
On 2015/12/14 17:54, Borislav Petkov wrote:
> On Mon, Dec 14, 2015 at 02:54:02PM +0800, Huang, Ying wrote:
>> No, there are no other systems reporting the same issue. I will queue
>> more tests for make sure this is not a false positive.
>
> I can trigger this too with my guest here.
>
> I have these two ontop of rc5:
>
> cc22b9b83f6a x86/irq: Enhance __assign_irq_vector() to rollback in case of failure
> 45dd79e03e1e x86/irq: Do not reuse struct apic_chip_data.old_domain as temporary buffer
> 9f9499ae8e64 Linux 4.4-rc5
>
> and my guest stalls while booting.
>
> The new thing I see in dmesg is this:
>
> ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
> +..MP-BIOS bug: 8254 timer not connected to IO-APIC
> +...trying to set up timer (IRQ0) through the 8259A ...
> +..... (found apic 0 pin 2) ...
> +....... failed.
> +...trying to set up timer as Virtual Wire IRQ...
> +..... failed.
> +...trying to set up timer as ExtINT IRQ...
> +..... works.
> +APIC calibration not consistent with PM-Timer: 111ms instead of 100ms
> +APIC delta adjusted to PM-Timer: 6248393 (6997337)
>
> which leads to boot stalling and timeoutting when loading the hdd
> driver:
Hi Boris and Ying,
Aha, found a possible regression. Could you please help to
apply the attached bugfix patch ontop of "cc22b9b83f6a x86/irq:
Enhance __assign_irq_vector() to rollback in case of failure"?
Hi Ying, I have push this patch to github so it should reach
0day test farm soon:)
Thanks,
Gerry
>
> ...
> [ 3.973447] console [netcon0] enabled
> [ 3.976099] netconsole: network logging started
> [ 3.979604] rtc_cmos 00:00: setting system clock to 2015-12-14 10:45:35 UTC (1450089935)
> [ 3.985348] PM: Checking hibernation image partition /dev/sdb1
> [ 6.600706] usb 1-1: New USB device found, idVendor=0627, idProduct=0001
> [ 6.613651] usb 1-1: New USB device strings: Mfr=1, Product=3, SerialNumber=5
> [ 6.636905] usb 1-1: Product: QEMU USB Tablet
> [ 6.642248] usb 1-1: Manufacturer: QEMU
> [ 6.647109] usb 1-1: SerialNumber: 42
> [ 7.580995] ata2.00: qc timeout (cmd 0xa0)
> [ 7.589300] ata2.00: TEST_UNIT_READY failed (err_mask=0x5)
> [ 7.750715] ata2.01: NODEV after polling detection
> [ 7.759605] ata2.00: configured for MWDMA2
> [ 8.585691] input: QEMU QEMU USB Tablet as /devices/pci0000:00/0000:00:01.2/usb1/1-1/1-1:1.0/0003:0627:0001.0001/input/input1
> [ 8.602467] hid-generic 0003:0627:0001.0001: input,hidraw0: USB HID v0.01 Pointer [QEMU QEMU USB Tablet] on usb-0000:00:01.2-1/input0
> [ 12.760846] ata2.00: qc timeout (cmd 0xa0)
> [ 12.786543] ata2.00: TEST_UNIT_READY failed (err_mask=0x5)
> [ 12.796576] ata2.00: limiting speed to MWDMA2:PIO3
> [ 12.958455] ata2.01: NODEV after polling detection
> [ 12.969693] ata2.00: configured for MWDMA2
> [ 17.972782] ata2.00: qc timeout (cmd 0xa0)
> [ 17.978967] ata2.00: TEST_UNIT_READY failed (err_mask=0x5)
> [ 17.983495] ata2.00: disabled
> [ 17.986352] ata2: soft resetting link
> [ 18.146586] ata2.01: NODEV after polling detection
> [ 18.151413] ata2: EH complete
> [ 32.745227] ata1: lost interrupt (Status 0x50)
> [ 32.748470] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
> [ 32.756586] ata1.00: failed command: READ DMA
> [ 32.761251] ata1.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 0 dma 4096 in
> [ 32.761251] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> [ 32.773928] ata1.00: status: { DRDY }
> [ 32.777028] ata1: soft resetting link
> [ 32.934437] ata1.01: NODEV after polling detection
> [ 32.946663] ata1.00: configured for MWDMA2
> [ 32.949964] ata1.00: device reported invalid CHS sector 0
> [ 32.953793] ata1: EH complete
> [ 63.849089] ata1: lost interrupt (Status 0x50)
> [ 63.857470] ata1.00: limiting speed to MWDMA1:PIO4
> [ 63.860982] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
> [ 63.865862] ata1.00: failed command: READ DMA
> [ 63.883697] ata1.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 0 dma 4096 in
> [ 63.883697] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> [ 63.899573] ata1.00: status: { DRDY }
> [ 63.902649] ata1: soft resetting link
> [ 64.062580] ata1.01: NODEV after polling detection
> [ 64.073800] ata1.00: configured for MWDMA1
> [ 64.076813] ata1.00: device reported invalid CHS sector 0
> [ 64.096188] ata1: EH complete
>
[-- Attachment #2: 0001-.patch --]
[-- Type: text/x-patch, Size: 1873 bytes --]
>From c7c3cc3a048576fd1e196e67b11ae0193e7fba1e Mon Sep 17 00:00:00 2001
From: Jiang Liu <jiang.liu@linux.intel.com>
Date: Tue, 15 Dec 2015 15:40:43 +0800
Subject: [PATCH]
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
arch/x86/kernel/apic/vector.c | 10 +++-------
1 file changed, 3 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index f03957e7c50d..fce2853f70d9 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -116,14 +116,13 @@ static int __assign_irq_vector(int irq, struct apic_chip_data *d,
*/
static int current_vector = FIRST_EXTERNAL_VECTOR + VECTOR_OFFSET_START;
static int current_offset = VECTOR_OFFSET_START % 16;
- int cpu, err;
- unsigned int dest = d->cfg.dest_apicid;
+ int cpu, err = -ENOSPC;
+ unsigned int dest;
if (d->move_in_progress)
return -EBUSY;
/* Only try and allocate irqs on cpus that are present */
- err = -ENOSPC;
cpumask_clear(d->old_domain);
cpumask_clear(used_cpumask);
cpu = cpumask_first_and(mask, cpu_online_mask);
@@ -133,9 +132,6 @@ static int __assign_irq_vector(int irq, struct apic_chip_data *d,
apic->vector_allocation_domain(cpu, vector_cpumask, mask);
if (cpumask_subset(vector_cpumask, d->domain)) {
- err = 0;
- if (cpumask_equal(vector_cpumask, d->domain))
- break;
/*
* New cpumask using the vector is a proper subset of
* the current in use mask. So cleanup the vector
@@ -144,7 +140,7 @@ static int __assign_irq_vector(int irq, struct apic_chip_data *d,
cpumask_and(used_cpumask, d->domain, vector_cpumask);
err = apic->cpu_mask_to_apicid_and(mask, used_cpumask,
&dest);
- if (err)
+ if (err || cpumask_equal(vector_cpumask, d->domain))
break;
cpumask_andnot(d->old_domain, d->domain,
vector_cpumask);
--
1.7.10.4
next prev parent reply other threads:[~2015-12-15 7:55 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-11 7:49 [lkp] [x86/irq] 4c24cee6b2: IP-Config: Auto-configuration of network failed kernel test robot
2015-12-14 6:38 ` Jiang Liu
2015-12-14 6:54 ` [LKP] " Huang, Ying
2015-12-14 9:54 ` Borislav Petkov
2015-12-15 7:55 ` Jiang Liu [this message]
2015-12-15 10:08 ` Borislav Petkov
2015-12-19 20:31 ` Thomas Gleixner
2015-12-23 14:13 ` [Bugfix v2 1/5] x86/irq: Do not reuse struct apic_chip_data.old_domain as temporary buffer Jiang Liu
2015-12-23 14:13 ` [Bugfix v2 2/5] x86/irq: Enhance __assign_irq_vector() to rollback in case of failure Jiang Liu
2015-12-30 18:52 ` Thomas Gleixner
2015-12-23 14:13 ` [Bugfix v2 3/5] x86/irq: Fix a race window in x86_vector_free_irqs() Jiang Liu
2015-12-29 13:39 ` Thomas Gleixner
2016-01-16 21:16 ` [tip:x86/urgent] x86/irq: Fix a race " tip-bot for Jiang Liu
2015-12-23 14:13 ` [Bugfix v2 4/5] x86/irq: Fix a race condition between vector assigning and cleanup Jiang Liu
2015-12-23 18:41 ` Borislav Petkov
2015-12-30 17:25 ` Thomas Gleixner
2015-12-30 22:50 ` Thomas Gleixner
2015-12-23 14:13 ` [Bugfix v2 5/5] x86/irq: Trivial cleanups for x86 vector allocation code Jiang Liu
2015-12-23 19:10 ` [Bugfix v2 1/5] x86/irq: Do not reuse struct apic_chip_data.old_domain as temporary buffer Borislav Petkov
2015-12-24 5:15 ` Jeremiah Mahler
2015-12-28 8:24 ` Jiang Liu
2015-12-29 3:26 ` Jeremiah Mahler
2015-12-24 14:34 ` Joe Lawrence
2016-01-16 21:16 ` [tip:x86/urgent] x86/irq: Do not use " tip-bot for Jiang Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=566FC762.1040107@linux.intel.com \
--to=jiang.liu@linux.intel.com \
--cc=bp@alien8.de \
--cc=joe.lawrence@stratus.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lkp@01.org \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).