From: Jiang Liu <jiang.liu@linux.intel.com>
To: lkp@lists.01.org
Subject: Re: [x86/irq] 4c24cee6b2: IP-Config: Auto-configuration of network failed
Date: Tue, 15 Dec 2015 15:55:14 +0800 [thread overview]
Message-ID: <566FC762.1040107@linux.intel.com> (raw)
In-Reply-To: <20151214095427.GA11638@pd.tnic>
[-- Attachment #1: Type: text/plain, Size: 4522 bytes --]
On 2015/12/14 17:54, Borislav Petkov wrote:
> On Mon, Dec 14, 2015 at 02:54:02PM +0800, Huang, Ying wrote:
>> No, there are no other systems reporting the same issue. I will queue
>> more tests for make sure this is not a false positive.
>
> I can trigger this too with my guest here.
>
> I have these two ontop of rc5:
>
> cc22b9b83f6a x86/irq: Enhance __assign_irq_vector() to rollback in case of failure
> 45dd79e03e1e x86/irq: Do not reuse struct apic_chip_data.old_domain as temporary buffer
> 9f9499ae8e64 Linux 4.4-rc5
>
> and my guest stalls while booting.
>
> The new thing I see in dmesg is this:
>
> ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
> +..MP-BIOS bug: 8254 timer not connected to IO-APIC
> +...trying to set up timer (IRQ0) through the 8259A ...
> +..... (found apic 0 pin 2) ...
> +....... failed.
> +...trying to set up timer as Virtual Wire IRQ...
> +..... failed.
> +...trying to set up timer as ExtINT IRQ...
> +..... works.
> +APIC calibration not consistent with PM-Timer: 111ms instead of 100ms
> +APIC delta adjusted to PM-Timer: 6248393 (6997337)
>
> which leads to boot stalling and timeoutting when loading the hdd
> driver:
Hi Boris and Ying,
Aha, found a possible regression. Could you please help to
apply the attached bugfix patch ontop of "cc22b9b83f6a x86/irq:
Enhance __assign_irq_vector() to rollback in case of failure"?
Hi Ying, I have push this patch to github so it should reach
0day test farm soon:)
Thanks,
Gerry
>
> ...
> [ 3.973447] console [netcon0] enabled
> [ 3.976099] netconsole: network logging started
> [ 3.979604] rtc_cmos 00:00: setting system clock to 2015-12-14 10:45:35 UTC (1450089935)
> [ 3.985348] PM: Checking hibernation image partition /dev/sdb1
> [ 6.600706] usb 1-1: New USB device found, idVendor=0627, idProduct=0001
> [ 6.613651] usb 1-1: New USB device strings: Mfr=1, Product=3, SerialNumber=5
> [ 6.636905] usb 1-1: Product: QEMU USB Tablet
> [ 6.642248] usb 1-1: Manufacturer: QEMU
> [ 6.647109] usb 1-1: SerialNumber: 42
> [ 7.580995] ata2.00: qc timeout (cmd 0xa0)
> [ 7.589300] ata2.00: TEST_UNIT_READY failed (err_mask=0x5)
> [ 7.750715] ata2.01: NODEV after polling detection
> [ 7.759605] ata2.00: configured for MWDMA2
> [ 8.585691] input: QEMU QEMU USB Tablet as /devices/pci0000:00/0000:00:01.2/usb1/1-1/1-1:1.0/0003:0627:0001.0001/input/input1
> [ 8.602467] hid-generic 0003:0627:0001.0001: input,hidraw0: USB HID v0.01 Pointer [QEMU QEMU USB Tablet] on usb-0000:00:01.2-1/input0
> [ 12.760846] ata2.00: qc timeout (cmd 0xa0)
> [ 12.786543] ata2.00: TEST_UNIT_READY failed (err_mask=0x5)
> [ 12.796576] ata2.00: limiting speed to MWDMA2:PIO3
> [ 12.958455] ata2.01: NODEV after polling detection
> [ 12.969693] ata2.00: configured for MWDMA2
> [ 17.972782] ata2.00: qc timeout (cmd 0xa0)
> [ 17.978967] ata2.00: TEST_UNIT_READY failed (err_mask=0x5)
> [ 17.983495] ata2.00: disabled
> [ 17.986352] ata2: soft resetting link
> [ 18.146586] ata2.01: NODEV after polling detection
> [ 18.151413] ata2: EH complete
> [ 32.745227] ata1: lost interrupt (Status 0x50)
> [ 32.748470] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
> [ 32.756586] ata1.00: failed command: READ DMA
> [ 32.761251] ata1.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 0 dma 4096 in
> [ 32.761251] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> [ 32.773928] ata1.00: status: { DRDY }
> [ 32.777028] ata1: soft resetting link
> [ 32.934437] ata1.01: NODEV after polling detection
> [ 32.946663] ata1.00: configured for MWDMA2
> [ 32.949964] ata1.00: device reported invalid CHS sector 0
> [ 32.953793] ata1: EH complete
> [ 63.849089] ata1: lost interrupt (Status 0x50)
> [ 63.857470] ata1.00: limiting speed to MWDMA1:PIO4
> [ 63.860982] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
> [ 63.865862] ata1.00: failed command: READ DMA
> [ 63.883697] ata1.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 0 dma 4096 in
> [ 63.883697] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> [ 63.899573] ata1.00: status: { DRDY }
> [ 63.902649] ata1: soft resetting link
> [ 64.062580] ata1.01: NODEV after polling detection
> [ 64.073800] ata1.00: configured for MWDMA1
> [ 64.076813] ata1.00: device reported invalid CHS sector 0
> [ 64.096188] ata1: EH complete
>
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-.patch --]
[-- Type: text/x-patch, Size: 1873 bytes --]
>From c7c3cc3a048576fd1e196e67b11ae0193e7fba1e Mon Sep 17 00:00:00 2001
From: Jiang Liu <jiang.liu@linux.intel.com>
Date: Tue, 15 Dec 2015 15:40:43 +0800
Subject: [PATCH]
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
arch/x86/kernel/apic/vector.c | 10 +++-------
1 file changed, 3 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index f03957e7c50d..fce2853f70d9 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -116,14 +116,13 @@ static int __assign_irq_vector(int irq, struct apic_chip_data *d,
*/
static int current_vector = FIRST_EXTERNAL_VECTOR + VECTOR_OFFSET_START;
static int current_offset = VECTOR_OFFSET_START % 16;
- int cpu, err;
- unsigned int dest = d->cfg.dest_apicid;
+ int cpu, err = -ENOSPC;
+ unsigned int dest;
if (d->move_in_progress)
return -EBUSY;
/* Only try and allocate irqs on cpus that are present */
- err = -ENOSPC;
cpumask_clear(d->old_domain);
cpumask_clear(used_cpumask);
cpu = cpumask_first_and(mask, cpu_online_mask);
@@ -133,9 +132,6 @@ static int __assign_irq_vector(int irq, struct apic_chip_data *d,
apic->vector_allocation_domain(cpu, vector_cpumask, mask);
if (cpumask_subset(vector_cpumask, d->domain)) {
- err = 0;
- if (cpumask_equal(vector_cpumask, d->domain))
- break;
/*
* New cpumask using the vector is a proper subset of
* the current in use mask. So cleanup the vector
@@ -144,7 +140,7 @@ static int __assign_irq_vector(int irq, struct apic_chip_data *d,
cpumask_and(used_cpumask, d->domain, vector_cpumask);
err = apic->cpu_mask_to_apicid_and(mask, used_cpumask,
&dest);
- if (err)
+ if (err || cpumask_equal(vector_cpumask, d->domain))
break;
cpumask_andnot(d->old_domain, d->domain,
vector_cpumask);
--
1.7.10.4
WARNING: multiple messages have this Message-ID (diff)
From: Jiang Liu <jiang.liu@linux.intel.com>
To: Borislav Petkov <bp@alien8.de>, "Huang, Ying" <ying.huang@intel.com>
Cc: Joe Lawrence <joe.lawrence@stratus.com>,
Thomas Gleixner <tglx@linutronix.de>,
lkp@01.org, LKML <linux-kernel@vger.kernel.org>,
x86-ml <x86@kernel.org>
Subject: Re: [LKP] [lkp] [x86/irq] 4c24cee6b2: IP-Config: Auto-configuration of network failed
Date: Tue, 15 Dec 2015 15:55:14 +0800 [thread overview]
Message-ID: <566FC762.1040107@linux.intel.com> (raw)
In-Reply-To: <20151214095427.GA11638@pd.tnic>
[-- Attachment #1: Type: text/plain, Size: 4430 bytes --]
On 2015/12/14 17:54, Borislav Petkov wrote:
> On Mon, Dec 14, 2015 at 02:54:02PM +0800, Huang, Ying wrote:
>> No, there are no other systems reporting the same issue. I will queue
>> more tests for make sure this is not a false positive.
>
> I can trigger this too with my guest here.
>
> I have these two ontop of rc5:
>
> cc22b9b83f6a x86/irq: Enhance __assign_irq_vector() to rollback in case of failure
> 45dd79e03e1e x86/irq: Do not reuse struct apic_chip_data.old_domain as temporary buffer
> 9f9499ae8e64 Linux 4.4-rc5
>
> and my guest stalls while booting.
>
> The new thing I see in dmesg is this:
>
> ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
> +..MP-BIOS bug: 8254 timer not connected to IO-APIC
> +...trying to set up timer (IRQ0) through the 8259A ...
> +..... (found apic 0 pin 2) ...
> +....... failed.
> +...trying to set up timer as Virtual Wire IRQ...
> +..... failed.
> +...trying to set up timer as ExtINT IRQ...
> +..... works.
> +APIC calibration not consistent with PM-Timer: 111ms instead of 100ms
> +APIC delta adjusted to PM-Timer: 6248393 (6997337)
>
> which leads to boot stalling and timeoutting when loading the hdd
> driver:
Hi Boris and Ying,
Aha, found a possible regression. Could you please help to
apply the attached bugfix patch ontop of "cc22b9b83f6a x86/irq:
Enhance __assign_irq_vector() to rollback in case of failure"?
Hi Ying, I have push this patch to github so it should reach
0day test farm soon:)
Thanks,
Gerry
>
> ...
> [ 3.973447] console [netcon0] enabled
> [ 3.976099] netconsole: network logging started
> [ 3.979604] rtc_cmos 00:00: setting system clock to 2015-12-14 10:45:35 UTC (1450089935)
> [ 3.985348] PM: Checking hibernation image partition /dev/sdb1
> [ 6.600706] usb 1-1: New USB device found, idVendor=0627, idProduct=0001
> [ 6.613651] usb 1-1: New USB device strings: Mfr=1, Product=3, SerialNumber=5
> [ 6.636905] usb 1-1: Product: QEMU USB Tablet
> [ 6.642248] usb 1-1: Manufacturer: QEMU
> [ 6.647109] usb 1-1: SerialNumber: 42
> [ 7.580995] ata2.00: qc timeout (cmd 0xa0)
> [ 7.589300] ata2.00: TEST_UNIT_READY failed (err_mask=0x5)
> [ 7.750715] ata2.01: NODEV after polling detection
> [ 7.759605] ata2.00: configured for MWDMA2
> [ 8.585691] input: QEMU QEMU USB Tablet as /devices/pci0000:00/0000:00:01.2/usb1/1-1/1-1:1.0/0003:0627:0001.0001/input/input1
> [ 8.602467] hid-generic 0003:0627:0001.0001: input,hidraw0: USB HID v0.01 Pointer [QEMU QEMU USB Tablet] on usb-0000:00:01.2-1/input0
> [ 12.760846] ata2.00: qc timeout (cmd 0xa0)
> [ 12.786543] ata2.00: TEST_UNIT_READY failed (err_mask=0x5)
> [ 12.796576] ata2.00: limiting speed to MWDMA2:PIO3
> [ 12.958455] ata2.01: NODEV after polling detection
> [ 12.969693] ata2.00: configured for MWDMA2
> [ 17.972782] ata2.00: qc timeout (cmd 0xa0)
> [ 17.978967] ata2.00: TEST_UNIT_READY failed (err_mask=0x5)
> [ 17.983495] ata2.00: disabled
> [ 17.986352] ata2: soft resetting link
> [ 18.146586] ata2.01: NODEV after polling detection
> [ 18.151413] ata2: EH complete
> [ 32.745227] ata1: lost interrupt (Status 0x50)
> [ 32.748470] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
> [ 32.756586] ata1.00: failed command: READ DMA
> [ 32.761251] ata1.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 0 dma 4096 in
> [ 32.761251] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> [ 32.773928] ata1.00: status: { DRDY }
> [ 32.777028] ata1: soft resetting link
> [ 32.934437] ata1.01: NODEV after polling detection
> [ 32.946663] ata1.00: configured for MWDMA2
> [ 32.949964] ata1.00: device reported invalid CHS sector 0
> [ 32.953793] ata1: EH complete
> [ 63.849089] ata1: lost interrupt (Status 0x50)
> [ 63.857470] ata1.00: limiting speed to MWDMA1:PIO4
> [ 63.860982] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
> [ 63.865862] ata1.00: failed command: READ DMA
> [ 63.883697] ata1.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 0 dma 4096 in
> [ 63.883697] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> [ 63.899573] ata1.00: status: { DRDY }
> [ 63.902649] ata1: soft resetting link
> [ 64.062580] ata1.01: NODEV after polling detection
> [ 64.073800] ata1.00: configured for MWDMA1
> [ 64.076813] ata1.00: device reported invalid CHS sector 0
> [ 64.096188] ata1: EH complete
>
[-- Attachment #2: 0001-.patch --]
[-- Type: text/x-patch, Size: 1873 bytes --]
>From c7c3cc3a048576fd1e196e67b11ae0193e7fba1e Mon Sep 17 00:00:00 2001
From: Jiang Liu <jiang.liu@linux.intel.com>
Date: Tue, 15 Dec 2015 15:40:43 +0800
Subject: [PATCH]
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
arch/x86/kernel/apic/vector.c | 10 +++-------
1 file changed, 3 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index f03957e7c50d..fce2853f70d9 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -116,14 +116,13 @@ static int __assign_irq_vector(int irq, struct apic_chip_data *d,
*/
static int current_vector = FIRST_EXTERNAL_VECTOR + VECTOR_OFFSET_START;
static int current_offset = VECTOR_OFFSET_START % 16;
- int cpu, err;
- unsigned int dest = d->cfg.dest_apicid;
+ int cpu, err = -ENOSPC;
+ unsigned int dest;
if (d->move_in_progress)
return -EBUSY;
/* Only try and allocate irqs on cpus that are present */
- err = -ENOSPC;
cpumask_clear(d->old_domain);
cpumask_clear(used_cpumask);
cpu = cpumask_first_and(mask, cpu_online_mask);
@@ -133,9 +132,6 @@ static int __assign_irq_vector(int irq, struct apic_chip_data *d,
apic->vector_allocation_domain(cpu, vector_cpumask, mask);
if (cpumask_subset(vector_cpumask, d->domain)) {
- err = 0;
- if (cpumask_equal(vector_cpumask, d->domain))
- break;
/*
* New cpumask using the vector is a proper subset of
* the current in use mask. So cleanup the vector
@@ -144,7 +140,7 @@ static int __assign_irq_vector(int irq, struct apic_chip_data *d,
cpumask_and(used_cpumask, d->domain, vector_cpumask);
err = apic->cpu_mask_to_apicid_and(mask, used_cpumask,
&dest);
- if (err)
+ if (err || cpumask_equal(vector_cpumask, d->domain))
break;
cpumask_andnot(d->old_domain, d->domain,
vector_cpumask);
--
1.7.10.4
next prev parent reply other threads:[~2015-12-15 7:55 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-11 7:49 [x86/irq] 4c24cee6b2: IP-Config: Auto-configuration of network failed kernel test robot
2015-12-11 7:49 ` [lkp] " kernel test robot
2015-12-14 6:38 ` Jiang Liu
2015-12-14 6:38 ` [lkp] " Jiang Liu
2015-12-14 6:54 ` Huang, Ying
2015-12-14 6:54 ` [LKP] [lkp] " Huang, Ying
2015-12-14 9:54 ` Borislav Petkov
2015-12-14 9:54 ` [LKP] [lkp] " Borislav Petkov
2015-12-15 7:55 ` Jiang Liu [this message]
2015-12-15 7:55 ` Jiang Liu
2015-12-15 10:08 ` Borislav Petkov
2015-12-15 10:08 ` [LKP] [lkp] " Borislav Petkov
2015-12-19 20:31 ` Thomas Gleixner
2015-12-19 20:31 ` [LKP] [lkp] " Thomas Gleixner
2015-12-23 14:13 ` [Bugfix v2 1/5] x86/irq: Do not reuse struct apic_chip_data.old_domain as temporary buffer Jiang Liu
2015-12-23 14:13 ` [Bugfix v2 2/5] x86/irq: Enhance __assign_irq_vector() to rollback in case of failure Jiang Liu
2015-12-30 18:52 ` Thomas Gleixner
2015-12-23 14:13 ` [Bugfix v2 3/5] x86/irq: Fix a race window in x86_vector_free_irqs() Jiang Liu
2015-12-29 13:39 ` Thomas Gleixner
2016-01-16 21:16 ` [tip:x86/urgent] x86/irq: Fix a race " tip-bot for Jiang Liu
2015-12-23 14:13 ` [Bugfix v2 4/5] x86/irq: Fix a race condition between vector assigning and cleanup Jiang Liu
2015-12-23 18:41 ` Borislav Petkov
2015-12-30 17:25 ` Thomas Gleixner
2015-12-30 22:50 ` Thomas Gleixner
2015-12-23 14:13 ` [Bugfix v2 5/5] x86/irq: Trivial cleanups for x86 vector allocation code Jiang Liu
2015-12-23 19:10 ` [Bugfix v2 1/5] x86/irq: Do not reuse struct apic_chip_data.old_domain as temporary buffer Borislav Petkov
2015-12-24 5:15 ` Jeremiah Mahler
2015-12-28 8:24 ` Jiang Liu
2015-12-29 3:26 ` Jeremiah Mahler
2015-12-24 14:34 ` Joe Lawrence
2016-01-16 21:16 ` [tip:x86/urgent] x86/irq: Do not use " tip-bot for Jiang Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=566FC762.1040107@linux.intel.com \
--to=jiang.liu@linux.intel.com \
--cc=lkp@lists.01.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.