On 05.07.16 06:17, Wan Zongshun wrote: > > > On 2016年07月05日 09:56, sunnydrake wrote: >> >> On 04.07.16 16:51, Wan Zongshun wrote: >>> >>> >>> 在 7/4/2016 4:48 AM, sunnydrake 写道: >>>> Thanks for reply. >>>> On 03.07.16 17:26, Wan Zongshun wrote: >>>>> >>>>> >>>>> 在 7/3/2016 8:59 AM, sunnydrake 写道: >>>>>> [description] >>>>>> working in kernel 3.9 >>>>>> Oops in current 4.4.0-28,4.7.0-040700rc5 >>>>>> kernel options ivrs_ioapic[7]=00:14.0 ivrs_ioapic[8]=00:00.1 >>>>>> workaround to fix ivrs table >>>>>> cause kernel Oops on boot >>>>> Do you mean "ivrs_ioapic[7]=00:14.0 ivrs_ioapic[8]=00:00.1" are >>>>> workable at kernel-3.9 but failed in kernel-4.4? >>>> 1)yes kernel 3.9 boots ok with ivrs_ioapic[7]=00:14.0 >>>> ivrs_ioapic[8]=00:00.1 >>>> kernels 4.4 and 4.7 fall to Oops >>>>> >>>>>> >>>>>> [bug] >>>>>> oops: >>>>>> short oops text >>>>>> AMD-Vi: Completion_wait loop timed Out >>>>>> BUG: unable to handle kernel NULL pointer dereference at 000..03e >>>>>> ... irq_pm_install_action+0x1c/0xd0 >>>>>> full oops image text >>>>>> http://img.ctrlv.in/img/16/07/03/577863055370c.jpg >>>>>> >>>>>> [additional info] >>>>>> dmesg|grep AMD-Vi without ivrs_ioapic[8]=00:00.1 >>>>> This log is from the kernel print without ivrs_ioapic[8]=00:00.1? >>>>> Why not provide your kernel log with "ivrs_ioapic[7]=00:14.0 >>>>> ivrs_ioapic[8]=00:00.1" ? >>>>> Full kernel log is better. >>>>> >>>> >>>> 2) yes, because with ivrs_ioapic[7]=00:14.0 ivrs_ioapic[8]=00:00.1 >>>> kernels are not bootable. Screen of Oops >>>> http://img.ctrlv.in/img/16/07/03/577863055370c.jpg (this with params >>>> ivrs_ioapic[7]=00:14.0 ivrs_ioapic[8]=00:00.1 ). if you need >>>> something another like kdump, i can provide. >>> >>> If you can provide a full kernel log with ivrs_ioapic[7]=00:14.0 >>> ivrs_ioapic[8]=00:00.1, that is better. >>> I checked your crash log, and find some things related to i8042 maybe >>> wrong, it is ps2 relation driver, is it necessary in your system? can >>> you disable this i8042 firsty to check if your issue is reasoned >>> from it? >> i have serial port disabled in bios and booting with i8042.no_acpi=1 >> does not fix problem. I don't think i8042 related, because >> i8042_panic_blink is caps lock blinking when kernel crash (std >> behavior) >> >> here is more detailed image of crash >> http://img.ctrlv.in/img/16/07/05/577b0ec96746e.jpg > > This is not enough to check this issue, I just see "AMD-vi CW loop > timoutout...", but I can not see that more info ahead of this timeout. > > I guess some pci device dead, and it leads to iommu send command > timeout or else... > Unfortunetly kdump cant reproduce this error due to skipping some hw init.. my best bet is somehow reload iommu module while under kdump kernel(Dunno how?). Other findings i have irqbypass used by kvm,vfio_pci if it related somehow. My guess(no i do not read iommu code) that after getting ivrs table info it try to remap interrupts and got smashed. from 4.6 kern 31 * Called from __setup_irq() with desc->lock held after @action has 32 * been installed in the action chain. 33 */ 34 void irq_pm_install_action(struct irq_desc *desc, struct irqaction *action) 35 { 36 desc->nr_actions++; 37 38 if (action->flags & IRQF_FORCE_RESUME) 39 desc->force_resume_depth++; 40 41 WARN_ON_ONCE(desc->force_resume_depth && 42 desc->force_resume_depth != desc->nr_actions); 43 44 if (action->flags & IRQF_NO_SUSPEND) 45 desc->no_suspend_depth++; 46 else if (action->flags & IRQF_COND_SUSPEND) 47 desc->cond_suspend_depth++; 48 49 WARN_ON_ONCE(desc->no_suspend_depth && 50 (desc->no_suspend_depth + 51 desc->cond_suspend_depth) != desc->nr_actions); 52 } hmm actually checks if irq is shared call in +/source/kernel/irq/manage.c 1097 /* 1098 * Internal function to register an irqaction - typically used to 1099 * allocate special interrupts that are part of the architecture. 1100 */ 1102 __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new) 1331 irq_pm_install_action(desc, new); >> Unable to handle null pointer reference at irq_pm_install_action... >> ok i will setup linux-crashdump and report logs >> >