From: "Marek Marczykowski-Górecki" <marmarek@invisiblethingslab.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: linux-pci@vger.kernel.org, stable@vger.kernel.org,
regressions@lists.kernel.dev,
xen-devel <xen-devel@lists.xenproject.org>,
Thomas Gleixner <tglx@linutronix.de>,
Bjorn Helgaas <bhelgaas@google.com>
Subject: Re: Kernel panic in __pci_enable_msix_range on Xen PV with PCI passthrough
Date: Wed, 25 Aug 2021 17:47:31 +0200 [thread overview]
Message-ID: <YSZmFMeVeO4Bupn+@mail-itl> (raw)
In-Reply-To: <3e72345b-d0e1-7856-de51-e74714474724@suse.com>
[-- Attachment #1: Type: text/plain, Size: 8385 bytes --]
On Wed, Aug 25, 2021 at 05:33:54PM +0200, Jan Beulich wrote:
> On 25.08.2021 17:24, Marek Marczykowski-Górecki wrote:
> > On recent kernel I get kernel panic when starting a Xen PV domain with
> > PCI devices assigned. This happens on 5.10.60 (worked on .54) and
> > 5.4.142 (worked on .136):
> >
> > [ 13.683009] pcifront pci-0: claiming resource 0000:00:00.0/0
> > [ 13.683042] pcifront pci-0: claiming resource 0000:00:00.0/1
> > [ 13.683049] pcifront pci-0: claiming resource 0000:00:00.0/2
> > [ 13.683055] pcifront pci-0: claiming resource 0000:00:00.0/3
> > [ 13.683061] pcifront pci-0: claiming resource 0000:00:00.0/6
> > [ 14.036142] e1000e: Intel(R) PRO/1000 Network Driver
> > [ 14.036179] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
> > [ 14.036982] e1000e 0000:00:00.0: Xen PCI mapped GSI11 to IRQ13
> > [ 14.044561] e1000e 0000:00:00.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
> > [ 14.045188] BUG: unable to handle page fault for address: ffffc9004069100c
> > [ 14.045197] #PF: supervisor write access in kernel mode
> > [ 14.045202] #PF: error_code(0x0003) - permissions violation
> > [ 14.045211] PGD 18f1c067 P4D 18f1c067 PUD 4dbd067 PMD 4fba067 PTE 80100000febd4075
>
> I'm curious what lives at physical address FEBD4000.
This is a third BAR of this device, related to MSI-X:
00:04.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
Subsystem: Intel Corporation Device 0000
Physical Slot: 4
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 11
Region 0: Memory at feb80000 (32-bit, non-prefetchable) [size=128K]
Region 1: Memory at feba0000 (32-bit, non-prefetchable) [size=128K]
Region 2: I/O ports at c080 [size=32]
Region 3: Memory at febd4000 (32-bit, non-prefetchable) [size=16K]
Expansion ROM at feb40000 [disabled] [size=256K]
Capabilities: [c8] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [e0] Express (v1) Endpoint, MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 128 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s, Exit Latency L0s <64ns
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s (ok), Width x1 (ok)
TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
Capabilities: [a0] MSI-X: Enable- Count=5 Masked-
Vector table: BAR=3 offset=00000000
PBA: BAR=3 offset=00002000
Kernel driver in use: pciback
Kernel modules: e1000e
> The maximum verbosity
> hypervisor log may also have a hint as to why this is a read-only PTE.
I'll try, if that still makes sense.
> > [ 14.045227] Oops: 0003 [#1] SMP NOPTI
> > [ 14.045234] CPU: 0 PID: 234 Comm: kworker/0:2 Tainted: G W 5.14.0-rc7-1.fc32.qubes.x86_64 #15
> > [ 14.045245] Workqueue: events work_for_cpu_fn
> > [ 14.045259] RIP: e030:__pci_enable_msix_range.part.0+0x26b/0x5f0
> > [ 14.045271] Code: 2f 96 ff 48 89 44 24 28 48 89 c7 48 85 c0 0f 84 f6 01 00 00 45 0f b7 f6 48 8d 40 0c ba 01 00 00 00 49 c1 e6 04 4a 8d 4c 37 1c <89> 10 48 83 c0 10 48 39 c1 75 f5 41 0f b6 44 24 6a 84 c0 0f 84 48
> > [ 14.045284] RSP: e02b:ffffc9004018bd50 EFLAGS: 00010212
> > [ 14.045290] RAX: ffffc9004069100c RBX: ffff88800ed412f8 RCX: ffffc9004069105c
> > [ 14.045296] RDX: 0000000000000001 RSI: 00000000000febd4 RDI: ffffc90040691000
> > [ 14.045302] RBP: 0000000000000003 R08: 0000000000000000 R09: 00000000febd404f
> > [ 14.045308] R10: 0000000000007ff0 R11: ffff88800ee8ae40 R12: ffff88800ed41000
> > [ 14.045313] R13: 0000000000000000 R14: 0000000000000040 R15: 00000000feba0000
> > [ 14.045393] FS: 0000000000000000(0000) GS:ffff888018400000(0000) knlGS:0000000000000000
> > [ 14.045401] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 14.045407] CR2: ffff8000007f5ea0 CR3: 0000000012f6a000 CR4: 0000000000000660
> > [ 14.045420] Call Trace:
> > [ 14.045431] e1000e_set_interrupt_capability+0xbf/0xd0 [e1000e]
> > [ 14.045479] e1000_probe+0x41f/0xdb0 [e1000e]
>
> Otoh, from this it's pretty clear it's not a device Xen may have found
> a need to access for its own purposes. If aforementioned address covers
> (or is adjacent to) the MSI-X table of a device drive by this driver,
> then it would also be helpful to know how many MSI-X entries the device
> reports its table can have.
See above.
Does PCI passthrough for on PV support MSI-X at all?
If so, I guess the issue is the kernel trying to write directly, instead
of via some hypercall, right?
> > [ 14.045506] local_pci_probe+0x42/0x80
> > [ 14.045515] work_for_cpu_fn+0x16/0x20
> > [ 14.045522] process_one_work+0x1ec/0x390
> > [ 14.045529] worker_thread+0x53/0x3e0
> > [ 14.045534] ? process_one_work+0x390/0x390
> > [ 14.045540] kthread+0x127/0x150
> > [ 14.045548] ? set_kthread_struct+0x40/0x40
> > [ 14.045554] ret_from_fork+0x22/0x30
> > [ 14.045565] Modules linked in: e1000e(+) edac_mce_amd rfkill xen_pcifront pcspkr xt_REDIRECT ip6table_filter ip6table_mangle ip6table_raw ip6_tables ipt_REJECT nf_reject_ipv4 xt_state xt_conntrack iptable_filter iptable_mangle iptable_raw xt_MASQUERADE iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xen_scsiback target_core_mod xen_netback xen_privcmd xen_gntdev xen_gntalloc xen_blkback xen_evtchn fuse drm bpf_preload ip_tables overlay xen_blkfront
> > [ 14.045620] CR2: ffffc9004069100c
> > [ 14.045627] ---[ end trace 307f5bb3bd9f30b4 ]---
> > [ 14.045632] RIP: e030:__pci_enable_msix_range.part.0+0x26b/0x5f0
> > [ 14.045640] Code: 2f 96 ff 48 89 44 24 28 48 89 c7 48 85 c0 0f 84 f6 01 00 00 45 0f b7 f6 48 8d 40 0c ba 01 00 00 00 49 c1 e6 04 4a 8d 4c 37 1c <89> 10 48 83 c0 10 48 39 c1 75 f5 41 0f b6 44 24 6a 84 c0 0f 84 48
> > [ 14.045652] RSP: e02b:ffffc9004018bd50 EFLAGS: 00010212
> > [ 14.045657] RAX: ffffc9004069100c RBX: ffff88800ed412f8 RCX: ffffc9004069105c
> > [ 14.045663] RDX: 0000000000000001 RSI: 00000000000febd4 RDI: ffffc90040691000
> > [ 14.045668] RBP: 0000000000000003 R08: 0000000000000000 R09: 00000000febd404f
> > [ 14.045674] R10: 0000000000007ff0 R11: ffff88800ee8ae40 R12: ffff88800ed41000
> > [ 14.045679] R13: 0000000000000000 R14: 0000000000000040 R15: 00000000feba0000
> > [ 14.045698] FS: 0000000000000000(0000) GS:ffff888018400000(0000) knlGS:0000000000000000
> > [ 14.045706] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 14.045711] CR2: ffff8000007f5ea0 CR3: 0000000012f6a000 CR4: 0000000000000660
> > [ 14.045718] Kernel panic - not syncing: Fatal exception
> > [ 14.045726] Kernel Offset: disabled
> >
> > I've bisected it down to this commit:
> >
> > commit 7d5ec3d3612396dc6d4b76366d20ab9fc06f399f
> > Author: Thomas Gleixner <tglx@linutronix.de>
> > Date: Thu Jul 29 23:51:41 2021 +0200
> >
> > PCI/MSI: Mask all unused MSI-X entries
> >
> > I can reliably reproduce it on Xen 4.14 and Xen 4.8, so I don't think
> > Xen version matters here.
> >
> > Any idea how to fix it?
> >
>
--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2021-08-25 15:47 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-25 15:24 Kernel panic in __pci_enable_msix_range on Xen PV with PCI passthrough Marek Marczykowski-Górecki
2021-08-25 15:33 ` Jan Beulich
2021-08-25 15:47 ` Marek Marczykowski-Górecki [this message]
2021-08-25 15:55 ` Jan Beulich
2021-08-26 1:28 ` Marek Marczykowski-Górecki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YSZmFMeVeO4Bupn+@mail-itl \
--to=marmarek@invisiblethingslab.com \
--cc=bhelgaas@google.com \
--cc=jbeulich@suse.com \
--cc=linux-pci@vger.kernel.org \
--cc=regressions@lists.kernel.dev \
--cc=stable@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.