* ath11k and vfio-pci support @ 2024-01-08 13:17 James Prestwood 2024-01-10 9:00 ` Kalle Valo ` (2 more replies) 0 siblings, 3 replies; 51+ messages in thread From: James Prestwood @ 2024-01-08 13:17 UTC (permalink / raw) To: open list:MEDIATEK MT76 WIRELESS LAN DRIVER Hi, For a while now I've been using vfio-pci to pass through wireless hardware to a Qemu VM. This makes testing kernel changes quick and easy compared to swapping the host kernel for reach iterative change. So far I've had very few issues doing this, maybe I've just been lucky... I tried doing this with an ath11k (WCN6855) card and ran into issues with the driver starting up. I'm wondering if its a configuration issue or just a lack of support by ath11k? The card works just fine when I use it on my host machine. Based on the logs it may not even be related to ath11k as wmi-bmof seems to fail first, but I'm not familiar with anything at the PCI level so I've got no idea whats going on. My host machine is running 5.19 Booting into a 6.2 kernel with Qemu Below is a partial kernel log of relevant messages: [ 0.318120] ath11k_pci 0000:00:06.0: BAR 0: assigned [mem 0x100000000-0x1001fffff 64bit] [ 0.319006] ath11k_pci 0000:00:06.0: MSI vectors: 1 [ 0.319075] ath11k_pci 0000:00:06.0: wcn6855 hw2.1 [ 0.475326] mhi mhi0: Requested to power ON [ 0.475333] mhi mhi0: Power on setup success [ 1.440375] Driver 'wmi-bmof' was unable to register with bus_type 'wmi' because the bus was not initialized. [ 1.459667] ath11k_pci 0000:00:06.0: chip_id 0x12 chip_family 0xb board_id 0xff soc_id 0x400c1211 [ 1.459674] ath11k_pci 0000:00:06.0: fw_version 0x1109996e fw_build_timestamp 2023-12-19 11:11 fw_build_id WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.36 [ 1.528705] ath11k_pci 0000:00:06.0: leaving PCI ASPM disabled to avoid MHI M2 problems [ 2.537835] ath11k_pci 0000:00:06.0: failed to receive control response completion, polling.. [ 3.562151] ath11k_pci 0000:00:06.0: Service connect timeout [ 3.563511] ath11k_pci 0000:00:06.0: failed to connect to HTT: -110 [ 3.565375] ath11k_pci 0000:00:06.0: failed to start core: -110 [ 3.776067] ath11k_pci 0000:00:06.0: ignore reset dev flags 0x4000 [ 13.802149] ath11k_pci 0000:00:06.0: failed to wait wlan mode request (mode 4): -110 [ 13.802163] ath11k_pci 0000:00:06.0: qmi failed to send wlan mode off: -110 ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-08 13:17 ath11k and vfio-pci support James Prestwood @ 2024-01-10 9:00 ` Kalle Valo 2024-01-10 13:04 ` James Prestwood 2024-08-12 16:59 ` [PATCH RFC/RFT] vfio/pci: Create feature to disable MSI virtualization Alex Williamson 2024-08-12 17:00 ` [PATCH RFC/RFT] vfio/pci-quirks: Quirk for ath wireless Alex Williamson 2 siblings, 1 reply; 51+ messages in thread From: Kalle Valo @ 2024-01-10 9:00 UTC (permalink / raw) To: James Prestwood; +Cc: linux-wireless, ath11k + ath11k list James Prestwood <prestwoj@gmail.com> writes: > For a while now I've been using vfio-pci to pass through wireless > hardware to a Qemu VM. This makes testing kernel changes quick and > easy compared to swapping the host kernel for reach iterative change. > So far I've had very few issues doing this, maybe I've just been > lucky... I tried doing this with an ath11k (WCN6855) card and ran into > issues with the driver starting up. I'm wondering if its a > configuration issue or just a lack of support by ath11k? The card > works just fine when I use it on my host machine. Based on the logs it > may not even be related to ath11k as wmi-bmof seems to fail first, but > I'm not familiar with anything at the PCI level so I've got no idea > whats going on. There's a bug report about this: https://bugzilla.kernel.org/show_bug.cgi?id=216055 But I have also no idea what is causing this, I guess we are doing something wrong with the PCI communication? That reminds me, you could try this in case that helps: https://patchwork.kernel.org/project/linux-wireless/patch/20231212031914.47339-1-imguzh@gmail.com/ -- https://patchwork.kernel.org/project/linux-wireless/list/ https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-10 9:00 ` Kalle Valo @ 2024-01-10 13:04 ` James Prestwood 2024-01-10 13:49 ` Kalle Valo 0 siblings, 1 reply; 51+ messages in thread From: James Prestwood @ 2024-01-10 13:04 UTC (permalink / raw) To: Kalle Valo; +Cc: linux-wireless, ath11k Hi Kalle, On 1/10/24 1:00 AM, Kalle Valo wrote: > + ath11k list > > James Prestwood <prestwoj@gmail.com> writes: > >> For a while now I've been using vfio-pci to pass through wireless >> hardware to a Qemu VM. This makes testing kernel changes quick and >> easy compared to swapping the host kernel for reach iterative change. >> So far I've had very few issues doing this, maybe I've just been >> lucky... I tried doing this with an ath11k (WCN6855) card and ran into >> issues with the driver starting up. I'm wondering if its a >> configuration issue or just a lack of support by ath11k? The card >> works just fine when I use it on my host machine. Based on the logs it >> may not even be related to ath11k as wmi-bmof seems to fail first, but >> I'm not familiar with anything at the PCI level so I've got no idea >> whats going on. > There's a bug report about this: > > https://bugzilla.kernel.org/show_bug.cgi?id=216055 Yes I saw that, but apparently didn't read past the first comment... I'm not seeing any of the DMAR faults but the ath-releated ones look the same. Good to know its not just me. > > But I have also no idea what is causing this, I guess we are doing > something wrong with the PCI communication? That reminds me, you could > try this in case that helps: > > https://patchwork.kernel.org/project/linux-wireless/patch/20231212031914.47339-1-imguzh@gmail.com/ Heh, I saw this pop up a day after I sent this and was wondering. Is this something I'd need on the host kernel, guest, or both? Thanks, James ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-10 13:04 ` James Prestwood @ 2024-01-10 13:49 ` Kalle Valo 2024-01-10 14:55 ` James Prestwood 0 siblings, 1 reply; 51+ messages in thread From: Kalle Valo @ 2024-01-10 13:49 UTC (permalink / raw) To: James Prestwood; +Cc: linux-wireless, ath11k James Prestwood <prestwoj@gmail.com> writes: >> But I have also no idea what is causing this, I guess we are doing >> something wrong with the PCI communication? That reminds me, you could >> try this in case that helps: >> >> https://patchwork.kernel.org/project/linux-wireless/patch/20231212031914.47339-1-imguzh@gmail.com/ > > Heh, I saw this pop up a day after I sent this and was wondering. Is > this something I'd need on the host kernel, guest, or both? On the guest where ath11k is running. I'm not optimistic that this would solve your issue, I suspect there can be also other bugs, but good to know if the patch changes anything. -- https://patchwork.kernel.org/project/linux-wireless/list/ https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-10 13:49 ` Kalle Valo @ 2024-01-10 14:55 ` James Prestwood 2024-01-11 3:51 ` Baochen Qiang 0 siblings, 1 reply; 51+ messages in thread From: James Prestwood @ 2024-01-10 14:55 UTC (permalink / raw) To: Kalle Valo; +Cc: linux-wireless, ath11k Hi Kalle, On 1/10/24 5:49 AM, Kalle Valo wrote: > James Prestwood <prestwoj@gmail.com> writes: > >>> But I have also no idea what is causing this, I guess we are doing >>> something wrong with the PCI communication? That reminds me, you could >>> try this in case that helps: >>> >>> https://patchwork.kernel.org/project/linux-wireless/patch/20231212031914.47339-1-imguzh@gmail.com/ >> Heh, I saw this pop up a day after I sent this and was wondering. Is >> this something I'd need on the host kernel, guest, or both? > On the guest where ath11k is running. I'm not optimistic that this would > solve your issue, I suspect there can be also other bugs, but good to > know if the patch changes anything. Looks the same here, didn't seem to change anything based on the kernel logs. Thanks, James ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-10 14:55 ` James Prestwood @ 2024-01-11 3:51 ` Baochen Qiang 2024-01-11 8:16 ` Kalle Valo 0 siblings, 1 reply; 51+ messages in thread From: Baochen Qiang @ 2024-01-11 3:51 UTC (permalink / raw) To: James Prestwood, Kalle Valo; +Cc: linux-wireless, ath11k On 1/10/2024 10:55 PM, James Prestwood wrote: > Hi Kalle, > > On 1/10/24 5:49 AM, Kalle Valo wrote: >> James Prestwood <prestwoj@gmail.com> writes: >> >>>> But I have also no idea what is causing this, I guess we are doing >>>> something wrong with the PCI communication? That reminds me, you could >>>> try this in case that helps: >>>> >>>> https://patchwork.kernel.org/project/linux-wireless/patch/20231212031914.47339-1-imguzh@gmail.com/ >>> Heh, I saw this pop up a day after I sent this and was wondering. Is >>> this something I'd need on the host kernel, guest, or both? >> On the guest where ath11k is running. I'm not optimistic that this would >> solve your issue, I suspect there can be also other bugs, but good to >> know if the patch changes anything. > > Looks the same here, didn't seem to change anything based on the kernel > logs. > > Thanks, > > James > Could you try this? https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/net/wireless/ath/ath11k/pci.c?id=39564b475ac5a589e6c22c43a08cbd283c295d2c ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-11 3:51 ` Baochen Qiang @ 2024-01-11 8:16 ` Kalle Valo 2024-01-11 12:48 ` James Prestwood 0 siblings, 1 reply; 51+ messages in thread From: Kalle Valo @ 2024-01-11 8:16 UTC (permalink / raw) To: Baochen Qiang; +Cc: James Prestwood, linux-wireless, ath11k Baochen Qiang <quic_bqiang@quicinc.com> writes: > On 1/10/2024 10:55 PM, James Prestwood wrote: >> Hi Kalle, >> On 1/10/24 5:49 AM, Kalle Valo wrote: >>> James Prestwood <prestwoj@gmail.com> writes: >>> >>>>> But I have also no idea what is causing this, I guess we are doing >>>>> something wrong with the PCI communication? That reminds me, you could >>>>> try this in case that helps: >>>>> >>>>> https://patchwork.kernel.org/project/linux-wireless/patch/20231212031914.47339-1-imguzh@gmail.com/ >>>> >>>> Heh, I saw this pop up a day after I sent this and was wondering. Is >>>> this something I'd need on the host kernel, guest, or both? >>> >>> On the guest where ath11k is running. I'm not optimistic that this would >>> solve your issue, I suspect there can be also other bugs, but good to >>> know if the patch changes anything. >> >> Looks the same here, didn't seem to change anything based on the >> kernel logs. >> > Could you try this? > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/net/wireless/ath/ath11k/pci.c?id=39564b475ac5a589e6c22c43a08cbd283c295d2c This reminds me, I assumed James was testing with ath.git master branch (which has that commit) but I never checked that. So for testing please always use the master branch to get the latest and greatest ath11k: https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/ There's a quite long delay from ath.git to official releases. -- https://patchwork.kernel.org/project/linux-wireless/list/ https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-11 8:16 ` Kalle Valo @ 2024-01-11 12:48 ` James Prestwood 2024-01-11 13:11 ` Kalle Valo 0 siblings, 1 reply; 51+ messages in thread From: James Prestwood @ 2024-01-11 12:48 UTC (permalink / raw) To: Kalle Valo, Baochen Qiang; +Cc: linux-wireless, ath11k Hi Kalle, Baochen, On 1/11/24 12:16 AM, Kalle Valo wrote: > Baochen Qiang <quic_bqiang@quicinc.com> writes: > >> On 1/10/2024 10:55 PM, James Prestwood wrote: >>> Hi Kalle, >>> On 1/10/24 5:49 AM, Kalle Valo wrote: >>>> James Prestwood <prestwoj@gmail.com> writes: >>>> >>>>>> But I have also no idea what is causing this, I guess we are doing >>>>>> something wrong with the PCI communication? That reminds me, you could >>>>>> try this in case that helps: >>>>>> >>>>>> https://patchwork.kernel.org/project/linux-wireless/patch/20231212031914.47339-1-imguzh@gmail.com/ >>>>> Heh, I saw this pop up a day after I sent this and was wondering. Is >>>>> this something I'd need on the host kernel, guest, or both? >>>> On the guest where ath11k is running. I'm not optimistic that this would >>>> solve your issue, I suspect there can be also other bugs, but good to >>>> know if the patch changes anything. >>> Looks the same here, didn't seem to change anything based on the >>> kernel logs. >>> >> Could you try this? >> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/net/wireless/ath/ath11k/pci.c?id=39564b475ac5a589e6c22c43a08cbd283c295d2c > This reminds me, I assumed James was testing with ath.git master branch > (which has that commit) but I never checked that. So for testing please > always use the master branch to get the latest and greatest ath11k: > > https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/ > > There's a quite long delay from ath.git to official releases. Good to know, and I was not in fact using that branch. Rebuilt from ath.git/master but still roughly the same behavior. There does appear to be more output now though, specifically a firmware crash: [ 2.281721] ath11k_pci 0000:00:06.0: failed to receive control response completion, polling.. [ 2.282101] ip (65) used greatest stack depth: 12464 bytes left [ 3.306039] ath11k_pci 0000:00:06.0: Service connect timeout [ 3.307588] ath11k_pci 0000:00:06.0: failed to connect to HTT: -110 [ 3.309286] ath11k_pci 0000:00:06.0: failed to start core: -110 [ 3.519637] ath11k_pci 0000:00:06.0: firmware crashed: MHI_CB_EE_RDDM [ 3.519678] ath11k_pci 0000:00:06.0: ignore reset dev flags 0x4000 [ 3.627087] ath11k_pci 0000:00:06.0: firmware crashed: MHI_CB_EE_RDDM [ 3.627129] ath11k_pci 0000:00:06.0: ignore reset dev flags 0x4000 [ 13.802105] ath11k_pci 0000:00:06.0: failed to wait wlan mode request (mode 4): -110 [ 13.802175] ath11k_pci 0000:00:06.0: qmi failed to send wlan mode off: -110 Thanks, James ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-11 12:48 ` James Prestwood @ 2024-01-11 13:11 ` Kalle Valo 2024-01-11 13:38 ` James Prestwood 0 siblings, 1 reply; 51+ messages in thread From: Kalle Valo @ 2024-01-11 13:11 UTC (permalink / raw) To: James Prestwood; +Cc: Baochen Qiang, linux-wireless, ath11k James Prestwood <prestwoj@gmail.com> writes: > Hi Kalle, Baochen, > > On 1/11/24 12:16 AM, Kalle Valo wrote: >> Baochen Qiang <quic_bqiang@quicinc.com> writes: >> >>> On 1/10/2024 10:55 PM, James Prestwood wrote: >>>> Hi Kalle, >>>> On 1/10/24 5:49 AM, Kalle Valo wrote: >>>>> James Prestwood <prestwoj@gmail.com> writes: >>>>> >>>>>>> But I have also no idea what is causing this, I guess we are doing >>>>>>> something wrong with the PCI communication? That reminds me, you could >>>>>>> try this in case that helps: >>>>>>> >>>>>>> https://patchwork.kernel.org/project/linux-wireless/patch/20231212031914.47339-1-imguzh@gmail.com/ >>>>>> Heh, I saw this pop up a day after I sent this and was wondering. Is >>>>>> this something I'd need on the host kernel, guest, or both? >>>>> On the guest where ath11k is running. I'm not optimistic that this would >>>>> solve your issue, I suspect there can be also other bugs, but good to >>>>> know if the patch changes anything. >>>> Looks the same here, didn't seem to change anything based on the >>>> kernel logs. >>>> >>> Could you try this? >>> >>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/net/wireless/ath/ath11k/pci.c?id=39564b475ac5a589e6c22c43a08cbd283c295d2c >> This reminds me, I assumed James was testing with ath.git master branch >> (which has that commit) but I never checked that. So for testing please >> always use the master branch to get the latest and greatest ath11k: >> >> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/ >> >> There's a quite long delay from ath.git to official releases. > > Good to know, and I was not in fact using that branch. Rebuilt from > ath.git/master but still roughly the same behavior. There does appear > to be more output now though, specifically a firmware crash: > > [ 2.281721] ath11k_pci 0000:00:06.0: failed to receive control > response completion, polling.. > [ 2.282101] ip (65) used greatest stack depth: 12464 bytes left > [ 3.306039] ath11k_pci 0000:00:06.0: Service connect timeout > [ 3.307588] ath11k_pci 0000:00:06.0: failed to connect to HTT: -110 > [ 3.309286] ath11k_pci 0000:00:06.0: failed to start core: -110 > [ 3.519637] ath11k_pci 0000:00:06.0: firmware crashed: MHI_CB_EE_RDDM > [ 3.519678] ath11k_pci 0000:00:06.0: ignore reset dev flags 0x4000 > [ 3.627087] ath11k_pci 0000:00:06.0: firmware crashed: MHI_CB_EE_RDDM > [ 3.627129] ath11k_pci 0000:00:06.0: ignore reset dev flags 0x4000 > [ 13.802105] ath11k_pci 0000:00:06.0: failed to wait wlan mode > request (mode 4): -110 > [ 13.802175] ath11k_pci 0000:00:06.0: qmi failed to send wlan mode > off: -110 Ok, that's progress now. Can you try next try the iommu patch[1] we talked about earlier? It's already in master-pending branch (along with other pending patches) so you can use that branch if you want. [1] https://patchwork.kernel.org/project/linux-wireless/patch/20231212031914.47339-1-imguzh@gmail.com/ -- https://patchwork.kernel.org/project/linux-wireless/list/ https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-11 13:11 ` Kalle Valo @ 2024-01-11 13:38 ` James Prestwood 2024-01-12 2:04 ` Baochen Qiang 0 siblings, 1 reply; 51+ messages in thread From: James Prestwood @ 2024-01-11 13:38 UTC (permalink / raw) To: Kalle Valo; +Cc: Baochen Qiang, linux-wireless, ath11k On 1/11/24 5:11 AM, Kalle Valo wrote: > James Prestwood <prestwoj@gmail.com> writes: > >> Hi Kalle, Baochen, >> >> On 1/11/24 12:16 AM, Kalle Valo wrote: >>> Baochen Qiang <quic_bqiang@quicinc.com> writes: >>> >>>> On 1/10/2024 10:55 PM, James Prestwood wrote: >>>>> Hi Kalle, >>>>> On 1/10/24 5:49 AM, Kalle Valo wrote: >>>>>> James Prestwood <prestwoj@gmail.com> writes: >>>>>> >>>>>>>> But I have also no idea what is causing this, I guess we are doing >>>>>>>> something wrong with the PCI communication? That reminds me, you could >>>>>>>> try this in case that helps: >>>>>>>> >>>>>>>> https://patchwork.kernel.org/project/linux-wireless/patch/20231212031914.47339-1-imguzh@gmail.com/ >>>>>>> Heh, I saw this pop up a day after I sent this and was wondering. Is >>>>>>> this something I'd need on the host kernel, guest, or both? >>>>>> On the guest where ath11k is running. I'm not optimistic that this would >>>>>> solve your issue, I suspect there can be also other bugs, but good to >>>>>> know if the patch changes anything. >>>>> Looks the same here, didn't seem to change anything based on the >>>>> kernel logs. >>>>> >>>> Could you try this? >>>> >>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/net/wireless/ath/ath11k/pci.c?id=39564b475ac5a589e6c22c43a08cbd283c295d2c >>> This reminds me, I assumed James was testing with ath.git master branch >>> (which has that commit) but I never checked that. So for testing please >>> always use the master branch to get the latest and greatest ath11k: >>> >>> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/ >>> >>> There's a quite long delay from ath.git to official releases. >> Good to know, and I was not in fact using that branch. Rebuilt from >> ath.git/master but still roughly the same behavior. There does appear >> to be more output now though, specifically a firmware crash: >> >> [ 2.281721] ath11k_pci 0000:00:06.0: failed to receive control >> response completion, polling.. >> [ 2.282101] ip (65) used greatest stack depth: 12464 bytes left >> [ 3.306039] ath11k_pci 0000:00:06.0: Service connect timeout >> [ 3.307588] ath11k_pci 0000:00:06.0: failed to connect to HTT: -110 >> [ 3.309286] ath11k_pci 0000:00:06.0: failed to start core: -110 >> [ 3.519637] ath11k_pci 0000:00:06.0: firmware crashed: MHI_CB_EE_RDDM >> [ 3.519678] ath11k_pci 0000:00:06.0: ignore reset dev flags 0x4000 >> [ 3.627087] ath11k_pci 0000:00:06.0: firmware crashed: MHI_CB_EE_RDDM >> [ 3.627129] ath11k_pci 0000:00:06.0: ignore reset dev flags 0x4000 >> [ 13.802105] ath11k_pci 0000:00:06.0: failed to wait wlan mode >> request (mode 4): -110 >> [ 13.802175] ath11k_pci 0000:00:06.0: qmi failed to send wlan mode >> off: -110 > Ok, that's progress now. Can you try next try the iommu patch[1] we > talked about earlier? It's already in master-pending branch (along with > other pending patches) so you can use that branch if you want. > > [1] https://patchwork.kernel.org/project/linux-wireless/patch/20231212031914.47339-1-imguzh@gmail.com/ Same result unfortunately, tried both with just [1] applied to ath.git and at HEAD of master-pending. Thanks, James ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-11 13:38 ` James Prestwood @ 2024-01-12 2:04 ` Baochen Qiang 2024-01-12 12:47 ` James Prestwood 0 siblings, 1 reply; 51+ messages in thread From: Baochen Qiang @ 2024-01-12 2:04 UTC (permalink / raw) To: James Prestwood, Kalle Valo; +Cc: linux-wireless, ath11k On 1/11/2024 9:38 PM, James Prestwood wrote: > > On 1/11/24 5:11 AM, Kalle Valo wrote: >> James Prestwood <prestwoj@gmail.com> writes: >> >>> Hi Kalle, Baochen, >>> >>> On 1/11/24 12:16 AM, Kalle Valo wrote: >>>> Baochen Qiang <quic_bqiang@quicinc.com> writes: >>>> >>>>> On 1/10/2024 10:55 PM, James Prestwood wrote: >>>>>> Hi Kalle, >>>>>> On 1/10/24 5:49 AM, Kalle Valo wrote: >>>>>>> James Prestwood <prestwoj@gmail.com> writes: >>>>>>> >>>>>>>>> But I have also no idea what is causing this, I guess we are doing >>>>>>>>> something wrong with the PCI communication? That reminds me, >>>>>>>>> you could >>>>>>>>> try this in case that helps: >>>>>>>>> >>>>>>>>> https://patchwork.kernel.org/project/linux-wireless/patch/20231212031914.47339-1-imguzh@gmail.com/ >>>>>>>> Heh, I saw this pop up a day after I sent this and was >>>>>>>> wondering. Is >>>>>>>> this something I'd need on the host kernel, guest, or both? >>>>>>> On the guest where ath11k is running. I'm not optimistic that >>>>>>> this would >>>>>>> solve your issue, I suspect there can be also other bugs, but >>>>>>> good to >>>>>>> know if the patch changes anything. >>>>>> Looks the same here, didn't seem to change anything based on the >>>>>> kernel logs. >>>>>> >>>>> Could you try this? >>>>> >>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/net/wireless/ath/ath11k/pci.c?id=39564b475ac5a589e6c22c43a08cbd283c295d2c >>>> This reminds me, I assumed James was testing with ath.git master branch >>>> (which has that commit) but I never checked that. So for testing please >>>> always use the master branch to get the latest and greatest ath11k: >>>> >>>> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/ >>>> >>>> There's a quite long delay from ath.git to official releases. >>> Good to know, and I was not in fact using that branch. Rebuilt from >>> ath.git/master but still roughly the same behavior. There does appear >>> to be more output now though, specifically a firmware crash: >>> >>> [ 2.281721] ath11k_pci 0000:00:06.0: failed to receive control >>> response completion, polling.. >>> [ 2.282101] ip (65) used greatest stack depth: 12464 bytes left >>> [ 3.306039] ath11k_pci 0000:00:06.0: Service connect timeout >>> [ 3.307588] ath11k_pci 0000:00:06.0: failed to connect to HTT: -110 >>> [ 3.309286] ath11k_pci 0000:00:06.0: failed to start core: -110 >>> [ 3.519637] ath11k_pci 0000:00:06.0: firmware crashed: MHI_CB_EE_RDDM >>> [ 3.519678] ath11k_pci 0000:00:06.0: ignore reset dev flags 0x4000 >>> [ 3.627087] ath11k_pci 0000:00:06.0: firmware crashed: MHI_CB_EE_RDDM >>> [ 3.627129] ath11k_pci 0000:00:06.0: ignore reset dev flags 0x4000 >>> [ 13.802105] ath11k_pci 0000:00:06.0: failed to wait wlan mode >>> request (mode 4): -110 >>> [ 13.802175] ath11k_pci 0000:00:06.0: qmi failed to send wlan mode >>> off: -110 >> Ok, that's progress now. Can you try next try the iommu patch[1] we >> talked about earlier? It's already in master-pending branch (along with >> other pending patches) so you can use that branch if you want. >> >> [1] >> https://patchwork.kernel.org/project/linux-wireless/patch/20231212031914.47339-1-imguzh@gmail.com/ > > Same result unfortunately, tried both with just [1] applied to ath.git > and at HEAD of master-pending. > > Thanks, > > James Strange that still fails. Are you now seeing this error in your host or your Qemu? or both? Could you share your test steps? And if you can share please be as detailed as possible since I'm not familiar with passing WLAN hardware to a VM using vfio-pci. > > ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-12 2:04 ` Baochen Qiang @ 2024-01-12 12:47 ` James Prestwood 2024-01-14 12:37 ` Baochen Qiang 0 siblings, 1 reply; 51+ messages in thread From: James Prestwood @ 2024-01-12 12:47 UTC (permalink / raw) To: Baochen Qiang, Kalle Valo; +Cc: linux-wireless, ath11k Hi, On 1/11/24 6:04 PM, Baochen Qiang wrote: > > > On 1/11/2024 9:38 PM, James Prestwood wrote: >> >> On 1/11/24 5:11 AM, Kalle Valo wrote: >>> James Prestwood <prestwoj@gmail.com> writes: >>> >>>> Hi Kalle, Baochen, >>>> >>>> On 1/11/24 12:16 AM, Kalle Valo wrote: >>>>> Baochen Qiang <quic_bqiang@quicinc.com> writes: >>>>> >>>>>> On 1/10/2024 10:55 PM, James Prestwood wrote: >>>>>>> Hi Kalle, >>>>>>> On 1/10/24 5:49 AM, Kalle Valo wrote: >>>>>>>> James Prestwood <prestwoj@gmail.com> writes: >>>>>>>> >>>>>>>>>> But I have also no idea what is causing this, I guess we are >>>>>>>>>> doing >>>>>>>>>> something wrong with the PCI communication? That reminds me, >>>>>>>>>> you could >>>>>>>>>> try this in case that helps: >>>>>>>>>> >>>>>>>>>> https://patchwork.kernel.org/project/linux-wireless/patch/20231212031914.47339-1-imguzh@gmail.com/ >>>>>>>>>> >>>>>>>>> Heh, I saw this pop up a day after I sent this and was >>>>>>>>> wondering. Is >>>>>>>>> this something I'd need on the host kernel, guest, or both? >>>>>>>> On the guest where ath11k is running. I'm not optimistic that >>>>>>>> this would >>>>>>>> solve your issue, I suspect there can be also other bugs, but >>>>>>>> good to >>>>>>>> know if the patch changes anything. >>>>>>> Looks the same here, didn't seem to change anything based on the >>>>>>> kernel logs. >>>>>>> >>>>>> Could you try this? >>>>>> >>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/net/wireless/ath/ath11k/pci.c?id=39564b475ac5a589e6c22c43a08cbd283c295d2c >>>>>> >>>>> This reminds me, I assumed James was testing with ath.git master >>>>> branch >>>>> (which has that commit) but I never checked that. So for testing >>>>> please >>>>> always use the master branch to get the latest and greatest ath11k: >>>>> >>>>> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/ >>>>> >>>>> There's a quite long delay from ath.git to official releases. >>>> Good to know, and I was not in fact using that branch. Rebuilt from >>>> ath.git/master but still roughly the same behavior. There does appear >>>> to be more output now though, specifically a firmware crash: >>>> >>>> [ 2.281721] ath11k_pci 0000:00:06.0: failed to receive control >>>> response completion, polling.. >>>> [ 2.282101] ip (65) used greatest stack depth: 12464 bytes left >>>> [ 3.306039] ath11k_pci 0000:00:06.0: Service connect timeout >>>> [ 3.307588] ath11k_pci 0000:00:06.0: failed to connect to HTT: -110 >>>> [ 3.309286] ath11k_pci 0000:00:06.0: failed to start core: -110 >>>> [ 3.519637] ath11k_pci 0000:00:06.0: firmware crashed: >>>> MHI_CB_EE_RDDM >>>> [ 3.519678] ath11k_pci 0000:00:06.0: ignore reset dev flags 0x4000 >>>> [ 3.627087] ath11k_pci 0000:00:06.0: firmware crashed: >>>> MHI_CB_EE_RDDM >>>> [ 3.627129] ath11k_pci 0000:00:06.0: ignore reset dev flags 0x4000 >>>> [ 13.802105] ath11k_pci 0000:00:06.0: failed to wait wlan mode >>>> request (mode 4): -110 >>>> [ 13.802175] ath11k_pci 0000:00:06.0: qmi failed to send wlan mode >>>> off: -110 >>> Ok, that's progress now. Can you try next try the iommu patch[1] we >>> talked about earlier? It's already in master-pending branch (along with >>> other pending patches) so you can use that branch if you want. >>> >>> [1] >>> https://patchwork.kernel.org/project/linux-wireless/patch/20231212031914.47339-1-imguzh@gmail.com/ >> >> Same result unfortunately, tried both with just [1] applied to >> ath.git and at HEAD of master-pending. >> >> Thanks, >> >> James > Strange that still fails. Are you now seeing this error in your host > or your Qemu? or both? > Could you share your test steps? And if you can share please be as > detailed as possible since I'm not familiar with passing WLAN hardware > to a VM using vfio-pci. Just in Qemu, the hardware works fine on my host machine. I basically follow this guide to set it up, its written in the context of GPUs/libvirt but the host setup is exactly the same. By no means do you need to read it all, once you set the vfio-pci.ids and see your unclaimed adapter you can stop: https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF In short you should be able to set the following host kernel options and reboot (assuming your motherboard/hardware is compatible): intel_iommu=on iommu=pt vfio-pci.ids=17cb:1103 Obviously change the device/vendor IDs to whatever ath11k hw you have. Once the host is rebooted you should see your wlan adapter as UNCLAIMED, showing the driver in use as vfio-pci. If not, its likely your motherboard just isn't compatible, the device has to be in its own IOMMU group (you could try switching PCI ports if this is the case). I then build a "kvm_guest.config" kernel with the driver/firmware for ath11k and boot into that with the following Qemu options: -enable-kvm -device -vfio-pci,host=<PCI address> If it seems easier you could also utilize IWD's test-runner which handles launching the Qemu kernel automatically, detecting any vfio-devices and passes them through and mounts some useful host folders into the VM. Its actually a very good general purpose tool for kernel testing, not just for IWD: https://git.kernel.org/pub/scm/network/wireless/iwd.git/tree/doc/test-runner.txt Once set up you can just run test-runner with a few flags and you'll boot into a shell: ./tools/test-runner -k <kernel-image> --hw --start /bin/bash Please reach out if you have questions, thanks for looking into this. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-12 12:47 ` James Prestwood @ 2024-01-14 12:37 ` Baochen Qiang 2024-01-14 14:36 ` Kalle Valo 2024-01-16 13:05 ` James Prestwood 0 siblings, 2 replies; 51+ messages in thread From: Baochen Qiang @ 2024-01-14 12:37 UTC (permalink / raw) To: James Prestwood, Kalle Valo; +Cc: linux-wireless, ath11k On 1/12/2024 8:47 PM, James Prestwood wrote: > Hi, > > On 1/11/24 6:04 PM, Baochen Qiang wrote: >> >> >> On 1/11/2024 9:38 PM, James Prestwood wrote: >>> >>> On 1/11/24 5:11 AM, Kalle Valo wrote: >>>> James Prestwood <prestwoj@gmail.com> writes: >>>> >>>>> Hi Kalle, Baochen, >>>>> >>>>> On 1/11/24 12:16 AM, Kalle Valo wrote: >>>>>> Baochen Qiang <quic_bqiang@quicinc.com> writes: >>>>>> >>>>>>> On 1/10/2024 10:55 PM, James Prestwood wrote: >>>>>>>> Hi Kalle, >>>>>>>> On 1/10/24 5:49 AM, Kalle Valo wrote: >>>>>>>>> James Prestwood <prestwoj@gmail.com> writes: >>>>>>>>> >>>>>>>>>>> But I have also no idea what is causing this, I guess we are >>>>>>>>>>> doing >>>>>>>>>>> something wrong with the PCI communication? That reminds me, >>>>>>>>>>> you could >>>>>>>>>>> try this in case that helps: >>>>>>>>>>> >>>>>>>>>>> https://patchwork.kernel.org/project/linux-wireless/patch/20231212031914.47339-1-imguzh@gmail.com/ >>>>>>>>>> Heh, I saw this pop up a day after I sent this and was >>>>>>>>>> wondering. Is >>>>>>>>>> this something I'd need on the host kernel, guest, or both? >>>>>>>>> On the guest where ath11k is running. I'm not optimistic that >>>>>>>>> this would >>>>>>>>> solve your issue, I suspect there can be also other bugs, but >>>>>>>>> good to >>>>>>>>> know if the patch changes anything. >>>>>>>> Looks the same here, didn't seem to change anything based on the >>>>>>>> kernel logs. >>>>>>>> >>>>>>> Could you try this? >>>>>>> >>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/net/wireless/ath/ath11k/pci.c?id=39564b475ac5a589e6c22c43a08cbd283c295d2c >>>>>> This reminds me, I assumed James was testing with ath.git master >>>>>> branch >>>>>> (which has that commit) but I never checked that. So for testing >>>>>> please >>>>>> always use the master branch to get the latest and greatest ath11k: >>>>>> >>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/ >>>>>> >>>>>> There's a quite long delay from ath.git to official releases. >>>>> Good to know, and I was not in fact using that branch. Rebuilt from >>>>> ath.git/master but still roughly the same behavior. There does appear >>>>> to be more output now though, specifically a firmware crash: >>>>> >>>>> [ 2.281721] ath11k_pci 0000:00:06.0: failed to receive control >>>>> response completion, polling.. >>>>> [ 2.282101] ip (65) used greatest stack depth: 12464 bytes left >>>>> [ 3.306039] ath11k_pci 0000:00:06.0: Service connect timeout >>>>> [ 3.307588] ath11k_pci 0000:00:06.0: failed to connect to HTT: -110 >>>>> [ 3.309286] ath11k_pci 0000:00:06.0: failed to start core: -110 >>>>> [ 3.519637] ath11k_pci 0000:00:06.0: firmware crashed: >>>>> MHI_CB_EE_RDDM >>>>> [ 3.519678] ath11k_pci 0000:00:06.0: ignore reset dev flags 0x4000 >>>>> [ 3.627087] ath11k_pci 0000:00:06.0: firmware crashed: >>>>> MHI_CB_EE_RDDM >>>>> [ 3.627129] ath11k_pci 0000:00:06.0: ignore reset dev flags 0x4000 >>>>> [ 13.802105] ath11k_pci 0000:00:06.0: failed to wait wlan mode >>>>> request (mode 4): -110 >>>>> [ 13.802175] ath11k_pci 0000:00:06.0: qmi failed to send wlan mode >>>>> off: -110 >>>> Ok, that's progress now. Can you try next try the iommu patch[1] we >>>> talked about earlier? It's already in master-pending branch (along with >>>> other pending patches) so you can use that branch if you want. >>>> >>>> [1] >>>> https://patchwork.kernel.org/project/linux-wireless/patch/20231212031914.47339-1-imguzh@gmail.com/ >>> >>> Same result unfortunately, tried both with just [1] applied to >>> ath.git and at HEAD of master-pending. >>> >>> Thanks, >>> >>> James >> Strange that still fails. Are you now seeing this error in your host >> or your Qemu? or both? >> Could you share your test steps? And if you can share please be as >> detailed as possible since I'm not familiar with passing WLAN hardware >> to a VM using vfio-pci. > > Just in Qemu, the hardware works fine on my host machine. > > I basically follow this guide to set it up, its written in the context > of GPUs/libvirt but the host setup is exactly the same. By no means do > you need to read it all, once you set the vfio-pci.ids and see your > unclaimed adapter you can stop: > > https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF > > In short you should be able to set the following host kernel options and > reboot (assuming your motherboard/hardware is compatible): > > intel_iommu=on iommu=pt vfio-pci.ids=17cb:1103 > > Obviously change the device/vendor IDs to whatever ath11k hw you have. > Once the host is rebooted you should see your wlan adapter as UNCLAIMED, > showing the driver in use as vfio-pci. If not, its likely your > motherboard just isn't compatible, the device has to be in its own IOMMU > group (you could try switching PCI ports if this is the case). > > I then build a "kvm_guest.config" kernel with the driver/firmware for > ath11k and boot into that with the following Qemu options: > > -enable-kvm -device -vfio-pci,host=<PCI address> > > If it seems easier you could also utilize IWD's test-runner which > handles launching the Qemu kernel automatically, detecting any > vfio-devices and passes them through and mounts some useful host folders > into the VM. Its actually a very good general purpose tool for kernel > testing, not just for IWD: > > https://git.kernel.org/pub/scm/network/wireless/iwd.git/tree/doc/test-runner.txt > > Once set up you can just run test-runner with a few flags and you'll > boot into a shell: > > ./tools/test-runner -k <kernel-image> --hw --start /bin/bash > > Please reach out if you have questions, thanks for looking into this. > Thanks for these details. I reproduced this issue by following your guide. Seems the root cause is that the MSI vector assigned to WCN6855 in qemu is different with that in host. In my case the MSI vector in qemu is [Address: fee00000 Data: 0020] while in host it is [Address: fee00578 Data: 0000]. So in qemu ath11k configures MSI vector [Address: fee00000 Data: 0020] to WCN6855 hardware/firmware, and firmware uses that vector to fire interrupts to host/qemu. However host IOMMU doesn't know that vector because the real vector is [Address: fee00578 Data: 0000], as a result host blocks that interrupt and reports an error, see below log: [ 1414.206069] DMAR: DRHD: handling fault status reg 2 [ 1414.206081] DMAR: [INTR-REMAP] Request device [02:00.0] fault index 0x0 [fault reason 0x25] Blocked a compatibility format interrupt request [ 1414.210334] DMAR: DRHD: handling fault status reg 2 [ 1414.210342] DMAR: [INTR-REMAP] Request device [02:00.0] fault index 0x0 [fault reason 0x25] Blocked a compatibility format interrupt request [ 1414.212496] DMAR: DRHD: handling fault status reg 2 [ 1414.212503] DMAR: [INTR-REMAP] Request device [02:00.0] fault index 0x0 [fault reason 0x25] Blocked a compatibility format interrupt request [ 1414.214600] DMAR: DRHD: handling fault status reg 2 While I don't think there is a way for qemu/ath11k to get the real MSI vector from host, I will try to read the vfio code to check further. Before that, to unblock you, a possible hack is to hard code the MSI vector in qemu to the same as in host, on condition that the MSI vector doesn't change. In my case, the change looks like diff --git a/drivers/net/wireless/ath/ath11k/pci.c b/drivers/net/wireless/ath/ath11k/pci.c index 09e65c5e55c4..89a9bbe9e4d2 100644 --- a/drivers/net/wireless/ath/ath11k/pci.c +++ b/drivers/net/wireless/ath/ath11k/pci.c @@ -459,7 +459,12 @@ static int ath11k_pci_alloc_msi(struct ath11k_pci *ab_pci) ab->pci.msi.addr_hi = 0; } - ath11k_dbg(ab, ATH11K_DBG_PCI, "msi base data is %d\n", ab->pci.msi.ep_base_data); + ab->pci.msi.addr_hi = 0; + ab->pci.msi.addr_lo = 0xfee00578; + ath11k_dbg(ab, ATH11K_DBG_PCI, "msi addr hi 0x%x lo 0x%x base data is %d\n", + ab->pci.msi.addr_hi, + ab->pci.msi.addr_lo, + ab->pci.msi.ep_base_data); return 0; @@ -487,6 +492,7 @@ static int ath11k_pci_config_msi_data(struct ath11k_pci *ab_pci) } ab_pci->ab->pci.msi.ep_base_data = msi_desc->msg.data; + ab_pci->ab->pci.msi.ep_base_data = 0; ath11k_dbg(ab_pci->ab, ATH11K_DBG_PCI, "after request_irq msi_ep_base_data %d\n", ab_pci->ab->pci.msi.ep_base_data); This hack works on my setup. ^ permalink raw reply related [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-14 12:37 ` Baochen Qiang @ 2024-01-14 14:36 ` Kalle Valo 2024-01-15 17:46 ` Alex Williamson 2024-01-16 13:05 ` James Prestwood 1 sibling, 1 reply; 51+ messages in thread From: Kalle Valo @ 2024-01-14 14:36 UTC (permalink / raw) To: Baochen Qiang Cc: James Prestwood, linux-wireless, ath11k, David Woodhouse, iommu Baochen Qiang <quic_bqiang@quicinc.com> writes: >>> Strange that still fails. Are you now seeing this error in your >>> host or your Qemu? or both? >>> Could you share your test steps? And if you can share please be as >>> detailed as possible since I'm not familiar with passing WLAN >>> hardware to a VM using vfio-pci. >> >> Just in Qemu, the hardware works fine on my host machine. >> I basically follow this guide to set it up, its written in the >> context of GPUs/libvirt but the host setup is exactly the same. By >> no means do you need to read it all, once you set the vfio-pci.ids >> and see your unclaimed adapter you can stop: >> https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF >> In short you should be able to set the following host kernel options >> and reboot (assuming your motherboard/hardware is compatible): >> intel_iommu=on iommu=pt vfio-pci.ids=17cb:1103 >> Obviously change the device/vendor IDs to whatever ath11k hw you >> have. Once the host is rebooted you should see your wlan adapter as >> UNCLAIMED, showing the driver in use as vfio-pci. If not, its likely >> your motherboard just isn't compatible, the device has to be in its >> own IOMMU group (you could try switching PCI ports if this is the >> case). >> I then build a "kvm_guest.config" kernel with the driver/firmware >> for ath11k and boot into that with the following Qemu options: >> -enable-kvm -device -vfio-pci,host=<PCI address> >> If it seems easier you could also utilize IWD's test-runner which >> handles launching the Qemu kernel automatically, detecting any >> vfio-devices and passes them through and mounts some useful host >> folders into the VM. Its actually a very good general purpose tool >> for kernel testing, not just for IWD: >> https://git.kernel.org/pub/scm/network/wireless/iwd.git/tree/doc/test-runner.txt >> Once set up you can just run test-runner with a few flags and you'll >> boot into a shell: >> ./tools/test-runner -k <kernel-image> --hw --start /bin/bash >> Please reach out if you have questions, thanks for looking into >> this. > > Thanks for these details. I reproduced this issue by following your guide. > > Seems the root cause is that the MSI vector assigned to WCN6855 in > qemu is different with that in host. In my case the MSI vector in qemu > is [Address: fee00000 Data: 0020] while in host it is [Address: > fee00578 Data: 0000]. So in qemu ath11k configures MSI vector > [Address: fee00000 Data: 0020] to WCN6855 hardware/firmware, and > firmware uses that vector to fire interrupts to host/qemu. However > host IOMMU doesn't know that vector because the real vector is > [Address: fee00578 Data: 0000], as a result host blocks that > interrupt and reports an error, see below log: > > [ 1414.206069] DMAR: DRHD: handling fault status reg 2 > [ 1414.206081] DMAR: [INTR-REMAP] Request device [02:00.0] fault index > 0x0 [fault reason 0x25] Blocked a compatibility format interrupt > request > [ 1414.210334] DMAR: DRHD: handling fault status reg 2 > [ 1414.210342] DMAR: [INTR-REMAP] Request device [02:00.0] fault index > 0x0 [fault reason 0x25] Blocked a compatibility format interrupt > request > [ 1414.212496] DMAR: DRHD: handling fault status reg 2 > [ 1414.212503] DMAR: [INTR-REMAP] Request device [02:00.0] fault index > 0x0 [fault reason 0x25] Blocked a compatibility format interrupt > request > [ 1414.214600] DMAR: DRHD: handling fault status reg 2 > > While I don't think there is a way for qemu/ath11k to get the real MSI > vector from host, I will try to read the vfio code to check further. > Before that, to unblock you, a possible hack is to hard code the MSI > vector in qemu to the same as in host, on condition that the MSI > vector doesn't change. Baochen, awesome that you were able to debug this further. Now we at least know what's the problem. I'll add David Woodhouse and the iommu list in hopes that they could give us ideas how to solve this. Full thread here: https://lore.kernel.org/all/adcb785e-4dc7-4c4a-b341-d53b72e13467@gmail.com/ -- https://patchwork.kernel.org/project/linux-wireless/list/ https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-14 14:36 ` Kalle Valo @ 2024-01-15 17:46 ` Alex Williamson 2024-01-16 10:08 ` Baochen Qiang 0 siblings, 1 reply; 51+ messages in thread From: Alex Williamson @ 2024-01-15 17:46 UTC (permalink / raw) To: Kalle Valo Cc: Baochen Qiang, James Prestwood, linux-wireless, ath11k, David Woodhouse, iommu On Sun, 14 Jan 2024 16:36:02 +0200 Kalle Valo <kvalo@kernel.org> wrote: > Baochen Qiang <quic_bqiang@quicinc.com> writes: > > >>> Strange that still fails. Are you now seeing this error in your > >>> host or your Qemu? or both? > >>> Could you share your test steps? And if you can share please be as > >>> detailed as possible since I'm not familiar with passing WLAN > >>> hardware to a VM using vfio-pci. > >> > >> Just in Qemu, the hardware works fine on my host machine. > >> I basically follow this guide to set it up, its written in the > >> context of GPUs/libvirt but the host setup is exactly the same. By > >> no means do you need to read it all, once you set the vfio-pci.ids > >> and see your unclaimed adapter you can stop: > >> https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF > >> In short you should be able to set the following host kernel options > >> and reboot (assuming your motherboard/hardware is compatible): > >> intel_iommu=on iommu=pt vfio-pci.ids=17cb:1103 > >> Obviously change the device/vendor IDs to whatever ath11k hw you > >> have. Once the host is rebooted you should see your wlan adapter as > >> UNCLAIMED, showing the driver in use as vfio-pci. If not, its likely > >> your motherboard just isn't compatible, the device has to be in its > >> own IOMMU group (you could try switching PCI ports if this is the > >> case). > >> I then build a "kvm_guest.config" kernel with the driver/firmware > >> for ath11k and boot into that with the following Qemu options: > >> -enable-kvm -device -vfio-pci,host=<PCI address> > >> If it seems easier you could also utilize IWD's test-runner which > >> handles launching the Qemu kernel automatically, detecting any > >> vfio-devices and passes them through and mounts some useful host > >> folders into the VM. Its actually a very good general purpose tool > >> for kernel testing, not just for IWD: > >> https://git.kernel.org/pub/scm/network/wireless/iwd.git/tree/doc/test-runner.txt > >> Once set up you can just run test-runner with a few flags and you'll > >> boot into a shell: > >> ./tools/test-runner -k <kernel-image> --hw --start /bin/bash > >> Please reach out if you have questions, thanks for looking into > >> this. > > > > Thanks for these details. I reproduced this issue by following your guide. > > > > Seems the root cause is that the MSI vector assigned to WCN6855 in > > qemu is different with that in host. In my case the MSI vector in qemu > > is [Address: fee00000 Data: 0020] while in host it is [Address: > > fee00578 Data: 0000]. So in qemu ath11k configures MSI vector > > [Address: fee00000 Data: 0020] to WCN6855 hardware/firmware, and > > firmware uses that vector to fire interrupts to host/qemu. However > > host IOMMU doesn't know that vector because the real vector is > > [Address: fee00578 Data: 0000], as a result host blocks that > > interrupt and reports an error, see below log: > > > > [ 1414.206069] DMAR: DRHD: handling fault status reg 2 > > [ 1414.206081] DMAR: [INTR-REMAP] Request device [02:00.0] fault index > > 0x0 [fault reason 0x25] Blocked a compatibility format interrupt > > request > > [ 1414.210334] DMAR: DRHD: handling fault status reg 2 > > [ 1414.210342] DMAR: [INTR-REMAP] Request device [02:00.0] fault index > > 0x0 [fault reason 0x25] Blocked a compatibility format interrupt > > request > > [ 1414.212496] DMAR: DRHD: handling fault status reg 2 > > [ 1414.212503] DMAR: [INTR-REMAP] Request device [02:00.0] fault index > > 0x0 [fault reason 0x25] Blocked a compatibility format interrupt > > request > > [ 1414.214600] DMAR: DRHD: handling fault status reg 2 > > > > While I don't think there is a way for qemu/ath11k to get the real MSI > > vector from host, I will try to read the vfio code to check further. > > Before that, to unblock you, a possible hack is to hard code the MSI > > vector in qemu to the same as in host, on condition that the MSI > > vector doesn't change. > > Baochen, awesome that you were able to debug this further. Now we at > least know what's the problem. It's an interesting problem, I don't think we've seen another device where the driver reads the MSI register in order to program another hardware entity to match the MSI address and data configuration. When assigning a device, the host and guest use entirely separate address spaces for MSI interrupts. When the guest enables MSI, the operation is trapped by the VMM and triggers an ioctl to the host to perform an equivalent configuration. Generally the physical device will interrupt within the host where it may be directly attached to KVM to signal the interrupt, trigger through the VMM, or where virtualization hardware supports it, the interrupt can directly trigger the vCPU. From the VM perspective, the guest address/data pair is used to signal the interrupt, which is why it makes sense to virtualize the MSI registers. Off hand I don't have a good solution for this, the hardware is essentially imposing a unique requirement for MSI programming that the driver needs visibility of the physical MSI address and data. It's conceivable that device specific code could either make the physical address/data pair visible to the VM or trap the firmware programming to inject the correct physical values. Is there somewhere other than the standard MSI capability in config space that the driver could learn the physical values, ie. somewhere that isn't virtualized? Thanks, Alex ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-15 17:46 ` Alex Williamson @ 2024-01-16 10:08 ` Baochen Qiang 2024-01-16 10:41 ` David Woodhouse 0 siblings, 1 reply; 51+ messages in thread From: Baochen Qiang @ 2024-01-16 10:08 UTC (permalink / raw) To: Alex Williamson, Kalle Valo Cc: James Prestwood, linux-wireless, ath11k, David Woodhouse, iommu On 1/16/2024 1:46 AM, Alex Williamson wrote: > On Sun, 14 Jan 2024 16:36:02 +0200 > Kalle Valo <kvalo@kernel.org> wrote: > >> Baochen Qiang <quic_bqiang@quicinc.com> writes: >> >>>>> Strange that still fails. Are you now seeing this error in your >>>>> host or your Qemu? or both? >>>>> Could you share your test steps? And if you can share please be as >>>>> detailed as possible since I'm not familiar with passing WLAN >>>>> hardware to a VM using vfio-pci. >>>> >>>> Just in Qemu, the hardware works fine on my host machine. >>>> I basically follow this guide to set it up, its written in the >>>> context of GPUs/libvirt but the host setup is exactly the same. By >>>> no means do you need to read it all, once you set the vfio-pci.ids >>>> and see your unclaimed adapter you can stop: >>>> https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF >>>> In short you should be able to set the following host kernel options >>>> and reboot (assuming your motherboard/hardware is compatible): >>>> intel_iommu=on iommu=pt vfio-pci.ids=17cb:1103 >>>> Obviously change the device/vendor IDs to whatever ath11k hw you >>>> have. Once the host is rebooted you should see your wlan adapter as >>>> UNCLAIMED, showing the driver in use as vfio-pci. If not, its likely >>>> your motherboard just isn't compatible, the device has to be in its >>>> own IOMMU group (you could try switching PCI ports if this is the >>>> case). >>>> I then build a "kvm_guest.config" kernel with the driver/firmware >>>> for ath11k and boot into that with the following Qemu options: >>>> -enable-kvm -device -vfio-pci,host=<PCI address> >>>> If it seems easier you could also utilize IWD's test-runner which >>>> handles launching the Qemu kernel automatically, detecting any >>>> vfio-devices and passes them through and mounts some useful host >>>> folders into the VM. Its actually a very good general purpose tool >>>> for kernel testing, not just for IWD: >>>> https://git.kernel.org/pub/scm/network/wireless/iwd.git/tree/doc/test-runner.txt >>>> Once set up you can just run test-runner with a few flags and you'll >>>> boot into a shell: >>>> ./tools/test-runner -k <kernel-image> --hw --start /bin/bash >>>> Please reach out if you have questions, thanks for looking into >>>> this. >>> >>> Thanks for these details. I reproduced this issue by following your guide. >>> >>> Seems the root cause is that the MSI vector assigned to WCN6855 in >>> qemu is different with that in host. In my case the MSI vector in qemu >>> is [Address: fee00000 Data: 0020] while in host it is [Address: >>> fee00578 Data: 0000]. So in qemu ath11k configures MSI vector >>> [Address: fee00000 Data: 0020] to WCN6855 hardware/firmware, and >>> firmware uses that vector to fire interrupts to host/qemu. However >>> host IOMMU doesn't know that vector because the real vector is >>> [Address: fee00578 Data: 0000], as a result host blocks that >>> interrupt and reports an error, see below log: >>> >>> [ 1414.206069] DMAR: DRHD: handling fault status reg 2 >>> [ 1414.206081] DMAR: [INTR-REMAP] Request device [02:00.0] fault index >>> 0x0 [fault reason 0x25] Blocked a compatibility format interrupt >>> request >>> [ 1414.210334] DMAR: DRHD: handling fault status reg 2 >>> [ 1414.210342] DMAR: [INTR-REMAP] Request device [02:00.0] fault index >>> 0x0 [fault reason 0x25] Blocked a compatibility format interrupt >>> request >>> [ 1414.212496] DMAR: DRHD: handling fault status reg 2 >>> [ 1414.212503] DMAR: [INTR-REMAP] Request device [02:00.0] fault index >>> 0x0 [fault reason 0x25] Blocked a compatibility format interrupt >>> request >>> [ 1414.214600] DMAR: DRHD: handling fault status reg 2 >>> >>> While I don't think there is a way for qemu/ath11k to get the real MSI >>> vector from host, I will try to read the vfio code to check further. >>> Before that, to unblock you, a possible hack is to hard code the MSI >>> vector in qemu to the same as in host, on condition that the MSI >>> vector doesn't change. >> >> Baochen, awesome that you were able to debug this further. Now we at >> least know what's the problem. > > It's an interesting problem, I don't think we've seen another device > where the driver reads the MSI register in order to program another > hardware entity to match the MSI address and data configuration. > > When assigning a device, the host and guest use entirely separate > address spaces for MSI interrupts. When the guest enables MSI, the > operation is trapped by the VMM and triggers an ioctl to the host to > perform an equivalent configuration. Generally the physical device > will interrupt within the host where it may be directly attached to KVM > to signal the interrupt, trigger through the VMM, or where > virtualization hardware supports it, the interrupt can directly trigger > the vCPU. From the VM perspective, the guest address/data pair is used > to signal the interrupt, which is why it makes sense to virtualize the > MSI registers. Hi Alex, could you help elaborate more? why from the VM perspective MSI virtualization is necessary? And, maybe a stupid question, is that possible VM/KVM or vfio only virtualize write operation to MSI register but leave read operation un-virtualized? I am asking this because in that way ath11k may get a chance to run in VM after getting the real vector. > > Off hand I don't have a good solution for this, the hardware is > essentially imposing a unique requirement for MSI programming that the > driver needs visibility of the physical MSI address and data. It's > conceivable that device specific code could either make the physical > address/data pair visible to the VM or trap the firmware programming to > inject the correct physical values. Is there somewhere other than the > standard MSI capability in config space that the driver could learn the > physical values, ie. somewhere that isn't virtualized? Thanks, I don't think we have such capability in configuration space. > > Alex > ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-16 10:08 ` Baochen Qiang @ 2024-01-16 10:41 ` David Woodhouse 2024-01-16 15:29 ` Jason Gunthorpe ` (3 more replies) 0 siblings, 4 replies; 51+ messages in thread From: David Woodhouse @ 2024-01-16 10:41 UTC (permalink / raw) To: Baochen Qiang, Alex Williamson, Kalle Valo Cc: James Prestwood, linux-wireless, ath11k, iommu [-- Attachment #1: Type: text/plain, Size: 11808 bytes --] On Tue, 2024-01-16 at 18:08 +0800, Baochen Qiang wrote: > > > On 1/16/2024 1:46 AM, Alex Williamson wrote: > > On Sun, 14 Jan 2024 16:36:02 +0200 > > Kalle Valo <kvalo@kernel.org> wrote: > > > > > Baochen Qiang <quic_bqiang@quicinc.com> writes: > > > > > > > > > Strange that still fails. Are you now seeing this error in your > > > > > > host or your Qemu? or both? > > > > > > Could you share your test steps? And if you can share please be as > > > > > > detailed as possible since I'm not familiar with passing WLAN > > > > > > hardware to a VM using vfio-pci. > > > > > > > > > > Just in Qemu, the hardware works fine on my host machine. > > > > > I basically follow this guide to set it up, its written in the > > > > > context of GPUs/libvirt but the host setup is exactly the same. By > > > > > no means do you need to read it all, once you set the vfio-pci.ids > > > > > and see your unclaimed adapter you can stop: > > > > > https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF > > > > > In short you should be able to set the following host kernel options > > > > > and reboot (assuming your motherboard/hardware is compatible): > > > > > intel_iommu=on iommu=pt vfio-pci.ids=17cb:1103 > > > > > Obviously change the device/vendor IDs to whatever ath11k hw you > > > > > have. Once the host is rebooted you should see your wlan adapter as > > > > > UNCLAIMED, showing the driver in use as vfio-pci. If not, its likely > > > > > your motherboard just isn't compatible, the device has to be in its > > > > > own IOMMU group (you could try switching PCI ports if this is the > > > > > case). > > > > > I then build a "kvm_guest.config" kernel with the driver/firmware > > > > > for ath11k and boot into that with the following Qemu options: > > > > > -enable-kvm -device -vfio-pci,host=<PCI address> > > > > > If it seems easier you could also utilize IWD's test-runner which > > > > > handles launching the Qemu kernel automatically, detecting any > > > > > vfio-devices and passes them through and mounts some useful host > > > > > folders into the VM. Its actually a very good general purpose tool > > > > > for kernel testing, not just for IWD: > > > > > https://git.kernel.org/pub/scm/network/wireless/iwd.git/tree/doc/test-runner.txt > > > > > Once set up you can just run test-runner with a few flags and you'll > > > > > boot into a shell: > > > > > ./tools/test-runner -k <kernel-image> --hw --start /bin/bash > > > > > Please reach out if you have questions, thanks for looking into > > > > > this. > > > > > > > > Thanks for these details. I reproduced this issue by following your guide. > > > > > > > > Seems the root cause is that the MSI vector assigned to WCN6855 in > > > > qemu is different with that in host. In my case the MSI vector in qemu > > > > is [Address: fee00000 Data: 0020] while in host it is [Address: > > > > fee00578 Data: 0000]. So in qemu ath11k configures MSI vector > > > > [Address: fee00000 Data: 0020] to WCN6855 hardware/firmware, and > > > > firmware uses that vector to fire interrupts to host/qemu. However > > > > host IOMMU doesn't know that vector because the real vector is > > > > [Address: fee00578 Data: 0000], as a result host blocks that > > > > interrupt and reports an error, see below log: > > > > > > > > [ 1414.206069] DMAR: DRHD: handling fault status reg 2 > > > > [ 1414.206081] DMAR: [INTR-REMAP] Request device [02:00.0] fault index > > > > 0x0 [fault reason 0x25] Blocked a compatibility format interrupt > > > > request > > > > [ 1414.210334] DMAR: DRHD: handling fault status reg 2 > > > > [ 1414.210342] DMAR: [INTR-REMAP] Request device [02:00.0] fault index > > > > 0x0 [fault reason 0x25] Blocked a compatibility format interrupt > > > > request > > > > [ 1414.212496] DMAR: DRHD: handling fault status reg 2 > > > > [ 1414.212503] DMAR: [INTR-REMAP] Request device [02:00.0] fault index > > > > 0x0 [fault reason 0x25] Blocked a compatibility format interrupt > > > > request > > > > [ 1414.214600] DMAR: DRHD: handling fault status reg 2 > > > > > > > > While I don't think there is a way for qemu/ath11k to get the real MSI > > > > vector from host, I will try to read the vfio code to check further. > > > > Before that, to unblock you, a possible hack is to hard code the MSI > > > > vector in qemu to the same as in host, on condition that the MSI > > > > vector doesn't change. > > > > > > Baochen, awesome that you were able to debug this further. Now we at > > > least know what's the problem. > > > > It's an interesting problem, I don't think we've seen another device > > where the driver reads the MSI register in order to program another > > hardware entity to match the MSI address and data configuration. > > > > When assigning a device, the host and guest use entirely separate > > address spaces for MSI interrupts. When the guest enables MSI, the > > operation is trapped by the VMM and triggers an ioctl to the host to > > perform an equivalent configuration. Generally the physical device > > will interrupt within the host where it may be directly attached to KVM > > to signal the interrupt, trigger through the VMM, or where > > virtualization hardware supports it, the interrupt can directly trigger > > the vCPU. From the VM perspective, the guest address/data pair is used > > to signal the interrupt, which is why it makes sense to virtualize the > > MSI registers. > > Hi Alex, could you help elaborate more? why from the VM perspective MSI > virtualization is necessary? An MSI is just a write to physical memory space. You can even use it like that; configure the device to just write 4 bytes to some address in a struct in memory to show that it needs attention, and you then poll that memory. But mostly we don't (ab)use it like that, of course. We tell the device to write to a special range of the physical address space where the interrupt controller lives — the range from 0xfee00000 to 0xfeefffff. The low 20 bits of the address, and the 32 bits of data written to that address, tell the interrupt controller which CPU to interrupt, and which vector to raise on the CPU (as well as some other details and weird interrupt modes which are theoretically encodable). So in your example, the guest writes [Address: fee00000 Data: 0020] which means it wants vector 0x20 on CPU#0 (well, the CPU with APICID 0). But that's what the *guest* wants. If we just blindly programmed that into the hardware, the hardware would deliver vector 0x20 to the host's CPU0... which would be very confused by it. The host has a driver for that device, probably the VFIO driver. The host registers its own interrupt handlers for the real hardware, decides which *host* CPU (and vector) should be notified when something happens. And when that happens, the VFIO driver will raise an event on an eventfd, which will notify QEMU to inject the appropriate interrupt into the guest. So... when the guest enables the MSI, that's trapped by QEMU which remembers which *guest* CPU/vector the interrupt should go to. QEMU tells VFIO to enable the corresponding interrupt, and what gets programmed into the actual hardware is up to the *host* operating system; nothing to do with the guest's information at all. Then when the actual hardware raises the interrupt, the VFIO interrupt handler runs in the guest, signals an event on the eventfd, and QEMU receives that and injects the event into the appropriate guest vCPU. (In practice QEMU doesn't do it these days; there's actually a shortcut which improves latency by allowing the kernel to deliver the event to the guest directly, connecting the eventfd directly to the KVM irq routing table.) Interrupt remapping is probably not important here, but I'll explain it briefly anyway. With interrupt remapping, the IOMMU handles the 'memory' write from the device, just as it handles all other memory transactions. One of the reasons for interrupt remapping is that the original definitions of the bits in the MSI (the low 20 bits of the address and the 32 bits of what's written) only had 8 bits for the target CPU APICID. And we have bigger systems than that now. So by using one of the spare bits in the MSI message, we can indicate that this isn't just a directly-encoded cpu/vector in "Compatibility Format", but is a "Remappable Format" interrupt. Instead of the cpu/vector it just contains an index in to the IOMMU's Interrupt Redirection Table. Which *does* have a full 32-bits for the target APIC ID. That's why x2apic support (which gives us support for >254 CPUs) depends on interrupt remapping. The other thing that the IOMMU can do in modern systems is *posted* interrupts. Where the entry in the IOMMU's IRT doesn't just specify the host's CPU/vector, but actually specifies a *vCPU* to deliver the interrupt to. All of which is mostly irrelevant as it's just another bypass optimisation to improve latency. The key here is that what the guest writes to its emulated MSI table and what the host writes to the real hardware are not at all related. If we had had this posted interrupt support from the beginning, perhaps we could have have a much simpler model — we just let the guest write its intended (v)CPU#/vector *directly* to the MSI table in the device, and let the IOMMU fix it up by having a table pointing to the appropriate set of vCPUs. But that isn't how it happened. The model we have is that the VMM has to *emulate* the config space and handle the interrupts as described above. This means that whenever a device has a non-standard way of configuring MSIs, the VMM has to understand and intercept that. I believe we've even seen some Atheros devices with the MSI target in some weird MMIO registers instead of the standard location, so we've had to hack QEMU to handle those too? > And, maybe a stupid question, is that possible VM/KVM or vfio only > virtualize write operation to MSI register but leave read operation > un-virtualized? I am asking this because in that way ath11k may get a > chance to run in VM after getting the real vector. That might confuse a number of operating systems. Especially if they mask/unmask by reading the register, flipping the mask bit and writing back again. How exactly is the content of this register then given back to the firmware? Is that communication snoopable by the VMM? > > > > Off hand I don't have a good solution for this, the hardware is > > essentially imposing a unique requirement for MSI programming that the > > driver needs visibility of the physical MSI address and data. > > Strictly, the driver doesn't need visibility to the actual values used by the hardware. Another way of it looking at it would be to say that the driver programs the MSI through this non-standard method, it just needs the VMM to trap and handle that, just as the VMM does for the standard MSI table. Which is what I thought we'd already seen on some Atheros devices. > > It's > > conceivable that device specific code could either make the physical > > address/data pair visible to the VM or trap the firmware programming to > > inject the correct physical values. Is there somewhere other than the > > standard MSI capability in config space that the driver could learn the > > physical values, ie. somewhere that isn't virtualized? Thanks, > > I don't think we have such capability in configuration space. Configuration space is a complete fiction though; it's all emulated. We can do anything we like. Or we can have a PV hypercall which will report it. I don't know that we'd *want* to, but all things are possible. [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5965 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-16 10:41 ` David Woodhouse @ 2024-01-16 15:29 ` Jason Gunthorpe 2024-01-16 18:28 ` Alex Williamson ` (2 subsequent siblings) 3 siblings, 0 replies; 51+ messages in thread From: Jason Gunthorpe @ 2024-01-16 15:29 UTC (permalink / raw) To: David Woodhouse Cc: Baochen Qiang, Alex Williamson, Kalle Valo, James Prestwood, linux-wireless, ath11k, iommu On Tue, Jan 16, 2024 at 11:41:19AM +0100, David Woodhouse wrote: > If we had had this posted interrupt support from the beginning, perhaps > we could have have a much simpler model — we just let the guest write > its intended (v)CPU#/vector *directly* to the MSI table in the device, > and let the IOMMU fix it up by having a table pointing to the > appropriate set of vCPUs. But that isn't how it happened. The model we > have is that the VMM has to *emulate* the config space and handle the > interrupts as described above. I do have a strong desire to rework things to be more like this, just not time yet :) We have enough real problems related to the fake interrupt data in the guest. This ath11k thing sounds more like IMS really - it makes zero sense that a device would be designed where the MSI vector has to be copied to another location - most likely the other location is another interrupt source that can be programmed independently, with its own irqchip, etc? Linux supports this now. Thomas and Intel did it to support SIOV IMS. Are you sure you have implemented your Linux driver correctly? :) Of course IMS doesn't work in VMs, but that is a big motivation to fix the irq organizing. At least you'd know why the device is broken :) Jason ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-16 10:41 ` David Woodhouse 2024-01-16 15:29 ` Jason Gunthorpe @ 2024-01-16 18:28 ` Alex Williamson 2024-01-16 21:10 ` Jeff Johnson 2024-01-17 5:47 ` Baochen Qiang 2024-03-21 19:14 ` Johannes Berg 3 siblings, 1 reply; 51+ messages in thread From: Alex Williamson @ 2024-01-16 18:28 UTC (permalink / raw) To: David Woodhouse Cc: Baochen Qiang, Kalle Valo, James Prestwood, linux-wireless, ath11k, iommu On Tue, 16 Jan 2024 11:41:19 +0100 David Woodhouse <dwmw2@infradead.org> wrote: > On Tue, 2024-01-16 at 18:08 +0800, Baochen Qiang wrote: > > > > > > On 1/16/2024 1:46 AM, Alex Williamson wrote: > > > On Sun, 14 Jan 2024 16:36:02 +0200 > > > Kalle Valo <kvalo@kernel.org> wrote: > > > > > > > Baochen Qiang <quic_bqiang@quicinc.com> writes: > > > > > > > > > > > Strange that still fails. Are you now seeing this error in your > > > > > > > host or your Qemu? or both? > > > > > > > Could you share your test steps? And if you can share please be as > > > > > > > detailed as possible since I'm not familiar with passing WLAN > > > > > > > hardware to a VM using vfio-pci. > > > > > > > > > > > > Just in Qemu, the hardware works fine on my host machine. > > > > > > I basically follow this guide to set it up, its written in the > > > > > > context of GPUs/libvirt but the host setup is exactly the same. By > > > > > > no means do you need to read it all, once you set the vfio-pci.ids > > > > > > and see your unclaimed adapter you can stop: > > > > > > https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF > > > > > > In short you should be able to set the following host kernel options > > > > > > and reboot (assuming your motherboard/hardware is compatible): > > > > > > intel_iommu=on iommu=pt vfio-pci.ids=17cb:1103 > > > > > > Obviously change the device/vendor IDs to whatever ath11k hw you > > > > > > have. Once the host is rebooted you should see your wlan adapter as > > > > > > UNCLAIMED, showing the driver in use as vfio-pci. If not, its likely > > > > > > your motherboard just isn't compatible, the device has to be in its > > > > > > own IOMMU group (you could try switching PCI ports if this is the > > > > > > case). > > > > > > I then build a "kvm_guest.config" kernel with the driver/firmware > > > > > > for ath11k and boot into that with the following Qemu options: > > > > > > -enable-kvm -device -vfio-pci,host=<PCI address> > > > > > > If it seems easier you could also utilize IWD's test-runner which > > > > > > handles launching the Qemu kernel automatically, detecting any > > > > > > vfio-devices and passes them through and mounts some useful host > > > > > > folders into the VM. Its actually a very good general purpose tool > > > > > > for kernel testing, not just for IWD: > > > > > > https://git.kernel.org/pub/scm/network/wireless/iwd.git/tree/doc/test-runner.txt > > > > > > Once set up you can just run test-runner with a few flags and you'll > > > > > > boot into a shell: > > > > > > ./tools/test-runner -k <kernel-image> --hw --start /bin/bash > > > > > > Please reach out if you have questions, thanks for looking into > > > > > > this. > > > > > > > > > > Thanks for these details. I reproduced this issue by following your guide. > > > > > > > > > > Seems the root cause is that the MSI vector assigned to WCN6855 in > > > > > qemu is different with that in host. In my case the MSI vector in qemu > > > > > is [Address: fee00000 Data: 0020] while in host it is [Address: > > > > > fee00578 Data: 0000]. So in qemu ath11k configures MSI vector > > > > > [Address: fee00000 Data: 0020] to WCN6855 hardware/firmware, and > > > > > firmware uses that vector to fire interrupts to host/qemu. However > > > > > host IOMMU doesn't know that vector because the real vector is > > > > > [Address: fee00578 Data: 0000], as a result host blocks that > > > > > interrupt and reports an error, see below log: > > > > > > > > > > [ 1414.206069] DMAR: DRHD: handling fault status reg 2 > > > > > [ 1414.206081] DMAR: [INTR-REMAP] Request device [02:00.0] fault index > > > > > 0x0 [fault reason 0x25] Blocked a compatibility format interrupt > > > > > request > > > > > [ 1414.210334] DMAR: DRHD: handling fault status reg 2 > > > > > [ 1414.210342] DMAR: [INTR-REMAP] Request device [02:00.0] fault index > > > > > 0x0 [fault reason 0x25] Blocked a compatibility format interrupt > > > > > request > > > > > [ 1414.212496] DMAR: DRHD: handling fault status reg 2 > > > > > [ 1414.212503] DMAR: [INTR-REMAP] Request device [02:00.0] fault index > > > > > 0x0 [fault reason 0x25] Blocked a compatibility format interrupt > > > > > request > > > > > [ 1414.214600] DMAR: DRHD: handling fault status reg 2 > > > > > > > > > > While I don't think there is a way for qemu/ath11k to get the real MSI > > > > > vector from host, I will try to read the vfio code to check further. > > > > > Before that, to unblock you, a possible hack is to hard code the MSI > > > > > vector in qemu to the same as in host, on condition that the MSI > > > > > vector doesn't change. > > > > > > > > Baochen, awesome that you were able to debug this further. Now we at > > > > least know what's the problem. > > > > > > It's an interesting problem, I don't think we've seen another device > > > where the driver reads the MSI register in order to program another > > > hardware entity to match the MSI address and data configuration. > > > > > > When assigning a device, the host and guest use entirely separate > > > address spaces for MSI interrupts. When the guest enables MSI, the > > > operation is trapped by the VMM and triggers an ioctl to the host to > > > perform an equivalent configuration. Generally the physical device > > > will interrupt within the host where it may be directly attached to KVM > > > to signal the interrupt, trigger through the VMM, or where > > > virtualization hardware supports it, the interrupt can directly trigger > > > the vCPU. From the VM perspective, the guest address/data pair is used > > > to signal the interrupt, which is why it makes sense to virtualize the > > > MSI registers. > > > > Hi Alex, could you help elaborate more? why from the VM perspective MSI > > virtualization is necessary? > > An MSI is just a write to physical memory space. You can even use it > like that; configure the device to just write 4 bytes to some address > in a struct in memory to show that it needs attention, and you then > poll that memory. > > But mostly we don't (ab)use it like that, of course. We tell the device > to write to a special range of the physical address space where the > interrupt controller lives — the range from 0xfee00000 to 0xfeefffff. > The low 20 bits of the address, and the 32 bits of data written to that > address, tell the interrupt controller which CPU to interrupt, and > which vector to raise on the CPU (as well as some other details and > weird interrupt modes which are theoretically encodable). > > So in your example, the guest writes [Address: fee00000 Data: 0020] > which means it wants vector 0x20 on CPU#0 (well, the CPU with APICID > 0). But that's what the *guest* wants. If we just blindly programmed > that into the hardware, the hardware would deliver vector 0x20 to the > host's CPU0... which would be very confused by it. > > The host has a driver for that device, probably the VFIO driver. The > host registers its own interrupt handlers for the real hardware, > decides which *host* CPU (and vector) should be notified when something > happens. And when that happens, the VFIO driver will raise an event on > an eventfd, which will notify QEMU to inject the appropriate interrupt > into the guest. > > So... when the guest enables the MSI, that's trapped by QEMU which > remembers which *guest* CPU/vector the interrupt should go to. QEMU > tells VFIO to enable the corresponding interrupt, and what gets > programmed into the actual hardware is up to the *host* operating > system; nothing to do with the guest's information at all. > > Then when the actual hardware raises the interrupt, the VFIO interrupt > handler runs in the guest, signals an event on the eventfd, and QEMU s/guest/host/ > receives that and injects the event into the appropriate guest vCPU. > > (In practice QEMU doesn't do it these days; there's actually a shortcut > which improves latency by allowing the kernel to deliver the event to > the guest directly, connecting the eventfd directly to the KVM irq > routing table.) > > > Interrupt remapping is probably not important here, but I'll explain it > briefly anyway. With interrupt remapping, the IOMMU handles the > 'memory' write from the device, just as it handles all other memory > transactions. One of the reasons for interrupt remapping is that the > original definitions of the bits in the MSI (the low 20 bits of the > address and the 32 bits of what's written) only had 8 bits for the > target CPU APICID. And we have bigger systems than that now. > > So by using one of the spare bits in the MSI message, we can indicate > that this isn't just a directly-encoded cpu/vector in "Compatibility > Format", but is a "Remappable Format" interrupt. Instead of the > cpu/vector it just contains an index in to the IOMMU's Interrupt > Redirection Table. Which *does* have a full 32-bits for the target APIC > ID. That's why x2apic support (which gives us support for >254 CPUs) > depends on interrupt remapping. > > The other thing that the IOMMU can do in modern systems is *posted* > interrupts. Where the entry in the IOMMU's IRT doesn't just specify the > host's CPU/vector, but actually specifies a *vCPU* to deliver the > interrupt to. > > All of which is mostly irrelevant as it's just another bypass > optimisation to improve latency. The key here is that what the guest > writes to its emulated MSI table and what the host writes to the real > hardware are not at all related. > > If we had had this posted interrupt support from the beginning, perhaps > we could have have a much simpler model — we just let the guest write > its intended (v)CPU#/vector *directly* to the MSI table in the device, > and let the IOMMU fix it up by having a table pointing to the > appropriate set of vCPUs. But that isn't how it happened. The model we > have is that the VMM has to *emulate* the config space and handle the > interrupts as described above. > > This means that whenever a device has a non-standard way of configuring > MSIs, the VMM has to understand and intercept that. I believe we've > even seen some Atheros devices with the MSI target in some weird MMIO > registers instead of the standard location, so we've had to hack QEMU > to handle those too? > > > And, maybe a stupid question, is that possible VM/KVM or vfio only > > virtualize write operation to MSI register but leave read operation > > un-virtualized? I am asking this because in that way ath11k may get a > > chance to run in VM after getting the real vector. > > That might confuse a number of operating systems. Especially if they > mask/unmask by reading the register, flipping the mask bit and writing > back again. > > How exactly is the content of this register then given back to the > firmware? Is that communication snoopable by the VMM? > > > > > > > > Off hand I don't have a good solution for this, the hardware is > > > essentially imposing a unique requirement for MSI programming that the > > > driver needs visibility of the physical MSI address and data. > > > > > Strictly, the driver doesn't need visibility to the actual values used > by the hardware. Another way of it looking at it would be to say that > the driver programs the MSI through this non-standard method, it just > needs the VMM to trap and handle that, just as the VMM does for the > standard MSI table. > > Which is what I thought we'd already seen on some Atheros devices. > > > > It's > > > conceivable that device specific code could either make the physical > > > address/data pair visible to the VM or trap the firmware programming to > > > inject the correct physical values. Is there somewhere other than the > > > standard MSI capability in config space that the driver could learn the > > > physical values, ie. somewhere that isn't virtualized? Thanks, > > > > I don't think we have such capability in configuration space. > > Configuration space is a complete fiction though; it's all emulated. We > can do anything we like. Or we can have a PV hypercall which will > report it. I don't know that we'd *want* to, but all things are > possible. RTL8169 has a back door to the MSI-X vector table, maybe that's the one you're thinking of. Alternate methods for the driver to access config space is common on GPUs, presumably because they require extensive vBIOS support and IO port and MMIO windows through which pre-boot code can interact with config space is faster and easier than standard config accesses. Much of the work of assigning a GPU to a VM is to wrap those alternate methods in virtualization to keep the driver working within the guest address space. The fictitious config space was my thought too, an ath11k vfio-pci variant driver could insert a vendor defined capability into config space to expose the physical MSI address/data. The driver would know by the presence of the capability that it's running in a VM and to prefer that mechanism to retrieve MSI address and data. Alternatively as also suggested here, if programming of the firmware with the MSI address/data is something that a hypervisor could trap, then we might be able to make it transparent to the guest. For example if it were programmed via MMIO, the guest address/data values could be auto-magically replaced with physical values. Since QEMU doesn't know the physical values, this would also likely be through a device specific extension to vfio-pci through a variant driver, or maybe some combination of variant driver and QEMU if we need to make trapping conditional in order to avoid a performance penalty. This is essentially device specific interrupt programming, which either needs to be virtualized (performed by the VMM) or paravirtualized (performed in cooperation with the guest). This is also something to keep in mind relative to the initial source of this issue, ie. testing device drivers and hardware under device assignment. There can be subtle differences. Thanks, Alex ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-16 18:28 ` Alex Williamson @ 2024-01-16 21:10 ` Jeff Johnson 0 siblings, 0 replies; 51+ messages in thread From: Jeff Johnson @ 2024-01-16 21:10 UTC (permalink / raw) To: Alex Williamson, David Woodhouse Cc: Baochen Qiang, Kalle Valo, James Prestwood, linux-wireless, ath11k, iommu, kernel@quicinc.com On 1/16/2024 10:28 AM, Alex Williamson wrote: > On Tue, 16 Jan 2024 11:41:19 +0100 > David Woodhouse <dwmw2@infradead.org> wrote: > >> On Tue, 2024-01-16 at 18:08 +0800, Baochen Qiang wrote: >>> >>> >>> On 1/16/2024 1:46 AM, Alex Williamson wrote: >>>> On Sun, 14 Jan 2024 16:36:02 +0200 >>>> Kalle Valo <kvalo@kernel.org> wrote: >>>> >>>>> Baochen Qiang <quic_bqiang@quicinc.com> writes: >>>>> >>>>>>>> Strange that still fails. Are you now seeing this error in your >>>>>>>> host or your Qemu? or both? >>>>>>>> Could you share your test steps? And if you can share please be as >>>>>>>> detailed as possible since I'm not familiar with passing WLAN >>>>>>>> hardware to a VM using vfio-pci. >>>>>>> >>>>>>> Just in Qemu, the hardware works fine on my host machine. >>>>>>> I basically follow this guide to set it up, its written in the >>>>>>> context of GPUs/libvirt but the host setup is exactly the same. By >>>>>>> no means do you need to read it all, once you set the vfio-pci.ids >>>>>>> and see your unclaimed adapter you can stop: >>>>>>> https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF >>>>>>> In short you should be able to set the following host kernel options >>>>>>> and reboot (assuming your motherboard/hardware is compatible): >>>>>>> intel_iommu=on iommu=pt vfio-pci.ids=17cb:1103 >>>>>>> Obviously change the device/vendor IDs to whatever ath11k hw you >>>>>>> have. Once the host is rebooted you should see your wlan adapter as >>>>>>> UNCLAIMED, showing the driver in use as vfio-pci. If not, its likely >>>>>>> your motherboard just isn't compatible, the device has to be in its >>>>>>> own IOMMU group (you could try switching PCI ports if this is the >>>>>>> case). >>>>>>> I then build a "kvm_guest.config" kernel with the driver/firmware >>>>>>> for ath11k and boot into that with the following Qemu options: >>>>>>> -enable-kvm -device -vfio-pci,host=<PCI address> >>>>>>> If it seems easier you could also utilize IWD's test-runner which >>>>>>> handles launching the Qemu kernel automatically, detecting any >>>>>>> vfio-devices and passes them through and mounts some useful host >>>>>>> folders into the VM. Its actually a very good general purpose tool >>>>>>> for kernel testing, not just for IWD: >>>>>>> https://git.kernel.org/pub/scm/network/wireless/iwd.git/tree/doc/test-runner.txt >>>>>>> Once set up you can just run test-runner with a few flags and you'll >>>>>>> boot into a shell: >>>>>>> ./tools/test-runner -k <kernel-image> --hw --start /bin/bash >>>>>>> Please reach out if you have questions, thanks for looking into >>>>>>> this. >>>>>> >>>>>> Thanks for these details. I reproduced this issue by following your guide. >>>>>> >>>>>> Seems the root cause is that the MSI vector assigned to WCN6855 in >>>>>> qemu is different with that in host. In my case the MSI vector in qemu >>>>>> is [Address: fee00000 Data: 0020] while in host it is [Address: >>>>>> fee00578 Data: 0000]. So in qemu ath11k configures MSI vector >>>>>> [Address: fee00000 Data: 0020] to WCN6855 hardware/firmware, and >>>>>> firmware uses that vector to fire interrupts to host/qemu. However >>>>>> host IOMMU doesn't know that vector because the real vector is >>>>>> [Address: fee00578 Data: 0000], as a result host blocks that >>>>>> interrupt and reports an error, see below log: >>>>>> >>>>>> [ 1414.206069] DMAR: DRHD: handling fault status reg 2 >>>>>> [ 1414.206081] DMAR: [INTR-REMAP] Request device [02:00.0] fault index >>>>>> 0x0 [fault reason 0x25] Blocked a compatibility format interrupt >>>>>> request >>>>>> [ 1414.210334] DMAR: DRHD: handling fault status reg 2 >>>>>> [ 1414.210342] DMAR: [INTR-REMAP] Request device [02:00.0] fault index >>>>>> 0x0 [fault reason 0x25] Blocked a compatibility format interrupt >>>>>> request >>>>>> [ 1414.212496] DMAR: DRHD: handling fault status reg 2 >>>>>> [ 1414.212503] DMAR: [INTR-REMAP] Request device [02:00.0] fault index >>>>>> 0x0 [fault reason 0x25] Blocked a compatibility format interrupt >>>>>> request >>>>>> [ 1414.214600] DMAR: DRHD: handling fault status reg 2 >>>>>> >>>>>> While I don't think there is a way for qemu/ath11k to get the real MSI >>>>>> vector from host, I will try to read the vfio code to check further. >>>>>> Before that, to unblock you, a possible hack is to hard code the MSI >>>>>> vector in qemu to the same as in host, on condition that the MSI >>>>>> vector doesn't change. >>>>> >>>>> Baochen, awesome that you were able to debug this further. Now we at >>>>> least know what's the problem. >>>> >>>> It's an interesting problem, I don't think we've seen another device >>>> where the driver reads the MSI register in order to program another >>>> hardware entity to match the MSI address and data configuration. >>>> >>>> When assigning a device, the host and guest use entirely separate >>>> address spaces for MSI interrupts. When the guest enables MSI, the >>>> operation is trapped by the VMM and triggers an ioctl to the host to >>>> perform an equivalent configuration. Generally the physical device >>>> will interrupt within the host where it may be directly attached to KVM >>>> to signal the interrupt, trigger through the VMM, or where >>>> virtualization hardware supports it, the interrupt can directly trigger >>>> the vCPU. From the VM perspective, the guest address/data pair is used >>>> to signal the interrupt, which is why it makes sense to virtualize the >>>> MSI registers. >>> >>> Hi Alex, could you help elaborate more? why from the VM perspective MSI >>> virtualization is necessary? >> >> An MSI is just a write to physical memory space. You can even use it >> like that; configure the device to just write 4 bytes to some address >> in a struct in memory to show that it needs attention, and you then >> poll that memory. >> >> But mostly we don't (ab)use it like that, of course. We tell the device >> to write to a special range of the physical address space where the >> interrupt controller lives — the range from 0xfee00000 to 0xfeefffff. >> The low 20 bits of the address, and the 32 bits of data written to that >> address, tell the interrupt controller which CPU to interrupt, and >> which vector to raise on the CPU (as well as some other details and >> weird interrupt modes which are theoretically encodable). >> >> So in your example, the guest writes [Address: fee00000 Data: 0020] >> which means it wants vector 0x20 on CPU#0 (well, the CPU with APICID >> 0). But that's what the *guest* wants. If we just blindly programmed >> that into the hardware, the hardware would deliver vector 0x20 to the >> host's CPU0... which would be very confused by it. >> >> The host has a driver for that device, probably the VFIO driver. The >> host registers its own interrupt handlers for the real hardware, >> decides which *host* CPU (and vector) should be notified when something >> happens. And when that happens, the VFIO driver will raise an event on >> an eventfd, which will notify QEMU to inject the appropriate interrupt >> into the guest. >> >> So... when the guest enables the MSI, that's trapped by QEMU which >> remembers which *guest* CPU/vector the interrupt should go to. QEMU >> tells VFIO to enable the corresponding interrupt, and what gets >> programmed into the actual hardware is up to the *host* operating >> system; nothing to do with the guest's information at all. >> >> Then when the actual hardware raises the interrupt, the VFIO interrupt >> handler runs in the guest, signals an event on the eventfd, and QEMU > > s/guest/host/ > >> receives that and injects the event into the appropriate guest vCPU. >> >> (In practice QEMU doesn't do it these days; there's actually a shortcut >> which improves latency by allowing the kernel to deliver the event to >> the guest directly, connecting the eventfd directly to the KVM irq >> routing table.) >> >> >> Interrupt remapping is probably not important here, but I'll explain it >> briefly anyway. With interrupt remapping, the IOMMU handles the >> 'memory' write from the device, just as it handles all other memory >> transactions. One of the reasons for interrupt remapping is that the >> original definitions of the bits in the MSI (the low 20 bits of the >> address and the 32 bits of what's written) only had 8 bits for the >> target CPU APICID. And we have bigger systems than that now. >> >> So by using one of the spare bits in the MSI message, we can indicate >> that this isn't just a directly-encoded cpu/vector in "Compatibility >> Format", but is a "Remappable Format" interrupt. Instead of the >> cpu/vector it just contains an index in to the IOMMU's Interrupt >> Redirection Table. Which *does* have a full 32-bits for the target APIC >> ID. That's why x2apic support (which gives us support for >254 CPUs) >> depends on interrupt remapping. >> >> The other thing that the IOMMU can do in modern systems is *posted* >> interrupts. Where the entry in the IOMMU's IRT doesn't just specify the >> host's CPU/vector, but actually specifies a *vCPU* to deliver the >> interrupt to. >> >> All of which is mostly irrelevant as it's just another bypass >> optimisation to improve latency. The key here is that what the guest >> writes to its emulated MSI table and what the host writes to the real >> hardware are not at all related. >> >> If we had had this posted interrupt support from the beginning, perhaps >> we could have have a much simpler model — we just let the guest write >> its intended (v)CPU#/vector *directly* to the MSI table in the device, >> and let the IOMMU fix it up by having a table pointing to the >> appropriate set of vCPUs. But that isn't how it happened. The model we >> have is that the VMM has to *emulate* the config space and handle the >> interrupts as described above. >> >> This means that whenever a device has a non-standard way of configuring >> MSIs, the VMM has to understand and intercept that. I believe we've >> even seen some Atheros devices with the MSI target in some weird MMIO >> registers instead of the standard location, so we've had to hack QEMU >> to handle those too? >> >>> And, maybe a stupid question, is that possible VM/KVM or vfio only >>> virtualize write operation to MSI register but leave read operation >>> un-virtualized? I am asking this because in that way ath11k may get a >>> chance to run in VM after getting the real vector. >> >> That might confuse a number of operating systems. Especially if they >> mask/unmask by reading the register, flipping the mask bit and writing >> back again. >> >> How exactly is the content of this register then given back to the >> firmware? Is that communication snoopable by the VMM? >> >> >>>> >>>> Off hand I don't have a good solution for this, the hardware is >>>> essentially imposing a unique requirement for MSI programming that the >>>> driver needs visibility of the physical MSI address and data. >>>> >> >> Strictly, the driver doesn't need visibility to the actual values used >> by the hardware. Another way of it looking at it would be to say that >> the driver programs the MSI through this non-standard method, it just >> needs the VMM to trap and handle that, just as the VMM does for the >> standard MSI table. >> >> Which is what I thought we'd already seen on some Atheros devices. >> >>>> It's >>>> conceivable that device specific code could either make the physical >>>> address/data pair visible to the VM or trap the firmware programming to >>>> inject the correct physical values. Is there somewhere other than the >>>> standard MSI capability in config space that the driver could learn the >>>> physical values, ie. somewhere that isn't virtualized? Thanks, >>> >>> I don't think we have such capability in configuration space. >> >> Configuration space is a complete fiction though; it's all emulated. We >> can do anything we like. Or we can have a PV hypercall which will >> report it. I don't know that we'd *want* to, but all things are >> possible. > > RTL8169 has a back door to the MSI-X vector table, maybe that's the one > you're thinking of. Alternate methods for the driver to access config > space is common on GPUs, presumably because they require extensive > vBIOS support and IO port and MMIO windows through which pre-boot code > can interact with config space is faster and easier than standard > config accesses. Much of the work of assigning a GPU to a VM is to > wrap those alternate methods in virtualization to keep the driver > working within the guest address space. > > The fictitious config space was my thought too, an ath11k vfio-pci > variant driver could insert a vendor defined capability into config > space to expose the physical MSI address/data. The driver would know > by the presence of the capability that it's running in a VM and to > prefer that mechanism to retrieve MSI address and data. > > Alternatively as also suggested here, if programming of the firmware > with the MSI address/data is something that a hypervisor could trap, > then we might be able to make it transparent to the guest. For example > if it were programmed via MMIO, the guest address/data values could be > auto-magically replaced with physical values. Since QEMU doesn't know > the physical values, this would also likely be through a device > specific extension to vfio-pci through a variant driver, or maybe some > combination of variant driver and QEMU if we need to make trapping > conditional in order to avoid a performance penalty. > > This is essentially device specific interrupt programming, which either > needs to be virtualized (performed by the VMM) or paravirtualized > (performed in cooperation with the guest). This is also something to > keep in mind relative to the initial source of this issue, ie. testing > device drivers and hardware under device assignment. There can be > subtle differences. Thanks, > > Alex > > + kernel@quicinc.com for added visibility and advice Full thread: <https://lore.kernel.org/all/adcb785e-4dc7-4c4a-b341-d53b72e13467@gmail.com/> ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-16 10:41 ` David Woodhouse 2024-01-16 15:29 ` Jason Gunthorpe 2024-01-16 18:28 ` Alex Williamson @ 2024-01-17 5:47 ` Baochen Qiang 2024-03-21 19:14 ` Johannes Berg 3 siblings, 0 replies; 51+ messages in thread From: Baochen Qiang @ 2024-01-17 5:47 UTC (permalink / raw) To: David Woodhouse, Alex Williamson, Kalle Valo Cc: James Prestwood, linux-wireless, ath11k, iommu On 1/16/2024 6:41 PM, David Woodhouse wrote: > On Tue, 2024-01-16 at 18:08 +0800, Baochen Qiang wrote: >> >> >> On 1/16/2024 1:46 AM, Alex Williamson wrote: >>> On Sun, 14 Jan 2024 16:36:02 +0200 >>> Kalle Valo <kvalo@kernel.org> wrote: >>> >>>> Baochen Qiang <quic_bqiang@quicinc.com> writes: >>>> >>>>>>> Strange that still fails. Are you now seeing this error in your >>>>>>> host or your Qemu? or both? >>>>>>> Could you share your test steps? And if you can share please be as >>>>>>> detailed as possible since I'm not familiar with passing WLAN >>>>>>> hardware to a VM using vfio-pci. >>>>>> >>>>>> Just in Qemu, the hardware works fine on my host machine. >>>>>> I basically follow this guide to set it up, its written in the >>>>>> context of GPUs/libvirt but the host setup is exactly the same. By >>>>>> no means do you need to read it all, once you set the vfio-pci.ids >>>>>> and see your unclaimed adapter you can stop: >>>>>> https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF >>>>>> In short you should be able to set the following host kernel options >>>>>> and reboot (assuming your motherboard/hardware is compatible): >>>>>> intel_iommu=on iommu=pt vfio-pci.ids=17cb:1103 >>>>>> Obviously change the device/vendor IDs to whatever ath11k hw you >>>>>> have. Once the host is rebooted you should see your wlan adapter as >>>>>> UNCLAIMED, showing the driver in use as vfio-pci. If not, its likely >>>>>> your motherboard just isn't compatible, the device has to be in its >>>>>> own IOMMU group (you could try switching PCI ports if this is the >>>>>> case). >>>>>> I then build a "kvm_guest.config" kernel with the driver/firmware >>>>>> for ath11k and boot into that with the following Qemu options: >>>>>> -enable-kvm -device -vfio-pci,host=<PCI address> >>>>>> If it seems easier you could also utilize IWD's test-runner which >>>>>> handles launching the Qemu kernel automatically, detecting any >>>>>> vfio-devices and passes them through and mounts some useful host >>>>>> folders into the VM. Its actually a very good general purpose tool >>>>>> for kernel testing, not just for IWD: >>>>>> https://git.kernel.org/pub/scm/network/wireless/iwd.git/tree/doc/test-runner.txt >>>>>> Once set up you can just run test-runner with a few flags and you'll >>>>>> boot into a shell: >>>>>> ./tools/test-runner -k <kernel-image> --hw --start /bin/bash >>>>>> Please reach out if you have questions, thanks for looking into >>>>>> this. >>>>> >>>>> Thanks for these details. I reproduced this issue by following your guide. >>>>> >>>>> Seems the root cause is that the MSI vector assigned to WCN6855 in >>>>> qemu is different with that in host. In my case the MSI vector in qemu >>>>> is [Address: fee00000 Data: 0020] while in host it is [Address: >>>>> fee00578 Data: 0000]. So in qemu ath11k configures MSI vector >>>>> [Address: fee00000 Data: 0020] to WCN6855 hardware/firmware, and >>>>> firmware uses that vector to fire interrupts to host/qemu. However >>>>> host IOMMU doesn't know that vector because the real vector is >>>>> [Address: fee00578 Data: 0000], as a result host blocks that >>>>> interrupt and reports an error, see below log: >>>>> >>>>> [ 1414.206069] DMAR: DRHD: handling fault status reg 2 >>>>> [ 1414.206081] DMAR: [INTR-REMAP] Request device [02:00.0] fault index >>>>> 0x0 [fault reason 0x25] Blocked a compatibility format interrupt >>>>> request >>>>> [ 1414.210334] DMAR: DRHD: handling fault status reg 2 >>>>> [ 1414.210342] DMAR: [INTR-REMAP] Request device [02:00.0] fault index >>>>> 0x0 [fault reason 0x25] Blocked a compatibility format interrupt >>>>> request >>>>> [ 1414.212496] DMAR: DRHD: handling fault status reg 2 >>>>> [ 1414.212503] DMAR: [INTR-REMAP] Request device [02:00.0] fault index >>>>> 0x0 [fault reason 0x25] Blocked a compatibility format interrupt >>>>> request >>>>> [ 1414.214600] DMAR: DRHD: handling fault status reg 2 >>>>> >>>>> While I don't think there is a way for qemu/ath11k to get the real MSI >>>>> vector from host, I will try to read the vfio code to check further. >>>>> Before that, to unblock you, a possible hack is to hard code the MSI >>>>> vector in qemu to the same as in host, on condition that the MSI >>>>> vector doesn't change. >>>> >>>> Baochen, awesome that you were able to debug this further. Now we at >>>> least know what's the problem. >>> >>> It's an interesting problem, I don't think we've seen another device >>> where the driver reads the MSI register in order to program another >>> hardware entity to match the MSI address and data configuration. >>> >>> When assigning a device, the host and guest use entirely separate >>> address spaces for MSI interrupts. When the guest enables MSI, the >>> operation is trapped by the VMM and triggers an ioctl to the host to >>> perform an equivalent configuration. Generally the physical device >>> will interrupt within the host where it may be directly attached to KVM >>> to signal the interrupt, trigger through the VMM, or where >>> virtualization hardware supports it, the interrupt can directly trigger >>> the vCPU. From the VM perspective, the guest address/data pair is used >>> to signal the interrupt, which is why it makes sense to virtualize the >>> MSI registers. >> >> Hi Alex, could you help elaborate more? why from the VM perspective MSI >> virtualization is necessary? > > An MSI is just a write to physical memory space. You can even use it > like that; configure the device to just write 4 bytes to some address > in a struct in memory to show that it needs attention, and you then > poll that memory. > > But mostly we don't (ab)use it like that, of course. We tell the device > to write to a special range of the physical address space where the > interrupt controller lives — the range from 0xfee00000 to 0xfeefffff. > The low 20 bits of the address, and the 32 bits of data written to that > address, tell the interrupt controller which CPU to interrupt, and > which vector to raise on the CPU (as well as some other details and > weird interrupt modes which are theoretically encodable). > > So in your example, the guest writes [Address: fee00000 Data: 0020] > which means it wants vector 0x20 on CPU#0 (well, the CPU with APICID > 0). But that's what the *guest* wants. If we just blindly programmed > that into the hardware, the hardware would deliver vector 0x20 to the > host's CPU0... which would be very confused by it. > > The host has a driver for that device, probably the VFIO driver. The > host registers its own interrupt handlers for the real hardware, > decides which *host* CPU (and vector) should be notified when something > happens. And when that happens, the VFIO driver will raise an event on > an eventfd, which will notify QEMU to inject the appropriate interrupt > into the guest. > > So... when the guest enables the MSI, that's trapped by QEMU which > remembers which *guest* CPU/vector the interrupt should go to. QEMU > tells VFIO to enable the corresponding interrupt, and what gets > programmed into the actual hardware is up to the *host* operating > system; nothing to do with the guest's information at all. > > Then when the actual hardware raises the interrupt, the VFIO interrupt > handler runs in the guest, signals an event on the eventfd, and QEMU > receives that and injects the event into the appropriate guest vCPU. > > (In practice QEMU doesn't do it these days; there's actually a shortcut > which improves latency by allowing the kernel to deliver the event to > the guest directly, connecting the eventfd directly to the KVM irq > routing table.) > > > Interrupt remapping is probably not important here, but I'll explain it > briefly anyway. With interrupt remapping, the IOMMU handles the > 'memory' write from the device, just as it handles all other memory > transactions. One of the reasons for interrupt remapping is that the > original definitions of the bits in the MSI (the low 20 bits of the > address and the 32 bits of what's written) only had 8 bits for the > target CPU APICID. And we have bigger systems than that now. > > So by using one of the spare bits in the MSI message, we can indicate > that this isn't just a directly-encoded cpu/vector in "Compatibility > Format", but is a "Remappable Format" interrupt. Instead of the > cpu/vector it just contains an index in to the IOMMU's Interrupt > Redirection Table. Which *does* have a full 32-bits for the target APIC > ID. That's why x2apic support (which gives us support for >254 CPUs) > depends on interrupt remapping. > > The other thing that the IOMMU can do in modern systems is *posted* > interrupts. Where the entry in the IOMMU's IRT doesn't just specify the > host's CPU/vector, but actually specifies a *vCPU* to deliver the > interrupt to. > > All of which is mostly irrelevant as it's just another bypass > optimisation to improve latency. The key here is that what the guest > writes to its emulated MSI table and what the host writes to the real > hardware are not at all related. > Thanks. A really detailed and clear explanation. > If we had had this posted interrupt support from the beginning, perhaps > we could have have a much simpler model — we just let the guest write > its intended (v)CPU#/vector *directly* to the MSI table in the device, > and let the IOMMU fix it up by having a table pointing to the > appropriate set of vCPUs. But that isn't how it happened. The model we > have is that the VMM has to *emulate* the config space and handle the > interrupts as described above. > > This means that whenever a device has a non-standard way of configuring > MSIs, the VMM has to understand and intercept that. I believe we've > even seen some Atheros devices with the MSI target in some weird MMIO > registers instead of the standard location, so we've had to hack QEMU > to handle those too? > >> And, maybe a stupid question, is that possible VM/KVM or vfio only >> virtualize write operation to MSI register but leave read operation >> un-virtualized? I am asking this because in that way ath11k may get a >> chance to run in VM after getting the real vector. > > That might confuse a number of operating systems. Especially if they > mask/unmask by reading the register, flipping the mask bit and writing > back again. > > How exactly is the content of this register then given back to the > firmware? Is that communication snoopable by the VMM? By programming it to a MMIO register. It is a non-standard register and also device specific, not sure snoopable or not by the VMM. > > >>> >>> Off hand I don't have a good solution for this, the hardware is >>> essentially imposing a unique requirement for MSI programming that the >>> driver needs visibility of the physical MSI address and data. >>> > > Strictly, the driver doesn't need visibility to the actual values used > by the hardware. Another way of it looking at it would be to say that > the driver programs the MSI through this non-standard method, it just > needs the VMM to trap and handle that, just as the VMM does for the > standard MSI table. > > Which is what I thought we'd already seen on some Atheros devices. > >>> It's >>> conceivable that device specific code could either make the physical >>> address/data pair visible to the VM or trap the firmware programming to >>> inject the correct physical values. Is there somewhere other than the >>> standard MSI capability in config space that the driver could learn the >>> physical values, ie. somewhere that isn't virtualized? Thanks, >> >> I don't think we have such capability in configuration space. > > Configuration space is a complete fiction though; it's all emulated. We > can do anything we like. Or we can have a PV hypercall which will > report it. I don't know that we'd *want* to, but all things are > possible. OK, I get the point now. > ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-16 10:41 ` David Woodhouse ` (2 preceding siblings ...) 2024-01-17 5:47 ` Baochen Qiang @ 2024-03-21 19:14 ` Johannes Berg 3 siblings, 0 replies; 51+ messages in thread From: Johannes Berg @ 2024-03-21 19:14 UTC (permalink / raw) To: David Woodhouse, Baochen Qiang, Alex Williamson, Kalle Valo Cc: Jose Ignacio Tornos Martinez, James Prestwood, linux-wireless, ath11k, iommu Given the "ewww" patch to make a module parameter workaround for this: https://lore.kernel.org/linux-wireless/20240321172402.346191-1-jtornosm@redhat.com/T/#u On Tue, 2024-01-16 at 11:41 +0100, David Woodhouse wrote: > > How exactly is the content of this register then given back to the > firmware? Is that communication snoopable by the VMM? If I'm reading the code correctly, it's just a write to a register in a memory mapped region, possibly in multiple locations (different queues or something). Possibly indirect (see __ath11k_pcic_write32), but someone would have to trace it or know the HW better to understand which locations it's written to. But yeah seems totally feasible to just translate that back in the VMM. > > > Off hand I don't have a good solution for this, the hardware is > > > essentially imposing a unique requirement for MSI programming that the > > > driver needs visibility of the physical MSI address and data. > > > > > Strictly, the driver doesn't need visibility to the actual values used > by the hardware. Another way of it looking at it would be to say that > the driver programs the MSI through this non-standard method, it just > needs the VMM to trap and handle that, just as the VMM does for the > standard MSI table. Indeed. Much better than having a module parameter. > Which is what I thought we'd already seen on some Atheros devices. It probably also affects ath12k, seems similar. johannes ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-14 12:37 ` Baochen Qiang 2024-01-14 14:36 ` Kalle Valo @ 2024-01-16 13:05 ` James Prestwood 2024-01-17 5:26 ` Baochen Qiang 1 sibling, 1 reply; 51+ messages in thread From: James Prestwood @ 2024-01-16 13:05 UTC (permalink / raw) To: Baochen Qiang, Kalle Valo; +Cc: linux-wireless, ath11k Hi Baochen, On 1/14/24 4:37 AM, Baochen Qiang wrote: > > > On 1/12/2024 8:47 PM, James Prestwood wrote: >> Hi, >> >> On 1/11/24 6:04 PM, Baochen Qiang wrote: >>> >>> >>> On 1/11/2024 9:38 PM, James Prestwood wrote: >>>> >>>> On 1/11/24 5:11 AM, Kalle Valo wrote: >>>>> James Prestwood <prestwoj@gmail.com> writes: >>>>> >>>>>> Hi Kalle, Baochen, >>>>>> >>>>>> On 1/11/24 12:16 AM, Kalle Valo wrote: >>>>>>> Baochen Qiang <quic_bqiang@quicinc.com> writes: >>>>>>> >>>>>>>> On 1/10/2024 10:55 PM, James Prestwood wrote: >>>>>>>>> Hi Kalle, >>>>>>>>> On 1/10/24 5:49 AM, Kalle Valo wrote: >>>>>>>>>> James Prestwood <prestwoj@gmail.com> writes: >>>>>>>>>> >>>>>>>>>>>> But I have also no idea what is causing this, I guess we >>>>>>>>>>>> are doing >>>>>>>>>>>> something wrong with the PCI communication? That reminds >>>>>>>>>>>> me, you could >>>>>>>>>>>> try this in case that helps: >>>>>>>>>>>> >>>>>>>>>>>> https://patchwork.kernel.org/project/linux-wireless/patch/20231212031914.47339-1-imguzh@gmail.com/ >>>>>>>>>>>> >>>>>>>>>>> Heh, I saw this pop up a day after I sent this and was >>>>>>>>>>> wondering. Is >>>>>>>>>>> this something I'd need on the host kernel, guest, or both? >>>>>>>>>> On the guest where ath11k is running. I'm not optimistic that >>>>>>>>>> this would >>>>>>>>>> solve your issue, I suspect there can be also other bugs, but >>>>>>>>>> good to >>>>>>>>>> know if the patch changes anything. >>>>>>>>> Looks the same here, didn't seem to change anything based on the >>>>>>>>> kernel logs. >>>>>>>>> >>>>>>>> Could you try this? >>>>>>>> >>>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/net/wireless/ath/ath11k/pci.c?id=39564b475ac5a589e6c22c43a08cbd283c295d2c >>>>>>>> >>>>>>> This reminds me, I assumed James was testing with ath.git master >>>>>>> branch >>>>>>> (which has that commit) but I never checked that. So for testing >>>>>>> please >>>>>>> always use the master branch to get the latest and greatest ath11k: >>>>>>> >>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/ >>>>>>> >>>>>>> There's a quite long delay from ath.git to official releases. >>>>>> Good to know, and I was not in fact using that branch. Rebuilt from >>>>>> ath.git/master but still roughly the same behavior. There does >>>>>> appear >>>>>> to be more output now though, specifically a firmware crash: >>>>>> >>>>>> [ 2.281721] ath11k_pci 0000:00:06.0: failed to receive control >>>>>> response completion, polling.. >>>>>> [ 2.282101] ip (65) used greatest stack depth: 12464 bytes left >>>>>> [ 3.306039] ath11k_pci 0000:00:06.0: Service connect timeout >>>>>> [ 3.307588] ath11k_pci 0000:00:06.0: failed to connect to HTT: >>>>>> -110 >>>>>> [ 3.309286] ath11k_pci 0000:00:06.0: failed to start core: -110 >>>>>> [ 3.519637] ath11k_pci 0000:00:06.0: firmware crashed: >>>>>> MHI_CB_EE_RDDM >>>>>> [ 3.519678] ath11k_pci 0000:00:06.0: ignore reset dev flags >>>>>> 0x4000 >>>>>> [ 3.627087] ath11k_pci 0000:00:06.0: firmware crashed: >>>>>> MHI_CB_EE_RDDM >>>>>> [ 3.627129] ath11k_pci 0000:00:06.0: ignore reset dev flags >>>>>> 0x4000 >>>>>> [ 13.802105] ath11k_pci 0000:00:06.0: failed to wait wlan mode >>>>>> request (mode 4): -110 >>>>>> [ 13.802175] ath11k_pci 0000:00:06.0: qmi failed to send wlan mode >>>>>> off: -110 >>>>> Ok, that's progress now. Can you try next try the iommu patch[1] we >>>>> talked about earlier? It's already in master-pending branch (along >>>>> with >>>>> other pending patches) so you can use that branch if you want. >>>>> >>>>> [1] >>>>> https://patchwork.kernel.org/project/linux-wireless/patch/20231212031914.47339-1-imguzh@gmail.com/ >>>> >>>> Same result unfortunately, tried both with just [1] applied to >>>> ath.git and at HEAD of master-pending. >>>> >>>> Thanks, >>>> >>>> James >>> Strange that still fails. Are you now seeing this error in your host >>> or your Qemu? or both? >>> Could you share your test steps? And if you can share please be as >>> detailed as possible since I'm not familiar with passing WLAN >>> hardware to a VM using vfio-pci. >> >> Just in Qemu, the hardware works fine on my host machine. >> >> I basically follow this guide to set it up, its written in the >> context of GPUs/libvirt but the host setup is exactly the same. By no >> means do you need to read it all, once you set the vfio-pci.ids and >> see your unclaimed adapter you can stop: >> >> https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF >> >> In short you should be able to set the following host kernel options >> and reboot (assuming your motherboard/hardware is compatible): >> >> intel_iommu=on iommu=pt vfio-pci.ids=17cb:1103 >> >> Obviously change the device/vendor IDs to whatever ath11k hw you >> have. Once the host is rebooted you should see your wlan adapter as >> UNCLAIMED, showing the driver in use as vfio-pci. If not, its likely >> your motherboard just isn't compatible, the device has to be in its >> own IOMMU group (you could try switching PCI ports if this is the case). >> >> I then build a "kvm_guest.config" kernel with the driver/firmware for >> ath11k and boot into that with the following Qemu options: >> >> -enable-kvm -device -vfio-pci,host=<PCI address> >> >> If it seems easier you could also utilize IWD's test-runner which >> handles launching the Qemu kernel automatically, detecting any >> vfio-devices and passes them through and mounts some useful host >> folders into the VM. Its actually a very good general purpose tool >> for kernel testing, not just for IWD: >> >> https://git.kernel.org/pub/scm/network/wireless/iwd.git/tree/doc/test-runner.txt >> >> >> Once set up you can just run test-runner with a few flags and you'll >> boot into a shell: >> >> ./tools/test-runner -k <kernel-image> --hw --start /bin/bash >> >> Please reach out if you have questions, thanks for looking into this. >> > Thanks for these details. I reproduced this issue by following your > guide. > > Seems the root cause is that the MSI vector assigned to WCN6855 in > qemu is different with that in host. In my case the MSI vector in qemu > is [Address: fee00000 Data: 0020] while in host it is [Address: > fee00578 Data: 0000]. So in qemu ath11k configures MSI vector > [Address: fee00000 Data: 0020] to WCN6855 hardware/firmware, and > firmware uses that vector to fire interrupts to host/qemu. However > host IOMMU doesn't know that vector because the real vector is > [Address: fee00578 Data: 0000], as a result host blocks that > interrupt and reports an error, see below log: > > [ 1414.206069] DMAR: DRHD: handling fault status reg 2 > [ 1414.206081] DMAR: [INTR-REMAP] Request device [02:00.0] fault index > 0x0 [fault reason 0x25] Blocked a compatibility format interrupt request > [ 1414.210334] DMAR: DRHD: handling fault status reg 2 > [ 1414.210342] DMAR: [INTR-REMAP] Request device [02:00.0] fault index > 0x0 [fault reason 0x25] Blocked a compatibility format interrupt request > [ 1414.212496] DMAR: DRHD: handling fault status reg 2 > [ 1414.212503] DMAR: [INTR-REMAP] Request device [02:00.0] fault index > 0x0 [fault reason 0x25] Blocked a compatibility format interrupt request > [ 1414.214600] DMAR: DRHD: handling fault status reg 2 > > While I don't think there is a way for qemu/ath11k to get the real MSI > vector from host, I will try to read the vfio code to check further. > Before that, to unblock you, a possible hack is to hard code the MSI > vector in qemu to the same as in host, on condition that the MSI > vector doesn't change. In my case, the change looks like > > diff --git a/drivers/net/wireless/ath/ath11k/pci.c > b/drivers/net/wireless/ath/ath11k/pci.c > index 09e65c5e55c4..89a9bbe9e4d2 100644 > --- a/drivers/net/wireless/ath/ath11k/pci.c > +++ b/drivers/net/wireless/ath/ath11k/pci.c > @@ -459,7 +459,12 @@ static int ath11k_pci_alloc_msi(struct ath11k_pci > *ab_pci) > ab->pci.msi.addr_hi = 0; > } > > - ath11k_dbg(ab, ATH11K_DBG_PCI, "msi base data is %d\n", > ab->pci.msi.ep_base_data); > + ab->pci.msi.addr_hi = 0; > + ab->pci.msi.addr_lo = 0xfee00578; > + ath11k_dbg(ab, ATH11K_DBG_PCI, "msi addr hi 0x%x lo 0x%x base > data is %d\n", > + ab->pci.msi.addr_hi, > + ab->pci.msi.addr_lo, > + ab->pci.msi.ep_base_data); > > return 0; > > @@ -487,6 +492,7 @@ static int ath11k_pci_config_msi_data(struct > ath11k_pci *ab_pci) > } > > ab_pci->ab->pci.msi.ep_base_data = msi_desc->msg.data; > + ab_pci->ab->pci.msi.ep_base_data = 0; > > ath11k_dbg(ab_pci->ab, ATH11K_DBG_PCI, "after request_irq > msi_ep_base_data %d\n", > ab_pci->ab->pci.msi.ep_base_data); > > > This hack works on my setup. Progress! Thank you. This didn't work for me but its likely because my host MSI vector is not fee00578. Where did you come up with this value? I don't see anything in the dmesg logs, or in lspci etc. Thanks, James ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-16 13:05 ` James Prestwood @ 2024-01-17 5:26 ` Baochen Qiang 2024-01-17 13:20 ` James Prestwood 0 siblings, 1 reply; 51+ messages in thread From: Baochen Qiang @ 2024-01-17 5:26 UTC (permalink / raw) To: James Prestwood, Kalle Valo; +Cc: linux-wireless, ath11k On 1/16/2024 9:05 PM, James Prestwood wrote: > Hi Baochen, > > On 1/14/24 4:37 AM, Baochen Qiang wrote: >> >> >> On 1/12/2024 8:47 PM, James Prestwood wrote: >>> Hi, >>> >>> On 1/11/24 6:04 PM, Baochen Qiang wrote: >>>> >>>> >>>> On 1/11/2024 9:38 PM, James Prestwood wrote: >>>>> >>>>> On 1/11/24 5:11 AM, Kalle Valo wrote: >>>>>> James Prestwood <prestwoj@gmail.com> writes: >>>>>> >>>>>>> Hi Kalle, Baochen, >>>>>>> >>>>>>> On 1/11/24 12:16 AM, Kalle Valo wrote: >>>>>>>> Baochen Qiang <quic_bqiang@quicinc.com> writes: >>>>>>>> >>>>>>>>> On 1/10/2024 10:55 PM, James Prestwood wrote: >>>>>>>>>> Hi Kalle, >>>>>>>>>> On 1/10/24 5:49 AM, Kalle Valo wrote: >>>>>>>>>>> James Prestwood <prestwoj@gmail.com> writes: >>>>>>>>>>> >>>>>>>>>>>>> But I have also no idea what is causing this, I guess we >>>>>>>>>>>>> are doing >>>>>>>>>>>>> something wrong with the PCI communication? That reminds >>>>>>>>>>>>> me, you could >>>>>>>>>>>>> try this in case that helps: >>>>>>>>>>>>> >>>>>>>>>>>>> https://patchwork.kernel.org/project/linux-wireless/patch/20231212031914.47339-1-imguzh@gmail.com/ >>>>>>>>>>>> Heh, I saw this pop up a day after I sent this and was >>>>>>>>>>>> wondering. Is >>>>>>>>>>>> this something I'd need on the host kernel, guest, or both? >>>>>>>>>>> On the guest where ath11k is running. I'm not optimistic that >>>>>>>>>>> this would >>>>>>>>>>> solve your issue, I suspect there can be also other bugs, but >>>>>>>>>>> good to >>>>>>>>>>> know if the patch changes anything. >>>>>>>>>> Looks the same here, didn't seem to change anything based on the >>>>>>>>>> kernel logs. >>>>>>>>>> >>>>>>>>> Could you try this? >>>>>>>>> >>>>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/net/wireless/ath/ath11k/pci.c?id=39564b475ac5a589e6c22c43a08cbd283c295d2c >>>>>>>> This reminds me, I assumed James was testing with ath.git master >>>>>>>> branch >>>>>>>> (which has that commit) but I never checked that. So for testing >>>>>>>> please >>>>>>>> always use the master branch to get the latest and greatest ath11k: >>>>>>>> >>>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/ >>>>>>>> >>>>>>>> There's a quite long delay from ath.git to official releases. >>>>>>> Good to know, and I was not in fact using that branch. Rebuilt from >>>>>>> ath.git/master but still roughly the same behavior. There does >>>>>>> appear >>>>>>> to be more output now though, specifically a firmware crash: >>>>>>> >>>>>>> [ 2.281721] ath11k_pci 0000:00:06.0: failed to receive control >>>>>>> response completion, polling.. >>>>>>> [ 2.282101] ip (65) used greatest stack depth: 12464 bytes left >>>>>>> [ 3.306039] ath11k_pci 0000:00:06.0: Service connect timeout >>>>>>> [ 3.307588] ath11k_pci 0000:00:06.0: failed to connect to HTT: >>>>>>> -110 >>>>>>> [ 3.309286] ath11k_pci 0000:00:06.0: failed to start core: -110 >>>>>>> [ 3.519637] ath11k_pci 0000:00:06.0: firmware crashed: >>>>>>> MHI_CB_EE_RDDM >>>>>>> [ 3.519678] ath11k_pci 0000:00:06.0: ignore reset dev flags >>>>>>> 0x4000 >>>>>>> [ 3.627087] ath11k_pci 0000:00:06.0: firmware crashed: >>>>>>> MHI_CB_EE_RDDM >>>>>>> [ 3.627129] ath11k_pci 0000:00:06.0: ignore reset dev flags >>>>>>> 0x4000 >>>>>>> [ 13.802105] ath11k_pci 0000:00:06.0: failed to wait wlan mode >>>>>>> request (mode 4): -110 >>>>>>> [ 13.802175] ath11k_pci 0000:00:06.0: qmi failed to send wlan mode >>>>>>> off: -110 >>>>>> Ok, that's progress now. Can you try next try the iommu patch[1] we >>>>>> talked about earlier? It's already in master-pending branch (along >>>>>> with >>>>>> other pending patches) so you can use that branch if you want. >>>>>> >>>>>> [1] >>>>>> https://patchwork.kernel.org/project/linux-wireless/patch/20231212031914.47339-1-imguzh@gmail.com/ >>>>> >>>>> Same result unfortunately, tried both with just [1] applied to >>>>> ath.git and at HEAD of master-pending. >>>>> >>>>> Thanks, >>>>> >>>>> James >>>> Strange that still fails. Are you now seeing this error in your host >>>> or your Qemu? or both? >>>> Could you share your test steps? And if you can share please be as >>>> detailed as possible since I'm not familiar with passing WLAN >>>> hardware to a VM using vfio-pci. >>> >>> Just in Qemu, the hardware works fine on my host machine. >>> >>> I basically follow this guide to set it up, its written in the >>> context of GPUs/libvirt but the host setup is exactly the same. By no >>> means do you need to read it all, once you set the vfio-pci.ids and >>> see your unclaimed adapter you can stop: >>> >>> https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF >>> >>> In short you should be able to set the following host kernel options >>> and reboot (assuming your motherboard/hardware is compatible): >>> >>> intel_iommu=on iommu=pt vfio-pci.ids=17cb:1103 >>> >>> Obviously change the device/vendor IDs to whatever ath11k hw you >>> have. Once the host is rebooted you should see your wlan adapter as >>> UNCLAIMED, showing the driver in use as vfio-pci. If not, its likely >>> your motherboard just isn't compatible, the device has to be in its >>> own IOMMU group (you could try switching PCI ports if this is the case). >>> >>> I then build a "kvm_guest.config" kernel with the driver/firmware for >>> ath11k and boot into that with the following Qemu options: >>> >>> -enable-kvm -device -vfio-pci,host=<PCI address> >>> >>> If it seems easier you could also utilize IWD's test-runner which >>> handles launching the Qemu kernel automatically, detecting any >>> vfio-devices and passes them through and mounts some useful host >>> folders into the VM. Its actually a very good general purpose tool >>> for kernel testing, not just for IWD: >>> >>> https://git.kernel.org/pub/scm/network/wireless/iwd.git/tree/doc/test-runner.txt >>> >>> Once set up you can just run test-runner with a few flags and you'll >>> boot into a shell: >>> >>> ./tools/test-runner -k <kernel-image> --hw --start /bin/bash >>> >>> Please reach out if you have questions, thanks for looking into this. >>> >> Thanks for these details. I reproduced this issue by following your >> guide. >> >> Seems the root cause is that the MSI vector assigned to WCN6855 in >> qemu is different with that in host. In my case the MSI vector in qemu >> is [Address: fee00000 Data: 0020] while in host it is [Address: >> fee00578 Data: 0000]. So in qemu ath11k configures MSI vector >> [Address: fee00000 Data: 0020] to WCN6855 hardware/firmware, and >> firmware uses that vector to fire interrupts to host/qemu. However >> host IOMMU doesn't know that vector because the real vector is >> [Address: fee00578 Data: 0000], as a result host blocks that >> interrupt and reports an error, see below log: >> >> [ 1414.206069] DMAR: DRHD: handling fault status reg 2 >> [ 1414.206081] DMAR: [INTR-REMAP] Request device [02:00.0] fault index >> 0x0 [fault reason 0x25] Blocked a compatibility format interrupt request >> [ 1414.210334] DMAR: DRHD: handling fault status reg 2 >> [ 1414.210342] DMAR: [INTR-REMAP] Request device [02:00.0] fault index >> 0x0 [fault reason 0x25] Blocked a compatibility format interrupt request >> [ 1414.212496] DMAR: DRHD: handling fault status reg 2 >> [ 1414.212503] DMAR: [INTR-REMAP] Request device [02:00.0] fault index >> 0x0 [fault reason 0x25] Blocked a compatibility format interrupt request >> [ 1414.214600] DMAR: DRHD: handling fault status reg 2 >> >> While I don't think there is a way for qemu/ath11k to get the real MSI >> vector from host, I will try to read the vfio code to check further. >> Before that, to unblock you, a possible hack is to hard code the MSI >> vector in qemu to the same as in host, on condition that the MSI >> vector doesn't change. In my case, the change looks like >> >> diff --git a/drivers/net/wireless/ath/ath11k/pci.c >> b/drivers/net/wireless/ath/ath11k/pci.c >> index 09e65c5e55c4..89a9bbe9e4d2 100644 >> --- a/drivers/net/wireless/ath/ath11k/pci.c >> +++ b/drivers/net/wireless/ath/ath11k/pci.c >> @@ -459,7 +459,12 @@ static int ath11k_pci_alloc_msi(struct ath11k_pci >> *ab_pci) >> ab->pci.msi.addr_hi = 0; >> } >> >> - ath11k_dbg(ab, ATH11K_DBG_PCI, "msi base data is %d\n", >> ab->pci.msi.ep_base_data); >> + ab->pci.msi.addr_hi = 0; >> + ab->pci.msi.addr_lo = 0xfee00578; >> + ath11k_dbg(ab, ATH11K_DBG_PCI, "msi addr hi 0x%x lo 0x%x base >> data is %d\n", >> + ab->pci.msi.addr_hi, >> + ab->pci.msi.addr_lo, >> + ab->pci.msi.ep_base_data); >> >> return 0; >> >> @@ -487,6 +492,7 @@ static int ath11k_pci_config_msi_data(struct >> ath11k_pci *ab_pci) >> } >> >> ab_pci->ab->pci.msi.ep_base_data = msi_desc->msg.data; >> + ab_pci->ab->pci.msi.ep_base_data = 0; >> >> ath11k_dbg(ab_pci->ab, ATH11K_DBG_PCI, "after request_irq >> msi_ep_base_data %d\n", >> ab_pci->ab->pci.msi.ep_base_data); >> >> >> This hack works on my setup. > > Progress! Thank you. This didn't work for me but its likely because my > host MSI vector is not fee00578. Where did you come up with this value? It could, and most likely, be different from machine to machine. > I don't see anything in the dmesg logs, or in lspci etc. > fee00578 is the physical MSI vector so I got it using lspci in host, see ... Capabilities: [50] MSI: Enable+ Count=1/32 Maskable+ 64bit- Address: fee00578 Data: 0000 Masking: fffffffe Pending: 00000000 ... > Thanks, > > James > ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-17 5:26 ` Baochen Qiang @ 2024-01-17 13:20 ` James Prestwood 2024-01-17 13:43 ` Kalle Valo 2024-01-18 2:09 ` Baochen Qiang 0 siblings, 2 replies; 51+ messages in thread From: James Prestwood @ 2024-01-17 13:20 UTC (permalink / raw) To: Baochen Qiang, Kalle Valo; +Cc: linux-wireless, ath11k Hi Baochen, <snip> >>> >>> While I don't think there is a way for qemu/ath11k to get the real >>> MSI vector from host, I will try to read the vfio code to check >>> further. Before that, to unblock you, a possible hack is to hard >>> code the MSI vector in qemu to the same as in host, on condition >>> that the MSI vector doesn't change. In my case, the change looks like >>> >>> diff --git a/drivers/net/wireless/ath/ath11k/pci.c >>> b/drivers/net/wireless/ath/ath11k/pci.c >>> index 09e65c5e55c4..89a9bbe9e4d2 100644 >>> --- a/drivers/net/wireless/ath/ath11k/pci.c >>> +++ b/drivers/net/wireless/ath/ath11k/pci.c >>> @@ -459,7 +459,12 @@ static int ath11k_pci_alloc_msi(struct >>> ath11k_pci *ab_pci) >>> ab->pci.msi.addr_hi = 0; >>> } >>> >>> - ath11k_dbg(ab, ATH11K_DBG_PCI, "msi base data is %d\n", >>> ab->pci.msi.ep_base_data); >>> + ab->pci.msi.addr_hi = 0; >>> + ab->pci.msi.addr_lo = 0xfee00578; >>> + ath11k_dbg(ab, ATH11K_DBG_PCI, "msi addr hi 0x%x lo 0x%x >>> base data is %d\n", >>> + ab->pci.msi.addr_hi, >>> + ab->pci.msi.addr_lo, >>> + ab->pci.msi.ep_base_data); >>> >>> return 0; >>> >>> @@ -487,6 +492,7 @@ static int ath11k_pci_config_msi_data(struct >>> ath11k_pci *ab_pci) >>> } >>> >>> ab_pci->ab->pci.msi.ep_base_data = msi_desc->msg.data; >>> + ab_pci->ab->pci.msi.ep_base_data = 0; >>> >>> ath11k_dbg(ab_pci->ab, ATH11K_DBG_PCI, "after request_irq >>> msi_ep_base_data %d\n", >>> ab_pci->ab->pci.msi.ep_base_data); >>> >>> >>> This hack works on my setup. >> >> Progress! Thank you. This didn't work for me but its likely because >> my host MSI vector is not fee00578. Where did you come up with this >> value? > It could, and most likely, be different from machine to machine. > >> I don't see anything in the dmesg logs, or in lspci etc. >> > fee00578 is the physical MSI vector so I got it using lspci in host, see > ... > Capabilities: [50] MSI: Enable+ Count=1/32 Maskable+ 64bit- > Address: fee00578 Data: 0000 > Masking: fffffffe Pending: 00000000 > ... Mine looks like this: ... Capabilities: [50] MSI: Enable- Count=1/32 Maskable+ 64bit- Address: 00000000 Data: 0000 Masking: 00000000 Pending: 00000000 ... I've adjusted the patch: diff --git a/drivers/net/wireless/ath/ath11k/pci.c b/drivers/net/wireless/ath/ath11k/pci.c index 09e65c5e55..1cc7115582 100644 --- a/drivers/net/wireless/ath/ath11k/pci.c +++ b/drivers/net/wireless/ath/ath11k/pci.c @@ -459,7 +459,12 @@ static int ath11k_pci_alloc_msi(struct ath11k_pci *ab_pci) ab->pci.msi.addr_hi = 0; } - ath11k_dbg(ab, ATH11K_DBG_PCI, "msi base data is %d\n", ab->pci.msi.ep_base_data); + ab->pci.msi.addr_hi = 0; + ab->pci.msi.addr_lo = 0; + ath11k_dbg(ab, ATH11K_DBG_PCI, "msi addr hi 0x%x lo 0x%x base data is %d\n", + ab->pci.msi.addr_hi, + ab->pci.msi.addr_lo, + ab->pci.msi.ep_base_data); return 0; @@ -487,6 +492,7 @@ static int ath11k_pci_config_msi_data(struct ath11k_pci *ab_pci) } ab_pci->ab->pci.msi.ep_base_data = msi_desc->msg.data; + ab_pci->ab->pci.msi.ep_base_data = 0; ath11k_dbg(ab_pci->ab, ATH11K_DBG_PCI, "after request_irq msi_ep_base_data %d\n", ab_pci->ab->pci.msi.ep_base_data); But still getting the same errors: [ 3.563057] ath11k_pci 0000:00:06.0: Service connect timeout [ 3.565044] ath11k_pci 0000:00:06.0: failed to connect to HTT: -110 [ 3.567031] ath11k_pci 0000:00:06.0: failed to start core: -110 [ 3.777514] ath11k_pci 0000:00:06.0: firmware crashed: MHI_CB_EE_RDDM [ 3.777555] ath11k_pci 0000:00:06.0: ignore reset dev flags 0x4000 [ 3.885137] ath11k_pci 0000:00:06.0: firmware crashed: MHI_CB_EE_RDDM [ 3.885178] ath11k_pci 0000:00:06.0: ignore reset dev flags 0x4000 I know this isn't a proper fix, so if its something that needs more thought than just hard-coded values I understand. Thanks, James > >> Thanks, >> >> James >> ^ permalink raw reply related [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-17 13:20 ` James Prestwood @ 2024-01-17 13:43 ` Kalle Valo 2024-01-17 14:25 ` James Prestwood 2024-01-18 2:09 ` Baochen Qiang 1 sibling, 1 reply; 51+ messages in thread From: Kalle Valo @ 2024-01-17 13:43 UTC (permalink / raw) To: James Prestwood; +Cc: Baochen Qiang, linux-wireless, ath11k James Prestwood <prestwoj@gmail.com> writes: >>> I don't see anything in the dmesg logs, or in lspci etc. >>> >> fee00578 is the physical MSI vector so I got it using lspci in host, see >> ... >> Capabilities: [50] MSI: Enable+ Count=1/32 Maskable+ 64bit- >> Address: fee00578 Data: 0000 >> Masking: fffffffe Pending: 00000000 >> ... > > Mine looks like this: > > ... > > Capabilities: [50] MSI: Enable- Count=1/32 Maskable+ 64bit- > Address: 00000000 Data: 0000 > Masking: 00000000 Pending: 00000000 And that lspci output is from the host (not the guest)? -- https://patchwork.kernel.org/project/linux-wireless/list/ https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-17 13:43 ` Kalle Valo @ 2024-01-17 14:25 ` James Prestwood 0 siblings, 0 replies; 51+ messages in thread From: James Prestwood @ 2024-01-17 14:25 UTC (permalink / raw) To: Kalle Valo; +Cc: Baochen Qiang, linux-wireless, ath11k Hi Kalle, On 1/17/24 5:43 AM, Kalle Valo wrote: > James Prestwood <prestwoj@gmail.com> writes: > >>>> I don't see anything in the dmesg logs, or in lspci etc. >>>> >>> fee00578 is the physical MSI vector so I got it using lspci in host, see >>> ... >>> Capabilities: [50] MSI: Enable+ Count=1/32 Maskable+ 64bit- >>> Address: fee00578 Data: 0000 >>> Masking: fffffffe Pending: 00000000 >>> ... >> Mine looks like this: >> >> ... >> >> Capabilities: [50] MSI: Enable- Count=1/32 Maskable+ 64bit- >> Address: 00000000 Data: 0000 >> Masking: 00000000 Pending: 00000000 > And that lspci output is from the host (not the guest)? Yes, that is on the host. On the guest I have: Capabilities: [50] MSI: Enable+ Count=1/32 Maskable+ 64bit- Address: fee01004 Data: 0028 Masking: fffffffe Pending: 00000000 And I just tried those values, i.e: - ath11k_dbg(ab, ATH11K_DBG_PCI, "msi base data is %d\n", ab->pci.msi.ep_base_data); + ab->pci.msi.addr_hi = 0; + ab->pci.msi.addr_lo = 0xfee01004; + ath11k_dbg(ab, ATH11K_DBG_PCI, "msi addr hi 0x%x lo 0x%x base data is %d\n", + ab->pci.msi.addr_hi, + ab->pci.msi.addr_lo, + ab->pci.msi.ep_base_data); Though wasn't sure for "Data"/ep_base_data since the kernel uses a u32 but lspci appears to show a 16 bit value. I tried: + ab_pci->ab->pci.msi.ep_base_data = 0; + ab_pci->ab->pci.msi.ep_base_data = 0x0028; + ab_pci->ab->pci.msi.ep_base_data = 0x0028000; (maybe thats entirely the wrong change) > ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-17 13:20 ` James Prestwood 2024-01-17 13:43 ` Kalle Valo @ 2024-01-18 2:09 ` Baochen Qiang 2024-01-19 17:52 ` James Prestwood 1 sibling, 1 reply; 51+ messages in thread From: Baochen Qiang @ 2024-01-18 2:09 UTC (permalink / raw) To: James Prestwood, Kalle Valo; +Cc: linux-wireless, ath11k On 1/17/2024 9:20 PM, James Prestwood wrote: > Hi Baochen, > > <snip> > >>>> >>>> While I don't think there is a way for qemu/ath11k to get the real >>>> MSI vector from host, I will try to read the vfio code to check >>>> further. Before that, to unblock you, a possible hack is to hard >>>> code the MSI vector in qemu to the same as in host, on condition >>>> that the MSI vector doesn't change. In my case, the change looks like >>>> >>>> diff --git a/drivers/net/wireless/ath/ath11k/pci.c >>>> b/drivers/net/wireless/ath/ath11k/pci.c >>>> index 09e65c5e55c4..89a9bbe9e4d2 100644 >>>> --- a/drivers/net/wireless/ath/ath11k/pci.c >>>> +++ b/drivers/net/wireless/ath/ath11k/pci.c >>>> @@ -459,7 +459,12 @@ static int ath11k_pci_alloc_msi(struct >>>> ath11k_pci *ab_pci) >>>> ab->pci.msi.addr_hi = 0; >>>> } >>>> >>>> - ath11k_dbg(ab, ATH11K_DBG_PCI, "msi base data is %d\n", >>>> ab->pci.msi.ep_base_data); >>>> + ab->pci.msi.addr_hi = 0; >>>> + ab->pci.msi.addr_lo = 0xfee00578; >>>> + ath11k_dbg(ab, ATH11K_DBG_PCI, "msi addr hi 0x%x lo 0x%x >>>> base data is %d\n", >>>> + ab->pci.msi.addr_hi, >>>> + ab->pci.msi.addr_lo, >>>> + ab->pci.msi.ep_base_data); >>>> >>>> return 0; >>>> >>>> @@ -487,6 +492,7 @@ static int ath11k_pci_config_msi_data(struct >>>> ath11k_pci *ab_pci) >>>> } >>>> >>>> ab_pci->ab->pci.msi.ep_base_data = msi_desc->msg.data; >>>> + ab_pci->ab->pci.msi.ep_base_data = 0; >>>> >>>> ath11k_dbg(ab_pci->ab, ATH11K_DBG_PCI, "after request_irq >>>> msi_ep_base_data %d\n", >>>> ab_pci->ab->pci.msi.ep_base_data); >>>> >>>> >>>> This hack works on my setup. >>> >>> Progress! Thank you. This didn't work for me but its likely because >>> my host MSI vector is not fee00578. Where did you come up with this >>> value? >> It could, and most likely, be different from machine to machine. >> >>> I don't see anything in the dmesg logs, or in lspci etc. >>> >> fee00578 is the physical MSI vector so I got it using lspci in host, see >> ... >> Capabilities: [50] MSI: Enable+ Count=1/32 Maskable+ 64bit- >> Address: fee00578 Data: 0000 >> Masking: fffffffe Pending: 00000000 >> ... > > Mine looks like this: > > ... > > Capabilities: [50] MSI: Enable- Count=1/32 Maskable+ 64bit- > Address: 00000000 Data: 0000 > Masking: 00000000 Pending: 00000000 I guess you got this output when qemu/ath11k is not running, either qemu not started or ath11k module not loaded, right? Since MSI is not enabled in this output. Sorry I should have made it clear: step1: start qemu and load ath11k without changing anything, then you get those errors; step2: keep qemu running and ath11k there, don't unload it. Then check lspci output in host to get the MSI vector. step3: Back in qemu, hard code MSI vector to the one you get in step2, then rebuild and reload ath11k. > ... > > I've adjusted the patch: > > diff --git a/drivers/net/wireless/ath/ath11k/pci.c > b/drivers/net/wireless/ath/ath11k/pci.c > index 09e65c5e55..1cc7115582 100644 > --- a/drivers/net/wireless/ath/ath11k/pci.c > +++ b/drivers/net/wireless/ath/ath11k/pci.c > @@ -459,7 +459,12 @@ static int ath11k_pci_alloc_msi(struct ath11k_pci > *ab_pci) > ab->pci.msi.addr_hi = 0; > } > > - ath11k_dbg(ab, ATH11K_DBG_PCI, "msi base data is %d\n", > ab->pci.msi.ep_base_data); > + ab->pci.msi.addr_hi = 0; > + ab->pci.msi.addr_lo = 0; > + ath11k_dbg(ab, ATH11K_DBG_PCI, "msi addr hi 0x%x lo 0x%x base > data is %d\n", > + ab->pci.msi.addr_hi, > + ab->pci.msi.addr_lo, > + ab->pci.msi.ep_base_data); > > return 0; > > @@ -487,6 +492,7 @@ static int ath11k_pci_config_msi_data(struct > ath11k_pci *ab_pci) > } > > ab_pci->ab->pci.msi.ep_base_data = msi_desc->msg.data; > + ab_pci->ab->pci.msi.ep_base_data = 0; > > ath11k_dbg(ab_pci->ab, ATH11K_DBG_PCI, "after request_irq > msi_ep_base_data %d\n", > ab_pci->ab->pci.msi.ep_base_data); > > But still getting the same errors: > > [ 3.563057] ath11k_pci 0000:00:06.0: Service connect timeout > [ 3.565044] ath11k_pci 0000:00:06.0: failed to connect to HTT: -110 > [ 3.567031] ath11k_pci 0000:00:06.0: failed to start core: -110 > [ 3.777514] ath11k_pci 0000:00:06.0: firmware crashed: MHI_CB_EE_RDDM > [ 3.777555] ath11k_pci 0000:00:06.0: ignore reset dev flags 0x4000 > [ 3.885137] ath11k_pci 0000:00:06.0: firmware crashed: MHI_CB_EE_RDDM > [ 3.885178] ath11k_pci 0000:00:06.0: ignore reset dev flags 0x4000 > > I know this isn't a proper fix, so if its something that needs more > thought than just hard-coded values I understand. > > Thanks, > > James > >> >>> Thanks, >>> >>> James >>> ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-18 2:09 ` Baochen Qiang @ 2024-01-19 17:52 ` James Prestwood 2024-01-19 17:57 ` Kalle Valo 0 siblings, 1 reply; 51+ messages in thread From: James Prestwood @ 2024-01-19 17:52 UTC (permalink / raw) To: Baochen Qiang, Kalle Valo; +Cc: linux-wireless, ath11k Hi Baochen, I guess you got this output when qemu/ath11k is not running, either qemu not started or ath11k module not loaded, right? Since MSI is not enabled in this output. > > Sorry I should have made it clear: > step1: start qemu and load ath11k without changing anything, then you > get those errors; > step2: keep qemu running and ath11k there, don't unload it. Then check > lspci output in host to get the MSI vector. > step3: Back in qemu, hard code MSI vector to the one you get in step2, > then rebuild and reload ath11k. > Thank you for the clarification, this worked like a charm! Thanks, James ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-19 17:52 ` James Prestwood @ 2024-01-19 17:57 ` Kalle Valo 2024-01-19 18:07 ` James Prestwood 2024-01-26 18:20 ` James Prestwood 0 siblings, 2 replies; 51+ messages in thread From: Kalle Valo @ 2024-01-19 17:57 UTC (permalink / raw) To: James Prestwood; +Cc: Baochen Qiang, linux-wireless, ath11k James Prestwood <prestwoj@gmail.com> writes: >> Sorry I should have made it clear: >> step1: start qemu and load ath11k without changing anything, then >> you get those errors; >> step2: keep qemu running and ath11k there, don't unload it. Then >> check lspci output in host to get the MSI vector. >> step3: Back in qemu, hard code MSI vector to the one you get in >> step2, then rebuild and reload ath11k. > > Thank you for the clarification, this worked like a charm! Very nice, huge thanks to Baochen for finding the workaround! James, let us know how it works for you in the long run. This would definitely help automating ath11k testing so I'm very interested about this. -- https://patchwork.kernel.org/project/linux-wireless/list/ https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-19 17:57 ` Kalle Valo @ 2024-01-19 18:07 ` James Prestwood 2024-01-26 18:20 ` James Prestwood 1 sibling, 0 replies; 51+ messages in thread From: James Prestwood @ 2024-01-19 18:07 UTC (permalink / raw) To: Kalle Valo; +Cc: Baochen Qiang, linux-wireless, ath11k Hi Kalle, On 1/19/24 9:57 AM, Kalle Valo wrote: > James Prestwood <prestwoj@gmail.com> writes: > >>> Sorry I should have made it clear: >>> step1: start qemu and load ath11k without changing anything, then >>> you get those errors; >>> step2: keep qemu running and ath11k there, don't unload it. Then >>> check lspci output in host to get the MSI vector. >>> step3: Back in qemu, hard code MSI vector to the one you get in >>> step2, then rebuild and reload ath11k. >> Thank you for the clarification, this worked like a charm! > Very nice, huge thanks to Baochen for finding the workaround! James, let > us know how it works for you in the long run. This would definitely help > automating ath11k testing so I'm very interested about this. Yes thank you Baochen! You mean using PCI pass-through? Yes, its immensely useful for testing kernel changes. No swapping host kernels/reboots and you can build a very minimal kernel, mine takes ~10 min to build initially then iterative changes take ~10 seconds. We have also integrated this into IWD's autotest framework so you can, in theory, run tests you normally run using mac80211_hwsim but using real hardware (assuming you aren't needing to modify frames). I did at one point have a system set up with 4-5 wifi cards and could run python tests against them, bring up an AP, connect, etc. It was pretty neat. Thanks, James ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-19 17:57 ` Kalle Valo 2024-01-19 18:07 ` James Prestwood @ 2024-01-26 18:20 ` James Prestwood 2024-01-27 4:31 ` Baochen Qiang 1 sibling, 1 reply; 51+ messages in thread From: James Prestwood @ 2024-01-26 18:20 UTC (permalink / raw) To: Kalle Valo; +Cc: Baochen Qiang, linux-wireless, ath11k Hi Kalle/Baochen, On 1/19/24 9:57 AM, Kalle Valo wrote: > James Prestwood <prestwoj@gmail.com> writes: > >>> Sorry I should have made it clear: >>> step1: start qemu and load ath11k without changing anything, then >>> you get those errors; >>> step2: keep qemu running and ath11k there, don't unload it. Then >>> check lspci output in host to get the MSI vector. >>> step3: Back in qemu, hard code MSI vector to the one you get in >>> step2, then rebuild and reload ath11k. >> Thank you for the clarification, this worked like a charm! > Very nice, huge thanks to Baochen for finding the workaround! James, let > us know how it works for you in the long run. This would definitely help > automating ath11k testing so I'm very interested about this. It appears this hack isn't a reliable solution long term. It did survive several reboots but then I was back to the same behavior. Looked at the MSI vector and it had changed: It went from fee00558 to feee00538 But at least I can just modify this and get it working again. Thanks, James > ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: ath11k and vfio-pci support 2024-01-26 18:20 ` James Prestwood @ 2024-01-27 4:31 ` Baochen Qiang 0 siblings, 0 replies; 51+ messages in thread From: Baochen Qiang @ 2024-01-27 4:31 UTC (permalink / raw) To: James Prestwood, Kalle Valo; +Cc: linux-wireless, ath11k On 1/27/2024 2:20 AM, James Prestwood wrote: > Hi Kalle/Baochen, > > On 1/19/24 9:57 AM, Kalle Valo wrote: >> James Prestwood <prestwoj@gmail.com> writes: >> >>>> Sorry I should have made it clear: >>>> step1: start qemu and load ath11k without changing anything, then >>>> you get those errors; >>>> step2: keep qemu running and ath11k there, don't unload it. Then >>>> check lspci output in host to get the MSI vector. >>>> step3: Back in qemu, hard code MSI vector to the one you get in >>>> step2, then rebuild and reload ath11k. >>> Thank you for the clarification, this worked like a charm! >> Very nice, huge thanks to Baochen for finding the workaround! James, let >> us know how it works for you in the long run. This would definitely help >> automating ath11k testing so I'm very interested about this. > > It appears this hack isn't a reliable solution long term. It did survive > several reboots but then I was back to the same behavior. Looked at the > MSI vector and it had changed: > > It went from fee00558 to feee00538 > Yes, the MSI vector allocation is up to the kernel, and several factors are considered when doing this. Among them, the most important are the IRQ load balance and available vectors on a specific CPU. As a result, the vector changes, not only when you request it (as you already see), but also in run time. > But at least I can just modify this and get it working again. > > Thanks, > > James > >> ^ permalink raw reply [flat|nested] 51+ messages in thread
* [PATCH RFC/RFT] vfio/pci: Create feature to disable MSI virtualization 2024-01-08 13:17 ath11k and vfio-pci support James Prestwood 2024-01-10 9:00 ` Kalle Valo @ 2024-08-12 16:59 ` Alex Williamson 2024-08-13 16:30 ` Jason Gunthorpe 2024-08-12 17:00 ` [PATCH RFC/RFT] vfio/pci-quirks: Quirk for ath wireless Alex Williamson 2 siblings, 1 reply; 51+ messages in thread From: Alex Williamson @ 2024-08-12 16:59 UTC (permalink / raw) To: kvm Cc: Alex Williamson, quic_bqiang, kvalo, prestwoj, linux-wireless, ath11k, dwmw2, iommu, jgg, kernel, johannes, jtornosm vfio-pci has always virtualized the MSI address and data registers as MSI programming is performed through the SET_IRQS ioctl. Often this virtualization is not used, and in specific cases can be unhelpful. One such case where the virtualization is a hinderance is when the device contains an onboard interrupt controller programmed by the guest driver. Userspace VMMs have a chance to quirk this programming, injecting the host physical MSI information, but only if the userspace driver can get access to the host physical address and data registers. This introduces a device feature which allows the userspace driver to disable virtualization of the MSI capability address and data registers in order to provide read-only access the the physical values. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216055 Signed-off-by: Alex Williamson <alex.williamson@redhat.com> --- drivers/vfio/pci/vfio_pci_config.c | 26 ++++++++++++++++++++++++++ drivers/vfio/pci/vfio_pci_core.c | 21 +++++++++++++++++++++ drivers/vfio/pci/vfio_pci_priv.h | 1 + include/uapi/linux/vfio.h | 14 ++++++++++++++ 4 files changed, 62 insertions(+) diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c index 97422aafaa7b..5f86e75ea6ca 100644 --- a/drivers/vfio/pci/vfio_pci_config.c +++ b/drivers/vfio/pci/vfio_pci_config.c @@ -1259,6 +1259,32 @@ static int vfio_msi_cap_len(struct vfio_pci_core_device *vdev, u8 pos) return len; } +/* Disable virtualization of the MSI address and data fields */ +int vfio_pci_msi_novirt(struct vfio_pci_core_device *vdev) +{ + struct pci_dev *pdev = vdev->pdev; + struct perm_bits *perm = vdev->msi_perm; + u16 flags; + int ret; + + if (!perm) + return -EINVAL; + + ret = pci_read_config_word(pdev, pdev->msi_cap + PCI_MSI_FLAGS, &flags); + if (ret) + return pcibios_err_to_errno(ret); + + p_setd(perm, PCI_MSI_ADDRESS_LO, NO_VIRT, NO_WRITE); + if (flags & PCI_MSI_FLAGS_64BIT) { + p_setd(perm, PCI_MSI_ADDRESS_HI, NO_VIRT, NO_WRITE); + p_setw(perm, PCI_MSI_DATA_64, (u16)NO_VIRT, (u16)NO_WRITE); + } else { + p_setw(perm, PCI_MSI_DATA_32, (u16)NO_VIRT, (u16)NO_WRITE); + } + + return 0; +} + /* Determine extended capability length for VC (2 & 9) and MFVC */ static int vfio_vc_cap_len(struct vfio_pci_core_device *vdev, u16 pos) { diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index ba0ce0075b2f..acdced212be2 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -1518,6 +1518,24 @@ static int vfio_pci_core_feature_token(struct vfio_device *device, u32 flags, return 0; } +static int vfio_pci_core_feature_msi_novirt(struct vfio_device *device, + u32 flags, void __user *arg, + size_t argsz) +{ + struct vfio_pci_core_device *vdev = + container_of(device, struct vfio_pci_core_device, vdev); + int ret; + + if (!vdev->msi_perm) + return -ENOTTY; + + ret = vfio_check_feature(flags, argsz, VFIO_DEVICE_FEATURE_SET, 0); + if (ret != 1) + return ret; + + return vfio_pci_msi_novirt(vdev); +} + int vfio_pci_core_ioctl_feature(struct vfio_device *device, u32 flags, void __user *arg, size_t argsz) { @@ -1531,6 +1549,9 @@ int vfio_pci_core_ioctl_feature(struct vfio_device *device, u32 flags, return vfio_pci_core_pm_exit(device, flags, arg, argsz); case VFIO_DEVICE_FEATURE_PCI_VF_TOKEN: return vfio_pci_core_feature_token(device, flags, arg, argsz); + case VFIO_DEVICE_FEATURE_PCI_MSI_NOVIRT: + return vfio_pci_core_feature_msi_novirt(device, flags, + arg, argsz); default: return -ENOTTY; } diff --git a/drivers/vfio/pci/vfio_pci_priv.h b/drivers/vfio/pci/vfio_pci_priv.h index 5e4fa69aee16..6e6cc74c6579 100644 --- a/drivers/vfio/pci/vfio_pci_priv.h +++ b/drivers/vfio/pci/vfio_pci_priv.h @@ -53,6 +53,7 @@ int vfio_pci_ioeventfd(struct vfio_pci_core_device *vdev, loff_t offset, int vfio_pci_init_perm_bits(void); void vfio_pci_uninit_perm_bits(void); +int vfio_pci_msi_novirt(struct vfio_pci_core_device *vdev); int vfio_config_init(struct vfio_pci_core_device *vdev); void vfio_config_free(struct vfio_pci_core_device *vdev); diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index 2b68e6cdf190..ddf5dd9245fb 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -1458,6 +1458,20 @@ struct vfio_device_feature_bus_master { }; #define VFIO_DEVICE_FEATURE_BUS_MASTER 10 +/** + * Toggle virtualization of PCI MSI address and data fields off. By default + * vfio-pci-core based drivers virtualize the MSI address and data fields of + * the MSI capability to emulate direct access to the device, ie. writes are + * allowed and buffered where subsequent reads return the buffered data. + * VMMs often virtualize these registers anyway and there are cases in user- + * space where having access to the host MSI fields can be useful, such as + * quirking an embedded interrupt controller on the device to generate physical + * MSI interrupts. Upon VFIO_DEVICE_FEATURE_SET of the PCI_MSI_NOVIRT feature + * this virtualization is disabled, reads of the MSI address and data fields + * will return the physical values and writes are dropped. + */ +#define VFIO_DEVICE_FEATURE_PCI_MSI_NOVIRT 11 + /* -------- API for Type1 VFIO IOMMU -------- */ /** -- 2.45.2 ^ permalink raw reply related [flat|nested] 51+ messages in thread
* Re: [PATCH RFC/RFT] vfio/pci: Create feature to disable MSI virtualization 2024-08-12 16:59 ` [PATCH RFC/RFT] vfio/pci: Create feature to disable MSI virtualization Alex Williamson @ 2024-08-13 16:30 ` Jason Gunthorpe 2024-08-13 17:30 ` Thomas Gleixner 2024-08-13 21:14 ` Alex Williamson 0 siblings, 2 replies; 51+ messages in thread From: Jason Gunthorpe @ 2024-08-13 16:30 UTC (permalink / raw) To: Alex Williamson, Thomas Gleixner Cc: kvm, quic_bqiang, kvalo, prestwoj, linux-wireless, ath11k, dwmw2, iommu, kernel, johannes, jtornosm On Mon, Aug 12, 2024 at 10:59:12AM -0600, Alex Williamson wrote: > vfio-pci has always virtualized the MSI address and data registers as > MSI programming is performed through the SET_IRQS ioctl. Often this > virtualization is not used, and in specific cases can be unhelpful. > > One such case where the virtualization is a hinderance is when the > device contains an onboard interrupt controller programmed by the guest > driver. Userspace VMMs have a chance to quirk this programming, > injecting the host physical MSI information, but only if the userspace > driver can get access to the host physical address and data registers. > > This introduces a device feature which allows the userspace driver to > disable virtualization of the MSI capability address and data registers > in order to provide read-only access the the physical values. Personally, I very much dislike this. Encouraging such hacky driver use of the interrupt subsystem is not a good direction. Enabling this in VMs will further complicate fixing the IRQ usages in these drivers over the long run. If the device has it's own interrupt sources then the device needs to create an irq_chip and related and hook them up properly. Not hackily read the MSI-X registers and write them someplace else. Thomas Gleixner has done alot of great work recently to clean this up. So if you imagine the driver is fixed, then this is not necessary. Howver, it will still not work in a VM. Making IMS and non-MSI interrupt controlers work within VMs is still something that needs to be done. Jason ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH RFC/RFT] vfio/pci: Create feature to disable MSI virtualization 2024-08-13 16:30 ` Jason Gunthorpe @ 2024-08-13 17:30 ` Thomas Gleixner 2024-08-13 23:39 ` Jason Gunthorpe 2024-12-13 9:10 ` David Woodhouse 2024-08-13 21:14 ` Alex Williamson 1 sibling, 2 replies; 51+ messages in thread From: Thomas Gleixner @ 2024-08-13 17:30 UTC (permalink / raw) To: Jason Gunthorpe, Alex Williamson Cc: kvm, quic_bqiang, kvalo, prestwoj, linux-wireless, ath11k, dwmw2, iommu, kernel, johannes, jtornosm On Tue, Aug 13 2024 at 13:30, Jason Gunthorpe wrote: > On Mon, Aug 12, 2024 at 10:59:12AM -0600, Alex Williamson wrote: >> vfio-pci has always virtualized the MSI address and data registers as >> MSI programming is performed through the SET_IRQS ioctl. Often this >> virtualization is not used, and in specific cases can be unhelpful. >> >> One such case where the virtualization is a hinderance is when the >> device contains an onboard interrupt controller programmed by the guest >> driver. Userspace VMMs have a chance to quirk this programming, >> injecting the host physical MSI information, but only if the userspace >> driver can get access to the host physical address and data registers. >> >> This introduces a device feature which allows the userspace driver to >> disable virtualization of the MSI capability address and data registers >> in order to provide read-only access the the physical values. > > Personally, I very much dislike this. Encouraging such hacky driver > use of the interrupt subsystem is not a good direction. Enabling this > in VMs will further complicate fixing the IRQ usages in these drivers > over the long run. > > If the device has it's own interrupt sources then the device needs to > create an irq_chip and related and hook them up properly. Not hackily > read the MSI-X registers and write them someplace else. > > Thomas Gleixner has done alot of great work recently to clean this up. > > So if you imagine the driver is fixed, then this is not necessary. Yes. I looked at the at11k driver when I was reworking the PCI/MSI subsystem and that's a perfect candidate for a proper device specific interrupt domain to replace the horrible MSI hackery it has. > Howver, it will still not work in a VM. Making IMS and non-MSI > interrupt controlers work within VMs is still something that needs to > be done. Sure, but we really want to do that in a generic way and not based on ad hoc workarounds. Did the debate around this go anywhere? Thanks, tglx ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH RFC/RFT] vfio/pci: Create feature to disable MSI virtualization 2024-08-13 17:30 ` Thomas Gleixner @ 2024-08-13 23:39 ` Jason Gunthorpe 2024-12-13 9:10 ` David Woodhouse 1 sibling, 0 replies; 51+ messages in thread From: Jason Gunthorpe @ 2024-08-13 23:39 UTC (permalink / raw) To: Thomas Gleixner Cc: Alex Williamson, kvm, quic_bqiang, kvalo, prestwoj, linux-wireless, ath11k, dwmw2, iommu, kernel, johannes, jtornosm On Tue, Aug 13, 2024 at 07:30:41PM +0200, Thomas Gleixner wrote: > > Howver, it will still not work in a VM. Making IMS and non-MSI > > interrupt controlers work within VMs is still something that needs to > > be done. > > Sure, but we really want to do that in a generic way and not based on ad > hoc workarounds. > > Did the debate around this go anywhere? No, it got stuck on the impossible situation that there is no existing way for the VM to have any idea if IMS will work or is broken. Recall Intel was planning to "solve" this by sticking a DVSEC in their virtual config space that said to turn off IMS :\ So using IMS in the real world looked impractical and interest faded a bit. But the underlying reasons for IMS haven't gone away and more work is coming that will bring it up again... Jason ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH RFC/RFT] vfio/pci: Create feature to disable MSI virtualization 2024-08-13 17:30 ` Thomas Gleixner 2024-08-13 23:39 ` Jason Gunthorpe @ 2024-12-13 9:10 ` David Woodhouse 2025-01-03 14:31 ` Jason Gunthorpe 1 sibling, 1 reply; 51+ messages in thread From: David Woodhouse @ 2024-12-13 9:10 UTC (permalink / raw) To: Thomas Gleixner, Jason Gunthorpe, Alex Williamson Cc: kvm, quic_bqiang, kvalo, prestwoj, linux-wireless, ath11k, iommu, kernel, johannes, jtornosm [-- Attachment #1: Type: text/plain, Size: 4266 bytes --] On Tue, 2024-08-13 at 19:30 +0200, Thomas Gleixner wrote: > On Tue, Aug 13 2024 at 13:30, Jason Gunthorpe wrote: > > On Mon, Aug 12, 2024 at 10:59:12AM -0600, Alex Williamson wrote: > > > vfio-pci has always virtualized the MSI address and data registers as > > > MSI programming is performed through the SET_IRQS ioctl. Often this > > > virtualization is not used, and in specific cases can be unhelpful. > > > > > > One such case where the virtualization is a hinderance is when the > > > device contains an onboard interrupt controller programmed by the guest > > > driver. Userspace VMMs have a chance to quirk this programming, > > > injecting the host physical MSI information, but only if the userspace > > > driver can get access to the host physical address and data registers. > > > > > > This introduces a device feature which allows the userspace driver to > > > disable virtualization of the MSI capability address and data registers > > > in order to provide read-only access the the physical values. > > > > Personally, I very much dislike this. Encouraging such hacky driver > > use of the interrupt subsystem is not a good direction. Enabling this > > in VMs will further complicate fixing the IRQ usages in these drivers > > over the long run. > > > > If the device has it's own interrupt sources then the device needs to > > create an irq_chip and related and hook them up properly. Not hackily > > read the MSI-X registers and write them someplace else. > > > > Thomas Gleixner has done alot of great work recently to clean this up. > > > > So if you imagine the driver is fixed, then this is not necessary. > > Yes. I looked at the at11k driver when I was reworking the PCI/MSI > subsystem and that's a perfect candidate for a proper device specific > interrupt domain to replace the horrible MSI hackery it has. The ath11k hacks may be awful, but in their defence, that's because the whole way the hardware works is awful. Q: With PCI passthrough to a guest, how does the guest OS tell the device where to do DMA? A: The guest OS just hands the device a guest physical address and the IOMMU does the rest. Nothing 'intercedes' between the guest and the device to mess with that address. Q: MSIs are just DMA. So with PCI passthrough to a guest, how does the guest OS configure the device's MSIs? <fantasy> A: The guest OS just hands the device a standard MSI message encoding the target guest APIC ID and vector (etc.), and the IOMMU does the rest. Nothing 'intercedes' between the guest and the device to mess with that MSI message. And thus ath11k didn't need to do *any* hacks to work around a stupid hardware design with the VMM snooping on stuff it ideally shouldn't have had any business touching in the first place. Posted interrupts are almost the *default* because the IOMMU receives a <source-id, vCPU APIC ID, vector> tuple on the bus. If receiving an interrupt for a vCPU which isn't currently running, that's when the IOMMU sets a bit in a table somewhere and notifies the host OS. All that special case MSI handling and routing code that I had nightmares about because it fell through a wormhole from a parallel universe, doesn't exist. And look, DPDK drivers which run in polling mode and 'abuse' MSIs by using real memory addresses and asking the device to "write <these> 32 bits to <this> structure if you want attention" just work nicely in virtual machines too, just as they do on real hardware. </fantasy> /me wakes up... Shit. And we have to enable this Interrupt Remapping crap even to address more than 255 CPUs *without* virtualization? Even a *guest* has to see a virtual IOMMU and enable Interrupt Remapping to be able to use more than 255 vCPUs? Even though there were a metric shitload of spare bits in the MSI message we could have used¹. Wait, so that means we have to offer an IOMMU with *DMA* remapping to guests, which means 2-stage translations and/or massive overhead, just for that guest to be able to use >255 vCPUs? Screw you all, I'm going back to bed. ¹ And *should* use, if we ever do something similar like, say, expand the vector# space past 8 bits. Intel and AMD take note. [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5965 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH RFC/RFT] vfio/pci: Create feature to disable MSI virtualization 2024-12-13 9:10 ` David Woodhouse @ 2025-01-03 14:31 ` Jason Gunthorpe 2025-01-03 14:47 ` David Woodhouse 0 siblings, 1 reply; 51+ messages in thread From: Jason Gunthorpe @ 2025-01-03 14:31 UTC (permalink / raw) To: David Woodhouse Cc: Thomas Gleixner, Alex Williamson, kvm, quic_bqiang, kvalo, prestwoj, linux-wireless, ath11k, iommu, kernel, johannes, jtornosm On Fri, Dec 13, 2024 at 09:10:30AM +0000, David Woodhouse wrote: > <fantasy> > A: The guest OS just hands the device a standard MSI message encoding > the target guest APIC ID and vector (etc.), and the IOMMU does the > rest. Nothing 'intercedes' between the guest and the device to mess > with that MSI message. > /me wakes up... Well, I share your dream at least. :\ Have the VMM shadow the virtual interrupt remapping tables and assign it to the phyiscal remapping so that the physical addr/data pair doesn't change. Driving interrupt routing fully via the remapping HW and not via MSI interception. IIRC Alex had a patch series for qemu to capture and rewrite the ath non-standard MSI locations, so virtualization worries should not block moving ath to use the device-specific MSI.. Jason ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH RFC/RFT] vfio/pci: Create feature to disable MSI virtualization 2025-01-03 14:31 ` Jason Gunthorpe @ 2025-01-03 14:47 ` David Woodhouse 2025-01-03 15:19 ` Jason Gunthorpe 0 siblings, 1 reply; 51+ messages in thread From: David Woodhouse @ 2025-01-03 14:47 UTC (permalink / raw) To: Jason Gunthorpe Cc: Thomas Gleixner, Alex Williamson, kvm, quic_bqiang, kvalo, prestwoj, linux-wireless, ath11k, iommu, kernel, johannes, jtornosm [-- Attachment #1: Type: text/plain, Size: 696 bytes --] On Fri, 2025-01-03 at 10:31 -0400, Jason Gunthorpe wrote: > > IIRC Alex had a patch series for qemu to capture and rewrite the ath > non-standard MSI locations, so virtualization worries should not block > moving ath to use the device-specific MSI.. In the absence of the IOMMU in our shared fantasy, this seems like the best approach. Even for passed-through PCI hardware, the config space is largely a fiction. As are the MSI(-X) vectors. So although it's a device-specific quirk, this seems like the best approach. Probably best for the hypervisor to have some way to *advertise* that it's handling this though, as guests also might want to run on hypervisors which don't. [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5069 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH RFC/RFT] vfio/pci: Create feature to disable MSI virtualization 2025-01-03 14:47 ` David Woodhouse @ 2025-01-03 15:19 ` Jason Gunthorpe 0 siblings, 0 replies; 51+ messages in thread From: Jason Gunthorpe @ 2025-01-03 15:19 UTC (permalink / raw) To: David Woodhouse Cc: Thomas Gleixner, Alex Williamson, kvm, quic_bqiang, kvalo, prestwoj, linux-wireless, ath11k, iommu, kernel, johannes, jtornosm On Fri, Jan 03, 2025 at 02:47:11PM +0000, David Woodhouse wrote: > Probably best for the hypervisor to have some way to *advertise* that > it's handling this though, as guests also might want to run on > hypervisors which don't. If the hypervisor doesn't properly virtualize the device it shouldn't assign it to a VM to start with :\ Intel looked at the question of advertising clean interrupt remapping when trying to virtualize IMS and it didn't seem so great. Bare metal machines need to work, so any test they could think of adding would either fail on bare metal or fail on existing VMs. VMM's have taken the approach of not telling the guest they are in VMs and then also not implementing the bare metal HW behaviors with full fidelity. So we have no way to discover that the VMM is, in fact, emulating broken "hw". Thus we get this push that all kernels need to accomodate the worst VMM behaviors :( Jason ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH RFC/RFT] vfio/pci: Create feature to disable MSI virtualization 2024-08-13 16:30 ` Jason Gunthorpe 2024-08-13 17:30 ` Thomas Gleixner @ 2024-08-13 21:14 ` Alex Williamson 2024-08-13 23:16 ` Jason Gunthorpe 1 sibling, 1 reply; 51+ messages in thread From: Alex Williamson @ 2024-08-13 21:14 UTC (permalink / raw) To: Jason Gunthorpe Cc: Thomas Gleixner, kvm, quic_bqiang, kvalo, prestwoj, linux-wireless, ath11k, dwmw2, iommu, kernel, johannes, jtornosm On Tue, 13 Aug 2024 13:30:53 -0300 Jason Gunthorpe <jgg@ziepe.ca> wrote: > On Mon, Aug 12, 2024 at 10:59:12AM -0600, Alex Williamson wrote: > > vfio-pci has always virtualized the MSI address and data registers as > > MSI programming is performed through the SET_IRQS ioctl. Often this > > virtualization is not used, and in specific cases can be unhelpful. > > > > One such case where the virtualization is a hinderance is when the > > device contains an onboard interrupt controller programmed by the guest > > driver. Userspace VMMs have a chance to quirk this programming, > > injecting the host physical MSI information, but only if the userspace > > driver can get access to the host physical address and data registers. > > > > This introduces a device feature which allows the userspace driver to > > disable virtualization of the MSI capability address and data registers > > in order to provide read-only access the the physical values. > > Personally, I very much dislike this. Encouraging such hacky driver > use of the interrupt subsystem is not a good direction. Enabling this > in VMs will further complicate fixing the IRQ usages in these drivers > over the long run. Clearly these _guest_ drivers are doing this regardless of the interfaces provided by vfio, so I don't see how we're encouraging hacky driver behavior, especially when it comes to Windows guest drivers. > If the device has it's own interrupt sources then the device needs to > create an irq_chip and related and hook them up properly. Not hackily > read the MSI-X registers and write them someplace else. This is how the hardware works, regardless of whether the guest driver represents the hardware using an irq_chip. > Thomas Gleixner has done alot of great work recently to clean this up. > > So if you imagine the driver is fixed, then this is not necessary. How so? Regardless of the guest driver structure, something is writing the MSI address and data values elsewhere in the device. AFAICT the only way to avoid needing to fixup those values is to give the guest ownership of the address space as you suggested in the other patch. That also seems to have a pile of issues though. > Howver, it will still not work in a VM. Making IMS and non-MSI > interrupt controlers work within VMs is still something that needs to > be done. Making it work in a VM is sort of the point here. Thanks, Alex ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH RFC/RFT] vfio/pci: Create feature to disable MSI virtualization 2024-08-13 21:14 ` Alex Williamson @ 2024-08-13 23:16 ` Jason Gunthorpe 2024-08-14 14:55 ` Alex Williamson 0 siblings, 1 reply; 51+ messages in thread From: Jason Gunthorpe @ 2024-08-13 23:16 UTC (permalink / raw) To: Alex Williamson Cc: Thomas Gleixner, kvm, quic_bqiang, kvalo, prestwoj, linux-wireless, ath11k, dwmw2, iommu, kernel, johannes, jtornosm On Tue, Aug 13, 2024 at 03:14:01PM -0600, Alex Williamson wrote: > > Personally, I very much dislike this. Encouraging such hacky driver > > use of the interrupt subsystem is not a good direction. Enabling this > > in VMs will further complicate fixing the IRQ usages in these drivers > > over the long run. > > Clearly these _guest_ drivers are doing this regardless of the > interfaces provided by vfio, so I don't see how we're encouraging hacky > driver behavior, especially when it comes to Windows guest drivers. Because people will then say the Linux driver can't be fixed to properly use an irq_domain/etc as the only option that works in VMs will be the hacky copy from MSI-X approach :\ > > Thomas Gleixner has done alot of great work recently to clean this up. > > > > So if you imagine the driver is fixed, then this is not necessary. > > How so? Because if the driver is properly using the new irq_domain/etc infrastructure to model its additional interrupt source then this patch won't make it work in the VM anyhow, so it is not necessary.. Your other patch would be the only short term answer. Jason ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH RFC/RFT] vfio/pci: Create feature to disable MSI virtualization 2024-08-13 23:16 ` Jason Gunthorpe @ 2024-08-14 14:55 ` Alex Williamson 2024-08-14 15:20 ` Jason Gunthorpe 0 siblings, 1 reply; 51+ messages in thread From: Alex Williamson @ 2024-08-14 14:55 UTC (permalink / raw) To: Jason Gunthorpe Cc: Thomas Gleixner, kvm, quic_bqiang, kvalo, prestwoj, linux-wireless, ath11k, dwmw2, iommu, kernel, johannes, jtornosm On Tue, 13 Aug 2024 20:16:42 -0300 Jason Gunthorpe <jgg@ziepe.ca> wrote: > On Tue, Aug 13, 2024 at 03:14:01PM -0600, Alex Williamson wrote: > > > > Personally, I very much dislike this. Encouraging such hacky driver > > > use of the interrupt subsystem is not a good direction. Enabling this > > > in VMs will further complicate fixing the IRQ usages in these drivers > > > over the long run. > > > > Clearly these _guest_ drivers are doing this regardless of the > > interfaces provided by vfio, so I don't see how we're encouraging hacky > > driver behavior, especially when it comes to Windows guest drivers. > > Because people will then say the Linux driver can't be fixed to > properly use an irq_domain/etc as the only option that works in VMs > will be the hacky copy from MSI-X approach :\ Ironically QEMU already has direct access to the MSI-X vector table in MMIO space and could implement this type of quirk with no kernel changes. It's MSI that is now blocked by virtualization of the address and data registers. Note also that QEMU is still virtualizing these registers, the values seen in the guest are unchanged. It's only the VMM that can bypass that virtualization to see the host values. Let's imagine the guest driver does change to implement an irq_domain. How does that fundamentally change the problem for the VMM that guest MSI values are being written to other portions of the device? The guest driver can have whatever architecture it wants (we don't know the architecture of the Windows driver) but we still need to trap writes of the guest MSI address/data and replace it with host values. > > > Thomas Gleixner has done alot of great work recently to clean this up. > > > > > > So if you imagine the driver is fixed, then this is not necessary. > > > > How so? > > Because if the driver is properly using the new irq_domain/etc > infrastructure to model its additional interrupt source then this > patch won't make it work in the VM anyhow, so it is not necessary.. > > Your other patch would be the only short term answer. The QEMU patch relies on this kernel patch in order to be able to access the host physical MSI address and data values through the vfio interface. Otherwise QEMU has no host values with which to patch-up guest values. As noted above, this does not provide any visible change to a QEMU guest, it only enables QEMU to implement the quirk in the other patch. Thanks, Alex ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH RFC/RFT] vfio/pci: Create feature to disable MSI virtualization 2024-08-14 14:55 ` Alex Williamson @ 2024-08-14 15:20 ` Jason Gunthorpe 0 siblings, 0 replies; 51+ messages in thread From: Jason Gunthorpe @ 2024-08-14 15:20 UTC (permalink / raw) To: Alex Williamson Cc: Thomas Gleixner, kvm, quic_bqiang, kvalo, prestwoj, linux-wireless, ath11k, dwmw2, iommu, kernel, johannes, jtornosm On Wed, Aug 14, 2024 at 08:55:05AM -0600, Alex Williamson wrote: > Let's imagine the guest driver does change to implement an irq_domain. > How does that fundamentally change the problem for the VMM that guest > MSI values are being written to other portions of the device? If changed to irq_domain the VM will write addr/data pairs into those special register that are unique to that interrupt source and will not re-use values already set in the MSI table. This means the VMM doesn't get any value from inspecting the MSI table because the value it needs won't be there, and alos that no interrupt routing will have been setup. The VMM must call VFIO_DEVICE_SET_IRQS to setup the unique routing. These two patches are avoiding VFIO_DEVICE_SET_IRQS based on the assumption that the VM will re-use a addr/data pair already setup in the MSI table. Invalidating that assumption is the fundamental change irq_domain in the VM will make. > The guest driver can have whatever architecture it wants (we don't > know the architecture of the Windows driver) but we still need to > trap writes of the guest MSI address/data and replace it with host > values. Yes you do. But the wrinkle is you can't just assume one of the existing MSI entries is a valid replacement and copy from the MSI table. That works right now only because the Linux/Windows driver is re-using a MSI vector in the IMS registers. I suggest the general path is something like: 1) A vfio variant driver sets up an irq_domain for the additional interrupt source registers 2) Somehow wire up VFIO_DEVICE_SET_IRQS so it can target vectors in the additional interrupt domain 3) Have the VMM trap writes to the extra interrupt source registers and execute VFIO_DEVICE_SET_IRQS 4) IRQ layer will setup an appropriate unique IRQ and route it to the guest/whatever just like MSI. Callbacks into the variant driver's irq_domain will program the HW registers. Basically exactly the same flow as MSI, except instead of targetting a vector in the PCI core's MSI irq_domain it targets a vector in the variant driver's IMS IRQ domain. Then we don't make any assumptions about how the VM is using these interrupt vectors, and crucially, SET_IRQs is called for every interrupt source and we rely on the kernel to produce the correct addr/data pair. No need for copying addr/data pairs from MSI tables. > As noted above, this does not provide any visible change to a QEMU > guest, it only enables QEMU to implement the quirk in the other > patch. I see, I definitely didn't understand that it only reaches qemu from the commit message.. Jason ^ permalink raw reply [flat|nested] 51+ messages in thread
* [PATCH RFC/RFT] vfio/pci-quirks: Quirk for ath wireless 2024-01-08 13:17 ath11k and vfio-pci support James Prestwood 2024-01-10 9:00 ` Kalle Valo 2024-08-12 16:59 ` [PATCH RFC/RFT] vfio/pci: Create feature to disable MSI virtualization Alex Williamson @ 2024-08-12 17:00 ` Alex Williamson 2024-08-13 16:43 ` Jason Gunthorpe 2 siblings, 1 reply; 51+ messages in thread From: Alex Williamson @ 2024-08-12 17:00 UTC (permalink / raw) To: qemu-devel, kvm Cc: Alex Williamson, quic_bqiang, kvalo, prestwoj, linux-wireless, ath11k, dwmw2, iommu, jgg, kernel, johannes, jtornosm These devices have an embedded interrupt controller which is programmed with guest physical MSI address/data, which doesn't work. We need vfio-pci kernel support to provide a device feature which disables virtualization of the MSI capability registers. Then we can do brute force testing for writes matching the MSI address, from which we can infer writes of the MSI data, replacing each with host physical values. This has only been tested on ath11k (0x1103), ath12k support is speculative and requires testing. Note that Windows guest drivers make use of multi-vector MSI which requires interrupt remapping support in the host. NB. The #define for the new vfio feature is temporary for RFC/RFT, a formal proposal will include a proper linux-headers update. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216055 Signed-off-by: Alex Williamson <alex.williamson@redhat.com> --- hw/vfio/pci-quirks.c | 236 +++++++++++++++++++++++++++++++++++++++++++ hw/vfio/trace-events | 3 + 2 files changed, 239 insertions(+) diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c index 39dae72497e0..5ba37bee3b36 100644 --- a/hw/vfio/pci-quirks.c +++ b/hw/vfio/pci-quirks.c @@ -23,6 +23,7 @@ #include "qapi/visitor.h" #include <sys/ioctl.h> #include "hw/nvram/fw_cfg.h" +#include "hw/pci/msi.h" #include "hw/qdev-properties.h" #include "pci.h" #include "trace.h" @@ -1159,6 +1160,240 @@ static void vfio_probe_rtl8168_bar2_quirk(VFIOPCIDevice *vdev, int nr) trace_vfio_quirk_rtl8168_probe(vdev->vbasedev.name); } +#define PCI_VENDOR_ID_QCOM 0x17cb + +/* + * ath11k wireless adapters do not support INTx and appear to have an interrupt + * controller embedded into the hardware. By default the interrupt controller + * is programmed with the MSI guest physical address, which doesn't work. + * Instead we need to trap and insert the host physical address and data. + * + * By default vfio-pci virtualizes the MSI address and data registers, providing + * a writable buffer, where reads simply return the buffer. QEMU writes + * everything through to hardware, so this only holds the guest MSI address. + * Therefore we first need to invoke a device feature that disables this + * emulation of the MSI address and data registers, allowing access to the host + * physical address and data. + * + * Next, where do these values get written? If we disable mmap support and + * trace accesses, we get this: + * + * # grep -A2 region0.*0xfee00000 typescript + * vfio_region_write (0000:01:00.0:region0+0x80048, 0xfee00000, 4) + * vfio_region_write (0000:01:00.0:region0+0x8004c, 0x100, 4) + * vfio_region_write (0000:01:00.0:region0+0x80050, 0x21, 4) + * -- + * vfio_region_write (0000:01:00.0:region0+0x83048, 0xfee00000, 4) + * vfio_region_write (0000:01:00.0:region0+0x8304c, 0x100, 4) + * vfio_region_write (0000:01:00.0:region0+0x83050, 0x21, 4) + * -- + * vfio_region_write (0000:01:00.0:region0+0x830a0, 0xfee00000, 4) + * vfio_region_write (0000:01:00.0:region0+0x830a4, 0x100, 4) + * vfio_region_write (0000:01:00.0:region0+0x830a8, 0x21, 4) + * -- + * vfio_region_write (0000:01:00.0:region0+0x85048, 0xfee00000, 4) + * vfio_region_write (0000:01:00.0:region0+0x8504c, 0x100, 4) + * vfio_region_write (0000:01:00.0:region0+0x85050, 0x21, 4) + * -- + * vfio_region_write (0000:01:00.0:region0+0x850a0, 0xfee00000, 4) + * vfio_region_write (0000:01:00.0:region0+0x850a4, 0x100, 4) + * vfio_region_write (0000:01:00.0:region0+0x850a8, 0x21, 4) + * -- + * vfio_region_write (0000:01:00.0:region0+0x86048, 0xfee00000, 4) + * vfio_region_write (0000:01:00.0:region0+0x8604c, 0x100, 4) + * vfio_region_write (0000:01:00.0:region0+0x86050, 0x21, 4) + * -- + * vfio_region_write (0000:01:00.0:region0+0x8b048, 0xfee00000, 4) + * vfio_region_write (0000:01:00.0:region0+0x8b04c, 0x100, 4) + * vfio_region_write (0000:01:00.0:region0+0x8b050, 0x21, 4) + * -- + * vfio_region_write (0000:01:00.0:region0+0x8b0a0, 0xfee00000, 4) + * vfio_region_write (0000:01:00.0:region0+0x8b0a4, 0x100, 4) + * vfio_region_write (0000:01:00.0:region0+0x8b0a8, 0x21, 4) + * -- + * vfio_region_write (0000:01:00.0:region0+0x8e048, 0xfee00000, 4) + * vfio_region_write (0000:01:00.0:region0+0x8e04c, 0x100, 4) + * vfio_region_write (0000:01:00.0:region0+0x8e050, 0x21, 4) + * -- + * vfio_region_write (0000:01:00.0:region0+0xb4968, 0xfee00000, 4) + * vfio_region_write (0000:01:00.0:region0+0xb496c, 0x100, 4) + * vfio_region_write (0000:01:00.0:region0+0xb4970, 0x21, 4) + * -- + * vfio_region_write (0000:01:00.0:region0+0xb4a70, 0xfee00000, 4) + * vfio_region_write (0000:01:00.0:region0+0xb4a74, 0x100, 4) + * vfio_region_write (0000:01:00.0:region0+0xb4a78, 0x21, 4) + * -- + * vfio_region_write (0000:01:00.0:region0+0xb849c, 0xfee00000, 4) + * vfio_region_write (0000:01:00.0:region0+0xb84a0, 0x100, 4) + * vfio_region_write (0000:01:00.0:region0+0xb84a4, 0x21, 4) + * -- + * vfio_region_write (0000:01:00.0:region0+0xb85a4, 0xfee00000, 4) + * vfio_region_write (0000:01:00.0:region0+0xb85a8, 0x100, 4) + * vfio_region_write (0000:01:00.0:region0+0xb85ac, 0x21, 4) + * + * We can find in this example trace that the MSI capability is programmed as: + * + * vfio_pci_write_config (0000:01:00.0, @0x54, 0xfee00000, len=0x4) + * vfio_pci_write_config (0000:01:00.0, @0x58, 0x21, len=0x2) + * + * This is a 32-bit MSI capability based at 0x50, so the MSI address is + * 0xfee00000 with data 0x21. So we see writes of the MSI address, followed + * 8-bytes later by what appears to be the MSI data. There's no obvious + * pattern in the device address where these are being written. + * + * We therefore come up with a really crude quirk that looks for values + * written to the ATH11K_PCI_WINDOW (defined in Linux driver as starting at + * 0x80000 with an 18-bit mask, ie. 256k) that match the guest MSI address. + * When found we replace the data with the host physical address and set a + * cookie to match the MSI data write, again replacing with the host value and + * clearing the cookie. + * + * Amazingly, this seems sufficient to work, and the trapped window only seems + * to be used during initialization, so this should introduce minimal overhead. + * + * The Windows driver makes use of multi-vector MSI, where our sanity test + * of the MSI data value must then mask off the vector offset for comparison + * and add it back to the host base data value on write. + * + * Only 4- and 8-byte accesses are observed to the PCI window, and MSI access + * are only observed with 4-byte width. + */ + +// Temporary, include updated Linux headers +#define VFIO_DEVICE_FEATURE_PCI_MSI_NOVIRT 11 + +typedef struct VFIOQAthQuirk { + VFIOPCIDevice *vdev; + uint32_t pci_window_base; + uint32_t msi_data_addr; +} VFIOQAthQuirk; + +static uint64_t vfio_ath_quirk_read(void *opaque, hwaddr addr, unsigned size) +{ + VFIOQAthQuirk *ath = opaque; + VFIOPCIDevice *vdev = ath->vdev; + + return vfio_region_read(&vdev->bars[0].region, + addr + ath->pci_window_base, size); +} + +static void vfio_ath_quirk_write(void *opaque, hwaddr addr, + uint64_t data, unsigned size) +{ + VFIOQAthQuirk *ath = opaque; + VFIOPCIDevice *vdev = ath->vdev; + + if (size == 4 && msi_enabled(&vdev->pdev)) { + uint32_t phys = 0; + + if (data == pci_get_long(vdev->pdev.config + + vdev->pdev.msi_cap + PCI_MSI_ADDRESS_LO)) { + pread(vdev->vbasedev.fd, &phys, 4, vdev->config_offset + + vdev->pdev.msi_cap + PCI_MSI_ADDRESS_LO); + trace_vfio_quirk_ath_write_address(vdev->vbasedev.name, + addr + ath->pci_window_base, + data, phys); + data = phys; + ath->msi_data_addr = addr + 8; + } else if (ath->msi_data_addr && ath->msi_data_addr == addr) { + uint32_t mask = msi_nr_vectors_allocated(&vdev->pdev) - 1; + uint32_t nr = data & mask; + + if ((data & ~mask) == pci_get_word(vdev->pdev.config + + vdev->pdev.msi_cap + + PCI_MSI_DATA_32)) { + pread(vdev->vbasedev.fd, &phys, 2, vdev->config_offset + + vdev->pdev.msi_cap + PCI_MSI_DATA_32); + phys += nr; + trace_vfio_quirk_ath_write_data(vdev->vbasedev.name, + addr + ath->pci_window_base, + data, phys); + data = phys; + } + ath->msi_data_addr = 0; + } + } + + vfio_region_write(&vdev->bars[0].region, addr + ath->pci_window_base, + data, size); +} + +static const MemoryRegionOps vfio_ath_quirk = { + .read = vfio_ath_quirk_read, + .write = vfio_ath_quirk_write, + .valid = { + .min_access_size = 4, + .max_access_size = 8, + .unaligned = false, + }, + .endianness = DEVICE_LITTLE_ENDIAN, +}; + +static bool vfio_pci_msi_novirt(VFIOPCIDevice *vdev) +{ + uint64_t buf[DIV_ROUND_UP(sizeof(struct vfio_device_feature), + sizeof(uint64_t))] = {}; + struct vfio_device_feature *feature = (struct vfio_device_feature *)buf; + + feature->argsz = sizeof(buf); + feature->flags = VFIO_DEVICE_FEATURE_SET | + VFIO_DEVICE_FEATURE_PCI_MSI_NOVIRT; + + return !ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, feature); +} + +static void vfio_probe_ath_bar0_quirk(VFIOPCIDevice *vdev, int nr) +{ + VFIOQuirk *quirk; + VFIOQAthQuirk *ath; + uint32_t pci_window_base, pci_window_size; + + if (nr != 0 || + vdev->vendor_id != PCI_VENDOR_ID_QCOM || !msi_present(&vdev->pdev)) { + return; + } + + switch (vdev->device_id) { + case 0x1101: /* Untested */ + case 0x1103: + case 0x1104: /* Untested */ + /* Devices claimed by ath11k_pci_id_table in Linux driver as of v6.10 */ + pci_window_base = 0x80000; /* ATH11K_PCI_WINDOW_START */ + pci_window_size = 0x40000; /* ATH11K_PCI_WINDOW_RANGE_MASK (256k)*/ + break; + case 0x1107: /* Untested */ + case 0x1109: /* Untested */ + /* Devices claimed by ath12k_pci_id_table in Linux driver as of v6.10 */ + pci_window_base = 0x1e00000; /* PCI_BAR_WINDOW0_BASE */ + pci_window_size = 0x80000; /* ~PCI_BAR_WINDOW0_END (512k) */ + break; + default: + return; + } + + if (!vfio_pci_msi_novirt(vdev)) { + warn_report("Found device matching Atheros wireless quirk, but host " + "does not support vfio device feature required for quirk. " + "Device is known not to work with device assignment " + "without this quirk. Please update host kernel."); + return; + } + + quirk = vfio_quirk_alloc(1); + quirk->data = ath = g_malloc0(sizeof(*ath)); + ath->vdev = vdev; + ath->pci_window_base = pci_window_base; + + memory_region_init_io(&quirk->mem[0], OBJECT(vdev), &vfio_ath_quirk, + ath, "vfio-ath-quirk", pci_window_size); + memory_region_add_subregion_overlap(vdev->bars[nr].region.mem, + pci_window_base, &quirk->mem[0], 1); + + QLIST_INSERT_HEAD(&vdev->bars[nr].quirks, quirk, next); + + trace_vfio_quirk_ath_bar0_probe(vdev->vbasedev.name); +} + #define IGD_ASLS 0xfc /* ASL Storage Register */ /* @@ -1261,6 +1496,7 @@ void vfio_bar_quirk_setup(VFIOPCIDevice *vdev, int nr) #ifdef CONFIG_VFIO_IGD vfio_probe_igd_bar4_quirk(vdev, nr); #endif + vfio_probe_ath_bar0_quirk(vdev, nr); } void vfio_bar_quirk_exit(VFIOPCIDevice *vdev, int nr) diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index 98bd4dcceadc..3d1154785157 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -82,6 +82,9 @@ vfio_ioeventfd_exit(const char *name, uint64_t addr, unsigned size, uint64_t dat vfio_ioeventfd_handler(const char *name, uint64_t addr, unsigned size, uint64_t data) "%s+0x%"PRIx64"[%d] -> 0x%"PRIx64 vfio_ioeventfd_init(const char *name, uint64_t addr, unsigned size, uint64_t data, bool vfio) "%s+0x%"PRIx64"[%d]:0x%"PRIx64" vfio:%d" vfio_pci_igd_opregion_enabled(const char *name) "%s" +vfio_quirk_ath_bar0_probe(const char *name) "%s" +vfio_quirk_ath_write_address(const char *name, uint64_t addr, uint64_t data, uint32_t phys) "%s[0x%"PRIx64"]: Replacing MSI address 0x%"PRIx64" with value 0x%x" +vfio_quirk_ath_write_data(const char *name, uint64_t addr, uint64_t data, uint32_t phys) "%s[0x%"PRIx64"]: Replacing MSI data 0x%"PRIx64" with value 0x%x" # igd.c vfio_pci_igd_bar4_write(const char *name, uint32_t index, uint32_t data, uint32_t base) "%s [0x%03x] 0x%08x -> 0x%08x" -- 2.45.2 ^ permalink raw reply related [flat|nested] 51+ messages in thread
* Re: [PATCH RFC/RFT] vfio/pci-quirks: Quirk for ath wireless 2024-08-12 17:00 ` [PATCH RFC/RFT] vfio/pci-quirks: Quirk for ath wireless Alex Williamson @ 2024-08-13 16:43 ` Jason Gunthorpe 2024-08-13 21:03 ` Alex Williamson 0 siblings, 1 reply; 51+ messages in thread From: Jason Gunthorpe @ 2024-08-13 16:43 UTC (permalink / raw) To: Alex Williamson Cc: qemu-devel, kvm, quic_bqiang, kvalo, prestwoj, linux-wireless, ath11k, dwmw2, iommu, kernel, johannes, jtornosm On Mon, Aug 12, 2024 at 11:00:40AM -0600, Alex Williamson wrote: > These devices have an embedded interrupt controller which is programmed > with guest physical MSI address/data, which doesn't work. We need > vfio-pci kernel support to provide a device feature which disables > virtualization of the MSI capability registers. Then we can do brute > force testing for writes matching the MSI address, from which we can > infer writes of the MSI data, replacing each with host physical values. > > This has only been tested on ath11k (0x1103), ath12k support is > speculative and requires testing. Note that Windows guest drivers make > use of multi-vector MSI which requires interrupt remapping support in > the host. The way it is really supposed to work, is that the guest itself controls/knows the MSI addr/data pairs and the interrupt remapping HW makes that delegation safe since all the interrupt processing will be qualified by the RID. Then the guest can make up the unique interrupts for MSI and any internal "IMS" sources and we just let the guest directly write the MSI/MSI-X and any IMS values however it wants. This hackery to capture and substitute the IMS programming is neat and will solve this one device, but there are more IMS style devices in the pipeline than will really need a full solution. > + * The Windows driver makes use of multi-vector MSI, where our sanity test > + * of the MSI data value must then mask off the vector offset for comparison > + * and add it back to the host base data value on write. But is that really enough? If the vector offset is newly created then that means the VM built a new interrupt that needs setup to be routed into the VM?? Is that why you say it "requires interrupt remapping support" because that setup is happening implicitly on x86? It looks like Windows is acting as I said Linux should, with a "irq_chip" and so on to get the unique interrupt source a proper unique addr/data pair... Jason ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH RFC/RFT] vfio/pci-quirks: Quirk for ath wireless 2024-08-13 16:43 ` Jason Gunthorpe @ 2024-08-13 21:03 ` Alex Williamson 2024-08-13 23:37 ` Jason Gunthorpe 0 siblings, 1 reply; 51+ messages in thread From: Alex Williamson @ 2024-08-13 21:03 UTC (permalink / raw) To: Jason Gunthorpe Cc: qemu-devel, kvm, quic_bqiang, kvalo, prestwoj, linux-wireless, ath11k, dwmw2, iommu, kernel, johannes, jtornosm On Tue, 13 Aug 2024 13:43:41 -0300 Jason Gunthorpe <jgg@ziepe.ca> wrote: > On Mon, Aug 12, 2024 at 11:00:40AM -0600, Alex Williamson wrote: > > These devices have an embedded interrupt controller which is programmed > > with guest physical MSI address/data, which doesn't work. We need > > vfio-pci kernel support to provide a device feature which disables > > virtualization of the MSI capability registers. Then we can do brute > > force testing for writes matching the MSI address, from which we can > > infer writes of the MSI data, replacing each with host physical values. > > > > This has only been tested on ath11k (0x1103), ath12k support is > > speculative and requires testing. Note that Windows guest drivers make > > use of multi-vector MSI which requires interrupt remapping support in > > the host. > > The way it is really supposed to work, is that the guest itself > controls/knows the MSI addr/data pairs and the interrupt remapping HW > makes that delegation safe since all the interrupt processing will be > qualified by the RID. > > Then the guest can make up the unique interrupts for MSI and any > internal "IMS" sources and we just let the guest directly write the > MSI/MSI-X and any IMS values however it wants. > > This hackery to capture and substitute the IMS programming is neat and > will solve this one device, but there are more IMS style devices in > the pipeline than will really need a full solution. How does the guest know to write a remappable vector format? How does the guest know the host interrupt architecture? For example why would an aarch64 guest program an MSI vector of 0xfee... if the host is x86? The idea of guest owning the physical MSI address space sounds great, but is it practical? Is it something that would be accomplished while this device is still relevant? > > + * The Windows driver makes use of multi-vector MSI, where our sanity test > > + * of the MSI data value must then mask off the vector offset for comparison > > + * and add it back to the host base data value on write. > > But is that really enough? If the vector offset is newly created then > that means the VM built a new interrupt that needs setup to be routed > into the VM?? Is that why you say it "requires interrupt remapping > support" because that setup is happening implicitly on x86? > > It looks like Windows is acting as I said Linux should, with a > "irq_chip" and so on to get the unique interrupt source a proper > unique addr/data pair... The Windows driver is just programming the MSI capability to use 16 vectors. We configure those vectors on the host at the time the capability is written. Whereas the Linux driver is only using a single vector and therefore writing the same MSI address and data at the locations noted in the trace, the Windows driver is writing different data values at different locations to make use of those vectors. This note is simply describing that we can't directly write the physical data value into the device, we need to determine which vector offset the guest is using and provide the same offset from the host data register value. I don't know that interrupt remapping is specifically required, but the MSI domain needs to support MSI_FLAG_MULTI_PCI_MSI and AFAIK that's only available with interrupt remapping on x86, ie. pci_alloc_irq_vectors() with max_vecs >1 and PCI_IRQ_MSI flags needs to work on the host to mirror the guest MSI configuration. Thanks, Alex ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH RFC/RFT] vfio/pci-quirks: Quirk for ath wireless 2024-08-13 21:03 ` Alex Williamson @ 2024-08-13 23:37 ` Jason Gunthorpe 2024-08-15 16:59 ` Alex Williamson 0 siblings, 1 reply; 51+ messages in thread From: Jason Gunthorpe @ 2024-08-13 23:37 UTC (permalink / raw) To: Alex Williamson Cc: qemu-devel, kvm, quic_bqiang, kvalo, prestwoj, linux-wireless, ath11k, dwmw2, iommu, kernel, johannes, jtornosm On Tue, Aug 13, 2024 at 03:03:20PM -0600, Alex Williamson wrote: > How does the guest know to write a remappable vector format? How does > the guest know the host interrupt architecture? For example why would > an aarch64 guest program an MSI vector of 0xfee... if the host is x86? All excellent questions. Emulating real interrupt controllers in the VM is probably impossible in every scenario. But certainly x86 emulating x86 and ARM emulating ARM would be usefully achievable. hyperv did a neat thing where their remapping driver seems to make VMM traps and looks kind of like the VMM gives it the platform specific addr/data pair. It is a big ugly problem for sure, and we definately have painted ourselves into a corner where the OS has no idea if IMS techniques work properly or it is broken. :( :( But I think there may not be a terribly impossible path where at least the guest could be offered a, say, virtio-irq in addition to the existing platform controllers that would process IMS for it. > The idea of guest owning the physical MSI address space sounds great, > but is it practical? In many cases yes, it is, but more importantly it is the only sane way to support these IMS like techniques broadly since IMS is by definition not generally trappable. > Is it something that would be accomplished while > this device is still relevant? I don't know, I fear not. But it keeps coming up. Too many things don't work right with the trapping approach, including this. > The Windows driver is just programming the MSI capability to use 16 > vectors. We configure those vectors on the host at the time the > capability is written. Whereas the Linux driver is only using a single > vector and therefore writing the same MSI address and data at the > locations noted in the trace, the Windows driver is writing different > data values at different locations to make use of those vectors. This > note is simply describing that we can't directly write the physical > data value into the device, we need to determine which vector offset > the guest is using and provide the same offset from the host data > register value. I see, it seems to be assuming also that these extra interrupt sources are generating the same MSI message as the main MSI, not something else. That is more a SW quirk of Windows, I expect. I don't think Linux would do that.. This is probably the only way to approach this, trap and emulate the places in the device that program additional interrupt sources and do a full MSI-like flow to set them up in the kernel. Jason ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH RFC/RFT] vfio/pci-quirks: Quirk for ath wireless 2024-08-13 23:37 ` Jason Gunthorpe @ 2024-08-15 16:59 ` Alex Williamson 2024-08-15 17:19 ` Jason Gunthorpe 0 siblings, 1 reply; 51+ messages in thread From: Alex Williamson @ 2024-08-15 16:59 UTC (permalink / raw) To: Jason Gunthorpe Cc: qemu-devel, kvm, quic_bqiang, kvalo, prestwoj, linux-wireless, ath11k, dwmw2, iommu, kernel, johannes, jtornosm On Tue, 13 Aug 2024 20:37:24 -0300 Jason Gunthorpe <jgg@ziepe.ca> wrote: > On Tue, Aug 13, 2024 at 03:03:20PM -0600, Alex Williamson wrote: > > > How does the guest know to write a remappable vector format? How does > > the guest know the host interrupt architecture? For example why would > > an aarch64 guest program an MSI vector of 0xfee... if the host is x86? > > All excellent questions. > > Emulating real interrupt controllers in the VM is probably impossible > in every scenario. But certainly x86 emulating x86 and ARM emulating > ARM would be usefully achievable. > > hyperv did a neat thing where their remapping driver seems to make VMM > traps and looks kind of like the VMM gives it the platform specific > addr/data pair. > > It is a big ugly problem for sure, and we definately have painted > ourselves into a corner where the OS has no idea if IMS techniques > work properly or it is broken. :( :( > > But I think there may not be a terribly impossible path where at least > the guest could be offered a, say, virtio-irq in addition to the > existing platform controllers that would process IMS for it. > > > The idea of guest owning the physical MSI address space sounds great, > > but is it practical? > > In many cases yes, it is, but more importantly it is the only sane way > to support these IMS like techniques broadly since IMS is by > definition not generally trappable. > > > Is it something that would be accomplished while > > this device is still relevant? > > I don't know, I fear not. But it keeps coming up. Too many things > don't work right with the trapping approach, including this. > > > The Windows driver is just programming the MSI capability to use 16 > > vectors. We configure those vectors on the host at the time the > > capability is written. Whereas the Linux driver is only using a single > > vector and therefore writing the same MSI address and data at the > > locations noted in the trace, the Windows driver is writing different > > data values at different locations to make use of those vectors. This > > note is simply describing that we can't directly write the physical > > data value into the device, we need to determine which vector offset > > the guest is using and provide the same offset from the host data > > register value. > > I see, it seems to be assuming also that these extra interrupt sources > are generating the same MSI message as the main MSI, not something > else. That is more a SW quirk of Windows, I expect. I don't think > Linux would do that.. > > This is probably the only way to approach this, trap and emulate the > places in the device that program additional interrupt sources and do > a full MSI-like flow to set them up in the kernel. Your last sentence here seems to agree with this approach, but everything else suggests disapproval, so I don't know where you're going here. I have no specs for this device, nor any involvement from the device vendor, so the idea of creating a vfio-pci variant driver to setup an irq_domain and augment a device specific SET_IRQs ioctls not only sounds tremendously more complicated (host and VMM), it's simply not possible with the knowledge we have at hand. Making this device work in a VM is dead in the water if that's the bar to achieve. I observe that the device configures MSI vectors and then writes that same vector address/data elsewhere into the device. Whether the device can trigger those vectors based only on the MSI capability programming and a secondary source piggybacks on those vectors or if this is just a hack by Qualcomm to use an MSI capability to acquire some vectors which are exclusively used by the secondary hardware, I have no idea. Who can even say if this is just a cost saving measure that a PCI config space is slapped into a platform device and there's simply no hw/fw support to push the vector data into the hardware and the driver bridges the gap. The solution here is arguably fragile, we're relying on having a sufficiently unique MSI address that we can recognize writes with that value in order to both replace it with the host value and mark the location of the data register. If someone with some hardware insight wants to come along and provide a reference for static locations of these writes, I'd certainly welcome it. My sample size is one, which is why this is posted as an RFT. I do not believe that introducing a vfio device feature that disables virtualization of the MSI address/data _only_ at the vfio interface (not to a QEMU VM) provides some implicit support of this device behavior. These values are already available to a privileged user in the host and the same is available for an MSI-X use case by directly reading the MSI-X vector table. The only point of the vfio device feature is that we need a vehicle to expose the MSI phsyical address/data values through he vfio channel, without additional host privileges. The virtualized values are essentially unused by QEMU, so why not give QEMU a way to turn off the virtualization to expose the host values. Thanks, Alex ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH RFC/RFT] vfio/pci-quirks: Quirk for ath wireless 2024-08-15 16:59 ` Alex Williamson @ 2024-08-15 17:19 ` Jason Gunthorpe 0 siblings, 0 replies; 51+ messages in thread From: Jason Gunthorpe @ 2024-08-15 17:19 UTC (permalink / raw) To: Alex Williamson Cc: qemu-devel, kvm, quic_bqiang, kvalo, prestwoj, linux-wireless, ath11k, dwmw2, iommu, kernel, johannes, jtornosm On Thu, Aug 15, 2024 at 10:59:05AM -0600, Alex Williamson wrote: > > This is probably the only way to approach this, trap and emulate the > > places in the device that program additional interrupt sources and do > > a full MSI-like flow to set them up in the kernel. > > Your last sentence here seems to agree with this approach, but > everything else suggests disapproval, so I don't know where you're > going here. Trapping and emulating is fine. My concern is really only about skipping SET_IRQ. That works because of the assumption that the IMS sources are going to re-use addr/data pairs setup in the MSI CAP. That assumption is frail, and won't work at all under the proper IMS support Linux now has. I really don't want to go down the road and have someone tell Thomas he can't convert the Linux driver to use irq_domain IMS because it will break this stuff here. > I have no specs for this device, nor any involvement from the device > vendor, so the idea of creating a vfio-pci variant driver to setup an > irq_domain and augment a device specific SET_IRQs ioctls not only sounds > tremendously more complicated (host and VMM), it's simply not possible > with the knowledge we have at hand. It seems like you have reverse engineered alot of the necessary information though?? Maybe there is a more generic approach than a variant driver. If you wanted to use IMS from userspace generically you could imagine some kind of IMS focused "SET_IRQ" in generic VFIO. Where we'd create the needed irq_domains and pass the addr/data pair back to userspace? > I observe that the device configures MSI vectors and then writes that > same vector address/data elsewhere into the device. Whether the device > can trigger those vectors based only on the MSI capability programming > and a secondary source piggybacks on those vectors or if this is just a > hack by Qualcomm to use an MSI capability to acquire some vectors which > are exclusively used by the secondary hardware, I have no idea. Well at least that should be testable - but it seems crazy if the device has registers for an addr/data pair and then somehow doesn't use the values that get put in them?? Copying from the MSI is almost certainly a SW hack because IMS support has never really existed in an OS until now. I think your guess for why it is like this is pretty good. > I do not believe that introducing a vfio device feature that disables > virtualization of the MSI address/data _only_ at the vfio interface > (not to a QEMU VM) provides some implicit support of this device > behavior. These values are already available to a privileged user in > the host and the same is available for an MSI-X use case by directly > reading the MSI-X vector table. To be clear, I'm not really worried about showing the data to userspace. Userspace just shouldn't be using it to implement an IMS technique! Jason ^ permalink raw reply [flat|nested] 51+ messages in thread
end of thread, other threads:[~2025-01-03 15:19 UTC | newest] Thread overview: 51+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-01-08 13:17 ath11k and vfio-pci support James Prestwood 2024-01-10 9:00 ` Kalle Valo 2024-01-10 13:04 ` James Prestwood 2024-01-10 13:49 ` Kalle Valo 2024-01-10 14:55 ` James Prestwood 2024-01-11 3:51 ` Baochen Qiang 2024-01-11 8:16 ` Kalle Valo 2024-01-11 12:48 ` James Prestwood 2024-01-11 13:11 ` Kalle Valo 2024-01-11 13:38 ` James Prestwood 2024-01-12 2:04 ` Baochen Qiang 2024-01-12 12:47 ` James Prestwood 2024-01-14 12:37 ` Baochen Qiang 2024-01-14 14:36 ` Kalle Valo 2024-01-15 17:46 ` Alex Williamson 2024-01-16 10:08 ` Baochen Qiang 2024-01-16 10:41 ` David Woodhouse 2024-01-16 15:29 ` Jason Gunthorpe 2024-01-16 18:28 ` Alex Williamson 2024-01-16 21:10 ` Jeff Johnson 2024-01-17 5:47 ` Baochen Qiang 2024-03-21 19:14 ` Johannes Berg 2024-01-16 13:05 ` James Prestwood 2024-01-17 5:26 ` Baochen Qiang 2024-01-17 13:20 ` James Prestwood 2024-01-17 13:43 ` Kalle Valo 2024-01-17 14:25 ` James Prestwood 2024-01-18 2:09 ` Baochen Qiang 2024-01-19 17:52 ` James Prestwood 2024-01-19 17:57 ` Kalle Valo 2024-01-19 18:07 ` James Prestwood 2024-01-26 18:20 ` James Prestwood 2024-01-27 4:31 ` Baochen Qiang 2024-08-12 16:59 ` [PATCH RFC/RFT] vfio/pci: Create feature to disable MSI virtualization Alex Williamson 2024-08-13 16:30 ` Jason Gunthorpe 2024-08-13 17:30 ` Thomas Gleixner 2024-08-13 23:39 ` Jason Gunthorpe 2024-12-13 9:10 ` David Woodhouse 2025-01-03 14:31 ` Jason Gunthorpe 2025-01-03 14:47 ` David Woodhouse 2025-01-03 15:19 ` Jason Gunthorpe 2024-08-13 21:14 ` Alex Williamson 2024-08-13 23:16 ` Jason Gunthorpe 2024-08-14 14:55 ` Alex Williamson 2024-08-14 15:20 ` Jason Gunthorpe 2024-08-12 17:00 ` [PATCH RFC/RFT] vfio/pci-quirks: Quirk for ath wireless Alex Williamson 2024-08-13 16:43 ` Jason Gunthorpe 2024-08-13 21:03 ` Alex Williamson 2024-08-13 23:37 ` Jason Gunthorpe 2024-08-15 16:59 ` Alex Williamson 2024-08-15 17:19 ` Jason Gunthorpe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).