From: Klaus Jensen <its@irrelevant.dk>
To: Akihiko Odaki <akihiko.odaki@daynix.com>
Cc: "Kevin Wolf" <kwolf@redhat.com>,
"Philippe Mathieu-Daudé" <philmd@linaro.org>,
"Michael S. Tsirkin" <mst@redhat.com>,
"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>,
"Alex Williamson" <alex.williamson@redhat.com>,
"Cédric Le Goater" <clg@redhat.com>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Daniel P. Berrangé" <berrange@redhat.com>,
"Eduardo Habkost" <eduardo@habkost.net>,
"Sriram Yagnaraman" <sriram.yagnaraman@est.tech>,
"Jason Wang" <jasowang@redhat.com>,
"Keith Busch" <kbusch@kernel.org>,
qemu-devel@nongnu.org, qemu-block@nongnu.org,
qemu-stable@nongnu.org
Subject: Re: [PATCH v6 01/15] hw/nvme: Use pcie_sriov_num_vfs()
Date: Tue, 20 Feb 2024 21:40:38 +0100 [thread overview]
Message-ID: <ZdUORlAIMpQN0hAk@cormorant.local> (raw)
In-Reply-To: <763f68fa-5c50-4f68-a758-fadcba36b857@daynix.com>
[-- Attachment #1: Type: text/plain, Size: 5255 bytes --]
On Feb 21 00:33, Akihiko Odaki wrote:
> On 2024/02/20 23:53, Kevin Wolf wrote:
> > Am 20.02.2024 um 15:29 hat Kevin Wolf geschrieben:
> > > Am 20.02.2024 um 13:24 hat Akihiko Odaki geschrieben:
> > > > nvme_sriov_pre_write_ctrl() used to directly inspect SR-IOV
> > > > configurations to know the number of VFs being disabled due to SR-IOV
> > > > configuration writes, but the logic was flawed and resulted in
> > > > out-of-bound memory access.
> > > >
> > > > It assumed PCI_SRIOV_NUM_VF always has the number of currently enabled
> > > > VFs, but it actually doesn't in the following cases:
> > > > - PCI_SRIOV_NUM_VF has been set but PCI_SRIOV_CTRL_VFE has never been.
> > > > - PCI_SRIOV_NUM_VF was written after PCI_SRIOV_CTRL_VFE was set.
> > > > - VFs were only partially enabled because of realization failure.
> > > >
> > > > It is a responsibility of pcie_sriov to interpret SR-IOV configurations
> > > > and pcie_sriov does it correctly, so use pcie_sriov_num_vfs(), which it
> > > > provides, to get the number of enabled VFs before and after SR-IOV
> > > > configuration writes.
> > > >
> > > > Cc: qemu-stable@nongnu.org
> > > > Fixes: CVE-2024-26328
> > > > Fixes: 11871f53ef8e ("hw/nvme: Add support for the Virtualization Management command")
> > > > Suggested-by: Michael S. Tsirkin <mst@redhat.com>
> > > > Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> > > > ---
> > > > hw/nvme/ctrl.c | 26 ++++++++------------------
> > > > 1 file changed, 8 insertions(+), 18 deletions(-)
> > > >
> > > > diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
> > > > index f026245d1e9e..7a56e7b79b4d 100644
> > > > --- a/hw/nvme/ctrl.c
> > > > +++ b/hw/nvme/ctrl.c
> > > > @@ -8466,36 +8466,26 @@ static void nvme_pci_reset(DeviceState *qdev)
> > > > nvme_ctrl_reset(n, NVME_RESET_FUNCTION);
> > > > }
> > > > -static void nvme_sriov_pre_write_ctrl(PCIDevice *dev, uint32_t address,
> > > > - uint32_t val, int len)
> > > > +static void nvme_sriov_post_write_config(PCIDevice *dev, uint16_t old_num_vfs)
> > > > {
> > > > NvmeCtrl *n = NVME(dev);
> > > > NvmeSecCtrlEntry *sctrl;
> > > > - uint16_t sriov_cap = dev->exp.sriov_cap;
> > > > - uint32_t off = address - sriov_cap;
> > > > - int i, num_vfs;
> > > > + int i;
> > > > - if (!sriov_cap) {
> > > > - return;
> > > > - }
> > > > -
> > > > - if (range_covers_byte(off, len, PCI_SRIOV_CTRL)) {
> > > > - if (!(val & PCI_SRIOV_CTRL_VFE)) {
> > > > - num_vfs = pci_get_word(dev->config + sriov_cap + PCI_SRIOV_NUM_VF);
> > > > - for (i = 0; i < num_vfs; i++) {
> > > > - sctrl = &n->sec_ctrl_list.sec[i];
> > > > - nvme_virt_set_state(n, le16_to_cpu(sctrl->scid), false);
> > > > - }
> > > > - }
> > > > + for (i = pcie_sriov_num_vfs(dev); i < old_num_vfs; i++) {
> > > > + sctrl = &n->sec_ctrl_list.sec[i];
> > > > + nvme_virt_set_state(n, le16_to_cpu(sctrl->scid), false);
> > > > }
> > > > }
> > >
> > > Maybe I'm missing something, but if the concern is that 'i' could run
> > > beyond the end of the array, I don't see anything that limits
> > > pcie_sriov_num_vfs() to the static size of 127 that n->sec_ctrl_list.sec
> > > has. register_vfs() seems to just take whatever 16 bit value the guest
> > > wrote without imposing additional restrictions.
> > >
> > > If there is some mechanism that makes register_vf() fail if we exceed
> > > the limit, maybe an assertion with a comment would be in order because
> > > it doesn't seem obvious. I couldn't find any code that enforces it,
> > > sriov_max_vfs only ever seems to be used as a hint for the guest.
> > >
> > > If not, do we need another check that fails gracefully in the error
> > > case?
> >
> > Ok, I see now that patch 2 fixes this. But then the commit message is
> > wrong because it implies that this patch is the only thing you need to
> > fix the problem with nvme. You can't say "Fixes: CVE-2024-26328" if half
> > of the fix is still missing.
>
> I didn't assign CVE-2024-26328 for the case that the value of
> PCI_SRIOV_NUM_VF is greater than PCI_SRIOV_TOTAL_VF; it's what
> CVE-2024-26327 deals with.
>
> The problem I dealt here is that the value of PCI_SRIOV_NUM_VF may not
> represent the actual number of enabled VFs because another register
> (PCI_SRIOV_CTRL_VFE) is not set, for example.
>
> If an assertion to be added, I think it should be in pcie_sriov_num_vfs()
> and ensure the returning value is less than the value of PCI_SRIOV_TOTAL_VF
> (aka sriov_max_vfs in hw/nvme/ctrl.c), but I think it's fine without it.
>
> >
> > Maybe asserting old_num_vfs < n->sec_ctrl_list.numcntl would still be a
> > good idea. But looking at this one, it seems to me that numcntl isn't
> > completely correct either:
> >
> > list->numcntl = cpu_to_le16(max_vfs);
> >
> > Both list->numcntl and max_vfs are uint8_t, so I think this will always
> > be 0 on big endian hosts?
>
> Indeed it looks wrong. Will you write a patch?
>
I'll fix it. And give the SR-IOV parts of hw/nvme some love all around.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2024-02-20 20:42 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-20 12:24 [PATCH v6 00/15] hw/pci: SR-IOV related fixes and improvements Akihiko Odaki
2024-02-20 12:24 ` [PATCH v6 01/15] hw/nvme: Use pcie_sriov_num_vfs() Akihiko Odaki
2024-02-20 14:29 ` Kevin Wolf
2024-02-20 14:47 ` Klaus Jensen
2024-02-20 14:53 ` Kevin Wolf
2024-02-20 15:33 ` Akihiko Odaki
2024-02-20 20:40 ` Klaus Jensen [this message]
2024-02-20 12:24 ` [PATCH v6 02/15] pcie_sriov: Validate NumVFs Akihiko Odaki
2024-02-20 12:24 ` [PATCH v6 03/15] pcie_sriov: Reset SR-IOV extended capability Akihiko Odaki
2024-02-20 12:24 ` [PATCH v6 04/15] pcie_sriov: Do not reset NumVFs after disabling VFs Akihiko Odaki
2024-02-20 12:24 ` [PATCH v6 05/15] hw/pci: Always call pcie_sriov_pf_reset() Akihiko Odaki
2024-02-20 12:24 ` [PATCH v6 06/15] hw/pci: Rename has_power to enabled Akihiko Odaki
2024-02-20 12:24 ` [PATCH v6 07/15] pcie_sriov: Do not manually unrealize Akihiko Odaki
2024-03-12 19:37 ` Michael S. Tsirkin
2024-02-20 12:24 ` [PATCH v6 08/15] pcie_sriov: Reuse SR-IOV VF device instances Akihiko Odaki
2024-03-12 19:47 ` Michael S. Tsirkin
2024-02-20 12:24 ` [PATCH v6 09/15] pcie_sriov: Release VFs failed to realize Akihiko Odaki
2024-02-20 12:24 ` [PATCH v6 10/15] pcie_sriov: Remove num_vfs from PCIESriovPF Akihiko Odaki
2024-02-20 12:24 ` [PATCH v6 11/15] pcie_sriov: Register VFs after migration Akihiko Odaki
2024-02-20 12:24 ` [PATCH v6 12/15] hw/pci: Use -1 as a default value for rombar Akihiko Odaki
2024-02-21 7:59 ` Markus Armbruster
2024-02-22 6:40 ` Akihiko Odaki
2024-02-20 12:24 ` [PATCH v6 13/15] hw/pci: Determine if rombar is explicitly enabled Akihiko Odaki
2024-02-21 8:15 ` Markus Armbruster
2024-02-22 6:50 ` Akihiko Odaki
2024-02-20 12:24 ` [PATCH v6 14/15] vfio: Avoid inspecting option QDict for rombar Akihiko Odaki
2024-02-21 8:16 ` Markus Armbruster
2024-02-20 12:24 ` [PATCH v6 15/15] hw/qdev: Remove opts member Akihiko Odaki
2024-02-21 8:18 ` Markus Armbruster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZdUORlAIMpQN0hAk@cormorant.local \
--to=its@irrelevant.dk \
--cc=akihiko.odaki@daynix.com \
--cc=alex.williamson@redhat.com \
--cc=berrange@redhat.com \
--cc=clg@redhat.com \
--cc=eduardo@habkost.net \
--cc=jasowang@redhat.com \
--cc=kbusch@kernel.org \
--cc=kwolf@redhat.com \
--cc=marcel.apfelbaum@gmail.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=philmd@linaro.org \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=qemu-stable@nongnu.org \
--cc=sriram.yagnaraman@est.tech \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).