* [PATCH v3 0/3] vfio/pci: Support 8-byte PCI loads and stores
@ 2024-04-25 16:56 Gerd Bayer
2024-04-25 16:56 ` [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro Gerd Bayer
` (2 more replies)
0 siblings, 3 replies; 25+ messages in thread
From: Gerd Bayer @ 2024-04-25 16:56 UTC (permalink / raw)
To: Alex Williamson, Jason Gunthorpe, Niklas Schnelle
Cc: kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic,
Julian Ruess, Gerd Bayer
Hi all,
this all started with a single patch by Ben to enable writing a user-mode
driver for a PCI device that requires 64bit register read/writes on s390.
A quick grep showed that there are several other drivers for PCI devices
in the kernel that use readq/writeq and eventually could use this, too.
So we decided to propose this for general inclusion.
A couple of suggestions for refactorizations by Jason Gunthorpe and Alex
Williamson later [1], I arrived at this little series that avoids some
code duplication and structures the different-size accesses in
vfio_pci_core_do_io_rw() in a way that the conditional compile of
8-byte accesses no longer creates an odd split of "else-if".
The initial version was tested with a PCI device on s390. This version
has only been tested for reads of 1..8 byte sizes and only been compile
tested for a 32bit architecture (arm).
Thank you,
Gerd Bayer
[1] https://lore.kernel.org/all/20240422153508.2355844-1-gbayer@linux.ibm.com/
Changes v2 -> v3:
- Introduce macro to generate body of different-size accesses in
vfio_pci_core_do_io_rw (courtesy Alex Williamson).
- Convert if-else if chain to a switch-case construct to better
accommodate conditional compiles.
Changes v1 -> v2:
- On non 64bit architecture use at most 32bit accesses in
vfio_pci_core_do_io_rw and describe that in the commit message.
- Drop the run-time error on 32bit architectures.
- The #endif splitting the "else if" is not really fortunate, but I'm
open to suggestions.
Ben Segal (1):
vfio/pci: Support 8-byte PCI loads and stores
Gerd Bayer (2):
vfio/pci: Extract duplicated code into macro
vfio/pci: Continue to refactor vfio_pci_core_do_io_rw
drivers/vfio/pci/vfio_pci_rdwr.c | 154 +++++++++++++++++--------------
include/linux/vfio_pci_core.h | 3 +
2 files changed, 90 insertions(+), 67 deletions(-)
--
2.44.0
^ permalink raw reply [flat|nested] 25+ messages in thread* [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro 2024-04-25 16:56 [PATCH v3 0/3] vfio/pci: Support 8-byte PCI loads and stores Gerd Bayer @ 2024-04-25 16:56 ` Gerd Bayer 2024-04-29 16:31 ` Alex Williamson ` (2 more replies) 2024-04-25 16:56 ` [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores Gerd Bayer 2024-04-25 16:56 ` [PATCH v3 3/3] vfio/pci: Continue to refactor vfio_pci_core_do_io_rw Gerd Bayer 2 siblings, 3 replies; 25+ messages in thread From: Gerd Bayer @ 2024-04-25 16:56 UTC (permalink / raw) To: Alex Williamson, Jason Gunthorpe, Niklas Schnelle Cc: kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic, Julian Ruess, Gerd Bayer vfio_pci_core_do_io_rw() repeats the same code for multiple access widths. Factor this out into a macro Suggested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com> --- drivers/vfio/pci/vfio_pci_rdwr.c | 106 ++++++++++++++----------------- 1 file changed, 46 insertions(+), 60 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c index 03b8f7ada1ac..3335f1b868b1 100644 --- a/drivers/vfio/pci/vfio_pci_rdwr.c +++ b/drivers/vfio/pci/vfio_pci_rdwr.c @@ -90,6 +90,40 @@ VFIO_IOREAD(8) VFIO_IOREAD(16) VFIO_IOREAD(32) +#define VFIO_IORDWR(size) \ +static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device *vdev,\ + bool iswrite, bool test_mem, \ + void __iomem *io, char __user *buf, \ + loff_t off, size_t *filled) \ +{ \ + u##size val; \ + int ret; \ + \ + if (iswrite) { \ + if (copy_from_user(&val, buf, sizeof(val))) \ + return -EFAULT; \ + \ + ret = vfio_pci_core_iowrite##size(vdev, test_mem, \ + val, io + off); \ + if (ret) \ + return ret; \ + } else { \ + ret = vfio_pci_core_ioread##size(vdev, test_mem, \ + &val, io + off); \ + if (ret) \ + return ret; \ + \ + if (copy_to_user(buf, &val, sizeof(val))) \ + return -EFAULT; \ + } \ + \ + *filled = sizeof(val); \ + return 0; \ +} \ + +VFIO_IORDWR(8) +VFIO_IORDWR(16) +VFIO_IORDWR(32) /* * Read or write from an __iomem region (MMIO or I/O port) with an excluded * range which is inaccessible. The excluded range drops writes and fills @@ -115,71 +149,23 @@ ssize_t vfio_pci_core_do_io_rw(struct vfio_pci_core_device *vdev, bool test_mem, fillable = 0; if (fillable >= 4 && !(off % 4)) { - u32 val; - - if (iswrite) { - if (copy_from_user(&val, buf, 4)) - return -EFAULT; - - ret = vfio_pci_core_iowrite32(vdev, test_mem, - val, io + off); - if (ret) - return ret; - } else { - ret = vfio_pci_core_ioread32(vdev, test_mem, - &val, io + off); - if (ret) - return ret; - - if (copy_to_user(buf, &val, 4)) - return -EFAULT; - } + ret = vfio_pci_core_iordwr32(vdev, iswrite, test_mem, + io, buf, off, &filled); + if (ret) + return ret; - filled = 4; } else if (fillable >= 2 && !(off % 2)) { - u16 val; - - if (iswrite) { - if (copy_from_user(&val, buf, 2)) - return -EFAULT; - - ret = vfio_pci_core_iowrite16(vdev, test_mem, - val, io + off); - if (ret) - return ret; - } else { - ret = vfio_pci_core_ioread16(vdev, test_mem, - &val, io + off); - if (ret) - return ret; - - if (copy_to_user(buf, &val, 2)) - return -EFAULT; - } + ret = vfio_pci_core_iordwr16(vdev, iswrite, test_mem, + io, buf, off, &filled); + if (ret) + return ret; - filled = 2; } else if (fillable) { - u8 val; - - if (iswrite) { - if (copy_from_user(&val, buf, 1)) - return -EFAULT; - - ret = vfio_pci_core_iowrite8(vdev, test_mem, - val, io + off); - if (ret) - return ret; - } else { - ret = vfio_pci_core_ioread8(vdev, test_mem, - &val, io + off); - if (ret) - return ret; - - if (copy_to_user(buf, &val, 1)) - return -EFAULT; - } + ret = vfio_pci_core_iordwr8(vdev, iswrite, test_mem, + io, buf, off, &filled); + if (ret) + return ret; - filled = 1; } else { /* Fill reads with -1, drop writes */ filled = min(count, (size_t)(x_end - off)); -- 2.44.0 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro 2024-04-25 16:56 ` [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro Gerd Bayer @ 2024-04-29 16:31 ` Alex Williamson 2024-05-17 14:22 ` Gerd Bayer 2024-04-29 20:09 ` Jason Gunthorpe 2024-05-17 10:47 ` Ramesh Thomas 2 siblings, 1 reply; 25+ messages in thread From: Alex Williamson @ 2024-04-29 16:31 UTC (permalink / raw) To: Gerd Bayer Cc: Jason Gunthorpe, Niklas Schnelle, kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic, Julian Ruess On Thu, 25 Apr 2024 18:56:02 +0200 Gerd Bayer <gbayer@linux.ibm.com> wrote: > vfio_pci_core_do_io_rw() repeats the same code for multiple access > widths. Factor this out into a macro > > Suggested-by: Alex Williamson <alex.williamson@redhat.com> > Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com> > --- > drivers/vfio/pci/vfio_pci_rdwr.c | 106 ++++++++++++++----------------- > 1 file changed, 46 insertions(+), 60 deletions(-) > > diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c > index 03b8f7ada1ac..3335f1b868b1 100644 > --- a/drivers/vfio/pci/vfio_pci_rdwr.c > +++ b/drivers/vfio/pci/vfio_pci_rdwr.c > @@ -90,6 +90,40 @@ VFIO_IOREAD(8) > VFIO_IOREAD(16) > VFIO_IOREAD(32) > > +#define VFIO_IORDWR(size) \ > +static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device *vdev,\ > + bool iswrite, bool test_mem, \ > + void __iomem *io, char __user *buf, \ > + loff_t off, size_t *filled) \ I realized later after proposing this that we should drop 'core' from the name since the resulting functions are not currently exported. It also helps with the wordiness. Thanks, Alex > +{ \ > + u##size val; \ > + int ret; \ > + \ > + if (iswrite) { \ > + if (copy_from_user(&val, buf, sizeof(val))) \ > + return -EFAULT; \ > + \ > + ret = vfio_pci_core_iowrite##size(vdev, test_mem, \ > + val, io + off); \ > + if (ret) \ > + return ret; \ > + } else { \ > + ret = vfio_pci_core_ioread##size(vdev, test_mem, \ > + &val, io + off); \ > + if (ret) \ > + return ret; \ > + \ > + if (copy_to_user(buf, &val, sizeof(val))) \ > + return -EFAULT; \ > + } \ > + \ > + *filled = sizeof(val); \ > + return 0; \ > +} \ > + > +VFIO_IORDWR(8) > +VFIO_IORDWR(16) > +VFIO_IORDWR(32) > /* > * Read or write from an __iomem region (MMIO or I/O port) with an excluded > * range which is inaccessible. The excluded range drops writes and fills > @@ -115,71 +149,23 @@ ssize_t vfio_pci_core_do_io_rw(struct vfio_pci_core_device *vdev, bool test_mem, > fillable = 0; > > if (fillable >= 4 && !(off % 4)) { > - u32 val; > - > - if (iswrite) { > - if (copy_from_user(&val, buf, 4)) > - return -EFAULT; > - > - ret = vfio_pci_core_iowrite32(vdev, test_mem, > - val, io + off); > - if (ret) > - return ret; > - } else { > - ret = vfio_pci_core_ioread32(vdev, test_mem, > - &val, io + off); > - if (ret) > - return ret; > - > - if (copy_to_user(buf, &val, 4)) > - return -EFAULT; > - } > + ret = vfio_pci_core_iordwr32(vdev, iswrite, test_mem, > + io, buf, off, &filled); > + if (ret) > + return ret; > > - filled = 4; > } else if (fillable >= 2 && !(off % 2)) { > - u16 val; > - > - if (iswrite) { > - if (copy_from_user(&val, buf, 2)) > - return -EFAULT; > - > - ret = vfio_pci_core_iowrite16(vdev, test_mem, > - val, io + off); > - if (ret) > - return ret; > - } else { > - ret = vfio_pci_core_ioread16(vdev, test_mem, > - &val, io + off); > - if (ret) > - return ret; > - > - if (copy_to_user(buf, &val, 2)) > - return -EFAULT; > - } > + ret = vfio_pci_core_iordwr16(vdev, iswrite, test_mem, > + io, buf, off, &filled); > + if (ret) > + return ret; > > - filled = 2; > } else if (fillable) { > - u8 val; > - > - if (iswrite) { > - if (copy_from_user(&val, buf, 1)) > - return -EFAULT; > - > - ret = vfio_pci_core_iowrite8(vdev, test_mem, > - val, io + off); > - if (ret) > - return ret; > - } else { > - ret = vfio_pci_core_ioread8(vdev, test_mem, > - &val, io + off); > - if (ret) > - return ret; > - > - if (copy_to_user(buf, &val, 1)) > - return -EFAULT; > - } > + ret = vfio_pci_core_iordwr8(vdev, iswrite, test_mem, > + io, buf, off, &filled); > + if (ret) > + return ret; > > - filled = 1; > } else { > /* Fill reads with -1, drop writes */ > filled = min(count, (size_t)(x_end - off)); ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro 2024-04-29 16:31 ` Alex Williamson @ 2024-05-17 14:22 ` Gerd Bayer 0 siblings, 0 replies; 25+ messages in thread From: Gerd Bayer @ 2024-05-17 14:22 UTC (permalink / raw) To: Alex Williamson Cc: Jason Gunthorpe, Niklas Schnelle, kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic, Julian Ruess On Mon, 2024-04-29 at 10:31 -0600, Alex Williamson wrote: > On Thu, 25 Apr 2024 18:56:02 +0200 > Gerd Bayer <gbayer@linux.ibm.com> wrote: > > > vfio_pci_core_do_io_rw() repeats the same code for multiple access > > widths. Factor this out into a macro > > > > Suggested-by: Alex Williamson <alex.williamson@redhat.com> > > Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com> > > --- > > drivers/vfio/pci/vfio_pci_rdwr.c | 106 ++++++++++++++------------- > > ---- > > 1 file changed, 46 insertions(+), 60 deletions(-) > > > > diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c > > b/drivers/vfio/pci/vfio_pci_rdwr.c > > index 03b8f7ada1ac..3335f1b868b1 100644 > > --- a/drivers/vfio/pci/vfio_pci_rdwr.c > > +++ b/drivers/vfio/pci/vfio_pci_rdwr.c > > @@ -90,6 +90,40 @@ VFIO_IOREAD(8) > > VFIO_IOREAD(16) > > VFIO_IOREAD(32) > > > > +#define > > VFIO_IORDWR(size) \ > > +static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device > > *vdev,\ > > + bool iswrite, bool > > test_mem, \ > > + void __iomem *io, char __user > > *buf, \ > > + loff_t off, size_t > > *filled) \ > > I realized later after proposing this that we should drop 'core' from > the name since the resulting functions are not currently exported. > It also helps with the wordiness. Thanks, > > Alex > > Sure that's easy enough. Thanks, Gerd ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro 2024-04-25 16:56 ` [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro Gerd Bayer 2024-04-29 16:31 ` Alex Williamson @ 2024-04-29 20:09 ` Jason Gunthorpe 2024-04-29 22:11 ` Alex Williamson 2024-05-17 10:47 ` Ramesh Thomas 2 siblings, 1 reply; 25+ messages in thread From: Jason Gunthorpe @ 2024-04-29 20:09 UTC (permalink / raw) To: Gerd Bayer Cc: Alex Williamson, Niklas Schnelle, kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic, Julian Ruess On Thu, Apr 25, 2024 at 06:56:02PM +0200, Gerd Bayer wrote: > vfio_pci_core_do_io_rw() repeats the same code for multiple access > widths. Factor this out into a macro > > Suggested-by: Alex Williamson <alex.williamson@redhat.com> > Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com> > --- > drivers/vfio/pci/vfio_pci_rdwr.c | 106 ++++++++++++++----------------- > 1 file changed, 46 insertions(+), 60 deletions(-) > > diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c > index 03b8f7ada1ac..3335f1b868b1 100644 > --- a/drivers/vfio/pci/vfio_pci_rdwr.c > +++ b/drivers/vfio/pci/vfio_pci_rdwr.c > @@ -90,6 +90,40 @@ VFIO_IOREAD(8) > VFIO_IOREAD(16) > VFIO_IOREAD(32) > > +#define VFIO_IORDWR(size) \ > +static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device *vdev,\ > + bool iswrite, bool test_mem, \ > + void __iomem *io, char __user *buf, \ > + loff_t off, size_t *filled) \ > +{ \ > + u##size val; \ > + int ret; \ > + \ > + if (iswrite) { \ > + if (copy_from_user(&val, buf, sizeof(val))) \ > + return -EFAULT; \ > + \ > + ret = vfio_pci_core_iowrite##size(vdev, test_mem, \ > + val, io + off); \ > + if (ret) \ > + return ret; \ > + } else { \ > + ret = vfio_pci_core_ioread##size(vdev, test_mem, \ > + &val, io + off); \ > + if (ret) \ > + return ret; \ > + \ > + if (copy_to_user(buf, &val, sizeof(val))) \ > + return -EFAULT; \ > + } \ > + \ > + *filled = sizeof(val); \ > + return 0; \ > +} \ > + > +VFIO_IORDWR(8) > +VFIO_IORDWR(16) > +VFIO_IORDWR(32) I'd suggest to try writing this without so many macros. This isn't very performance optimal already, we take a lock on every iteration, so there isn't much point in inlining multiple copies of everything to save an branch. Push the sizing switch down to the bottom, start with a function like: static void __iowrite(const void *val, void __iomem *io, size_t len) { switch (len) { case 8: { #ifdef iowrite64 // NOTE this doesn't seem to work on x86? if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)) return iowrite64be(*(const u64 *)val, io); return iowrite64(*(const u64 *)val, io); #else if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)) { iowrite32be(*(const u32 *)val, io); iowrite32be(*(const u32 *)(val + 4), io + 4); } else { iowrite32(*(const u32 *)val, io); iowrite32(*(const u32 *)(val + 4), io + 4); } return; #endif } case 4: if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)) return iowrite32be(*(const u32 *)val, io); return iowrite32(*(const u32 *)val, io); case 2: if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)) return iowrite16be(*(const u16 *)val, io); return iowrite16(*(const u16 *)val, io); case 1: return iowrite8(*(const u8 *)val, io); } } And then wrap it with the copy and the lock: static int do_iordwr(struct vfio_pci_core_device *vdev, bool test_mem, const void __user *buf, void __iomem *io, size_t len, bool iswrite) { u64 val; if (iswrite && copy_from_user(&val, buf, len)) return -EFAULT; if (test_mem) { down_read(&vdev->memory_lock); if (!__vfio_pci_memory_enabled(vdev)) { up_read(&vdev->memory_lock); return -EIO; } } if (iswrite) __iowrite(&val, io, len); else __ioread(&val, io, len); if (test_mem) up_read(&vdev->memory_lock); if (!iswrite && copy_to_user(buf, &val, len)) return -EFAULT; return 0; } And then the loop can be simple: if (fillable) { filled = num_bytes(fillable, off); ret = do_iordwr(vdev, test_mem, buf, io + off, filled, iswrite); if (ret) return ret; } else { filled = min(count, (size_t)(x_end - off)); /* Fill reads with -1, drop writes */ ret = fill_err(buf, filled); if (ret) return ret; } Jason ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro 2024-04-29 20:09 ` Jason Gunthorpe @ 2024-04-29 22:11 ` Alex Williamson 2024-04-29 22:33 ` Jason Gunthorpe 2024-04-30 8:16 ` liulongfang 0 siblings, 2 replies; 25+ messages in thread From: Alex Williamson @ 2024-04-29 22:11 UTC (permalink / raw) To: Jason Gunthorpe Cc: Gerd Bayer, Niklas Schnelle, kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic, Julian Ruess On Mon, 29 Apr 2024 17:09:10 -0300 Jason Gunthorpe <jgg@ziepe.ca> wrote: > On Thu, Apr 25, 2024 at 06:56:02PM +0200, Gerd Bayer wrote: > > vfio_pci_core_do_io_rw() repeats the same code for multiple access > > widths. Factor this out into a macro > > > > Suggested-by: Alex Williamson <alex.williamson@redhat.com> > > Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com> > > --- > > drivers/vfio/pci/vfio_pci_rdwr.c | 106 ++++++++++++++----------------- > > 1 file changed, 46 insertions(+), 60 deletions(-) > > > > diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c > > index 03b8f7ada1ac..3335f1b868b1 100644 > > --- a/drivers/vfio/pci/vfio_pci_rdwr.c > > +++ b/drivers/vfio/pci/vfio_pci_rdwr.c > > @@ -90,6 +90,40 @@ VFIO_IOREAD(8) > > VFIO_IOREAD(16) > > VFIO_IOREAD(32) > > > > +#define VFIO_IORDWR(size) \ > > +static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device *vdev,\ > > + bool iswrite, bool test_mem, \ > > + void __iomem *io, char __user *buf, \ > > + loff_t off, size_t *filled) \ > > +{ \ > > + u##size val; \ > > + int ret; \ > > + \ > > + if (iswrite) { \ > > + if (copy_from_user(&val, buf, sizeof(val))) \ > > + return -EFAULT; \ > > + \ > > + ret = vfio_pci_core_iowrite##size(vdev, test_mem, \ > > + val, io + off); \ > > + if (ret) \ > > + return ret; \ > > + } else { \ > > + ret = vfio_pci_core_ioread##size(vdev, test_mem, \ > > + &val, io + off); \ > > + if (ret) \ > > + return ret; \ > > + \ > > + if (copy_to_user(buf, &val, sizeof(val))) \ > > + return -EFAULT; \ > > + } \ > > + \ > > + *filled = sizeof(val); \ > > + return 0; \ > > +} \ > > + > > +VFIO_IORDWR(8) > > +VFIO_IORDWR(16) > > +VFIO_IORDWR(32) > > I'd suggest to try writing this without so many macros. > > This isn't very performance optimal already, we take a lock on every > iteration, so there isn't much point in inlining multiple copies of > everything to save an branch. These macros are to reduce duplicate code blocks and the errors that typically come from such duplication, as well as to provide type safe functions in the spirit of the ioread# and iowrite# helpers. It really has nothing to do with, nor is it remotely effective at saving a branch. Thanks, Alex > Push the sizing switch down to the bottom, start with a function like: > > static void __iowrite(const void *val, void __iomem *io, size_t len) > { > switch (len) { > case 8: { > #ifdef iowrite64 // NOTE this doesn't seem to work on x86? > if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)) > return iowrite64be(*(const u64 *)val, io); > return iowrite64(*(const u64 *)val, io); > #else > if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)) { > iowrite32be(*(const u32 *)val, io); > iowrite32be(*(const u32 *)(val + 4), io + 4); > } else { > iowrite32(*(const u32 *)val, io); > iowrite32(*(const u32 *)(val + 4), io + 4); > } > return; > #endif > } > > case 4: > if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)) > return iowrite32be(*(const u32 *)val, io); > return iowrite32(*(const u32 *)val, io); > case 2: > if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)) > return iowrite16be(*(const u16 *)val, io); > return iowrite16(*(const u16 *)val, io); > > case 1: > return iowrite8(*(const u8 *)val, io); > } > } > > And then wrap it with the copy and the lock: > > static int do_iordwr(struct vfio_pci_core_device *vdev, bool test_mem, > const void __user *buf, void __iomem *io, size_t len, > bool iswrite) > { > u64 val; > > if (iswrite && copy_from_user(&val, buf, len)) > return -EFAULT; > > if (test_mem) { > down_read(&vdev->memory_lock); > if (!__vfio_pci_memory_enabled(vdev)) { > up_read(&vdev->memory_lock); > return -EIO; > } > } > > if (iswrite) > __iowrite(&val, io, len); > else > __ioread(&val, io, len); > > if (test_mem) > up_read(&vdev->memory_lock); > > if (!iswrite && copy_to_user(buf, &val, len)) > return -EFAULT; > > return 0; > } > > And then the loop can be simple: > > if (fillable) { > filled = num_bytes(fillable, off); > ret = do_iordwr(vdev, test_mem, buf, io + off, filled, > iswrite); > if (ret) > return ret; > } else { > filled = min(count, (size_t)(x_end - off)); > /* Fill reads with -1, drop writes */ > ret = fill_err(buf, filled); > if (ret) > return ret; > } > > Jason > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro 2024-04-29 22:11 ` Alex Williamson @ 2024-04-29 22:33 ` Jason Gunthorpe 2024-05-21 15:47 ` Gerd Bayer 2024-04-30 8:16 ` liulongfang 1 sibling, 1 reply; 25+ messages in thread From: Jason Gunthorpe @ 2024-04-29 22:33 UTC (permalink / raw) To: Alex Williamson Cc: Gerd Bayer, Niklas Schnelle, kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic, Julian Ruess On Mon, Apr 29, 2024 at 04:11:03PM -0600, Alex Williamson wrote: > > This isn't very performance optimal already, we take a lock on every > > iteration, so there isn't much point in inlining multiple copies of > > everything to save an branch. > > These macros are to reduce duplicate code blocks and the errors that > typically come from such duplication, But there is still quite a bit of repetition here.. > as well as to provide type safe functions in the spirit of the > ioread# and iowrite# helpers. But it never really takes any advantage of type safety? It is making a memcpy.. Jason ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro 2024-04-29 22:33 ` Jason Gunthorpe @ 2024-05-21 15:47 ` Gerd Bayer 0 siblings, 0 replies; 25+ messages in thread From: Gerd Bayer @ 2024-05-21 15:47 UTC (permalink / raw) To: Jason Gunthorpe, Alex Williamson Cc: Niklas Schnelle, kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic, Julian Ruess On Mon, 2024-04-29 at 19:33 -0300, Jason Gunthorpe wrote: > On Mon, Apr 29, 2024 at 04:11:03PM -0600, Alex Williamson wrote: > > > This isn't very performance optimal already, we take a lock on > > > every > > > iteration, so there isn't much point in inlining multiple copies > > > of > > > everything to save an branch. > > > > These macros are to reduce duplicate code blocks and the errors > > that typically come from such duplication, > > But there is still quite a bit of repetition here.. I appears like duplications, I agree - but the vfio_pci_core_ioreadX and vfio_pci_core_iowriteX accessors are exported as such, or might be reused by way of vfio_pci_iordwrX in vfio_pci_core_do_io_rw for arbitrarily sized read/writes, too. > > as well as to provide type safe functions in the spirit of the > > ioread# and iowrite# helpers. > > But it never really takes any advantage of type safety? It is making > a memcpy.. At first, I was overwhelmed by the macro definitions, too. But after a while I started to like the strict typing once the value came out of memcpy or until it is memcpy'd. > > Jason > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro 2024-04-29 22:11 ` Alex Williamson 2024-04-29 22:33 ` Jason Gunthorpe @ 2024-04-30 8:16 ` liulongfang 1 sibling, 0 replies; 25+ messages in thread From: liulongfang @ 2024-04-30 8:16 UTC (permalink / raw) To: Alex Williamson, Jason Gunthorpe Cc: Gerd Bayer, Niklas Schnelle, kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic, Julian Ruess On 2024/4/30 6:11, Alex Williamson wrote: > On Mon, 29 Apr 2024 17:09:10 -0300 > Jason Gunthorpe <jgg@ziepe.ca> wrote: > >> On Thu, Apr 25, 2024 at 06:56:02PM +0200, Gerd Bayer wrote: >>> vfio_pci_core_do_io_rw() repeats the same code for multiple access >>> widths. Factor this out into a macro >>> >>> Suggested-by: Alex Williamson <alex.williamson@redhat.com> >>> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com> >>> --- >>> drivers/vfio/pci/vfio_pci_rdwr.c | 106 ++++++++++++++----------------- >>> 1 file changed, 46 insertions(+), 60 deletions(-) >>> >>> diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c >>> index 03b8f7ada1ac..3335f1b868b1 100644 >>> --- a/drivers/vfio/pci/vfio_pci_rdwr.c >>> +++ b/drivers/vfio/pci/vfio_pci_rdwr.c >>> @@ -90,6 +90,40 @@ VFIO_IOREAD(8) >>> VFIO_IOREAD(16) >>> VFIO_IOREAD(32) >>> >>> +#define VFIO_IORDWR(size) \ >>> +static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device *vdev,\ >>> + bool iswrite, bool test_mem, \ >>> + void __iomem *io, char __user *buf, \ >>> + loff_t off, size_t *filled) \ >>> +{ \ >>> + u##size val; \ >>> + int ret; \ >>> + \ >>> + if (iswrite) { \ >>> + if (copy_from_user(&val, buf, sizeof(val))) \ >>> + return -EFAULT; \ >>> + \ >>> + ret = vfio_pci_core_iowrite##size(vdev, test_mem, \ >>> + val, io + off); \ >>> + if (ret) \ >>> + return ret; \ >>> + } else { \ >>> + ret = vfio_pci_core_ioread##size(vdev, test_mem, \ >>> + &val, io + off); \ >>> + if (ret) \ >>> + return ret; \ >>> + \ >>> + if (copy_to_user(buf, &val, sizeof(val))) \ >>> + return -EFAULT; \ >>> + } \ >>> + \ >>> + *filled = sizeof(val); \ >>> + return 0; \ >>> +} \ >>> + >>> +VFIO_IORDWR(8) >>> +VFIO_IORDWR(16) >>> +VFIO_IORDWR(32) >> >> I'd suggest to try writing this without so many macros. >> >> This isn't very performance optimal already, we take a lock on every >> iteration, so there isn't much point in inlining multiple copies of >> everything to save an branch. > > These macros are to reduce duplicate code blocks and the errors that Although simple and straightforward writing will result in more lines of code. But it's not easy to squeeze in "extra" code. The backdoor of "XZ Utils" is implanted through code complication. Thanks. Longfang. > typically come from such duplication, as well as to provide type safe > functions in the spirit of the ioread# and iowrite# helpers. It really > has nothing to do with, nor is it remotely effective at saving a branch. > Thanks, > > Alex > >> Push the sizing switch down to the bottom, start with a function like: >> >> static void __iowrite(const void *val, void __iomem *io, size_t len) >> { >> switch (len) { >> case 8: { >> #ifdef iowrite64 // NOTE this doesn't seem to work on x86? >> if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)) >> return iowrite64be(*(const u64 *)val, io); >> return iowrite64(*(const u64 *)val, io); >> #else >> if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)) { >> iowrite32be(*(const u32 *)val, io); >> iowrite32be(*(const u32 *)(val + 4), io + 4); >> } else { >> iowrite32(*(const u32 *)val, io); >> iowrite32(*(const u32 *)(val + 4), io + 4); >> } >> return; >> #endif >> } >> >> case 4: >> if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)) >> return iowrite32be(*(const u32 *)val, io); >> return iowrite32(*(const u32 *)val, io); >> case 2: >> if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)) >> return iowrite16be(*(const u16 *)val, io); >> return iowrite16(*(const u16 *)val, io); >> >> case 1: >> return iowrite8(*(const u8 *)val, io); >> } >> } >> >> And then wrap it with the copy and the lock: >> >> static int do_iordwr(struct vfio_pci_core_device *vdev, bool test_mem, >> const void __user *buf, void __iomem *io, size_t len, >> bool iswrite) >> { >> u64 val; >> >> if (iswrite && copy_from_user(&val, buf, len)) >> return -EFAULT; >> >> if (test_mem) { >> down_read(&vdev->memory_lock); >> if (!__vfio_pci_memory_enabled(vdev)) { >> up_read(&vdev->memory_lock); >> return -EIO; >> } >> } >> >> if (iswrite) >> __iowrite(&val, io, len); >> else >> __ioread(&val, io, len); >> >> if (test_mem) >> up_read(&vdev->memory_lock); >> >> if (!iswrite && copy_to_user(buf, &val, len)) >> return -EFAULT; >> >> return 0; >> } >> >> And then the loop can be simple: >> >> if (fillable) { >> filled = num_bytes(fillable, off); >> ret = do_iordwr(vdev, test_mem, buf, io + off, filled, >> iswrite); >> if (ret) >> return ret; >> } else { >> filled = min(count, (size_t)(x_end - off)); >> /* Fill reads with -1, drop writes */ >> ret = fill_err(buf, filled); >> if (ret) >> return ret; >> } >> >> Jason >> > > > . > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro 2024-04-25 16:56 ` [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro Gerd Bayer 2024-04-29 16:31 ` Alex Williamson 2024-04-29 20:09 ` Jason Gunthorpe @ 2024-05-17 10:47 ` Ramesh Thomas 2 siblings, 0 replies; 25+ messages in thread From: Ramesh Thomas @ 2024-05-17 10:47 UTC (permalink / raw) To: Gerd Bayer, Alex Williamson, Jason Gunthorpe, Niklas Schnelle Cc: kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic, Julian Ruess, Thomas, Ramesh On 4/25/2024 9:56 AM, Gerd Bayer wrote: > vfio_pci_core_do_io_rw() repeats the same code for multiple access > widths. Factor this out into a macro > > Suggested-by: Alex Williamson <alex.williamson@redhat.com> > Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com> > --- > drivers/vfio/pci/vfio_pci_rdwr.c | 106 ++++++++++++++----------------- > 1 file changed, 46 insertions(+), 60 deletions(-) > > diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c > index 03b8f7ada1ac..3335f1b868b1 100644 > --- a/drivers/vfio/pci/vfio_pci_rdwr.c > +++ b/drivers/vfio/pci/vfio_pci_rdwr.c > @@ -90,6 +90,40 @@ VFIO_IOREAD(8) > VFIO_IOREAD(16) > VFIO_IOREAD(32) > > +#define VFIO_IORDWR(size) \ > +static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device *vdev,\ > + bool iswrite, bool test_mem, \ > + void __iomem *io, char __user *buf, \ > + loff_t off, size_t *filled) \ > +{ \ > + u##size val; \ > + int ret; \ > + \ > + if (iswrite) { \ > + if (copy_from_user(&val, buf, sizeof(val))) \ Another way to get the size is (size)/8 "if (copy_from_user(&val, buf, (size)/8))" > + return -EFAULT; \ > + \ > + ret = vfio_pci_core_iowrite##size(vdev, test_mem, \ > + val, io + off); \ > + if (ret) \ > + return ret; \ > + } else { \ > + ret = vfio_pci_core_ioread##size(vdev, test_mem, \ > + &val, io + off); \ > + if (ret) \ > + return ret; \ > + \ > + if (copy_to_user(buf, &val, sizeof(val))) \ > + return -EFAULT; \ > + } \ > + \ > + *filled = sizeof(val); \ > + return 0; \ > +} \ > + > +VFIO_IORDWR(8) > +VFIO_IORDWR(16) > +VFIO_IORDWR(32) > /* > * Read or write from an __iomem region (MMIO or I/O port) with an excluded > * range which is inaccessible. The excluded range drops writes and fills > @@ -115,71 +149,23 @@ ssize_t vfio_pci_core_do_io_rw(struct vfio_pci_core_device *vdev, bool test_mem, > fillable = 0; > > if (fillable >= 4 && !(off % 4)) { > - u32 val; > - > - if (iswrite) { > - if (copy_from_user(&val, buf, 4)) > - return -EFAULT; > - > - ret = vfio_pci_core_iowrite32(vdev, test_mem, > - val, io + off); > - if (ret) > - return ret; > - } else { > - ret = vfio_pci_core_ioread32(vdev, test_mem, > - &val, io + off); > - if (ret) > - return ret; > - > - if (copy_to_user(buf, &val, 4)) > - return -EFAULT; > - } > + ret = vfio_pci_core_iordwr32(vdev, iswrite, test_mem, > + io, buf, off, &filled); > + if (ret) > + return ret; > > - filled = 4; > } else if (fillable >= 2 && !(off % 2)) { > - u16 val; > - > - if (iswrite) { > - if (copy_from_user(&val, buf, 2)) > - return -EFAULT; > - > - ret = vfio_pci_core_iowrite16(vdev, test_mem, > - val, io + off); > - if (ret) > - return ret; > - } else { > - ret = vfio_pci_core_ioread16(vdev, test_mem, > - &val, io + off); > - if (ret) > - return ret; > - > - if (copy_to_user(buf, &val, 2)) > - return -EFAULT; > - } > + ret = vfio_pci_core_iordwr16(vdev, iswrite, test_mem, > + io, buf, off, &filled); > + if (ret) > + return ret; > > - filled = 2; > } else if (fillable) { > - u8 val; > - > - if (iswrite) { > - if (copy_from_user(&val, buf, 1)) > - return -EFAULT; > - > - ret = vfio_pci_core_iowrite8(vdev, test_mem, > - val, io + off); > - if (ret) > - return ret; > - } else { > - ret = vfio_pci_core_ioread8(vdev, test_mem, > - &val, io + off); > - if (ret) > - return ret; > - > - if (copy_to_user(buf, &val, 1)) > - return -EFAULT; > - } > + ret = vfio_pci_core_iordwr8(vdev, iswrite, test_mem, > + io, buf, off, &filled); > + if (ret) > + return ret; > > - filled = 1; > } else { > /* Fill reads with -1, drop writes */ > filled = min(count, (size_t)(x_end - off)); ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores 2024-04-25 16:56 [PATCH v3 0/3] vfio/pci: Support 8-byte PCI loads and stores Gerd Bayer 2024-04-25 16:56 ` [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro Gerd Bayer @ 2024-04-25 16:56 ` Gerd Bayer 2024-04-29 16:31 ` Alex Williamson 2024-05-17 10:29 ` Ramesh Thomas 2024-04-25 16:56 ` [PATCH v3 3/3] vfio/pci: Continue to refactor vfio_pci_core_do_io_rw Gerd Bayer 2 siblings, 2 replies; 25+ messages in thread From: Gerd Bayer @ 2024-04-25 16:56 UTC (permalink / raw) To: Alex Williamson, Jason Gunthorpe, Niklas Schnelle Cc: kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic, Julian Ruess, Ben Segal, Gerd Bayer From: Ben Segal <bpsegal@us.ibm.com> Many PCI adapters can benefit or even require full 64bit read and write access to their registers. In order to enable work on user-space drivers for these devices add two new variations vfio_pci_core_io{read|write}64 of the existing access methods when the architecture supports 64-bit ioreads and iowrites. Signed-off-by: Ben Segal <bpsegal@us.ibm.com> Co-developed-by: Gerd Bayer <gbayer@linux.ibm.com> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com> --- drivers/vfio/pci/vfio_pci_rdwr.c | 16 ++++++++++++++++ include/linux/vfio_pci_core.h | 3 +++ 2 files changed, 19 insertions(+) diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c index 3335f1b868b1..8ed06edaee23 100644 --- a/drivers/vfio/pci/vfio_pci_rdwr.c +++ b/drivers/vfio/pci/vfio_pci_rdwr.c @@ -89,6 +89,9 @@ EXPORT_SYMBOL_GPL(vfio_pci_core_ioread##size); VFIO_IOREAD(8) VFIO_IOREAD(16) VFIO_IOREAD(32) +#ifdef ioread64 +VFIO_IOREAD(64) +#endif #define VFIO_IORDWR(size) \ static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device *vdev,\ @@ -124,6 +127,10 @@ static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device *vdev,\ VFIO_IORDWR(8) VFIO_IORDWR(16) VFIO_IORDWR(32) +#if defined(ioread64) && defined(iowrite64) +VFIO_IORDWR(64) +#endif + /* * Read or write from an __iomem region (MMIO or I/O port) with an excluded * range which is inaccessible. The excluded range drops writes and fills @@ -148,6 +155,15 @@ ssize_t vfio_pci_core_do_io_rw(struct vfio_pci_core_device *vdev, bool test_mem, else fillable = 0; +#if defined(ioread64) && defined(iowrite64) + if (fillable >= 8 && !(off % 8)) { + ret = vfio_pci_core_iordwr64(vdev, iswrite, test_mem, + io, buf, off, &filled); + if (ret) + return ret; + + } else +#endif /* defined(ioread64) && defined(iowrite64) */ if (fillable >= 4 && !(off % 4)) { ret = vfio_pci_core_iordwr32(vdev, iswrite, test_mem, io, buf, off, &filled); diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h index a2c8b8bba711..f4cf5fd2350c 100644 --- a/include/linux/vfio_pci_core.h +++ b/include/linux/vfio_pci_core.h @@ -157,5 +157,8 @@ int vfio_pci_core_ioread##size(struct vfio_pci_core_device *vdev, \ VFIO_IOREAD_DECLATION(8) VFIO_IOREAD_DECLATION(16) VFIO_IOREAD_DECLATION(32) +#ifdef ioread64 +VFIO_IOREAD_DECLATION(64) +#endif #endif /* VFIO_PCI_CORE_H */ -- 2.44.0 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores 2024-04-25 16:56 ` [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores Gerd Bayer @ 2024-04-29 16:31 ` Alex Williamson 2024-05-21 15:50 ` Gerd Bayer 2024-05-17 10:29 ` Ramesh Thomas 1 sibling, 1 reply; 25+ messages in thread From: Alex Williamson @ 2024-04-29 16:31 UTC (permalink / raw) To: Gerd Bayer Cc: Jason Gunthorpe, Niklas Schnelle, kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic, Julian Ruess, Ben Segal On Thu, 25 Apr 2024 18:56:03 +0200 Gerd Bayer <gbayer@linux.ibm.com> wrote: > From: Ben Segal <bpsegal@us.ibm.com> > > Many PCI adapters can benefit or even require full 64bit read > and write access to their registers. In order to enable work on > user-space drivers for these devices add two new variations > vfio_pci_core_io{read|write}64 of the existing access methods > when the architecture supports 64-bit ioreads and iowrites. > > Signed-off-by: Ben Segal <bpsegal@us.ibm.com> > Co-developed-by: Gerd Bayer <gbayer@linux.ibm.com> > Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com> > --- > drivers/vfio/pci/vfio_pci_rdwr.c | 16 ++++++++++++++++ > include/linux/vfio_pci_core.h | 3 +++ > 2 files changed, 19 insertions(+) > > diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c > index 3335f1b868b1..8ed06edaee23 100644 > --- a/drivers/vfio/pci/vfio_pci_rdwr.c > +++ b/drivers/vfio/pci/vfio_pci_rdwr.c > @@ -89,6 +89,9 @@ EXPORT_SYMBOL_GPL(vfio_pci_core_ioread##size); > VFIO_IOREAD(8) > VFIO_IOREAD(16) > VFIO_IOREAD(32) > +#ifdef ioread64 > +VFIO_IOREAD(64) > +#endif > > #define VFIO_IORDWR(size) \ > static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device *vdev,\ > @@ -124,6 +127,10 @@ static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device *vdev,\ > VFIO_IORDWR(8) > VFIO_IORDWR(16) > VFIO_IORDWR(32) > +#if defined(ioread64) && defined(iowrite64) > +VFIO_IORDWR(64) > +#endif > + > /* > * Read or write from an __iomem region (MMIO or I/O port) with an excluded > * range which is inaccessible. The excluded range drops writes and fills > @@ -148,6 +155,15 @@ ssize_t vfio_pci_core_do_io_rw(struct vfio_pci_core_device *vdev, bool test_mem, > else > fillable = 0; > > +#if defined(ioread64) && defined(iowrite64) Nit, #ifdef vfio_pci_core_iordwr64 > + if (fillable >= 8 && !(off % 8)) { > + ret = vfio_pci_core_iordwr64(vdev, iswrite, test_mem, > + io, buf, off, &filled); > + if (ret) > + return ret; > + > + } else > +#endif /* defined(ioread64) && defined(iowrite64) */ AFAIK, the comment appended to the #endif is really only suggested when the code block is too long to reasonable fit in a terminal. That's no longer the case with the new helper. Thanks, Alex > if (fillable >= 4 && !(off % 4)) { > ret = vfio_pci_core_iordwr32(vdev, iswrite, test_mem, > io, buf, off, &filled); > diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h > index a2c8b8bba711..f4cf5fd2350c 100644 > --- a/include/linux/vfio_pci_core.h > +++ b/include/linux/vfio_pci_core.h > @@ -157,5 +157,8 @@ int vfio_pci_core_ioread##size(struct vfio_pci_core_device *vdev, \ > VFIO_IOREAD_DECLATION(8) > VFIO_IOREAD_DECLATION(16) > VFIO_IOREAD_DECLATION(32) > +#ifdef ioread64 > +VFIO_IOREAD_DECLATION(64) > +#endif > > #endif /* VFIO_PCI_CORE_H */ ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores 2024-04-29 16:31 ` Alex Williamson @ 2024-05-21 15:50 ` Gerd Bayer 0 siblings, 0 replies; 25+ messages in thread From: Gerd Bayer @ 2024-05-21 15:50 UTC (permalink / raw) To: Alex Williamson Cc: Jason Gunthorpe, Niklas Schnelle, kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic, Julian Ruess, Ben Segal On Mon, 2024-04-29 at 10:31 -0600, Alex Williamson wrote: > On Thu, 25 Apr 2024 18:56:03 +0200 > Gerd Bayer <gbayer@linux.ibm.com> wrote: > > > From: Ben Segal <bpsegal@us.ibm.com> > > > > Many PCI adapters can benefit or even require full 64bit read > > and write access to their registers. In order to enable work on > > user-space drivers for these devices add two new variations > > vfio_pci_core_io{read|write}64 of the existing access methods > > when the architecture supports 64-bit ioreads and iowrites. > > > > Signed-off-by: Ben Segal <bpsegal@us.ibm.com> > > Co-developed-by: Gerd Bayer <gbayer@linux.ibm.com> > > Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com> > > --- > > drivers/vfio/pci/vfio_pci_rdwr.c | 16 ++++++++++++++++ > > include/linux/vfio_pci_core.h | 3 +++ > > 2 files changed, 19 insertions(+) > > > > diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c > > b/drivers/vfio/pci/vfio_pci_rdwr.c > > index 3335f1b868b1..8ed06edaee23 100644 > > --- a/drivers/vfio/pci/vfio_pci_rdwr.c > > +++ b/drivers/vfio/pci/vfio_pci_rdwr.c > > @@ -89,6 +89,9 @@ EXPORT_SYMBOL_GPL(vfio_pci_core_ioread##size); > > VFIO_IOREAD(8) > > VFIO_IOREAD(16) > > VFIO_IOREAD(32) > > +#ifdef ioread64 > > +VFIO_IOREAD(64) > > +#endif > > > > #define > > VFIO_IORDWR(size) \ > > static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device > > *vdev,\ > > @@ -124,6 +127,10 @@ static int vfio_pci_core_iordwr##size(struct > > vfio_pci_core_device *vdev,\ > > VFIO_IORDWR(8) > > VFIO_IORDWR(16) > > VFIO_IORDWR(32) > > +#if defined(ioread64) && defined(iowrite64) > > +VFIO_IORDWR(64) > > +#endif > > + > > /* > > * Read or write from an __iomem region (MMIO or I/O port) with an > > excluded > > * range which is inaccessible. The excluded range drops writes > > and fills > > @@ -148,6 +155,15 @@ ssize_t vfio_pci_core_do_io_rw(struct > > vfio_pci_core_device *vdev, bool test_mem, > > else > > fillable = 0; > > > > +#if defined(ioread64) && defined(iowrite64) > > Nit, #ifdef vfio_pci_core_iordwr64 I'm sorry, but I'm not sure how I should check for the expanded symbol here. I think I'll have to stick to checking the same condition as for whether VFIO_IORDWR(64) should be expanded. > > + if (fillable >= 8 && !(off % 8)) { > > + ret = vfio_pci_core_iordwr64(vdev, > > iswrite, test_mem, > > + io, buf, off, > > &filled); > > + if (ret) > > + return ret; > > + > > + } else > > +#endif /* defined(ioread64) && defined(iowrite64) */ > > AFAIK, the comment appended to the #endif is really only suggested > when the code block is too long to reasonable fit in a terminal. > That's no longer the case with the new helper. Yes, I'll change that. > Thanks, > > Alex > > Thanks, Gerd ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores 2024-04-25 16:56 ` [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores Gerd Bayer 2024-04-29 16:31 ` Alex Williamson @ 2024-05-17 10:29 ` Ramesh Thomas 2024-05-20 9:02 ` Tian, Kevin 2024-05-21 16:40 ` Gerd Bayer 1 sibling, 2 replies; 25+ messages in thread From: Ramesh Thomas @ 2024-05-17 10:29 UTC (permalink / raw) To: Gerd Bayer, Alex Williamson, Jason Gunthorpe, Niklas Schnelle Cc: kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic, Julian Ruess, Ben Segal, Thomas, Ramesh On 4/25/2024 9:56 AM, Gerd Bayer wrote: > From: Ben Segal <bpsegal@us.ibm.com> > > Many PCI adapters can benefit or even require full 64bit read > and write access to their registers. In order to enable work on > user-space drivers for these devices add two new variations > vfio_pci_core_io{read|write}64 of the existing access methods > when the architecture supports 64-bit ioreads and iowrites. This is indeed necessary as back to back 32 bit may not be optimal in some devices. > > Signed-off-by: Ben Segal <bpsegal@us.ibm.com> > Co-developed-by: Gerd Bayer <gbayer@linux.ibm.com> > Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com> > --- > drivers/vfio/pci/vfio_pci_rdwr.c | 16 ++++++++++++++++ > include/linux/vfio_pci_core.h | 3 +++ > 2 files changed, 19 insertions(+) > > diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c > index 3335f1b868b1..8ed06edaee23 100644 > --- a/drivers/vfio/pci/vfio_pci_rdwr.c > +++ b/drivers/vfio/pci/vfio_pci_rdwr.c > @@ -89,6 +89,9 @@ EXPORT_SYMBOL_GPL(vfio_pci_core_ioread##size); > VFIO_IOREAD(8) > VFIO_IOREAD(16) > VFIO_IOREAD(32) > +#ifdef ioread64 > +VFIO_IOREAD(64) > +#endif > > #define VFIO_IORDWR(size) \ > static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device *vdev,\ > @@ -124,6 +127,10 @@ static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device *vdev,\ > VFIO_IORDWR(8) > VFIO_IORDWR(16) > VFIO_IORDWR(32) > +#if defined(ioread64) && defined(iowrite64) > +VFIO_IORDWR(64) > +#endif > + > /* > * Read or write from an __iomem region (MMIO or I/O port) with an excluded > * range which is inaccessible. The excluded range drops writes and fills > @@ -148,6 +155,15 @@ ssize_t vfio_pci_core_do_io_rw(struct vfio_pci_core_device *vdev, bool test_mem, > else > fillable = 0; > > +#if defined(ioread64) && defined(iowrite64) Can we check for #ifdef CONFIG_64BIT instead? In x86, ioread64 and iowrite64 get declared as extern functions if CONFIG_GENERIC_IOMAP is defined and this check always fails. In include/asm-generic/io.h, asm-generic/iomap.h gets included which declares them as extern functions. One more thing to consider io-64-nonatomic-hi-lo.h and io-64-nonatomic-lo-hi.h, if included would define it as a macro that calls a function that rw 32 bits back to back. > + if (fillable >= 8 && !(off % 8)) { > + ret = vfio_pci_core_iordwr64(vdev, iswrite, test_mem, > + io, buf, off, &filled); > + if (ret) > + return ret; > + > + } else > +#endif /* defined(ioread64) && defined(iowrite64) */ > if (fillable >= 4 && !(off % 4)) { > ret = vfio_pci_core_iordwr32(vdev, iswrite, test_mem, > io, buf, off, &filled); > diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h > index a2c8b8bba711..f4cf5fd2350c 100644 > --- a/include/linux/vfio_pci_core.h > +++ b/include/linux/vfio_pci_core.h > @@ -157,5 +157,8 @@ int vfio_pci_core_ioread##size(struct vfio_pci_core_device *vdev, \ > VFIO_IOREAD_DECLATION(8) > VFIO_IOREAD_DECLATION(16) > VFIO_IOREAD_DECLATION(32) > +#ifdef ioread64 > +VFIO_IOREAD_DECLATION(64) nit: This macro is referenced only in this file. Can the typo be corrected (_DECLARATION)? > +#endif > > #endif /* VFIO_PCI_CORE_H */ ^ permalink raw reply [flat|nested] 25+ messages in thread
* RE: [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores 2024-05-17 10:29 ` Ramesh Thomas @ 2024-05-20 9:02 ` Tian, Kevin 2024-05-23 0:11 ` Ramesh Thomas 2024-05-21 16:40 ` Gerd Bayer 1 sibling, 1 reply; 25+ messages in thread From: Tian, Kevin @ 2024-05-20 9:02 UTC (permalink / raw) To: Thomas, Ramesh, Gerd Bayer, Alex Williamson, Jason Gunthorpe, Niklas Schnelle Cc: kvm@vger.kernel.org, linux-s390@vger.kernel.org, Ankit Agrawal, Yishai Hadas, Halil Pasic, Julian Ruess, Ben Segal, Thomas, Ramesh > From: Ramesh Thomas <ramesh.thomas@intel.com> > Sent: Friday, May 17, 2024 6:30 PM > > On 4/25/2024 9:56 AM, Gerd Bayer wrote: > > From: Ben Segal <bpsegal@us.ibm.com> > > > > @@ -148,6 +155,15 @@ ssize_t vfio_pci_core_do_io_rw(struct > vfio_pci_core_device *vdev, bool test_mem, > > else > > fillable = 0; > > > > +#if defined(ioread64) && defined(iowrite64) > > Can we check for #ifdef CONFIG_64BIT instead? In x86, ioread64 and > iowrite64 get declared as extern functions if CONFIG_GENERIC_IOMAP is > defined and this check always fails. In include/asm-generic/io.h, > asm-generic/iomap.h gets included which declares them as extern functions. > > One more thing to consider io-64-nonatomic-hi-lo.h and > io-64-nonatomic-lo-hi.h, if included would define it as a macro that > calls a function that rw 32 bits back to back. I don't see the problem here. when the defined check fails it falls back to back-to-back vfio_pci_core_iordwr32(). there is no need to do it in an indirect way via including io-64-nonatomic-hi-lo.h. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores 2024-05-20 9:02 ` Tian, Kevin @ 2024-05-23 0:11 ` Ramesh Thomas 2024-05-23 21:52 ` Ramesh Thomas 0 siblings, 1 reply; 25+ messages in thread From: Ramesh Thomas @ 2024-05-23 0:11 UTC (permalink / raw) To: Tian, Kevin, Gerd Bayer, Alex Williamson, Jason Gunthorpe, Niklas Schnelle Cc: kvm@vger.kernel.org, linux-s390@vger.kernel.org, Ankit Agrawal, Yishai Hadas, Halil Pasic, Julian Ruess, Ben Segal On 5/20/2024 2:02 AM, Tian, Kevin wrote: >> From: Ramesh Thomas <ramesh.thomas@intel.com> >> Sent: Friday, May 17, 2024 6:30 PM >> >> On 4/25/2024 9:56 AM, Gerd Bayer wrote: >>> From: Ben Segal <bpsegal@us.ibm.com> >>> >>> @@ -148,6 +155,15 @@ ssize_t vfio_pci_core_do_io_rw(struct >> vfio_pci_core_device *vdev, bool test_mem, >>> else >>> fillable = 0; >>> >>> +#if defined(ioread64) && defined(iowrite64) >> >> Can we check for #ifdef CONFIG_64BIT instead? In x86, ioread64 and >> iowrite64 get declared as extern functions if CONFIG_GENERIC_IOMAP is >> defined and this check always fails. In include/asm-generic/io.h, >> asm-generic/iomap.h gets included which declares them as extern functions. >> >> One more thing to consider io-64-nonatomic-hi-lo.h and >> io-64-nonatomic-lo-hi.h, if included would define it as a macro that >> calls a function that rw 32 bits back to back. > > I don't see the problem here. when the defined check fails it falls > back to back-to-back vfio_pci_core_iordwr32(). there is no need to > do it in an indirect way via including io-64-nonatomic-hi-lo.h. The issue is iowrite64 and iowrite64 was not getting defined when CONFIG_GENERIC_IOMAP was not defined, even though the architecture implemented the 64 bit rw functions readq and writeq. io-64-nonatomic-hi-lo.h and io-64-nonatomic-lo-hi.h define them and map them to generic implementations in lib/iomap.c. The implementation calls the 64 bit rw functions if present, otherwise does 32 bit back to back rw. Besides it also has sanity checks for port numbers in the iomap path. I think it is better to rely on this existing generic method than implementing the checks at places where iowrite64 and ioread64 get called, at least in the IOMAP path. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores 2024-05-23 0:11 ` Ramesh Thomas @ 2024-05-23 21:52 ` Ramesh Thomas 0 siblings, 0 replies; 25+ messages in thread From: Ramesh Thomas @ 2024-05-23 21:52 UTC (permalink / raw) To: Tian, Kevin, Gerd Bayer, Alex Williamson, Jason Gunthorpe, Niklas Schnelle Cc: kvm@vger.kernel.org, linux-s390@vger.kernel.org, Ankit Agrawal, Yishai Hadas, Halil Pasic, Julian Ruess, Ben Segal On 5/22/2024 5:11 PM, Ramesh Thomas wrote: > On 5/20/2024 2:02 AM, Tian, Kevin wrote: >>> From: Ramesh Thomas <ramesh.thomas@intel.com> >>> Sent: Friday, May 17, 2024 6:30 PM >>> >>> On 4/25/2024 9:56 AM, Gerd Bayer wrote: >>>> From: Ben Segal <bpsegal@us.ibm.com> >>>> >>>> @@ -148,6 +155,15 @@ ssize_t vfio_pci_core_do_io_rw(struct >>> vfio_pci_core_device *vdev, bool test_mem, >>>> else >>>> fillable = 0; >>>> >>>> +#if defined(ioread64) && defined(iowrite64) >>> >>> Can we check for #ifdef CONFIG_64BIT instead? In x86, ioread64 and >>> iowrite64 get declared as extern functions if CONFIG_GENERIC_IOMAP is >>> defined and this check always fails. In include/asm-generic/io.h, >>> asm-generic/iomap.h gets included which declares them as extern >>> functions. >>> >>> One more thing to consider io-64-nonatomic-hi-lo.h and >>> io-64-nonatomic-lo-hi.h, if included would define it as a macro that >>> calls a function that rw 32 bits back to back. >> >> I don't see the problem here. when the defined check fails it falls >> back to back-to-back vfio_pci_core_iordwr32(). there is no need to >> do it in an indirect way via including io-64-nonatomic-hi-lo.h. > > The issue is iowrite64 and iowrite64 was not getting defined when > CONFIG_GENERIC_IOMAP was not defined, even though the architecture Sorry, I meant they were not getting defined when CONFIG_GENERIC_IOMAP *was defined*. The only definitions of ioread64 and iowrite64 in the code path are in asm/io.h where they are surrounded by #ifndef CONFIG_GENERIC_IOMAP > implemented the 64 bit rw functions readq and writeq. > io-64-nonatomic-hi-lo.h and io-64-nonatomic-lo-hi.h define them and map > them to generic implementations in lib/iomap.c. The implementation calls > the 64 bit rw functions if present, otherwise does 32 bit back to back > rw. Besides it also has sanity checks for port numbers in the iomap > path. I think it is better to rely on this existing generic method than > implementing the checks at places where iowrite64 and ioread64 get > called, at least in the IOMAP path. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores 2024-05-17 10:29 ` Ramesh Thomas 2024-05-20 9:02 ` Tian, Kevin @ 2024-05-21 16:40 ` Gerd Bayer 2024-05-22 13:48 ` Jason Gunthorpe 2024-05-22 23:57 ` Ramesh Thomas 1 sibling, 2 replies; 25+ messages in thread From: Gerd Bayer @ 2024-05-21 16:40 UTC (permalink / raw) To: Ramesh Thomas, Alex Williamson, Jason Gunthorpe, Niklas Schnelle Cc: kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic, Julian Ruess, Ben Segal On Fri, 2024-05-17 at 03:29 -0700, Ramesh Thomas wrote: > On 4/25/2024 9:56 AM, Gerd Bayer wrote: > > From: Ben Segal <bpsegal@us.ibm.com> > > > > Many PCI adapters can benefit or even require full 64bit read > > and write access to their registers. In order to enable work on > > user-space drivers for these devices add two new variations > > vfio_pci_core_io{read|write}64 of the existing access methods > > when the architecture supports 64-bit ioreads and iowrites. > > This is indeed necessary as back to back 32 bit may not be optimal in > some devices. > > > > > Signed-off-by: Ben Segal <bpsegal@us.ibm.com> > > Co-developed-by: Gerd Bayer <gbayer@linux.ibm.com> > > Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com> > > --- > > drivers/vfio/pci/vfio_pci_rdwr.c | 16 ++++++++++++++++ > > include/linux/vfio_pci_core.h | 3 +++ > > 2 files changed, 19 insertions(+) > > > > diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c > > b/drivers/vfio/pci/vfio_pci_rdwr.c > > index 3335f1b868b1..8ed06edaee23 100644 > > --- a/drivers/vfio/pci/vfio_pci_rdwr.c > > +++ b/drivers/vfio/pci/vfio_pci_rdwr.c > > @@ -89,6 +89,9 @@ EXPORT_SYMBOL_GPL(vfio_pci_core_ioread##size); > > VFIO_IOREAD(8) > > VFIO_IOREAD(16) > > VFIO_IOREAD(32) > > +#ifdef ioread64 > > +VFIO_IOREAD(64) > > +#endif > > > > #define > > VFIO_IORDWR(size) \ > > static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device > > *vdev,\ > > @@ -124,6 +127,10 @@ static int vfio_pci_core_iordwr##size(struct > > vfio_pci_core_device *vdev,\ > > VFIO_IORDWR(8) > > VFIO_IORDWR(16) > > VFIO_IORDWR(32) > > +#if defined(ioread64) && defined(iowrite64) > > +VFIO_IORDWR(64) > > +#endif > > + > > /* > > * Read or write from an __iomem region (MMIO or I/O port) with > > an excluded > > * range which is inaccessible. The excluded range drops writes > > and fills > > @@ -148,6 +155,15 @@ ssize_t vfio_pci_core_do_io_rw(struct > > vfio_pci_core_device *vdev, bool test_mem, > > else > > fillable = 0; > > > > +#if defined(ioread64) && defined(iowrite64) > > Can we check for #ifdef CONFIG_64BIT instead? In x86, ioread64 and > iowrite64 get declared as extern functions if CONFIG_GENERIC_IOMAP is > defined and this check always fails. In include/asm-generic/io.h, > asm-generic/iomap.h gets included which declares them as extern > functions. I thinks that should be possible - since ioread64/iowrite64 depend on CONFIG_64BIT themselves. > One more thing to consider io-64-nonatomic-hi-lo.h and > io-64-nonatomic-lo-hi.h, if included would define it as a macro that > calls a function that rw 32 bits back to back. Even today, vfio_pci_core_do_io_rw() makes no guarantees to its users that register accesses will be done in the granularity they've thought to use. The vfs layer may coalesce the accesses and this code will then read/write the largest naturally aligned chunks. I've witnessed in my testing that one device driver was doing 8-byte writes through the 8- byte capable vfio layer all of a sudden when run in a KVM guest. So higher-level code needs to consider how to split register accesses appropriately to get the intended side-effects. Thus, I'd rather stay away from splitting 64bit accesses into two 32bit accesses - and decide if high or low order values should be written first. > > + if (fillable >= 8 && !(off % 8)) { > > + ret = vfio_pci_core_iordwr64(vdev, > > iswrite, test_mem, > > + io, buf, off, > > &filled); > > + if (ret) > > + return ret; > > + > > + } else > > +#endif /* defined(ioread64) && defined(iowrite64) */ > > if (fillable >= 4 && !(off % 4)) { > > ret = vfio_pci_core_iordwr32(vdev, > > iswrite, test_mem, > > io, buf, off, > > &filled); > > diff --git a/include/linux/vfio_pci_core.h > > b/include/linux/vfio_pci_core.h > > index a2c8b8bba711..f4cf5fd2350c 100644 > > --- a/include/linux/vfio_pci_core.h > > +++ b/include/linux/vfio_pci_core.h > > @@ -157,5 +157,8 @@ int vfio_pci_core_ioread##size(struct > > vfio_pci_core_device *vdev, \ > > VFIO_IOREAD_DECLATION(8) > > VFIO_IOREAD_DECLATION(16) > > VFIO_IOREAD_DECLATION(32) > > +#ifdef ioread64 > > +VFIO_IOREAD_DECLATION(64) > nit: This macro is referenced only in this file. Can the typo be > corrected (_DECLARATION)? Sure thanks for pointing this out! I'll single this editorial change out into a separate patch of the series, though. > > > +#endif > > > > #endif /* VFIO_PCI_CORE_H */ > > Thanks, Gerd ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores 2024-05-21 16:40 ` Gerd Bayer @ 2024-05-22 13:48 ` Jason Gunthorpe 2024-05-22 23:57 ` Ramesh Thomas 1 sibling, 0 replies; 25+ messages in thread From: Jason Gunthorpe @ 2024-05-22 13:48 UTC (permalink / raw) To: Gerd Bayer Cc: Ramesh Thomas, Alex Williamson, Niklas Schnelle, kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic, Julian Ruess, Ben Segal On Tue, May 21, 2024 at 06:40:13PM +0200, Gerd Bayer wrote: > > > @@ -148,6 +155,15 @@ ssize_t vfio_pci_core_do_io_rw(struct > > > vfio_pci_core_device *vdev, bool test_mem, > > > else > > > fillable = 0; > > > > > > +#if defined(ioread64) && defined(iowrite64) > > > > Can we check for #ifdef CONFIG_64BIT instead? In x86, ioread64 and > > iowrite64 get declared as extern functions if CONFIG_GENERIC_IOMAP is > > defined and this check always fails. In include/asm-generic/io.h, > > asm-generic/iomap.h gets included which declares them as extern > > functions. > > I thinks that should be possible - since ioread64/iowrite64 depend on > CONFIG_64BIT themselves. > > > One more thing to consider io-64-nonatomic-hi-lo.h and > > io-64-nonatomic-lo-hi.h, if included would define it as a macro that > > calls a function that rw 32 bits back to back. This might be a better way to go than trying to have vfio provide its own emulation. > Even today, vfio_pci_core_do_io_rw() makes no guarantees to its users > that register accesses will be done in the granularity they've thought > to use. The vfs layer may coalesce the accesses and this code will then > read/write the largest naturally aligned chunks. I've witnessed in my > testing that one device driver was doing 8-byte writes through the 8- > byte capable vfio layer all of a sudden when run in a KVM guest. Sure, KVM has emulation for various byte sizes, and does invoke vfio with the raw size it got from guest, including larger than 8 sizes from things like SSE instructions. This has nothing to do with the VFS layer. > So higher-level code needs to consider how to split register accesses > appropriately to get the intended side-effects. Thus, I'd rather stay > away from splitting 64bit accesses into two 32bit accesses - and decide > if high or low order values should be written first. The VFIO API is a byte for byte memcpy. VFIO should try to do the largest single instruction accesses it knows how to do because some HW is sensitive to that. Otherwise it does a memcpy loop. Jason ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores 2024-05-21 16:40 ` Gerd Bayer 2024-05-22 13:48 ` Jason Gunthorpe @ 2024-05-22 23:57 ` Ramesh Thomas 1 sibling, 0 replies; 25+ messages in thread From: Ramesh Thomas @ 2024-05-22 23:57 UTC (permalink / raw) To: Gerd Bayer, Alex Williamson, Jason Gunthorpe, Niklas Schnelle Cc: kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic, Julian Ruess, Ben Segal On 5/21/2024 9:40 AM, Gerd Bayer wrote: > On Fri, 2024-05-17 at 03:29 -0700, Ramesh Thomas wrote: >> On 4/25/2024 9:56 AM, Gerd Bayer wrote: >>> From: Ben Segal <bpsegal@us.ibm.com> >>> >>> Many PCI adapters can benefit or even require full 64bit read >>> and write access to their registers. In order to enable work on >>> user-space drivers for these devices add two new variations >>> vfio_pci_core_io{read|write}64 of the existing access methods >>> when the architecture supports 64-bit ioreads and iowrites. >> >> This is indeed necessary as back to back 32 bit may not be optimal in >> some devices. >> >>> >>> Signed-off-by: Ben Segal <bpsegal@us.ibm.com> >>> Co-developed-by: Gerd Bayer <gbayer@linux.ibm.com> >>> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com> >>> --- >>> drivers/vfio/pci/vfio_pci_rdwr.c | 16 ++++++++++++++++ >>> include/linux/vfio_pci_core.h | 3 +++ >>> 2 files changed, 19 insertions(+) >>> >>> diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c >>> b/drivers/vfio/pci/vfio_pci_rdwr.c >>> index 3335f1b868b1..8ed06edaee23 100644 >>> --- a/drivers/vfio/pci/vfio_pci_rdwr.c >>> +++ b/drivers/vfio/pci/vfio_pci_rdwr.c >>> @@ -89,6 +89,9 @@ EXPORT_SYMBOL_GPL(vfio_pci_core_ioread##size); >>> VFIO_IOREAD(8) >>> VFIO_IOREAD(16) >>> VFIO_IOREAD(32) >>> +#ifdef ioread64 >>> +VFIO_IOREAD(64) >>> +#endif >>> >>> #define >>> VFIO_IORDWR(size) \ >>> static int vfio_pci_core_iordwr##size(struct vfio_pci_core_device >>> *vdev,\ >>> @@ -124,6 +127,10 @@ static int vfio_pci_core_iordwr##size(struct >>> vfio_pci_core_device *vdev,\ >>> VFIO_IORDWR(8) >>> VFIO_IORDWR(16) >>> VFIO_IORDWR(32) >>> +#if defined(ioread64) && defined(iowrite64) >>> +VFIO_IORDWR(64) >>> +#endif >>> + >>> /* >>> * Read or write from an __iomem region (MMIO or I/O port) with >>> an excluded >>> * range which is inaccessible. The excluded range drops writes >>> and fills >>> @@ -148,6 +155,15 @@ ssize_t vfio_pci_core_do_io_rw(struct >>> vfio_pci_core_device *vdev, bool test_mem, >>> else >>> fillable = 0; >>> >>> +#if defined(ioread64) && defined(iowrite64) >> >> Can we check for #ifdef CONFIG_64BIT instead? In x86, ioread64 and >> iowrite64 get declared as extern functions if CONFIG_GENERIC_IOMAP is >> defined and this check always fails. In include/asm-generic/io.h, >> asm-generic/iomap.h gets included which declares them as extern >> functions. > > I thinks that should be possible - since ioread64/iowrite64 depend on > CONFIG_64BIT themselves. > >> One more thing to consider io-64-nonatomic-hi-lo.h and >> io-64-nonatomic-lo-hi.h, if included would define it as a macro that >> calls a function that rw 32 bits back to back. > > Even today, vfio_pci_core_do_io_rw() makes no guarantees to its users > that register accesses will be done in the granularity they've thought > to use. The vfs layer may coalesce the accesses and this code will then > read/write the largest naturally aligned chunks. I've witnessed in my > testing that one device driver was doing 8-byte writes through the 8- > byte capable vfio layer all of a sudden when run in a KVM guest. > > So higher-level code needs to consider how to split register accesses > appropriately to get the intended side-effects. Thus, I'd rather stay > away from splitting 64bit accesses into two 32bit accesses - and decide > if high or low order values should be written first. Sorry, I was not clear. The main reason to include io-64-nonatomic-hi-lo.h or io-64-nonatomic-lo-hi.h is to get the iowrite64 and ioread64 macros defined mapping to functions that implement them. They define and map them to generic functions in lib/iomap.c The 64 bit rw functions (readq/writeq) get called if present. If any architecture has not implemented them, then it maps them to functions that do 32 bit back to back. > > >>> + if (fillable >= 8 && !(off % 8)) { >>> + ret = vfio_pci_core_iordwr64(vdev, >>> iswrite, test_mem, >>> + io, buf, off, >>> &filled); >>> + if (ret) >>> + return ret; >>> + >>> + } else >>> +#endif /* defined(ioread64) && defined(iowrite64) */ >>> if (fillable >= 4 && !(off % 4)) { >>> ret = vfio_pci_core_iordwr32(vdev, >>> iswrite, test_mem, >>> io, buf, off, >>> &filled); >>> diff --git a/include/linux/vfio_pci_core.h >>> b/include/linux/vfio_pci_core.h >>> index a2c8b8bba711..f4cf5fd2350c 100644 >>> --- a/include/linux/vfio_pci_core.h >>> +++ b/include/linux/vfio_pci_core.h >>> @@ -157,5 +157,8 @@ int vfio_pci_core_ioread##size(struct >>> vfio_pci_core_device *vdev, \ >>> VFIO_IOREAD_DECLATION(8) >>> VFIO_IOREAD_DECLATION(16) >>> VFIO_IOREAD_DECLATION(32) >>> +#ifdef ioread64 >>> +VFIO_IOREAD_DECLATION(64) >> nit: This macro is referenced only in this file. Can the typo be >> corrected (_DECLARATION)? > > Sure thanks for pointing this out! > I'll single this editorial change out into a separate patch of the > series, though. > >> >>> +#endif >>> >>> #endif /* VFIO_PCI_CORE_H */ >> >> > > Thanks, Gerd ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v3 3/3] vfio/pci: Continue to refactor vfio_pci_core_do_io_rw 2024-04-25 16:56 [PATCH v3 0/3] vfio/pci: Support 8-byte PCI loads and stores Gerd Bayer 2024-04-25 16:56 ` [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro Gerd Bayer 2024-04-25 16:56 ` [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores Gerd Bayer @ 2024-04-25 16:56 ` Gerd Bayer 2024-04-28 6:59 ` Tian, Kevin ` (2 more replies) 2 siblings, 3 replies; 25+ messages in thread From: Gerd Bayer @ 2024-04-25 16:56 UTC (permalink / raw) To: Alex Williamson, Jason Gunthorpe, Niklas Schnelle Cc: kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic, Julian Ruess, Gerd Bayer Convert if-elseif-chain into switch-case. Separate out and generalize the code from the if-clauses so the result can be used in the switch statement. Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com> --- drivers/vfio/pci/vfio_pci_rdwr.c | 30 ++++++++++++++++++++++++------ 1 file changed, 24 insertions(+), 6 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c index 8ed06edaee23..634c00b03c71 100644 --- a/drivers/vfio/pci/vfio_pci_rdwr.c +++ b/drivers/vfio/pci/vfio_pci_rdwr.c @@ -131,6 +131,20 @@ VFIO_IORDWR(32) VFIO_IORDWR(64) #endif +static int fill_size(size_t fillable, loff_t off) +{ + unsigned int fill_size; +#if defined(ioread64) && defined(iowrite64) + for (fill_size = 8; fill_size >= 0; fill_size /= 2) { +#else + for (fill_size = 4; fill_size >= 0; fill_size /= 2) { +#endif /* defined(ioread64) && defined(iowrite64) */ + if (fillable >= fill_size && !(off % fill_size)) + return fill_size; + } + return -1; +} + /* * Read or write from an __iomem region (MMIO or I/O port) with an excluded * range which is inaccessible. The excluded range drops writes and fills @@ -155,34 +169,38 @@ ssize_t vfio_pci_core_do_io_rw(struct vfio_pci_core_device *vdev, bool test_mem, else fillable = 0; + switch (fill_size(fillable, off)) { #if defined(ioread64) && defined(iowrite64) - if (fillable >= 8 && !(off % 8)) { + case 8: ret = vfio_pci_core_iordwr64(vdev, iswrite, test_mem, io, buf, off, &filled); if (ret) return ret; + break; - } else #endif /* defined(ioread64) && defined(iowrite64) */ - if (fillable >= 4 && !(off % 4)) { + case 4: ret = vfio_pci_core_iordwr32(vdev, iswrite, test_mem, io, buf, off, &filled); if (ret) return ret; + break; - } else if (fillable >= 2 && !(off % 2)) { + case 2: ret = vfio_pci_core_iordwr16(vdev, iswrite, test_mem, io, buf, off, &filled); if (ret) return ret; + break; - } else if (fillable) { + case 1: ret = vfio_pci_core_iordwr8(vdev, iswrite, test_mem, io, buf, off, &filled); if (ret) return ret; + break; - } else { + default: /* Fill reads with -1, drop writes */ filled = min(count, (size_t)(x_end - off)); if (!iswrite) { -- 2.44.0 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* RE: [PATCH v3 3/3] vfio/pci: Continue to refactor vfio_pci_core_do_io_rw 2024-04-25 16:56 ` [PATCH v3 3/3] vfio/pci: Continue to refactor vfio_pci_core_do_io_rw Gerd Bayer @ 2024-04-28 6:59 ` Tian, Kevin 2024-04-29 16:32 ` Alex Williamson 2024-05-17 11:41 ` Ramesh Thomas 2 siblings, 0 replies; 25+ messages in thread From: Tian, Kevin @ 2024-04-28 6:59 UTC (permalink / raw) To: Gerd Bayer, Alex Williamson, Jason Gunthorpe, Niklas Schnelle Cc: kvm@vger.kernel.org, linux-s390@vger.kernel.org, Ankit Agrawal, Yishai Hadas, Halil Pasic, Julian Ruess > From: Gerd Bayer <gbayer@linux.ibm.com> > Sent: Friday, April 26, 2024 12:56 AM > > @@ -131,6 +131,20 @@ VFIO_IORDWR(32) > VFIO_IORDWR(64) > #endif > > +static int fill_size(size_t fillable, loff_t off) > +{ > + unsigned int fill_size; > +#if defined(ioread64) && defined(iowrite64) > + for (fill_size = 8; fill_size >= 0; fill_size /= 2) { > +#else > + for (fill_size = 4; fill_size >= 0; fill_size /= 2) { > +#endif /* defined(ioread64) && defined(iowrite64) */ > + if (fillable >= fill_size && !(off % fill_size)) > + return fill_size; > + } > + return -1; this is unreachable with "fill_size >= 0" in the loop. shouldn't it be: #if defined(ioread64) && defined(iowrite64) for (fill_size = 8; fill_size > 0; fill_size /= 2) { #else for (fill_size = 4; fill_size > 0; fill_size /= 2) { #endif /* defined(ioread64) && defined(iowrite64) */ if (fillable >= fill_size && !(off % fill_size)) break; } return fill_size; > > + switch (fill_size(fillable, off)) { > #if defined(ioread64) && defined(iowrite64) > - if (fillable >= 8 && !(off % 8)) { > + case 8: > ret = vfio_pci_core_iordwr64(vdev, iswrite, test_mem, > io, buf, off, &filled); > if (ret) > return ret; > + break; the check on 'ret' can be moved out of the switch to be generic. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 3/3] vfio/pci: Continue to refactor vfio_pci_core_do_io_rw 2024-04-25 16:56 ` [PATCH v3 3/3] vfio/pci: Continue to refactor vfio_pci_core_do_io_rw Gerd Bayer 2024-04-28 6:59 ` Tian, Kevin @ 2024-04-29 16:32 ` Alex Williamson 2024-05-21 16:43 ` Gerd Bayer 2024-05-17 11:41 ` Ramesh Thomas 2 siblings, 1 reply; 25+ messages in thread From: Alex Williamson @ 2024-04-29 16:32 UTC (permalink / raw) To: Gerd Bayer Cc: Jason Gunthorpe, Niklas Schnelle, kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic, Julian Ruess On Thu, 25 Apr 2024 18:56:04 +0200 Gerd Bayer <gbayer@linux.ibm.com> wrote: > Convert if-elseif-chain into switch-case. > Separate out and generalize the code from the if-clauses so the result > can be used in the switch statement. > > Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com> > --- > drivers/vfio/pci/vfio_pci_rdwr.c | 30 ++++++++++++++++++++++++------ > 1 file changed, 24 insertions(+), 6 deletions(-) > > diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c > index 8ed06edaee23..634c00b03c71 100644 > --- a/drivers/vfio/pci/vfio_pci_rdwr.c > +++ b/drivers/vfio/pci/vfio_pci_rdwr.c > @@ -131,6 +131,20 @@ VFIO_IORDWR(32) > VFIO_IORDWR(64) #define MAX_FILL_SIZE 8 #else #define MAX_FILL_SIZE 4 > #endif > > +static int fill_size(size_t fillable, loff_t off) > +{ > + unsigned int fill_size; unsigned int fill_size = MAX_FILL_SIZE; > +#if defined(ioread64) && defined(iowrite64) > + for (fill_size = 8; fill_size >= 0; fill_size /= 2) { > +#else > + for (fill_size = 4; fill_size >= 0; fill_size /= 2) { > +#endif /* defined(ioread64) && defined(iowrite64) */ > + if (fillable >= fill_size && !(off % fill_size)) > + return fill_size; > + } > + return -1; > +} > + > /* > * Read or write from an __iomem region (MMIO or I/O port) with an excluded > * range which is inaccessible. The excluded range drops writes and fills > @@ -155,34 +169,38 @@ ssize_t vfio_pci_core_do_io_rw(struct vfio_pci_core_device *vdev, bool test_mem, > else > fillable = 0; > > + switch (fill_size(fillable, off)) { > #if defined(ioread64) && defined(iowrite64) > - if (fillable >= 8 && !(off % 8)) { > + case 8: > ret = vfio_pci_core_iordwr64(vdev, iswrite, test_mem, > io, buf, off, &filled); > if (ret) > return ret; > + break; > > - } else AFAICT, avoiding this dangling else within the #ifdef is really the only tangible advantage of conversion to a switch statement. Getting rid of that alone while keeping, and actually increasing, the inline ifdefs in the code doesn't seem worthwhile to me. I'd probably only go this route if we could make vfio_pci_iordwr64() stubbed as a BUG_ON when we don't have the ioread64 and iowrite64 accessors, in which case the switch helper would never return 8 and the function would be unreachable. > #endif /* defined(ioread64) && defined(iowrite64) */ > - if (fillable >= 4 && !(off % 4)) { > + case 4: > ret = vfio_pci_core_iordwr32(vdev, iswrite, test_mem, > io, buf, off, &filled); > if (ret) > return ret; > + break; > > - } else if (fillable >= 2 && !(off % 2)) { > + case 2: > ret = vfio_pci_core_iordwr16(vdev, iswrite, test_mem, > io, buf, off, &filled); > if (ret) > return ret; > + break; > > - } else if (fillable) { > + case 1: > ret = vfio_pci_core_iordwr8(vdev, iswrite, test_mem, > io, buf, off, &filled); > if (ret) > return ret; > + break; > > - } else { > + default: This condition also seems a little more obfuscated without being preceded by the 'if (fillable)' test, which might warrant handling separate from the switch if we continue to pursue the switch conversion. Thanks, Alex > /* Fill reads with -1, drop writes */ > filled = min(count, (size_t)(x_end - off)); > if (!iswrite) { ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 3/3] vfio/pci: Continue to refactor vfio_pci_core_do_io_rw 2024-04-29 16:32 ` Alex Williamson @ 2024-05-21 16:43 ` Gerd Bayer 0 siblings, 0 replies; 25+ messages in thread From: Gerd Bayer @ 2024-05-21 16:43 UTC (permalink / raw) To: Alex Williamson Cc: Jason Gunthorpe, Niklas Schnelle, kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic, Julian Ruess On Mon, 2024-04-29 at 10:32 -0600, Alex Williamson wrote: > On Thu, 25 Apr 2024 18:56:04 +0200 > Gerd Bayer <gbayer@linux.ibm.com> wrote: > > > Convert if-elseif-chain into switch-case. > > Separate out and generalize the code from the if-clauses so the > > result > > can be used in the switch statement. > > > > Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com> > > --- > > drivers/vfio/pci/vfio_pci_rdwr.c | 30 ++++++++++++++++++++++++---- > > -- > > 1 file changed, 24 insertions(+), 6 deletions(-) > > > > diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c > > b/drivers/vfio/pci/vfio_pci_rdwr.c > > index 8ed06edaee23..634c00b03c71 100644 > > --- a/drivers/vfio/pci/vfio_pci_rdwr.c > > +++ b/drivers/vfio/pci/vfio_pci_rdwr.c > > @@ -131,6 +131,20 @@ VFIO_IORDWR(32) > > VFIO_IORDWR(64) > > #define MAX_FILL_SIZE 8 > #else > #define MAX_FILL_SIZE 4 > > > #endif > > > > +static int fill_size(size_t fillable, loff_t off) > > +{ > > + unsigned int fill_size; > > unsigned int fill_size = MAX_FILL_SIZE; > > > +#if defined(ioread64) && defined(iowrite64) > > + for (fill_size = 8; fill_size >= 0; fill_size /= 2) { > > +#else > > + for (fill_size = 4; fill_size >= 0; fill_size /= 2) { > > +#endif /* defined(ioread64) && defined(iowrite64) */ > > + if (fillable >= fill_size && !(off % fill_size)) > > + return fill_size; > > + } > > + return -1; > > +} > > + > > /* > > * Read or write from an __iomem region (MMIO or I/O port) with an > > excluded > > * range which is inaccessible. The excluded range drops writes > > and fills > > @@ -155,34 +169,38 @@ ssize_t vfio_pci_core_do_io_rw(struct > > vfio_pci_core_device *vdev, bool test_mem, > > else > > fillable = 0; > > > > + switch (fill_size(fillable, off)) { > > #if defined(ioread64) && defined(iowrite64) > > - if (fillable >= 8 && !(off % 8)) { > > + case 8: > > ret = vfio_pci_core_iordwr64(vdev, > > iswrite, test_mem, > > io, buf, off, > > &filled); > > if (ret) > > return ret; > > + break; > > > > - } else > > AFAICT, avoiding this dangling else within the #ifdef is really the > only tangible advantage of conversion to a switch statement. Getting > rid of that alone while keeping, and actually increasing, the inline > ifdefs in the code doesn't seem worthwhile to me. I'd probably only > go this route if we could make vfio_pci_iordwr64() stubbed as a > BUG_ON when we don't have the ioread64 and iowrite64 accessors, in > which case the switch helper would never return 8 and the function > would be unreachable. > > > #endif /* defined(ioread64) && defined(iowrite64) */ > > - if (fillable >= 4 && !(off % 4)) { > > + case 4: > > ret = vfio_pci_core_iordwr32(vdev, > > iswrite, test_mem, > > io, buf, off, > > &filled); > > if (ret) > > return ret; > > + break; > > > > - } else if (fillable >= 2 && !(off % 2)) { > > + case 2: > > ret = vfio_pci_core_iordwr16(vdev, > > iswrite, test_mem, > > io, buf, off, > > &filled); > > if (ret) > > return ret; > > + break; > > > > - } else if (fillable) { > > + case 1: > > ret = vfio_pci_core_iordwr8(vdev, iswrite, > > test_mem, > > io, buf, off, > > &filled); > > if (ret) > > return ret; > > + break; > > > > - } else { > > + default: > > This condition also seems a little more obfuscated without being > preceded by the 'if (fillable)' test, which might warrant handling > separate from the switch if we continue to pursue the switch > conversion. Thanks, > > Alex > > > /* Fill reads with -1, drop writes */ > > filled = min(count, (size_t)(x_end - > > off)); > > if (!iswrite) { > > Well, overall this sounds like it creates more headaches than it tries to solve - and that is a strong hint to not do it. I'll drop this further refactoring in the next version. Thanks, Gerd ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 3/3] vfio/pci: Continue to refactor vfio_pci_core_do_io_rw 2024-04-25 16:56 ` [PATCH v3 3/3] vfio/pci: Continue to refactor vfio_pci_core_do_io_rw Gerd Bayer 2024-04-28 6:59 ` Tian, Kevin 2024-04-29 16:32 ` Alex Williamson @ 2024-05-17 11:41 ` Ramesh Thomas 2 siblings, 0 replies; 25+ messages in thread From: Ramesh Thomas @ 2024-05-17 11:41 UTC (permalink / raw) To: Gerd Bayer, Alex Williamson, Jason Gunthorpe, Niklas Schnelle Cc: kvm, linux-s390, Ankit Agrawal, Yishai Hadas, Halil Pasic, Julian Ruess, Thomas, Ramesh On 4/25/2024 9:56 AM, Gerd Bayer wrote: > Convert if-elseif-chain into switch-case. > Separate out and generalize the code from the if-clauses so the result > can be used in the switch statement. > > Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com> > --- > drivers/vfio/pci/vfio_pci_rdwr.c | 30 ++++++++++++++++++++++++------ > 1 file changed, 24 insertions(+), 6 deletions(-) > > diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c > index 8ed06edaee23..634c00b03c71 100644 > --- a/drivers/vfio/pci/vfio_pci_rdwr.c > +++ b/drivers/vfio/pci/vfio_pci_rdwr.c > @@ -131,6 +131,20 @@ VFIO_IORDWR(32) > VFIO_IORDWR(64) > #endif > > +static int fill_size(size_t fillable, loff_t off) > +{ > + unsigned int fill_size; > +#if defined(ioread64) && defined(iowrite64) > + for (fill_size = 8; fill_size >= 0; fill_size /= 2) { > +#else > + for (fill_size = 4; fill_size >= 0; fill_size /= 2) { 0 anyway reaches default case, so loop condition can be "fill_size > 0" or at the start of function return 0 or -1 if fillable <= 0 > +#endif /* defined(ioread64) && defined(iowrite64) */ > + if (fillable >= fill_size && !(off % fill_size)) > + return fill_size; > + } > + return -1; > +} > + > /* > * Read or write from an __iomem region (MMIO or I/O port) with an excluded > * range which is inaccessible. The excluded range drops writes and fills > @@ -155,34 +169,38 @@ ssize_t vfio_pci_core_do_io_rw(struct vfio_pci_core_device *vdev, bool test_mem, > else > fillable = 0; > > + switch (fill_size(fillable, off)) { > #if defined(ioread64) && defined(iowrite64) > - if (fillable >= 8 && !(off % 8)) { > + case 8: > ret = vfio_pci_core_iordwr64(vdev, iswrite, test_mem, > io, buf, off, &filled); > if (ret) > return ret; This check and returning is common for all cases except default. Maybe ret can be initialized to 0 before the switch block and do the check and return after the switch block. > + break; > > - } else > #endif /* defined(ioread64) && defined(iowrite64) */ > - if (fillable >= 4 && !(off % 4)) { > + case 4: > ret = vfio_pci_core_iordwr32(vdev, iswrite, test_mem, > io, buf, off, &filled); > if (ret) > return ret; > + break; > > - } else if (fillable >= 2 && !(off % 2)) { > + case 2: > ret = vfio_pci_core_iordwr16(vdev, iswrite, test_mem, > io, buf, off, &filled); > if (ret) > return ret; > + break; > > - } else if (fillable) { > + case 1: > ret = vfio_pci_core_iordwr8(vdev, iswrite, test_mem, > io, buf, off, &filled); > if (ret) > return ret; > + break; > > - } else { > + default: > /* Fill reads with -1, drop writes */ > filled = min(count, (size_t)(x_end - off)); > if (!iswrite) { ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2024-05-23 21:53 UTC | newest] Thread overview: 25+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-04-25 16:56 [PATCH v3 0/3] vfio/pci: Support 8-byte PCI loads and stores Gerd Bayer 2024-04-25 16:56 ` [PATCH v3 1/3] vfio/pci: Extract duplicated code into macro Gerd Bayer 2024-04-29 16:31 ` Alex Williamson 2024-05-17 14:22 ` Gerd Bayer 2024-04-29 20:09 ` Jason Gunthorpe 2024-04-29 22:11 ` Alex Williamson 2024-04-29 22:33 ` Jason Gunthorpe 2024-05-21 15:47 ` Gerd Bayer 2024-04-30 8:16 ` liulongfang 2024-05-17 10:47 ` Ramesh Thomas 2024-04-25 16:56 ` [PATCH v3 2/3] vfio/pci: Support 8-byte PCI loads and stores Gerd Bayer 2024-04-29 16:31 ` Alex Williamson 2024-05-21 15:50 ` Gerd Bayer 2024-05-17 10:29 ` Ramesh Thomas 2024-05-20 9:02 ` Tian, Kevin 2024-05-23 0:11 ` Ramesh Thomas 2024-05-23 21:52 ` Ramesh Thomas 2024-05-21 16:40 ` Gerd Bayer 2024-05-22 13:48 ` Jason Gunthorpe 2024-05-22 23:57 ` Ramesh Thomas 2024-04-25 16:56 ` [PATCH v3 3/3] vfio/pci: Continue to refactor vfio_pci_core_do_io_rw Gerd Bayer 2024-04-28 6:59 ` Tian, Kevin 2024-04-29 16:32 ` Alex Williamson 2024-05-21 16:43 ` Gerd Bayer 2024-05-17 11:41 ` Ramesh Thomas
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox