* [PATCH v2 0/2] vfio: Fix racy bitfields and tighten struct layout
@ 2026-05-11 22:16 Alex Williamson
2026-05-11 22:16 ` [PATCH v2 1/2] vfio/pci: " Alex Williamson
2026-05-11 22:16 ` [PATCH v2 2/2] vfio/mlx5: " Alex Williamson
0 siblings, 2 replies; 8+ messages in thread
From: Alex Williamson @ 2026-05-11 22:16 UTC (permalink / raw)
To: Alex Williamson, kvm
Cc: Alex Williamson, Jason Gunthorpe, Kevin Tian, linux-kernel,
Yishai Hadas, rananta
A recent patch[1] proposed by Raghavendra triggered a Sashiko
review[2] flagging that the proposed new bitfield shares storage
with neighbors and that concurrent updates via RMW may clobber
adjacent fields.
Auditing bitfield users in vfio_pci_core_device finds several
pre-existing fields with the same hazard, and an analogous pattern
in mlx5_vhca_page_tracker / mlx5vf_pci_core_device. This series
splits all such fields out of their shared storage words, resolving
the existing cases.
v2: Uncouple from Raghavendra's patch so that Sashiko can apply
and review (new field dropped), we can handle merge on commit.
Thanks,
Alex
[1] https://lore.kernel.org/all/20260504224142.1041477-1-rananta@google.com/
[2] https://sashiko.dev/#/patchset/20260504224142.1041477-1-rananta@google.com
Alex Williamson (2):
vfio/pci: Fix racy bitfields and tighten struct layout
vfio/mlx5: Fix racy bitfields and tighten struct layout
drivers/vfio/pci/mlx5/cmd.h | 8 ++++----
include/linux/vfio_pci_core.h | 8 ++++----
2 files changed, 8 insertions(+), 8 deletions(-)
--
2.51.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v2 1/2] vfio/pci: Fix racy bitfields and tighten struct layout
2026-05-11 22:16 [PATCH v2 0/2] vfio: Fix racy bitfields and tighten struct layout Alex Williamson
@ 2026-05-11 22:16 ` Alex Williamson
2026-05-12 13:17 ` David Laight
2026-05-12 13:18 ` Jason Gunthorpe
2026-05-11 22:16 ` [PATCH v2 2/2] vfio/mlx5: " Alex Williamson
1 sibling, 2 replies; 8+ messages in thread
From: Alex Williamson @ 2026-05-11 22:16 UTC (permalink / raw)
To: Alex Williamson, kvm
Cc: Alex Williamson, Jason Gunthorpe, Kevin Tian, linux-kernel,
Yishai Hadas, rananta, stable
Bitfield operations are not atomic, they use a read-modify-write
pattern, therefore we should be careful not to pack bitfields that
can be concurrently updated into the same storage unit.
The split fields (virq_disabled, bardirty, pm_intx_masked,
pm_runtime_engaged, sriov_pwr_active) are mutated post-init from
contexts that don't serialize against the other writers in the same
storage unit, so a bitfield RMW could drop an adjacent field's
update. The remaining bitfields are touched only during probe or
close where no concurrent writer exists, so they stay packed.
While reordering, place virq_disabled and bardirty earlier to fill
an existing alignment hole.
Fixes: 9cd0f6d5cbb6 ("vfio/pci: Use bitfield for struct vfio_pci_core_device flags")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Alex Williamson <alex.williamson@nvidia.com>
---
include/linux/vfio_pci_core.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index 2ebba746c18f..24e8db5b1c0d 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -101,6 +101,8 @@ struct vfio_pci_core_device {
const struct vfio_pci_device_ops *pci_ops;
void __iomem *barmap[PCI_STD_NUM_BARS];
bool bar_mmap_supported[PCI_STD_NUM_BARS];
+ bool virq_disabled;
+ bool bardirty;
u8 *pci_config_map;
u8 *vconfig;
struct perm_bits *msi_perm;
@@ -117,16 +119,14 @@ struct vfio_pci_core_device {
u32 rbar[7];
bool has_dyn_msix:1;
bool pci_2_3:1;
- bool virq_disabled:1;
bool reset_works:1;
bool extended_caps:1;
- bool bardirty:1;
bool has_vga:1;
bool needs_reset:1;
bool nointx:1;
bool needs_pm_restore:1;
- bool pm_intx_masked:1;
- bool pm_runtime_engaged:1;
+ bool pm_intx_masked;
+ bool pm_runtime_engaged;
struct pci_saved_state *pci_saved_state;
struct pci_saved_state *pm_save;
int ioeventfds_nr;
--
2.51.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 2/2] vfio/mlx5: Fix racy bitfields and tighten struct layout
2026-05-11 22:16 [PATCH v2 0/2] vfio: Fix racy bitfields and tighten struct layout Alex Williamson
2026-05-11 22:16 ` [PATCH v2 1/2] vfio/pci: " Alex Williamson
@ 2026-05-11 22:16 ` Alex Williamson
1 sibling, 0 replies; 8+ messages in thread
From: Alex Williamson @ 2026-05-11 22:16 UTC (permalink / raw)
To: Alex Williamson, kvm
Cc: Alex Williamson, Jason Gunthorpe, Kevin Tian, linux-kernel,
Yishai Hadas, rananta, stable
Bitfield operations are not atomic, they use a read-modify-write
pattern, therefore we should be careful not to pack bitfields that
can be concurrently updated into the same storage unit.
The split fields (is_err and object_changed in mlx5_vhca_page_tracker,
deferred_reset in mlx5vf_pci_core_device) are mutated from contexts
that don't serialize against the other writers in the same storage
unit, so a bitfield RMW could drop an adjacent field's update. The
remaining bitfields are either probe-only or share a single writer
context, so they stay packed.
The page tracker's status field is also relocated to fill the
alignment hole the split exposes.
Fixes: f886473071d6 ("vfio/mlx5: Add support for tracker object change event")
Fixes: 61a2f1460fd0 ("vfio/mlx5: Manage the VF attach/detach callback from the PF")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Alex Williamson <alex.williamson@nvidia.com>
---
drivers/vfio/pci/mlx5/cmd.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/vfio/pci/mlx5/cmd.h b/drivers/vfio/pci/mlx5/cmd.h
index deed0f132f39..b782139eb8be 100644
--- a/drivers/vfio/pci/mlx5/cmd.h
+++ b/drivers/vfio/pci/mlx5/cmd.h
@@ -158,14 +158,14 @@ struct mlx5_vhca_qp {
struct mlx5_vhca_page_tracker {
u32 id;
u32 pdn;
- u8 is_err:1;
- u8 object_changed:1;
+ u8 is_err;
+ u8 object_changed;
+ int status;
struct mlx5_uars_page *uar;
struct mlx5_vhca_cq cq;
struct mlx5_vhca_qp *host_qp;
struct mlx5_vhca_qp *fw_qp;
struct mlx5_nb nb;
- int status;
};
struct mlx5vf_pci_core_device {
@@ -173,11 +173,11 @@ struct mlx5vf_pci_core_device {
int vf_id;
u16 vhca_id;
u8 migrate_cap:1;
- u8 deferred_reset:1;
u8 mdev_detach:1;
u8 log_active:1;
u8 chunk_mode:1;
u8 mig_state_cap:1;
+ u8 deferred_reset;
struct completion tracker_comp;
/* protect migration state */
struct mutex state_mutex;
--
2.51.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/2] vfio/pci: Fix racy bitfields and tighten struct layout
2026-05-11 22:16 ` [PATCH v2 1/2] vfio/pci: " Alex Williamson
@ 2026-05-12 13:17 ` David Laight
2026-05-12 13:26 ` Alex Williamson
2026-05-12 13:18 ` Jason Gunthorpe
1 sibling, 1 reply; 8+ messages in thread
From: David Laight @ 2026-05-12 13:17 UTC (permalink / raw)
To: Alex Williamson
Cc: Alex Williamson, kvm, Jason Gunthorpe, Kevin Tian, linux-kernel,
Yishai Hadas, rananta, stable
On Mon, 11 May 2026 16:16:02 -0600
Alex Williamson <alex.williamson@nvidia.com> wrote:
> Bitfield operations are not atomic, they use a read-modify-write
> pattern, therefore we should be careful not to pack bitfields that
> can be concurrently updated into the same storage unit.
>
> The split fields (virq_disabled, bardirty, pm_intx_masked,
> pm_runtime_engaged, sriov_pwr_active) are mutated post-init from
> contexts that don't serialize against the other writers in the same
> storage unit, so a bitfield RMW could drop an adjacent field's
> update. The remaining bitfields are touched only during probe or
> close where no concurrent writer exists, so they stay packed.
>
> While reordering, place virq_disabled and bardirty earlier to fill
> an existing alignment hole.
>
> Fixes: 9cd0f6d5cbb6 ("vfio/pci: Use bitfield for struct vfio_pci_core_device flags")
> Cc: stable@vger.kernel.org
> Assisted-by: Claude:claude-opus-4-7
> Signed-off-by: Alex Williamson <alex.williamson@nvidia.com>
> ---
> include/linux/vfio_pci_core.h | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
> index 2ebba746c18f..24e8db5b1c0d 100644
> --- a/include/linux/vfio_pci_core.h
> +++ b/include/linux/vfio_pci_core.h
> @@ -101,6 +101,8 @@ struct vfio_pci_core_device {
> const struct vfio_pci_device_ops *pci_ops;
> void __iomem *barmap[PCI_STD_NUM_BARS];
> bool bar_mmap_supported[PCI_STD_NUM_BARS];
> + bool virq_disabled;
> + bool bardirty;
I'd put those two after the :1 fields to avoid an extra hole.
-- David
> u8 *pci_config_map;
> u8 *vconfig;
> struct perm_bits *msi_perm;
> @@ -117,16 +119,14 @@ struct vfio_pci_core_device {
> u32 rbar[7];
> bool has_dyn_msix:1;
> bool pci_2_3:1;
> - bool virq_disabled:1;
> bool reset_works:1;
> bool extended_caps:1;
> - bool bardirty:1;
> bool has_vga:1;
> bool needs_reset:1;
> bool nointx:1;
> bool needs_pm_restore:1;
> - bool pm_intx_masked:1;
> - bool pm_runtime_engaged:1;
> + bool pm_intx_masked;
> + bool pm_runtime_engaged;
> struct pci_saved_state *pci_saved_state;
> struct pci_saved_state *pm_save;
> int ioeventfds_nr;
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/2] vfio/pci: Fix racy bitfields and tighten struct layout
2026-05-11 22:16 ` [PATCH v2 1/2] vfio/pci: " Alex Williamson
2026-05-12 13:17 ` David Laight
@ 2026-05-12 13:18 ` Jason Gunthorpe
2026-05-12 18:23 ` Alex Williamson
1 sibling, 1 reply; 8+ messages in thread
From: Jason Gunthorpe @ 2026-05-12 13:18 UTC (permalink / raw)
To: Alex Williamson
Cc: Alex Williamson, kvm, Kevin Tian, linux-kernel, Yishai Hadas,
rananta, stable
On Mon, May 11, 2026 at 04:16:02PM -0600, Alex Williamson wrote:
> Bitfield operations are not atomic, they use a read-modify-write
> pattern, therefore we should be careful not to pack bitfields that
> can be concurrently updated into the same storage unit.
>
> The split fields (virq_disabled, bardirty, pm_intx_masked,
> pm_runtime_engaged, sriov_pwr_active) are mutated post-init from
> contexts that don't serialize against the other writers in the same
> storage unit, so a bitfield RMW could drop an adjacent field's
> update. The remaining bitfields are touched only during probe or
> close where no concurrent writer exists, so they stay packed.
>
> While reordering, place virq_disabled and bardirty earlier to fill
> an existing alignment hole.
I feel like a comment is needed here for the various bool groupings
'write locked by XX' or something?
Jason
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/2] vfio/pci: Fix racy bitfields and tighten struct layout
2026-05-12 13:17 ` David Laight
@ 2026-05-12 13:26 ` Alex Williamson
0 siblings, 0 replies; 8+ messages in thread
From: Alex Williamson @ 2026-05-12 13:26 UTC (permalink / raw)
To: David Laight, Alex Williamson
Cc: kvm, Jason Gunthorpe, Kevin Tian, linux-kernel, Yishai Hadas,
Raghavendra Rao Ananta, stable
On Tue, May 12, 2026, at 7:17 AM, David Laight wrote:
> On Mon, 11 May 2026 16:16:02 -0600
> Alex Williamson <alex.williamson@nvidia.com> wrote:
>
>> Bitfield operations are not atomic, they use a read-modify-write
>> pattern, therefore we should be careful not to pack bitfields that
>> can be concurrently updated into the same storage unit.
>>
>> The split fields (virq_disabled, bardirty, pm_intx_masked,
>> pm_runtime_engaged, sriov_pwr_active) are mutated post-init from
>> contexts that don't serialize against the other writers in the same
>> storage unit, so a bitfield RMW could drop an adjacent field's
>> update. The remaining bitfields are touched only during probe or
>> close where no concurrent writer exists, so they stay packed.
>>
>> While reordering, place virq_disabled and bardirty earlier to fill
>> an existing alignment hole.
>>
>> Fixes: 9cd0f6d5cbb6 ("vfio/pci: Use bitfield for struct vfio_pci_core_device flags")
>> Cc: stable@vger.kernel.org
>> Assisted-by: Claude:claude-opus-4-7
>> Signed-off-by: Alex Williamson <alex.williamson@nvidia.com>
>> ---
>> include/linux/vfio_pci_core.h | 8 ++++----
>> 1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
>> index 2ebba746c18f..24e8db5b1c0d 100644
>> --- a/include/linux/vfio_pci_core.h
>> +++ b/include/linux/vfio_pci_core.h
>> @@ -101,6 +101,8 @@ struct vfio_pci_core_device {
>> const struct vfio_pci_device_ops *pci_ops;
>> void __iomem *barmap[PCI_STD_NUM_BARS];
>> bool bar_mmap_supported[PCI_STD_NUM_BARS];
>> + bool virq_disabled;
>> + bool bardirty;
>
> I'd put those two after the :1 fields to avoid an extra hole.
This actually fills a hole
#define PCI_STD_NUM_BARS 6 /* Number of standard BARs */
6 bytes above, pointers below. Thanks,
Alex
>> u8 *pci_config_map;
>> u8 *vconfig;
>> struct perm_bits *msi_perm;
>> @@ -117,16 +119,14 @@ struct vfio_pci_core_device {
>> u32 rbar[7];
>> bool has_dyn_msix:1;
>> bool pci_2_3:1;
>> - bool virq_disabled:1;
>> bool reset_works:1;
>> bool extended_caps:1;
>> - bool bardirty:1;
>> bool has_vga:1;
>> bool needs_reset:1;
>> bool nointx:1;
>> bool needs_pm_restore:1;
>> - bool pm_intx_masked:1;
>> - bool pm_runtime_engaged:1;
>> + bool pm_intx_masked;
>> + bool pm_runtime_engaged;
>> struct pci_saved_state *pci_saved_state;
>> struct pci_saved_state *pm_save;
>> int ioeventfds_nr;
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/2] vfio/pci: Fix racy bitfields and tighten struct layout
2026-05-12 13:18 ` Jason Gunthorpe
@ 2026-05-12 18:23 ` Alex Williamson
2026-05-13 12:31 ` Jason Gunthorpe
0 siblings, 1 reply; 8+ messages in thread
From: Alex Williamson @ 2026-05-12 18:23 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Alex Williamson, kvm, Kevin Tian, linux-kernel, Yishai Hadas,
rananta, stable
On Tue, 12 May 2026 10:18:12 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:
> On Mon, May 11, 2026 at 04:16:02PM -0600, Alex Williamson wrote:
> > Bitfield operations are not atomic, they use a read-modify-write
> > pattern, therefore we should be careful not to pack bitfields that
> > can be concurrently updated into the same storage unit.
> >
> > The split fields (virq_disabled, bardirty, pm_intx_masked,
> > pm_runtime_engaged, sriov_pwr_active) are mutated post-init from
> > contexts that don't serialize against the other writers in the same
> > storage unit, so a bitfield RMW could drop an adjacent field's
> > update. The remaining bitfields are touched only during probe or
> > close where no concurrent writer exists, so they stay packed.
> >
> > While reordering, place virq_disabled and bardirty earlier to fill
> > an existing alignment hole.
>
> I feel like a comment is needed here for the various bool groupings
>
> 'write locked by XX' or something?
I can provide that, but there are several ways we can approach this.
As I dig into pm_intx_masked vs pm_runtime_engaged, there's an implicit
pm_runtime_get before pm_runtime_engaged, while pm_intx_masked is only
modified in the .suspend/.resume callbacks. So those cannot actually
race. needs_reset is set on close, which is already serialized, and
also via ioctl, which again does a pm_runtime_get, and indirectly takes
memory_lock, so it seems safe that it could share a storage unit.
OTOH, virq_disabled and bardirty are both modified by config space
writes, and while there's likely serialization in a VM, vfio-pci itself
doesn't provide any.
So in the strictest fix, maybe only virq_disabled and bardirty are
pulled out of the bitfield, but the dependencies are sufficiently
subtle that I wonder if it doesn't make sense to limit bitfield use to
anything serialized by probe/open/close and anything dynamically
updated while the device is opened should use its own storage unit.
The mlx5 patch has similar subtle dependencies, mdev_detach and
log_active are serialized by state_mutex, but deferred_reset is set
with reset_lock.
It's not clear the bit compaction is worth the subtle RMW scenarios.
What do you think, should we reserve bitfields for setup/release-time to
avoid this class of issue or handle these as individual point fixes?
Thanks,
Alex
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/2] vfio/pci: Fix racy bitfields and tighten struct layout
2026-05-12 18:23 ` Alex Williamson
@ 2026-05-13 12:31 ` Jason Gunthorpe
0 siblings, 0 replies; 8+ messages in thread
From: Jason Gunthorpe @ 2026-05-13 12:31 UTC (permalink / raw)
To: Alex Williamson
Cc: Alex Williamson, kvm, Kevin Tian, linux-kernel, Yishai Hadas,
rananta, stable
On Tue, May 12, 2026 at 12:23:55PM -0600, Alex Williamson wrote:
> It's not clear the bit compaction is worth the subtle RMW scenarios.
> What do you think, should we reserve bitfields for setup/release-time to
> avoid this class of issue or handle these as individual point fixes?
I think one patch is fine, just that every group of bitfields should
have a description what the locking rule is to write to
it. 'setup/release only' is a fine rule too
Otherwise the next person to add a bitfield will randomly select a
group and we will be back to this again..
Jason
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-05-13 12:31 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-11 22:16 [PATCH v2 0/2] vfio: Fix racy bitfields and tighten struct layout Alex Williamson
2026-05-11 22:16 ` [PATCH v2 1/2] vfio/pci: " Alex Williamson
2026-05-12 13:17 ` David Laight
2026-05-12 13:26 ` Alex Williamson
2026-05-12 13:18 ` Jason Gunthorpe
2026-05-12 18:23 ` Alex Williamson
2026-05-13 12:31 ` Jason Gunthorpe
2026-05-11 22:16 ` [PATCH v2 2/2] vfio/mlx5: " Alex Williamson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox