Linux-HyperV List

Linux-HyperV List
 help / color / mirror / Atom feed

* Re: [PATCH 0/9] drm: Limit DRM_IOCTL_WAIT_VBLANK to vblank interrupts
From: Michel Dänzer @ 2026-05-27 14:04 UTC (permalink / raw)
  To: Thomas Zimmermann, simona, airlied, pekka.paalanen, jadahl,
	contact, maarten.lankhorst, mripard
  Cc: amd-gfx, dri-devel, linux-hyperv, virtualization, spice-devel,
	wayland-devel
In-Reply-To: <4a7c2f87-60fd-458c-a579-a36799b86557@suse.de>

On 5/27/26 15:07, Thomas Zimmermann wrote:
> Am 15.05.26 um 18:56 schrieb Michel Dänzer:
>> On 5/15/26 17:12, Michel Dänzer wrote:
>>> On 5/15/26 13:55, Thomas Zimmermann wrote:
>>>> DRM's WAIT_VBLANK ioctl synchronizes user-space clients to display
>>>> refresh. This is meaningless with vblank timers, which run unrelated
>>>> to the hardware's vblank.
>>>>
>>>> Disable the ioctl for simulated vblanks. Set DRM_VBLANK_FLAG_SIMULATED
>>>> for CRTCs with simulated vblank events in all such drivers. The vblank
>>>> timers of these devices still rate-limit the number of page-flip events
>>>> to match the display refresh.
>>>>
>>>> According to maintainers, user-space compositors do not require the ioctl
>>>> for rate-limitting display output. Weston and Kwin rely on page-flip
>>>> events. Mutter uses and internal timer to limit the number of display
>>>> updates per second.
>>> Actually mutter fundamentally relies on atomic commit completion events for that, same as Weston & KWin. Mutter uses the WAIT_VBLANK ioctl only for minimizing input → output latency (which can hide issues when completion of atomic commits isn't properly throttled).
>>>
>>>
>>> (Just a side not on the cover letter, no objections to the patches themselves)
>> After more discussion on IRC, I have some concerns.
>>
>>
>> The big one first: For drivers with no strict refresh cycle (i.e. an atomic commit can take effect more or less anytime after at least one "refresh cycle" has passed since the last one), does this change really make sense / what's the actual benefit?
> 
> I don't have a strong opinion on that matter. I just think we should clarify the meaning of these ioctls.
> 
> Timing page flip is currently not supported on any driver without hardware vblank IRQ or a vblank timer. The situation might vary among compositors, but there have been plenty of reports of animation and frame rates either being too high or too low.

As discussed on IRC, that's due to insufficient throttling of atomic commits in the kernel, not directly related to the functionality in this series.


> So I think we should rollout vblank timers for all drivers without hardware vblank IRQ.

I'm not arguing against that. It doesn't necessarily invalidate my concern though.


> Right now, vblank timers act like a vblank IRQ in these ioctls. That's a convenient position for user space.
> 
> But we don't really sync anything with hardware here, so the alternate proposal is to not support them.  This also appears to be the original intention of these ioctls.

Speaking as the creator of the DRM_IOCTL_WAIT_VBLANK ioctl, I'm afraid it's not that simple. While it's true that the ioctl was originally only available if backed by corresponding HW & driver support, that's mostly because vblank timers and the scenarios which motivated them weren't really a thing until much later though, not an explicit intention at the time.

I created it to give user space information about the display refresh cycle timing, and to allow it to synchronize to that. This seems like it could be useful even with vblank timers. At least if this ioctl and atomic commits are properly integrated, similar to how they are with a proper HW display refresh cycle, which I'm not sure is currently the case though.


> Vblank timers would just limit the internal page-flip rate, but nothing else. That's the more defensive approach.

If those two things aren't properly integrated though, then this might indeed be less bad than the status quo.


-- 
Earthling Michel Dänzer       \        GNOME / Xwayland / Mesa developer
https://redhat.com             \               Libre software enthusiast

^ permalink raw reply

* [PATCH v2 8/9] drm/virtgpu: Set DRM_VBLANK_FLAG_SIMULATED
From: Thomas Zimmermann @ 2026-05-27 13:32 UTC (permalink / raw)
  To: simona, airlied, mdaenzer, pekka.paalanen, jadahl, contact,
	maarten.lankhorst, mripard, mhklinux
  Cc: amd-gfx, dri-devel, wayland-devel, linux-hyperv, virtualization,
	spice-devel, Thomas Zimmermann
In-Reply-To: <20260527133917.207150-1-tzimmermann@suse.de>

Mark the vblank event on virtgpu as simulated, so that the WAIT_VBLANK
ioctl fails with an error. The ioctl should not be supported because
the output is not synchronized to a display refresh.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
---
 drivers/gpu/drm/virtio/virtgpu_display.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_display.c b/drivers/gpu/drm/virtio/virtgpu_display.c
index 44ffffec550f..558d8001c54f 100644
--- a/drivers/gpu/drm/virtio/virtgpu_display.c
+++ b/drivers/gpu/drm/virtio/virtgpu_display.c
@@ -381,7 +381,7 @@ int virtio_gpu_modeset_init(struct virtio_gpu_device *vgdev)
 	for (i = 0 ; i < vgdev->num_scanouts; ++i)
 		vgdev_output_init(vgdev, i);
 
-	ret = drm_vblank_init(vgdev->ddev, vgdev->num_scanouts);
+	ret = drmm_vblank_init(vgdev->ddev, vgdev->num_scanouts, DRM_VBLANK_FLAG_SIMULATED);
 	if (ret)
 		return ret;
 
-- 
2.54.0


^ permalink raw reply related

* [PATCH v2 7/9] drm/qxl: Set DRM_VBLANK_FLAG_SIMULATED
From: Thomas Zimmermann @ 2026-05-27 13:32 UTC (permalink / raw)
  To: simona, airlied, mdaenzer, pekka.paalanen, jadahl, contact,
	maarten.lankhorst, mripard, mhklinux
  Cc: amd-gfx, dri-devel, wayland-devel, linux-hyperv, virtualization,
	spice-devel, Thomas Zimmermann
In-Reply-To: <20260527133917.207150-1-tzimmermann@suse.de>

Mark the vblank event on qxl as simulated, so that the WAIT_VBLANK
ioctl fails with an error. The ioctl should not be supported because
the output is not synchronized to a display refresh.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
---
 drivers/gpu/drm/qxl/qxl_display.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/qxl/qxl_display.c b/drivers/gpu/drm/qxl/qxl_display.c
index a026bd35ef48..b808fdebbd89 100644
--- a/drivers/gpu/drm/qxl/qxl_display.c
+++ b/drivers/gpu/drm/qxl/qxl_display.c
@@ -1300,7 +1300,7 @@ int qxl_modeset_init(struct qxl_device *qdev)
 
 	qxl_display_read_client_monitors_config(qdev);
 
-	ret = drm_vblank_init(&qdev->ddev, qxl_num_crtc);
+	ret = drmm_vblank_init(&qdev->ddev, qxl_num_crtc, DRM_VBLANK_FLAG_SIMULATED);
 	if (ret)
 		return ret;
 
-- 
2.54.0


^ permalink raw reply related

* [PATCH v2 6/9] drm/hypervdrm: Set DRM_VBLANK_FLAG_SIMULATED
From: Thomas Zimmermann @ 2026-05-27 13:32 UTC (permalink / raw)
  To: simona, airlied, mdaenzer, pekka.paalanen, jadahl, contact,
	maarten.lankhorst, mripard, mhklinux
  Cc: amd-gfx, dri-devel, wayland-devel, linux-hyperv, virtualization,
	spice-devel, Thomas Zimmermann
In-Reply-To: <20260527133917.207150-1-tzimmermann@suse.de>

Mark the vblank event on hypervdrm as simulated, so that the WAIT_VBLANK
ioctl fails with an error. The ioctl should not be supported because the
output is not synchronized to a display refresh.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
---
 drivers/gpu/drm/hyperv/hyperv_drm_modeset.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/hyperv/hyperv_drm_modeset.c b/drivers/gpu/drm/hyperv/hyperv_drm_modeset.c
index 1bbb7de5ab49..24bed31c35e7 100644
--- a/drivers/gpu/drm/hyperv/hyperv_drm_modeset.c
+++ b/drivers/gpu/drm/hyperv/hyperv_drm_modeset.c
@@ -329,7 +329,7 @@ int hyperv_mode_config_init(struct hyperv_drm_device *hv)
 		return ret;
 	}
 
-	ret = drm_vblank_init(dev, 1);
+	ret = drmm_vblank_init(dev, 1, DRM_VBLANK_FLAG_SIMULATED);
 	if (ret)
 		return ret;
 
-- 
2.54.0


^ permalink raw reply related

* [PATCH v2 9/9] drm/vkms: Set DRM_VBLANK_FLAG_SIMULATED
From: Thomas Zimmermann @ 2026-05-27 13:32 UTC (permalink / raw)
  To: simona, airlied, mdaenzer, pekka.paalanen, jadahl, contact,
	maarten.lankhorst, mripard, mhklinux
  Cc: amd-gfx, dri-devel, wayland-devel, linux-hyperv, virtualization,
	spice-devel, Thomas Zimmermann
In-Reply-To: <20260527133917.207150-1-tzimmermann@suse.de>

Mark the vblank event on vkms as simulated, so that the WAIT_VBLANK
ioctl fails with an error. The ioctl should not be supported because
the output is not synchronized to a display refresh.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
---
 drivers/gpu/drm/vkms/vkms_drv.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/vkms/vkms_drv.c b/drivers/gpu/drm/vkms/vkms_drv.c
index 5a640b531d88..c4cfa1e5ab01 100644
--- a/drivers/gpu/drm/vkms/vkms_drv.c
+++ b/drivers/gpu/drm/vkms/vkms_drv.c
@@ -192,8 +192,8 @@ int vkms_create(struct vkms_config *config)
 		goto out_devres;
 	}
 
-	ret = drm_vblank_init(&vkms_device->drm,
-			      vkms_config_get_num_crtcs(config));
+	ret = drmm_vblank_init(&vkms_device->drm, vkms_config_get_num_crtcs(config),
+			       DRM_VBLANK_FLAG_SIMULATED);
 	if (ret) {
 		DRM_ERROR("Failed to vblank\n");
 		goto out_devres;
-- 
2.54.0


^ permalink raw reply related

* [PATCH v2 5/9] drm/cirrus: Set DRM_VBLANK_FLAG_SIMULATED
From: Thomas Zimmermann @ 2026-05-27 13:32 UTC (permalink / raw)
  To: simona, airlied, mdaenzer, pekka.paalanen, jadahl, contact,
	maarten.lankhorst, mripard, mhklinux
  Cc: amd-gfx, dri-devel, wayland-devel, linux-hyperv, virtualization,
	spice-devel, Thomas Zimmermann
In-Reply-To: <20260527133917.207150-1-tzimmermann@suse.de>

Mark the vblank event on cirrus as simulated, so that the WAIT_VBLANK
ioctl fails with an error. The ioctl should not be supported because
the output is not synchronized to a display refresh.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
---
 drivers/gpu/drm/tiny/cirrus-qemu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/tiny/cirrus-qemu.c b/drivers/gpu/drm/tiny/cirrus-qemu.c
index 075221b431d3..01522d1158b2 100644
--- a/drivers/gpu/drm/tiny/cirrus-qemu.c
+++ b/drivers/gpu/drm/tiny/cirrus-qemu.c
@@ -501,7 +501,7 @@ static int cirrus_pipe_init(struct cirrus_device *cirrus)
 	if (ret)
 		return ret;
 
-	ret = drm_vblank_init(dev, 1);
+	ret = drmm_vblank_init(dev, 1, DRM_VBLANK_FLAG_SIMULATED);
 	if (ret)
 		return ret;
 
-- 
2.54.0


^ permalink raw reply related

* [PATCH v2 3/9] drm/amdgpu: vkms: Set DRM_VBLANK_FLAG_SIMULATED
From: Thomas Zimmermann @ 2026-05-27 13:32 UTC (permalink / raw)
  To: simona, airlied, mdaenzer, pekka.paalanen, jadahl, contact,
	maarten.lankhorst, mripard, mhklinux
  Cc: amd-gfx, dri-devel, wayland-devel, linux-hyperv, virtualization,
	spice-devel, Thomas Zimmermann
In-Reply-To: <20260527133917.207150-1-tzimmermann@suse.de>

Mark the vblank event on amdgpu's vkms as simulated, so that the
WAIT_VBLANK ioctl fails with an error. The ioctl should not be
supported because the output is not synchronized to a display refresh.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
index 170adaf7e76a..bc88acc819a6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
@@ -413,7 +413,8 @@ static int amdgpu_vkms_sw_init(struct amdgpu_ip_block *ip_block)
 			return r;
 	}
 
-	r = drm_vblank_init(adev_to_drm(adev), adev->mode_info.num_crtc);
+	r = drmm_vblank_init(adev_to_drm(adev), adev->mode_info.num_crtc,
+			     DRM_VBLANK_FLAG_SIMULATED);
 	if (r)
 		return r;
 
-- 
2.54.0


^ permalink raw reply related

* [PATCH v2 4/9] drm/bochs: Set DRM_VBLANK_FLAG_SIMULATED
From: Thomas Zimmermann @ 2026-05-27 13:32 UTC (permalink / raw)
  To: simona, airlied, mdaenzer, pekka.paalanen, jadahl, contact,
	maarten.lankhorst, mripard, mhklinux
  Cc: amd-gfx, dri-devel, wayland-devel, linux-hyperv, virtualization,
	spice-devel, Thomas Zimmermann
In-Reply-To: <20260527133917.207150-1-tzimmermann@suse.de>

Mark the vblank event on bochs as simulated, so that the WAIT_VBLANK
ioctl fails with an error. The ioctl should not be supported because
the output is not synchronized to a display refresh.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
---
 drivers/gpu/drm/tiny/bochs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/tiny/bochs.c b/drivers/gpu/drm/tiny/bochs.c
index e2d957e51505..b5955ef39e31 100644
--- a/drivers/gpu/drm/tiny/bochs.c
+++ b/drivers/gpu/drm/tiny/bochs.c
@@ -677,7 +677,7 @@ static int bochs_kms_init(struct bochs_device *bochs)
 	drm_connector_attach_edid_property(connector);
 	drm_connector_attach_encoder(connector, encoder);
 
-	ret = drm_vblank_init(dev, 1);
+	ret = drmm_vblank_init(dev, 1, DRM_VBLANK_FLAG_SIMULATED);
 	if (ret)
 		return ret;
 
-- 
2.54.0


^ permalink raw reply related

* [PATCH v2 2/9] drm/vblank: Add DRM_VBLANK_FLAG_SIMULATED
From: Thomas Zimmermann @ 2026-05-27 13:32 UTC (permalink / raw)
  To: simona, airlied, mdaenzer, pekka.paalanen, jadahl, contact,
	maarten.lankhorst, mripard, mhklinux
  Cc: amd-gfx, dri-devel, wayland-devel, linux-hyperv, virtualization,
	spice-devel, Thomas Zimmermann
In-Reply-To: <20260527133917.207150-1-tzimmermann@suse.de>

Add DRM_VBLANK_FLAG_SIMULATED for CRTCs that do not have a hardware
vblank interrupt. Setting the flag tells DRM to not report vblank
capabilities from the WAIT_VBLANK ioctl.

DRM_IOCTL_WAIT_VBLANK queries timestamps from a vblank event or waits
for the next vblank event to occur. DRM clients use this functionality
to synchronize their output with the display's vblank phase. Hence this
is only supported for hardware implementations.

Software implementations are not synchronized to the display and merely
act as a rate limiter for page-flip events. The WAIT_VBLANK ioctl thus
should fail with an error.

v2:
- add filter in CRTC_GET_SEQUENCE and CRTC_QUEUE_SEQUENCE ioctls (Michel)

Suggested-by: Simona Vetter <simona@ffwll.ch>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
---
 drivers/gpu/drm/drm_vblank.c | 10 ++++++++++
 include/drm/drm_vblank.h     |  5 +++++
 2 files changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
index 21ca91b4c014..3ef4ed08e7f6 100644
--- a/drivers/gpu/drm/drm_vblank.c
+++ b/drivers/gpu/drm/drm_vblank.c
@@ -1794,6 +1794,9 @@ int drm_wait_vblank_ioctl(struct drm_device *dev, void *data,
 
 	vblank = drm_vblank_crtc(dev, pipe);
 
+	if (vblank->flags & DRM_VBLANK_FLAG_SIMULATED)
+		return -EOPNOTSUPP;
+
 	/* If the counter is currently enabled and accurate, short-circuit
 	 * queries to return the cached timestamp of the last vblank.
 	 */
@@ -2035,6 +2038,10 @@ int drm_crtc_get_sequence_ioctl(struct drm_device *dev, void *data,
 	pipe = drm_crtc_index(crtc);
 
 	vblank = drm_crtc_vblank_crtc(crtc);
+
+	if (vblank->flags & DRM_VBLANK_FLAG_SIMULATED)
+		return -EOPNOTSUPP;
+
 	vblank_enabled = READ_ONCE(vblank->config.disable_immediate) &&
 		READ_ONCE(vblank->enabled);
 
@@ -2102,6 +2109,9 @@ int drm_crtc_queue_sequence_ioctl(struct drm_device *dev, void *data,
 
 	vblank = drm_crtc_vblank_crtc(crtc);
 
+	if (vblank->flags & DRM_VBLANK_FLAG_SIMULATED)
+		return -EOPNOTSUPP;
+
 	e = kzalloc_obj(*e);
 	if (e == NULL)
 		return -ENOMEM;
diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h
index 39a201b83781..03fa7259b6ac 100644
--- a/include/drm/drm_vblank.h
+++ b/include/drm/drm_vblank.h
@@ -37,6 +37,11 @@ struct drm_device;
 struct drm_crtc;
 struct drm_vblank_work;
 
+/**
+ * DRM_VBLANK_FLAG_SIMULATED - vblank uses a software timer
+ */
+#define DRM_VBLANK_FLAG_SIMULATED	BIT(1)
+
 /**
  * struct drm_pending_vblank_event - pending vblank event tracking
  */
-- 
2.54.0


^ permalink raw reply related

* [PATCH v2 1/9] drm/vblank: Add drmm_vblank_init() to indicate managed cleanup
From: Thomas Zimmermann @ 2026-05-27 13:32 UTC (permalink / raw)
  To: simona, airlied, mdaenzer, pekka.paalanen, jadahl, contact,
	maarten.lankhorst, mripard, mhklinux
  Cc: amd-gfx, dri-devel, wayland-devel, linux-hyperv, virtualization,
	spice-devel, Thomas Zimmermann
In-Reply-To: <20260527133917.207150-1-tzimmermann@suse.de>

Rename drm_vblank_init() to drmm_vblank_init(). As the initializer
function sets up managed cleanup, it should use the drmm prefix. Keep
the old name around until all callers have been converted.

Also add a flags argument to the function. The first use of the flags
will be to distinguish between hardware vblank interrupts and simulated
vblank timeouts.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
---
 drivers/gpu/drm/drm_vblank.c        | 16 +++++++++-------
 drivers/gpu/drm/drm_vblank_helper.c |  2 +-
 include/drm/drm_crtc.h              |  2 +-
 include/drm/drm_device.h            |  2 +-
 include/drm/drm_vblank.h            | 10 +++++++++-
 5 files changed, 21 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
index f90fb2d13e42..21ca91b4c014 100644
--- a/drivers/gpu/drm/drm_vblank.c
+++ b/drivers/gpu/drm/drm_vblank.c
@@ -117,7 +117,7 @@
  * optionally provide a hardware vertical blanking counter.
  *
  * Drivers must initialize the vertical blanking handling core with a call to
- * drm_vblank_init(). Minimally, a driver needs to implement
+ * drmm_vblank_init(). Minimally, a driver needs to implement
  * &drm_crtc_funcs.enable_vblank and &drm_crtc_funcs.disable_vblank plus call
  * drm_crtc_handle_vblank() in its vblank interrupt handler for working vblank
  * support.
@@ -146,7 +146,7 @@
  * See also DRM vblank helpers for more information.
  *
  * Drivers without support for vertical-blanking interrupts nor timers must
- * not call drm_vblank_init(). For these drivers, atomic helpers will
+ * not call drmm_vblank_init(). For these drivers, atomic helpers will
  * automatically generate fake vblank events as part of the display update.
  * This functionality also can be controlled by the driver by enabling and
  * disabling struct drm_crtc_state.no_vblank.
@@ -519,7 +519,7 @@ static void vblank_disable_fn(struct timer_list *t)
 	spin_unlock_irqrestore(&dev->vbl_lock, irqflags);
 }
 
-static void drm_vblank_init_release(struct drm_device *dev, void *ptr)
+static void drmm_vblank_init_release(struct drm_device *dev, void *ptr)
 {
 	struct drm_vblank_crtc *vblank = ptr;
 
@@ -534,9 +534,10 @@ static void drm_vblank_init_release(struct drm_device *dev, void *ptr)
 }
 
 /**
- * drm_vblank_init - initialize vblank support
+ * drmm_vblank_init - initialize vblank support
  * @dev: DRM device
  * @num_crtcs: number of CRTCs supported by @dev
+ * @flags: flags for vblank handling
  *
  * This function initializes vblank support for @num_crtcs display pipelines.
  * Cleanup is handled automatically through a cleanup function added with
@@ -545,7 +546,7 @@ static void drm_vblank_init_release(struct drm_device *dev, void *ptr)
  * Returns:
  * Zero on success or a negative error code on failure.
  */
-int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs)
+int drmm_vblank_init(struct drm_device *dev, unsigned int num_crtcs, unsigned int flags)
 {
 	int ret;
 	unsigned int i;
@@ -564,11 +565,12 @@ int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs)
 
 		vblank->dev = dev;
 		vblank->pipe = i;
+		vblank->flags = flags;
 		init_waitqueue_head(&vblank->queue);
 		timer_setup(&vblank->disable_timer, vblank_disable_fn, 0);
 		seqlock_init(&vblank->seqlock);
 
-		ret = drmm_add_action_or_reset(dev, drm_vblank_init_release,
+		ret = drmm_add_action_or_reset(dev, drmm_vblank_init_release,
 					       vblank);
 		if (ret)
 			return ret;
@@ -580,7 +582,7 @@ int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs)
 
 	return 0;
 }
-EXPORT_SYMBOL(drm_vblank_init);
+EXPORT_SYMBOL(drmm_vblank_init);
 
 /**
  * drm_dev_has_vblank - test if vblanking has been initialized for
diff --git a/drivers/gpu/drm/drm_vblank_helper.c b/drivers/gpu/drm/drm_vblank_helper.c
index d3f8147ecdc1..5b05ab72e133 100644
--- a/drivers/gpu/drm/drm_vblank_helper.c
+++ b/drivers/gpu/drm/drm_vblank_helper.c
@@ -25,7 +25,7 @@
  * for drivers without further requirements. The initializer macro
  * DRM_CRTC_HELPER_VBLANK_FUNCS sets them coveniently.
  *
- * Once the driver enables vblank support with drm_vblank_init(), each
+ * Once the driver enables vblank support with drmm_vblank_init(), each
  * CRTC's vblank timer fires according to the programmed display mode. By
  * default, the vblank timer invokes drm_crtc_handle_vblank(). Drivers with
  * more specific requirements can set their own handler function in
diff --git a/include/drm/drm_crtc.h b/include/drm/drm_crtc.h
index c6dbe8b7db9e..f981468d9a00 100644
--- a/include/drm/drm_crtc.h
+++ b/include/drm/drm_crtc.h
@@ -163,7 +163,7 @@ struct drm_crtc_state {
 	 *
 	 * One usage is for drivers and/or hardware without support for VBLANK
 	 * interrupts. Such drivers typically do not initialize vblanking
-	 * (i.e., call drm_vblank_init() with the number of CRTCs). For CRTCs
+	 * (i.e., call drmm_vblank_init() with the number of CRTCs). For CRTCs
 	 * without initialized vblanking, this field is set to true in
 	 * drm_atomic_helper_check_modeset(), and a fake VBLANK event will be
 	 * send out on each update of the display pipeline by
diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
index 768a8dae83c5..742996576313 100644
--- a/include/drm/drm_device.h
+++ b/include/drm/drm_device.h
@@ -283,7 +283,7 @@ struct drm_device {
 	 * Array of vblank tracking structures, one per &struct drm_crtc. For
 	 * historical reasons (vblank support predates kernel modesetting) this
 	 * is free-standing and not part of &struct drm_crtc itself. It must be
-	 * initialized explicitly by calling drm_vblank_init().
+	 * initialized explicitly by calling drmm_vblank_init().
 	 */
 	struct drm_vblank_crtc *vblank;
 
diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h
index 2fcef9c0f5b1..39a201b83781 100644
--- a/include/drm/drm_vblank.h
+++ b/include/drm/drm_vblank.h
@@ -282,10 +282,12 @@ struct drm_vblank_crtc {
 	 * @vblank_timer: Holds the state of the vblank timer
 	 */
 	struct drm_vblank_crtc_timer vblank_timer;
+
+	unsigned int flags;
 };
 
 struct drm_vblank_crtc *drm_crtc_vblank_crtc(struct drm_crtc *crtc);
-int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs);
+int drmm_vblank_init(struct drm_device *dev, unsigned int num_crtcs, unsigned int flags);
 bool drm_dev_has_vblank(const struct drm_device *dev);
 u64 drm_crtc_vblank_count(struct drm_crtc *crtc);
 u64 drm_crtc_vblank_count_and_time(struct drm_crtc *crtc,
@@ -321,6 +323,12 @@ int drm_crtc_vblank_start_timer(struct drm_crtc *crtc);
 void drm_crtc_vblank_cancel_timer(struct drm_crtc *crtc);
 void drm_crtc_vblank_get_vblank_timeout(struct drm_crtc *crtc, ktime_t *vblank_time);
 
+/* deprecated; use drmm_vblank_init() instead */
+static inline int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs)
+{
+	return drmm_vblank_init(dev, num_crtcs, 0);
+}
+
 /*
  * Helpers for struct drm_crtc_funcs
  */
-- 
2.54.0


^ permalink raw reply related

* [PATCH v2 0/9] drm: Limit DRM_IOCTL_WAIT_VBLANK to vblank interrupts
From: Thomas Zimmermann @ 2026-05-27 13:32 UTC (permalink / raw)
  To: simona, airlied, mdaenzer, pekka.paalanen, jadahl, contact,
	maarten.lankhorst, mripard, mhklinux
  Cc: amd-gfx, dri-devel, wayland-devel, linux-hyperv, virtualization,
	spice-devel, Thomas Zimmermann

DRM's WAIT_VBLANK ioctl synchronizes user-space clients to display
refresh. This is meaningless with vblank timers, which run unrelated
to the hardware's vblank.

Disable the ioctl for simulated vblanks. Set DRM_VBLANK_FLAG_SIMULATED
for CRTCs with simulated vblank events in all such drivers. The vblank
timers of these devices still rate-limit the number of page-flip events
to match the display refresh.

According to maintainers, user-space compositors do not require the ioctl
for rate-limitting display output. Weston, Kwin and Mutter rely on completion
events. Mutter optionally uses the WAIT_VBLANK ioctl only to optimize the
time from input to output.

When testing with mutter and weston, the page-flip rate appears correct
with the patch set applied.

This change has been discussed at length on IRC recently.

https://people.freedesktop.org/~cbrill/dri-log/?channel=dri-devel&highlight_names=&date=2026-05-08&show_html=true
https://people.freedesktop.org/~cbrill/dri-log/?channel=dri-devel&highlight_names=&date=2026-05-12&show_html=true
https://people.freedesktop.org/~cbrill/dri-log/?channel=dri-devel&highlight_names=&date=2026-05-13&show_html=true
https://people.freedesktop.org/~cbrill/dri-log/?channel=dri-devel&highlight_names=&date=2026-05-15&show_html=true

v2:
- add filter to CRTC_GET_SEQUENCE and CRTC_QUEUE_SEQUENCE ioctls (Michel)
- clarify Mutter's behavior in cover letter (Michel)

Thomas Zimmermann (9):
  drm/vblank: Add drmm_vblank_init() to indicate managed cleanup
  drm/vblank: Add DRM_VBLANK_FLAG_SIMULATED
  drm/amdgpu: vkms: Set DRM_VBLANK_FLAG_SIMULATED
  drm/bochs: Set DRM_VBLANK_FLAG_SIMULATED
  drm/cirrus: Set DRM_VBLANK_FLAG_SIMULATED
  drm/hypervdrm: Set DRM_VBLANK_FLAG_SIMULATED
  drm/qxl: Set DRM_VBLANK_FLAG_SIMULATED
  drm/virtgpu: Set DRM_VBLANK_FLAG_SIMULATED
  drm/vkms: Set DRM_VBLANK_FLAG_SIMULATED

 drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c    |  3 ++-
 drivers/gpu/drm/drm_vblank.c                | 26 +++++++++++++++------
 drivers/gpu/drm/drm_vblank_helper.c         |  2 +-
 drivers/gpu/drm/hyperv/hyperv_drm_modeset.c |  2 +-
 drivers/gpu/drm/qxl/qxl_display.c           |  2 +-
 drivers/gpu/drm/tiny/bochs.c                |  2 +-
 drivers/gpu/drm/tiny/cirrus-qemu.c          |  2 +-
 drivers/gpu/drm/virtio/virtgpu_display.c    |  2 +-
 drivers/gpu/drm/vkms/vkms_drv.c             |  4 ++--
 include/drm/drm_crtc.h                      |  2 +-
 include/drm/drm_device.h                    |  2 +-
 include/drm/drm_vblank.h                    | 15 +++++++++++-
 12 files changed, 45 insertions(+), 19 deletions(-)


base-commit: 5fb5a9a63cf5ece68e0eeb6fa397da27712bccf0
-- 
2.54.0


^ permalink raw reply

* Re: [PATCH 0/9] drm: Limit DRM_IOCTL_WAIT_VBLANK to vblank interrupts
From: Thomas Zimmermann @ 2026-05-27 13:07 UTC (permalink / raw)
  To: Michel Dänzer, simona, airlied, pekka.paalanen, jadahl,
	contact, maarten.lankhorst, mripard
  Cc: amd-gfx, dri-devel, linux-hyperv, virtualization, spice-devel,
	wayland-devel
In-Reply-To: <cb461424-e4c1-4d2c-934b-ffd7374e2a56@mailbox.org>

Hi

Am 15.05.26 um 18:56 schrieb Michel Dänzer:
> [ Adding the wayland-devel list for awareness ]
>
> On 5/15/26 17:12, Michel Dänzer wrote:
>> On 5/15/26 13:55, Thomas Zimmermann wrote:
>>> DRM's WAIT_VBLANK ioctl synchronizes user-space clients to display
>>> refresh. This is meaningless with vblank timers, which run unrelated
>>> to the hardware's vblank.
>>>
>>> Disable the ioctl for simulated vblanks. Set DRM_VBLANK_FLAG_SIMULATED
>>> for CRTCs with simulated vblank events in all such drivers. The vblank
>>> timers of these devices still rate-limit the number of page-flip events
>>> to match the display refresh.
>>>
>>> According to maintainers, user-space compositors do not require the ioctl
>>> for rate-limitting display output. Weston and Kwin rely on page-flip
>>> events. Mutter uses and internal timer to limit the number of display
>>> updates per second.
>> Actually mutter fundamentally relies on atomic commit completion events for that, same as Weston & KWin. Mutter uses the WAIT_VBLANK ioctl only for minimizing input → output latency (which can hide issues when completion of atomic commits isn't properly throttled).
>>
>>
>> (Just a side not on the cover letter, no objections to the patches themselves)
> After more discussion on IRC, I have some concerns.
>
>
> The big one first: For drivers with no strict refresh cycle (i.e. an atomic commit can take effect more or less anytime after at least one "refresh cycle" has passed since the last one), does this change really make sense / what's the actual benefit?

I don't have a strong opinion on that matter. I just think we should 
clarify the meaning of these ioctls.

Timing page flip is currently not supported on any driver without 
hardware vblank IRQ or a vblank timer. The situation might vary among 
compositors, but there have been plenty of reports of animation and 
frame rates either being too high or too low. So I think we should 
rollout vblank timers for all drivers without hardware vblank IRQ.

Right now, vblank timers act like a vblank IRQ in these ioctls. That's a 
convenient position for user space.

But we don't really sync anything with hardware here, so the alternate 
proposal is to not support them.  This also appears to be the original 
intention of these ioctls.  Vblank timers would just limit the internal 
page-flip rate, but nothing else. That's the more defensive approach. 
I'd prefer this, as it does not give out guarantees to user space that 
future drivers might not be able to fulfill.




>
> In the case of the asahi & nvidia drivers, the problem with exposing this functionality to user space is that if the timestamps aren't accurate, it can result in missing display refresh cycles, which are dictated by hardware. That's why it makes sense to reject it there.
>
> When there's no strict refresh cycle, that issue doesn't apply though.
>
>
> Any changes made to the WAIT_VBLANK ioctl should also be made to the CRTC_GET_SEQUENCE / CRTC_QUEUE_SEQUENCE ioctls, which are slightly different UAPI for the same functionality.

I'll send out an update with that fixed in a bit.

Best regards
Thomas


>
>

-- 
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstr. 146, 90461 Nürnberg, Germany, www.suse.com
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich, (HRB 36809, AG Nürnberg)



^ permalink raw reply

* Re: [PATCH 1/1] drm/hyperv: Replace "hyperv_" with "hvdrm_" as symbol name prefix
From: Hamza Mahfooz @ 2026-05-27 13:04 UTC (permalink / raw)
  To: mhklinux
  Cc: maarten.lankhorst, mripard, tzimmermann, airlied, simona, decui,
	longli, ssengar, dri-devel, linux-kernel, linux-hyperv
In-Reply-To: <20260526205239.1509-1-mhklkml@zohomail.com>

On Tue, May 26, 2026 at 01:52:39PM -0700, Michael Kelley wrote:
> From: Michael Kelley <mhklinux@outlook.com>
> 
> Function and structure names in the Hyper-V DRM driver currently
> use "hyperv_" as the prefix. This conflicts with usage in core Hyper-V
> and VMBus code, and incorrectly implies that functions and structures
> in this driver apply generically to Hyper-V. A specific conflict arises
> for "hyperv_init", which is an initcall for generic Hyper-V
> initialization on arm64. The conflict prevents the use of
> initcall_blacklist on the kernel boot line to skip loading this driver.
> 
> Fix this by substituting "hvdrm_" as the prefix for all functions and

I would personally prefer "hv_drm_", since it seems clearer.

> structures in this driver. This prefix marries the existing "hv" prefix
> for Hyper-V related code with "drm" to indicate this driver.
> 
> The changes are all mechanical text substitution in symbol names.
> There are no other code or functional changes.
> 
> Signed-off-by: Michael Kelley <mhklinux@outlook.com>
> ---
> This patch is built against linux-next20260526.
> 
>  drivers/gpu/drm/hyperv/hyperv_drm.h         |  20 ++--
>  drivers/gpu/drm/hyperv/hyperv_drm_drv.c     |  88 ++++++++--------
>  drivers/gpu/drm/hyperv/hyperv_drm_modeset.c | 110 ++++++++++----------
>  drivers/gpu/drm/hyperv/hyperv_drm_proto.c   |  70 ++++++-------
>  4 files changed, 144 insertions(+), 144 deletions(-)
> 
> diff --git a/drivers/gpu/drm/hyperv/hyperv_drm.h b/drivers/gpu/drm/hyperv/hyperv_drm.h
> index 9e776112c03e..66bd8730aad2 100644
> --- a/drivers/gpu/drm/hyperv/hyperv_drm.h
> +++ b/drivers/gpu/drm/hyperv/hyperv_drm.h
> @@ -8,7 +8,7 @@
>  
>  #define VMBUS_MAX_PACKET_SIZE 0x4000
>  
> -struct hyperv_drm_device {
> +struct hvdrm_drm_device {

"hvdrm_drm_device" looks kinda redundant, perhaps
s/hyperv_drm_device/hv_drm_device would be more sensible.

Hamza

>  	/* drm */
>  	struct drm_device dev;
>  	struct drm_plane plane;
> @@ -39,17 +39,17 @@ struct hyperv_drm_device {
>  	struct hv_device *hdev;
>  };
>  
> -#define to_hv(_dev) container_of(_dev, struct hyperv_drm_device, dev)
> +#define to_hv(_dev) container_of(_dev, struct hvdrm_drm_device, dev)
>  
> -/* hyperv_drm_modeset */
> -int hyperv_mode_config_init(struct hyperv_drm_device *hv);
> +/* hvdrm_drm_modeset */
> +int hvdrm_mode_config_init(struct hvdrm_drm_device *hv);
>  
> -/* hyperv_drm_proto */
> -int hyperv_update_vram_location(struct hv_device *hdev, phys_addr_t vram_pp);
> -int hyperv_update_situation(struct hv_device *hdev, u8 active, u32 bpp,
> +/* hvdrm_drm_proto */
> +int hvdrm_update_vram_location(struct hv_device *hdev, phys_addr_t vram_pp);
> +int hvdrm_update_situation(struct hv_device *hdev, u8 active, u32 bpp,
>  			    u32 w, u32 h, u32 pitch);
> -int hyperv_hide_hw_ptr(struct hv_device *hdev);
> -int hyperv_update_dirt(struct hv_device *hdev, struct drm_rect *rect);
> -int hyperv_connect_vsp(struct hv_device *hdev);
> +int hvdrm_hide_hw_ptr(struct hv_device *hdev);
> +int hvdrm_update_dirt(struct hv_device *hdev, struct drm_rect *rect);
> +int hvdrm_connect_vsp(struct hv_device *hdev);
>  
>  #endif
> diff --git a/drivers/gpu/drm/hyperv/hyperv_drm_drv.c b/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> index b6bf6412ae34..a4456ccf340e 100644
> --- a/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> +++ b/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> @@ -26,7 +26,7 @@
>  
>  DEFINE_DRM_GEM_FOPS(hv_fops);
>  
> -static struct drm_driver hyperv_driver = {
> +static struct drm_driver hvdrm_driver = {
>  	.driver_features = DRIVER_MODESET | DRIVER_GEM | DRIVER_ATOMIC,
>  
>  	.name		 = DRIVER_NAME,
> @@ -39,17 +39,17 @@ static struct drm_driver hyperv_driver = {
>  	DRM_FBDEV_SHMEM_DRIVER_OPS,
>  };
>  
> -static int hyperv_pci_probe(struct pci_dev *pdev,
> +static int hvdrm_pci_probe(struct pci_dev *pdev,
>  			    const struct pci_device_id *ent)
>  {
>  	return 0;
>  }
>  
> -static void hyperv_pci_remove(struct pci_dev *pdev)
> +static void hvdrm_pci_remove(struct pci_dev *pdev)
>  {
>  }
>  
> -static const struct pci_device_id hyperv_pci_tbl[] = {
> +static const struct pci_device_id hvdrm_pci_tbl[] = {
>  	{
>  		.vendor = PCI_VENDOR_ID_MICROSOFT,
>  		.device = PCI_DEVICE_ID_HYPERV_VIDEO,
> @@ -60,14 +60,14 @@ static const struct pci_device_id hyperv_pci_tbl[] = {
>  /*
>   * PCI stub to support gen1 VM.
>   */
> -static struct pci_driver hyperv_pci_driver = {
> +static struct pci_driver hvdrm_pci_driver = {
>  	.name =		KBUILD_MODNAME,
> -	.id_table =	hyperv_pci_tbl,
> -	.probe =	hyperv_pci_probe,
> -	.remove =	hyperv_pci_remove,
> +	.id_table =	hvdrm_pci_tbl,
> +	.probe =	hvdrm_pci_probe,
> +	.remove =	hvdrm_pci_remove,
>  };
>  
> -static int hyperv_setup_vram(struct hyperv_drm_device *hv,
> +static int hvdrm_setup_vram(struct hvdrm_drm_device *hv,
>  			     struct hv_device *hdev)
>  {
>  	struct drm_device *dev = &hv->dev;
> @@ -102,15 +102,15 @@ static int hyperv_setup_vram(struct hyperv_drm_device *hv,
>  	return ret;
>  }
>  
> -static int hyperv_vmbus_probe(struct hv_device *hdev,
> +static int hvdrm_vmbus_probe(struct hv_device *hdev,
>  			      const struct hv_vmbus_device_id *dev_id)
>  {
> -	struct hyperv_drm_device *hv;
> +	struct hvdrm_drm_device *hv;
>  	struct drm_device *dev;
>  	int ret;
>  
> -	hv = devm_drm_dev_alloc(&hdev->device, &hyperv_driver,
> -				struct hyperv_drm_device, dev);
> +	hv = devm_drm_dev_alloc(&hdev->device, &hvdrm_driver,
> +				struct hvdrm_drm_device, dev);
>  	if (IS_ERR(hv))
>  		return PTR_ERR(hv);
>  
> @@ -119,15 +119,15 @@ static int hyperv_vmbus_probe(struct hv_device *hdev,
>  	hv_set_drvdata(hdev, hv);
>  	hv->hdev = hdev;
>  
> -	ret = hyperv_connect_vsp(hdev);
> +	ret = hvdrm_connect_vsp(hdev);
>  	if (ret) {
>  		drm_err(dev, "Failed to connect to vmbus.\n");
>  		goto err_hv_set_drv_data;
>  	}
>  
> -	aperture_remove_all_conflicting_devices(hyperv_driver.name);
> +	aperture_remove_all_conflicting_devices(hvdrm_driver.name);
>  
> -	ret = hyperv_setup_vram(hv, hdev);
> +	ret = hvdrm_setup_vram(hv, hdev);
>  	if (ret)
>  		goto err_vmbus_close;
>  
> @@ -136,11 +136,11 @@ static int hyperv_vmbus_probe(struct hv_device *hdev,
>  	 * vram location is not fatal. Device will update dirty area till
>  	 * preferred resolution only.
>  	 */
> -	ret = hyperv_update_vram_location(hdev, hv->fb_base);
> +	ret = hvdrm_update_vram_location(hdev, hv->fb_base);
>  	if (ret)
>  		drm_warn(dev, "Failed to update vram location.\n");
>  
> -	ret = hyperv_mode_config_init(hv);
> +	ret = hvdrm_mode_config_init(hv);
>  	if (ret)
>  		goto err_free_mmio;
>  
> @@ -168,10 +168,10 @@ static int hyperv_vmbus_probe(struct hv_device *hdev,
>  	return ret;
>  }
>  
> -static void hyperv_vmbus_remove(struct hv_device *hdev)
> +static void hvdrm_vmbus_remove(struct hv_device *hdev)
>  {
>  	struct drm_device *dev = hv_get_drvdata(hdev);
> -	struct hyperv_drm_device *hv = to_hv(dev);
> +	struct hvdrm_drm_device *hv = to_hv(dev);
>  
>  	vmbus_set_skip_unload(false);
>  	drm_dev_unplug(dev);
> @@ -183,12 +183,12 @@ static void hyperv_vmbus_remove(struct hv_device *hdev)
>  	vmbus_free_mmio(hv->mem->start, hv->fb_size);
>  }
>  
> -static void hyperv_vmbus_shutdown(struct hv_device *hdev)
> +static void hvdrm_vmbus_shutdown(struct hv_device *hdev)
>  {
>  	drm_atomic_helper_shutdown(hv_get_drvdata(hdev));
>  }
>  
> -static int hyperv_vmbus_suspend(struct hv_device *hdev)
> +static int hvdrm_vmbus_suspend(struct hv_device *hdev)
>  {
>  	struct drm_device *dev = hv_get_drvdata(hdev);
>  	int ret;
> @@ -202,67 +202,67 @@ static int hyperv_vmbus_suspend(struct hv_device *hdev)
>  	return 0;
>  }
>  
> -static int hyperv_vmbus_resume(struct hv_device *hdev)
> +static int hvdrm_vmbus_resume(struct hv_device *hdev)
>  {
>  	struct drm_device *dev = hv_get_drvdata(hdev);
> -	struct hyperv_drm_device *hv = to_hv(dev);
> +	struct hvdrm_drm_device *hv = to_hv(dev);
>  	int ret;
>  
> -	ret = hyperv_connect_vsp(hdev);
> +	ret = hvdrm_connect_vsp(hdev);
>  	if (ret)
>  		return ret;
>  
> -	ret = hyperv_update_vram_location(hdev, hv->fb_base);
> +	ret = hvdrm_update_vram_location(hdev, hv->fb_base);
>  	if (ret)
>  		return ret;
>  
>  	return drm_mode_config_helper_resume(dev);
>  }
>  
> -static const struct hv_vmbus_device_id hyperv_vmbus_tbl[] = {
> +static const struct hv_vmbus_device_id hvdrm_vmbus_tbl[] = {
>  	/* Synthetic Video Device GUID */
>  	{HV_SYNTHVID_GUID},
>  	{}
>  };
>  
> -static struct hv_driver hyperv_hv_driver = {
> +static struct hv_driver hvdrm_hv_driver = {
>  	.name = KBUILD_MODNAME,
> -	.id_table = hyperv_vmbus_tbl,
> -	.probe = hyperv_vmbus_probe,
> -	.remove = hyperv_vmbus_remove,
> -	.shutdown = hyperv_vmbus_shutdown,
> -	.suspend = hyperv_vmbus_suspend,
> -	.resume = hyperv_vmbus_resume,
> +	.id_table = hvdrm_vmbus_tbl,
> +	.probe = hvdrm_vmbus_probe,
> +	.remove = hvdrm_vmbus_remove,
> +	.shutdown = hvdrm_vmbus_shutdown,
> +	.suspend = hvdrm_vmbus_suspend,
> +	.resume = hvdrm_vmbus_resume,
>  	.driver = {
>  		.probe_type = PROBE_PREFER_ASYNCHRONOUS,
>  	},
>  };
>  
> -static int __init hyperv_init(void)
> +static int __init hvdrm_init(void)
>  {
>  	int ret;
>  
>  	if (drm_firmware_drivers_only())
>  		return -ENODEV;
>  
> -	ret = pci_register_driver(&hyperv_pci_driver);
> +	ret = pci_register_driver(&hvdrm_pci_driver);
>  	if (ret != 0)
>  		return ret;
>  
> -	return vmbus_driver_register(&hyperv_hv_driver);
> +	return vmbus_driver_register(&hvdrm_hv_driver);
>  }
>  
> -static void __exit hyperv_exit(void)
> +static void __exit hvdrm_exit(void)
>  {
> -	vmbus_driver_unregister(&hyperv_hv_driver);
> -	pci_unregister_driver(&hyperv_pci_driver);
> +	vmbus_driver_unregister(&hvdrm_hv_driver);
> +	pci_unregister_driver(&hvdrm_pci_driver);
>  }
>  
> -module_init(hyperv_init);
> -module_exit(hyperv_exit);
> +module_init(hvdrm_init);
> +module_exit(hvdrm_exit);
>  
> -MODULE_DEVICE_TABLE(pci, hyperv_pci_tbl);
> -MODULE_DEVICE_TABLE(vmbus, hyperv_vmbus_tbl);
> +MODULE_DEVICE_TABLE(pci, hvdrm_pci_tbl);
> +MODULE_DEVICE_TABLE(vmbus, hvdrm_vmbus_tbl);
>  MODULE_LICENSE("GPL");
>  MODULE_AUTHOR("Deepak Rawat <drawat.floss@gmail.com>");
>  MODULE_DESCRIPTION("DRM driver for Hyper-V synthetic video device");
> diff --git a/drivers/gpu/drm/hyperv/hyperv_drm_modeset.c b/drivers/gpu/drm/hyperv/hyperv_drm_modeset.c
> index 793dbbf61893..6844d085e709 100644
> --- a/drivers/gpu/drm/hyperv/hyperv_drm_modeset.c
> +++ b/drivers/gpu/drm/hyperv/hyperv_drm_modeset.c
> @@ -25,11 +25,11 @@
>  
>  #include "hyperv_drm.h"
>  
> -static int hyperv_blit_to_vram_rect(struct drm_framebuffer *fb,
> +static int hvdrm_blit_to_vram_rect(struct drm_framebuffer *fb,
>  				    const struct iosys_map *vmap,
>  				    struct drm_rect *rect)
>  {
> -	struct hyperv_drm_device *hv = to_hv(fb->dev);
> +	struct hvdrm_drm_device *hv = to_hv(fb->dev);
>  	struct iosys_map dst = IOSYS_MAP_INIT_VADDR_IOMEM(hv->vram);
>  	int idx;
>  
> @@ -44,9 +44,9 @@ static int hyperv_blit_to_vram_rect(struct drm_framebuffer *fb,
>  	return 0;
>  }
>  
> -static int hyperv_connector_get_modes(struct drm_connector *connector)
> +static int hvdrm_connector_get_modes(struct drm_connector *connector)
>  {
> -	struct hyperv_drm_device *hv = to_hv(connector->dev);
> +	struct hvdrm_drm_device *hv = to_hv(connector->dev);
>  	int count;
>  
>  	count = drm_add_modes_noedid(connector,
> @@ -58,11 +58,11 @@ static int hyperv_connector_get_modes(struct drm_connector *connector)
>  	return count;
>  }
>  
> -static const struct drm_connector_helper_funcs hyperv_connector_helper_funcs = {
> -	.get_modes = hyperv_connector_get_modes,
> +static const struct drm_connector_helper_funcs hvdrm_connector_helper_funcs = {
> +	.get_modes = hvdrm_connector_get_modes,
>  };
>  
> -static const struct drm_connector_funcs hyperv_connector_funcs = {
> +static const struct drm_connector_funcs hvdrm_connector_funcs = {
>  	.fill_modes = drm_helper_probe_single_connector_modes,
>  	.destroy = drm_connector_cleanup,
>  	.reset = drm_atomic_helper_connector_reset,
> @@ -70,15 +70,15 @@ static const struct drm_connector_funcs hyperv_connector_funcs = {
>  	.atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
>  };
>  
> -static inline int hyperv_conn_init(struct hyperv_drm_device *hv)
> +static inline int hvdrm_conn_init(struct hvdrm_drm_device *hv)
>  {
> -	drm_connector_helper_add(&hv->connector, &hyperv_connector_helper_funcs);
> +	drm_connector_helper_add(&hv->connector, &hvdrm_connector_helper_funcs);
>  	return drm_connector_init(&hv->dev, &hv->connector,
> -				  &hyperv_connector_funcs,
> +				  &hvdrm_connector_funcs,
>  				  DRM_MODE_CONNECTOR_VIRTUAL);
>  }
>  
> -static int hyperv_check_size(struct hyperv_drm_device *hv, int w, int h,
> +static int hvdrm_check_size(struct hvdrm_drm_device *hv, int w, int h,
>  			     struct drm_framebuffer *fb)
>  {
>  	u32 pitch = w * (hv->screen_depth / 8);
> @@ -92,25 +92,25 @@ static int hyperv_check_size(struct hyperv_drm_device *hv, int w, int h,
>  	return 0;
>  }
>  
> -static const uint32_t hyperv_formats[] = {
> +static const uint32_t hvdrm_formats[] = {
>  	DRM_FORMAT_XRGB8888,
>  };
>  
> -static const uint64_t hyperv_modifiers[] = {
> +static const uint64_t hvdrm_modifiers[] = {
>  	DRM_FORMAT_MOD_LINEAR,
>  	DRM_FORMAT_MOD_INVALID
>  };
>  
> -static void hyperv_crtc_helper_atomic_enable(struct drm_crtc *crtc,
> +static void hvdrm_crtc_helper_atomic_enable(struct drm_crtc *crtc,
>  					     struct drm_atomic_commit *state)
>  {
> -	struct hyperv_drm_device *hv = to_hv(crtc->dev);
> +	struct hvdrm_drm_device *hv = to_hv(crtc->dev);
>  	struct drm_plane *plane = &hv->plane;
>  	struct drm_plane_state *plane_state = plane->state;
>  	struct drm_crtc_state *crtc_state = crtc->state;
>  
> -	hyperv_hide_hw_ptr(hv->hdev);
> -	hyperv_update_situation(hv->hdev, 1,  hv->screen_depth,
> +	hvdrm_hide_hw_ptr(hv->hdev);
> +	hvdrm_update_situation(hv->hdev, 1,  hv->screen_depth,
>  				crtc_state->mode.hdisplay,
>  				crtc_state->mode.vdisplay,
>  				plane_state->fb->pitches[0]);
> @@ -118,14 +118,14 @@ static void hyperv_crtc_helper_atomic_enable(struct drm_crtc *crtc,
>  	drm_crtc_vblank_on(crtc);
>  }
>  
> -static const struct drm_crtc_helper_funcs hyperv_crtc_helper_funcs = {
> +static const struct drm_crtc_helper_funcs hvdrm_crtc_helper_funcs = {
>  	.atomic_check = drm_crtc_helper_atomic_check,
>  	.atomic_flush = drm_crtc_vblank_atomic_flush,
> -	.atomic_enable = hyperv_crtc_helper_atomic_enable,
> +	.atomic_enable = hvdrm_crtc_helper_atomic_enable,
>  	.atomic_disable = drm_crtc_vblank_atomic_disable,
>  };
>  
> -static const struct drm_crtc_funcs hyperv_crtc_funcs = {
> +static const struct drm_crtc_funcs hvdrm_crtc_funcs = {
>  	.reset = drm_atomic_helper_crtc_reset,
>  	.destroy = drm_crtc_cleanup,
>  	.set_config = drm_atomic_helper_set_config,
> @@ -135,11 +135,11 @@ static const struct drm_crtc_funcs hyperv_crtc_funcs = {
>  	DRM_CRTC_VBLANK_TIMER_FUNCS,
>  };
>  
> -static int hyperv_plane_atomic_check(struct drm_plane *plane,
> +static int hvdrm_plane_atomic_check(struct drm_plane *plane,
>  				     struct drm_atomic_commit *state)
>  {
>  	struct drm_plane_state *plane_state = drm_atomic_get_new_plane_state(state, plane);
> -	struct hyperv_drm_device *hv = to_hv(plane->dev);
> +	struct hvdrm_drm_device *hv = to_hv(plane->dev);
>  	struct drm_framebuffer *fb = plane_state->fb;
>  	struct drm_crtc *crtc = plane_state->crtc;
>  	struct drm_crtc_state *crtc_state = NULL;
> @@ -167,10 +167,10 @@ static int hyperv_plane_atomic_check(struct drm_plane *plane,
>  	return 0;
>  }
>  
> -static void hyperv_plane_atomic_update(struct drm_plane *plane,
> +static void hvdrm_plane_atomic_update(struct drm_plane *plane,
>  				       struct drm_atomic_commit *state)
>  {
> -	struct hyperv_drm_device *hv = to_hv(plane->dev);
> +	struct hvdrm_drm_device *hv = to_hv(plane->dev);
>  	struct drm_plane_state *old_state = drm_atomic_get_old_plane_state(state, plane);
>  	struct drm_plane_state *new_state = drm_atomic_get_new_plane_state(state, plane);
>  	struct drm_shadow_plane_state *shadow_plane_state = to_drm_shadow_plane_state(new_state);
> @@ -185,15 +185,15 @@ static void hyperv_plane_atomic_update(struct drm_plane *plane,
>  		if (!drm_rect_intersect(&dst_clip, &damage))
>  			continue;
>  
> -		hyperv_blit_to_vram_rect(new_state->fb, &shadow_plane_state->data[0], &damage);
> -		hyperv_update_dirt(hv->hdev, &damage);
> +		hvdrm_blit_to_vram_rect(new_state->fb, &shadow_plane_state->data[0], &damage);
> +		hvdrm_update_dirt(hv->hdev, &damage);
>  	}
>  }
>  
> -static int hyperv_plane_get_scanout_buffer(struct drm_plane *plane,
> +static int hvdrm_plane_get_scanout_buffer(struct drm_plane *plane,
>  					   struct drm_scanout_buffer *sb)
>  {
> -	struct hyperv_drm_device *hv = to_hv(plane->dev);
> +	struct hvdrm_drm_device *hv = to_hv(plane->dev);
>  	struct iosys_map map = IOSYS_MAP_INIT_VADDR_IOMEM(hv->vram);
>  
>  	if (plane->state && plane->state->fb) {
> @@ -207,9 +207,9 @@ static int hyperv_plane_get_scanout_buffer(struct drm_plane *plane,
>  	return -ENODEV;
>  }
>  
> -static void hyperv_plane_panic_flush(struct drm_plane *plane)
> +static void hvdrm_plane_panic_flush(struct drm_plane *plane)
>  {
> -	struct hyperv_drm_device *hv = to_hv(plane->dev);
> +	struct hvdrm_drm_device *hv = to_hv(plane->dev);
>  	struct drm_rect rect;
>  
>  	if (plane->state && plane->state->fb) {
> @@ -218,32 +218,32 @@ static void hyperv_plane_panic_flush(struct drm_plane *plane)
>  		rect.x2 = plane->state->fb->width;
>  		rect.y2 = plane->state->fb->height;
>  
> -		hyperv_update_dirt(hv->hdev, &rect);
> +		hvdrm_update_dirt(hv->hdev, &rect);
>  	}
>  
>  	vmbus_initiate_unload(true);
>  }
>  
> -static const struct drm_plane_helper_funcs hyperv_plane_helper_funcs = {
> +static const struct drm_plane_helper_funcs hvdrm_plane_helper_funcs = {
>  	DRM_GEM_SHADOW_PLANE_HELPER_FUNCS,
> -	.atomic_check = hyperv_plane_atomic_check,
> -	.atomic_update = hyperv_plane_atomic_update,
> -	.get_scanout_buffer = hyperv_plane_get_scanout_buffer,
> -	.panic_flush = hyperv_plane_panic_flush,
> +	.atomic_check = hvdrm_plane_atomic_check,
> +	.atomic_update = hvdrm_plane_atomic_update,
> +	.get_scanout_buffer = hvdrm_plane_get_scanout_buffer,
> +	.panic_flush = hvdrm_plane_panic_flush,
>  };
>  
> -static const struct drm_plane_funcs hyperv_plane_funcs = {
> +static const struct drm_plane_funcs hvdrm_plane_funcs = {
>  	.update_plane		= drm_atomic_helper_update_plane,
>  	.disable_plane		= drm_atomic_helper_disable_plane,
>  	.destroy		= drm_plane_cleanup,
>  	DRM_GEM_SHADOW_PLANE_FUNCS,
>  };
>  
> -static const struct drm_encoder_funcs hyperv_drm_simple_encoder_funcs_cleanup = {
> +static const struct drm_encoder_funcs hvdrm_drm_simple_encoder_funcs_cleanup = {
>  	.destroy = drm_encoder_cleanup,
>  };
>  
> -static inline int hyperv_pipe_init(struct hyperv_drm_device *hv)
> +static inline int hvdrm_pipe_init(struct hvdrm_drm_device *hv)
>  {
>  	struct drm_device *dev = &hv->dev;
>  	struct drm_encoder *encoder = &hv->encoder;
> @@ -253,29 +253,29 @@ static inline int hyperv_pipe_init(struct hyperv_drm_device *hv)
>  	int ret;
>  
>  	ret = drm_universal_plane_init(dev, plane, 0,
> -				       &hyperv_plane_funcs,
> -				       hyperv_formats, ARRAY_SIZE(hyperv_formats),
> -				       hyperv_modifiers,
> +				       &hvdrm_plane_funcs,
> +				       hvdrm_formats, ARRAY_SIZE(hvdrm_formats),
> +				       hvdrm_modifiers,
>  				       DRM_PLANE_TYPE_PRIMARY, NULL);
>  	if (ret)
>  		return ret;
> -	drm_plane_helper_add(plane, &hyperv_plane_helper_funcs);
> +	drm_plane_helper_add(plane, &hvdrm_plane_helper_funcs);
>  	drm_plane_enable_fb_damage_clips(plane);
>  
>  	ret = drm_crtc_init_with_planes(dev, crtc, plane, NULL,
> -					&hyperv_crtc_funcs, NULL);
> +					&hvdrm_crtc_funcs, NULL);
>  	if (ret)
>  		return ret;
> -	drm_crtc_helper_add(crtc, &hyperv_crtc_helper_funcs);
> +	drm_crtc_helper_add(crtc, &hvdrm_crtc_helper_funcs);
>  
>  	encoder->possible_crtcs = drm_crtc_mask(crtc);
>  	ret = drm_encoder_init(dev, encoder,
> -			       &hyperv_drm_simple_encoder_funcs_cleanup,
> +			       &hvdrm_drm_simple_encoder_funcs_cleanup,
>  			       DRM_MODE_ENCODER_NONE, NULL);
>  	if (ret)
>  		return ret;
>  
> -	ret = hyperv_conn_init(hv);
> +	ret = hvdrm_conn_init(hv);
>  	if (ret) {
>  		drm_err(dev, "Failed to initialized connector.\n");
>  		return ret;
> @@ -285,25 +285,25 @@ static inline int hyperv_pipe_init(struct hyperv_drm_device *hv)
>  }
>  
>  static enum drm_mode_status
> -hyperv_mode_valid(struct drm_device *dev,
> +hvdrm_mode_valid(struct drm_device *dev,
>  		  const struct drm_display_mode *mode)
>  {
> -	struct hyperv_drm_device *hv = to_hv(dev);
> +	struct hvdrm_drm_device *hv = to_hv(dev);
>  
> -	if (hyperv_check_size(hv, mode->hdisplay, mode->vdisplay, NULL))
> +	if (hvdrm_check_size(hv, mode->hdisplay, mode->vdisplay, NULL))
>  		return MODE_BAD;
>  
>  	return MODE_OK;
>  }
>  
> -static const struct drm_mode_config_funcs hyperv_mode_config_funcs = {
> +static const struct drm_mode_config_funcs hvdrm_mode_config_funcs = {
>  	.fb_create = drm_gem_fb_create_with_dirty,
> -	.mode_valid = hyperv_mode_valid,
> +	.mode_valid = hvdrm_mode_valid,
>  	.atomic_check = drm_atomic_helper_check,
>  	.atomic_commit = drm_atomic_helper_commit,
>  };
>  
> -int hyperv_mode_config_init(struct hyperv_drm_device *hv)
> +int hvdrm_mode_config_init(struct hvdrm_drm_device *hv)
>  {
>  	struct drm_device *dev = &hv->dev;
>  	int ret;
> @@ -322,9 +322,9 @@ int hyperv_mode_config_init(struct hyperv_drm_device *hv)
>  	dev->mode_config.preferred_depth = hv->screen_depth;
>  	dev->mode_config.prefer_shadow = 0;
>  
> -	dev->mode_config.funcs = &hyperv_mode_config_funcs;
> +	dev->mode_config.funcs = &hvdrm_mode_config_funcs;
>  
> -	ret = hyperv_pipe_init(hv);
> +	ret = hvdrm_pipe_init(hv);
>  	if (ret) {
>  		drm_err(dev, "Failed to initialized pipe.\n");
>  		return ret;
> diff --git a/drivers/gpu/drm/hyperv/hyperv_drm_proto.c b/drivers/gpu/drm/hyperv/hyperv_drm_proto.c
> index 6e09b0218df4..7c11b20a9124 100644
> --- a/drivers/gpu/drm/hyperv/hyperv_drm_proto.c
> +++ b/drivers/gpu/drm/hyperv/hyperv_drm_proto.c
> @@ -181,7 +181,7 @@ struct synthvid_msg {
>  	};
>  } __packed;
>  
> -static inline bool hyperv_version_ge(u32 ver1, u32 ver2)
> +static inline bool hvdrm_version_ge(u32 ver1, u32 ver2)
>  {
>  	if (SYNTHVID_VER_GET_MAJOR(ver1) > SYNTHVID_VER_GET_MAJOR(ver2) ||
>  	    (SYNTHVID_VER_GET_MAJOR(ver1) == SYNTHVID_VER_GET_MAJOR(ver2) &&
> @@ -191,10 +191,10 @@ static inline bool hyperv_version_ge(u32 ver1, u32 ver2)
>  	return false;
>  }
>  
> -static inline int hyperv_sendpacket(struct hv_device *hdev, struct synthvid_msg *msg)
> +static inline int hvdrm_sendpacket(struct hv_device *hdev, struct synthvid_msg *msg)
>  {
>  	static atomic64_t request_id = ATOMIC64_INIT(0);
> -	struct hyperv_drm_device *hv = hv_get_drvdata(hdev);
> +	struct hvdrm_drm_device *hv = hv_get_drvdata(hdev);
>  	int ret;
>  
>  	msg->pipe_hdr.type = PIPE_MSG_DATA;
> @@ -211,9 +211,9 @@ static inline int hyperv_sendpacket(struct hv_device *hdev, struct synthvid_msg
>  	return ret;
>  }
>  
> -static int hyperv_negotiate_version(struct hv_device *hdev, u32 ver)
> +static int hvdrm_negotiate_version(struct hv_device *hdev, u32 ver)
>  {
> -	struct hyperv_drm_device *hv = hv_get_drvdata(hdev);
> +	struct hvdrm_drm_device *hv = hv_get_drvdata(hdev);
>  	struct synthvid_msg *msg = (struct synthvid_msg *)hv->init_buf;
>  	struct drm_device *dev = &hv->dev;
>  	unsigned long t;
> @@ -223,7 +223,7 @@ static int hyperv_negotiate_version(struct hv_device *hdev, u32 ver)
>  	msg->vid_hdr.size = sizeof(struct synthvid_msg_hdr) +
>  		sizeof(struct synthvid_version_req);
>  	msg->ver_req.version = ver;
> -	hyperv_sendpacket(hdev, msg);
> +	hvdrm_sendpacket(hdev, msg);
>  
>  	t = wait_for_completion_timeout(&hv->wait, VMBUS_VSP_TIMEOUT);
>  	if (!t) {
> @@ -243,9 +243,9 @@ static int hyperv_negotiate_version(struct hv_device *hdev, u32 ver)
>  	return 0;
>  }
>  
> -int hyperv_update_vram_location(struct hv_device *hdev, phys_addr_t vram_pp)
> +int hvdrm_update_vram_location(struct hv_device *hdev, phys_addr_t vram_pp)
>  {
> -	struct hyperv_drm_device *hv = hv_get_drvdata(hdev);
> +	struct hvdrm_drm_device *hv = hv_get_drvdata(hdev);
>  	struct synthvid_msg *msg = (struct synthvid_msg *)hv->init_buf;
>  	struct drm_device *dev = &hv->dev;
>  	unsigned long t;
> @@ -257,7 +257,7 @@ int hyperv_update_vram_location(struct hv_device *hdev, phys_addr_t vram_pp)
>  	msg->vram.user_ctx = vram_pp;
>  	msg->vram.vram_gpa = vram_pp;
>  	msg->vram.is_vram_gpa_specified = 1;
> -	hyperv_sendpacket(hdev, msg);
> +	hvdrm_sendpacket(hdev, msg);
>  
>  	t = wait_for_completion_timeout(&hv->wait, VMBUS_VSP_TIMEOUT);
>  	if (!t) {
> @@ -272,7 +272,7 @@ int hyperv_update_vram_location(struct hv_device *hdev, phys_addr_t vram_pp)
>  	return 0;
>  }
>  
> -int hyperv_update_situation(struct hv_device *hdev, u8 active, u32 bpp,
> +int hvdrm_update_situation(struct hv_device *hdev, u8 active, u32 bpp,
>  			    u32 w, u32 h, u32 pitch)
>  {
>  	struct synthvid_msg msg;
> @@ -292,7 +292,7 @@ int hyperv_update_situation(struct hv_device *hdev, u8 active, u32 bpp,
>  	msg.situ.video_output[0].height_pixels = h;
>  	msg.situ.video_output[0].pitch_bytes = pitch;
>  
> -	hyperv_sendpacket(hdev, &msg);
> +	hvdrm_sendpacket(hdev, &msg);
>  
>  	return 0;
>  }
> @@ -306,11 +306,11 @@ int hyperv_update_situation(struct hv_device *hdev, u8 active, u32 bpp,
>   * the msg.ptr_shape.data. Note: setting msg.ptr_pos.is_visible to 0 doesn't
>   * work in tests.
>   *
> - * The hyperv_hide_hw_ptr() is also called in the handler of the
> + * The hvdrm_hide_hw_ptr() is also called in the handler of the
>   * SYNTHVID_FEATURE_CHANGE event, otherwise the host still draws an extra
>   * unwanted mouse pointer after the VM Connection window is closed and reopened.
>   */
> -int hyperv_hide_hw_ptr(struct hv_device *hdev)
> +int hvdrm_hide_hw_ptr(struct hv_device *hdev)
>  {
>  	struct synthvid_msg msg;
>  
> @@ -322,7 +322,7 @@ int hyperv_hide_hw_ptr(struct hv_device *hdev)
>  	msg.ptr_pos.video_output = 0;
>  	msg.ptr_pos.image_x = 0;
>  	msg.ptr_pos.image_y = 0;
> -	hyperv_sendpacket(hdev, &msg);
> +	hvdrm_sendpacket(hdev, &msg);
>  
>  	memset(&msg, 0, sizeof(struct synthvid_msg));
>  	msg.vid_hdr.type = SYNTHVID_POINTER_SHAPE;
> @@ -338,14 +338,14 @@ int hyperv_hide_hw_ptr(struct hv_device *hdev)
>  	msg.ptr_shape.data[1] = 1;
>  	msg.ptr_shape.data[2] = 1;
>  	msg.ptr_shape.data[3] = 1;
> -	hyperv_sendpacket(hdev, &msg);
> +	hvdrm_sendpacket(hdev, &msg);
>  
>  	return 0;
>  }
>  
> -int hyperv_update_dirt(struct hv_device *hdev, struct drm_rect *rect)
> +int hvdrm_update_dirt(struct hv_device *hdev, struct drm_rect *rect)
>  {
> -	struct hyperv_drm_device *hv = hv_get_drvdata(hdev);
> +	struct hvdrm_drm_device *hv = hv_get_drvdata(hdev);
>  	struct synthvid_msg msg;
>  
>  	if (!hv->dirt_needed)
> @@ -363,14 +363,14 @@ int hyperv_update_dirt(struct hv_device *hdev, struct drm_rect *rect)
>  	msg.dirt.rect[0].x2 = rect->x2;
>  	msg.dirt.rect[0].y2 = rect->y2;
>  
> -	hyperv_sendpacket(hdev, &msg);
> +	hvdrm_sendpacket(hdev, &msg);
>  
>  	return 0;
>  }
>  
> -static int hyperv_get_supported_resolution(struct hv_device *hdev)
> +static int hvdrm_get_supported_resolution(struct hv_device *hdev)
>  {
> -	struct hyperv_drm_device *hv = hv_get_drvdata(hdev);
> +	struct hvdrm_drm_device *hv = hv_get_drvdata(hdev);
>  	struct synthvid_msg *msg = (struct synthvid_msg *)hv->init_buf;
>  	struct drm_device *dev = &hv->dev;
>  	unsigned long t;
> @@ -383,7 +383,7 @@ static int hyperv_get_supported_resolution(struct hv_device *hdev)
>  		sizeof(struct synthvid_supported_resolution_req);
>  	msg->resolution_req.maximum_resolution_count =
>  		SYNTHVID_MAX_RESOLUTION_COUNT;
> -	hyperv_sendpacket(hdev, msg);
> +	hvdrm_sendpacket(hdev, msg);
>  
>  	t = wait_for_completion_timeout(&hv->wait, VMBUS_VSP_TIMEOUT);
>  	if (!t) {
> @@ -420,9 +420,9 @@ static int hyperv_get_supported_resolution(struct hv_device *hdev)
>  	return 0;
>  }
>  
> -static void hyperv_receive_sub(struct hv_device *hdev, u32 bytes_recvd)
> +static void hvdrm_receive_sub(struct hv_device *hdev, u32 bytes_recvd)
>  {
> -	struct hyperv_drm_device *hv = hv_get_drvdata(hdev);
> +	struct hvdrm_drm_device *hv = hv_get_drvdata(hdev);
>  	struct synthvid_msg *msg;
>  	size_t hdr_size;
>  	size_t need;
> @@ -486,7 +486,7 @@ static void hyperv_receive_sub(struct hv_device *hdev, u32 bytes_recvd)
>  		}
>  		hv->dirt_needed = msg->feature_chg.is_dirt_needed;
>  		if (hv->dirt_needed)
> -			hyperv_hide_hw_ptr(hv->hdev);
> +			hvdrm_hide_hw_ptr(hv->hdev);
>  		return;
>  	default:
>  		return;
> @@ -508,10 +508,10 @@ static void hyperv_receive_sub(struct hv_device *hdev, u32 bytes_recvd)
>  	complete(&hv->wait);
>  }
>  
> -static void hyperv_receive(void *ctx)
> +static void hvdrm_receive(void *ctx)
>  {
>  	struct hv_device *hdev = ctx;
> -	struct hyperv_drm_device *hv = hv_get_drvdata(hdev);
> +	struct hvdrm_drm_device *hv = hv_get_drvdata(hdev);
>  	struct synthvid_msg *recv_buf;
>  	u32 bytes_recvd;
>  	u64 req_id;
> @@ -539,19 +539,19 @@ static void hyperv_receive(void *ctx)
>  					    ret, bytes_recvd);
>  		} else if (bytes_recvd > 0 &&
>  			   recv_buf->pipe_hdr.type == PIPE_MSG_DATA) {
> -			hyperv_receive_sub(hdev, bytes_recvd);
> +			hvdrm_receive_sub(hdev, bytes_recvd);
>  		}
>  	} while (bytes_recvd > 0 && ret == 0);
>  }
>  
> -int hyperv_connect_vsp(struct hv_device *hdev)
> +int hvdrm_connect_vsp(struct hv_device *hdev)
>  {
> -	struct hyperv_drm_device *hv = hv_get_drvdata(hdev);
> +	struct hvdrm_drm_device *hv = hv_get_drvdata(hdev);
>  	struct drm_device *dev = &hv->dev;
>  	int ret;
>  
>  	ret = vmbus_open(hdev->channel, VMBUS_RING_BUFSIZE, VMBUS_RING_BUFSIZE,
> -			 NULL, 0, hyperv_receive, hdev);
> +			 NULL, 0, hvdrm_receive, hdev);
>  	if (ret) {
>  		drm_err(dev, "Unable to open vmbus channel\n");
>  		return ret;
> @@ -561,16 +561,16 @@ int hyperv_connect_vsp(struct hv_device *hdev)
>  	switch (vmbus_proto_version) {
>  	case VERSION_WIN10:
>  	case VERSION_WIN10_V5:
> -		ret = hyperv_negotiate_version(hdev, SYNTHVID_VERSION_WIN10);
> +		ret = hvdrm_negotiate_version(hdev, SYNTHVID_VERSION_WIN10);
>  		if (!ret)
>  			break;
>  		fallthrough;
>  	case VERSION_WIN8:
>  	case VERSION_WIN8_1:
> -		ret = hyperv_negotiate_version(hdev, SYNTHVID_VERSION_WIN8);
> +		ret = hvdrm_negotiate_version(hdev, SYNTHVID_VERSION_WIN8);
>  		break;
>  	default:
> -		ret = hyperv_negotiate_version(hdev, SYNTHVID_VERSION_WIN10);
> +		ret = hvdrm_negotiate_version(hdev, SYNTHVID_VERSION_WIN10);
>  		break;
>  	}
>  
> @@ -581,8 +581,8 @@ int hyperv_connect_vsp(struct hv_device *hdev)
>  
>  	hv->screen_depth = SYNTHVID_DEPTH_WIN8;
>  
> -	if (hyperv_version_ge(hv->synthvid_version, SYNTHVID_VERSION_WIN10)) {
> -		ret = hyperv_get_supported_resolution(hdev);
> +	if (hvdrm_version_ge(hv->synthvid_version, SYNTHVID_VERSION_WIN10)) {
> +		ret = hvdrm_get_supported_resolution(hdev);
>  		if (ret)
>  			drm_err(dev, "Failed to get supported resolution from host, use default\n");
>  	}
> -- 
> 2.25.1
> 

^ permalink raw reply

* Re: [RFC] KVM/x86: Killing kvm_get_time_and_clockread() in favour of ktime_get_snapshot()
From: Paolo Bonzini @ 2026-05-27  8:49 UTC (permalink / raw)
  To: David Woodhouse, Sean Christopherson, Thomas Gleixner,
	John Stultz, Michael Kelley
  Cc: Vitaly Kuznetsov, Marcelo Tosatti, Christopher S. Hall,
	Stephen Boyd, Miroslav Lichvar, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, K. Y. Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Daniel Lezcano, kvm, linux-hyperv, x86,
	linux-kernel
In-Reply-To: <b4895a532344ba6a879d922be8536f9000cd398c.camel@infradead.org>

On 5/26/26 15:57, David Woodhouse wrote:
> In 2012, as part of implementing the "master clock" mode for kvmclock,
> Marcelo added kvm_get_time_and_clockread() in commit d828199e8444
> ("KVM: x86: implement PVCLOCK_TSC_STABLE_BIT pvclock flag").
> 
> In 2016, Christopher Hall added the generic ktime_get_snapshot() in
> commit 9da0f49c8767 ("time: Add timekeeping snapshot code capturing
> system time and counter"), which provides the same paired read of
> { time, counter } through the core timekeeping code.
> 
> Then in 2018, Vitaly Kuznetsov added Hyper-V TSC page support in
> commit b0c39dc68e3b ("x86/kvm: Pass stable clocksource to guests when
> running nested on Hyper-V"), which extended vgettsc() to handle the
> HVCLOCK case.
> 
> I'd quite like to kill it all with fire and make KVM use
> ktime_get_snapshot() instead.
> 
> However, to correlate with the TSC provided to guests, KVM needs the
> underlying host TSC counter value, *not* the cycles count from the
> hyperv_clocksource_tsc_page clocksource which is scaled to 10MHz.
> 
> If we wanted to support master clock mode while nesting under KVM and
> bizarrely using the kvmclock for system timing, we'd have the same
> problem with the kvmclock clocksource, which similarly scales to 1GHz.
> 
> One option is to say "Don't Do That Then™": if you want to provide a
> masterclock kvmclock to guests then *don't* use the silly pvclocks for
> your own kernel's timekeeping, use the damn TSC. Because if the TSC
> *isn't* reliable then you can't do masterclock mode for your guests
> anyway.
> 
> Perhaps that should have been the response when commit b0c39dc68e3b was
> submitted, but I guess we're stuck supporting that mode now. But I
> really do want to kill the KVM hacks and use ktime_get_snapshot().
> 
> Reverse-engineering the original TSC reading from the clocksource
> counter value doesn't look sane, without a loss of precision and/or
> 128-bit division.
> 
> One simple option that occurs to me would be to add a 'cycles_raw'
> value to the system_time_snapshot, for PV clocksources like hyperv and
> kvmclock to populate with the original TSC reading.
> 
> That might actually let us clean up some of the PTP code that currently
> has to deal with TSC vs. kvmclock in counter snapshots too. I think I
> could kill the use of get_cycles() in vmclock for the kvmclock case,
> which might make Thomas happy...

Yeah, when reading I was thinking of PTP as well.  Seems worthwhile.

Paolo


^ permalink raw reply

* Re: [RFC] KVM/x86: Killing kvm_get_time_and_clockread() in favour of ktime_get_snapshot()
From: David Woodhouse @ 2026-05-27  8:42 UTC (permalink / raw)
  To: Vitaly Kuznetsov, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, John Stultz, Michael Kelley
  Cc: Marcelo Tosatti, Christopher S. Hall, Stephen Boyd,
	Miroslav Lichvar, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, K. Y. Srinivasan, Haiyang Zhang, Wei Liu,
	Dexuan Cui, Daniel Lezcano, kvm, linux-hyperv, x86, linux-kernel
In-Reply-To: <87zf1ljluq.fsf@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 1454 bytes --]

On Wed, 2026-05-27 at 10:30 +0200, Vitaly Kuznetsov wrote:
> David Woodhouse <dwmw2@infradead.org> writes:
> 
> ...
> 
> > 
> > Then in 2018, Vitaly Kuznetsov added Hyper-V TSC page support in
> > commit b0c39dc68e3b ("x86/kvm: Pass stable clocksource to guests when
> > running nested on Hyper-V"), which extended vgettsc() to handle the
> > HVCLOCK case.
> > 
> > I'd quite like to kill it all with fire and make KVM use
> > ktime_get_snapshot() instead.
> 
> The main motivation is reducing the complexity of KVM's timekeeping code
> I guess?

Reduce the complexity, and clean it up to use the core kernel snapshot
facility as $DEITY intended, yes.

> The statement "TSC isn't reliable" deserves a book of its own :-)

... deserves a whole wing in a rehab facility...

Thanks for the clarification.

> > One simple option that occurs to me would be to add a 'cycles_raw'
> > value to the system_time_snapshot, for PV clocksources like hyperv and
> > kvmclock to populate with the original TSC reading.
> 
> Personally, I don't see this as such an ugly hack.

Yeah, having thrown that together last night to see what it looks like,
I'm not all that unhappy with it. And it does indeed let me provide
precise vmclock paired time when the clocksource is kvmclock, too.

There are probably some details to tweak, and I'm sure Thomas will call
me a moron again, but I think it's worth it for the resulting cleanup
:)

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5069 bytes --]

^ permalink raw reply

* Re: [RFC] KVM/x86: Killing kvm_get_time_and_clockread() in favour of ktime_get_snapshot()
From: Vitaly Kuznetsov @ 2026-05-27  8:30 UTC (permalink / raw)
  To: David Woodhouse, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, John Stultz, Michael Kelley
  Cc: Marcelo Tosatti, Christopher S. Hall, Stephen Boyd,
	Miroslav Lichvar, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, K. Y. Srinivasan, Haiyang Zhang, Wei Liu,
	Dexuan Cui, Daniel Lezcano, kvm, linux-hyperv, x86, linux-kernel
In-Reply-To: <b4895a532344ba6a879d922be8536f9000cd398c.camel@infradead.org>

David Woodhouse <dwmw2@infradead.org> writes:

...

>
> Then in 2018, Vitaly Kuznetsov added Hyper-V TSC page support in
> commit b0c39dc68e3b ("x86/kvm: Pass stable clocksource to guests when
> running nested on Hyper-V"), which extended vgettsc() to handle the
> HVCLOCK case.
>
> I'd quite like to kill it all with fire and make KVM use
> ktime_get_snapshot() instead.

The main motivation is reducing the complexity of KVM's timekeeping code
I guess?

>
> However, to correlate with the TSC provided to guests, KVM needs the
> underlying host TSC counter value, *not* the cycles count from the
> hyperv_clocksource_tsc_page clocksource which is scaled to 10MHz.
>
> If we wanted to support master clock mode while nesting under KVM and
> bizarrely using the kvmclock for system timing, we'd have the same
> problem with the kvmclock clocksource, which similarly scales to 1GHz.
>
> One option is to say "Don't Do That Then™": if you want to provide a
> masterclock kvmclock to guests then *don't* use the silly pvclocks for
> your own kernel's timekeeping, use the damn TSC. Because if the TSC
> *isn't* reliable then you can't do masterclock mode for your guests
> anyway.

The statement "TSC isn't reliable" deserves a book of its own :-)
Historically, we've seen all sorts of issues with it, but by the time of
b0c39dc68e3b, they were mostly gone. The real problem the Hyper-V/Azure
folks were solving back then was that while the TSC *was* reliable
(synchronized across CPUs, not jumping backwards, stable frequency,
...), tons of hardware out there (Azure is quite big) did not support
TSC scaling. VMs on Azure don't migrate very often, but they do migrate
when hardware maintenance is needed. Migrating to a host with a
different TSC frequency would've been a problem, so the Hyper-V TSC page
was introduced. Note: it is a *single* page for all CPUs, so the
clocksource was never intended to be used in a situation where TSCs are
unsynchronized across CPUs.

To deal with migrations, the Hyper-V folks came up with a mechanism
called 'reenlightenment notifications', and we support it in KVM. It's
not really great, as we need to stop all the nested VMs, but it does the
job: we can re-compute guest PV clocksources (kvmclock, TSC page,
... Xen?) and live happily ever after.

>
> Perhaps that should have been the response when commit b0c39dc68e3b was
> submitted, but I guess we're stuck supporting that mode now.

Times are changing, and it is becoming increasingly difficult to find
x86 hardware without TSC scaling support. Linux guests on Hyper-V now
prefer TSC if possible (HV_ACCESS_TSC_INVARIANT; see, e.g., commit
4c78738ead4e), so I expect that in a few years, there will be no need
for the Hyper-V TSC page clocksource or the reenlightenment logic
anyway.

> But I really do want to kill the KVM hacks and use ktime_get_snapshot().
>
> Reverse-engineering the original TSC reading from the clocksource
> counter value doesn't look sane, without a loss of precision and/or
> 128-bit division.
>
> One simple option that occurs to me would be to add a 'cycles_raw'
> value to the system_time_snapshot, for PV clocksources like hyperv and
> kvmclock to populate with the original TSC reading.

Personally, I don't see this as such an ugly hack.

>
> That might actually let us clean up some of the PTP code that currently
> has to deal with TSC vs. kvmclock in counter snapshots too. I think I
> could kill the use of get_cycles() in vmclock for the kvmclock case,
> which might make Thomas happy...
>
> Any better ideas?

-- 
Vitaly

^ permalink raw reply

* [RFC PATCH 8/8] ptp: vmclock: Use raw_cycles from snapshot for precise TSC pairing
From: David Woodhouse @ 2026-05-26 23:06 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner, John Stultz,
	Michael Kelley
  Cc: Vitaly Kuznetsov, Marcelo Tosatti, Christopher S . Hall,
	Stephen Boyd, Miroslav Lichvar, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H . Peter Anvin, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Daniel Lezcano, kvm, linux-hyperv, x86,
	linux-kernel
In-Reply-To: <20260526230635.136914-1-dwmw2@infradead.org>

From: David Woodhouse <dwmw@amazon.co.uk>

When the system clocksource is kvmclock or Hyper-V (not the TSC
directly), vmclock_get_crosststamp() previously fell through to a
separate get_cycles() call, losing the atomic pairing between the
system time snapshot and the TSC reading.

Now that ktime_get_snapshot_id() populates raw_cycles with the
underlying TSC value for derived clocksources, use it when available.
This gives a perfect (system_time, tsc) pairing for the device time
calculation.

The SUPPORT_KVMCLOCK wrapper is still needed to convert the TSC into
kvmclock nanoseconds for system_counter->cycles, because otherwise
get_device_system_crosststamp() can't interpret the result.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Assisted-by: Kiro:claude-opus-4.6-1m
---
 drivers/ptp/ptp_vmclock.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/ptp/ptp_vmclock.c b/drivers/ptp/ptp_vmclock.c
index cb18c15a4697..aacbf8f48f13 100644
--- a/drivers/ptp/ptp_vmclock.c
+++ b/drivers/ptp/ptp_vmclock.c
@@ -140,6 +140,10 @@ static int vmclock_get_crosststamp(struct vmclock_state *st,
 			if (sts->pre_sts.cs_id == st->cs_id) {
 				cycle = sts->pre_sts.cycles;
 				sts->post_sts = sts->pre_sts;
+			} else if (sts->pre_sts.raw_csid == st->cs_id &&
+				   sts->pre_sts.raw_cycles) {
+				cycle = sts->pre_sts.raw_cycles;
+				sts->post_sts = sts->pre_sts;
 			} else {
 				cycle = get_cycles();
 				ptp_read_system_postts(sts);
-- 
2.54.0

^ permalink raw reply related

* [RFC PATCH 4/8] KVM: x86: Use ktime_get_snapshot_id() for master clock
From: David Woodhouse @ 2026-05-26 23:06 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner, John Stultz,
	Michael Kelley
  Cc: Vitaly Kuznetsov, Marcelo Tosatti, Christopher S . Hall,
	Stephen Boyd, Miroslav Lichvar, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H . Peter Anvin, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Daniel Lezcano, kvm, linux-hyperv, x86,
	linux-kernel
In-Reply-To: <20260526230635.136914-1-dwmw2@infradead.org>

From: David Woodhouse <dwmw@amazon.co.uk>

Replace the KVM-private vgettsc()/do_kvmclock_base()/do_monotonic()/
do_realtime() timekeeping reimplementation with calls to the generic
ktime_get_snapshot_id() interface.

The snapshot provides both the system time and the raw_cycles (TSC)
atomically paired. When raw_cycles is zero, the clocksource could not
provide a raw hardware counter value, which is equivalent to the
previous vgettsc() returning VDSO_CLOCKMODE_NONE.

For kvm_get_time_and_clockread(), the kvmclock base time is
CLOCK_MONOTONIC_RAW + offs_boot. The snapshot provides the raw time
atomically paired with the TSC; offs_boot is added separately as it
only changes at suspend/resume boundaries.

This is a step towards eliminating the pvclock_gtod_data private copy
of timekeeping state and the associated notifier callback.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Assisted-by: Kiro:claude-opus-4.6-1m
---
 arch/x86/kvm/x86.c | 50 ++++++++++++++++++++++++++++++++++++----------
 1 file changed, 39 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c9e17e01f82d..e6f740f95ff9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -35,6 +35,7 @@
 #include "smm.h"
 
 #include <linux/clocksource.h>
+#include <linux/timekeeping.h>
 #include <linux/interrupt.h>
 #include <linux/kvm.h>
 #include <linux/fs.h>
@@ -3137,14 +3138,34 @@ static int do_realtime(struct timespec64 *ts, u64 *tsc_timestamp)
  * reports the TSC value from which it do so. Returns true if host is
  * using TSC based clocksource.
  */
+static bool kvm_snapshot_has_tsc(struct system_time_snapshot *snap,
+				u64 *tsc_timestamp)
+{
+	if (snap->cs_id == CSID_X86_TSC) {
+		*tsc_timestamp = snap->cycles;
+		return true;
+	}
+
+	if (snap->raw_csid == CSID_X86_TSC && snap->raw_cycles) {
+		*tsc_timestamp = snap->raw_cycles;
+		return true;
+	}
+
+	return false;
+}
+
 static bool kvm_get_time_and_clockread(s64 *kernel_ns, u64 *tsc_timestamp)
 {
-	/* checked again under seqlock below */
-	if (!gtod_is_based_on_tsc(pvclock_gtod_data.clock.vclock_mode))
+	struct system_time_snapshot snap;
+
+	if (!ktime_get_snapshot_id(&snap, CLOCK_MONOTONIC_RAW))
+		return false;
+	if (!kvm_snapshot_has_tsc(&snap, tsc_timestamp))
 		return false;
 
-	return gtod_is_based_on_tsc(do_kvmclock_base(kernel_ns,
-						     tsc_timestamp));
+	*kernel_ns = ktime_to_ns(snap.sys) +
+		     ktime_to_ns(ktime_mono_to_any(0, TK_OFFS_BOOT));
+	return true;
 }
 
 /*
@@ -3153,12 +3174,15 @@ static bool kvm_get_time_and_clockread(s64 *kernel_ns, u64 *tsc_timestamp)
  */
 bool kvm_get_monotonic_and_clockread(s64 *kernel_ns, u64 *tsc_timestamp)
 {
-	/* checked again under seqlock below */
-	if (!gtod_is_based_on_tsc(pvclock_gtod_data.clock.vclock_mode))
+	struct system_time_snapshot snap;
+
+	if (!ktime_get_snapshot_id(&snap, CLOCK_MONOTONIC))
+		return false;
+	if (!kvm_snapshot_has_tsc(&snap, tsc_timestamp))
 		return false;
 
-	return gtod_is_based_on_tsc(do_monotonic(kernel_ns,
-						 tsc_timestamp));
+	*kernel_ns = ktime_to_ns(snap.sys);
+	return true;
 }
 
 /*
@@ -3171,11 +3195,15 @@ bool kvm_get_monotonic_and_clockread(s64 *kernel_ns, u64 *tsc_timestamp)
 static bool kvm_get_walltime_and_clockread(struct timespec64 *ts,
 					   u64 *tsc_timestamp)
 {
-	/* checked again under seqlock below */
-	if (!gtod_is_based_on_tsc(pvclock_gtod_data.clock.vclock_mode))
+	struct system_time_snapshot snap;
+
+	if (!ktime_get_snapshot_id(&snap, CLOCK_REALTIME))
+		return false;
+	if (!kvm_snapshot_has_tsc(&snap, tsc_timestamp))
 		return false;
 
-	return gtod_is_based_on_tsc(do_realtime(ts, tsc_timestamp));
+	*ts = ktime_to_timespec64(snap.sys);
+	return true;
 }
 #endif
 
-- 
2.54.0


^ permalink raw reply related

* [RFC PATCH 1/8] timekeeping: Add clocksource read_raw() method and raw_cycles to snapshot
From: David Woodhouse @ 2026-05-26 23:06 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner, John Stultz,
	Michael Kelley
  Cc: Vitaly Kuznetsov, Marcelo Tosatti, Christopher S . Hall,
	Stephen Boyd, Miroslav Lichvar, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H . Peter Anvin, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Daniel Lezcano, kvm, linux-hyperv, x86,
	linux-kernel
In-Reply-To: <to=b6d2173312b8d0469774846eb18b9799832d9cfc.camel@infradead.org>

From: David Woodhouse <dwmw@amazon.co.uk>

Add a read_raw() callback to struct clocksource which returns the
derived clocksource value while also providing the underlying hardware
counter reading. This allows ktime_get_snapshot_id() to populate a new
raw_cycles field in struct system_time_snapshot.

For clocksources that are derived from an underlying counter (e.g.,
Hyper-V TSC page scales TSC to 10MHz, kvmclock scales TSC to 1GHz), this
provides atomic access to both the derived value needed for timekeeping
calculations, and the raw hardware counter needed by consumers like
KVM's master clock and the vmclock PTP driver.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Assisted-by: Kiro:claude-opus-4.6-1m
---
 include/linux/clocksource.h |  8 ++++++++
 include/linux/timekeeping.h |  6 ++++++
 kernel/time/timekeeping.c   | 30 +++++++++++++++++++++++++++++-
 3 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 7c38190b10bf..674299e32f0c 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -37,6 +37,10 @@ struct module;
  *	This is the structure used for system time.
  *
  * @read:		Returns a cycle value, passes clocksource as argument
+ * @read_raw:		Where a clocksource such as kvmclock or the Hyper-V
+ *			scaled TSC is calculated from an underlying hardware
+ *			counter, return both a cycle value and the raw value
+ *			of the underlying counter from which it was calculated
  * @mask:		Bitmask for two's complement
  *			subtraction of non 64 bit counters
  * @mult:		Cycle to nanosecond multiplier
@@ -69,6 +73,8 @@ struct module;
  *			in certain snapshot functions to allow callers to
  *			validate the clocksource from which the snapshot was
  *			taken.
+ * @raw_csid:		If a @read_raw method exists, the clocksource_id of the
+ *			raw underlying counter
  * @flags:		Flags describing special properties
  * @base:		Hardware abstraction for clock on which a clocksource
  *			is based
@@ -97,6 +103,7 @@ struct module;
  */
 struct clocksource {
 	u64			(*read)(struct clocksource *cs);
+	u64			(*read_raw)(struct clocksource *cs, u64 *raw);
 	u64			mask;
 	u32			mult;
 	u32			shift;
@@ -109,6 +116,7 @@ struct clocksource {
 	u32			freq_khz;
 	int			rating;
 	enum clocksource_ids	id;
+	enum clocksource_ids	raw_csid;
 	enum vdso_clock_mode	vdso_clock_mode;
 	unsigned long		flags;
 	struct clocksource_base *base;
diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
index f7945f1048fc..54799a9ebeb0 100644
--- a/include/linux/timekeeping.h
+++ b/include/linux/timekeeping.h
@@ -279,18 +279,24 @@ static inline bool ktime_get_aux_ts64(clockid_t id, struct timespec64 *kt) { ret
  * struct system_time_snapshot - Simultaneous time capture of CLOCK_MONOTONIC_RAW,
  *				 a selected CLOCK_* and the clocksource counter value
  * @cycles:		Clocksource counter value to produce the system times
+ * @raw_cycles:		For derived clocksources, the raw hardware counter value from
+ *			which @cycles was derived
  * @sys:		The system time of the selected CLOCK ID
  * @raw:		Monotonic raw system time
  * @cs_id:		Clocksource ID
+ * @raw_csid:		Clocksource ID of underlying raw hardware counter, set if
+ *			@raw_cycles is non-zero
  * @clock_was_set_seq:	The sequence number of clock-was-set events
  * @cs_was_changed_seq:	The sequence number of clocksource change events
  * @valid:		True if the snapshot is valid
  */
 struct system_time_snapshot {
 	u64			cycles;
+	u64			raw_cycles;
 	ktime_t			sys;
 	ktime_t			raw;
 	enum clocksource_ids	cs_id;
+	enum clocksource_ids	raw_csid;
 	unsigned int		clock_was_set_seq;
 	u8			cs_was_changed_seq;
 	u8			valid;
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index c4fd7229b7da..6c75a677fd2a 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -304,6 +304,21 @@ static __always_inline u64 tk_clock_read(const struct tk_read_base *tkr)
 	return clock->read(clock);
 }
 
+static __always_inline u64 tk_clock_read_raw(const struct tk_read_base *tkr, u64 *raw)
+{
+	struct clocksource *clock = READ_ONCE(tkr->clock);
+
+	*raw = 0;
+
+	if (static_branch_likely(&clocksource_read_inlined))
+		return arch_inlined_clocksource_read(clock);
+
+	if (clock->read_raw)
+		return clock->read_raw(clock, raw);
+	else
+		return clock->read(clock);
+}
+
 static inline void clocksource_disable_inline_read(void)
 {
 	static_branch_disable(&clocksource_read_inlined);
@@ -320,6 +335,18 @@ static __always_inline u64 tk_clock_read(const struct tk_read_base *tkr)
 
 	return clock->read(clock);
 }
+
+static __always_inline u64 tk_clock_read_raw(const struct tk_read_base *tkr, u64 *raw)
+{
+	struct clocksource *clock = READ_ONCE(tkr->clock);
+
+	*raw = 0;
+
+	if (clock->read_raw)
+		return clock->read_raw(clock, raw);
+	else
+		return clock->read(clock);
+}
 static inline void clocksource_disable_inline_read(void) { }
 static inline void clocksource_enable_inline_read(void) { }
 #endif
@@ -1243,8 +1270,9 @@ bool ktime_get_snapshot_id(struct system_time_snapshot *systime_snapshot, clocki
 		if (!tk->clock_valid)
 			return false;
 
-		now = tk_clock_read(&tk->tkr_mono);
+		now = tk_clock_read_raw(&tk->tkr_mono, &systime_snapshot->raw_cycles);
 		systime_snapshot->cs_id = tk->tkr_mono.clock->id;
+		systime_snapshot->raw_csid = tk->tkr_mono.clock->raw_csid;
 		systime_snapshot->cs_was_changed_seq = tk->cs_was_changed_seq;
 		systime_snapshot->clock_was_set_seq = tk->clock_was_set_seq;
 
-- 
2.54.0


^ permalink raw reply related

* [RFC PATCH 5/8] KVM: x86: Compute kvmclock base without pvclock_gtod_data
From: David Woodhouse @ 2026-05-26 23:06 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner, John Stultz,
	Michael Kelley
  Cc: Vitaly Kuznetsov, Marcelo Tosatti, Christopher S . Hall,
	Stephen Boyd, Miroslav Lichvar, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H . Peter Anvin, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Daniel Lezcano, kvm, linux-hyperv, x86,
	linux-kernel
In-Reply-To: <20260526230635.136914-1-dwmw2@infradead.org>

From: David Woodhouse <dwmw@amazon.co.uk>

get_kvmclock_base_ns() needs CLOCK_MONOTONIC_RAW + offs_boot. Compute
this directly rather than reading offs_boot from the pvclock_gtod_data
private copy. offs_boot only changes at suspend/resume so does not
need to be atomically paired with the raw clock read.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Assisted-by: Kiro:claude-opus-4.6-1m
---
 arch/x86/kvm/x86.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e6f740f95ff9..d057f42603e4 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2402,7 +2402,7 @@ static void update_pvclock_gtod(struct timekeeper *tk)
 static s64 get_kvmclock_base_ns(void)
 {
 	/* Count up from boot time, but with the frequency of the raw clock.  */
-	return ktime_to_ns(ktime_add(ktime_get_raw(), pvclock_gtod_data.offs_boot));
+	return ktime_get_raw_ns() + ktime_to_ns(ktime_mono_to_any(0, TK_OFFS_BOOT));
 }
 
 static void kvm_write_wall_clock(struct kvm *kvm, gpa_t wall_clock, int sec_hi_ofs)
-- 
2.54.0


^ permalink raw reply related

* [RFC PATCH 3/8] x86/kvmclock: Implement read_raw() for kvmclock clocksource
From: David Woodhouse @ 2026-05-26 23:06 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner, John Stultz,
	Michael Kelley
  Cc: Vitaly Kuznetsov, Marcelo Tosatti, Christopher S . Hall,
	Stephen Boyd, Miroslav Lichvar, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H . Peter Anvin, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Daniel Lezcano, kvm, linux-hyperv, x86,
	linux-kernel
In-Reply-To: <20260526230635.136914-1-dwmw2@infradead.org>

From: David Woodhouse <dwmw@amazon.co.uk>

Implement the read_raw() callback for the kvmclock clocksource.
This returns the kvmclock nanosecond value (for timekeeping) while
also providing the raw TSC value that was used to compute it.

The TSC is read inside the pvclock seqlock-protected region,
ensuring the raw TSC and derived kvmclock value are atomically
paired.

This enables ktime_get_snapshot_id() to provide the raw TSC to consumers
like the vmclock PTP driver, which currently has to do a separate call
to get_cycles() to obtain the value to feed through the vmclock
calculation.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Assisted-by: Kiro:claude-opus-4.6-1m
---
 arch/x86/kernel/kvmclock.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 74aca22dc726..ef86635433f9 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -87,6 +87,25 @@ static u64 kvm_clock_get_cycles(struct clocksource *cs)
 	return kvm_clock_read();
 }
 
+static u64 kvm_clock_get_cycles_raw(struct clocksource *cs, u64 *raw)
+{
+	struct pvclock_vcpu_time_info *src;
+	unsigned version;
+	u64 ret, tsc;
+
+	preempt_disable_notrace();
+	src = this_cpu_pvti();
+	do {
+		version = pvclock_read_begin(src);
+		tsc = rdtsc_ordered();
+		ret = __pvclock_read_cycles(src, tsc);
+	} while (pvclock_read_retry(src, version));
+	preempt_enable_notrace();
+
+	*raw = tsc;
+	return ret;
+}
+
 static noinstr u64 kvm_sched_clock_read(void)
 {
 	return pvclock_clocksource_read_nowd(this_cpu_pvti()) - kvm_sched_clock_offset;
@@ -163,6 +182,8 @@ static int kvm_cs_enable(struct clocksource *cs)
 static struct clocksource kvm_clock = {
 	.name	= "kvm-clock",
 	.read	= kvm_clock_get_cycles,
+	.read_raw = kvm_clock_get_cycles_raw,
+	.raw_csid = CSID_X86_TSC,
 	.rating	= 400,
 	.mask	= CLOCKSOURCE_MASK(64),
 	.flags	= CLOCK_SOURCE_IS_CONTINUOUS,
-- 
2.54.0


^ permalink raw reply related

* [RFC PATCH 6/8] KVM: x86: Replace pvclock_gtod_data vclock_mode with boolean
From: David Woodhouse @ 2026-05-26 23:06 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner, John Stultz,
	Michael Kelley
  Cc: Vitaly Kuznetsov, Marcelo Tosatti, Christopher S . Hall,
	Stephen Boyd, Miroslav Lichvar, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H . Peter Anvin, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Daniel Lezcano, kvm, linux-hyperv, x86,
	linux-kernel
In-Reply-To: <20260526230635.136914-1-dwmw2@infradead.org>

From: David Woodhouse <dwmw@amazon.co.uk>

The remaining users of pvclock_gtod_data only need to know whether
the host clocksource is TSC-based. Replace all vclock_mode checks
with a simple kvm_host_has_tsc_clocksource boolean, updated by the
pvclock_gtod_notify callback.

This is inherently racy (as it always was — kvm_track_tsc_matching
never held the gtod seqcount), relying on eventual consistency: the
notifier fires on every timekeeping update and will correct any
transient inconsistency within one tick.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Assisted-by: Kiro:claude-opus-4.6-1m
---
 arch/x86/kvm/x86.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d057f42603e4..c31b19860c13 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2649,6 +2649,8 @@ static s64 compute_guest_tsc(struct kvm_vcpu *vcpu, s64 kernel_ns)
 }
 
 #ifdef CONFIG_X86_64
+static bool kvm_host_has_tsc_clocksource;
+
 static inline bool gtod_is_based_on_tsc(int mode)
 {
 	return mode == VDSO_CLOCKMODE_TSC || mode == VDSO_CLOCKMODE_HVCLOCK;
@@ -2682,7 +2684,6 @@ static void kvm_track_tsc_matching(struct kvm_vcpu *vcpu, bool new_generation)
 {
 #ifdef CONFIG_X86_64
 	struct kvm_arch *ka = &vcpu->kvm->arch;
-	struct pvclock_gtod_data *gtod = &pvclock_gtod_data;
 
 	/*
 	 * Track whether all vCPUs have matching TSC offsets (for
@@ -2701,7 +2702,7 @@ static void kvm_track_tsc_matching(struct kvm_vcpu *vcpu, bool new_generation)
 	 * accounts for its offset.
 	 */
 	bool use_master_clock = kvm_use_master_clock(vcpu->kvm) &&
-				gtod_is_based_on_tsc(gtod->clock.vclock_mode);
+				kvm_host_has_tsc_clocksource;
 
 	/*
 	 * Request a masterclock update if the masterclock needs to be toggled
@@ -2715,7 +2716,7 @@ static void kvm_track_tsc_matching(struct kvm_vcpu *vcpu, bool new_generation)
 
 	trace_kvm_track_tsc(vcpu->vcpu_id, ka->nr_vcpus_matched_tsc,
 			    atomic_read(&vcpu->kvm->online_vcpus),
-		            ka->use_master_clock, gtod->clock.vclock_mode);
+		            ka->use_master_clock, kvm_host_has_tsc_clocksource);
 #endif
 }
 
@@ -2836,7 +2837,7 @@ static inline bool kvm_check_tsc_unstable(void)
 	 * TSC is marked unstable when we're running on Hyper-V,
 	 * 'TSC page' clocksource is good.
 	 */
-	if (pvclock_gtod_data.clock.vclock_mode == VDSO_CLOCKMODE_HVCLOCK)
+	if (kvm_host_has_tsc_clocksource)
 		return false;
 #endif
 	return check_tsc_unstable();
@@ -3292,7 +3293,7 @@ static void pvclock_update_vm_gtod_copy(struct kvm *kvm)
 					   &ka->master_tsc_mul);
 	}
 
-	vclock_mode = pvclock_gtod_data.clock.vclock_mode;
+	vclock_mode = kvm_host_has_tsc_clocksource;
 	trace_kvm_update_master_clock(ka->use_master_clock, vclock_mode,
 					ka->all_vcpus_matched_freq);
 #endif
@@ -10364,12 +10365,15 @@ static int pvclock_gtod_notify(struct notifier_block *nb, unsigned long unused,
 	update_pvclock_gtod(tk);
 
 #ifdef CONFIG_X86_64
+	kvm_host_has_tsc_clocksource =
+		gtod_is_based_on_tsc(tk->tkr_mono.clock->vdso_clock_mode);
+
 	/*
 	 * Disable master clock if host does not trust, or does not use,
 	 * TSC based clocksource. Delegate queue_work() to irq_work as
 	 * this is invoked with tk_core.seq write held.
 	 */
-	if (!gtod_is_based_on_tsc(pvclock_gtod_data.clock.vclock_mode) &&
+	if (!kvm_host_has_tsc_clocksource &&
 	    atomic_read(&kvm_guest_has_master_clock) != 0)
 		irq_work_queue(&pvclock_irq_work);
 #endif
-- 
2.54.0


^ permalink raw reply related

* [RFC PATCH 7/8] KVM: x86: Remove pvclock_gtod_data and private timekeeping code
From: David Woodhouse @ 2026-05-26 23:06 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner, John Stultz,
	Michael Kelley
  Cc: Vitaly Kuznetsov, Marcelo Tosatti, Christopher S . Hall,
	Stephen Boyd, Miroslav Lichvar, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H . Peter Anvin, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Daniel Lezcano, kvm, linux-hyperv, x86,
	linux-kernel
In-Reply-To: <20260526230635.136914-1-dwmw2@infradead.org>

From: David Woodhouse <dwmw@amazon.co.uk>

Remove the now-unused KVM-private timekeeping infrastructure:

 - struct pvclock_clock and struct pvclock_gtod_data
 - update_pvclock_gtod() and its seqcount-protected state copy
 - read_tsc() (KVM's private TSC reader with cycle_last clamping)
 - vgettsc() (KVM's private clocksource interpolation)
 - do_kvmclock_base(), do_monotonic(), do_realtime()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Assisted-by: Kiro:claude-opus-4.6-1m
---
 arch/x86/kvm/x86.c | 175 ---------------------------------------------
 1 file changed, 175 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c31b19860c13..2c34b973fce0 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2347,58 +2347,6 @@ static int do_set_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
 	return kvm_set_msr_ignored_check(vcpu, index, *data, true);
 }
 
-struct pvclock_clock {
-	int vclock_mode;
-	u64 cycle_last;
-	u64 mask;
-	u32 mult;
-	u32 shift;
-	u64 base_cycles;
-	u64 offset;
-};
-
-struct pvclock_gtod_data {
-	seqcount_t	seq;
-
-	struct pvclock_clock clock; /* extract of a clocksource struct */
-	struct pvclock_clock raw_clock; /* extract of a clocksource struct */
-
-	ktime_t		offs_boot;
-	u64		wall_time_sec;
-};
-
-static struct pvclock_gtod_data pvclock_gtod_data;
-
-static void update_pvclock_gtod(struct timekeeper *tk)
-{
-	struct pvclock_gtod_data *vdata = &pvclock_gtod_data;
-
-	write_seqcount_begin(&vdata->seq);
-
-	/* copy pvclock gtod data */
-	vdata->clock.vclock_mode	= tk->tkr_mono.clock->vdso_clock_mode;
-	vdata->clock.cycle_last		= tk->tkr_mono.cycle_last;
-	vdata->clock.mask		= tk->tkr_mono.mask;
-	vdata->clock.mult		= tk->tkr_mono.mult;
-	vdata->clock.shift		= tk->tkr_mono.shift;
-	vdata->clock.base_cycles	= tk->tkr_mono.xtime_nsec;
-	vdata->clock.offset		= tk->tkr_mono.base;
-
-	vdata->raw_clock.vclock_mode	= tk->tkr_raw.clock->vdso_clock_mode;
-	vdata->raw_clock.cycle_last	= tk->tkr_raw.cycle_last;
-	vdata->raw_clock.mask		= tk->tkr_raw.mask;
-	vdata->raw_clock.mult		= tk->tkr_raw.mult;
-	vdata->raw_clock.shift		= tk->tkr_raw.shift;
-	vdata->raw_clock.base_cycles	= tk->tkr_raw.xtime_nsec;
-	vdata->raw_clock.offset		= tk->tkr_raw.base;
-
-	vdata->wall_time_sec            = tk->xtime_sec;
-
-	vdata->offs_boot		= tk->offs_boot;
-
-	write_seqcount_end(&vdata->seq);
-}
-
 static s64 get_kvmclock_base_ns(void)
 {
 	/* Count up from boot time, but with the frequency of the raw clock.  */
@@ -3012,128 +2960,6 @@ static inline void adjust_tsc_offset_host(struct kvm_vcpu *vcpu, s64 adjustment)
 
 #ifdef CONFIG_X86_64
 
-static u64 read_tsc(void)
-{
-	u64 ret = (u64)rdtsc_ordered();
-	u64 last = pvclock_gtod_data.clock.cycle_last;
-
-	if (likely(ret >= last))
-		return ret;
-
-	/*
-	 * GCC likes to generate cmov here, but this branch is extremely
-	 * predictable (it's just a function of time and the likely is
-	 * very likely) and there's a data dependence, so force GCC
-	 * to generate a branch instead.  I don't barrier() because
-	 * we don't actually need a barrier, and if this function
-	 * ever gets inlined it will generate worse code.
-	 */
-	asm volatile ("");
-	return last;
-}
-
-static inline u64 vgettsc(struct pvclock_clock *clock, u64 *tsc_timestamp,
-			  int *mode)
-{
-	u64 tsc_pg_val;
-	long v;
-
-	switch (clock->vclock_mode) {
-	case VDSO_CLOCKMODE_HVCLOCK:
-		if (hv_read_tsc_page_tsc(hv_get_tsc_page(),
-					 tsc_timestamp, &tsc_pg_val)) {
-			/* TSC page valid */
-			*mode = VDSO_CLOCKMODE_HVCLOCK;
-			v = (tsc_pg_val - clock->cycle_last) &
-				clock->mask;
-		} else {
-			/* TSC page invalid */
-			*mode = VDSO_CLOCKMODE_NONE;
-		}
-		break;
-	case VDSO_CLOCKMODE_TSC:
-		*mode = VDSO_CLOCKMODE_TSC;
-		*tsc_timestamp = read_tsc();
-		v = (*tsc_timestamp - clock->cycle_last) &
-			clock->mask;
-		break;
-	default:
-		*mode = VDSO_CLOCKMODE_NONE;
-	}
-
-	if (*mode == VDSO_CLOCKMODE_NONE)
-		*tsc_timestamp = v = 0;
-
-	return v * clock->mult;
-}
-
-/*
- * As with get_kvmclock_base_ns(), this counts from boot time, at the
- * frequency of CLOCK_MONOTONIC_RAW (hence adding gtos->offs_boot).
- */
-static int do_kvmclock_base(s64 *t, u64 *tsc_timestamp)
-{
-	struct pvclock_gtod_data *gtod = &pvclock_gtod_data;
-	unsigned long seq;
-	int mode;
-	u64 ns;
-
-	do {
-		seq = read_seqcount_begin(&gtod->seq);
-		ns = gtod->raw_clock.base_cycles;
-		ns += vgettsc(&gtod->raw_clock, tsc_timestamp, &mode);
-		ns >>= gtod->raw_clock.shift;
-		ns += ktime_to_ns(ktime_add(gtod->raw_clock.offset, gtod->offs_boot));
-	} while (unlikely(read_seqcount_retry(&gtod->seq, seq)));
-	*t = ns;
-
-	return mode;
-}
-
-/*
- * This calculates CLOCK_MONOTONIC at the time of the TSC snapshot, with
- * no boot time offset.
- */
-static int do_monotonic(s64 *t, u64 *tsc_timestamp)
-{
-	struct pvclock_gtod_data *gtod = &pvclock_gtod_data;
-	unsigned long seq;
-	int mode;
-	u64 ns;
-
-	do {
-		seq = read_seqcount_begin(&gtod->seq);
-		ns = gtod->clock.base_cycles;
-		ns += vgettsc(&gtod->clock, tsc_timestamp, &mode);
-		ns >>= gtod->clock.shift;
-		ns += ktime_to_ns(gtod->clock.offset);
-	} while (unlikely(read_seqcount_retry(&gtod->seq, seq)));
-	*t = ns;
-
-	return mode;
-}
-
-static int do_realtime(struct timespec64 *ts, u64 *tsc_timestamp)
-{
-	struct pvclock_gtod_data *gtod = &pvclock_gtod_data;
-	unsigned long seq;
-	int mode;
-	u64 ns;
-
-	do {
-		seq = read_seqcount_begin(&gtod->seq);
-		ts->tv_sec = gtod->wall_time_sec;
-		ns = gtod->clock.base_cycles;
-		ns += vgettsc(&gtod->clock, tsc_timestamp, &mode);
-		ns >>= gtod->clock.shift;
-	} while (unlikely(read_seqcount_retry(&gtod->seq, seq)));
-
-	ts->tv_sec += __iter_div_u64_rem(ns, NSEC_PER_SEC, &ns);
-	ts->tv_nsec = ns;
-
-	return mode;
-}
-
 /*
  * Calculates the kvmclock_base_ns (CLOCK_MONOTONIC_RAW + boot time) and
  * reports the TSC value from which it do so. Returns true if host is
@@ -10362,7 +10188,6 @@ static int pvclock_gtod_notify(struct notifier_block *nb, unsigned long unused,
 {
 	struct timekeeper *tk = priv;
 
-	update_pvclock_gtod(tk);
 
 #ifdef CONFIG_X86_64
 	kvm_host_has_tsc_clocksource =
-- 
2.54.0


^ permalink raw reply related

* [RFC PATCH 2/8] clocksource/hyperv: Implement read_raw() for TSC page clocksource
From: David Woodhouse @ 2026-05-26 23:06 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner, John Stultz,
	Michael Kelley
  Cc: Vitaly Kuznetsov, Marcelo Tosatti, Christopher S . Hall,
	Stephen Boyd, Miroslav Lichvar, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H . Peter Anvin, K . Y . Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Daniel Lezcano, kvm, linux-hyperv, x86,
	linux-kernel
In-Reply-To: <20260526230635.136914-1-dwmw2@infradead.org>

From: David Woodhouse <dwmw@amazon.co.uk>

Implement the read_raw() callback for the Hyper-V TSC page
clocksource. This returns the derived 10MHz reference time (for
timekeeping) while also providing the raw TSC value that was used
to compute it.

When the TSC page is valid, hv_read_tsc_page_tsc() atomically
captures both values from a single RDTSC inside the sequence-counter
protected read. When the TSC page is invalid (sequence == 0), raw is
set to zero indicating no value is available.

This enables ktime_get_snapshot_id() to provide the raw TSC to
consumers like KVM's master clock when running nested on Hyper-V.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Assisted-by: Kiro:claude-opus-4.6-1m
---
 drivers/clocksource/hyperv_timer.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
index e9f5034a1bc8..c5ae01fdbd8e 100644
--- a/drivers/clocksource/hyperv_timer.c
+++ b/drivers/clocksource/hyperv_timer.c
@@ -444,6 +444,18 @@ static u64 notrace read_hv_clock_tsc_cs(struct clocksource *arg)
 	return read_hv_clock_tsc();
 }
 
+static u64 notrace read_hv_clock_tsc_cs_raw(struct clocksource *arg, u64 *raw)
+{
+	u64 time;
+
+	if (!hv_read_tsc_page_tsc(tsc_page, raw, &time)) {
+		time = read_hv_clock_msr();
+		*raw = 0;
+	}
+
+	return time;
+}
+
 static u64 noinstr read_hv_sched_clock_tsc(void)
 {
 	return (read_hv_clock_tsc() - hv_sched_clock_offset) *
@@ -495,6 +507,8 @@ static struct clocksource hyperv_cs_tsc = {
 	.name	= "hyperv_clocksource_tsc_page",
 	.rating	= 500,
 	.read	= read_hv_clock_tsc_cs,
+	.read_raw = read_hv_clock_tsc_cs_raw,
+	.raw_csid = CSID_X86_TSC,
 	.mask	= CLOCKSOURCE_MASK(64),
 	.flags	= CLOCK_SOURCE_IS_CONTINUOUS,
 	.suspend= suspend_hv_clock_tsc,
-- 
2.54.0


^ permalink raw reply related

* Re: [RFC] KVM/x86: Killing kvm_get_time_and_clockread() in favour of ktime_get_snapshot()
From: David Woodhouse @ 2026-05-26 23:04 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner, John Stultz,
	Michael Kelley
  Cc: Vitaly Kuznetsov, Marcelo Tosatti, Christopher S. Hall,
	Stephen Boyd, Miroslav Lichvar, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, K. Y. Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Daniel Lezcano, kvm, linux-hyperv, x86,
	linux-kernel
In-Reply-To: <b4895a532344ba6a879d922be8536f9000cd398c.camel@infradead.org>

[-- Attachment #1: Type: text/plain, Size: 2117 bytes --]

On Tue, 2026-05-26 at 14:57 +0100, David Woodhouse wrote:
> 
> One simple option that occurs to me would be to add a 'cycles_raw'
> value to the system_time_snapshot, for PV clocksources like hyperv and
> kvmclock to populate with the original TSC reading.
> 
> That might actually let us clean up some of the PTP code that currently
> has to deal with TSC vs. kvmclock in counter snapshots too. I think I
> could kill the use of get_cycles() in vmclock for the kvmclock case,
> which might make Thomas happy...

I hacked that up to see what it looks like, and it kind of seems to work...

Based on merging my kvmclock branch and Thomas's ktime_get_snapshot_id():
 • https://git.infradead.org/?p=users/dwmw2/linux.git;a=shortlog;h=refs/heads/kvmclock5
 • https://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git/log/?h=timers/ptp/timekeeping

I'll probably not post this for real until the above two are merged;
there's no rush but I think it's a worthwhile cleanup. For now it's at
 • https://git.infradead.org/?p=users/dwmw2/linux.git;a=shortlog;h=refs/heads/kvm-ktime-snapshot

David Woodhouse (8):
      timekeeping: Add clocksource read_raw() method and raw_cycles to snapshot
      clocksource/hyperv: Implement read_raw() for TSC page clocksource
      x86/kvmclock: Implement read_raw() for kvmclock clocksource
      KVM: x86: Use ktime_get_snapshot_id() for master clock
      KVM: x86: Compute kvmclock base without pvclock_gtod_data
      KVM: x86: Replace pvclock_gtod_data vclock_mode with boolean
      KVM: x86: Remove pvclock_gtod_data and private timekeeping code
      ptp: vmclock: Use raw_cycles from snapshot for precise TSC pairing

 arch/x86/kernel/kvmclock.c         |  21 ++++
 arch/x86/kvm/x86.c                 | 239 ++++++++-----------------------------
 drivers/clocksource/hyperv_timer.c |  14 +++
 drivers/ptp/ptp_vmclock.c          |   4 +
 include/linux/clocksource.h        |   8 ++
 include/linux/timekeeping.h        |   6 +
 kernel/time/timekeeping.c          |  30 ++++-
 7 files changed, 130 insertions(+), 192 deletions(-)


[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5069 bytes --]

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox