Linux-HyperV List
 help / color / mirror / Atom feed
* Re: [RFC PATCH 02/20] kho: store incoming radix tree in kho_in
From: sashiko-bot @ 2026-05-28  1:08 UTC (permalink / raw)
  To: Jork Loeser; +Cc: linux-hyperv
In-Reply-To: <20260528004204.1484584-3-jloeser@linux.microsoft.com>

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] Missing error handling leads to boot panic
--

commit 84425fcb93791aa9b9c2db978b57ed659f2c9ec2
Author: Pratyush Yadav (Google) <pratyush@kernel.org>

kho: store incoming radix tree in kho_in

This commit stores the incoming radix tree in the kho_in struct to allow
other functions to use it. It also refactors the code to use
kho_get_mem_map_phys() for retrieving the radix tree root from the FDT.

> diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c
> --- a/kernel/liveupdate/kexec_handover.c
> +++ b/kernel/liveupdate/kexec_handover.c

[ ... ]

> @@ -1395,24 +1396,10 @@ EXPORT_SYMBOL_GPL(kho_retrieve_subtree);
>  
>  static int __init kho_mem_retrieve(const void *fdt)
>  {
> -	struct kho_radix_tree tree;
> -	const phys_addr_t *mem;
> -	int len;
> -
> -	/* Retrieve the KHO radix tree from passed-in FDT. */
> -	mem = fdt_getprop(fdt, 0, KHO_FDT_MEMORY_MAP_PROP_NAME, &len);
> -
> -	if (!mem || len != sizeof(*mem)) {
> -		pr_err("failed to get preserved KHO memory tree\n");
> -		return -ENOENT;
> -	}
> -
> -	if (!*mem)
> -		return -EINVAL;
> -
> -	tree.root = phys_to_virt(*mem);
> -	mutex_init(&tree.lock);
> -	return kho_radix_walk_tree(&tree, kho_preserved_memory_reserve);
> +	kho_in.radix_tree.root = phys_to_virt(kho_get_mem_map_phys(fdt));

[Severity: High]
Can this lead to a boot panic if kho_get_mem_map_phys() fails?

If kho_get_mem_map_phys() returns 0 on failure, phys_to_virt(0) will likely 
generate a non-NULL pointer. This bypasses the safety validation later when 
walking the tree:

kernel/liveupdate/kexec_handover.c:kho_radix_walk_tree()
    if (WARN_ON_ONCE(!tree->root))
        return -EINVAL;

Walking the tree with this invalid address would then trigger a panic during 
early boot.

I note this issue appears to be fixed later in the series by commit 
80928d5942267 (kho: return virtual address of mem_map), but wanted to point 
out the temporary breakage here.

> +	mutex_init(&kho_in.radix_tree.lock);
> +	return kho_radix_walk_tree(&kho_in.radix_tree,
> +				   kho_preserved_memory_reserve);
>  }

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260528004204.1484584-1-jloeser@linux.microsoft.com?part=2

^ permalink raw reply

* Re: [RFC PATCH 05/20] kho: add data argument to radix walk callback
From: sashiko-bot @ 2026-05-28  1:11 UTC (permalink / raw)
  To: Jork Loeser; +Cc: linux-hyperv
In-Reply-To: <20260528004204.1484584-6-jloeser@linux.microsoft.com>

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:

Pre-existing issues:
- [High] Signed integer overflow in size calculation in kho_preserved_memory_reserve().
--

commit 64e1a8a692077719d0e7f42a7ff19088fba754f4
Author: Pratyush Yadav (Google) <pratyush@kernel.org>

kho: add data argument to radix walk callback

This commit adds an opaque data pointer argument to kho_radix_walk_cb_t.
This allows callers to pass extra information to the callback functions
during tree walks.

> diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c
> index 0f8d058f1a272..f6de6bf632261 100644
> --- a/kernel/liveupdate/kexec_handover.c
> +++ b/kernel/liveupdate/kexec_handover.c

[ ... ]

> @@ -486,7 +488,7 @@ static struct page *__init kho_get_preserved_page(phys_addr_t phys,
>  	return pfn_to_page(pfn);
>  }
>  
> -static int __init kho_preserved_memory_reserve(unsigned long key)
> +static int __init kho_preserved_memory_reserve(unsigned long key, void *data)
>  {
>  	union kho_page_info info;
>  	struct page *page;

[Severity: High]
This isn't a bug introduced by this patch, but does sz overflow when
calculating the memory size in this function?

Looking at kho_preserved_memory_reserve(), the memory size is calculated as:

    sz = 1 << (order + PAGE_SHIFT);

The literal 1 is a signed 32-bit integer. When KHO preserves memory using
kho_preserve_pages(), the maximum order is bounded by NUMA node boundaries
rather than MAX_PAGE_ORDER, allowing very large contiguous chunks to be
preserved as a single radix tree entry.

For example, if a 2GB contiguous memory chunk is preserved, the order is 19.
With PAGE_SHIFT = 12, order + PAGE_SHIFT is 31. The expression 1 << 31
overflows the signed 32-bit integer into -2147483648, which is then
sign-extended into u64 sz as 0xffffffff80000000.

If order + PAGE_SHIFT >= 32 (e.g., preserving a 4GB chunk), the behavior is
undefined and often evaluates to 1.

Calling memblock_reserve() with either of these invalid sizes will cause the
system to either exhaust all physical memory and fail to boot, or fail to
actually reserve the preserved memory, leading to severe memory corruption
when the successor kernel allocates and overwrites the preserved state.

Could this expression be updated to use the 1ULL literal to prevent
overflow?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260528004204.1484584-1-jloeser@linux.microsoft.com?part=5

^ permalink raw reply

* Re: [RFC PATCH 10/20] kho: extended scratch
From: sashiko-bot @ 2026-05-28  1:21 UTC (permalink / raw)
  To: Jork Loeser; +Cc: linux-hyperv
In-Reply-To: <20260528004204.1484584-11-jloeser@linux.microsoft.com>

Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
- [High] Missing error check for `kho_get_mem_map_phys()` return value
- [High] Missing static inline stub for `kho_extend_scratch()` causes build failures
- [High] 32-bit integer overflow in `kho_ext_mark_scratch()` causes incorrect physical address calculations
- [Medium] Uninitialized stack variable `radix` passed to `kho_radix_init_tree()`
- [Medium] Nested acquisition of mutexes from the same lockdep class
--

commit 69c3d241e681519286b4c5b53a9ed0d4287fb2b8
Author: Pratyush Yadav (Google) <pratyush@kernel.org>

kho: extended scratch

The scratch space is allocated by the first kernel in the KHO chain and reused.
This commit introduces extended scratch areas discovered at boot by walking the
preserved memory radix tree.

> diff --git a/include/linux/kexec_handover.h b/include/linux/kexec_handover.h
> index 8968c56d2d73e..6ce46f36ed992 100644
> --- a/include/linux/kexec_handover.h
> +++ b/include/linux/kexec_handover.h
> @@ -37,6 +37,7 @@ void kho_remove_subtree(void *blob);
>  int kho_retrieve_subtree(const char *name, phys_addr_t *phys, size_t *size);
>  
>  void kho_memory_init(void);
> +void kho_extend_scratch(void);

[Severity: High]
Does this missing static inline stub cause build failures when
CONFIG_KEXEC_HANDOVER=n?

kho_extend_scratch() is called unconditionally from mm_core_init_early().
However, its prototype here is only declared when CONFIG_KEXEC_HANDOVER is
enabled. Compiling the kernel with CONFIG_KEXEC_HANDOVER=n will result in an
implicit function declaration error.

[ ... ]

> diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c
> index b2d1572808ebf..a006a883ee94a 100644
> --- a/kernel/liveupdate/kexec_handover.c
> +++ b/kernel/liveupdate/kexec_handover.c
[ ... ]
> @@ -825,6 +842,120 @@ static void __init kho_reserve_scratch(void)
>  	kho_enable = false;
>  }
>  
> +#define KHO_EXT_SHIFT 30 /* 1 GiB */
> +
> +static int __init kho_ext_walk_key(unsigned long key, void *data)
> +{
> +	struct kho_radix_tree *tree = data;
> +	phys_addr_t start, end;
> +	unsigned int order;
> +	int err;
> +
> +	start = kho_decode_radix_key(key, &order);
> +	end = start + (1UL << (order + PAGE_SHIFT));
> +
> +	while (start < end) {
> +		err = kho_radix_add_key(tree, start >> KHO_EXT_SHIFT);

[Severity: Medium]
Will this trigger a lockdep warning due to nested acquisition of mutexes from
the same lockdep class?

When kho_radix_walk_tree() is called on kho_in.radix_tree, it acquires its
lock and then calls this walk callback. This callback calls kho_radix_add_key()
on the new radix tree, which acquires its lock. Because both locks are
initialized by kho_radix_init_tree(), they share the same lockdep class.
Acquiring a second lock of the same class without using mutex_lock_nested()
can trigger a "possible recursive locking detected" lockdep warning.

> +		if (err)
> +			return err;
> +
> +		start += (1UL << KHO_EXT_SHIFT);
> +	}
> +
> +	return 0;
> +}

[ ... ]

> +static int __init kho_ext_mark_scratch(unsigned long key, void *data)
> +{
> +	phys_addr_t *prev_end = data;
> +	phys_addr_t start = key << KHO_EXT_SHIFT;

[Severity: High]
Can this cause a 32-bit integer overflow on 32-bit architectures?

The key variable is of type unsigned long, which is 32-bit on 32-bit
architectures. Since KHO_EXT_SHIFT is 30, any key >= 4 (representing memory at
or above 4 GiB) will overflow the 32-bit shift and result in a truncated
address. Should key be cast to phys_addr_t before shifting?

> +	int err;
> +
> +	if (start > *prev_end) {
> +		err = memblock_mark_kho_scratch_ext(*prev_end, start - *prev_end);
> +		if (err)
> +			return err;
> +	}
> +
> +	*prev_end = start + (1UL << KHO_EXT_SHIFT);
> +	return 0;
> +}

[ ... ]

> +void __init kho_extend_scratch(void)
> +{
> +	const struct kho_radix_walk_cb kho_cb = {
> +		.key = kho_ext_walk_key,
> +		.table = kho_ext_walk_table,
> +	};
> +	const struct kho_radix_walk_cb ext_cb = {
> +		.key = kho_ext_mark_scratch,
> +	};
> +	struct kho_radix_tree radix;

[Severity: Medium]
Is it safe to pass this uninitialized stack variable radix to
kho_radix_init_tree()?

It is allocated on the stack but not zero-initialized. If kho_radix_init_tree()
conditionally initializes its fields (e.g., checking if tree->root is non-NULL
to avoid double initialization for global trees), it could read garbage stack
memory and return without initializing radix.lock. Subsequent operations would
then dereference uninitialized pointers.

> +	phys_addr_t prev_end = 0, mem_map_phys;
> +	int err = 0;
> +
> +	if (!is_kho_boot())
> +		return;
> +
> +	/* Make sure the KHO radix tree is initialized. */
> +	mem_map_phys = kho_get_mem_map_phys(kho_get_fdt());
> +	err = kho_radix_init_tree(&kho_in.radix_tree, phys_to_virt(mem_map_phys));

[Severity: High]
Does this code miss checking the return value of kho_get_mem_map_phys()?

kho_get_mem_map_phys() returns 0 on failure (e.g., missing FDT property),
and this return value is passed directly to phys_to_virt() without validation.
If the map is missing, the code will initialize the KHO radix tree using
physical address 0x0 as the root, leading to arbitrary memory being parsed
as tree nodes, which can cause memory corruption or a kernel panic during
early boot.

> +	if (err)
> +		goto print;
> +
> +	err = kho_radix_init_tree(&radix, NULL);
> +	if (err)
> +		goto print;

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260528004204.1484584-1-jloeser@linux.microsoft.com?part=10

^ permalink raw reply

* Re: [RFC PATCH 08/20] kho: add kho_radix_init_tree()
From: sashiko-bot @ 2026-05-28  1:21 UTC (permalink / raw)
  To: Jork Loeser; +Cc: linux-hyperv
In-Reply-To: <20260528004204.1484584-9-jloeser@linux.microsoft.com>

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] kho_radix_destroy_tree() leaves tree->root dangling, causing subsequent re-initializations to erroneously succeed and lead to Use-After-Free.
--

commit d207a159c7c112b64706fe2fb64399a65932fdb6
Author: Pratyush Yadav (Google) <pratyush@kernel.org>

kho: add kho_radix_init_tree()

Move the initialization logic of the radix tree into
kho_radix_init_tree() instead of having users open-code it. Makes the
boundaries cleaner and reduces code duplication when a new user of the
radix tree will be added in a future commit.

[ ... ]

> diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c
> index 3f3ea71baa1ad..b2d1572808ebf 100644
> --- a/kernel/liveupdate/kexec_handover.c
> +++ b/kernel/liveupdate/kexec_handover.c
> @@ -305,6 +305,34 @@ static void __kho_radix_destroy_tree(struct kho_radix_node *root,
>  	kho_radix_free_node(root);
>  }
>  
> +/**
> + * kho_radix_init_tree - initialize the radix tree.
> + * @tree:   the tree to initialize.
> + * @root:   root table of the radix tree.
> + *
> + * Initialize the radix tree with the given root node. If root is %NULL, an
> + * empty root table is allocated. If root is not %NULL, it is the caller's
> + * responsibility to make sure the root is valid and in the correct format.
> + *
> + * Return: 0 on success, -errno on failure.
> + */
> +int kho_radix_init_tree(struct kho_radix_tree *tree, struct kho_radix_node *root)
> +{
> +	/* Already initialized. */
> +	if (tree->root)
> +		return 0;

[Severity: High]
Can this cause a use-after-free if a tree is re-initialized?

Looking at kho_radix_destroy_tree(), it frees the tree nodes but does not set
tree->root to NULL:

kernel/liveupdate/kexec_handover.c:kho_radix_destroy_tree() {
	if (!tree->root)
		return;

	__kho_radix_destroy_tree(tree->root, KHO_TREE_MAX_DEPTH - 1);
}

If a tree is destroyed and then kho_radix_init_tree() is called on it again,
wouldn't this check see the dangling tree->root pointer and incorrectly return
success, leading to a use-after-free on subsequent tree operations?

> +
> +	if (!root)
> +		root = kho_radix_alloc_node();
> +	if (!root)
> +		return -ENOMEM;
> +
> +	tree->root = root;
> +	mutex_init(&tree->lock);
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(kho_radix_init_tree);

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260528004204.1484584-1-jloeser@linux.microsoft.com?part=8

^ permalink raw reply

* Re: [RFC PATCH 01/20] kho: generalize radix tree APIs
From: sashiko-bot @ 2026-05-28  1:22 UTC (permalink / raw)
  To: Jork Loeser; +Cc: linux-hyperv
In-Reply-To: <20260528004204.1484584-2-jloeser@linux.microsoft.com>

Thank you for your contribution! Sashiko AI review found 8 potential issue(s) to consider:

New issues:
- [High] Refactoring of scratch_size_update() skips CMA alignment for fixed scratch sizes.
- [High] Unsigned integer underflow when calculating scratch_size_global for unaligned lowmem sizes.
- [High] The generalized radix tree API silently truncates bits from generic 64-bit keys.
- [Low] Unnecessary duplication of __kho_preserve_pages_order() logic in kho_preserve_pages().

Pre-existing issues:
- [Critical] Heap buffer overflow in radix tree node allocation on 32-bit platforms with 4-byte phys_addr_t.
- [High] Sleeping memory allocation (GFP_KERNEL) is performed while holding a mutex required in reclaim paths.
- [High] Bitwise shift operations using a 32-bit signed integer literal '1' overflow for large memory preservations.
- [High] Lack of bounds checking on reconstructed radix tree keys causes an underflow of 'order', leading to severe memory corruption.
--

commit 8b45214cacd5b180aa13f584e115cd95e44f26d2
Author: Pratyush Yadav (Google) <pratyush@kernel.org>

kho: generalize radix tree APIs

This patch generalizes the KHO radix tree APIs to track arbitrary keys directly
instead of hardcoding page tracking operations (PFN and order). It renames the
functions to kho_radix_add_key() and kho_radix_del_key() and adjusts the
callback structures accordingly.

> diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c
> index 4834a809985ab..05a6eb56e1766 100644
> --- a/kernel/liveupdate/kexec_handover.c
> +++ b/kernel/liveupdate/kexec_handover.c
[ ... ]
> @@ -106,17 +106,17 @@ static unsigned long kho_radix_encode_key(phys_addr_t phys, unsigned int order)
>  }
>  
>  /**
> - * kho_radix_decode_key - Decodes a radix key back into a physical address and order.
> + * kho_decode_radix_key - Decodes a radix key back into a physical address and order.
>   * @key: The unsigned long key to decode.
>   * @order: An output parameter, a pointer to an unsigned int where the decoded
>   *         page order will be stored.
>   *
> - * This function reverses the encoding performed by kho_radix_encode_key(),
> + * This function reverses the encoding performed by kho_encode_radix_key(),
>   * extracting the original physical address and page order from a given key.
>   *
>   * Return: The decoded physical address.
>   */
> -static phys_addr_t kho_radix_decode_key(unsigned long key, unsigned int *order)
> +static phys_addr_t kho_decode_radix_key(unsigned long key, unsigned int *order)
>  {
>  	unsigned int order_bit = fls64(key);
>  	phys_addr_t phys;
>  
>  	/* order_bit is numbered starting at 1 from fls64 */
>  	*order = KHO_ORDER_0_LOG2 - order_bit + 1;

[Severity: High]
This isn't a bug introduced by this patch, but does this calculation risk an
unsigned integer underflow?

If a corrupted tree (for instance, an FDT from a crashing kernel) provides
entries at inappropriately high indices, the reconstructed key in
__kho_radix_walk_tree() might have bits set above KHO_ORDER_0_LOG2.

If fls64(key) returns an order_bit greater than KHO_ORDER_0_LOG2 + 1, this
calculation will underflow and produce an enormous order. This value is later
used in loops bounded by (1UL << order), which could lead to out-of-bounds
iteration.

[ ... ]
> @@ -144,24 +144,21 @@ static unsigned long kho_radix_get_table_index(unsigned long key,
>  }
>  
>  /**
> - * kho_radix_add_page - Marks a page as preserved in the radix tree.
> + * kho_radix_add_key - Add a key to the radix tree.
>   * @tree: The KHO radix tree.
> - * @pfn: The page frame number of the page to preserve.
> - * @order: The order of the page.
> + * @key: The key to add.
>   *
> - * This function traverses the radix tree based on the key derived from @pfn
> - * and @order. It sets the corresponding bit in the leaf bitmap to mark the
> - * page for preservation. If intermediate nodes do not exist along the path,
> - * they are allocated and added to the tree.
> + * This function traverses the radix tree based on the key provided. It sets the
> + * corresponding bit in the leaf bitmap to mark the key as present. If
> + * intermediate nodes do not exist along the path, they are allocated and added
> + * to the tree.
>   *
>   * Return: 0 on success, or a negative error code on failure.
>   */
> -int kho_radix_add_page(struct kho_radix_tree *tree,
> -		       unsigned long pfn, unsigned int order)
> +int kho_radix_add_key(struct kho_radix_tree *tree, unsigned long key)

[Severity: High]
Does this new arbitrary key API truncate bits from generic 64-bit keys?

The KHO radix tree depth and shifting logic appear to be statically sized
based on physical address and order bounds. When an arbitrary 64-bit key
is passed here, any bits that exceed the maximum addressable depth might
be silently ignored (e.g., bits shifted out or lost in the modulo operations
inside kho_radix_get_table_index()).

Could this cause distinct large 64-bit keys to silently alias to the same
path and overwrite each other?

>  {
>  	/* Newly allocated nodes for error cleanup */
>  	struct kho_radix_node *intermediate_nodes[KHO_TREE_MAX_DEPTH] = { 0 };
> -	unsigned long key = kho_radix_encode_key(PFN_PHYS(pfn), order);
>  	struct kho_radix_node *anchor_node = NULL;

[Severity: Critical]
This is a pre-existing issue, but can the node allocation later in this
function cause an out-of-bounds heap access on 32-bit platforms where
sizeof(phys_addr_t) is 4?

The struct kho_radix_node array size is based on KHO_TABLE_SIZE_LOG2,
which is calculated as ilog2(PAGE_SIZE / sizeof(phys_addr_t)).
On 32-bit systems, this evaluates to 1024 elements of u64, requiring
8192 bytes. However, the allocation relies on get_zeroed_page():

	new_node = (struct kho_radix_node *)get_zeroed_page(GFP_KERNEL);

This only allocates a single 4096-byte page. Could writes to indices >= 512,
or walking the tree via __kho_radix_walk_tree(), overflow this allocation?

[Severity: High]
This is also a pre-existing issue, but can using GFP_KERNEL for the node
allocation lead to an ABBA deadlock?

This allocation happens while holding tree->lock:

kho_radix_add_key()
    mutex_lock(&tree->lock)
    ...
    get_zeroed_page(GFP_KERNEL)

Since GFP_KERNEL can sleep and trigger direct memory reclaim, if a page cache
folio preserved for KHO is evicted, the filesystem's eviction path could call
kho_unpreserve_folio().

That function would attempt to acquire the exact same tree->lock, resulting
in a deadlock. Should this allocation be moved outside the lock or downgraded
to GFP_NOWAIT?

[ ... ]
> @@ -484,13 +474,16 @@ static struct page *__init kho_get_preserved_page(phys_addr_t phys,
>  	return pfn_to_page(pfn);
>  }
>  
> -static int __init kho_preserved_memory_reserve(phys_addr_t phys,
> -					       unsigned int order)
> +static int __init kho_preserved_memory_reserve(unsigned long key)
>  {
>  	union kho_page_info info;
>  	struct page *page;
> +	unsigned int order;
> +	phys_addr_t phys;
>  	u64 sz;
>  
> +	phys = kho_decode_radix_key(key, &order);
> +
>  	sz = 1 << (order + PAGE_SHIFT);

[Severity: High]
This is a pre-existing issue, but will this bitwise shift overflow for large
memory preservations?

Since the literal 1 is a 32-bit signed integer, a shift amount of 31 or
greater (which can happen for contiguous preservations of 2GB or more, like
hugepages or CMA areas) results in a signed integer overflow.

This would produce a negative 32-bit value that is sign-extended into the
64-bit sz variable, leading to incorrect memblock reservations.
Does this need to use 1UL or 1ULL?

[ ... ]
> @@ -618,30 +611,20 @@ early_param("kho_scratch", kho_parse_scratch_size);
>  
>  static void __init scratch_size_update(void)
>  {
> -	/*
> -	 * If fixed sizes are not provided via command line, calculate them
> -	 * now.
> -	 */
> -	if (scratch_scale) {
> -		phys_addr_t size;
> +	phys_addr_t size;
>  
> -		size = memblock_reserved_kern_size(ARCH_LOW_ADDRESS_LIMIT,
> -						   NUMA_NO_NODE);
> -		size = size * scratch_scale / 100;
> -		scratch_size_lowmem = size;
> +	if (!scratch_scale)
> +		return;

[Severity: High]
Does this early return skip CMA alignment for fixed scratch sizes?

In the previous implementation, scratch_size_lowmem and scratch_size_global
were unconditionally aligned using round_up(..., CMA_MIN_ALIGNMENT_BYTES)
at the end of the function, regardless of whether scratch_scale was set.

By returning early when scratch_scale is 0 (which occurs when fixed sizes
are provided via the command line), the alignment step is bypassed.
Will this cause CMA initialization to fail when iterating over partial
pageblocks?

>  
> -		size = memblock_reserved_kern_size(MEMBLOCK_ALLOC_ANYWHERE,
> -						   NUMA_NO_NODE);
> -		size = size * scratch_scale / 100 - scratch_size_lowmem;
> -		scratch_size_global = size;
> -	}
> +	size = memblock_reserved_kern_size(ARCH_LOW_ADDRESS_LIMIT,
> +					   NUMA_NO_NODE);
> +	size = size * scratch_scale / 100;
> +	scratch_size_lowmem = round_up(size, CMA_MIN_ALIGNMENT_BYTES);
>  
> -	/*
> -	 * Scratch areas are released as MIGRATE_CMA. Round them up to the right
> -	 * size.
> -	 */
> -	scratch_size_lowmem = round_up(scratch_size_lowmem, CMA_MIN_ALIGNMENT_BYTES);
> -	scratch_size_global = round_up(scratch_size_global, CMA_MIN_ALIGNMENT_BYTES);
> +	size = memblock_reserved_kern_size(MEMBLOCK_ALLOC_ANYWHERE,
> +					   NUMA_NO_NODE);
> +	size = size * scratch_scale / 100 - scratch_size_lowmem;

[Severity: High]
Can this subtraction result in an unsigned integer underflow?

Because the round_up() for scratch_size_lowmem now occurs before the
calculation of scratch_size_global, scratch_size_lowmem may be larger
than the unaligned size (size * scratch_scale / 100).

On a system where all memory is below ARCH_LOW_ADDRESS_LIMIT, both calls to
memblock_reserved_kern_size() return the same value. The subtraction will
then subtract the rounded-up value from the unaligned value, resulting in
a negative number.

Since size is a phys_addr_t (unsigned), this will underflow to an enormous
positive value and likely lead to an unfulfillable CMA reservation attempt.

[ ... ]
> @@ -937,9 +922,19 @@ int kho_preserve_pages(struct page *page, unsigned long nr_pages)
>  	}
>  
>  	while (pfn < end_pfn) {
> -		unsigned int order = __kho_preserve_pages_order(pfn, end_pfn);
> +		unsigned int order =
> +			min(count_trailing_zeros(pfn), ilog2(end_pfn - pfn));
> +
> +		/*
> +		 * Make sure all the pages in a single preservation are in the
> +		 * same NUMA node. The restore machinery can not cope with a
> +		 * preservation spanning multiple NUMA nodes.
> +		 */
> +		while (pfn_to_nid(pfn) != pfn_to_nid(pfn + (1UL << order) - 1))
> +			order--;

[Severity: Low]
Is there a specific reason to inline and duplicate the logic from
__kho_preserve_pages_order() here?

The helper function __kho_preserve_pages_order() remains in the file and
is still actively used by __kho_unpreserve(). Duplicating this complex
order calculation and NUMA node alignment logic increases the risk of the
two implementations diverging in the future.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260528004204.1484584-1-jloeser@linux.microsoft.com?part=1

^ permalink raw reply

* Re: [RFC PATCH 11/20] kho: return virtual address of mem_map
From: sashiko-bot @ 2026-05-28  1:27 UTC (permalink / raw)
  To: Jork Loeser; +Cc: linux-hyperv
In-Reply-To: <20260528004204.1484584-12-jloeser@linux.microsoft.com>

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [Critical] The local variable `radix` in `kho_extend_scratch` is uninitialized, leading to skipped initialization and a subsequent kernel panic when walking the radix tree.
- [Low] kho_get_mem_map returns an integer 0 instead of a NULL pointer on an error path.
--

commit 80928d5942267d8e8edcbc91a1c5643fdd5aecee
Author: Pratyush Yadav (Google) <pratyush@kernel.org>

kho: return virtual address of mem_map

Changes kho_get_mem_map_phys() to kho_get_mem_map() and makes it return a
virtual address instead of a physical address to simplify its callers.

> diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c
> index a006a883ee94a..797ec285b698c 100644
> --- a/kernel/liveupdate/kexec_handover.c
> +++ b/kernel/liveupdate/kexec_handover.c
> @@ -610,10 +610,11 @@ static int __init kho_preserved_memory_reserve(unsigned long key, void *data)
>  	return 0;
>  }
>  
> -/* Returns physical address of the preserved memory map from FDT */
> -static phys_addr_t __init kho_get_mem_map_phys(const void *fdt)
> +/* Returns virtual address of the preserved memory map from FDT */
> +static __init void *kho_get_mem_map(const void *fdt)
>  {
>  	const void *mem_ptr;
> +	phys_addr_t mem_map_phys;
>  	int len;
>  
>  	mem_ptr = fdt_getprop(fdt, 0, KHO_FDT_MEMORY_MAP_PROP_NAME, &len);
> @@ -622,7 +623,11 @@ static phys_addr_t __init kho_get_mem_map_phys(const void *fdt)
>  		return 0;
>  	}

[Severity: Low]
Since the return type has been changed to a pointer, should this return NULL
instead of the integer 0 to avoid a sparse warning?

> -	return get_unaligned((const u64 *)mem_ptr);
> +	mem_map_phys = get_unaligned((const u64 *)mem_ptr);
> +	if (!mem_map_phys)
> +		return NULL;
> +
> +	return phys_to_virt(mem_map_phys);
>  }
>  
[ ... ]
> @@ -917,15 +922,15 @@ void __init kho_extend_scratch(void)
>  		.key = kho_ext_mark_scratch,
>  	};
>  	struct kho_radix_tree radix;
> -	phys_addr_t prev_end = 0, mem_map_phys;
> +	phys_addr_t prev_end = 0;
>  	int err = 0;

[Severity: Critical]
Because radix is declared on the stack without being zeroed, it is
uninitialized. When kho_radix_init_tree() is called later, it checks if
tree->root is already set:

kernel/liveupdate/kexec_handover.c:kho_radix_init_tree() {
    ...
    if (tree->root)
        return 0;
    ...
}

Since radix.root contains uninitialized stack garbage, it is highly likely
to be non-zero. Could this cause kho_radix_init_tree() to return without
initializing the lock and setting up the root node, leading to a panic when
the tree is walked?

>  
>  	if (!is_kho_boot())
>  		return;
>  
>  	/* Make sure the KHO radix tree is initialized. */
> -	mem_map_phys = kho_get_mem_map_phys(kho_get_fdt());
> -	err = kho_radix_init_tree(&kho_in.radix_tree, phys_to_virt(mem_map_phys));
> +	err = kho_radix_init_tree(&kho_in.radix_tree,
> +				  kho_get_mem_map(kho_get_fdt()));
>  	if (err)
>  		goto print;

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260528004204.1484584-1-jloeser@linux.microsoft.com?part=11

^ permalink raw reply

* Re: [RFC PATCH 14/20] kho: Add crash-kernel-safe radix tree presence check
From: sashiko-bot @ 2026-05-28  1:27 UTC (permalink / raw)
  To: Jork Loeser; +Cc: linux-hyperv
In-Reply-To: <20260528004204.1484584-15-jloeser@linux.microsoft.com>

Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
- [High] In-place conversion destructively modifies the old kernel's memory, corrupting the crash dump.
- [High] Unbounded per-node `memremap()` calls can quickly exhaust the crash kernel's limited memory resources.
- [High] Lack of physical address validation before `memremap()` risks triggering Machine Check Exceptions (MCE).
- [Medium] `kho_radix_crash_contains_page()` fails to identify subpages of higher-order folios.
- [Medium] The exported API lacks a teardown function, resulting in permanent vmap memory leaks.
--

commit c1bba78846e7698e1b8a124f8432407d41f69b0c
Author: Jork Loeser <jloeser@linux.microsoft.com>
Subject: kho: Add crash-kernel-safe radix tree presence check

In the crash kernel, the old kernel's memory is outside the direct map.
This patch adds a read-only radix tree variant that memremaps nodes during
init so that subsequent page presence checks can traverse the tree with
plain pointer dereferencing.

> diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c
> index 2e2b4e73f00d3..0dfdf0f9781ef 100644
> --- a/kernel/liveupdate/kexec_handover.c
> +++ b/kernel/liveupdate/kexec_handover.c
[ ... ]
> @@ -396,6 +397,129 @@ void kho_radix_destroy_tree(struct kho_radix_tree *tree)
>  }
>  EXPORT_SYMBOL_GPL(kho_radix_destroy_tree);
>  
> +/*
> + * Convert a crash tree node's children from PA to VA in-place via memremap().
> + * On failure, already-remapped pages are not cleaned up -- the crash kernel
> + * is short-lived and will reboot after dump collection, so the leak is
> + * inconsequential.
> + */
> +static int kho_radix_crash_convert_node(struct kho_radix_node *node,
> +					unsigned int level)
> +{
> +	struct kho_radix_node *child;
> +	unsigned int i;
> +	int err;
> +
> +	for (i = 0; i < (1 << KHO_TABLE_SIZE_LOG2); i++) {
> +		if (!node->table[i])
> +			continue;
> +
> +		/* Validate: PA must have bit 63 clear and be page-aligned */
> +		if ((node->table[i] & BIT_ULL(63)) ||
> +		    (node->table[i] & (PAGE_SIZE - 1)))
> +			return -EINVAL;
> +
> +		child = memremap(node->table[i], PAGE_SIZE, MEMREMAP_WB);

[Severity: High]
Since this recursively calls memremap() for every populated node in the tree,
could this exhaust the crash kernel's limited memory resources? A fully
populated tree might create thousands of individual virtual mappings, consuming
significant memory for tracking structures like vm_struct and page tables.

Also, the untrusted physical addresses read from the crashed kernel's memory
are passed blindly to memremap(). If the old kernel crashed due to memory
corruption, could these pointers contain garbage values pointing to non-RAM
regions like MMIO, potentially triggering a Machine Check Exception (MCE) when
accessed?

> +		if (!child)
> +			return -ENOMEM;
> +
> +		/* Overwrite PA with VA in-place */
> +		node->table[i] = (u64)(uintptr_t)child;

[Severity: High]
Because memremap() creates a direct mapping to the underlying physical memory,
does assigning the virtual address here destructively modify the old kernel's
physical memory? This seems like it would corrupt the crash dump, causing
offline analysis tools like crash or makedumpfile to encounter meaningless
virtual addresses instead of the original physical addresses.

> +
> +		/* Recurse for intermediate levels; level 1 children are leaves */
> +		if (level > 1) {
> +			err = kho_radix_crash_convert_node(child, level - 1);
> +			if (err)
> +				return err;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +/**
> + * kho_radix_crash_init - Initialize a crash-kernel view of a KHO radix tree.
[ ... ]
> + */
> +int kho_radix_crash_init(struct kho_radix_crash_tree *tree, phys_addr_t root_pa)
> +{
> +	struct kho_radix_node *root;
> +	int err;
> +
> +	tree->root = NULL;
> +
> +	if (!root_pa || (root_pa & (PAGE_SIZE - 1)))
> +		return -EINVAL;
> +
> +	root = memremap(root_pa, PAGE_SIZE, MEMREMAP_WB);
> +	if (!root)
> +		return -ENOMEM;
> +
> +	err = kho_radix_crash_convert_node(root, KHO_TREE_MAX_DEPTH - 1);
> +	if (err)
> +		return err;
> +
> +	tree->root = root;
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(kho_radix_crash_init);

[Severity: Medium]
This API is exported to modules and allocates mappings via memremap(), but
there is no corresponding kho_radix_crash_destroy() provided. Even if the
crash kernel is short-lived, should kernel APIs exported to modules enforce
symmetrical resource lifecycle management to avoid permanent vmap memory
leaks if a module is unloaded or fails initialization?

> +
> +/**
> + * kho_radix_crash_contains_page - Check if a page is in a crash-kernel radix tree.
[ ... ]
> + * Note: This function checks specifically for the presence of the page at the
> + * given order. If a larger order page that encompasses this page is preserved,
> + * this function will return false.
> + *
> + * Return: true if the page is present in the tree, false otherwise.
> + */
> +bool kho_radix_crash_contains_page(struct kho_radix_crash_tree *tree,
> +				   unsigned long pfn, unsigned int order)
> +{
> +	unsigned long key = kho_encode_radix_key(PFN_PHYS(pfn), order);

[Severity: Medium]
Since this searches for an exact match of the provided pfn and order, doesn't
this fail to identify subpages of higher-order folios? If a page was
preserved as part of a higher-order folio, an order-0 query for a PFN within
that folio will fail to find the key and incorrectly return false. 

Would callers like pfn_is_ram() be forced to manually iterate through all
possible orders up to MAX_PAGE_ORDER to reliably determine if a page is
preserved?

> +	struct kho_radix_node *node = tree->root;

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260528004204.1484584-1-jloeser@linux.microsoft.com?part=14

^ permalink raw reply

* Re: [RFC PATCH 04/20] kho: add callback for table pages
From: sashiko-bot @ 2026-05-28  1:33 UTC (permalink / raw)
  To: Jork Loeser; +Cc: linux-hyperv
In-Reply-To: <20260528004204.1484584-5-jloeser@linux.microsoft.com>

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:

Pre-existing issues:
- [Critical] Severe out-of-bounds memory accesses and ABI incompatibility due to incorrect node sizing on 32-bit platforms.
- [High] Pre-existing memory corruption risk due to `kzalloc` crossing page boundaries for KHO radix root node.
- [High] Undefined behavior and logic failure due to 64-bit shift assumptions on 32-bit platforms.
--

commit a4a688bed2326d7eaaf8a3a47636fe600e228c89
Author: Pratyush Yadav (Google) <pratyush@kernel.org>

kho: add callback for table pages

The KHO memory preservation radix tree does not mark the table pages
themselves as scratch. This means any walker looking for free ranges of
memory outside of scratch areas will ignore the table. This patch adds a
table callback that is invoked for each table page to address this.

> diff --git a/include/linux/kho_radix_tree.h b/include/linux/kho_radix_tree.h
> index 030da6399d288..fe7151d89361d 100644
> --- a/include/linux/kho_radix_tree.h
> +++ b/include/linux/kho_radix_tree.h
> @@ -37,12 +37,15 @@ struct kho_radix_tree {
>  /**
>   * struct kho_radix_walk_cb - Callbacks for KHO radix tree walk.
>   * @key:      Called on each present key in the radix tree.
> + * @table:    Called on each table of the radix tree itself. Receives the
> + *            physical address of the page containing the table.
>   *
>   * For each callback, a return value of 0 continues the walk and a non-zero
>   * return value is directly returned to the caller.
>   */
>  struct kho_radix_walk_cb {
>  	int (*key)(unsigned long key);

[Severity: High]
This isn't a bug introduced by this patch, but does the key encoding work
correctly on 32-bit platforms?

Looking at kho_encode_radix_key(), the code performs
1UL << (KHO_ORDER_0_LOG2 - order). With KHO_ORDER_0_LOG2 defined as
64 - PAGE_SHIFT (52 for 4KB pages), doesn't shifting a 32-bit 1UL by
52 bits invoke undefined behavior?

Additionally, since the radix key is typed as unsigned long (which is 32 bits
on these platforms), is it structurally possible to encode and decode the
52-bit shifted physical address and order bit as designed?

> +	int (*table)(phys_addr_t phys);
>  };
>  

[ ... ]

> diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c
> index b22b3cec251ee..0f8d058f1a272 100644
> --- a/kernel/liveupdate/kexec_handover.c
> +++ b/kernel/liveupdate/kexec_handover.c

[ ... ]

> @@ -295,6 +301,12 @@ static int __kho_radix_walk_tree(struct kho_radix_node *root,
>  	unsigned int shift;
>  	int err;
>  
> +	if (cb->table) {
> +		err = cb->table(virt_to_phys(root));

[Severity: High]
This isn't a bug introduced by this patch, but does the KHO radix tree root
node get allocated via kzalloc(PAGE_SIZE, GFP_KERNEL) in kho_init()?

The SLUB allocator does not guarantee page alignment for PAGE_SIZE objects
when debugging features (like redzones) are enabled. If the root node crosses
a physical page boundary, won't this table callback only receive the physical
address of the start of the object?

If a walker treats this address as a single page, could it fail to mark the
second page as in-use? During kexec, couldn't this unmarked page be treated
as free scratch memory and overwritten, corrupting the KHO radix tree for the
incoming kernel?

Note: I see this is fixed later in the series in commit
d207a159c7c112b64706fe2fb64399a65932fdb6 ("kho: add kho_radix_init_tree()").

> +		if (err)
> +			return err;
> +	}
> +
>  	for (i = 0; i < PAGE_SIZE / sizeof(phys_addr_t); i++) {

[Severity: Critical]
This is a pre-existing issue, but does this loop correctly bound memory
accesses on 32-bit platforms?

Since sizeof(phys_addr_t) is 4 bytes on 32-bit platforms,
PAGE_SIZE / sizeof(phys_addr_t) evaluates to 1024 for 4KB pages.
KHO_TABLE_SIZE_LOG2 evaluates to 10, meaning the table array in
struct kho_radix_node has 1024 elements of type u64 (8 bytes each).
This makes the structure 8192 bytes large.

If node allocations only allocate a single PAGE_SIZE (4096 bytes), won't
accessing root->table[i] in this loop up to 1024 read and write well past
the bounds of the allocated page?

Also, by relying on sizeof(phys_addr_t), doesn't the FDT format vary by
platform and break ABI cross-compatibility?

>  		if (!root->table[i])
>  			continue;

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260528004204.1484584-1-jloeser@linux.microsoft.com?part=4

^ permalink raw reply

* Re: [RFC PATCH 13/20] kho: add radix tree freeze and del_key() error reporting
From: sashiko-bot @ 2026-05-28  1:34 UTC (permalink / raw)
  To: Jork Loeser; +Cc: linux-hyperv
In-Reply-To: <20260528004204.1484584-14-jloeser@linux.microsoft.com>

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
- [High] `kho_radix_tree_freeze()` locks the tree mutex without verifying if the tree has been initialized.
- [Medium] `kho_radix_del_key()` does not accurately report `-ENOENT` if the key is missing but its leaf node exists.
- [Low] The docstring for `kho_radix_tree` instructs developers to pass a physical address to initialize the tree, which will lead to a crash.
- [Low] The error path in `kho_init()` frees the radix tree root using `free_page()`, breaking the allocator abstraction.
--

commit da7ab3d5519dc40884c4213f1650e1238149beb0
Author: Jork Loeser <jloeser@linux.microsoft.com>

kho: add radix tree freeze and del_key() error reporting

This commit adds kho_radix_tree_freeze() to prevent modifications to a KHO
radix tree. It also updates kho_radix_del_key() to return an integer so it
can report errors like -EBUSY and -ENOENT.

> diff --git a/include/linux/kho_radix_tree.h b/include/linux/kho_radix_tree.h
> index c0840ecb230c8..4fe2238e1e30a 100644
> --- a/include/linux/kho_radix_tree.h
> +++ b/include/linux/kho_radix_tree.h
> @@ -21,10 +21,10 @@
>   * scheme. Each key is an unsigned long that combines a page's physical
>   * address and its order.
>   *
> - * Client code is responsible for allocating the root node of the tree,
> - * initializing the mutex lock, and managing its lifecycle. It must use the
> - * tree data structures defined in the KHO ABI,
> - * `include/linux/kho/abi/kexec_handover.h`.
> + * Client code must initialize the tree using kho_radix_tree_init(). Pass
> + * a physical address to restore a tree preserved across kexec, or 0 to

[Severity: Low]
Is this documentation accurate? The tree traversal functions treat
tree->root as a virtual address pointer. If a caller passes a raw physical
address without using phys_to_virt(), would it cause a page fault upon the
first dereference?

> + * allocate a fresh empty tree. The tree uses data structures defined in
> + * the KHO ABI, `include/linux/kho/abi/kexec_handover.h`.
>   */

[ ... ]

> diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c
> index 797ec285b698c..2e2b4e73f00d3 100644
> --- a/kernel/liveupdate/kexec_handover.c
> +++ b/kernel/liveupdate/kexec_handover.c
> @@ -79,9 +79,6 @@ struct kho_out {
>  
>  static struct kho_out kho_out = {
>  	.lock = __MUTEX_INITIALIZER(kho_out.lock),
> -	.radix_tree = {
> -		.lock = __MUTEX_INITIALIZER(kho_out.radix_tree.lock),
> -	},
>  };

[ ... ]

> @@ -180,6 +177,28 @@ static void __ref kho_radix_free_node(struct kho_radix_node *node)
>  		memblock_free(node, PAGE_SIZE);
>  }
>  
> +/**
> + * kho_radix_tree_freeze - Freeze the tree, preventing further modifications.
> + * @tree: The KHO radix tree to freeze.
> + *
> + * After freezing, kho_radix_add_key() and kho_radix_del_key() will return
> + * -EBUSY. The check is performed under the tree's mutex, so there is no
> + * race between a concurrent add/del and the freeze.
> + *
> + * Return: 0 on success, -EBUSY if the tree is already frozen.
> + */
> +int kho_radix_tree_freeze(struct kho_radix_tree *tree)
> +{
> +	guard(mutex)(&tree->lock);

[Severity: High]
Could this cause a crash if called on an uninitialized tree?

Because the static __MUTEX_INITIALIZER was removed from
kho_out.radix_tree.lock in this patch, the mutex is zero-initialized before
kho_radix_init_tree() is called.

Unlike kho_radix_add_key() and kho_radix_del_key(), which safely check
if (WARN_ON_ONCE(!tree->root)) before taking the mutex, this function
directly attempts to lock it.

> +
> +	if (tree->frozen)
> +		return -EBUSY;
> +
> +	tree->frozen = true;
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(kho_radix_tree_freeze);

[ ... ]

> @@ -300,6 +328,8 @@ void kho_radix_del_key(struct kho_radix_tree *tree, unsigned long key)
>  	leaf = (struct kho_radix_leaf *)node;
>  	idx = kho_radix_get_bitmap_index(key);
>  	__clear_bit(idx, leaf->bitmap);
> +
> +	return 0;
>  }

[Severity: Medium]
Will this correctly report -ENOENT if the key is missing but the leaf
node exists?

The commit message states kho_radix_del_key() was updated to report -ENOENT
when a key is not present. However, if other keys share the same leaf node
and the target key's bit is already 0, __clear_bit() does nothing and this
silently returns 0.

Should this use test_and_clear_bit() and return -ENOENT if the bit was not
previously set?

>  EXPORT_SYMBOL_GPL(kho_radix_del_key);

[ ... ]

> @@ -1807,7 +1836,7 @@ static __init int kho_init(void)
>  err_free_fdt:
>  	kho_unpreserve_free(kho_out.fdt);
>  err_free_kho_radix_tree_root:
> -	kfree(tree->root);
> +	free_page((unsigned long)tree->root);

[Severity: Low]
Is it safe to use free_page() directly here instead of calling
kho_radix_free_node()?

Since kho_radix_init_tree() allocates the node internally via
kho_radix_alloc_node(), calling free_page() directly bypasses the fallback
safety checks in kho_radix_free_node(), which handles memblock_free() when
the slab is unavailable.

>  	tree->root = NULL;
>  err_free_scratch:
>  	kho_out.fdt = NULL;

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260528004204.1484584-1-jloeser@linux.microsoft.com?part=13

^ permalink raw reply

* Re: [RFC PATCH 17/20] hyperv: Reserve crash MSR P2 for page preservation root PA
From: sashiko-bot @ 2026-05-28  1:34 UTC (permalink / raw)
  To: Jork Loeser; +Cc: linux-hyperv
In-Reply-To: <20260528004204.1484584-18-jloeser@linux.microsoft.com>

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] Removing the overwrites of crash MSR P2 in panic paths can leak a stale KHO root PA to the crash kernel if KHO is currently disabled.
- [Medium] The commit message incorrectly describes the register shifts on x86, hiding the undocumented removal of `regs->ax` from the crash report.
--

commit a2f7e475a7dfd02f8ed395438172d715377b9152
Author: Jork Loeser <jloeser@linux.microsoft.com>

hyperv: Reserve crash MSR P2 for page preservation root PA

This commit reserves the crash MSR P2 to pass the KHO radix tree root
physical address to the crash kernel for MSHV page exclusion. It updates
the panic reporting paths to stop overwriting P2 and shifts the remaining
register values accordingly.

>     Move IP/PC to P3 and SP to P4 in hyperv_report_panic() on both x86
>     and ARM64. Remove the P2 write from hv_kmsg_dump().

[Severity: Medium]
Does this description accurately reflect the changes on x86?

On x86, SP was already assigned to P4, so it is not moved. Additionally,
moving IP to P3 overwrites the previous contents of P3 (regs->ax), entirely
dropping it from the crash report.

Could the commit message be updated to explicitly mention this loss of
diagnostic data?

> diff --git a/arch/arm64/hyperv/hv_core.c b/arch/arm64/hyperv/hv_core.c
> --- a/arch/arm64/hyperv/hv_core.c
> +++ b/arch/arm64/hyperv/hv_core.c
> @@ -185,9 +185,9 @@ void hyperv_report_panic(struct pt_regs *regs, long err, bool in_die)
>  	 */
>  	hv_set_vpreg(HV_REGISTER_GUEST_CRASH_P0, err);
>  	hv_set_vpreg(HV_REGISTER_GUEST_CRASH_P1, guest_id);
> -	hv_set_vpreg(HV_REGISTER_GUEST_CRASH_P2, regs->pc);
> -	hv_set_vpreg(HV_REGISTER_GUEST_CRASH_P3, regs->sp);
> -	hv_set_vpreg(HV_REGISTER_GUEST_CRASH_P4, 0);
> +	/* P2 is reserved for the KHO preserved-pages tree root PA */

[Severity: High]
Could skipping the overwrite of P2 leak a stale KHO root physical address
to the crash kernel if KHO is currently disabled?

If a KHO-enabled kernel writes a valid radix tree root PA to P2 and then
kexecs into a KHO-disabled kernel, the KHO-disabled kernel might not clear
the MSR during its initialization.

If this KHO-disabled kernel later panics, leaving P2 untouched here means
the crash kernel will read the stale PA from P2. Since the memory at the
stale PA was likely reused, could walking it as a valid KHO radix tree
cause a page fault and break crash dump collection?

> +	hv_set_vpreg(HV_REGISTER_GUEST_CRASH_P3, regs->pc);
> +	hv_set_vpreg(HV_REGISTER_GUEST_CRASH_P4, regs->sp);

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260528004204.1484584-1-jloeser@linux.microsoft.com?part=17

^ permalink raw reply

* Re: [RFC PATCH 06/20] kho: allow early-boot usage of the KHO radix tree
From: sashiko-bot @ 2026-05-28  1:40 UTC (permalink / raw)
  To: Jork Loeser; +Cc: linux-hyperv
In-Reply-To: <20260528004204.1484584-7-jloeser@linux.microsoft.com>

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] kho_radix_alloc_node() incorrectly uses slab_is_available() to determine if the buddy allocator can be used, leading to double-allocation and memory corruption during boot.
- [High] kho_radix_free_node() conditionally frees memory based on the current global boot state rather than how the memory was originally allocated, leading to page state corruption or crashes.
--

commit 470d80274c16904cd71b8b9c878c7272d6b58647
Author: Pratyush Yadav (Google) <pratyush@kernel.org>

kho: allow early-boot usage of the KHO radix tree

This commit updates the KHO radix tree to support allocations during early boot
when the buddy allocator is not yet available. It introduces
kho_radix_alloc_node() and kho_radix_free_node() which use slab_is_available()
to toggle between memblock and buddy allocators.

> diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c
> index f6de6bf632261..5c201e605b969 100644
> --- a/kernel/liveupdate/kexec_handover.c
> +++ b/kernel/liveupdate/kexec_handover.c
[ ... ]
> @@ -143,6 +143,26 @@ static unsigned long kho_radix_get_table_index(unsigned long key,
>  	return (key >> s) % (1 << KHO_TABLE_SIZE_LOG2);
>  }
>  
> +static void __ref *kho_radix_alloc_node(void)
> +{
> +	struct kho_radix_node *node;
> +
> +	if (slab_is_available())
> +		node = (struct kho_radix_node *)get_zeroed_page(GFP_KERNEL);
> +	else
> +		node = memblock_alloc(PAGE_SIZE, PAGE_SIZE);

[Severity: High]
Will this check cause memory corruption during the window between buddy
initialization and slab initialization?

During boot, memblock_free_all() hands over all unreserved pages to the buddy
allocator. However, slab_is_available() remains false until kmem_cache_init()
finishes later in mm_core_init(). 

If kho_radix_alloc_node() is called in this window, it evaluates
slab_is_available() as false and falls back to memblock_alloc(). This means
it will reserve physical memory that the buddy allocator is already actively
managing, potentially leading to a silent double-allocation where KHO radix
tree nodes alias with other kernel allocations.

> +
> +	return node;
> +}
> +
> +static void __ref kho_radix_free_node(struct kho_radix_node *node)
> +{
> +	if (slab_is_available())
> +		free_page((unsigned long)node);
> +	else
> +		memblock_free(node, PAGE_SIZE);
> +}

[Severity: High]
Could checking slab_is_available() at the time of freeing cause crashes or
memory leaks for nodes allocated during early boot?

If a node is allocated early via memblock_alloc(), it is backed by a
PageReserved physical page. If this same node is later freed after the
slab allocator has been initialized, slab_is_available() will be true,
causing the free path to incorrectly call free_page() on the reserved page.

Calling free_page() on a memblock-reserved page fails in __free_pages_prepare()
because PAGE_FLAGS_CHECK_AT_FREE includes PG_reserved. This triggers a
bad_page() warning, dumps the stack, and leaks the memory.

Should this code determine the deallocation method based on the node's
actual allocation origin, such as checking PageReserved(virt_to_page(node)),
rather than checking the current global boot state?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260528004204.1484584-1-jloeser@linux.microsoft.com?part=6

^ permalink raw reply

* Re: [RFC PATCH 15/20] mshv: Use page tracker to manage MSHV-owned pages and preserve with KHO
From: sashiko-bot @ 2026-05-28  1:41 UTC (permalink / raw)
  To: Jork Loeser; +Cc: linux-hyperv
In-Reply-To: <20260528004204.1484584-16-jloeser@linux.microsoft.com>

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] KHO radix tree is accessed without initialization if KHO is disabled, leading to API misuse and potential crashes.
- [High] Concurrent memory withdrawal during kexec tree freeze causes memory corruption and use-after-free.
--

commit 9c3872dd668bb8911798d680d377645cc757d8e0
Author: Jork Loeser <jloeser@linux.microsoft.com>

mshv: Use page tracker to manage MSHV-owned pages and preserve with KHO
    
This commit adapts MSHV page deposit and withdrawal paths to use a tracker,
and registers these pages with KHO to preserve them across kexecs.

> diff --git a/drivers/hv/mshv_page_preserve.c b/drivers/hv/mshv_page_preserve.c
> --- /dev/null
> +++ b/drivers/hv/mshv_page_preserve.c

[ ... ]

> +int __init mshv_preserve_init(void)
> +{
> +	int err;
> +
> +	if (!kho_is_enabled()) {
> +		pr_err("KHO is disabled; page deposits will fail.\n");
> +		return 0;
> +	}

[Severity: High]
If kho_is_enabled() returns false, this function returns 0 but sets no global
state to actually reject subsequent deposit requests. Does this mean
hv_call_deposit_pages() will proceed and unconditionally call
mshv_register_preserve_page(), leading to a crash when it tries to operate on
the uninitialized radix tree?

[ ... ]

> diff --git a/drivers/hv/mshv_root_hv_call.c b/drivers/hv/mshv_root_hv_call.c
> --- a/drivers/hv/mshv_root_hv_call.c
> +++ b/drivers/hv/mshv_root_hv_call.c

[ ... ]

> @@ -69,8 +69,16 @@ int hv_call_withdraw_memory(u64 count, int node, u64 partition_id)
>  
>  		completed = hv_repcomp(status);
>  
> -		for (i = 0; i < completed; i++)
> -			__free_page(pfn_to_page(output_page->gpa_page_list[i]));
> +		for (i = 0; i < completed; i++) {
> +			struct page *pg = pfn_to_page(output_page->gpa_page_list[i]);
> +			int res = mshv_unregister_preserve_page(pg);
> +
> +			WARN_ONCE(res, "Failed to unregister PFN %#llx\n",
> +				  output_page->gpa_page_list[i]);
> +
> +			/* Free regardless -- HV has already released the page */
> +			__free_page(pg);
> +		}

[Severity: High]
Does this code introduce a use-after-free and memory corruption during kexec?

If the reboot notifier calls preserve_tree() and freezes the page tree,
mshv_unregister_preserve_page() will fail. However, the page is still freed
to the buddy allocator here. The concurrent preserve_tree() walk will then
visit this still-registered node and preserve the freed page, corrupting the
buddy allocator state.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260528004204.1484584-1-jloeser@linux.microsoft.com?part=15

^ permalink raw reply

* Re: [RFC PATCH 16/20] mshv: Add debugfs interface to page tracker
From: sashiko-bot @ 2026-05-28  1:48 UTC (permalink / raw)
  To: Jork Loeser; +Cc: linux-hyperv
In-Reply-To: <20260528004204.1484584-17-jloeser@linux.microsoft.com>

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
- [Medium] The newly introduced `mshv_iterate_preserved()` and its debugfs callers invoke KHO APIs without checking `kho_is_enabled()`, triggering a kernel warning on systems where KHO is disabled.
- [Medium] The error handling path in `mshv_debugfs_init()` fails to clean up resources allocated by `mshv_debugfs_parent_partition_create()`.
- [Medium] The `pt_tree` debugfs interface uses `single_open()` for potentially unbounded radix tree iteration, leading to O(N^2) processing overhead, prolonged lock contention, and allocation failures.
--

commit 0782bfee5c54e7b8251a28d7d1fba762890a41cc
Author: Jork Loeser <jloeser@linux.microsoft.com>

mshv: Add debugfs interface to page tracker

This commit adds debugfs files to expose statistics and tracked pages from the
MSHV page tracker, as well as the scheduler type. It also adds a new API to
iterate over preserved pages using the KHO radix tree.

> diff --git a/drivers/hv/mshv_debugfs.c b/drivers/hv/mshv_debugfs.c
> index 3c3e02237ae94..d79898e21b361 100644
> --- a/drivers/hv/mshv_debugfs.c
> +++ b/drivers/hv/mshv_debugfs.c

[ ... ]

> @@ -668,8 +675,89 @@ void mshv_debugfs_partition_remove(struct mshv_partition *partition)

[ ... ]

> +static int pt_stats_show(struct seq_file *m, void *v)
> +{
> +	const struct kho_radix_walk_cb cb = {
> +		.key = pt_count_data_cb,
> +		.table = pt_count_meta_cb,
> +	};
> +
> +	struct mshv_pt_stats pt_stats = {0};
> +
> +	mshv_iterate_preserved(&cb, &pt_stats);

[Severity: Medium]
Is it possible to read this file when KHO is disabled?

According to KHO subsystem rules, callers must gate KHO usage on
kho_is_enabled(). If kho_is_enabled() is false, preserved_pages_tree may remain
uninitialized (root is NULL), and kho_radix_walk_tree() will trigger a kernel
warning.

> +	seq_printf(m, "Data pages: %lu\n", pt_stats.count_data);
> +	seq_printf(m, "Meta pages: %lu\n", pt_stats.count_meta);
> +	return 0;
> +}
> +DEFINE_SHOW_ATTRIBUTE(pt_stats);

[ ... ]

> +static int pt_tree_show(struct seq_file *m, void *v)
> +{
> +	const struct kho_radix_walk_cb cb = {
> +		.key = pt_tree_data_cb,
> +		.table = pt_tree_meta_cb,
> +	};
> +
> +	struct mshv_pt_stats pt_stats = {.stat_file = m};
> +
> +	mshv_iterate_preserved(&cb, &pt_stats);
> +	return 0;
> +}
> +DEFINE_SHOW_ATTRIBUTE(pt_tree);

[Severity: Medium]
Will this lead to large allocations and O(N^2) lock contention for large VMs?

DEFINE_SHOW_ATTRIBUTE() uses single_open(), which requires all output to fit in
a single contiguous kmalloc buffer. Iterating over all preserved pages can scale
to millions of pages.

If the output exceeds the initial seq_file buffer, seq_printf() discards the
overflow, but the tree traversal continues to completion. Afterwards, seq_read()
will reallocate a larger buffer and restart the entire traversal. This creates
an retry loop that monopolizes the KHO radix tree lock and will likely
fail with -ENOMEM when kmalloc reaches the MAX_ORDER limit.

Should unbounded outputs use stateful seq_operations instead?

[ ... ]

> @@ -694,6 +782,17 @@ int __init mshv_debugfs_init(void)
>  	if (err)
>  		goto unmap_lp_stats;
>  
> +	err = mshv_debugfs_pt_create(mshv_debugfs);
> +	if (err)
> +		goto unmap_lp_stats;

[Severity: Medium]
Does this error path bypass cleaning up the parent partition stats?

If mshv_debugfs_pt_create() fails, it jumps to unmap_lp_stats, which
skips mshv_debugfs_parent_partition_remove(). This appears to permanently leak
the resources allocated by the preceding mshv_debugfs_parent_partition_create()
call.

> +
> +	d = debugfs_create_file("scheduler_info", 0400, mshv_debugfs, NULL,
> +				&scheduler_info_fops);
> +	if (IS_ERR(d)) {
> +		err = PTR_ERR(d);
> +		goto unmap_lp_stats;
> +	}
> +
>  	return 0;
>  
>  unmap_lp_stats:

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260528004204.1484584-1-jloeser@linux.microsoft.com?part=16

^ permalink raw reply

* Re: [RFC PATCH 20/20] mshv: freeze and vacuum partitions across kexec
From: sashiko-bot @ 2026-05-28  2:11 UTC (permalink / raw)
  To: Jork Loeser; +Cc: linux-hyperv
In-Reply-To: <20260528004204.1484584-21-jloeser@linux.microsoft.com>

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
- [Critical] Concurrent kfree in partition creation error path causes UAF or NULL pointer dereference in mshv_freeze_and_get_partition_ids().
- [High] KHO serialization APIs are invoked without verifying if the KHO subsystem is enabled.
- [High] Lockless call to mshv_eventfd_release() races with concurrent ioctls, causing list corruption and double-frees.
--

commit 1d32c3ff3fb719ececcc6529a01a0e1cb8e01bc9
Author: Jork Loeser <jloeser@linux.microsoft.com>

mshv: freeze and vacuum partitions across kexec

This commit introduces logic to freeze VM partitions prior to a kexec,
ensuring no VP modifies VM-memory reused by the new kernel. It preserves
the partition IDs using the KHO subsystem so the successor kernel can
retrieve and clean them up. Finally, it provides a mechanism to vacuum
stale partitions left over from a previous kernel.

> diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
> --- a/drivers/hv/mshv_root_main.c
> +++ b/drivers/hv/mshv_root_main.c
> @@ -1762,6 +1790,201 @@ static void drain_all_vps(const struct mshv_partition *partition)
[ ... ]
> +	if (!nr_alloc) {
> +		pr_info("Frozen 0 partition(s) for kexec\n");
> +		return 0;
> +	}
> +
> +	ids = kho_alloc_preserve(nr_alloc * sizeof(*ids));

[Severity: High]
Should we verify kho_is_enabled() before calling KHO serialization APIs
like kho_alloc_preserve() here and kho_add_subtree() during preservation?

If a standard kexec is performed (with kexec_in_progress true but KHO
disabled), can calling these KHO serialization APIs lead to a NULL pointer
dereference on kho_out.fdt?

> +	if (IS_ERR(ids)) {

[ ... ]

> +	for (i = 0; i < nr_ref; i++) {
> +		/* Ref held; partition stays in hash and alive outside RCU */
> +		scoped_guard(rcu)
> +			partition = mshv_partition_find(ids[i]);
> +
> +		/* Barrier: wait for any in-flight create_vp to complete */
> +		scoped_guard(mutex, &partition->pt_mutex) {}

[Severity: Critical]
Can mshv_partition_find() return NULL here if a partition creation error
path concurrently unlinks and unconditionally frees the partition?

If mshv_freeze_and_get_partition_ids() obtains a reference in the first
loop, but the error path ignores the reference count and frees the
partition anyway, would mshv_partition_find() return NULL after the
partition is removed from the hash table?

This would cause a panic when attempting to acquire &partition->pt_mutex.
Additionally, if the reference count was ignored during a concurrent
kfree, could this result in a use-after-free regression?

[ ... ]

> +		/*
> +		 * Tear down doorbell ports owned by the parent partition.
> +		 * These survive child partition deletion and kexec, so the
> +		 * new kernel would collide on port IDs if we leave them.
> +		 */
> +		mshv_eventfd_release(partition);

[Severity: High]
Is it safe to call mshv_eventfd_release() here without holding
partition->pt_mutex?

If mshv_root.frozen is not checked in the ioeventfd ioctl dispatch path,
could userspace concurrently issue ioctls that modify the ioeventfds_list
while this function is moving it, resulting in list corruption or
double-frees?

> +
> +		mshv_partition_put(partition);
> +	}

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260528004204.1484584-1-jloeser@linux.microsoft.com?part=20

^ permalink raw reply

* Re: [RFC PATCH 18/20] mshv: Exclude Hyper-V donated pages from crash dump collection
From: sashiko-bot @ 2026-05-28  2:13 UTC (permalink / raw)
  To: Jork Loeser; +Cc: linux-hyperv
In-Reply-To: <20260528004204.1484584-19-jloeser@linux.microsoft.com>

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
- [High] Radix tree key encoding mismatch prevents exclusion of Hyper-V owned pages during crash dump.
- [High] Failure to clear the Hyper-V crash MSR P2 when KHO is disabled or fails to initialize leaves a stale physical address, leading to memory corruption during crash dump collection.
- [Medium] The kdump initialization path calls a KHO API without being gated by `kho_is_enabled()`, violating strict KHO subsystem guidelines.
--

commit a672e76ca918e68a4de01d571d5f2e72810eec75
Author: Jork Loeser <jloeser@linux.microsoft.com>

mshv: Exclude Hyper-V donated pages from crash dump collection

Pages donated to Hyper-V must not be read during crash dump collection.
They are not ordinary RAM and accessing them can hang or corrupt the
crash kernel.

> diff --git a/drivers/hv/mshv_page_preserve.c b/drivers/hv/mshv_page_preserve.c
> index bc3a3a688f5b2..e16fb946790d5 100644
> --- a/drivers/hv/mshv_page_preserve.c
> +++ b/drivers/hv/mshv_page_preserve.c

[ ... ]

> @@ -327,6 +328,57 @@ static int __init alloc_tree(void)
>  	return 0;
>  }
>  
> +#ifdef CONFIG_CRASH_DUMP
> +static struct kho_radix_crash_tree crash_preserved_pages_tree;
> +
> +/**
> + * restore_crash_tree() - Set up the crash tree for dump-time page exclusion.
> + *
> + * In the crash kernel, the old kernel's memory is not in the direct map.
> + * The old kernel stashes the radix tree root PA in Hyper-V crash MSR P2
> + * so we can retrieve it without touching the old kernel's FDT.
> + *
> + * Return: 0 on success, negative error code on failure.
> + */
> +static int __init restore_crash_tree(void)
> +{
> +	phys_addr_t root_pa;
> +
> +	root_pa = hv_get_msr(HV_MSR_CRASH_P2);
> +	if (!root_pa)
> +		return -ENOENT;
> +
> +	/*
> +	 * The MSR may contain stale data from a previous
> +	 * hyperv_report_panic().  Sanity-check that it looks like a
> +	 * page-aligned physical address within the architectural limit.
> +	 */
> +	if (!PAGE_ALIGNED(root_pa) || root_pa >> MAX_POSSIBLE_PHYSMEM_BITS) {
> +		pr_warn("Invalid crash tree root PA: 0x%llx\n",
> +			(unsigned long long)root_pa);
> +		return -EINVAL;
> +	}
> +
> +	return kho_radix_crash_init(&crash_preserved_pages_tree, root_pa);
> +}
> +
> +static bool mshv_vmcore_pfn_is_ram(struct vmcore_cb *cb, unsigned long pfn)
> +{
> +	/*
> +	 * MSHV-owned pages must not be read during crash dump collection.
> +	 * Currently all pages are registered at order 0. If higher-order
> +	 * registrations are added, this lookup will need to handle them
> +	 * (e.g. by querying multiple orders or using a range-based API).
> +	 */
> +	return !kho_radix_crash_contains_page(&crash_preserved_pages_tree,
> +					      pfn, 0);

[Severity: High]
Is the key inserted into the radix tree matched correctly here?

If pages are inserted into the tree using the raw page PFN (e.g.,
page_to_pfn()), this lookup might always miss. The
kho_radix_crash_contains_page() function internally encodes the query key
using kho_encode_radix_key(PFN_PHYS(pfn), order), which adds high-order
metadata bits.

Will this cause the crash kernel to fail to exclude Hyper-V donated pages,
treating them as normal RAM?

> +}
> +
> +static struct vmcore_cb mshv_vmcore_cb = {
> +	.pfn_is_ram = mshv_vmcore_pfn_is_ram,
> +};
> +#endif
> +
>  static struct notifier_block reboot_notifier = {
>  	.notifier_call = reboot_cb,
>  	.priority = 0,
> @@ -347,6 +399,24 @@ int __init mshv_preserve_init(void)
>  {
>  	int err;
>  
> +#ifdef CONFIG_CRASH_DUMP
> +	if (is_kdump_kernel()) {
> +		/*
> +		 * Crash kernel only needs the pfn_is_ram callback to exclude
> +		 * MSHV-owned pages from the dump.  No page restoration, no
> +		 * reboot notifier — the crash kernel reboots after collection.
> +		 */
> +		err = restore_crash_tree();
> +		if (err) {
> +			pr_err("Could not set up crash page tree: %d; MSHV pages may appear in dump\n",
> +			       err);
> +			return 0;
> +		}
> +		register_vmcore_cb(&mshv_vmcore_cb);
> +		return 0;
> +	}
> +#endif
> +

[Severity: Medium]
Should this path be gated by kho_is_enabled() as well?

The KHO subsystem documentation mandates checking kho_is_enabled() at the
entry point of any code path that uses KHO APIs. Here, restore_crash_tree()
(which calls the KHO API kho_radix_crash_init()) is invoked before checking
if KHO is enabled.

>  	if (!kho_is_enabled()) {
>  		pr_err("KHO is disabled; page deposits will fail.\n");
>  		return 0;

[Severity: High]
If the system kexecs into a new kernel where KHO is disabled (or initialization
fails), this function returns early without clearing the HV_MSR_CRASH_P2 MSR.

Since Hyper-V crash MSRs persist across kexecs, could the new kernel leave a
stale physical address in P2?

If this new kernel panics, the kdump kernel will read the stale MSR. Because
the address is page-aligned and within bounds, it passes sanity checks in
restore_crash_tree(), and kho_radix_crash_init() will recursively map and
parse the arbitrary memory as a radix tree.

By executing node->table[i] = (u64)(uintptr_t)child, might this actively write
virtual addresses into the crashed kernel's physical memory, corrupting the
crash dump or triggering panics in the kdump kernel due to mapping invalid
addresses?

>  	}
>  
>  	err = restore_tree();

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260528004204.1484584-1-jloeser@linux.microsoft.com?part=18

^ permalink raw reply

* Re: [PATCH net-next v11 0/6] net: mana: Per-vPort EQ and MSI-X management
From: Jakub Kicinski @ 2026-05-28  2:27 UTC (permalink / raw)
  To: Long Li
  Cc: Konstantin Taranov, David S . Miller, Paolo Abeni, Eric Dumazet,
	Andrew Lunn, Jason Gunthorpe, Leon Romanovsky, Haiyang Zhang,
	K . Y . Srinivasan, Wei Liu, Dexuan Cui, shradhagupta,
	Simon Horman, netdev, linux-rdma, linux-hyperv, linux-kernel
In-Reply-To: <20260523020258.1107742-1-longli@microsoft.com>

On Fri, 22 May 2026 19:02:50 -0700 Long Li wrote:
> The following changes since commit 95fab46aea57d6d7b76b319341acbefe8a9293c8:
> 
>   Merge branch 'net-convert-atm-xdp-af_iucv-l2tp_ppp-rxrpc-tipc-to-getsockopt_iter' (2026-05-22 11:11:12 -0700)
> 
> are available in the Git repository at:
> 
>   https://github.com/longlimsft/linux.git tags/mana-eq-msi-v11
> 
> for you to fetch changes up to a26d11135abba51e81ae8b9689e288718af95088:
> 
>   RDMA/mana_ib: Allocate interrupt contexts on EQs (2026-05-22 20:35:43 +0000)

The branch is no good, it needs to be your patches applied on top 
of a commit already in Linus's tree. The current branch is on
top of net-next, RDMA would have to pull in 100s of networking
commits together with your changes.

^ permalink raw reply

* Re: [PATCH v3 01/10] drm/damage-helper: Do not alter damage clips on modeset, but ignore them
From: Thomas Zimmermann @ 2026-05-28  6:14 UTC (permalink / raw)
  To: sashiko-reviews; +Cc: linux-hyperv
In-Reply-To: <20260527151944.F29571F000E9@smtp.kernel.org>



Am 27.05.26 um 17:19 schrieb sashiko-bot@kernel.org:
[...]
> [Severity: High]
> By no longer setting plane_state->fb_damage_clips to NULL during a modeset,
> will this break drivers that bypass the damage iterator?

No drivers bypass the damage iterator. Doing so would be an error.

>
> Drivers like AMDGPU (in fill_dc_dirty_rects) and vmwgfx directly call
> drm_plane_get_damage_clips_count() and drm_plane_get_damage_clips().
> Because these accessor functions do not check the new ignore_damage_clips flag,
> won't they see a non-zero clip count if userspace provided one?
>
> If they see a non-zero clip count during a modeset, will they incorrectly
> skip their fallback paths and perform a partial update instead of the
> required full framebuffer update?
>

-- 
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstr. 146, 90461 Nürnberg, Germany, www.suse.com
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich, (HRB 36809, AG Nürnberg)



^ permalink raw reply

* Re: [PATCH v3 07/10] drm/damage-helper: Remove old state from drm_atomic_helper_damage_merged()
From: Thomas Zimmermann @ 2026-05-28  6:20 UTC (permalink / raw)
  To: sashiko-reviews; +Cc: linux-hyperv
In-Reply-To: <20260527151003.4F25D1F000E9@smtp.kernel.org>



Am 27.05.26 um 17:10 schrieb sashiko-bot@kernel.org:
> Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
>
> Pre-existing issues:
> - [High] This problem wasn't introduced by this patch, but the plane atomic update callback `gm12u320_pipe_update` executes a path that acquires a mutex. Plane atomic update callbacks run in atomic context during non-blocking commits, making the use of sleeping locks unsafe.
> - [High] This problem wasn't introduced by this patch, but the plane atomic update callback `repaper_pipe_update` executes a path that performs a memory allocation with `GFP_KERNEL`. Plane atomic update callbacks run in atomic context, where allocations that may sleep are strictly prohibited.

Unrelated issues should be fixed separately.

> --
>
> commit da86f07c371224a54756c1f750c01c663a2a7a86
> Author: Thomas Zimmermann <tzimmermann@suse.de>
>
> drm/damage-helper: Remove old state from drm_atomic_helper_damage_merged()
>
> This patch removes the unused old_state parameter from
> drm_atomic_helper_damage_merged() and updates all callers. It also cleans up
> the old_plane_state variable in caller functions where it is no longer
> needed.
>
> [ ... ]
>
>> diff --git a/drivers/gpu/drm/tiny/gm12u320.c b/drivers/gpu/drm/tiny/gm12u320.c
>> --- a/drivers/gpu/drm/tiny/gm12u320.c
>> +++ b/drivers/gpu/drm/tiny/gm12u320.c
>> @@ -582,7 +582,7 @@ static void gm12u320_pipe_update(struct drm_simple_display_pipe *pipe,
>>   	struct drm_shadow_plane_state *shadow_plane_state = to_drm_shadow_plane_state(state);
>>   	struct drm_rect rect;
>>   
>> -	if (drm_atomic_helper_damage_merged(old_state, state, &rect))
>> +	if (drm_atomic_helper_damage_merged(state, &rect))
> [Severity: High]
> This is a pre-existing issue, but does gm12u320_pipe_update() safely execute
> in an atomic context?
>
> This function acts as the simple display pipe's plane atomic update callback,
> which runs in atomic context during non-blocking commits. However, it calls
> gm12u320_fb_mark_dirty(), which acquires a mutex:
>
> drivers/gpu/drm/tiny/gm12u320.c:gm12u320_fb_mark_dirty() {
>      ...
>      mutex_lock(&gm12u320->fb_update.lock);
>      ...
> }
>
> Since mutex_lock() is a sleeping function, could this lead to deadlocks or
> kernel warnings when called from an atomic context?
>
>>   		gm12u320_fb_mark_dirty(state->fb, &shadow_plane_state->data[0], &rect);
>>   }
> [ ... ]
>
>> diff --git a/drivers/gpu/drm/tiny/repaper.c b/drivers/gpu/drm/tiny/repaper.c
>> --- a/drivers/gpu/drm/tiny/repaper.c
>> +++ b/drivers/gpu/drm/tiny/repaper.c
>> @@ -837,7 +837,7 @@ static void repaper_pipe_update(struct drm_simple_display_pipe *pipe,
>>   	if (!pipe->crtc.state->active)
>>   		return;
>>   
>> -	if (drm_atomic_helper_damage_merged(old_state, state, &rect))
>> +	if (drm_atomic_helper_damage_merged(state, &rect))
> [Severity: High]
> This isn't a bug introduced by this patch, but is it safe for
> repaper_pipe_update() to trigger allocations with GFP_KERNEL?
>
> Like the issue above, this acts as the plane atomic update callback. It calls
> repaper_fb_dirty(), which performs a memory allocation that may sleep:
>
> drivers/gpu/drm/tiny/repaper.c:repaper_fb_dirty() {
>      ...
>      buf = kmalloc(fb->width * fb->height / 8, GFP_KERNEL);
>      ...
>      ret = drm_gem_fb_begin_cpu_access(fb, DMA_FROM_DEVICE);
>      ...
> }
>
> Because plane atomic update callbacks run in atomic context, allocations that
> may sleep (like GFP_KERNEL) and sleeping functions like
> drm_gem_fb_begin_cpu_access() are strictly prohibited. Is there a plan to
> shift these sleeping operations to a workqueue or use non-sleeping
> allocations?
>
>>   		repaper_fb_dirty(state->fb, shadow_plane_state->data,
>>   				 &shadow_plane_state->fmtcnv_state);
>>   }

-- 
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstr. 146, 90461 Nürnberg, Germany, www.suse.com
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich, (HRB 36809, AG Nürnberg)



^ permalink raw reply

* Re: [PATCH v3 10/10] drm/vmwgfx: Remove unused field struct vmwgfx_du_update_plane.old_state
From: Thomas Zimmermann @ 2026-05-28  6:21 UTC (permalink / raw)
  To: sashiko-reviews; +Cc: linux-hyperv
In-Reply-To: <20260527152200.9F49F1F000E9@smtp.kernel.org>



Am 27.05.26 um 17:22 schrieb sashiko-bot@kernel.org:
> Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
>
> Pre-existing issues:
> - [High] Plane atomic_update callbacks indirectly call sleeping functions (mutex_lock/mutex_lock_interruptible) from an atomic context.
> --

Preexisting issues should be fixed separately.

>
> commit deccb03fba43ed6006d88ae9933240eb52b52fee
> Author: Thomas Zimmermann <tzimmermann@suse.de>
>
> drm/vmwgfx: Remove unused field struct vmwgfx_du_update_plane.old_state
>
> This commit removes the unused old_state field from the
> vmw_du_update_plane structure and updates all caller functions to
> accommodate this change.
>
>> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c b/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c
>> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c
>> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c
> [ ... ]
>> @@ -721,7 +717,6 @@ static void
>>   vmw_sou_primary_plane_atomic_update(struct drm_plane *plane,
>>   				    struct drm_atomic_commit *state)
>>   {
>> -	struct drm_plane_state *old_state = drm_atomic_get_old_plane_state(state, plane);
>>   	struct drm_plane_state *new_state = drm_atomic_get_new_plane_state(state, plane);
>>   	struct drm_crtc *crtc = new_state->crtc;
>>   	struct vmw_fence_obj *fence = NULL;
>> @@ -734,12 +729,9 @@ vmw_sou_primary_plane_atomic_update(struct drm_plane *plane,
>>   			vmw_framebuffer_to_vfb(new_state->fb);
>>   
>>   		if (vfb->bo)
>> -			ret = vmw_sou_plane_update_bo(dev_priv, plane,
>> -						      old_state, vfb, &fence);
>> +			ret = vmw_sou_plane_update_bo(dev_priv, plane, vfb, &fence);
>>   		else
>> -			ret = vmw_sou_plane_update_surface(dev_priv, plane,
>> -							   old_state, vfb,
>> -							   &fence);
>> +			ret = vmw_sou_plane_update_surface(dev_priv, plane, vfb, &fence);
> [Severity: High]
> This is a pre-existing issue, but does this code path end up calling sleeping
> functions from an atomic context?
>
> Looking at the call chain, vmw_sou_primary_plane_atomic_update() is a plane
> atomic update callback, which runs in atomic context where sleeping is
> forbidden. However, it calls vmw_sou_plane_update_surface(), which assigns a
> mutex pointer to the update structure:
>
> vmw_sou_plane_update_surface()
>    srf_update.base.mutex = &dev_priv->cmdbuf_mutex;
>    vmw_du_helper_plane_update(&srf_update.base);
>
> This closure is then passed down to vmw_du_helper_plane_update(), which calls
> vmw_validation_prepare(). That function then takes the lock using
> mutex_lock() or mutex_lock_interruptible().
>
> Could this lead to system instability or deadlocks if the atomic commit runs
> in a non-blocking path where sleeping is not allowed?
>
> [ ... ]
>

-- 
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstr. 146, 90461 Nürnberg, Germany, www.suse.com
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich, (HRB 36809, AG Nürnberg)



^ permalink raw reply

* Re: [PATCH v3 05/10] drm/appletbdrm: Allocate request/response buffers in begin_fb_access
From: Thomas Zimmermann @ 2026-05-28  6:26 UTC (permalink / raw)
  To: sashiko-reviews; +Cc: linux-hyperv, Aditya Garg, Aun-Ali Zaidi
In-Reply-To: <20260527154205.140101F000E9@smtp.kernel.org>

(cc'ing Aditya, Aun-Ali)

Am 27.05.26 um 17:42 schrieb sashiko-bot@kernel.org:
> Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
>
> Pre-existing issues:
> - [Critical] Unbounded accumulation of damage clip sizes causes an integer overflow, resulting in an undersized buffer allocation and subsequent out-of-bounds heap write.
> - [High] The plane `atomic_update` callback incorrectly calls sleeping functions, violating DRM atomic context guidelines.
> - [High] Mismatched coordinate space intersection and inconsistent calculation ordering lead to uninitialized memory gaps and malformed hardware payloads.
> --
>
> commit fc37b98d4bfc3183056fa690c4e2e659894c67e7
> Author: Thomas Zimmermann <tzimmermann@suse.de>
> Date:   Wed May 27 14:46:24 2026 +0000
>
> drm/appletbdrm: Allocate request/response buffers in begin_fb_access
>
> In atomic_check, damage handling is not fully evaluated. Another
> atomic_check helper could trigger a full modeset and thus invalidate
> damage clips.
>
> Allocation of the request/response buffers in appletbdrm depends on
> correct damage information. Otherwise it might allocate incorrectly
> sized buffers. Allocate the buffers in the driver's begin_fb_access
> helper. It runs early during the commit when damage clipping has been
> fully evaluated.
>
>> diff --git a/drivers/gpu/drm/tiny/appletbdrm.c b/drivers/gpu/drm/tiny/appletbdrm.c
>> index cdd35af498920..b683dcb35b0be 100644
>> --- a/drivers/gpu/drm/tiny/appletbdrm.c
>> +++ b/drivers/gpu/drm/tiny/appletbdrm.c
>> @@ -315,33 +315,16 @@ static const u32 appletbdrm_primary_plane_formats[] = {
>>   	DRM_FORMAT_XRGB8888, /* emulated */
>>   };
>>   
>> -static int appletbdrm_primary_plane_helper_atomic_check(struct drm_plane *plane,
>> -						   struct drm_atomic_commit *state)
>> +static int appletbdrm_primary_plane_helper_begin_fb_access(struct drm_plane *plane,
>> +							   struct drm_plane_state *new_plane_state)
>>   {
>>   	struct appletbdrm_plane_state *appletbdrm_state = to_appletbdrm_plane_state(new_plane_state);
>> +	size_t frames_size = 0;
>>   	struct drm_atomic_helper_damage_iter iter;
>>   	struct drm_rect damage;
>>   	size_t request_size;
>>   
>> -	drm_atomic_helper_damage_iter_init(&iter, old_plane_state, new_plane_state);
>> +	drm_atomic_helper_damage_iter_init(&iter, NULL, new_plane_state);
>>   	drm_atomic_for_each_plane_damage(&iter, &damage) {
>>   		frames_size += struct_size((struct appletbdrm_frame *)0, buf, rect_size(&damage));
>>   	}
> [Severity: Critical]
> This is a pre-existing issue, but could the unbounded accumulation of damage
> clip sizes cause an integer overflow?

Pre-existing issues should be fixed separately.

>
> The FB_DAMAGE_CLIPS property allows userspace to pass an unbounded number of
> damage clips, limited only by the maximum DRM property blob size.
>
> If millions of clips are provided, accumulating the sizes with:
>
>      frames_size += struct_size(...)
>
> could wrap around size_t on 32-bit systems. Even if frames_size doesn't wrap,
> the subsequent calculation:
>
>      request_size = ALIGN(sizeof(...) + frames_size + sizeof(...), 16)
>
> might easily wrap around to a small value, causing kvzalloc() to successfully
> allocate an undersized buffer.
>
> When appletbdrm_flush_damage() iteratively processes the exact same clips,
> could it trust the per-clip buf_size and copy pixel data into the undersized
> buffer via drm_fb_memcpy(), resulting in an out-of-bounds heap write?
>
> [Severity: High]
> This is also a pre-existing issue, but does this allocation properly account
> for the intersection calculations that happen later?
>
> In appletbdrm_flush_damage(), buf_size is calculated before the intersection:
>
>      buf_size = rect_size(&damage);
>      if (!drm_rect_intersect(&dst_clip, &damage))
>          continue;
>
> If the intersection shrinks the damage rectangle, drm_fb_memcpy() copies fewer
> bytes than buf_size, leaving zeroed memory at the end of the frame buffer.
> Since frame->buf_size was already set to the original size, the hardware will
> read these zeroes as pixel data.
>
> Furthermore, dst_clip is in CRTC coordinates while damage is in Framebuffer
> (SRC) coordinates. If userspace utilizes a framebuffer larger than the CRTC
> and applies a source offset, the coordinate spaces will not overlap properly.
> This causes valid damage clips to be skipped entirely.
>
> Since the begin_fb_access() helper factored these skipped clips into the total
> frames_size allocation without performing the intersection, skipping them later
> fails to advance the frame pointer.
>
> Could this leave uninitialized gaps before the footer, corrupting the packed
> payload structure sent to the hardware?
>
> [ ... ]
>
>> @@ -468,7 +474,7 @@ static int appletbdrm_flush_damage(struct appletbdrm_device *adev,
>>   }
>>   
>>   static void appletbdrm_primary_plane_helper_atomic_update(struct drm_plane *plane,
>> -						     struct drm_atomic_commit *old_state)
>> +							  struct drm_atomic_commit *old_state)
>>   {
>>   	struct appletbdrm_device *adev = drm_to_adev(plane->dev);
>>   	struct drm_device *drm = plane->dev;
> [Severity: High]
> This is a pre-existing issue, but does this code execute sleeping functions
> from an atomic context?
>
> According to DRM atomic context guidelines, plane atomic update callbacks
> execute in an atomic context where sleeping is forbidden.
>
> The appletbdrm_primary_plane_helper_atomic_update() callback invokes
> appletbdrm_flush_damage(), which subsequently calls:
>
>      drm_gem_fb_begin_cpu_access(fb, DMA_FROM_DEVICE);
>
> This function takes locks and can wait on fences. The flush function then
> performs synchronous USB bulk transfers, which also sleep.
>
> Could this break atomic context rules and cause system instability?
>

-- 
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstr. 146, 90461 Nürnberg, Germany, www.suse.com
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich, (HRB 36809, AG Nürnberg)



^ permalink raw reply

* Re: [PATCH v2 0/9] drm: Limit DRM_IOCTL_WAIT_VBLANK to vblank interrupts
From: Thomas Zimmermann @ 2026-05-28  7:54 UTC (permalink / raw)
  To: Julian Orth
  Cc: simona, airlied, mdaenzer, pekka.paalanen, jadahl, contact,
	maarten.lankhorst, mripard, mhklinux, amd-gfx, dri-devel,
	wayland-devel, linux-hyperv, virtualization, spice-devel
In-Reply-To: <CAHijbEVZBRTK7yhZy8gaZwb19JMzUD_nA2S1LOKX2NrK19RBsQ@mail.gmail.com>

Hi

Am 27.05.26 um 18:31 schrieb Julian Orth:
> On Wed, May 27, 2026 at 3:39 PM Thomas Zimmermann <tzimmermann@suse.de> wrote:
>> DRM's WAIT_VBLANK ioctl synchronizes user-space clients to display
>> refresh. This is meaningless with vblank timers, which run unrelated
>> to the hardware's vblank.
>>
>> Disable the ioctl for simulated vblanks. Set DRM_VBLANK_FLAG_SIMULATED
>> for CRTCs with simulated vblank events in all such drivers. The vblank
>> timers of these devices still rate-limit the number of page-flip events
>> to match the display refresh.
>>
>> According to maintainers, user-space compositors do not require the ioctl
>> for rate-limitting display output. Weston, Kwin and Mutter rely on completion
>> events. Mutter optionally uses the WAIT_VBLANK ioctl only to optimize the
>> time from input to output.
>>
>> When testing with mutter and weston, the page-flip rate appears correct
>> with the patch set applied.
> To avoid this being a regression, you need to test that this change
> does not regress input latency.

Let me stress that the current situation is that there's high-quality, 
and low-quality and no timing information. Depends on the driver and 
hardware.

>
> As discussed on IRC, compositors use vblank data to predict the time
> of the next flip event. For each device that you are touching here,
> there are two possibilities:
>
> - The vblank data is related to the flip timing, i.e. flip events and
> vblank events are sent at almost the same time. In this case removing
> these apis removes the path for compositors to predict the time of the
> next flip event. Input latency will therefore regress after idle
> periods when the compositor no longer has the time of the last vblank.

User-space compositors seem to operate under this assumption. That, I 
think, makes sense on better hardware with rendering and vblank IRQs. 
Page flips are fast on such systems.

>
> - The vblank data has nothing to do with the time of the next flip
> event. In this case this series could in fact improve latency because
> it removes the incorrect data from the compositor.

Most of the hardware that would use vblank timers falls in this 
category. Page flips often consist of memcpys into video memory, or they 
transfer pixel data over slow peripheral busses. The amount of work per 
page flip varies with the size of the damage rectangles.

Any vblank timing information here is therefore of low quality. For some 
scenarios, it would be common to miss a vblank or even the one after it.


IMHO, the first thing to discuss is whether having possibly low-quality 
timing information is preferable to having either high-quality timing or 
none. I have no strong opinion, but would tend to the latter.

Best regards
Thomas


>
> Whether the times of the flip events correspond to hardware timings is
> not relevant. Everything in wayland compositors is scheduled against
> flip event timings and they are also forwarded to clients for their
> frame scheduling. If the flip timings are wrong/out of sync with the
> hardware, then removing the vblank apis does not improve this
> situation.
>
>> This change has been discussed at length on IRC recently.
>>
>> https://people.freedesktop.org/~cbrill/dri-log/?channel=dri-devel&highlight_names=&date=2026-05-08&show_html=true
>> https://people.freedesktop.org/~cbrill/dri-log/?channel=dri-devel&highlight_names=&date=2026-05-12&show_html=true
>> https://people.freedesktop.org/~cbrill/dri-log/?channel=dri-devel&highlight_names=&date=2026-05-13&show_html=true
>> https://people.freedesktop.org/~cbrill/dri-log/?channel=dri-devel&highlight_names=&date=2026-05-15&show_html=true
>>
>> v2:
>> - add filter to CRTC_GET_SEQUENCE and CRTC_QUEUE_SEQUENCE ioctls (Michel)
>> - clarify Mutter's behavior in cover letter (Michel)
>>
>> Thomas Zimmermann (9):
>>    drm/vblank: Add drmm_vblank_init() to indicate managed cleanup
>>    drm/vblank: Add DRM_VBLANK_FLAG_SIMULATED
>>    drm/amdgpu: vkms: Set DRM_VBLANK_FLAG_SIMULATED
>>    drm/bochs: Set DRM_VBLANK_FLAG_SIMULATED
>>    drm/cirrus: Set DRM_VBLANK_FLAG_SIMULATED
>>    drm/hypervdrm: Set DRM_VBLANK_FLAG_SIMULATED
>>    drm/qxl: Set DRM_VBLANK_FLAG_SIMULATED
>>    drm/virtgpu: Set DRM_VBLANK_FLAG_SIMULATED
>>    drm/vkms: Set DRM_VBLANK_FLAG_SIMULATED
>>
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c    |  3 ++-
>>   drivers/gpu/drm/drm_vblank.c                | 26 +++++++++++++++------
>>   drivers/gpu/drm/drm_vblank_helper.c         |  2 +-
>>   drivers/gpu/drm/hyperv/hyperv_drm_modeset.c |  2 +-
>>   drivers/gpu/drm/qxl/qxl_display.c           |  2 +-
>>   drivers/gpu/drm/tiny/bochs.c                |  2 +-
>>   drivers/gpu/drm/tiny/cirrus-qemu.c          |  2 +-
>>   drivers/gpu/drm/virtio/virtgpu_display.c    |  2 +-
>>   drivers/gpu/drm/vkms/vkms_drv.c             |  4 ++--
>>   include/drm/drm_crtc.h                      |  2 +-
>>   include/drm/drm_device.h                    |  2 +-
>>   include/drm/drm_vblank.h                    | 15 +++++++++++-
>>   12 files changed, 45 insertions(+), 19 deletions(-)
>>
>>
>> base-commit: 5fb5a9a63cf5ece68e0eeb6fa397da27712bccf0
>> --
>> 2.54.0
>>

-- 
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstr. 146, 90461 Nürnberg, Germany, www.suse.com
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich, (HRB 36809, AG Nürnberg)



^ permalink raw reply

* Re: [PATCH net v3 2/2] net: mana: Skip redundant detach on already-detached port
From: Paolo Abeni @ 2026-05-28  9:30 UTC (permalink / raw)
  To: Dipayaan Roy, kys, haiyangz, wei.liu, decui, andrew+netdev, davem,
	edumazet, kuba, leon, longli, kotaranov, horms, shradhagupta,
	ssengar, ernis, shirazsaleem, linux-hyperv, netdev, linux-kernel,
	linux-rdma, stephen, jacob.e.keller, dipayanroy, leitao, kees,
	john.fastabend, hawk, bpf, daniel, ast, sdf, yury.norov,
	pavan.chebbi
In-Reply-To: <20260525081129.1230035-3-dipayanroy@linux.microsoft.com>

On 5/25/26 10:08 AM, Dipayaan Roy wrote:
> When mana_per_port_queue_reset_work_handler() runs after a previous
> detach succeeded but attach failed, the port is left in a detached
> state with apc->tx_qp and apc->rxqs already freed. Calling
> mana_detach() again unconditionally leads to NULL pointer dereferences
> during queue teardown.
> 
> Add an early exit in mana_detach() when the port is already in
> detached state (!netif_device_present) for non-close callers, making
> it safe to call idempotently. This allows the queue reset handler and
> other recovery paths to simply retry mana_attach() without redundant
> teardown.
> 
> Fixes: 3b194343c250 ("net: mana: Implement ndo_tx_timeout and serialize queue resets per port.")
> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
> Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
> ---
>  drivers/net/ethernet/microsoft/mana/mana_en.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
> index 0582803907a8..1e1ad2795c3c 100644
> --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> @@ -3350,6 +3350,12 @@ int mana_detach(struct net_device *ndev, bool from_close)
>  
>  	ASSERT_RTNL();
>  
> +	/* If already detached (indicates detach succeeded but attach failed
> +	 * previously). Now skip mana detach and just retry mana_attach.
> +	 */
> +	if (!from_close && !netif_device_present(ndev))
> +		return 0;
> +
>  	apc->port_st_save = apc->port_is_up;
>  	apc->port_is_up = false;

sashiko(gemini) notes the above can lead to different race:

---
Can this early return cause state machine corruption by bypassing the
updates
to apc->port_st_save?
Consider this sequence:
1. queue_reset_work runs, mana_detach() succeeds (apc->port_st_save = true,
   apc->port_is_up = false), but mana_attach() fails.
2. The admin brings the interface down (ip link set dev eth0 down), skipping
   mana_close() since apc->port_is_up is false.
3. The admin changes the MTU, triggering mana_change_mtu() which calls
   mana_detach() followed by mana_attach().
4. mana_detach() hits this new early return, preserving
   apc->port_st_save == true.
When mana_attach() runs, it sees apc->port_st_save == true and allocates
queues, setting apc->vport_use_count = 1 and apc->port_is_up = true, even
though the interface is administratively down.
If the admin then brings the interface up, mana_open() will unconditionally
call mana_alloc_queues(). That function calls mana_cfg_vport(), which will
return -EBUSY because apc->vport_use_count is already 1.
This leaves mana_open() failing and the interface down. Since the interface
is already down, trying to bring it down again is a no-op, meaning
mana_close() is never called to clean up the orphaned queues.
Does this sequence permanently brick the port until the driver is reloaded?
---

I think you need to be more restrictive in the early return check.

/P


^ permalink raw reply

* Re: [PATCH v2 0/9] drm: Limit DRM_IOCTL_WAIT_VBLANK to vblank interrupts
From: Julian Orth @ 2026-05-28 10:01 UTC (permalink / raw)
  To: Thomas Zimmermann
  Cc: simona, airlied, mdaenzer, pekka.paalanen, jadahl, contact,
	maarten.lankhorst, mripard, mhklinux, amd-gfx, dri-devel,
	wayland-devel, linux-hyperv, virtualization, spice-devel
In-Reply-To: <1d399c2d-b50f-4d19-8170-9db8961e4227@suse.de>

On Thu, May 28, 2026 at 9:54 AM Thomas Zimmermann <tzimmermann@suse.de> wrote:
>
> Hi
>
> Am 27.05.26 um 18:31 schrieb Julian Orth:
> > On Wed, May 27, 2026 at 3:39 PM Thomas Zimmermann <tzimmermann@suse.de> wrote:
> >> DRM's WAIT_VBLANK ioctl synchronizes user-space clients to display
> >> refresh. This is meaningless with vblank timers, which run unrelated
> >> to the hardware's vblank.
> >>
> >> Disable the ioctl for simulated vblanks. Set DRM_VBLANK_FLAG_SIMULATED
> >> for CRTCs with simulated vblank events in all such drivers. The vblank
> >> timers of these devices still rate-limit the number of page-flip events
> >> to match the display refresh.
> >>
> >> According to maintainers, user-space compositors do not require the ioctl
> >> for rate-limitting display output. Weston, Kwin and Mutter rely on completion
> >> events. Mutter optionally uses the WAIT_VBLANK ioctl only to optimize the
> >> time from input to output.
> >>
> >> When testing with mutter and weston, the page-flip rate appears correct
> >> with the patch set applied.
> > To avoid this being a regression, you need to test that this change
> > does not regress input latency.
>
> Let me stress that the current situation is that there's high-quality,
> and low-quality and no timing information. Depends on the driver and
> hardware.
>
> >
> > As discussed on IRC, compositors use vblank data to predict the time
> > of the next flip event. For each device that you are touching here,
> > there are two possibilities:
> >
> > - The vblank data is related to the flip timing, i.e. flip events and
> > vblank events are sent at almost the same time. In this case removing
> > these apis removes the path for compositors to predict the time of the
> > next flip event. Input latency will therefore regress after idle
> > periods when the compositor no longer has the time of the last vblank.
>
> User-space compositors seem to operate under this assumption. That, I
> think, makes sense on better hardware with rendering and vblank IRQs.
> Page flips are fast on such systems.
>
> >
> > - The vblank data has nothing to do with the time of the next flip
> > event. In this case this series could in fact improve latency because
> > it removes the incorrect data from the compositor.
>
> Most of the hardware that would use vblank timers falls in this
> category. Page flips often consist of memcpys into video memory, or they
> transfer pixel data over slow peripheral busses. The amount of work per
> page flip varies with the size of the damage rectangles.
>
> Any vblank timing information here is therefore of low quality. For some
> scenarios, it would be common to miss a vblank or even the one after it.

What matters is if the flip event will be aligned to _some_ vblank
event. As long as that is the case, the compositor can estimate which
vblank it will hit based on previous frames and can schedule its work
accordingly. I believe KWin and Mutter already support scheduling
frames for multiple vblanks in the future to support low-powered
devices or devices that are under high load. I have not looked into
this myself.

But even on high-powered devices compositors already take per-commit
kernel work into account. For example, by default I aim to commit 1.5
ms before vblank. This grace period is adjusted dynamically if I miss
the expected vblank.

Therefore I don't think this is an argument against exposing vblank
info. Even if the hardware had such an interrupt, the memcpy and
slow-bus issues would continue to apply.

>
>
> IMHO, the first thing to discuss is whether having possibly low-quality
> timing information is preferable to having either high-quality timing or
> none. I have no strong opinion, but would tend to the latter.

If you want to make userspace aware that vblank events are not backed
by hardware interrupts, then maybe this could be exposed as a driver
cap or a flag in the vblank event. Userspace could then decide on
their own what to do with that information.

Currently I don't think any compositor would use that information
since they target flip times and don't care if those times are driven
by hardware or software (since this is not actionable by userspace
anyway). So maybe the useful flag would be "flip times will not be
aligned to any vblank event" if that applies to any driver.

>
> Best regards
> Thomas
>
>
> >
> > Whether the times of the flip events correspond to hardware timings is
> > not relevant. Everything in wayland compositors is scheduled against
> > flip event timings and they are also forwarded to clients for their
> > frame scheduling. If the flip timings are wrong/out of sync with the
> > hardware, then removing the vblank apis does not improve this
> > situation.
> >
> >> This change has been discussed at length on IRC recently.
> >>
> >> https://people.freedesktop.org/~cbrill/dri-log/?channel=dri-devel&highlight_names=&date=2026-05-08&show_html=true
> >> https://people.freedesktop.org/~cbrill/dri-log/?channel=dri-devel&highlight_names=&date=2026-05-12&show_html=true
> >> https://people.freedesktop.org/~cbrill/dri-log/?channel=dri-devel&highlight_names=&date=2026-05-13&show_html=true
> >> https://people.freedesktop.org/~cbrill/dri-log/?channel=dri-devel&highlight_names=&date=2026-05-15&show_html=true
> >>
> >> v2:
> >> - add filter to CRTC_GET_SEQUENCE and CRTC_QUEUE_SEQUENCE ioctls (Michel)
> >> - clarify Mutter's behavior in cover letter (Michel)
> >>
> >> Thomas Zimmermann (9):
> >>    drm/vblank: Add drmm_vblank_init() to indicate managed cleanup
> >>    drm/vblank: Add DRM_VBLANK_FLAG_SIMULATED
> >>    drm/amdgpu: vkms: Set DRM_VBLANK_FLAG_SIMULATED
> >>    drm/bochs: Set DRM_VBLANK_FLAG_SIMULATED
> >>    drm/cirrus: Set DRM_VBLANK_FLAG_SIMULATED
> >>    drm/hypervdrm: Set DRM_VBLANK_FLAG_SIMULATED
> >>    drm/qxl: Set DRM_VBLANK_FLAG_SIMULATED
> >>    drm/virtgpu: Set DRM_VBLANK_FLAG_SIMULATED
> >>    drm/vkms: Set DRM_VBLANK_FLAG_SIMULATED
> >>
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c    |  3 ++-
> >>   drivers/gpu/drm/drm_vblank.c                | 26 +++++++++++++++------
> >>   drivers/gpu/drm/drm_vblank_helper.c         |  2 +-
> >>   drivers/gpu/drm/hyperv/hyperv_drm_modeset.c |  2 +-
> >>   drivers/gpu/drm/qxl/qxl_display.c           |  2 +-
> >>   drivers/gpu/drm/tiny/bochs.c                |  2 +-
> >>   drivers/gpu/drm/tiny/cirrus-qemu.c          |  2 +-
> >>   drivers/gpu/drm/virtio/virtgpu_display.c    |  2 +-
> >>   drivers/gpu/drm/vkms/vkms_drv.c             |  4 ++--
> >>   include/drm/drm_crtc.h                      |  2 +-
> >>   include/drm/drm_device.h                    |  2 +-
> >>   include/drm/drm_vblank.h                    | 15 +++++++++++-
> >>   12 files changed, 45 insertions(+), 19 deletions(-)
> >>
> >>
> >> base-commit: 5fb5a9a63cf5ece68e0eeb6fa397da27712bccf0
> >> --
> >> 2.54.0
> >>
>
> --
> --
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Frankenstr. 146, 90461 Nürnberg, Germany, www.suse.com
> GF: Jochen Jaser, Andrew McDonald, Werner Knoblich, (HRB 36809, AG Nürnberg)
>
>

^ permalink raw reply

* [PATCH v2 1/1] drm/hyperv: Replace "hyperv_" with "hv_drm_" as symbol name prefix
From: Michael Kelley @ 2026-05-28 13:51 UTC (permalink / raw)
  To: maarten.lankhorst, mripard, tzimmermann, airlied, simona, decui,
	longli, ssengar
  Cc: dri-devel, linux-kernel, linux-hyperv

From: Michael Kelley <mhklinux@outlook.com>

Function and structure names in the Hyper-V DRM driver currently
use "hyperv_" as the prefix. This conflicts with usage in core Hyper-V
and VMBus code, and incorrectly implies that functions and structures
in this driver apply generically to Hyper-V. A specific conflict arises
for "hyperv_init", which is an initcall for generic Hyper-V
initialization on arm64. The conflict prevents the use of
initcall_blacklist on the kernel boot line to skip loading this driver.

Fix this by substituting "hv_drm_" as the prefix for all functions and
structures in this driver. In most places, this is replacing "hyperv_"
with "hv_drm_". In a few places, the substitution results in
"hv_drm_drm_", which has been collapsed to just "hv_drm_". In one
place, the existing prefix is a bare "hv_", which has been replaced
with "hv_drm_" for consistency.

The changes are all mechanical text substitution in symbol names.
There are no other code or functional changes.

Signed-off-by: Michael Kelley <mhklinux@outlook.com>
---
This patch is built against linux-next20260526.

Changes in v2:
* Use "hv_drm_" as the new prefix instead of "hvdrm_". [Hamza Mahfooz]
* After the new prefix is applied, collapse occurrences of "hv_drm_drm_"
  to just "hv_drm_", such as with hv_drm_device. [Hamza Mahfooz]
* Don't change comments referring to source code filenames. [Dexuan Cui]
* Change hv_fops to hv_drm_fops for consistency.

 drivers/gpu/drm/hyperv/hyperv_drm.h         |  16 +--
 drivers/gpu/drm/hyperv/hyperv_drm_drv.c     |  92 ++++++++--------
 drivers/gpu/drm/hyperv/hyperv_drm_modeset.c | 110 ++++++++++----------
 drivers/gpu/drm/hyperv/hyperv_drm_proto.c   |  70 ++++++-------
 4 files changed, 144 insertions(+), 144 deletions(-)

diff --git a/drivers/gpu/drm/hyperv/hyperv_drm.h b/drivers/gpu/drm/hyperv/hyperv_drm.h
index 9e776112c03e..fe0bf5d40e48 100644
--- a/drivers/gpu/drm/hyperv/hyperv_drm.h
+++ b/drivers/gpu/drm/hyperv/hyperv_drm.h
@@ -8,7 +8,7 @@
 
 #define VMBUS_MAX_PACKET_SIZE 0x4000
 
-struct hyperv_drm_device {
+struct hv_drm_device {
 	/* drm */
 	struct drm_device dev;
 	struct drm_plane plane;
@@ -39,17 +39,17 @@ struct hyperv_drm_device {
 	struct hv_device *hdev;
 };
 
-#define to_hv(_dev) container_of(_dev, struct hyperv_drm_device, dev)
+#define to_hv(_dev) container_of(_dev, struct hv_drm_device, dev)
 
 /* hyperv_drm_modeset */
-int hyperv_mode_config_init(struct hyperv_drm_device *hv);
+int hv_drm_mode_config_init(struct hv_drm_device *hv);
 
 /* hyperv_drm_proto */
-int hyperv_update_vram_location(struct hv_device *hdev, phys_addr_t vram_pp);
-int hyperv_update_situation(struct hv_device *hdev, u8 active, u32 bpp,
+int hv_drm_update_vram_location(struct hv_device *hdev, phys_addr_t vram_pp);
+int hv_drm_update_situation(struct hv_device *hdev, u8 active, u32 bpp,
 			    u32 w, u32 h, u32 pitch);
-int hyperv_hide_hw_ptr(struct hv_device *hdev);
-int hyperv_update_dirt(struct hv_device *hdev, struct drm_rect *rect);
-int hyperv_connect_vsp(struct hv_device *hdev);
+int hv_drm_hide_hw_ptr(struct hv_device *hdev);
+int hv_drm_update_dirt(struct hv_device *hdev, struct drm_rect *rect);
+int hv_drm_connect_vsp(struct hv_device *hdev);
 
 #endif
diff --git a/drivers/gpu/drm/hyperv/hyperv_drm_drv.c b/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
index b6bf6412ae34..b9661f946b7f 100644
--- a/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
+++ b/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
@@ -24,9 +24,9 @@
 #define DRIVER_MAJOR 1
 #define DRIVER_MINOR 0
 
-DEFINE_DRM_GEM_FOPS(hv_fops);
+DEFINE_DRM_GEM_FOPS(hv_drm_fops);
 
-static struct drm_driver hyperv_driver = {
+static struct drm_driver hv_drm_driver = {
 	.driver_features = DRIVER_MODESET | DRIVER_GEM | DRIVER_ATOMIC,
 
 	.name		 = DRIVER_NAME,
@@ -34,22 +34,22 @@ static struct drm_driver hyperv_driver = {
 	.major		 = DRIVER_MAJOR,
 	.minor		 = DRIVER_MINOR,
 
-	.fops		 = &hv_fops,
+	.fops		 = &hv_drm_fops,
 	DRM_GEM_SHMEM_DRIVER_OPS,
 	DRM_FBDEV_SHMEM_DRIVER_OPS,
 };
 
-static int hyperv_pci_probe(struct pci_dev *pdev,
+static int hv_drm_pci_probe(struct pci_dev *pdev,
 			    const struct pci_device_id *ent)
 {
 	return 0;
 }
 
-static void hyperv_pci_remove(struct pci_dev *pdev)
+static void hv_drm_pci_remove(struct pci_dev *pdev)
 {
 }
 
-static const struct pci_device_id hyperv_pci_tbl[] = {
+static const struct pci_device_id hv_drm_pci_tbl[] = {
 	{
 		.vendor = PCI_VENDOR_ID_MICROSOFT,
 		.device = PCI_DEVICE_ID_HYPERV_VIDEO,
@@ -60,14 +60,14 @@ static const struct pci_device_id hyperv_pci_tbl[] = {
 /*
  * PCI stub to support gen1 VM.
  */
-static struct pci_driver hyperv_pci_driver = {
+static struct pci_driver hv_drm_pci_driver = {
 	.name =		KBUILD_MODNAME,
-	.id_table =	hyperv_pci_tbl,
-	.probe =	hyperv_pci_probe,
-	.remove =	hyperv_pci_remove,
+	.id_table =	hv_drm_pci_tbl,
+	.probe =	hv_drm_pci_probe,
+	.remove =	hv_drm_pci_remove,
 };
 
-static int hyperv_setup_vram(struct hyperv_drm_device *hv,
+static int hv_drm_setup_vram(struct hv_drm_device *hv,
 			     struct hv_device *hdev)
 {
 	struct drm_device *dev = &hv->dev;
@@ -102,15 +102,15 @@ static int hyperv_setup_vram(struct hyperv_drm_device *hv,
 	return ret;
 }
 
-static int hyperv_vmbus_probe(struct hv_device *hdev,
+static int hv_drm_vmbus_probe(struct hv_device *hdev,
 			      const struct hv_vmbus_device_id *dev_id)
 {
-	struct hyperv_drm_device *hv;
+	struct hv_drm_device *hv;
 	struct drm_device *dev;
 	int ret;
 
-	hv = devm_drm_dev_alloc(&hdev->device, &hyperv_driver,
-				struct hyperv_drm_device, dev);
+	hv = devm_drm_dev_alloc(&hdev->device, &hv_drm_driver,
+				struct hv_drm_device, dev);
 	if (IS_ERR(hv))
 		return PTR_ERR(hv);
 
@@ -119,15 +119,15 @@ static int hyperv_vmbus_probe(struct hv_device *hdev,
 	hv_set_drvdata(hdev, hv);
 	hv->hdev = hdev;
 
-	ret = hyperv_connect_vsp(hdev);
+	ret = hv_drm_connect_vsp(hdev);
 	if (ret) {
 		drm_err(dev, "Failed to connect to vmbus.\n");
 		goto err_hv_set_drv_data;
 	}
 
-	aperture_remove_all_conflicting_devices(hyperv_driver.name);
+	aperture_remove_all_conflicting_devices(hv_drm_driver.name);
 
-	ret = hyperv_setup_vram(hv, hdev);
+	ret = hv_drm_setup_vram(hv, hdev);
 	if (ret)
 		goto err_vmbus_close;
 
@@ -136,11 +136,11 @@ static int hyperv_vmbus_probe(struct hv_device *hdev,
 	 * vram location is not fatal. Device will update dirty area till
 	 * preferred resolution only.
 	 */
-	ret = hyperv_update_vram_location(hdev, hv->fb_base);
+	ret = hv_drm_update_vram_location(hdev, hv->fb_base);
 	if (ret)
 		drm_warn(dev, "Failed to update vram location.\n");
 
-	ret = hyperv_mode_config_init(hv);
+	ret = hv_drm_mode_config_init(hv);
 	if (ret)
 		goto err_free_mmio;
 
@@ -168,10 +168,10 @@ static int hyperv_vmbus_probe(struct hv_device *hdev,
 	return ret;
 }
 
-static void hyperv_vmbus_remove(struct hv_device *hdev)
+static void hv_drm_vmbus_remove(struct hv_device *hdev)
 {
 	struct drm_device *dev = hv_get_drvdata(hdev);
-	struct hyperv_drm_device *hv = to_hv(dev);
+	struct hv_drm_device *hv = to_hv(dev);
 
 	vmbus_set_skip_unload(false);
 	drm_dev_unplug(dev);
@@ -183,12 +183,12 @@ static void hyperv_vmbus_remove(struct hv_device *hdev)
 	vmbus_free_mmio(hv->mem->start, hv->fb_size);
 }
 
-static void hyperv_vmbus_shutdown(struct hv_device *hdev)
+static void hv_drm_vmbus_shutdown(struct hv_device *hdev)
 {
 	drm_atomic_helper_shutdown(hv_get_drvdata(hdev));
 }
 
-static int hyperv_vmbus_suspend(struct hv_device *hdev)
+static int hv_drm_vmbus_suspend(struct hv_device *hdev)
 {
 	struct drm_device *dev = hv_get_drvdata(hdev);
 	int ret;
@@ -202,67 +202,67 @@ static int hyperv_vmbus_suspend(struct hv_device *hdev)
 	return 0;
 }
 
-static int hyperv_vmbus_resume(struct hv_device *hdev)
+static int hv_drm_vmbus_resume(struct hv_device *hdev)
 {
 	struct drm_device *dev = hv_get_drvdata(hdev);
-	struct hyperv_drm_device *hv = to_hv(dev);
+	struct hv_drm_device *hv = to_hv(dev);
 	int ret;
 
-	ret = hyperv_connect_vsp(hdev);
+	ret = hv_drm_connect_vsp(hdev);
 	if (ret)
 		return ret;
 
-	ret = hyperv_update_vram_location(hdev, hv->fb_base);
+	ret = hv_drm_update_vram_location(hdev, hv->fb_base);
 	if (ret)
 		return ret;
 
 	return drm_mode_config_helper_resume(dev);
 }
 
-static const struct hv_vmbus_device_id hyperv_vmbus_tbl[] = {
+static const struct hv_vmbus_device_id hv_drm_vmbus_tbl[] = {
 	/* Synthetic Video Device GUID */
 	{HV_SYNTHVID_GUID},
 	{}
 };
 
-static struct hv_driver hyperv_hv_driver = {
+static struct hv_driver hv_drm_hv_driver = {
 	.name = KBUILD_MODNAME,
-	.id_table = hyperv_vmbus_tbl,
-	.probe = hyperv_vmbus_probe,
-	.remove = hyperv_vmbus_remove,
-	.shutdown = hyperv_vmbus_shutdown,
-	.suspend = hyperv_vmbus_suspend,
-	.resume = hyperv_vmbus_resume,
+	.id_table = hv_drm_vmbus_tbl,
+	.probe = hv_drm_vmbus_probe,
+	.remove = hv_drm_vmbus_remove,
+	.shutdown = hv_drm_vmbus_shutdown,
+	.suspend = hv_drm_vmbus_suspend,
+	.resume = hv_drm_vmbus_resume,
 	.driver = {
 		.probe_type = PROBE_PREFER_ASYNCHRONOUS,
 	},
 };
 
-static int __init hyperv_init(void)
+static int __init hv_drm_init(void)
 {
 	int ret;
 
 	if (drm_firmware_drivers_only())
 		return -ENODEV;
 
-	ret = pci_register_driver(&hyperv_pci_driver);
+	ret = pci_register_driver(&hv_drm_pci_driver);
 	if (ret != 0)
 		return ret;
 
-	return vmbus_driver_register(&hyperv_hv_driver);
+	return vmbus_driver_register(&hv_drm_hv_driver);
 }
 
-static void __exit hyperv_exit(void)
+static void __exit hv_drm_exit(void)
 {
-	vmbus_driver_unregister(&hyperv_hv_driver);
-	pci_unregister_driver(&hyperv_pci_driver);
+	vmbus_driver_unregister(&hv_drm_hv_driver);
+	pci_unregister_driver(&hv_drm_pci_driver);
 }
 
-module_init(hyperv_init);
-module_exit(hyperv_exit);
+module_init(hv_drm_init);
+module_exit(hv_drm_exit);
 
-MODULE_DEVICE_TABLE(pci, hyperv_pci_tbl);
-MODULE_DEVICE_TABLE(vmbus, hyperv_vmbus_tbl);
+MODULE_DEVICE_TABLE(pci, hv_drm_pci_tbl);
+MODULE_DEVICE_TABLE(vmbus, hv_drm_vmbus_tbl);
 MODULE_LICENSE("GPL");
 MODULE_AUTHOR("Deepak Rawat <drawat.floss@gmail.com>");
 MODULE_DESCRIPTION("DRM driver for Hyper-V synthetic video device");
diff --git a/drivers/gpu/drm/hyperv/hyperv_drm_modeset.c b/drivers/gpu/drm/hyperv/hyperv_drm_modeset.c
index 793dbbf61893..3f0ab5da0cd5 100644
--- a/drivers/gpu/drm/hyperv/hyperv_drm_modeset.c
+++ b/drivers/gpu/drm/hyperv/hyperv_drm_modeset.c
@@ -25,11 +25,11 @@
 
 #include "hyperv_drm.h"
 
-static int hyperv_blit_to_vram_rect(struct drm_framebuffer *fb,
+static int hv_drm_blit_to_vram_rect(struct drm_framebuffer *fb,
 				    const struct iosys_map *vmap,
 				    struct drm_rect *rect)
 {
-	struct hyperv_drm_device *hv = to_hv(fb->dev);
+	struct hv_drm_device *hv = to_hv(fb->dev);
 	struct iosys_map dst = IOSYS_MAP_INIT_VADDR_IOMEM(hv->vram);
 	int idx;
 
@@ -44,9 +44,9 @@ static int hyperv_blit_to_vram_rect(struct drm_framebuffer *fb,
 	return 0;
 }
 
-static int hyperv_connector_get_modes(struct drm_connector *connector)
+static int hv_drm_connector_get_modes(struct drm_connector *connector)
 {
-	struct hyperv_drm_device *hv = to_hv(connector->dev);
+	struct hv_drm_device *hv = to_hv(connector->dev);
 	int count;
 
 	count = drm_add_modes_noedid(connector,
@@ -58,11 +58,11 @@ static int hyperv_connector_get_modes(struct drm_connector *connector)
 	return count;
 }
 
-static const struct drm_connector_helper_funcs hyperv_connector_helper_funcs = {
-	.get_modes = hyperv_connector_get_modes,
+static const struct drm_connector_helper_funcs hv_drm_connector_helper_funcs = {
+	.get_modes = hv_drm_connector_get_modes,
 };
 
-static const struct drm_connector_funcs hyperv_connector_funcs = {
+static const struct drm_connector_funcs hv_drm_connector_funcs = {
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.destroy = drm_connector_cleanup,
 	.reset = drm_atomic_helper_connector_reset,
@@ -70,15 +70,15 @@ static const struct drm_connector_funcs hyperv_connector_funcs = {
 	.atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
 };
 
-static inline int hyperv_conn_init(struct hyperv_drm_device *hv)
+static inline int hv_drm_conn_init(struct hv_drm_device *hv)
 {
-	drm_connector_helper_add(&hv->connector, &hyperv_connector_helper_funcs);
+	drm_connector_helper_add(&hv->connector, &hv_drm_connector_helper_funcs);
 	return drm_connector_init(&hv->dev, &hv->connector,
-				  &hyperv_connector_funcs,
+				  &hv_drm_connector_funcs,
 				  DRM_MODE_CONNECTOR_VIRTUAL);
 }
 
-static int hyperv_check_size(struct hyperv_drm_device *hv, int w, int h,
+static int hv_drm_check_size(struct hv_drm_device *hv, int w, int h,
 			     struct drm_framebuffer *fb)
 {
 	u32 pitch = w * (hv->screen_depth / 8);
@@ -92,25 +92,25 @@ static int hyperv_check_size(struct hyperv_drm_device *hv, int w, int h,
 	return 0;
 }
 
-static const uint32_t hyperv_formats[] = {
+static const uint32_t hv_drm_formats[] = {
 	DRM_FORMAT_XRGB8888,
 };
 
-static const uint64_t hyperv_modifiers[] = {
+static const uint64_t hv_drm_modifiers[] = {
 	DRM_FORMAT_MOD_LINEAR,
 	DRM_FORMAT_MOD_INVALID
 };
 
-static void hyperv_crtc_helper_atomic_enable(struct drm_crtc *crtc,
+static void hv_drm_crtc_helper_atomic_enable(struct drm_crtc *crtc,
 					     struct drm_atomic_commit *state)
 {
-	struct hyperv_drm_device *hv = to_hv(crtc->dev);
+	struct hv_drm_device *hv = to_hv(crtc->dev);
 	struct drm_plane *plane = &hv->plane;
 	struct drm_plane_state *plane_state = plane->state;
 	struct drm_crtc_state *crtc_state = crtc->state;
 
-	hyperv_hide_hw_ptr(hv->hdev);
-	hyperv_update_situation(hv->hdev, 1,  hv->screen_depth,
+	hv_drm_hide_hw_ptr(hv->hdev);
+	hv_drm_update_situation(hv->hdev, 1,  hv->screen_depth,
 				crtc_state->mode.hdisplay,
 				crtc_state->mode.vdisplay,
 				plane_state->fb->pitches[0]);
@@ -118,14 +118,14 @@ static void hyperv_crtc_helper_atomic_enable(struct drm_crtc *crtc,
 	drm_crtc_vblank_on(crtc);
 }
 
-static const struct drm_crtc_helper_funcs hyperv_crtc_helper_funcs = {
+static const struct drm_crtc_helper_funcs hv_drm_crtc_helper_funcs = {
 	.atomic_check = drm_crtc_helper_atomic_check,
 	.atomic_flush = drm_crtc_vblank_atomic_flush,
-	.atomic_enable = hyperv_crtc_helper_atomic_enable,
+	.atomic_enable = hv_drm_crtc_helper_atomic_enable,
 	.atomic_disable = drm_crtc_vblank_atomic_disable,
 };
 
-static const struct drm_crtc_funcs hyperv_crtc_funcs = {
+static const struct drm_crtc_funcs hv_drm_crtc_funcs = {
 	.reset = drm_atomic_helper_crtc_reset,
 	.destroy = drm_crtc_cleanup,
 	.set_config = drm_atomic_helper_set_config,
@@ -135,11 +135,11 @@ static const struct drm_crtc_funcs hyperv_crtc_funcs = {
 	DRM_CRTC_VBLANK_TIMER_FUNCS,
 };
 
-static int hyperv_plane_atomic_check(struct drm_plane *plane,
+static int hv_drm_plane_atomic_check(struct drm_plane *plane,
 				     struct drm_atomic_commit *state)
 {
 	struct drm_plane_state *plane_state = drm_atomic_get_new_plane_state(state, plane);
-	struct hyperv_drm_device *hv = to_hv(plane->dev);
+	struct hv_drm_device *hv = to_hv(plane->dev);
 	struct drm_framebuffer *fb = plane_state->fb;
 	struct drm_crtc *crtc = plane_state->crtc;
 	struct drm_crtc_state *crtc_state = NULL;
@@ -167,10 +167,10 @@ static int hyperv_plane_atomic_check(struct drm_plane *plane,
 	return 0;
 }
 
-static void hyperv_plane_atomic_update(struct drm_plane *plane,
+static void hv_drm_plane_atomic_update(struct drm_plane *plane,
 				       struct drm_atomic_commit *state)
 {
-	struct hyperv_drm_device *hv = to_hv(plane->dev);
+	struct hv_drm_device *hv = to_hv(plane->dev);
 	struct drm_plane_state *old_state = drm_atomic_get_old_plane_state(state, plane);
 	struct drm_plane_state *new_state = drm_atomic_get_new_plane_state(state, plane);
 	struct drm_shadow_plane_state *shadow_plane_state = to_drm_shadow_plane_state(new_state);
@@ -185,15 +185,15 @@ static void hyperv_plane_atomic_update(struct drm_plane *plane,
 		if (!drm_rect_intersect(&dst_clip, &damage))
 			continue;
 
-		hyperv_blit_to_vram_rect(new_state->fb, &shadow_plane_state->data[0], &damage);
-		hyperv_update_dirt(hv->hdev, &damage);
+		hv_drm_blit_to_vram_rect(new_state->fb, &shadow_plane_state->data[0], &damage);
+		hv_drm_update_dirt(hv->hdev, &damage);
 	}
 }
 
-static int hyperv_plane_get_scanout_buffer(struct drm_plane *plane,
+static int hv_drm_plane_get_scanout_buffer(struct drm_plane *plane,
 					   struct drm_scanout_buffer *sb)
 {
-	struct hyperv_drm_device *hv = to_hv(plane->dev);
+	struct hv_drm_device *hv = to_hv(plane->dev);
 	struct iosys_map map = IOSYS_MAP_INIT_VADDR_IOMEM(hv->vram);
 
 	if (plane->state && plane->state->fb) {
@@ -207,9 +207,9 @@ static int hyperv_plane_get_scanout_buffer(struct drm_plane *plane,
 	return -ENODEV;
 }
 
-static void hyperv_plane_panic_flush(struct drm_plane *plane)
+static void hv_drm_plane_panic_flush(struct drm_plane *plane)
 {
-	struct hyperv_drm_device *hv = to_hv(plane->dev);
+	struct hv_drm_device *hv = to_hv(plane->dev);
 	struct drm_rect rect;
 
 	if (plane->state && plane->state->fb) {
@@ -218,32 +218,32 @@ static void hyperv_plane_panic_flush(struct drm_plane *plane)
 		rect.x2 = plane->state->fb->width;
 		rect.y2 = plane->state->fb->height;
 
-		hyperv_update_dirt(hv->hdev, &rect);
+		hv_drm_update_dirt(hv->hdev, &rect);
 	}
 
 	vmbus_initiate_unload(true);
 }
 
-static const struct drm_plane_helper_funcs hyperv_plane_helper_funcs = {
+static const struct drm_plane_helper_funcs hv_drm_plane_helper_funcs = {
 	DRM_GEM_SHADOW_PLANE_HELPER_FUNCS,
-	.atomic_check = hyperv_plane_atomic_check,
-	.atomic_update = hyperv_plane_atomic_update,
-	.get_scanout_buffer = hyperv_plane_get_scanout_buffer,
-	.panic_flush = hyperv_plane_panic_flush,
+	.atomic_check = hv_drm_plane_atomic_check,
+	.atomic_update = hv_drm_plane_atomic_update,
+	.get_scanout_buffer = hv_drm_plane_get_scanout_buffer,
+	.panic_flush = hv_drm_plane_panic_flush,
 };
 
-static const struct drm_plane_funcs hyperv_plane_funcs = {
+static const struct drm_plane_funcs hv_drm_plane_funcs = {
 	.update_plane		= drm_atomic_helper_update_plane,
 	.disable_plane		= drm_atomic_helper_disable_plane,
 	.destroy		= drm_plane_cleanup,
 	DRM_GEM_SHADOW_PLANE_FUNCS,
 };
 
-static const struct drm_encoder_funcs hyperv_drm_simple_encoder_funcs_cleanup = {
+static const struct drm_encoder_funcs hv_drm_simple_encoder_funcs_cleanup = {
 	.destroy = drm_encoder_cleanup,
 };
 
-static inline int hyperv_pipe_init(struct hyperv_drm_device *hv)
+static inline int hv_drm_pipe_init(struct hv_drm_device *hv)
 {
 	struct drm_device *dev = &hv->dev;
 	struct drm_encoder *encoder = &hv->encoder;
@@ -253,29 +253,29 @@ static inline int hyperv_pipe_init(struct hyperv_drm_device *hv)
 	int ret;
 
 	ret = drm_universal_plane_init(dev, plane, 0,
-				       &hyperv_plane_funcs,
-				       hyperv_formats, ARRAY_SIZE(hyperv_formats),
-				       hyperv_modifiers,
+				       &hv_drm_plane_funcs,
+				       hv_drm_formats, ARRAY_SIZE(hv_drm_formats),
+				       hv_drm_modifiers,
 				       DRM_PLANE_TYPE_PRIMARY, NULL);
 	if (ret)
 		return ret;
-	drm_plane_helper_add(plane, &hyperv_plane_helper_funcs);
+	drm_plane_helper_add(plane, &hv_drm_plane_helper_funcs);
 	drm_plane_enable_fb_damage_clips(plane);
 
 	ret = drm_crtc_init_with_planes(dev, crtc, plane, NULL,
-					&hyperv_crtc_funcs, NULL);
+					&hv_drm_crtc_funcs, NULL);
 	if (ret)
 		return ret;
-	drm_crtc_helper_add(crtc, &hyperv_crtc_helper_funcs);
+	drm_crtc_helper_add(crtc, &hv_drm_crtc_helper_funcs);
 
 	encoder->possible_crtcs = drm_crtc_mask(crtc);
 	ret = drm_encoder_init(dev, encoder,
-			       &hyperv_drm_simple_encoder_funcs_cleanup,
+			       &hv_drm_simple_encoder_funcs_cleanup,
 			       DRM_MODE_ENCODER_NONE, NULL);
 	if (ret)
 		return ret;
 
-	ret = hyperv_conn_init(hv);
+	ret = hv_drm_conn_init(hv);
 	if (ret) {
 		drm_err(dev, "Failed to initialized connector.\n");
 		return ret;
@@ -285,25 +285,25 @@ static inline int hyperv_pipe_init(struct hyperv_drm_device *hv)
 }
 
 static enum drm_mode_status
-hyperv_mode_valid(struct drm_device *dev,
+hv_drm_mode_valid(struct drm_device *dev,
 		  const struct drm_display_mode *mode)
 {
-	struct hyperv_drm_device *hv = to_hv(dev);
+	struct hv_drm_device *hv = to_hv(dev);
 
-	if (hyperv_check_size(hv, mode->hdisplay, mode->vdisplay, NULL))
+	if (hv_drm_check_size(hv, mode->hdisplay, mode->vdisplay, NULL))
 		return MODE_BAD;
 
 	return MODE_OK;
 }
 
-static const struct drm_mode_config_funcs hyperv_mode_config_funcs = {
+static const struct drm_mode_config_funcs hv_drm_mode_config_funcs = {
 	.fb_create = drm_gem_fb_create_with_dirty,
-	.mode_valid = hyperv_mode_valid,
+	.mode_valid = hv_drm_mode_valid,
 	.atomic_check = drm_atomic_helper_check,
 	.atomic_commit = drm_atomic_helper_commit,
 };
 
-int hyperv_mode_config_init(struct hyperv_drm_device *hv)
+int hv_drm_mode_config_init(struct hv_drm_device *hv)
 {
 	struct drm_device *dev = &hv->dev;
 	int ret;
@@ -322,9 +322,9 @@ int hyperv_mode_config_init(struct hyperv_drm_device *hv)
 	dev->mode_config.preferred_depth = hv->screen_depth;
 	dev->mode_config.prefer_shadow = 0;
 
-	dev->mode_config.funcs = &hyperv_mode_config_funcs;
+	dev->mode_config.funcs = &hv_drm_mode_config_funcs;
 
-	ret = hyperv_pipe_init(hv);
+	ret = hv_drm_pipe_init(hv);
 	if (ret) {
 		drm_err(dev, "Failed to initialized pipe.\n");
 		return ret;
diff --git a/drivers/gpu/drm/hyperv/hyperv_drm_proto.c b/drivers/gpu/drm/hyperv/hyperv_drm_proto.c
index 6e09b0218df4..f0ef627b4898 100644
--- a/drivers/gpu/drm/hyperv/hyperv_drm_proto.c
+++ b/drivers/gpu/drm/hyperv/hyperv_drm_proto.c
@@ -181,7 +181,7 @@ struct synthvid_msg {
 	};
 } __packed;
 
-static inline bool hyperv_version_ge(u32 ver1, u32 ver2)
+static inline bool hv_drm_version_ge(u32 ver1, u32 ver2)
 {
 	if (SYNTHVID_VER_GET_MAJOR(ver1) > SYNTHVID_VER_GET_MAJOR(ver2) ||
 	    (SYNTHVID_VER_GET_MAJOR(ver1) == SYNTHVID_VER_GET_MAJOR(ver2) &&
@@ -191,10 +191,10 @@ static inline bool hyperv_version_ge(u32 ver1, u32 ver2)
 	return false;
 }
 
-static inline int hyperv_sendpacket(struct hv_device *hdev, struct synthvid_msg *msg)
+static inline int hv_drm_sendpacket(struct hv_device *hdev, struct synthvid_msg *msg)
 {
 	static atomic64_t request_id = ATOMIC64_INIT(0);
-	struct hyperv_drm_device *hv = hv_get_drvdata(hdev);
+	struct hv_drm_device *hv = hv_get_drvdata(hdev);
 	int ret;
 
 	msg->pipe_hdr.type = PIPE_MSG_DATA;
@@ -211,9 +211,9 @@ static inline int hyperv_sendpacket(struct hv_device *hdev, struct synthvid_msg
 	return ret;
 }
 
-static int hyperv_negotiate_version(struct hv_device *hdev, u32 ver)
+static int hv_drm_negotiate_version(struct hv_device *hdev, u32 ver)
 {
-	struct hyperv_drm_device *hv = hv_get_drvdata(hdev);
+	struct hv_drm_device *hv = hv_get_drvdata(hdev);
 	struct synthvid_msg *msg = (struct synthvid_msg *)hv->init_buf;
 	struct drm_device *dev = &hv->dev;
 	unsigned long t;
@@ -223,7 +223,7 @@ static int hyperv_negotiate_version(struct hv_device *hdev, u32 ver)
 	msg->vid_hdr.size = sizeof(struct synthvid_msg_hdr) +
 		sizeof(struct synthvid_version_req);
 	msg->ver_req.version = ver;
-	hyperv_sendpacket(hdev, msg);
+	hv_drm_sendpacket(hdev, msg);
 
 	t = wait_for_completion_timeout(&hv->wait, VMBUS_VSP_TIMEOUT);
 	if (!t) {
@@ -243,9 +243,9 @@ static int hyperv_negotiate_version(struct hv_device *hdev, u32 ver)
 	return 0;
 }
 
-int hyperv_update_vram_location(struct hv_device *hdev, phys_addr_t vram_pp)
+int hv_drm_update_vram_location(struct hv_device *hdev, phys_addr_t vram_pp)
 {
-	struct hyperv_drm_device *hv = hv_get_drvdata(hdev);
+	struct hv_drm_device *hv = hv_get_drvdata(hdev);
 	struct synthvid_msg *msg = (struct synthvid_msg *)hv->init_buf;
 	struct drm_device *dev = &hv->dev;
 	unsigned long t;
@@ -257,7 +257,7 @@ int hyperv_update_vram_location(struct hv_device *hdev, phys_addr_t vram_pp)
 	msg->vram.user_ctx = vram_pp;
 	msg->vram.vram_gpa = vram_pp;
 	msg->vram.is_vram_gpa_specified = 1;
-	hyperv_sendpacket(hdev, msg);
+	hv_drm_sendpacket(hdev, msg);
 
 	t = wait_for_completion_timeout(&hv->wait, VMBUS_VSP_TIMEOUT);
 	if (!t) {
@@ -272,7 +272,7 @@ int hyperv_update_vram_location(struct hv_device *hdev, phys_addr_t vram_pp)
 	return 0;
 }
 
-int hyperv_update_situation(struct hv_device *hdev, u8 active, u32 bpp,
+int hv_drm_update_situation(struct hv_device *hdev, u8 active, u32 bpp,
 			    u32 w, u32 h, u32 pitch)
 {
 	struct synthvid_msg msg;
@@ -292,7 +292,7 @@ int hyperv_update_situation(struct hv_device *hdev, u8 active, u32 bpp,
 	msg.situ.video_output[0].height_pixels = h;
 	msg.situ.video_output[0].pitch_bytes = pitch;
 
-	hyperv_sendpacket(hdev, &msg);
+	hv_drm_sendpacket(hdev, &msg);
 
 	return 0;
 }
@@ -306,11 +306,11 @@ int hyperv_update_situation(struct hv_device *hdev, u8 active, u32 bpp,
  * the msg.ptr_shape.data. Note: setting msg.ptr_pos.is_visible to 0 doesn't
  * work in tests.
  *
- * The hyperv_hide_hw_ptr() is also called in the handler of the
+ * The hv_drm_hide_hw_ptr() is also called in the handler of the
  * SYNTHVID_FEATURE_CHANGE event, otherwise the host still draws an extra
  * unwanted mouse pointer after the VM Connection window is closed and reopened.
  */
-int hyperv_hide_hw_ptr(struct hv_device *hdev)
+int hv_drm_hide_hw_ptr(struct hv_device *hdev)
 {
 	struct synthvid_msg msg;
 
@@ -322,7 +322,7 @@ int hyperv_hide_hw_ptr(struct hv_device *hdev)
 	msg.ptr_pos.video_output = 0;
 	msg.ptr_pos.image_x = 0;
 	msg.ptr_pos.image_y = 0;
-	hyperv_sendpacket(hdev, &msg);
+	hv_drm_sendpacket(hdev, &msg);
 
 	memset(&msg, 0, sizeof(struct synthvid_msg));
 	msg.vid_hdr.type = SYNTHVID_POINTER_SHAPE;
@@ -338,14 +338,14 @@ int hyperv_hide_hw_ptr(struct hv_device *hdev)
 	msg.ptr_shape.data[1] = 1;
 	msg.ptr_shape.data[2] = 1;
 	msg.ptr_shape.data[3] = 1;
-	hyperv_sendpacket(hdev, &msg);
+	hv_drm_sendpacket(hdev, &msg);
 
 	return 0;
 }
 
-int hyperv_update_dirt(struct hv_device *hdev, struct drm_rect *rect)
+int hv_drm_update_dirt(struct hv_device *hdev, struct drm_rect *rect)
 {
-	struct hyperv_drm_device *hv = hv_get_drvdata(hdev);
+	struct hv_drm_device *hv = hv_get_drvdata(hdev);
 	struct synthvid_msg msg;
 
 	if (!hv->dirt_needed)
@@ -363,14 +363,14 @@ int hyperv_update_dirt(struct hv_device *hdev, struct drm_rect *rect)
 	msg.dirt.rect[0].x2 = rect->x2;
 	msg.dirt.rect[0].y2 = rect->y2;
 
-	hyperv_sendpacket(hdev, &msg);
+	hv_drm_sendpacket(hdev, &msg);
 
 	return 0;
 }
 
-static int hyperv_get_supported_resolution(struct hv_device *hdev)
+static int hv_drm_get_supported_resolution(struct hv_device *hdev)
 {
-	struct hyperv_drm_device *hv = hv_get_drvdata(hdev);
+	struct hv_drm_device *hv = hv_get_drvdata(hdev);
 	struct synthvid_msg *msg = (struct synthvid_msg *)hv->init_buf;
 	struct drm_device *dev = &hv->dev;
 	unsigned long t;
@@ -383,7 +383,7 @@ static int hyperv_get_supported_resolution(struct hv_device *hdev)
 		sizeof(struct synthvid_supported_resolution_req);
 	msg->resolution_req.maximum_resolution_count =
 		SYNTHVID_MAX_RESOLUTION_COUNT;
-	hyperv_sendpacket(hdev, msg);
+	hv_drm_sendpacket(hdev, msg);
 
 	t = wait_for_completion_timeout(&hv->wait, VMBUS_VSP_TIMEOUT);
 	if (!t) {
@@ -420,9 +420,9 @@ static int hyperv_get_supported_resolution(struct hv_device *hdev)
 	return 0;
 }
 
-static void hyperv_receive_sub(struct hv_device *hdev, u32 bytes_recvd)
+static void hv_drm_receive_sub(struct hv_device *hdev, u32 bytes_recvd)
 {
-	struct hyperv_drm_device *hv = hv_get_drvdata(hdev);
+	struct hv_drm_device *hv = hv_get_drvdata(hdev);
 	struct synthvid_msg *msg;
 	size_t hdr_size;
 	size_t need;
@@ -486,7 +486,7 @@ static void hyperv_receive_sub(struct hv_device *hdev, u32 bytes_recvd)
 		}
 		hv->dirt_needed = msg->feature_chg.is_dirt_needed;
 		if (hv->dirt_needed)
-			hyperv_hide_hw_ptr(hv->hdev);
+			hv_drm_hide_hw_ptr(hv->hdev);
 		return;
 	default:
 		return;
@@ -508,10 +508,10 @@ static void hyperv_receive_sub(struct hv_device *hdev, u32 bytes_recvd)
 	complete(&hv->wait);
 }
 
-static void hyperv_receive(void *ctx)
+static void hv_drm_receive(void *ctx)
 {
 	struct hv_device *hdev = ctx;
-	struct hyperv_drm_device *hv = hv_get_drvdata(hdev);
+	struct hv_drm_device *hv = hv_get_drvdata(hdev);
 	struct synthvid_msg *recv_buf;
 	u32 bytes_recvd;
 	u64 req_id;
@@ -539,19 +539,19 @@ static void hyperv_receive(void *ctx)
 					    ret, bytes_recvd);
 		} else if (bytes_recvd > 0 &&
 			   recv_buf->pipe_hdr.type == PIPE_MSG_DATA) {
-			hyperv_receive_sub(hdev, bytes_recvd);
+			hv_drm_receive_sub(hdev, bytes_recvd);
 		}
 	} while (bytes_recvd > 0 && ret == 0);
 }
 
-int hyperv_connect_vsp(struct hv_device *hdev)
+int hv_drm_connect_vsp(struct hv_device *hdev)
 {
-	struct hyperv_drm_device *hv = hv_get_drvdata(hdev);
+	struct hv_drm_device *hv = hv_get_drvdata(hdev);
 	struct drm_device *dev = &hv->dev;
 	int ret;
 
 	ret = vmbus_open(hdev->channel, VMBUS_RING_BUFSIZE, VMBUS_RING_BUFSIZE,
-			 NULL, 0, hyperv_receive, hdev);
+			 NULL, 0, hv_drm_receive, hdev);
 	if (ret) {
 		drm_err(dev, "Unable to open vmbus channel\n");
 		return ret;
@@ -561,16 +561,16 @@ int hyperv_connect_vsp(struct hv_device *hdev)
 	switch (vmbus_proto_version) {
 	case VERSION_WIN10:
 	case VERSION_WIN10_V5:
-		ret = hyperv_negotiate_version(hdev, SYNTHVID_VERSION_WIN10);
+		ret = hv_drm_negotiate_version(hdev, SYNTHVID_VERSION_WIN10);
 		if (!ret)
 			break;
 		fallthrough;
 	case VERSION_WIN8:
 	case VERSION_WIN8_1:
-		ret = hyperv_negotiate_version(hdev, SYNTHVID_VERSION_WIN8);
+		ret = hv_drm_negotiate_version(hdev, SYNTHVID_VERSION_WIN8);
 		break;
 	default:
-		ret = hyperv_negotiate_version(hdev, SYNTHVID_VERSION_WIN10);
+		ret = hv_drm_negotiate_version(hdev, SYNTHVID_VERSION_WIN10);
 		break;
 	}
 
@@ -581,8 +581,8 @@ int hyperv_connect_vsp(struct hv_device *hdev)
 
 	hv->screen_depth = SYNTHVID_DEPTH_WIN8;
 
-	if (hyperv_version_ge(hv->synthvid_version, SYNTHVID_VERSION_WIN10)) {
-		ret = hyperv_get_supported_resolution(hdev);
+	if (hv_drm_version_ge(hv->synthvid_version, SYNTHVID_VERSION_WIN10)) {
+		ret = hv_drm_get_supported_resolution(hdev);
 		if (ret)
 			drm_err(dev, "Failed to get supported resolution from host, use default\n");
 	}
-- 
2.25.1


^ permalink raw reply related

* RE: [PATCH 1/1] drm/hyperv: Replace "hyperv_" with "hvdrm_" as symbol name prefix
From: Michael Kelley @ 2026-05-28 13:54 UTC (permalink / raw)
  To: Dexuan Cui, Michael Kelley, Hamza Mahfooz
  Cc: maarten.lankhorst@linux.intel.com, mripard@kernel.org,
	tzimmermann@suse.de, airlied@gmail.com, simona@ffwll.ch, Long Li,
	ssengar@linux.microsoft.com, dri-devel@lists.freedesktop.org,
	linux-kernel@vger.kernel.org, linux-hyperv@vger.kernel.org
In-Reply-To: <SA1PR21MB6921A2C91F67C10F272E9430BF082@SA1PR21MB6921.namprd21.prod.outlook.com>

From: Dexuan Cui <DECUI@microsoft.com> Sent: Wednesday, May 27, 2026 12:55 PM
> 
> > From: Michael Kelley <mhklinux@outlook.com>
> > Sent: Wednesday, May 27, 2026 8:05 AM
> > > >
> > > > Function and structure names in the Hyper-V DRM driver currently
> > > > use "hyperv_" as the prefix. This conflicts with usage in core Hyper-V
> > > > and VMBus code, and incorrectly implies that functions and structures
> > > > in this driver apply generically to Hyper-V. A specific conflict arises
> > > > for "hyperv_init", which is an initcall for generic Hyper-V
> > > > initialization on arm64. The conflict prevents the use of
> > > > initcall_blacklist on the kernel boot line to skip loading this driver.
> 
> I also hit the issue. Thanks for the fix!
> 
> > > > Fix this by substituting "hvdrm_" as the prefix for all functions and
> > >
> > > I would personally prefer "hv_drm_", since it seems clearer.
> >
> > My choice of "hvdrm" mimics the old Hyper-V FBdev driver, which
> > uses "hvfb" as the prefix. However, looking through everything that
> > starts with "hv" in /proc/kallsyms, I also see prefixes with the additional
> > underscore.  "hv_kbd_" in the Hyper-V keyboard driver is an example.
> > The Hyper-V utils drivers have both forms -- I see "hv_vss_", "hv_ptp_",
> > and "hv_kvp_", but also "hvt" (for Hyper-V Transport). So the historical
> > practice is inconsistent.
> >
> > I'm OK going either way.  Does anyone else want to express a
> > preference?
> 
> I also prefer "hv_drm_".
> 
> > > > -struct hyperv_drm_device {
> > > > +struct hvdrm_drm_device {
> > >
> > > "hvdrm_drm_device" looks kinda redundant, perhaps
> > > s/hyperv_drm_device/hv_drm_device would be more sensible.
> 
> s/hyperv_drm_device/hv_drm_dev/ seems better to me.
> 
> 
> > Yes, I'll make this change. And in looking through kallsyms, I
> > see that the Hyper-V DRM driver has "hv_fops", which did not
> > get changed in the mechanical substitution because it doesn't
> > start with "hyperv_".  I'll change it to hv_drm_fops.
> >
> > Michael
> 
> Some comments need to be updated accordingly, e.g.
> /* hvdrm_drm_modeset */
> /* hvdrm_drm_proto */
> 
> This needs to be updated as well:
> +static const struct drm_encoder_funcs hvdrm_drm_simple_encoder_funcs_cleanup
> 

Dexuan and Hamza -- thanks for your feedback! I have incorporated
all of it into the "v2" that I just posted.

Michael

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox