All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mukesh R <mrathor@linux.microsoft.com>
To: Michael Kelley <mhklinux@outlook.com>,
	Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Cc: "kys@microsoft.com" <kys@microsoft.com>,
	"haiyangz@microsoft.com" <haiyangz@microsoft.com>,
	"wei.liu@kernel.org" <wei.liu@kernel.org>,
	"decui@microsoft.com" <decui@microsoft.com>,
	"longli@microsoft.com" <longli@microsoft.com>,
	"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mshv: Make MSHV mutually exclusive with KEXEC
Date: Thu, 29 Jan 2026 18:52:59 -0800	[thread overview]
Message-ID: <6e480ee7-683a-e5f1-7448-51f257d58614@linux.microsoft.com> (raw)
In-Reply-To: <SN6PR02MB4157EDC69791EF24D5DA8661D491A@SN6PR02MB4157.namprd02.prod.outlook.com>

On 1/28/26 07:53, Michael Kelley wrote:
> From: Mukesh R <mrathor@linux.microsoft.com> Sent: Tuesday, January 27, 2026 11:56 AM
>> To: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
>> Cc: kys@microsoft.com; haiyangz@microsoft.com; wei.liu@kernel.org;
>> decui@microsoft.com; longli@microsoft.com; linux-hyperv@vger.kernel.org; linux-
>> kernel@vger.kernel.org
>> Subject: Re: [PATCH] mshv: Make MSHV mutually exclusive with KEXEC
>>
>> On 1/27/26 09:47, Stanislav Kinsburskii wrote:
>>> On Mon, Jan 26, 2026 at 05:39:49PM -0800, Mukesh R wrote:
>>>> On 1/26/26 16:21, Stanislav Kinsburskii wrote:
>>>>> On Mon, Jan 26, 2026 at 03:07:18PM -0800, Mukesh R wrote:
>>>>>> On 1/26/26 12:43, Stanislav Kinsburskii wrote:
>>>>>>> On Mon, Jan 26, 2026 at 12:20:09PM -0800, Mukesh R wrote:
>>>>>>>> On 1/25/26 14:39, Stanislav Kinsburskii wrote:
>>>>>>>>> On Fri, Jan 23, 2026 at 04:16:33PM -0800, Mukesh R wrote:
>>>>>>>>>> On 1/23/26 14:20, Stanislav Kinsburskii wrote:
>>>>>>>>>>> The MSHV driver deposits kernel-allocated pages to the hypervisor during
>>>>>>>>>>> runtime and never withdraws them. This creates a fundamental incompatibility
>>>>>>>>>>> with KEXEC, as these deposited pages remain unavailable to the new kernel
>>>>>>>>>>> loaded via KEXEC, leading to potential system crashes upon kernel accessing
>>>>>>>>>>> hypervisor deposited pages.
>>>>>>>>>>>
>>>>>>>>>>> Make MSHV mutually exclusive with KEXEC until proper page lifecycle
>>>>>>>>>>> management is implemented.
>>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
>>>>>>>>>>> ---
>>>>>>>>>>>        drivers/hv/Kconfig |    1 +
>>>>>>>>>>>        1 file changed, 1 insertion(+)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/drivers/hv/Kconfig b/drivers/hv/Kconfig
>>>>>>>>>>> index 7937ac0cbd0f..cfd4501db0fa 100644
>>>>>>>>>>> --- a/drivers/hv/Kconfig
>>>>>>>>>>> +++ b/drivers/hv/Kconfig
>>>>>>>>>>> @@ -74,6 +74,7 @@ config MSHV_ROOT
>>>>>>>>>>>        	# e.g. When withdrawing memory, the hypervisor gives back 4k pages in
>>>>>>>>>>>        	# no particular order, making it impossible to reassemble larger pages
>>>>>>>>>>>        	depends on PAGE_SIZE_4KB
>>>>>>>>>>> +	depends on !KEXEC
>>>>>>>>>>>        	select EVENTFD
>>>>>>>>>>>        	select VIRT_XFER_TO_GUEST_WORK
>>>>>>>>>>>        	select HMM_MIRROR
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Will this affect CRASH kexec? I see few CONFIG_CRASH_DUMP in kexec.c
>>>>>>>>>> implying that crash dump might be involved. Or did you test kdump
>>>>>>>>>> and it was fine?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Yes, it will. Crash kexec depends on normal kexec functionality, so it
>>>>>>>>> will be affected as well.
>>>>>>>>
>>>>>>>> So not sure I understand the reason for this patch. We can just block
>>>>>>>> kexec if there are any VMs running, right? Doing this would mean any
>>>>>>>> further developement would be without a ver important and major feature,
>>>>>>>> right?
>>>>>>>
>>>>>>> This is an option. But until it's implemented and merged, a user mshv
>>>>>>> driver gets into a situation where kexec is broken in a non-obvious way.
>>>>>>> The system may crash at any time after kexec, depending on whether the
>>>>>>> new kernel touches the pages deposited to hypervisor or not. This is a
>>>>>>> bad user experience.
>>>>>>
>>>>>> I understand that. But with this we cannot collect core and debug any
>>>>>> crashes. I was thinking there would be a quick way to prohibit kexec
>>>>>> for update via notifier or some other quick hack. Did you already
>>>>>> explore that and didn't find anything, hence this?
>>>>>>
>>>>>
>>>>> This quick hack you mention isn't quick in the upstream kernel as there
>>>>> is no hook to interrupt kexec process except the live update one.
>>>>
>>>> That's the one we want to interrupt and block right? crash kexec
>>>> is ok and should be allowed. We can document we don't support kexec
>>>> for update for now.
>>>>
>>>>> I sent an RFC for that one but given todays conversation details is
>>>>> won't be accepted as is.
>>>>
>>>> Are you taking about this?
>>>>
>>>>           "mshv: Add kexec safety for deposited pages"
>>>>
>>>
>>> Yes.
>>>
>>>>> Making mshv mutually exclusive with kexec is the only viable option for
>>>>> now given time constraints.
>>>>> It is intended to be replaced with proper page lifecycle management in
>>>>> the future.
>>>>
>>>> Yeah, that could take a long time and imo we cannot just disable KEXEC
>>>> completely. What we want is just block kexec for updates from some
>>>> mshv file for now, we an print during boot that kexec for updates is
>>>> not supported on mshv. Hope that makes sense.
>>>>
>>>
>>> The trade-off here is between disabling kexec support and having the
>>> kernel crash after kexec in a non-obvious way. This affects both regular
>>> kexec and crash kexec.
>>
>> crash kexec on baremetal is not affected, hence disabling that
>> doesn't make sense as we can't debug crashes then on bm.
>>
>> Let me think and explore a bit, and if I come up with something, I'll
>> send a patch here. If nothing, then we can do this as last resort.
>>
>> Thanks,
>> -Mukesh
> 
> Maybe you've already looked at this, but there's a sysctl parameter
> kernel.kexec_load_limit_reboot that prevents loading a kexec
> kernel for reboot if the value is zero. Separately, there is
> kernel.kexec_load_limit_panic that controls whether a kexec
> kernel can be loaded for kdump purposes.
> 
> kernel.kexec_load_limit_reboot defaults to -1, which allows an
> unlimited number of loading a kexec kernel for reboot. But the value
> can be set to zero with this kernel boot line parameter:
> 
> sysctl.kernel.kexec_load_limit_reboot=0
> 
> Alternatively, the mshv driver initialization could add code along
> the lines of process_sysctl_arg() to open
> /proc/sys/kernel/kexec_load_limit_reboot and write a value of zero.
> Then there's no dependency on setting the kernel boot line.
> 
> The downside to either method is that after Linux in the root partition
> is up-and-running, it is possible to change the sysctl to a non-zero value,
> and then load a kexec kernel for reboot. So this approach isn't absolute
> protection against doing a kexec for reboot. But it makes it harder, and
> until there's a mechanism to reclaim the deposited pages, it might be
> a viable compromise to allow kdump to still be used.

Mmm...eee...weelll... i think i see a much easier way to do this by
just hijacking __kexec_lock. I will resume my normal work tmrw/Fri,
so let me test it out. if it works, will send patch Monday.

Thanks,
-Mukesh



> Just a thought ....
> 
> Michael
> 
>>
>>
>>> It?s a pity we can?t apply a quick hack to disable only regular kexec.
>>> However, since crash kexec would hit the same issues, until we have a
>>> proper state transition for deposted pages, the best workaround for now
>>> is to reset the hypervisor state on every kexec, which needs design,
>>> work, and testing.
>>>
>>> Disabling kexec is the only consistent way to handle this in the
>>> upstream kernel at the moment.
>>>
>>> Thanks, Stanislav


  reply	other threads:[~2026-01-30  2:53 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-23 22:20 [PATCH] mshv: Make MSHV mutually exclusive with KEXEC Stanislav Kinsburskii
2026-01-24  0:09 ` Nuno Das Neves
2026-01-24  0:16 ` Mukesh R
2026-01-25 22:39   ` Stanislav Kinsburskii
2026-01-26 20:20     ` Mukesh R
2026-01-26 20:43       ` Stanislav Kinsburskii
2026-01-26 23:07         ` Mukesh R
2026-01-27  0:21           ` Stanislav Kinsburskii
2026-01-27  1:39             ` Mukesh R
2026-01-27 17:47               ` Stanislav Kinsburskii
2026-01-27 19:56                 ` Mukesh R
2026-01-28 15:53                   ` Michael Kelley
2026-01-30  2:52                     ` Mukesh R [this message]
2026-01-28 23:08                   ` Stanislav Kinsburskii
2026-01-30  2:59                     ` Mukesh R
2026-01-30 17:17                       ` Anirudh Rayabharam
2026-01-30 18:41                         ` Stanislav Kinsburskii
2026-01-30 19:47                           ` Mukesh R
2026-02-02 16:43                             ` Stanislav Kinsburskii
2026-02-02 20:15                               ` Mukesh R
2026-02-04  2:46                                 ` Mukesh R
2026-01-26 18:49 ` Anirudh Rayabharam
2026-01-26 20:46   ` Stanislav Kinsburskii
2026-01-28 16:16     ` Anirudh Rayabharam
2026-01-28 23:11       ` Stanislav Kinsburskii
2026-01-30 17:11         ` Anirudh Rayabharam
2026-01-30 18:46           ` Stanislav Kinsburskii
2026-01-30 20:32             ` Anirudh Rayabharam
2026-02-02 17:10               ` Stanislav Kinsburskii
2026-02-02 19:01                 ` Anirudh Rayabharam
2026-02-02 19:18                   ` Stanislav Kinsburskii
2026-02-03  5:04                     ` Anirudh Rayabharam
2026-02-03 15:40                       ` Stanislav Kinsburskii
2026-02-03 16:46                         ` Anirudh Rayabharam
2026-02-03 19:42                           ` Stanislav Kinsburskii
2026-02-04  5:33                             ` Anirudh Rayabharam
2026-02-04 18:33                               ` Stanislav Kinsburskii
2026-02-05  4:59                                 ` Anirudh Rayabharam
2026-02-05 17:12                                   ` Stanislav Kinsburskii
2026-02-02 18:09           ` Stanislav Kinsburskii
2026-02-02 16:56 ` Naman Jain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6e480ee7-683a-e5f1-7448-51f257d58614@linux.microsoft.com \
    --to=mrathor@linux.microsoft.com \
    --cc=decui@microsoft.com \
    --cc=haiyangz@microsoft.com \
    --cc=kys@microsoft.com \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longli@microsoft.com \
    --cc=mhklinux@outlook.com \
    --cc=skinsburskii@linux.microsoft.com \
    --cc=wei.liu@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.