* RE: [PATCH] Drivers: hv: Allow single instance of hv_util devices
[not found] <1734738938-21274-1-git-send-email-sosha@linux.microsoft.com>
@ 2024-12-29 18:02 ` Michael Kelley
2025-01-07 2:00 ` Wei Liu
0 siblings, 1 reply; 3+ messages in thread
From: Michael Kelley @ 2024-12-29 18:02 UTC (permalink / raw)
To: Sonia Sharma, linux-kernel@vger.kernel.org
Cc: sosha@microsoft.com, decui@microsoft.com,
ssengar@linux.microsoft.com, linux-hyperv@vger.kernel.org
From: Sonia Sharma <sosha@linux.microsoft.com> Sent: Friday, December 20, 2024 3:56 PM
>
Please include the "linux-hyperv@vger.kernel.org" mailing list
when submitting patches related to Hyper-V.
> Harden hv_util type device drivers to allow single
> instance of the device be configured at given time.
>
I think the reason for this patch needs more explanation. For several
VMBus devices, a well-behaved Hyper-V is expected to offer only one
instance of the device in a given VM. Linux guests originally assumed
that the Hyper-V host is well-behaved, so the device drivers for many
of these devices were written assuming only a single instance. But
with the introduction of Confidential Computing (CoCo) VMs, Hyper-V
is no longer assumed to be well-behaved. If a compromised & malicious
Hyper-V were to offer multiple instances of such a device, the device
driver assumption about a single instance would be false, and
memory corruption could occur, which has the potential to lead to
compromise of the CoCo VM. The intent is to prevent such a scenario.
Note that this problem extends beyond just "util" devices. Hyper-V
is expected to offer only a single instance of keyboard, mouse, frame
buffer, and balloon devices as well. So this patch should be extended
to include them as well (and your new function names containing
"hv_util" should be adjusted). Interestingly, the Hyper-V keyboard driver
does not assume a single instance, so it should be safe regardless. But
the mouse, frame buffer, and balloon drivers are not safe.
With this understanding, there are two ways to approach the problem:
1) Enforce the expectation that a well-behaved Hyper-V only offers a
single instance of these VMBus devices. That's the approach that this
patch takes.
2) Update the device drivers to remove the assumption of a single
instance. With this approach, if a compromised & malicious Hyper-V
were to offer multiple instances, the extra devices might be bogus,
but memory corruption would not occur and the integrity of the
CoCo VM should not be compromised. As mentioned above, such
is already the case with the keyboard driver.
I've thought about the tradeoffs for the two approaches, and don't
really have a strong opinion either way. In some sense, #2 is the
more correct approach as ideally device drivers shouldn't make
single instance assumptions. But #1 is an easier fix, and perhaps
more robust. Other reviewers might have other reasons to prefer
one over the other, and have a stronger viewpoint on the tradeoffs.
I would be interested in any such comments. But I'm OK with
approach #1 unless someone points out a good reason to
prefer #2.
>
> New function vmbus_is_valid_hvutil_offer() is added.
> It checks if the new offer is for hv_util device type,
> then read the refcount for that device and accept or
> reject the offer accordingly.
>
> Signed-off-by: Sonia Sharma <sonia.sharma@linux.microsoft.com>
> ---
> drivers/hv/channel_mgmt.c | 64 ++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 63 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> index 3c6011a48dab..1a135cfad81f 100644
> --- a/drivers/hv/channel_mgmt.c
> +++ b/drivers/hv/channel_mgmt.c
> @@ -20,6 +20,7 @@
> #include <linux/delay.h>
> #include <linux/cpu.h>
> #include <linux/hyperv.h>
> +#include <linux/refcount.h>
> #include <asm/mshyperv.h>
> #include <linux/sched/isolation.h>
>
> @@ -156,6 +157,8 @@ const struct vmbus_device vmbus_devs[] = {
> };
> EXPORT_SYMBOL_GPL(vmbus_devs);
>
> +static refcount_t singleton_vmbus_devs[HV_UNKNOWN + 1];
> +
From what I can see, these refcounts always have a value of either
0 or 1. The refcount never goes above 1 because the intent of this
patch is to enforce having only a single instance. The patch reads
the refcount, sets it to 1, or decrements it. I think you could just
use an array of booleans here, and set them to true or false.
READ_ONCE() and WRITE_ONCE() should be used because the
booleans are accessed from multiple threads.
> static const struct {
> guid_t guid;
> } vmbus_unsupported_devs[] = {
> @@ -198,6 +201,25 @@ static bool is_unsupported_vmbus_devs(const guid_t *guid)
> return false;
> }
>
> +static bool is_dev_hv_util(u16 dev_type)
> +{
> + switch (dev_type) {
> + case HV_SHUTDOWN:
> + fallthrough;
> + case HV_TS:
> + fallthrough;
> + case HV_HB:
> + fallthrough;
> + case HV_KVP:
> + fallthrough;
> + case HV_BACKUP:
> + return true;
> +
> + default:
> + return false;
> + }
> +}
> +
Rather than have a big case statement here, we already have
the "vmbus_devs" array that statically specifies various properties
of each VMBus device type. I'd suggest adding a field to those array
entries to indicate whether the device type is expected to be a
singleton. Then you can just do a direct lookup, like with the
"perf_device" and "allowed_in_isolated" properties.
> static u16 hv_get_dev_type(const struct vmbus_channel *channel)
> {
> const guid_t *guid = &channel->offermsg.offer.if_type;
> @@ -1004,6 +1026,26 @@ find_primary_channel_by_offer(const struct vmbus_channel_offer_channel *offer)
> return channel;
> }
>
> +static u16 vmbus_is_valid_hvutil_offer(const struct vmbus_channel_offer_channel *offer)
> +{
> + const guid_t *guid = &offer->offer.if_type;
> + u16 i;
> +
> + if (is_hvsock_offer(offer))
> + return HV_UNKNOWN;
> +
> + for (i = HV_IDE; i < HV_UNKNOWN; i++) {
> + if (guid_equal(guid, &vmbus_devs[i].guid) && is_dev_hv_util(i)) {
Ideally, we should avoid coding yet another case of searching through
the vmbus_devs[] array for a matching GUID. The function hv_get_dev_type()
already does this, and returns the index into the vmbus_devs[] array.
You could probably use that function, and then just pass the index as
the argument to this function.
That index is also stored as the "device_id" (arguably mis-named) in the
struct vmbus_channel, so it's already available in the rescind path.
> + if (refcount_read(&singleton_vmbus_devs[i]))
> + return HV_UNKNOWN;
> + refcount_set(&singleton_vmbus_devs[i], 1);
> + return i;
> + }
> + }
> +
> + return i;
> +}
> +
> static bool vmbus_is_valid_offer(const struct vmbus_channel_offer_channel *offer)
> {
> const guid_t *guid = &offer->offer.if_type;
> @@ -1031,6 +1073,7 @@ static void vmbus_onoffer(struct vmbus_channel_message_header *hdr)
> struct vmbus_channel_offer_channel *offer;
> struct vmbus_channel *oldchannel, *newchannel;
> size_t offer_sz;
> + u16 dev_type;
>
> offer = (struct vmbus_channel_offer_channel *)hdr;
>
> @@ -1115,11 +1158,29 @@ static void vmbus_onoffer(struct vmbus_channel_message_header *hdr)
> return;
> }
>
> + /*
> + * If vmbus_is_valid_offer() returns -
> + * HV_UNKNOWN - Subsequent offer received for hv_util dev, thus reject offer.
> + * HV_SHUTDOWN|HV_TS|HV_KVP|HV_HB|HV-KVP|HV_BACKUP - Increment refcount
> + * Others - Continue as is without doing anything.
> + *
> + * NOTE - We do not want to increase refcount if we resume from hibernation.
> + */
> + dev_type = vmbus_is_valid_hvutil_offer(offer);
> + if (dev_type == HV_UNKNOWN) {
> + pr_err_ratelimited("Invalid hv_util offer %d from the host supporting "
> + "isolation\n", offer->child_relid);
This check for multiple instances of a singleton device is not limited
to just CoCo VMs (a.k.a. "isolated VMs"). So the error message here really
shouldn't reference "host supporting isolation".
> + atomic_dec(&vmbus_connection.offer_in_progress);
> + return;
> + }
> +
> /* Allocate the channel object and save this offer. */
> newchannel = alloc_channel();
> if (!newchannel) {
> vmbus_release_relid(offer->child_relid);
> atomic_dec(&vmbus_connection.offer_in_progress);
> + if (is_dev_hv_util(dev_type))
> + refcount_dec(&singleton_vmbus_devs[dev_type]);
It might be good to have a function that combines the above two lines.
Then the two parallel functions are:
1) vmbus_is_valid_hvutil_offer() which marks a singleton device as
"already present" [and that function probably needs a new name]
2) vmbus_clear_singleton_device(), [or something similar] that clears
the boolean if it is a singleton device.
vmbus_clear_singleton_device() would also be used in the
rescind path and in the vmbus_add_channel_work() error path
that I mention below.
> pr_err("Unable to allocate channel object\n");
> return;
> }
There's another error case in the channel offer path that needs
to be handled. vmbus_add_channel_work() can fail, in which case
the new channel is cleaned up and removed. The accounting of
singleton devices must be updated if the channel is deleted via
this error path.
> @@ -1235,7 +1296,6 @@ static void vmbus_onoffer_rescind(struct vmbus_channel_message_header *hdr)
> /*
> * At this point, the rescind handling can proceed safely.
> */
> -
This is a spurious whitespace change that should be avoided.
> if (channel->device_obj) {
> if (channel->chn_rescind_callback) {
> channel->chn_rescind_callback(channel);
> @@ -1251,6 +1311,8 @@ static void vmbus_onoffer_rescind(struct vmbus_channel_message_header *hdr)
> */
> dev = get_device(&channel->device_obj->device);
> if (dev) {
> + if (is_dev_hv_util(hv_get_dev_type(channel)))
As noted above, the "dev_type" is already stored in the channel structure
as field "device_id" (which is a bit mis-named).
Michael
> + refcount_dec(&singleton_vmbus_devs[hv_get_dev_type(channel)]);
> vmbus_device_unregister(channel->device_obj);
> put_device(dev);
> }
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] Drivers: hv: Allow single instance of hv_util devices
2024-12-29 18:02 ` [PATCH] Drivers: hv: Allow single instance of hv_util devices Michael Kelley
@ 2025-01-07 2:00 ` Wei Liu
[not found] ` <CH4PR21MB4613241E591ED702A508C35491112@CH4PR21MB4613.namprd21.prod.outlook.com>
0 siblings, 1 reply; 3+ messages in thread
From: Wei Liu @ 2025-01-07 2:00 UTC (permalink / raw)
To: Michael Kelley
Cc: Sonia Sharma, linux-kernel@vger.kernel.org, sosha@microsoft.com,
decui@microsoft.com, ssengar@linux.microsoft.com,
linux-hyperv@vger.kernel.org, Wei Liu
On Sun, Dec 29, 2024 at 06:02:34PM +0000, Michael Kelley wrote:
> From: Sonia Sharma <sosha@linux.microsoft.com> Sent: Friday, December 20, 2024 3:56 PM
> >
>
> Please include the "linux-hyperv@vger.kernel.org" mailing list
> when submitting patches related to Hyper-V.
>
> > Harden hv_util type device drivers to allow single
> > instance of the device be configured at given time.
> >
Why is this needed? What's the problem that this patch is trying to
solve?
>
> I think the reason for this patch needs more explanation. For several
> VMBus devices, a well-behaved Hyper-V is expected to offer only one
> instance of the device in a given VM. Linux guests originally assumed
> that the Hyper-V host is well-behaved, so the device drivers for many
> of these devices were written assuming only a single instance. But
> with the introduction of Confidential Computing (CoCo) VMs, Hyper-V
> is no longer assumed to be well-behaved. If a compromised & malicious
> Hyper-V were to offer multiple instances of such a device, the device
> driver assumption about a single instance would be false, and
> memory corruption could occur, which has the potential to lead to
> compromise of the CoCo VM. The intent is to prevent such a scenario.
>
> Note that this problem extends beyond just "util" devices. Hyper-V
> is expected to offer only a single instance of keyboard, mouse, frame
> buffer, and balloon devices as well. So this patch should be extended
> to include them as well (and your new function names containing
> "hv_util" should be adjusted). Interestingly, the Hyper-V keyboard driver
> does not assume a single instance, so it should be safe regardless. But
> the mouse, frame buffer, and balloon drivers are not safe.
>
> With this understanding, there are two ways to approach the problem:
>
> 1) Enforce the expectation that a well-behaved Hyper-V only offers a
> single instance of these VMBus devices. That's the approach that this
> patch takes.
>
> 2) Update the device drivers to remove the assumption of a single
> instance. With this approach, if a compromised & malicious Hyper-V
> were to offer multiple instances, the extra devices might be bogus,
> but memory corruption would not occur and the integrity of the
> CoCo VM should not be compromised. As mentioned above, such
> is already the case with the keyboard driver.
>
> I've thought about the tradeoffs for the two approaches, and don't
> really have a strong opinion either way. In some sense, #2 is the
> more correct approach as ideally device drivers shouldn't make
> single instance assumptions. But #1 is an easier fix, and perhaps
> more robust. Other reviewers might have other reasons to prefer
> one over the other, and have a stronger viewpoint on the tradeoffs.
> I would be interested in any such comments. But I'm OK with
> approach #1 unless someone points out a good reason to
> prefer #2.
#2 is preferred. It is frowned upon to make assumptions that only one
instance of a device will be present.
It perhaps takes more work to check and enforce the invariant (as this
patch demonstrates) than to just let the device framework handle
multiple instances.
Thanks,
Wei.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] Drivers: hv: Allow single instance of hv_util devices
[not found] ` <CH4PR21MB4613241E591ED702A508C35491112@CH4PR21MB4613.namprd21.prod.outlook.com>
@ 2025-01-09 7:35 ` Sonia Sharma
0 siblings, 0 replies; 3+ messages in thread
From: Sonia Sharma @ 2025-01-09 7:35 UTC (permalink / raw)
To: Wei Liu, Michael Kelley, Sonia Sharma,
linux-kernel@vger.kernel.org, Sonia Sharma, Dexuan Cui,
ssengar@linux.microsoft.com, linux-hyperv@vger.kernel.org
On Tue, Jan 7, 2025 at 02:00:48PM +0000, Wei Liu wrote:
>
> On Sun, Dec 29, 2024 at 06:02:34PM +0000, Michael Kelley wrote:
>> From: Sonia Sharma <sosha@linux.microsoft.com> Sent: Friday, December 20, 2024 3:56 PM
>> >
>>
>> Please include the "linux-hyperv@vger.kernel.org" mailing list
>> when submitting patches related to Hyper-V.
>>
>> > Harden hv_util type device drivers to allow single
>> > instance of the device be configured at given time.
>> >
>
> Why is this needed? What's the problem that this patch is trying to
> solve?
>
>>
>> I think the reason for this patch needs more explanation. For several
>> VMBus devices, a well-behaved Hyper-V is expected to offer only one
>> instance of the device in a given VM. Linux guests originally assumed
>> that the Hyper-V host is well-behaved, so the device drivers for many
>> of these devices were written assuming only a single instance. But
>> with the introduction of Confidential Computing (CoCo) VMs, Hyper-V
>> is no longer assumed to be well-behaved. If a compromised & malicious
>> Hyper-V were to offer multiple instances of such a device, the device
>> driver assumption about a single instance would be false, and
>> memory corruption could occur, which has the potential to lead to
>> compromise of the CoCo VM. The intent is to prevent such a scenario.
>>
>> Note that this problem extends beyond just "util" devices. Hyper-V
>> is expected to offer only a single instance of keyboard, mouse, frame
>> buffer, and balloon devices as well. So this patch should be extended
>> to include them as well (and your new function names containing
>> "hv_util" should be adjusted). Interestingly, the Hyper-V keyboard driver
>> does not assume a single instance, so it should be safe regardless. But
>> the mouse, frame buffer, and balloon drivers are not safe.
>>
>> With this understanding, there are two ways to approach the problem:
>>
>> 1) Enforce the expectation that a well-behaved Hyper-V only offers a
>> single instance of these VMBus devices. That's the approach that this
>> patch takes.
>>
>> 2) Update the device drivers to remove the assumption of a single
>> instance. With this approach, if a compromised & malicious Hyper-V
>> were to offer multiple instances, the extra devices might be bogus,
>> but memory corruption would not occur and the integrity of the
>> CoCo VM should not be compromised. As mentioned above, such
>> is already the case with the keyboard driver.
>>
>> I've thought about the tradeoffs for the two approaches, and don't
>> really have a strong opinion either way. In some sense, #2 is the
>> more correct approach as ideally device drivers shouldn't make
>> single instance assumptions. But #1 is an easier fix, and perhaps
>> more robust. Other reviewers might have other reasons to prefer
>> one over the other, and have a stronger viewpoint on the tradeoffs.
>> I would be interested in any such comments. But I'm OK with
>> approach #1 unless someone points out a good reason to
>> prefer #2.
>
> #2 is preferred. It is frowned upon to make assumptions that only one
> instance of a device will be present.
>
> It perhaps takes more work to check and enforce the invariant (as this
> patch demonstrates) than to just let the device framework handle
> multiple instances.
>
> Thanks,
> Wei.
Thanks Michael and Wei for the review.
The intent of the patch is correctly described by Michael. With that, it seems the consensus is to go with approach #2, so I would then work on a new patch series fixing the assumption of singleton driver wherever needed.
Thank you,
Sonia
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-01-09 7:35 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1734738938-21274-1-git-send-email-sosha@linux.microsoft.com>
2024-12-29 18:02 ` [PATCH] Drivers: hv: Allow single instance of hv_util devices Michael Kelley
2025-01-07 2:00 ` Wei Liu
[not found] ` <CH4PR21MB4613241E591ED702A508C35491112@CH4PR21MB4613.namprd21.prod.outlook.com>
2025-01-09 7:35 ` Sonia Sharma
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).