* [PATCH AUTOSEL 5.15 12/41] Drivers: hv: vmbus: Deactivate sysctl_record_panic_msg by default in isolated guests
[not found] <20220412004656.350101-1-sashal@kernel.org>
@ 2022-04-12 0:46 ` Sasha Levin
2022-04-12 0:46 ` [PATCH AUTOSEL 5.15 13/41] PCI: hv: Propagate coherence from VMbus device to PCI device Sasha Levin
` (2 subsequent siblings)
3 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2022-04-12 0:46 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Andrea Parri (Microsoft), Dexuan Cui, Wei Liu, Sasha Levin, kys,
haiyangz, sthemmin, linux-hyperv
From: "Andrea Parri (Microsoft)" <parri.andrea@gmail.com>
[ Upstream commit 9f8b577f7b43b2170628d6c537252785dcc2dcea ]
hv_panic_page might contain guest-sensitive information, do not dump it
over to Hyper-V by default in isolated guests.
While at it, update some comments in hyperv_{panic,die}_event().
Reported-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Link: https://lore.kernel.org/r/20220301141135.2232-1-parri.andrea@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/hv/vmbus_drv.c | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 44bd0b6ff505..75e0a0994619 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -76,8 +76,8 @@ static int hyperv_panic_event(struct notifier_block *nb, unsigned long val,
/*
* Hyper-V should be notified only once about a panic. If we will be
- * doing hyperv_report_panic_msg() later with kmsg data, don't do
- * the notification here.
+ * doing hv_kmsg_dump() with kmsg data later, don't do the notification
+ * here.
*/
if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE
&& hyperv_report_reg()) {
@@ -99,8 +99,8 @@ static int hyperv_die_event(struct notifier_block *nb, unsigned long val,
/*
* Hyper-V should be notified only once about a panic. If we will be
- * doing hyperv_report_panic_msg() later with kmsg data, don't do
- * the notification here.
+ * doing hv_kmsg_dump() with kmsg data later, don't do the notification
+ * here.
*/
if (hyperv_report_reg())
hyperv_report_panic(regs, val, true);
@@ -1545,14 +1545,20 @@ static int vmbus_bus_init(void)
if (ret)
goto err_connect;
+ if (hv_is_isolation_supported())
+ sysctl_record_panic_msg = 0;
+
/*
* Only register if the crash MSRs are available
*/
if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE) {
u64 hyperv_crash_ctl;
/*
- * Sysctl registration is not fatal, since by default
- * reporting is enabled.
+ * Panic message recording (sysctl_record_panic_msg)
+ * is enabled by default in non-isolated guests and
+ * disabled by default in isolated guests; the panic
+ * message recording won't be available in isolated
+ * guests should the following registration fail.
*/
hv_ctl_table_hdr = register_sysctl_table(hv_root_table);
if (!hv_ctl_table_hdr)
--
2.35.1
^ permalink raw reply related [flat|nested] 4+ messages in thread* [PATCH AUTOSEL 5.15 13/41] PCI: hv: Propagate coherence from VMbus device to PCI device
[not found] <20220412004656.350101-1-sashal@kernel.org>
2022-04-12 0:46 ` [PATCH AUTOSEL 5.15 12/41] Drivers: hv: vmbus: Deactivate sysctl_record_panic_msg by default in isolated guests Sasha Levin
@ 2022-04-12 0:46 ` Sasha Levin
2022-04-12 0:46 ` [PATCH AUTOSEL 5.15 14/41] Drivers: hv: vmbus: Prevent load re-ordering when reading ring buffer Sasha Levin
2022-04-12 0:46 ` [PATCH AUTOSEL 5.15 25/41] Drivers: hv: balloon: Disable balloon and hot-add accordingly Sasha Levin
3 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2022-04-12 0:46 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Michael Kelley, Boqun Feng, Robin Murphy, Wei Liu, Sasha Levin,
kys, haiyangz, sthemmin, decui, lorenzo.pieralisi, bhelgaas,
linux-hyperv, linux-pci
From: Michael Kelley <mikelley@microsoft.com>
[ Upstream commit 8d21732475c637c7efcdb91dc927a4c594e97898 ]
PCI pass-thru devices in a Hyper-V VM are represented as a VMBus
device and as a PCI device. The coherence of the VMbus device is
set based on the VMbus node in ACPI, but the PCI device has no
ACPI node and defaults to not hardware coherent. This results
in extra software coherence management overhead on ARM64 when
devices are hardware coherent.
Fix this by setting up the PCI host bus so that normal
PCI mechanisms will propagate the coherence of the VMbus
device to the PCI device. There's no effect on x86/x64 where
devices are always hardware coherent.
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Acked-by: Boqun Feng <boqun.feng@gmail.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/1648138492-2191-3-git-send-email-mikelley@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/pci/controller/pci-hyperv.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index 9dd4502d32a4..5b156c563e3a 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -3148,6 +3148,15 @@ static int hv_pci_probe(struct hv_device *hdev,
hbus->bridge->domain_nr = dom;
#ifdef CONFIG_X86
hbus->sysdata.domain = dom;
+#elif defined(CONFIG_ARM64)
+ /*
+ * Set the PCI bus parent to be the corresponding VMbus
+ * device. Then the VMbus device will be assigned as the
+ * ACPI companion in pcibios_root_bridge_prepare() and
+ * pci_dma_configure() will propagate device coherence
+ * information to devices created on the bus.
+ */
+ hbus->sysdata.parent = hdev->device.parent;
#endif
hbus->hdev = hdev;
--
2.35.1
^ permalink raw reply related [flat|nested] 4+ messages in thread* [PATCH AUTOSEL 5.15 14/41] Drivers: hv: vmbus: Prevent load re-ordering when reading ring buffer
[not found] <20220412004656.350101-1-sashal@kernel.org>
2022-04-12 0:46 ` [PATCH AUTOSEL 5.15 12/41] Drivers: hv: vmbus: Deactivate sysctl_record_panic_msg by default in isolated guests Sasha Levin
2022-04-12 0:46 ` [PATCH AUTOSEL 5.15 13/41] PCI: hv: Propagate coherence from VMbus device to PCI device Sasha Levin
@ 2022-04-12 0:46 ` Sasha Levin
2022-04-12 0:46 ` [PATCH AUTOSEL 5.15 25/41] Drivers: hv: balloon: Disable balloon and hot-add accordingly Sasha Levin
3 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2022-04-12 0:46 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Michael Kelley, Andrea Parri, Wei Liu, Sasha Levin, kys, haiyangz,
sthemmin, decui, linux-hyperv
From: Michael Kelley <mikelley@microsoft.com>
[ Upstream commit b6cae15b5710c8097aad26a2e5e752c323ee5348 ]
When reading a packet from a host-to-guest ring buffer, there is no
memory barrier between reading the write index (to see if there is
a packet to read) and reading the contents of the packet. The Hyper-V
host uses store-release when updating the write index to ensure that
writes of the packet data are completed first. On the guest side,
the processor can reorder and read the packet data before the write
index, and sometimes get stale packet data. Getting such stale packet
data has been observed in a reproducible case in a VM on ARM64.
Fix this by using virt_load_acquire() to read the write index,
ensuring that reads of the packet data cannot be reordered
before it. Preventing such reordering is logically correct, and
with this change, getting stale data can no longer be reproduced.
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/1648394710-33480-1-git-send-email-mikelley@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/hv/ring_buffer.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c
index 314015d9e912..f4091143213b 100644
--- a/drivers/hv/ring_buffer.c
+++ b/drivers/hv/ring_buffer.c
@@ -408,7 +408,16 @@ int hv_ringbuffer_read(struct vmbus_channel *channel,
static u32 hv_pkt_iter_avail(const struct hv_ring_buffer_info *rbi)
{
u32 priv_read_loc = rbi->priv_read_index;
- u32 write_loc = READ_ONCE(rbi->ring_buffer->write_index);
+ u32 write_loc;
+
+ /*
+ * The Hyper-V host writes the packet data, then uses
+ * store_release() to update the write_index. Use load_acquire()
+ * here to prevent loads of the packet data from being re-ordered
+ * before the read of the write_index and potentially getting
+ * stale data.
+ */
+ write_loc = virt_load_acquire(&rbi->ring_buffer->write_index);
if (write_loc >= priv_read_loc)
return write_loc - priv_read_loc;
--
2.35.1
^ permalink raw reply related [flat|nested] 4+ messages in thread* [PATCH AUTOSEL 5.15 25/41] Drivers: hv: balloon: Disable balloon and hot-add accordingly
[not found] <20220412004656.350101-1-sashal@kernel.org>
` (2 preceding siblings ...)
2022-04-12 0:46 ` [PATCH AUTOSEL 5.15 14/41] Drivers: hv: vmbus: Prevent load re-ordering when reading ring buffer Sasha Levin
@ 2022-04-12 0:46 ` Sasha Levin
3 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2022-04-12 0:46 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Boqun Feng, Michael Kelley, Wei Liu, Sasha Levin, kys, haiyangz,
sthemmin, decui, linux-hyperv
From: Boqun Feng <boqun.feng@gmail.com>
[ Upstream commit be5802795cf8d0b881745fa9ba7790293b382280 ]
Currently there are known potential issues for balloon and hot-add on
ARM64:
* Unballoon requests from Hyper-V should only unballoon ranges
that are guest page size aligned, otherwise guests cannot handle
because it's impossible to partially free a page. This is a
problem when guest page size > 4096 bytes.
* Memory hot-add requests from Hyper-V should provide the NUMA
node id of the added ranges or ARM64 should have a functional
memory_add_physaddr_to_nid(), otherwise the node id is missing
for add_memory().
These issues require discussions on design and implementation. In the
meanwhile, post_status() is working and essential to guest monitoring.
Therefore instead of disabling the entire hv_balloon driver, the
ballooning (when page size > 4096 bytes) and hot-add are disabled
accordingly for now. Once the issues are fixed, they can be re-enable in
these cases.
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20220325023212.1570049-3-boqun.feng@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/hv/hv_balloon.c | 36 ++++++++++++++++++++++++++++++++++--
1 file changed, 34 insertions(+), 2 deletions(-)
diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
index 439f99b8b5de..3cf334c46c31 100644
--- a/drivers/hv/hv_balloon.c
+++ b/drivers/hv/hv_balloon.c
@@ -1653,6 +1653,38 @@ static void disable_page_reporting(void)
}
}
+static int ballooning_enabled(void)
+{
+ /*
+ * Disable ballooning if the page size is not 4k (HV_HYP_PAGE_SIZE),
+ * since currently it's unclear to us whether an unballoon request can
+ * make sure all page ranges are guest page size aligned.
+ */
+ if (PAGE_SIZE != HV_HYP_PAGE_SIZE) {
+ pr_info("Ballooning disabled because page size is not 4096 bytes\n");
+ return 0;
+ }
+
+ return 1;
+}
+
+static int hot_add_enabled(void)
+{
+ /*
+ * Disable hot add on ARM64, because we currently rely on
+ * memory_add_physaddr_to_nid() to get a node id of a hot add range,
+ * however ARM64's memory_add_physaddr_to_nid() always return 0 and
+ * DM_MEM_HOT_ADD_REQUEST doesn't have the NUMA node information for
+ * add_memory().
+ */
+ if (IS_ENABLED(CONFIG_ARM64)) {
+ pr_info("Memory hot add disabled on ARM64\n");
+ return 0;
+ }
+
+ return 1;
+}
+
static int balloon_connect_vsp(struct hv_device *dev)
{
struct dm_version_request version_req;
@@ -1724,8 +1756,8 @@ static int balloon_connect_vsp(struct hv_device *dev)
* currently still requires the bits to be set, so we have to add code
* to fail the host's hot-add and balloon up/down requests, if any.
*/
- cap_msg.caps.cap_bits.balloon = 1;
- cap_msg.caps.cap_bits.hot_add = 1;
+ cap_msg.caps.cap_bits.balloon = ballooning_enabled();
+ cap_msg.caps.cap_bits.hot_add = hot_add_enabled();
/*
* Specify our alignment requirements as it relates
--
2.35.1
^ permalink raw reply related [flat|nested] 4+ messages in thread