[PATCH RESEND kvmtool 0/4] arm: Payload memory layout change

public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed

* [PATCH RESEND kvmtool 0/4] arm: Payload memory layout change
@ 2024-11-28 15:12 Alexandru Elisei
  2024-11-28 15:12 ` [PATCH kvmtool 1/4] arm: Fix off-by-one errors when computing payload memory layout Alexandru Elisei
                   ` (5 more replies)
  0 siblings, 6 replies; 9+ messages in thread
From: Alexandru Elisei @ 2024-11-28 15:12 UTC (permalink / raw)
  To: will, julien.thierry.kdev, kvm, linux-arm-kernel, kvmarm
  Cc: maz, oliver.upton, apatel, andre.przywara, suzuki.poulose,
	s.abdollahi22

(resending because I accidently only sent the cover letter, sorry for that)

The first 3 patches are fixes to kvm__arch_load_kernel_image(). I've CC'ed
the riscv maintainer because it looks to me like riscv is similarly
affected.

Patch #4 ("arm64: Increase the payload memory region size to 512MB") might
be controversial. Follows a bug report I received from Abdollahi Sina in
private. Details in the commit message, but the gist of the patch is that
the memory region where kernel + initrd + DTB are copied to are changed
from 256MB to 512MB.  As a result, the DTB and initrd are moved from below
ram_start + 256MB to ram_start + 512MB to accomodate a larger initrd.  If
users rely on finding the DTB and initrd at the current addresses, then I'm
not sure the patch is justified - after all, if someone really wants to use
such a large initrd instead of a disk image with virtio, then replacing
SZ_256M with SZ_512M locally doesn't look like a big ask.

On the other hand, if there are no users that rely on the current payload
layout, increasing the memory region size to 512MB to allow for more
unusual use cases, while still maintaining compatibility with older
kernels, doesn't seem unreasonable to me.

Please comment, I don't feel strongly either way - I'll happy drop the last
patch if there are objections.

Alexandru Elisei (4):
  arm: Fix off-by-one errors when computing payload memory layout
  arm: Check return value for host_to_guest_flat()
  arm64: Use the kernel header image_size when loading into memory
  arm64: Increase the payload memory region size to 512MB

 arm/aarch32/include/kvm/kvm-arch.h |  5 +-
 arm/aarch64/include/kvm/kvm-arch.h |  7 ++-
 arm/aarch64/kvm.c                  | 88 +++++++++++++++++++++++-------
 arm/kvm.c                          | 35 +++++++++---
 4 files changed, 105 insertions(+), 30 deletions(-)

-- 
2.47.0

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH kvmtool 1/4] arm: Fix off-by-one errors when computing payload memory layout
  2024-11-28 15:12 [PATCH RESEND kvmtool 0/4] arm: Payload memory layout change Alexandru Elisei
@ 2024-11-28 15:12 ` Alexandru Elisei
  2024-11-28 15:12 ` [PATCH kvmtool 2/4] arm: Check return value for host_to_guest_flat() Alexandru Elisei
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Alexandru Elisei @ 2024-11-28 15:12 UTC (permalink / raw)
  To: will, julien.thierry.kdev, kvm, linux-arm-kernel, kvmarm
  Cc: maz, oliver.upton, apatel, andre.przywara, suzuki.poulose,
	s.abdollahi22

In kvm__arch_load_kernel_image(), 'limit' is computed to be the topmost
byte address where the payload can reside.

In all the read_file() calls, the maximum size of the file being read is
computed as limit - pos, which is incorrect: either limit is inclusive, and
it should be limit - pos + 1, or the maximum size is correct and limit is
incorrectly computed as inclusive.

After reserving space for the DTB, 'limit' is updated to point at the first
byte of the DTB. Which is in contradiction with the way it is initially
calculated, because in theory this makes it possible for the initrd (which
is copied below the DTB) to overwrite the first byte of the DTB. That's
only avoided by accident, and not by design, because, as explained above,
the size of the initrd is smaller by 1 byte (read_file() has the size
parameter limit - pos, instead of limit - pos + 1).

Let's get rid of this confusion and compute 'limit' as exclusive from the
start.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arm/kvm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arm/kvm.c b/arm/kvm.c
index 9f9582326401..da0430c40c36 100644
--- a/arm/kvm.c
+++ b/arm/kvm.c
@@ -109,7 +109,7 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, int fd_kernel, int fd_initrd,
 	 * Linux requires the initrd and dtb to be mapped inside lowmem,
 	 * so we can't just place them at the top of memory.
 	 */
-	limit = kvm->ram_start + min(kvm->ram_size, (u64)SZ_256M) - 1;
+	limit = kvm->ram_start + min(kvm->ram_size, (u64)SZ_256M);

 	pos = kvm->ram_start + kvm__arch_get_kern_offset(kvm, fd_kernel);
 	kvm->arch.kern_guest_start = host_to_guest_flat(kvm, pos);
@@ -139,7 +139,7 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, int fd_kernel, int fd_initrd,
 	kvm->arch.dtb_guest_start = guest_addr;
 	pr_debug("Placing fdt at 0x%llx - 0x%llx",
 		 kvm->arch.dtb_guest_start,
-		 host_to_guest_flat(kvm, limit));
+		 host_to_guest_flat(kvm, limit - 1));
 	limit = pos;

 	/* ... and finally the initrd, if we have one. */
-- 
2.47.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH kvmtool 2/4] arm: Check return value for host_to_guest_flat()
  2024-11-28 15:12 [PATCH RESEND kvmtool 0/4] arm: Payload memory layout change Alexandru Elisei
  2024-11-28 15:12 ` [PATCH kvmtool 1/4] arm: Fix off-by-one errors when computing payload memory layout Alexandru Elisei
@ 2024-11-28 15:12 ` Alexandru Elisei
  2024-11-28 16:06   ` Andrew Jones
  2024-11-28 15:12 ` [PATCH kvmtool 3/4] arm64: Use the kernel header image_size when loading into memory Alexandru Elisei
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 9+ messages in thread
From: Alexandru Elisei @ 2024-11-28 15:12 UTC (permalink / raw)
  To: will, julien.thierry.kdev, kvm, linux-arm-kernel, kvmarm
  Cc: maz, oliver.upton, apatel, andre.przywara, suzuki.poulose,
	s.abdollahi22

kvmtool, on arm and arm64, puts the kernel, DTB and initrd (if present) in
a 256MB memory region that starts at the bottom of RAM.
kvm__arch_load_kernel_image() copies the kernel at the start of RAM, the
DTB is placed at the top of the region, and immediately below it the
initrd.

When the initrd is specified by the user, kvmtool checks that it doesn't
overlap with the kernel by computing the start address in the host's
address space:

	fstat(fd_initrd, &sb);
	pos = pos - (sb.st_size + INITRD_ALIGN);
	guest_addr = ALIGN(host_to_guest_flat(kvm, pos), INITRD_ALIGN); (a)
	pos = guest_flat_to_host(kvm, guest_addr); (b)

If the initrd is large enough to completely overwrite the kernel and start
below the guest RAM (pos < kvm->ram_start), then kvmtool will omit the
following errors:

  Warning: unable to translate host address 0xfffe849ffffc to guest (1)
  Warning: unable to translate guest address 0x0 to host (2)
  Fatal: initrd overlaps with kernel image. (3)

(1) is because (a) calls host_to_guest_flat(kvm, pos) with a 'pos'
outside any of the memslots.

(2) is because guest_flat_to_host() is called at (b) with guest_addr=0,
which is what host_to_guest_flat() returns if the supplied address is not
found in any of the memslots. This warning is eliminated by this patch.

And finally, (3) is the most useful message, because it tells the user what
the error is.

The issue is a more general pattern in kvm__arch_load_kernel_image():
kvmtool doesn't check if host_to_guest_flat() returns 0, which means that
the host address is not within any of the memslots. Add a check for that,
which will at the very least remove the second warning.

This also fixes the following edge cases:

1. The same warnings being emitted in a similar scenario with the DTB, when
the RAM is smaller than FDT_MAX_SIZE + FDT_ALIGN.

2. When copying the kernel, if the RAM size is smaller than the kernel
offset, the start of the kernel (represented by the variable 'pos') will be
outside the VA space allocated for the guest RAM.  limit - pos will wrap
around, because gcc 14.1.1 wraps the pointers (void pointer arithmetic is
undefined in C99). Then read_file()->..->read() will return -EFAULT because
the destination address is unallocated (as per man 2 read, also reproduced
during testing).

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arm/kvm.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/arm/kvm.c b/arm/kvm.c
index da0430c40c36..4beae69e1fb3 100644
--- a/arm/kvm.c
+++ b/arm/kvm.c
@@ -113,6 +113,8 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, int fd_kernel, int fd_initrd,

 	pos = kvm->ram_start + kvm__arch_get_kern_offset(kvm, fd_kernel);
 	kvm->arch.kern_guest_start = host_to_guest_flat(kvm, pos);
+	if (!kvm->arch.kern_guest_start)
+			die("guest memory too small to contain the kernel");
 	file_size = read_file(fd_kernel, pos, limit - pos);
 	if (file_size < 0) {
 		if (errno == ENOMEM)
@@ -131,7 +133,10 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, int fd_kernel, int fd_initrd,
 	 */
 	pos = limit;
 	pos -= (FDT_MAX_SIZE + FDT_ALIGN);
-	guest_addr = ALIGN(host_to_guest_flat(kvm, pos), FDT_ALIGN);
+	guest_addr = host_to_guest_flat(kvm, pos);
+	if (!guest_addr)
+		die("fdt too big to contain in guest memory");
+	guest_addr = ALIGN(guest_addr, FDT_ALIGN);
 	pos = guest_flat_to_host(kvm, guest_addr);
 	if (pos < kernel_end)
 		die("fdt overlaps with kernel image.");
@@ -151,7 +156,10 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, int fd_kernel, int fd_initrd,
 			die_perror("fstat");

 		pos -= (sb.st_size + INITRD_ALIGN);
-		guest_addr = ALIGN(host_to_guest_flat(kvm, pos), INITRD_ALIGN);
+		guest_addr = host_to_guest_flat(kvm, pos);
+		if (!guest_addr)
+			die("initrd too big to fit in the payload memory region");
+		guest_addr = ALIGN(guest_addr, INITRD_ALIGN);
 		pos = guest_flat_to_host(kvm, guest_addr);
 		if (pos < kernel_end)
 			die("initrd overlaps with kernel image.");
-- 
2.47.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH kvmtool 3/4] arm64: Use the kernel header image_size when loading into memory
  2024-11-28 15:12 [PATCH RESEND kvmtool 0/4] arm: Payload memory layout change Alexandru Elisei
  2024-11-28 15:12 ` [PATCH kvmtool 1/4] arm: Fix off-by-one errors when computing payload memory layout Alexandru Elisei
  2024-11-28 15:12 ` [PATCH kvmtool 2/4] arm: Check return value for host_to_guest_flat() Alexandru Elisei
@ 2024-11-28 15:12 ` Alexandru Elisei
  2024-11-28 15:12 ` [RFC PATCH kvmtool 4/4] arm64: Increase the payload memory region size to 512MB Alexandru Elisei
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Alexandru Elisei @ 2024-11-28 15:12 UTC (permalink / raw)
  To: will, julien.thierry.kdev, kvm, linux-arm-kernel, kvmarm
  Cc: maz, oliver.upton, apatel, andre.przywara, suzuki.poulose,
	s.abdollahi22

The field 'image_size' from the kernel header encodes the kernel size when
loaded in memory. This includes the BSS section which gets zeroed early
during boot (for example, in early_map_kernel() in Linux v6.12), section
which is not reflected in the file size.

kvmtool, after loading the kernel image into memory, uses the file size,
not the image size, to compute the end of the kernel to check for overlaps.
As a result, kvmtool doesn't detect when the DTB or initrd overlap with the
in memory kernel image as long as they don't overlap with the file, and
this leads to Linux silently overwriting the DTB or the initrd with zeroes
during boot.

This kind of issue, when it happens, is not trivial to debug. kvmtool
already reads the image header to get the kernel offset, so expand on that
to also read the image size, and use it instead of the file size for memory
layout calculations.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arm/aarch32/include/kvm/kvm-arch.h |  2 +
 arm/aarch64/include/kvm/kvm-arch.h |  5 +-
 arm/aarch64/kvm.c                  | 80 +++++++++++++++++++++++-------
 arm/kvm.c                          | 15 ++++--
 4 files changed, 78 insertions(+), 24 deletions(-)

diff --git a/arm/aarch32/include/kvm/kvm-arch.h b/arm/aarch32/include/kvm/kvm-arch.h
index 467fb09175b8..07d711e2f4c1 100644
--- a/arm/aarch32/include/kvm/kvm-arch.h
+++ b/arm/aarch32/include/kvm/kvm-arch.h
@@ -4,8 +4,10 @@
 #include <linux/sizes.h>
 
 #define kvm__arch_get_kern_offset(...)	0x8000
+#define kvm__arch_get_kernel_size(...)	0
 
 struct kvm;
+static inline void kvm__arch_read_kernel_header(struct kvm *kvm, int fd) {}
 static inline void kvm__arch_enable_mte(struct kvm *kvm) {}
 
 #define MAX_PAGE_SIZE	SZ_4K
diff --git a/arm/aarch64/include/kvm/kvm-arch.h b/arm/aarch64/include/kvm/kvm-arch.h
index 02d09a413831..97ab42485158 100644
--- a/arm/aarch64/include/kvm/kvm-arch.h
+++ b/arm/aarch64/include/kvm/kvm-arch.h
@@ -4,7 +4,10 @@
 #include <linux/sizes.h>
 
 struct kvm;
-unsigned long long kvm__arch_get_kern_offset(struct kvm *kvm, int fd);
+void kvm__arch_read_kernel_header(struct kvm *kvm, int fd);
+unsigned long long kvm__arch_get_kern_offset(struct kvm *kvm);
+u64 kvm__arch_get_kernel_size(struct kvm *kvm);
+
 int kvm__arch_get_ipa_limit(struct kvm *kvm);
 void kvm__arch_enable_mte(struct kvm *kvm);
 
diff --git a/arm/aarch64/kvm.c b/arm/aarch64/kvm.c
index 54200c9eec9d..6fcc828cbe01 100644
--- a/arm/aarch64/kvm.c
+++ b/arm/aarch64/kvm.c
@@ -8,6 +8,8 @@
 
 #include <kvm/util.h>
 
+static struct arm64_image_header *kernel_header;
+
 int vcpu_affinity_parser(const struct option *opt, const char *arg, int unset)
 {
 	struct kvm *kvm = opt->ptr;
@@ -57,50 +59,82 @@ u64 kvm__arch_default_ram_address(void)
 	return ARM_MEMORY_AREA;
 }
 
+void kvm__arch_read_kernel_header(struct kvm *kvm, int fd)
+{
+	const char *debug_str;
+	off_t cur_offset;
+	ssize_t size;
+
+	if (kvm->cfg.arch.aarch32_guest)
+		return;
+
+	kernel_header = malloc(sizeof(*kernel_header));
+	if (!kernel_header)
+		return;
+
+	cur_offset = lseek(fd, 0, SEEK_CUR);
+	if (cur_offset == (off_t)-1 || lseek(fd, 0, SEEK_SET) == (off_t)-1) {
+		debug_str = "Failed to seek in kernel image file";
+		goto fail;
+	}
+
+	size = xread(fd, kernel_header, sizeof(*kernel_header));
+	if (size < 0 || (size_t)size < sizeof(*kernel_header))
+		die("Failed to read kernel image header");
+
+	lseek(fd, cur_offset, SEEK_SET);
+
+	if (memcmp(&kernel_header->magic, ARM64_IMAGE_MAGIC, sizeof(kernel_header->magic))) {
+		debug_str = "Kernel image magic not matching";
+		kernel_header = NULL;
+		goto fail;
+	}
+
+	return;
+
+fail:
+	pr_debug("%s, using defaults", debug_str);
+}
+
 /*
  * Return the TEXT_OFFSET value that the guest kernel expects. Note
  * that pre-3.17 kernels expose this value using the native endianness
  * instead of Little-Endian. BE kernels of this vintage may fail to
  * boot. See Documentation/arm64/booting.rst in your local kernel tree.
  */
-unsigned long long kvm__arch_get_kern_offset(struct kvm *kvm, int fd)
+unsigned long long kvm__arch_get_kern_offset(struct kvm *kvm)
 {
-	struct arm64_image_header header;
-	off_t cur_offset;
-	ssize_t size;
 	const char *debug_str;
 
 	/* the 32bit kernel offset is a well known value */
 	if (kvm->cfg.arch.aarch32_guest)
 		return 0x8000;
 
-	cur_offset = lseek(fd, 0, SEEK_CUR);
-	if (cur_offset == (off_t)-1 ||
-	    lseek(fd, 0, SEEK_SET) == (off_t)-1) {
-		debug_str = "Failed to seek in kernel image file";
+	if (!kernel_header) {
+		debug_str = "Kernel header is missing";
 		goto default_offset;
 	}
 
-	size = xread(fd, &header, sizeof(header));
-	if (size < 0 || (size_t)size < sizeof(header))
-		die("Failed to read kernel image header");
-
-	lseek(fd, cur_offset, SEEK_SET);
-
-	if (memcmp(&header.magic, ARM64_IMAGE_MAGIC, sizeof(header.magic))) {
-		debug_str = "Kernel image magic not matching";
+	if (!le64_to_cpu(kernel_header->image_size)) {
+		debug_str = "Image size is 0";
 		goto default_offset;
 	}
 
-	if (le64_to_cpu(header.image_size))
-		return le64_to_cpu(header.text_offset);
+	return le64_to_cpu(kernel_header->text_offset);
 
-	debug_str = "Image size is 0";
 default_offset:
 	pr_debug("%s, assuming TEXT_OFFSET to be 0x80000", debug_str);
 	return 0x80000;
 }
 
+u64 kvm__arch_get_kernel_size(struct kvm *kvm)
+{
+	if (kvm->cfg.arch.aarch32_guest || !kernel_header)
+		return 0;
+
+	return le64_to_cpu(kernel_header->image_size);
+}
+
 int kvm__arch_get_ipa_limit(struct kvm *kvm)
 {
 	int ret;
@@ -160,3 +194,11 @@ void kvm__arch_enable_mte(struct kvm *kvm)
 
 	pr_debug("MTE capability enabled");
 }
+
+static int kvm__arch_free_kernel_header(struct kvm *kvm)
+{
+	free(kernel_header);
+
+	return 0;
+}
+late_exit(kvm__arch_free_kernel_header);
diff --git a/arm/kvm.c b/arm/kvm.c
index 4beae69e1fb3..9013be489aff 100644
--- a/arm/kvm.c
+++ b/arm/kvm.c
@@ -104,6 +104,7 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, int fd_kernel, int fd_initrd,
 	void *pos, *kernel_end, *limit;
 	unsigned long guest_addr;
 	ssize_t file_size;
+	u64 kernel_size;
 
 	/*
 	 * Linux requires the initrd and dtb to be mapped inside lowmem,
@@ -111,7 +112,9 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, int fd_kernel, int fd_initrd,
 	 */
 	limit = kvm->ram_start + min(kvm->ram_size, (u64)SZ_256M);
 
-	pos = kvm->ram_start + kvm__arch_get_kern_offset(kvm, fd_kernel);
+	kvm__arch_read_kernel_header(kvm, fd_kernel);
+
+	pos = kvm->ram_start + kvm__arch_get_kern_offset(kvm);
 	kvm->arch.kern_guest_start = host_to_guest_flat(kvm, pos);
 	if (!kvm->arch.kern_guest_start)
 			die("guest memory too small to contain the kernel");
@@ -122,9 +125,13 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, int fd_kernel, int fd_initrd,
 
 		die_perror("kernel read");
 	}
-	kernel_end = pos + file_size;
-	pr_debug("Loaded kernel to 0x%llx (%zd bytes)",
-		 kvm->arch.kern_guest_start, file_size);
+
+	kernel_size = kvm__arch_get_kernel_size(kvm);
+	if (!kernel_size || kernel_size < (u64)file_size)
+		kernel_size = file_size;
+	kernel_end = pos + kernel_size;
+	pr_debug("Loaded kernel to 0x%llx (%llu bytes)",
+		 kvm->arch.kern_guest_start, kernel_size);
 
 	/*
 	 * Now load backwards from the end of memory so the kernel
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC PATCH kvmtool 4/4] arm64: Increase the payload memory region size to 512MB
  2024-11-28 15:12 [PATCH RESEND kvmtool 0/4] arm: Payload memory layout change Alexandru Elisei
                   ` (2 preceding siblings ...)
  2024-11-28 15:12 ` [PATCH kvmtool 3/4] arm64: Use the kernel header image_size when loading into memory Alexandru Elisei
@ 2024-11-28 15:12 ` Alexandru Elisei
  2024-12-11 22:59 ` [PATCH RESEND kvmtool 0/4] arm: Payload memory layout change Will Deacon
  2024-12-11 23:44 ` Will Deacon
  5 siblings, 0 replies; 9+ messages in thread
From: Alexandru Elisei @ 2024-11-28 15:12 UTC (permalink / raw)
  To: will, julien.thierry.kdev, kvm, linux-arm-kernel, kvmarm
  Cc: maz, oliver.upton, apatel, andre.przywara, suzuki.poulose,
	s.abdollahi22

kvmtool uses the same memory map for 64bit and 32bit guests, where it
copies the kernel, the initrd and DTB in the bottom 256MB.

The restriction on placing everything in the bottom 256MB comes from the
aarch32 boot protocol, where the kernel, DTB and initrd must be placed in a
region covered by the low-memory mapping. The size of the lowmem region
varies based on the kernel-userspace split, which is a Kconfig option, and
on the vmalloc area size, which can be specified by the user on the kernel
command line. Hence kvmtool's choice of using the bottom 256MB as a
reasonable compromise which has worked well so far.

Sina has reported in private that they were unable to create a 64bit
virtual machine with a 351MB initrd, and that's due to the 256MB
restriction placed on the arm64 payload layout.

This restriction is not found in the arm64 boot protocol: booting.rst in
the Linux v6.12 source tree specifies that the kernel and initrd must be
placed in the same 32GB window. There is also a mention of kernels prior
to v4.2 requiring the DTB to be placed within a 512MB region starting at
the kernel image minus the kernel offset.

Increase the payload region size to 512MB for arm64, which will provide
maximum compatibility with Linux guests, while allowing for larger initrds
or kernel images. This means that the gap between the DTB (or initrd, if
present) and the kernel is larger now.

For 32 bit guests, the payload region size has been kept unchanged, because
it has proven adequate so far.

Reported-by: Abdollahi Sina <s.abdollahi22@imperial.ac.uk>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arm/aarch32/include/kvm/kvm-arch.h | 5 +++--
 arm/aarch64/include/kvm/kvm-arch.h | 2 ++
 arm/aarch64/kvm.c                  | 8 ++++++++
 arm/kvm.c                          | 6 ++++--
 4 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/arm/aarch32/include/kvm/kvm-arch.h b/arm/aarch32/include/kvm/kvm-arch.h
index 07d711e2f4c1..0333cf4355ac 100644
--- a/arm/aarch32/include/kvm/kvm-arch.h
+++ b/arm/aarch32/include/kvm/kvm-arch.h
@@ -3,8 +3,9 @@
 
 #include <linux/sizes.h>
 
-#define kvm__arch_get_kern_offset(...)	0x8000
-#define kvm__arch_get_kernel_size(...)	0
+#define kvm__arch_get_kern_offset(...)		0x8000
+#define kvm__arch_get_kernel_size(...)		0
+#define kvm__arch_get_payload_region_size(...)	SZ_256M
 
 struct kvm;
 static inline void kvm__arch_read_kernel_header(struct kvm *kvm, int fd) {}
diff --git a/arm/aarch64/include/kvm/kvm-arch.h b/arm/aarch64/include/kvm/kvm-arch.h
index 97ab42485158..2d1a4ed8cea4 100644
--- a/arm/aarch64/include/kvm/kvm-arch.h
+++ b/arm/aarch64/include/kvm/kvm-arch.h
@@ -8,6 +8,8 @@ void kvm__arch_read_kernel_header(struct kvm *kvm, int fd);
 unsigned long long kvm__arch_get_kern_offset(struct kvm *kvm);
 u64 kvm__arch_get_kernel_size(struct kvm *kvm);
 
+u64 kvm__arch_get_payload_region_size(struct kvm *kvm);
+
 int kvm__arch_get_ipa_limit(struct kvm *kvm);
 void kvm__arch_enable_mte(struct kvm *kvm);
 
diff --git a/arm/aarch64/kvm.c b/arm/aarch64/kvm.c
index 6fcc828cbe01..98b24375ee98 100644
--- a/arm/aarch64/kvm.c
+++ b/arm/aarch64/kvm.c
@@ -135,6 +135,14 @@ u64 kvm__arch_get_kernel_size(struct kvm *kvm)
 	return le64_to_cpu(kernel_header->image_size);
 }
 
+u64 kvm__arch_get_payload_region_size(struct kvm *kvm)
+{
+	if (kvm->cfg.arch.aarch32_guest)
+		return SZ_256M;
+
+	return SZ_512M;
+}
+
 int kvm__arch_get_ipa_limit(struct kvm *kvm)
 {
 	int ret;
diff --git a/arm/kvm.c b/arm/kvm.c
index 9013be489aff..7b2b49e21498 100644
--- a/arm/kvm.c
+++ b/arm/kvm.c
@@ -103,14 +103,16 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, int fd_kernel, int fd_initrd,
 {
 	void *pos, *kernel_end, *limit;
 	unsigned long guest_addr;
+	u64 payload_region_size;
 	ssize_t file_size;
 	u64 kernel_size;
 
+	payload_region_size = kvm__arch_get_payload_region_size(kvm);
 	/*
-	 * Linux requires the initrd and dtb to be mapped inside lowmem,
+	 * Linux for arm requires the initrd and dtb to be mapped inside lowmem,
 	 * so we can't just place them at the top of memory.
 	 */
-	limit = kvm->ram_start + min(kvm->ram_size, (u64)SZ_256M);
+	limit = kvm->ram_start + min(kvm->ram_size, payload_region_size);
 
 	kvm__arch_read_kernel_header(kvm, fd_kernel);
 
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH kvmtool 2/4] arm: Check return value for host_to_guest_flat()
  2024-11-28 15:12 ` [PATCH kvmtool 2/4] arm: Check return value for host_to_guest_flat() Alexandru Elisei
@ 2024-11-28 16:06   ` Andrew Jones
  0 siblings, 0 replies; 9+ messages in thread
From: Andrew Jones @ 2024-11-28 16:06 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: will, julien.thierry.kdev, kvm, linux-arm-kernel, kvmarm, maz,
	oliver.upton, apatel, andre.przywara, suzuki.poulose,
	s.abdollahi22

On Thu, Nov 28, 2024 at 03:12:44PM +0000, Alexandru Elisei wrote:
> kvmtool, on arm and arm64, puts the kernel, DTB and initrd (if present) in
> a 256MB memory region that starts at the bottom of RAM.
> kvm__arch_load_kernel_image() copies the kernel at the start of RAM, the
> DTB is placed at the top of the region, and immediately below it the
> initrd.
> 
> When the initrd is specified by the user, kvmtool checks that it doesn't
> overlap with the kernel by computing the start address in the host's
> address space:
> 
> 	fstat(fd_initrd, &sb);
> 	pos = pos - (sb.st_size + INITRD_ALIGN);
> 	guest_addr = ALIGN(host_to_guest_flat(kvm, pos), INITRD_ALIGN); (a)
> 	pos = guest_flat_to_host(kvm, guest_addr); (b)
> 
> If the initrd is large enough to completely overwrite the kernel and start
> below the guest RAM (pos < kvm->ram_start), then kvmtool will omit the
> following errors:
> 
>   Warning: unable to translate host address 0xfffe849ffffc to guest (1)
>   Warning: unable to translate guest address 0x0 to host (2)
>   Fatal: initrd overlaps with kernel image. (3)
> 
> (1) is because (a) calls host_to_guest_flat(kvm, pos) with a 'pos'
> outside any of the memslots.
> 
> (2) is because guest_flat_to_host() is called at (b) with guest_addr=0,
> which is what host_to_guest_flat() returns if the supplied address is not
> found in any of the memslots. This warning is eliminated by this patch.
> 
> And finally, (3) is the most useful message, because it tells the user what
> the error is.
> 
> The issue is a more general pattern in kvm__arch_load_kernel_image():
> kvmtool doesn't check if host_to_guest_flat() returns 0, which means that
> the host address is not within any of the memslots. Add a check for that,
> which will at the very least remove the second warning.
> 
> This also fixes the following edge cases:
> 
> 1. The same warnings being emitted in a similar scenario with the DTB, when
> the RAM is smaller than FDT_MAX_SIZE + FDT_ALIGN.
> 
> 2. When copying the kernel, if the RAM size is smaller than the kernel
> offset, the start of the kernel (represented by the variable 'pos') will be
> outside the VA space allocated for the guest RAM.  limit - pos will wrap
> around, because gcc 14.1.1 wraps the pointers (void pointer arithmetic is
> undefined in C99). Then read_file()->..->read() will return -EFAULT because
> the destination address is unallocated (as per man 2 read, also reproduced
> during testing).
> 
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
>  arm/kvm.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/arm/kvm.c b/arm/kvm.c
> index da0430c40c36..4beae69e1fb3 100644
> --- a/arm/kvm.c
> +++ b/arm/kvm.c
> @@ -113,6 +113,8 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, int fd_kernel, int fd_initrd,
>  
>  	pos = kvm->ram_start + kvm__arch_get_kern_offset(kvm, fd_kernel);
>  	kvm->arch.kern_guest_start = host_to_guest_flat(kvm, pos);
> +	if (!kvm->arch.kern_guest_start)
> +			die("guest memory too small to contain the kernel");

Just doing a quick drive-by and noticed this indentation issue.

Thanks,
drew

>  	file_size = read_file(fd_kernel, pos, limit - pos);
>  	if (file_size < 0) {
>  		if (errno == ENOMEM)
> @@ -131,7 +133,10 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, int fd_kernel, int fd_initrd,
>  	 */
>  	pos = limit;
>  	pos -= (FDT_MAX_SIZE + FDT_ALIGN);
> -	guest_addr = ALIGN(host_to_guest_flat(kvm, pos), FDT_ALIGN);
> +	guest_addr = host_to_guest_flat(kvm, pos);
> +	if (!guest_addr)
> +		die("fdt too big to contain in guest memory");
> +	guest_addr = ALIGN(guest_addr, FDT_ALIGN);
>  	pos = guest_flat_to_host(kvm, guest_addr);
>  	if (pos < kernel_end)
>  		die("fdt overlaps with kernel image.");
> @@ -151,7 +156,10 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, int fd_kernel, int fd_initrd,
>  			die_perror("fstat");
>  
>  		pos -= (sb.st_size + INITRD_ALIGN);
> -		guest_addr = ALIGN(host_to_guest_flat(kvm, pos), INITRD_ALIGN);
> +		guest_addr = host_to_guest_flat(kvm, pos);
> +		if (!guest_addr)
> +			die("initrd too big to fit in the payload memory region");
> +		guest_addr = ALIGN(guest_addr, INITRD_ALIGN);
>  		pos = guest_flat_to_host(kvm, guest_addr);
>  		if (pos < kernel_end)
>  			die("initrd overlaps with kernel image.");
> -- 
> 2.47.0
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RESEND kvmtool 0/4] arm: Payload memory layout change
  2024-11-28 15:12 [PATCH RESEND kvmtool 0/4] arm: Payload memory layout change Alexandru Elisei
                   ` (3 preceding siblings ...)
  2024-11-28 15:12 ` [RFC PATCH kvmtool 4/4] arm64: Increase the payload memory region size to 512MB Alexandru Elisei
@ 2024-12-11 22:59 ` Will Deacon
  2024-12-11 23:44 ` Will Deacon
  5 siblings, 0 replies; 9+ messages in thread
From: Will Deacon @ 2024-12-11 22:59 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: julien.thierry.kdev, kvm, linux-arm-kernel, kvmarm, maz,
	oliver.upton, apatel, andre.przywara, suzuki.poulose,
	s.abdollahi22

On Thu, Nov 28, 2024 at 03:12:42PM +0000, Alexandru Elisei wrote:
> (resending because I accidently only sent the cover letter, sorry for that)
> 
> The first 3 patches are fixes to kvm__arch_load_kernel_image(). I've CC'ed
> the riscv maintainer because it looks to me like riscv is similarly
> affected.
> 
> Patch #4 ("arm64: Increase the payload memory region size to 512MB") might
> be controversial. Follows a bug report I received from Abdollahi Sina in
> private. Details in the commit message, but the gist of the patch is that
> the memory region where kernel + initrd + DTB are copied to are changed
> from 256MB to 512MB.  As a result, the DTB and initrd are moved from below
> ram_start + 256MB to ram_start + 512MB to accomodate a larger initrd.  If
> users rely on finding the DTB and initrd at the current addresses, then I'm
> not sure the patch is justified - after all, if someone really wants to use
> such a large initrd instead of a disk image with virtio, then replacing
> SZ_256M with SZ_512M locally doesn't look like a big ask.
> 
> On the other hand, if there are no users that rely on the current payload
> layout, increasing the memory region size to 512MB to allow for more
> unusual use cases, while still maintaining compatibility with older
> kernels, doesn't seem unreasonable to me.
> 
> Please comment, I don't feel strongly either way - I'll happy drop the last
> patch if there are objections.

Well, I won't reject an imperial.ac.uk patch on a hunch, so let's go for
the whole lot and see if anybody complains!

I'll fix the indentation issue spotted by Drew when applying (thanks!).

Will


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RESEND kvmtool 0/4] arm: Payload memory layout change
  2024-11-28 15:12 [PATCH RESEND kvmtool 0/4] arm: Payload memory layout change Alexandru Elisei
                   ` (4 preceding siblings ...)
  2024-12-11 22:59 ` [PATCH RESEND kvmtool 0/4] arm: Payload memory layout change Will Deacon
@ 2024-12-11 23:44 ` Will Deacon
  2024-12-12  9:41   ` Will Deacon
  5 siblings, 1 reply; 9+ messages in thread
From: Will Deacon @ 2024-12-11 23:44 UTC (permalink / raw)
  To: julien.thierry.kdev, kvm, linux-arm-kernel, kvmarm,
	Alexandru Elisei
  Cc: catalin.marinas, kernel-team, Will Deacon, maz, oliver.upton,
	apatel, andre.przywara, suzuki.poulose, s.abdollahi22

On Thu, 28 Nov 2024 15:12:42 +0000, Alexandru Elisei wrote:
> (resending because I accidently only sent the cover letter, sorry for that)
> 
> The first 3 patches are fixes to kvm__arch_load_kernel_image(). I've CC'ed
> the riscv maintainer because it looks to me like riscv is similarly
> affected.
> 
> Patch #4 ("arm64: Increase the payload memory region size to 512MB") might
> be controversial. Follows a bug report I received from Abdollahi Sina in
> private. Details in the commit message, but the gist of the patch is that
> the memory region where kernel + initrd + DTB are copied to are changed
> from 256MB to 512MB.  As a result, the DTB and initrd are moved from below
> ram_start + 256MB to ram_start + 512MB to accomodate a larger initrd.  If
> users rely on finding the DTB and initrd at the current addresses, then I'm
> not sure the patch is justified - after all, if someone really wants to use
> such a large initrd instead of a disk image with virtio, then replacing
> SZ_256M with SZ_512M locally doesn't look like a big ask.
> 
> [...]

Applied to arm64 (sina), thanks!

[1/4] arm: Fix off-by-one errors when computing payload memory layout
      https://git.kernel.org/arm64/c/167aa1e
[2/4] arm: Check return value for host_to_guest_flat()
      https://git.kernel.org/arm64/c/ca57fb6
[3/4] arm64: Use the kernel header image_size when loading into memory
      https://git.kernel.org/arm64/c/32345de
[4/4] arm64: Increase the payload memory region size to 512MB
      https://git.kernel.org/arm64/c/9b26a8e

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RESEND kvmtool 0/4] arm: Payload memory layout change
  2024-12-11 23:44 ` Will Deacon
@ 2024-12-12  9:41   ` Will Deacon
  0 siblings, 0 replies; 9+ messages in thread
From: Will Deacon @ 2024-12-12  9:41 UTC (permalink / raw)
  To: julien.thierry.kdev, kvm, linux-arm-kernel, kvmarm,
	Alexandru Elisei
  Cc: catalin.marinas, kernel-team, maz, oliver.upton, apatel,
	andre.przywara, suzuki.poulose, s.abdollahi22

On Wed, Dec 11, 2024 at 11:44:14PM +0000, Will Deacon wrote:
> On Thu, 28 Nov 2024 15:12:42 +0000, Alexandru Elisei wrote:
> > (resending because I accidently only sent the cover letter, sorry for that)
> > 
> > The first 3 patches are fixes to kvm__arch_load_kernel_image(). I've CC'ed
> > the riscv maintainer because it looks to me like riscv is similarly
> > affected.
> > 
> > Patch #4 ("arm64: Increase the payload memory region size to 512MB") might
> > be controversial. Follows a bug report I received from Abdollahi Sina in
> > private. Details in the commit message, but the gist of the patch is that
> > the memory region where kernel + initrd + DTB are copied to are changed
> > from 256MB to 512MB.  As a result, the DTB and initrd are moved from below
> > ram_start + 256MB to ram_start + 512MB to accomodate a larger initrd.  If
> > users rely on finding the DTB and initrd at the current addresses, then I'm
> > not sure the patch is justified - after all, if someone really wants to use
> > such a large initrd instead of a disk image with virtio, then replacing
> > SZ_256M with SZ_512M locally doesn't look like a big ask.
> > 
> > [...]
> 
> Applied to arm64 (sina), thanks!
> 
> [1/4] arm: Fix off-by-one errors when computing payload memory layout
>       https://git.kernel.org/arm64/c/167aa1e
> [2/4] arm: Check return value for host_to_guest_flat()
>       https://git.kernel.org/arm64/c/ca57fb6
> [3/4] arm64: Use the kernel header image_size when loading into memory
>       https://git.kernel.org/arm64/c/32345de
> [4/4] arm64: Increase the payload memory region size to 512MB
>       https://git.kernel.org/arm64/c/9b26a8e

Sorry, these links are all broken as I generated the thankyou note
against a testing branch. Rest assured that the changes are pushed to
the kvmtool.git:

https://git.kernel.org/pub/scm/linux/kernel/git/will/kvmtool.git/

Will


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-12-12  9:46 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-28 15:12 [PATCH RESEND kvmtool 0/4] arm: Payload memory layout change Alexandru Elisei
2024-11-28 15:12 ` [PATCH kvmtool 1/4] arm: Fix off-by-one errors when computing payload memory layout Alexandru Elisei
2024-11-28 15:12 ` [PATCH kvmtool 2/4] arm: Check return value for host_to_guest_flat() Alexandru Elisei
2024-11-28 16:06   ` Andrew Jones
2024-11-28 15:12 ` [PATCH kvmtool 3/4] arm64: Use the kernel header image_size when loading into memory Alexandru Elisei
2024-11-28 15:12 ` [RFC PATCH kvmtool 4/4] arm64: Increase the payload memory region size to 512MB Alexandru Elisei
2024-12-11 22:59 ` [PATCH RESEND kvmtool 0/4] arm: Payload memory layout change Will Deacon
2024-12-11 23:44 ` Will Deacon
2024-12-12  9:41   ` Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox