* [Qemu-devel] [PATCH] pc: memhotplug: rise minimum DIMM addr/size alignment to 128Mb
@ 2015-09-21 11:50 Igor Mammedov
2015-09-21 12:22 ` Paolo Bonzini
2015-09-21 14:58 ` Eduardo Habkost
0 siblings, 2 replies; 11+ messages in thread
From: Igor Mammedov @ 2015-09-21 11:50 UTC (permalink / raw)
To: qemu-devel; +Cc: pbonzini, ehabkost, mst
it's attempt to workaround virtio bug reported earlier:
http://lists.nongnu.org/archive/html/qemu-devel/2015-08/msg00522.html
where virtio can't handle buffer that crosses border
between 2 DIMM's (i.e. 2 MemoryRegions).
Testing showed that virtio doesn't hit above bug
with 128Mb DIMM's granularity. Also linux memory
hotplug can handle hotplugged memory starting with
128Mb memory sections so lets rise minimum size limit
to 128Mb and align starting DIMM address on 128Mb.
It's certainly not the fix but it reduces risk of
crashing VM till virtio is fixed.
It also could be improved in guest's virtio side if it
would align buffers on 128Mb border and limit max buffer
size to the same value.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
Based on PCI tree as it has patches that add
2.5 machine type.
---
hw/i386/pc.c | 8 +++++---
hw/i386/pc_piix.c | 12 ++++++++++--
hw/i386/pc_q35.c | 12 ++++++++++--
include/hw/i386/pc.h | 5 ++---
4 files changed, 27 insertions(+), 10 deletions(-)
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index b5107f7..ddb6710 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1645,8 +1645,9 @@ static void pc_dimm_plug(HotplugHandler *hotplug_dev,
MemoryRegion *mr = ddc->get_memory_region(dimm);
uint64_t align = TARGET_PAGE_SIZE;
- if (memory_region_get_alignment(mr) && pcms->enforce_aligned_dimm) {
- align = memory_region_get_alignment(mr);
+ if (pcms->enforce_aligned_dimm) {
+ align = MAX(memory_region_get_alignment(mr),
+ pcms->enforce_aligned_dimm);
}
if (!pcms->acpi_dev) {
@@ -1936,7 +1937,8 @@ static void pc_machine_initfn(Object *obj)
"Enable vmport (pc & q35)",
&error_abort);
- pcms->enforce_aligned_dimm = true;
+ /* align DIMM starting address/size by 128Mb */
+ pcms->enforce_aligned_dimm = 1ULL << 27;
object_property_add_bool(obj, PC_MACHINE_ENFORCE_ALIGNED_DIMM,
pc_machine_get_aligned_dimm,
NULL, &error_abort);
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index caa4edc..7671905 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -301,9 +301,17 @@ static void pc_init1(MachineState *machine,
}
}
+static void pc_compat_2_4(MachineState *machine)
+{
+ PCMachineState *pcms = PC_MACHINE(machine);
+
+ pcms->enforce_aligned_dimm = TARGET_PAGE_SIZE;
+}
+
static void pc_compat_2_3(MachineState *machine)
{
PCMachineState *pcms = PC_MACHINE(machine);
+ pc_compat_2_4(machine);
savevm_skip_section_footers();
if (kvm_enabled()) {
pcms->smm = ON_OFF_AUTO_OFF;
@@ -326,7 +334,7 @@ static void pc_compat_2_1(MachineState *machine)
pc_compat_2_2(machine);
smbios_uuid_encoded = false;
x86_cpu_compat_kvm_no_autodisable(FEAT_8000_0001_ECX, CPUID_EXT3_SVM);
- pcms->enforce_aligned_dimm = false;
+ pcms->enforce_aligned_dimm = 0;
}
static void pc_compat_2_0(MachineState *machine)
@@ -485,7 +493,7 @@ static void pc_i440fx_2_4_machine_options(MachineClass *m)
SET_MACHINE_COMPAT(m, PC_COMPAT_2_4);
}
-DEFINE_I440FX_MACHINE(v2_4, "pc-i440fx-2.4", NULL,
+DEFINE_I440FX_MACHINE(v2_4, "pc-i440fx-2.4", pc_compat_2_3,
pc_i440fx_2_4_machine_options)
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 506b6bf..72b479f 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -284,9 +284,17 @@ static void pc_q35_init(MachineState *machine)
}
}
+static void pc_compat_2_4(MachineState *machine)
+{
+ PCMachineState *pcms = PC_MACHINE(machine);
+
+ pcms->enforce_aligned_dimm = TARGET_PAGE_SIZE;
+}
+
static void pc_compat_2_3(MachineState *machine)
{
PCMachineState *pcms = PC_MACHINE(machine);
+ pc_compat_2_4(machine);
savevm_skip_section_footers();
if (kvm_enabled()) {
pcms->smm = ON_OFF_AUTO_OFF;
@@ -307,7 +315,7 @@ static void pc_compat_2_1(MachineState *machine)
PCMachineState *pcms = PC_MACHINE(machine);
pc_compat_2_2(machine);
- pcms->enforce_aligned_dimm = false;
+ pcms->enforce_aligned_dimm = 0;
smbios_uuid_encoded = false;
x86_cpu_compat_kvm_no_autodisable(FEAT_8000_0001_ECX, CPUID_EXT3_SVM);
}
@@ -388,7 +396,7 @@ static void pc_q35_2_4_machine_options(MachineClass *m)
SET_MACHINE_COMPAT(m, PC_COMPAT_2_4);
}
-DEFINE_Q35_MACHINE(v2_4, "pc-q35-2.4", NULL,
+DEFINE_Q35_MACHINE(v2_4, "pc-q35-2.4", pc_compat_2_4,
pc_q35_2_4_machine_options);
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 6896328..fdcf0ec 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -23,8 +23,7 @@
/**
* PCMachineState:
* @acpi_dev: link to ACPI PM device that performs ACPI hotplug handling
- * @enforce_aligned_dimm: check that DIMM's address/size is aligned by
- * backend's alignment value if provided
+ * @enforce_aligned_dimm: minimal DIMM's address/size alignment
*/
struct PCMachineState {
/*< private >*/
@@ -37,9 +36,9 @@ struct PCMachineState {
ISADevice *rtc;
uint64_t max_ram_below_4g;
+ uint64_t enforce_aligned_dimm;
OnOffAuto vmport;
OnOffAuto smm;
- bool enforce_aligned_dimm;
ram_addr_t below_4g_mem_size, above_4g_mem_size;
};
--
1.8.3.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH] pc: memhotplug: rise minimum DIMM addr/size alignment to 128Mb
2015-09-21 11:50 [Qemu-devel] [PATCH] pc: memhotplug: rise minimum DIMM addr/size alignment to 128Mb Igor Mammedov
@ 2015-09-21 12:22 ` Paolo Bonzini
2015-09-21 13:05 ` Igor Mammedov
2015-09-23 9:36 ` Igor Mammedov
2015-09-21 14:58 ` Eduardo Habkost
1 sibling, 2 replies; 11+ messages in thread
From: Paolo Bonzini @ 2015-09-21 12:22 UTC (permalink / raw)
To: Igor Mammedov, qemu-devel; +Cc: ehabkost, mst
On 21/09/2015 13:50, Igor Mammedov wrote:
> it's attempt to workaround virtio bug reported earlier:
> http://lists.nongnu.org/archive/html/qemu-devel/2015-08/msg00522.html
> where virtio can't handle buffer that crosses border
> between 2 DIMM's (i.e. 2 MemoryRegions).
>
> Testing showed that virtio doesn't hit above bug
> with 128Mb DIMM's granularity. Also linux memory
> hotplug can handle hotplugged memory starting with
> 128Mb memory sections so lets rise minimum size limit
> to 128Mb and align starting DIMM address on 128Mb.
>
> It's certainly not the fix but it reduces risk of
> crashing VM till virtio is fixed.
> It also could be improved in guest's virtio side if it
> would align buffers on 128Mb border and limit max buffer
> size to the same value.
>
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
This seems to be easily handled at a level above QEMU---and the fix
would be available to older machine types as well. This patch would
also make it quite a bit harder to test the real fix with QEMU. It is
not alone a reason to NACK it but should also be kept in mind.
Aligning to 4K makes some sense, since 4K is the page size, but
enforcing an arbitrary alignment above 4K is policy that does not belong
in QEMU.
To some extend, enforcing natural alignment would be okay as a
workaround for the virtio bug as well. It would also make it easier to
ensure that hotplugged hugetlbfs-backed memory can use hugepages in the
guest. Does it make sense to you?
Paolo
> ---
> Based on PCI tree as it has patches that add
> 2.5 machine type.
> ---
> hw/i386/pc.c | 8 +++++---
> hw/i386/pc_piix.c | 12 ++++++++++--
> hw/i386/pc_q35.c | 12 ++++++++++--
> include/hw/i386/pc.h | 5 ++---
> 4 files changed, 27 insertions(+), 10 deletions(-)
>
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index b5107f7..ddb6710 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1645,8 +1645,9 @@ static void pc_dimm_plug(HotplugHandler *hotplug_dev,
> MemoryRegion *mr = ddc->get_memory_region(dimm);
> uint64_t align = TARGET_PAGE_SIZE;
>
> - if (memory_region_get_alignment(mr) && pcms->enforce_aligned_dimm) {
> - align = memory_region_get_alignment(mr);
> + if (pcms->enforce_aligned_dimm) {
> + align = MAX(memory_region_get_alignment(mr),
> + pcms->enforce_aligned_dimm);
> }
>
> if (!pcms->acpi_dev) {
> @@ -1936,7 +1937,8 @@ static void pc_machine_initfn(Object *obj)
> "Enable vmport (pc & q35)",
> &error_abort);
>
> - pcms->enforce_aligned_dimm = true;
> + /* align DIMM starting address/size by 128Mb */
> + pcms->enforce_aligned_dimm = 1ULL << 27;
> object_property_add_bool(obj, PC_MACHINE_ENFORCE_ALIGNED_DIMM,
> pc_machine_get_aligned_dimm,
> NULL, &error_abort);
> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> index caa4edc..7671905 100644
> --- a/hw/i386/pc_piix.c
> +++ b/hw/i386/pc_piix.c
> @@ -301,9 +301,17 @@ static void pc_init1(MachineState *machine,
> }
> }
>
> +static void pc_compat_2_4(MachineState *machine)
> +{
> + PCMachineState *pcms = PC_MACHINE(machine);
> +
> + pcms->enforce_aligned_dimm = TARGET_PAGE_SIZE;
> +}
> +
> static void pc_compat_2_3(MachineState *machine)
> {
> PCMachineState *pcms = PC_MACHINE(machine);
> + pc_compat_2_4(machine);
> savevm_skip_section_footers();
> if (kvm_enabled()) {
> pcms->smm = ON_OFF_AUTO_OFF;
> @@ -326,7 +334,7 @@ static void pc_compat_2_1(MachineState *machine)
> pc_compat_2_2(machine);
> smbios_uuid_encoded = false;
> x86_cpu_compat_kvm_no_autodisable(FEAT_8000_0001_ECX, CPUID_EXT3_SVM);
> - pcms->enforce_aligned_dimm = false;
> + pcms->enforce_aligned_dimm = 0;
> }
>
> static void pc_compat_2_0(MachineState *machine)
> @@ -485,7 +493,7 @@ static void pc_i440fx_2_4_machine_options(MachineClass *m)
> SET_MACHINE_COMPAT(m, PC_COMPAT_2_4);
> }
>
> -DEFINE_I440FX_MACHINE(v2_4, "pc-i440fx-2.4", NULL,
> +DEFINE_I440FX_MACHINE(v2_4, "pc-i440fx-2.4", pc_compat_2_3,
> pc_i440fx_2_4_machine_options)
>
>
> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> index 506b6bf..72b479f 100644
> --- a/hw/i386/pc_q35.c
> +++ b/hw/i386/pc_q35.c
> @@ -284,9 +284,17 @@ static void pc_q35_init(MachineState *machine)
> }
> }
>
> +static void pc_compat_2_4(MachineState *machine)
> +{
> + PCMachineState *pcms = PC_MACHINE(machine);
> +
> + pcms->enforce_aligned_dimm = TARGET_PAGE_SIZE;
> +}
> +
> static void pc_compat_2_3(MachineState *machine)
> {
> PCMachineState *pcms = PC_MACHINE(machine);
> + pc_compat_2_4(machine);
> savevm_skip_section_footers();
> if (kvm_enabled()) {
> pcms->smm = ON_OFF_AUTO_OFF;
> @@ -307,7 +315,7 @@ static void pc_compat_2_1(MachineState *machine)
> PCMachineState *pcms = PC_MACHINE(machine);
>
> pc_compat_2_2(machine);
> - pcms->enforce_aligned_dimm = false;
> + pcms->enforce_aligned_dimm = 0;
> smbios_uuid_encoded = false;
> x86_cpu_compat_kvm_no_autodisable(FEAT_8000_0001_ECX, CPUID_EXT3_SVM);
> }
> @@ -388,7 +396,7 @@ static void pc_q35_2_4_machine_options(MachineClass *m)
> SET_MACHINE_COMPAT(m, PC_COMPAT_2_4);
> }
>
> -DEFINE_Q35_MACHINE(v2_4, "pc-q35-2.4", NULL,
> +DEFINE_Q35_MACHINE(v2_4, "pc-q35-2.4", pc_compat_2_4,
> pc_q35_2_4_machine_options);
>
>
> diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> index 6896328..fdcf0ec 100644
> --- a/include/hw/i386/pc.h
> +++ b/include/hw/i386/pc.h
> @@ -23,8 +23,7 @@
> /**
> * PCMachineState:
> * @acpi_dev: link to ACPI PM device that performs ACPI hotplug handling
> - * @enforce_aligned_dimm: check that DIMM's address/size is aligned by
> - * backend's alignment value if provided
> + * @enforce_aligned_dimm: minimal DIMM's address/size alignment
> */
> struct PCMachineState {
> /*< private >*/
> @@ -37,9 +36,9 @@ struct PCMachineState {
> ISADevice *rtc;
>
> uint64_t max_ram_below_4g;
> + uint64_t enforce_aligned_dimm;
> OnOffAuto vmport;
> OnOffAuto smm;
> - bool enforce_aligned_dimm;
> ram_addr_t below_4g_mem_size, above_4g_mem_size;
> };
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH] pc: memhotplug: rise minimum DIMM addr/size alignment to 128Mb
2015-09-21 12:22 ` Paolo Bonzini
@ 2015-09-21 13:05 ` Igor Mammedov
2015-09-21 13:13 ` Paolo Bonzini
2015-09-23 9:36 ` Igor Mammedov
1 sibling, 1 reply; 11+ messages in thread
From: Igor Mammedov @ 2015-09-21 13:05 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: mst, qemu-devel, ehabkost
On Mon, 21 Sep 2015 14:22:23 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>
> On 21/09/2015 13:50, Igor Mammedov wrote:
> > it's attempt to workaround virtio bug reported earlier:
> > http://lists.nongnu.org/archive/html/qemu-devel/2015-08/msg00522.html
> > where virtio can't handle buffer that crosses border
> > between 2 DIMM's (i.e. 2 MemoryRegions).
> >
> > Testing showed that virtio doesn't hit above bug
> > with 128Mb DIMM's granularity. Also linux memory
> > hotplug can handle hotplugged memory starting with
> > 128Mb memory sections so lets rise minimum size limit
> > to 128Mb and align starting DIMM address on 128Mb.
> >
> > It's certainly not the fix but it reduces risk of
> > crashing VM till virtio is fixed.
> > It also could be improved in guest's virtio side if it
> > would align buffers on 128Mb border and limit max buffer
> > size to the same value.
> >
> > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
>
> This seems to be easily handled at a level above QEMU---and the fix
> would be available to older machine types as well. This patch would
> also make it quite a bit harder to test the real fix with QEMU. It is
> not alone a reason to NACK it but should also be kept in mind.
>
> Aligning to 4K makes some sense, since 4K is the page size, but
> enforcing an arbitrary alignment above 4K is policy that does not belong
> in QEMU.
>
> To some extend, enforcing natural alignment would be okay as a
> workaround for the virtio bug as well. It would also make it easier to
> ensure that hotplugged hugetlbfs-backed memory can use hugepages in the
> guest. Does it make sense to you?
in current machine types we already enforce backend-s address/size alignment,
which is file's page size for hugetlbfs-backed memory and 2Mb for RAM backend.
So I guess we could try to apply workaround to virtio on guest side,
aligning and limiting max buffer size to 2Mb, it should work for 'old'
machine types as well.
>
> Paolo
>
> > ---
> > Based on PCI tree as it has patches that add
> > 2.5 machine type.
> > ---
> > hw/i386/pc.c | 8 +++++---
> > hw/i386/pc_piix.c | 12 ++++++++++--
> > hw/i386/pc_q35.c | 12 ++++++++++--
> > include/hw/i386/pc.h | 5 ++---
> > 4 files changed, 27 insertions(+), 10 deletions(-)
> >
> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > index b5107f7..ddb6710 100644
> > --- a/hw/i386/pc.c
> > +++ b/hw/i386/pc.c
> > @@ -1645,8 +1645,9 @@ static void pc_dimm_plug(HotplugHandler *hotplug_dev,
> > MemoryRegion *mr = ddc->get_memory_region(dimm);
> > uint64_t align = TARGET_PAGE_SIZE;
> >
> > - if (memory_region_get_alignment(mr) && pcms->enforce_aligned_dimm) {
> > - align = memory_region_get_alignment(mr);
> > + if (pcms->enforce_aligned_dimm) {
> > + align = MAX(memory_region_get_alignment(mr),
> > + pcms->enforce_aligned_dimm);
> > }
> >
> > if (!pcms->acpi_dev) {
> > @@ -1936,7 +1937,8 @@ static void pc_machine_initfn(Object *obj)
> > "Enable vmport (pc & q35)",
> > &error_abort);
> >
> > - pcms->enforce_aligned_dimm = true;
> > + /* align DIMM starting address/size by 128Mb */
> > + pcms->enforce_aligned_dimm = 1ULL << 27;
> > object_property_add_bool(obj, PC_MACHINE_ENFORCE_ALIGNED_DIMM,
> > pc_machine_get_aligned_dimm,
> > NULL, &error_abort);
> > diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> > index caa4edc..7671905 100644
> > --- a/hw/i386/pc_piix.c
> > +++ b/hw/i386/pc_piix.c
> > @@ -301,9 +301,17 @@ static void pc_init1(MachineState *machine,
> > }
> > }
> >
> > +static void pc_compat_2_4(MachineState *machine)
> > +{
> > + PCMachineState *pcms = PC_MACHINE(machine);
> > +
> > + pcms->enforce_aligned_dimm = TARGET_PAGE_SIZE;
> > +}
> > +
> > static void pc_compat_2_3(MachineState *machine)
> > {
> > PCMachineState *pcms = PC_MACHINE(machine);
> > + pc_compat_2_4(machine);
> > savevm_skip_section_footers();
> > if (kvm_enabled()) {
> > pcms->smm = ON_OFF_AUTO_OFF;
> > @@ -326,7 +334,7 @@ static void pc_compat_2_1(MachineState *machine)
> > pc_compat_2_2(machine);
> > smbios_uuid_encoded = false;
> > x86_cpu_compat_kvm_no_autodisable(FEAT_8000_0001_ECX, CPUID_EXT3_SVM);
> > - pcms->enforce_aligned_dimm = false;
> > + pcms->enforce_aligned_dimm = 0;
> > }
> >
> > static void pc_compat_2_0(MachineState *machine)
> > @@ -485,7 +493,7 @@ static void pc_i440fx_2_4_machine_options(MachineClass *m)
> > SET_MACHINE_COMPAT(m, PC_COMPAT_2_4);
> > }
> >
> > -DEFINE_I440FX_MACHINE(v2_4, "pc-i440fx-2.4", NULL,
> > +DEFINE_I440FX_MACHINE(v2_4, "pc-i440fx-2.4", pc_compat_2_3,
> > pc_i440fx_2_4_machine_options)
> >
> >
> > diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> > index 506b6bf..72b479f 100644
> > --- a/hw/i386/pc_q35.c
> > +++ b/hw/i386/pc_q35.c
> > @@ -284,9 +284,17 @@ static void pc_q35_init(MachineState *machine)
> > }
> > }
> >
> > +static void pc_compat_2_4(MachineState *machine)
> > +{
> > + PCMachineState *pcms = PC_MACHINE(machine);
> > +
> > + pcms->enforce_aligned_dimm = TARGET_PAGE_SIZE;
> > +}
> > +
> > static void pc_compat_2_3(MachineState *machine)
> > {
> > PCMachineState *pcms = PC_MACHINE(machine);
> > + pc_compat_2_4(machine);
> > savevm_skip_section_footers();
> > if (kvm_enabled()) {
> > pcms->smm = ON_OFF_AUTO_OFF;
> > @@ -307,7 +315,7 @@ static void pc_compat_2_1(MachineState *machine)
> > PCMachineState *pcms = PC_MACHINE(machine);
> >
> > pc_compat_2_2(machine);
> > - pcms->enforce_aligned_dimm = false;
> > + pcms->enforce_aligned_dimm = 0;
> > smbios_uuid_encoded = false;
> > x86_cpu_compat_kvm_no_autodisable(FEAT_8000_0001_ECX, CPUID_EXT3_SVM);
> > }
> > @@ -388,7 +396,7 @@ static void pc_q35_2_4_machine_options(MachineClass *m)
> > SET_MACHINE_COMPAT(m, PC_COMPAT_2_4);
> > }
> >
> > -DEFINE_Q35_MACHINE(v2_4, "pc-q35-2.4", NULL,
> > +DEFINE_Q35_MACHINE(v2_4, "pc-q35-2.4", pc_compat_2_4,
> > pc_q35_2_4_machine_options);
> >
> >
> > diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> > index 6896328..fdcf0ec 100644
> > --- a/include/hw/i386/pc.h
> > +++ b/include/hw/i386/pc.h
> > @@ -23,8 +23,7 @@
> > /**
> > * PCMachineState:
> > * @acpi_dev: link to ACPI PM device that performs ACPI hotplug handling
> > - * @enforce_aligned_dimm: check that DIMM's address/size is aligned by
> > - * backend's alignment value if provided
> > + * @enforce_aligned_dimm: minimal DIMM's address/size alignment
> > */
> > struct PCMachineState {
> > /*< private >*/
> > @@ -37,9 +36,9 @@ struct PCMachineState {
> > ISADevice *rtc;
> >
> > uint64_t max_ram_below_4g;
> > + uint64_t enforce_aligned_dimm;
> > OnOffAuto vmport;
> > OnOffAuto smm;
> > - bool enforce_aligned_dimm;
> > ram_addr_t below_4g_mem_size, above_4g_mem_size;
> > };
> >
> >
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH] pc: memhotplug: rise minimum DIMM addr/size alignment to 128Mb
2015-09-21 13:05 ` Igor Mammedov
@ 2015-09-21 13:13 ` Paolo Bonzini
2015-09-21 13:32 ` Igor Mammedov
0 siblings, 1 reply; 11+ messages in thread
From: Paolo Bonzini @ 2015-09-21 13:13 UTC (permalink / raw)
To: Igor Mammedov; +Cc: mst, qemu-devel, ehabkost
On 21/09/2015 15:05, Igor Mammedov wrote:
>> > To some extend, enforcing natural alignment would be okay as a
>> > workaround for the virtio bug as well. It would also make it easier to
>> > ensure that hotplugged hugetlbfs-backed memory can use hugepages in the
>> > guest. Does it make sense to you?
> in current machine types we already enforce backend-s address/size alignment,
> which is file's page size for hugetlbfs-backed memory and 2Mb for RAM backend.
Right, but it's not enough if the guest's physical address is not
aligned to 2Mb/1Gb too. This is why we changed i440FX and q35 to have
only 3 and 2 gigabytes of low memory (down from 3.5 and 2 IIRC).
> So I guess we could try to apply workaround to virtio on guest side,
> aligning and limiting max buffer size to 2Mb, it should work for 'old'
> machine types as well.
That would make sense and it would be complementary to natural alignment
of DIMMs in the host. This would give:
host guest
old old fails
old new works (virtio workaround)
new old works (natural alignment)
new new works (choose your favorite workaround)
Paolo
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH] pc: memhotplug: rise minimum DIMM addr/size alignment to 128Mb
2015-09-21 13:13 ` Paolo Bonzini
@ 2015-09-21 13:32 ` Igor Mammedov
2015-09-21 13:38 ` Paolo Bonzini
0 siblings, 1 reply; 11+ messages in thread
From: Igor Mammedov @ 2015-09-21 13:32 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: mst, qemu-devel, ehabkost
On Mon, 21 Sep 2015 15:13:17 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>
> On 21/09/2015 15:05, Igor Mammedov wrote:
> >> > To some extend, enforcing natural alignment would be okay as a
> >> > workaround for the virtio bug as well. It would also make it easier to
> >> > ensure that hotplugged hugetlbfs-backed memory can use hugepages in the
> >> > guest. Does it make sense to you?
> > in current machine types we already enforce backend-s address/size alignment,
> > which is file's page size for hugetlbfs-backed memory and 2Mb for RAM backend.
>
> Right, but it's not enough if the guest's physical address is not
> aligned to 2Mb/1Gb too. This is why we changed i440FX and q35 to have
> only 3 and 2 gigabytes of low memory (down from 3.5 and 2 IIRC).
DIMM's GPA is aligned to backend's alignment since 2.2,
it should be aligned 2Mb/1Gb depending on what hugetlbfs file is used
so that already works as expected
>
> > So I guess we could try to apply workaround to virtio on guest side,
> > aligning and limiting max buffer size to 2Mb, it should work for 'old'
> > machine types as well.
>
> That would make sense and it would be complementary to natural alignment
> of DIMMs in the host. This would give:
>
> host guest
> old old fails
> old new works (virtio workaround)
> new old works (natural alignment)
+ not sure if it would work,
I've though that virtio refactoring, that drops requirement
for buffer to be inside of only one MemoryRegion, would
touch virtio in QEMU and on guest side too.
> new new works (choose your favorite workaround)
>
> Paolo
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH] pc: memhotplug: rise minimum DIMM addr/size alignment to 128Mb
2015-09-21 13:32 ` Igor Mammedov
@ 2015-09-21 13:38 ` Paolo Bonzini
0 siblings, 0 replies; 11+ messages in thread
From: Paolo Bonzini @ 2015-09-21 13:38 UTC (permalink / raw)
To: Igor Mammedov; +Cc: mst, qemu-devel, ehabkost
On 21/09/2015 15:32, Igor Mammedov wrote:
> On Mon, 21 Sep 2015 15:13:17 +0200
> Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>>
>>
>> On 21/09/2015 15:05, Igor Mammedov wrote:
>>>>> To some extend, enforcing natural alignment would be okay as a
>>>>> workaround for the virtio bug as well. It would also make it easier to
>>>>> ensure that hotplugged hugetlbfs-backed memory can use hugepages in the
>>>>> guest. Does it make sense to you?
>>> in current machine types we already enforce backend-s address/size alignment,
>>> which is file's page size for hugetlbfs-backed memory and 2Mb for RAM backend.
>>
>> Right, but it's not enough if the guest's physical address is not
>> aligned to 2Mb/1Gb too. This is why we changed i440FX and q35 to have
>> only 3 and 2 gigabytes of low memory (down from 3.5 and 2 IIRC).
>
> DIMM's GPA is aligned to backend's alignment since 2.2,
> it should be aligned 2Mb/1Gb depending on what hugetlbfs file is used
> so that already works as expected
Oh, ok. This is what pcms->enforce_aligned_dimm does when true, and
this is also what I was missing. Great!
>>> So I guess we could try to apply workaround to virtio on guest side,
>>> aligning and limiting max buffer size to 2Mb, it should work for 'old'
>>> machine types as well.
>>
>> That would make sense and it would be complementary to natural alignment
>> of DIMMs in the host. This would give:
>>
>> host guest
>> old old fails
>> old new works (virtio workaround)
>> new old works (natural alignment)
> + not sure if it would work,
> I've though that virtio refactoring, that drops requirement
> for buffer to be inside of only one MemoryRegion, would
> touch virtio in QEMU and on guest side too.
Right, it would theoretically not be enough. However, I think it would
work in practice because hot-plugged DIMMs should be bigger than 128 MiB.
Paolo
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH] pc: memhotplug: rise minimum DIMM addr/size alignment to 128Mb
2015-09-21 11:50 [Qemu-devel] [PATCH] pc: memhotplug: rise minimum DIMM addr/size alignment to 128Mb Igor Mammedov
2015-09-21 12:22 ` Paolo Bonzini
@ 2015-09-21 14:58 ` Eduardo Habkost
1 sibling, 0 replies; 11+ messages in thread
From: Eduardo Habkost @ 2015-09-21 14:58 UTC (permalink / raw)
To: Igor Mammedov; +Cc: pbonzini, Marcel Apfelbaum, qemu-devel, mst
On Mon, Sep 21, 2015 at 01:50:23PM +0200, Igor Mammedov wrote:
[...]
> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> index caa4edc..7671905 100644
> --- a/hw/i386/pc_piix.c
> +++ b/hw/i386/pc_piix.c
> @@ -301,9 +301,17 @@ static void pc_init1(MachineState *machine,
> }
> }
>
> +static void pc_compat_2_4(MachineState *machine)
> +{
> + PCMachineState *pcms = PC_MACHINE(machine);
> +
> + pcms->enforce_aligned_dimm = TARGET_PAGE_SIZE;
> +}
> +
I want to stop creating new machine-init-time pc_compat_*() functions
because they make it harder to do any analysis or introspection of
machine class data in the future.
If enforce_aligned_dimm never changes during the lifetime of the machine
object, it can be moved to PCMachineClass. If it can change, you can
allow it to be set using PCMachineClass::default_machine_opts by
providing a property setter.
> static void pc_compat_2_3(MachineState *machine)
> {
> PCMachineState *pcms = PC_MACHINE(machine);
> + pc_compat_2_4(machine);
> savevm_skip_section_footers();
> if (kvm_enabled()) {
> pcms->smm = ON_OFF_AUTO_OFF;
> @@ -326,7 +334,7 @@ static void pc_compat_2_1(MachineState *machine)
> pc_compat_2_2(machine);
> smbios_uuid_encoded = false;
> x86_cpu_compat_kvm_no_autodisable(FEAT_8000_0001_ECX, CPUID_EXT3_SVM);
> - pcms->enforce_aligned_dimm = false;
> + pcms->enforce_aligned_dimm = 0;
> }
>
> static void pc_compat_2_0(MachineState *machine)
> @@ -485,7 +493,7 @@ static void pc_i440fx_2_4_machine_options(MachineClass *m)
> SET_MACHINE_COMPAT(m, PC_COMPAT_2_4);
> }
>
> -DEFINE_I440FX_MACHINE(v2_4, "pc-i440fx-2.4", NULL,
> +DEFINE_I440FX_MACHINE(v2_4, "pc-i440fx-2.4", pc_compat_2_3,
Did you mean pc_compat_2_4?
> pc_i440fx_2_4_machine_options)
>
>
--
Eduardo
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH] pc: memhotplug: rise minimum DIMM addr/size alignment to 128Mb
2015-09-21 12:22 ` Paolo Bonzini
2015-09-21 13:05 ` Igor Mammedov
@ 2015-09-23 9:36 ` Igor Mammedov
2015-09-23 9:38 ` Paolo Bonzini
1 sibling, 1 reply; 11+ messages in thread
From: Igor Mammedov @ 2015-09-23 9:36 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Peter, ehabkost, mst, qemu-devel, David, Jiri
On Mon, 21 Sep 2015 14:22:23 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>
> On 21/09/2015 13:50, Igor Mammedov wrote:
> > it's attempt to workaround virtio bug reported earlier:
> > http://lists.nongnu.org/archive/html/qemu-devel/2015-08/msg00522.html
> > where virtio can't handle buffer that crosses border
> > between 2 DIMM's (i.e. 2 MemoryRegions).
> >
> > Testing showed that virtio doesn't hit above bug
> > with 128Mb DIMM's granularity. Also linux memory
> > hotplug can handle hotplugged memory starting with
> > 128Mb memory sections so lets rise minimum size limit
> > to 128Mb and align starting DIMM address on 128Mb.
> >
> > It's certainly not the fix but it reduces risk of
> > crashing VM till virtio is fixed.
> > It also could be improved in guest's virtio side if it
> > would align buffers on 128Mb border and limit max buffer
> > size to the same value.
> >
> > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
>
> This seems to be easily handled at a level above QEMU---and the fix
> would be available to older machine types as well. This patch would
> also make it quite a bit harder to test the real fix with QEMU. It is
> not alone a reason to NACK it but should also be kept in mind.
Patch makes it easy to change enforced alignment for future machine types,
so lowering alignment for testing isn't hard when virtio is fixed.
Handling it at libvirt level is a bit hard since currently it doesn't
deal with DIMM.addr allocation and there isn't any interface
to communicate hotplug address range to libvirt. Libvirt doesn't need
to know anything about DIMM.addr except of migration when it needs
to replicate state on target side.
Also it's QEMU bug/fault and pushing workaround to upper layers
doesn't seem right when it's much easier to do it in QEMU itself.
> Aligning to 4K makes some sense, since 4K is the page size, but
> enforcing an arbitrary alignment above 4K is policy that does not belong
> in QEMU.
>
> To some extend, enforcing natural alignment would be okay as a
> workaround for the virtio bug as well. It would also make it easier to
> ensure that hotplugged hugetlbfs-backed memory can use hugepages in the
> guest. Does it make sense to you?
>
> Paolo
>
> > ---
> > Based on PCI tree as it has patches that add
> > 2.5 machine type.
> > ---
> > hw/i386/pc.c | 8 +++++---
> > hw/i386/pc_piix.c | 12 ++++++++++--
> > hw/i386/pc_q35.c | 12 ++++++++++--
> > include/hw/i386/pc.h | 5 ++---
> > 4 files changed, 27 insertions(+), 10 deletions(-)
> >
> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > index b5107f7..ddb6710 100644
> > --- a/hw/i386/pc.c
> > +++ b/hw/i386/pc.c
> > @@ -1645,8 +1645,9 @@ static void pc_dimm_plug(HotplugHandler *hotplug_dev,
> > MemoryRegion *mr = ddc->get_memory_region(dimm);
> > uint64_t align = TARGET_PAGE_SIZE;
> >
> > - if (memory_region_get_alignment(mr) && pcms->enforce_aligned_dimm) {
> > - align = memory_region_get_alignment(mr);
> > + if (pcms->enforce_aligned_dimm) {
> > + align = MAX(memory_region_get_alignment(mr),
> > + pcms->enforce_aligned_dimm);
> > }
> >
> > if (!pcms->acpi_dev) {
> > @@ -1936,7 +1937,8 @@ static void pc_machine_initfn(Object *obj)
> > "Enable vmport (pc & q35)",
> > &error_abort);
> >
> > - pcms->enforce_aligned_dimm = true;
> > + /* align DIMM starting address/size by 128Mb */
> > + pcms->enforce_aligned_dimm = 1ULL << 27;
> > object_property_add_bool(obj, PC_MACHINE_ENFORCE_ALIGNED_DIMM,
> > pc_machine_get_aligned_dimm,
> > NULL, &error_abort);
> > diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> > index caa4edc..7671905 100644
> > --- a/hw/i386/pc_piix.c
> > +++ b/hw/i386/pc_piix.c
> > @@ -301,9 +301,17 @@ static void pc_init1(MachineState *machine,
> > }
> > }
> >
> > +static void pc_compat_2_4(MachineState *machine)
> > +{
> > + PCMachineState *pcms = PC_MACHINE(machine);
> > +
> > + pcms->enforce_aligned_dimm = TARGET_PAGE_SIZE;
> > +}
> > +
> > static void pc_compat_2_3(MachineState *machine)
> > {
> > PCMachineState *pcms = PC_MACHINE(machine);
> > + pc_compat_2_4(machine);
> > savevm_skip_section_footers();
> > if (kvm_enabled()) {
> > pcms->smm = ON_OFF_AUTO_OFF;
> > @@ -326,7 +334,7 @@ static void pc_compat_2_1(MachineState *machine)
> > pc_compat_2_2(machine);
> > smbios_uuid_encoded = false;
> > x86_cpu_compat_kvm_no_autodisable(FEAT_8000_0001_ECX, CPUID_EXT3_SVM);
> > - pcms->enforce_aligned_dimm = false;
> > + pcms->enforce_aligned_dimm = 0;
> > }
> >
> > static void pc_compat_2_0(MachineState *machine)
> > @@ -485,7 +493,7 @@ static void pc_i440fx_2_4_machine_options(MachineClass *m)
> > SET_MACHINE_COMPAT(m, PC_COMPAT_2_4);
> > }
> >
> > -DEFINE_I440FX_MACHINE(v2_4, "pc-i440fx-2.4", NULL,
> > +DEFINE_I440FX_MACHINE(v2_4, "pc-i440fx-2.4", pc_compat_2_3,
> > pc_i440fx_2_4_machine_options)
> >
> >
> > diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> > index 506b6bf..72b479f 100644
> > --- a/hw/i386/pc_q35.c
> > +++ b/hw/i386/pc_q35.c
> > @@ -284,9 +284,17 @@ static void pc_q35_init(MachineState *machine)
> > }
> > }
> >
> > +static void pc_compat_2_4(MachineState *machine)
> > +{
> > + PCMachineState *pcms = PC_MACHINE(machine);
> > +
> > + pcms->enforce_aligned_dimm = TARGET_PAGE_SIZE;
> > +}
> > +
> > static void pc_compat_2_3(MachineState *machine)
> > {
> > PCMachineState *pcms = PC_MACHINE(machine);
> > + pc_compat_2_4(machine);
> > savevm_skip_section_footers();
> > if (kvm_enabled()) {
> > pcms->smm = ON_OFF_AUTO_OFF;
> > @@ -307,7 +315,7 @@ static void pc_compat_2_1(MachineState *machine)
> > PCMachineState *pcms = PC_MACHINE(machine);
> >
> > pc_compat_2_2(machine);
> > - pcms->enforce_aligned_dimm = false;
> > + pcms->enforce_aligned_dimm = 0;
> > smbios_uuid_encoded = false;
> > x86_cpu_compat_kvm_no_autodisable(FEAT_8000_0001_ECX, CPUID_EXT3_SVM);
> > }
> > @@ -388,7 +396,7 @@ static void pc_q35_2_4_machine_options(MachineClass *m)
> > SET_MACHINE_COMPAT(m, PC_COMPAT_2_4);
> > }
> >
> > -DEFINE_Q35_MACHINE(v2_4, "pc-q35-2.4", NULL,
> > +DEFINE_Q35_MACHINE(v2_4, "pc-q35-2.4", pc_compat_2_4,
> > pc_q35_2_4_machine_options);
> >
> >
> > diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> > index 6896328..fdcf0ec 100644
> > --- a/include/hw/i386/pc.h
> > +++ b/include/hw/i386/pc.h
> > @@ -23,8 +23,7 @@
> > /**
> > * PCMachineState:
> > * @acpi_dev: link to ACPI PM device that performs ACPI hotplug handling
> > - * @enforce_aligned_dimm: check that DIMM's address/size is aligned by
> > - * backend's alignment value if provided
> > + * @enforce_aligned_dimm: minimal DIMM's address/size alignment
> > */
> > struct PCMachineState {
> > /*< private >*/
> > @@ -37,9 +36,9 @@ struct PCMachineState {
> > ISADevice *rtc;
> >
> > uint64_t max_ram_below_4g;
> > + uint64_t enforce_aligned_dimm;
> > OnOffAuto vmport;
> > OnOffAuto smm;
> > - bool enforce_aligned_dimm;
> > ram_addr_t below_4g_mem_size, above_4g_mem_size;
> > };
> >
> >
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH] pc: memhotplug: rise minimum DIMM addr/size alignment to 128Mb
2015-09-23 9:36 ` Igor Mammedov
@ 2015-09-23 9:38 ` Paolo Bonzini
2015-09-23 10:25 ` Igor Mammedov
2015-09-23 10:32 ` Dr. David Alan Gilbert
0 siblings, 2 replies; 11+ messages in thread
From: Paolo Bonzini @ 2015-09-23 9:38 UTC (permalink / raw)
To: Igor Mammedov; +Cc: Peter, ehabkost, mst, qemu-devel, David, Jiri
On 23/09/2015 11:36, Igor Mammedov wrote:
> Also it's QEMU bug/fault and pushing workaround to upper layers
> doesn't seem right when it's much easier to do it in QEMU itself.
No, it's virtio's bug. It makes sense to push workaround for guest bugs
as far from the hypervisor as possible.
If we want to increase the alignment in QEMU, I would prefer to have
natural alignment which makes some sense and happens to work around the
bug as well. Michael, Eduardo, any third opinions?
Paolo
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH] pc: memhotplug: rise minimum DIMM addr/size alignment to 128Mb
2015-09-23 9:38 ` Paolo Bonzini
@ 2015-09-23 10:25 ` Igor Mammedov
2015-09-23 10:32 ` Dr. David Alan Gilbert
1 sibling, 0 replies; 11+ messages in thread
From: Igor Mammedov @ 2015-09-23 10:25 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Peter, ehabkost, mst, qemu-devel, David, Jiri
On Wed, 23 Sep 2015 11:38:56 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>
> On 23/09/2015 11:36, Igor Mammedov wrote:
> > Also it's QEMU bug/fault and pushing workaround to upper layers
> > doesn't seem right when it's much easier to do it in QEMU itself.
>
> No, it's virtio's bug. It makes sense to push workaround for guest bugs
> as far from the hypervisor as possible.
>
> If we want to increase the alignment in QEMU, I would prefer to have
> natural alignment which makes some sense and happens to work around the
> bug as well. Michael, Eduardo, any third opinions?
Natural alignment we have now is 2Mb, we can leave hipervisor side
as is and just try to align on 2Mb in guest + limit buffer size to 2Mb as we
that would make sure that buffer never crosses DIMM borders.
>
> Paolo
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH] pc: memhotplug: rise minimum DIMM addr/size alignment to 128Mb
2015-09-23 9:38 ` Paolo Bonzini
2015-09-23 10:25 ` Igor Mammedov
@ 2015-09-23 10:32 ` Dr. David Alan Gilbert
1 sibling, 0 replies; 11+ messages in thread
From: Dr. David Alan Gilbert @ 2015-09-23 10:32 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Peter, ehabkost, mst, qemu-devel, Igor Mammedov, Jiri
* Paolo Bonzini (pbonzini@redhat.com) wrote:
>
>
> On 23/09/2015 11:36, Igor Mammedov wrote:
> > Also it's QEMU bug/fault and pushing workaround to upper layers
> > doesn't seem right when it's much easier to do it in QEMU itself.
>
> No, it's virtio's bug. It makes sense to push workaround for guest bugs
> as far from the hypervisor as possible.
But you really don't want to have higher level things having to align
addresses themselves. I could see adding an option for the required
alignment would be OK.
> If we want to increase the alignment in QEMU, I would prefer to have
> natural alignment which makes some sense and happens to work around the
> bug as well. Michael, Eduardo, any third opinions?
>
By natural alignment do you mean that an 'n MB' DIMM gets aligned to 'n MB' ?
Dave
> Paolo
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2015-09-23 10:33 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-21 11:50 [Qemu-devel] [PATCH] pc: memhotplug: rise minimum DIMM addr/size alignment to 128Mb Igor Mammedov
2015-09-21 12:22 ` Paolo Bonzini
2015-09-21 13:05 ` Igor Mammedov
2015-09-21 13:13 ` Paolo Bonzini
2015-09-21 13:32 ` Igor Mammedov
2015-09-21 13:38 ` Paolo Bonzini
2015-09-23 9:36 ` Igor Mammedov
2015-09-23 9:38 ` Paolo Bonzini
2015-09-23 10:25 ` Igor Mammedov
2015-09-23 10:32 ` Dr. David Alan Gilbert
2015-09-21 14:58 ` Eduardo Habkost
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).