* [PATCH V3 0/1] support >3 level p2m tree @ 2014-09-09 9:58 Juergen Gross 2014-09-09 9:58 ` [PATCH V3 1/1] expand x86 arch_shared_info to " Juergen Gross 0 siblings, 1 reply; 23+ messages in thread From: Juergen Gross @ 2014-09-09 9:58 UTC (permalink / raw) To: ian.campbell, ian.jackson, jbeulich, keir, tim, xen-devel; +Cc: Juergen Gross Pv domains write the mfn of a 3 level p2m tree to arch_shared_info structure. Consumers of this information are the domain save/restore functions of the Xen tools. Being defined as having 3 levels the maximum supported domain size of 64 bit domains is 512 GB. The following patch expands the arch_shared_info structure to support more levels. The Xen tools are not covered in this patch as the patch series "Libxl migration v2 support": http://lists.xen.org/archives/html/xen-devel/2014-09/msg00427.html should be applied first. Juergen Gross (1): expand x86 arch_shared_info to support >3 level p2m tree xen/include/public/arch-x86/xen.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) -- 1.8.4.5 ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH V3 1/1] expand x86 arch_shared_info to support >3 level p2m tree 2014-09-09 9:58 [PATCH V3 0/1] support >3 level p2m tree Juergen Gross @ 2014-09-09 9:58 ` Juergen Gross 2014-09-09 10:27 ` Andrew Cooper ` (2 more replies) 0 siblings, 3 replies; 23+ messages in thread From: Juergen Gross @ 2014-09-09 9:58 UTC (permalink / raw) To: ian.campbell, ian.jackson, jbeulich, keir, tim, xen-devel; +Cc: Juergen Gross The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list currently contains the mfn of the top level page frame of the 3 level p2m tree, which is used by the Xen tools during saving and restoring (and live migration) of pv domains. With three levels of the p2m tree it is possible to support up to 512 GB of RAM for a 64 bit pv domain. A 32 bit pv domain can support more, as each memory page can hold 1024 instead of 512 entries, leading to a limit of 4 TB. To be able to support more RAM on x86-64 an additional level is to be added. This patch expands struct arch_shared_info with a new p2m tree root and the number of levels of the p2m tree. The new information is indicated by the domain to be valid by storing ~0UL into pfn_to_mfn_frame_list_list (this should be done only if more than three levels are needed, of course). Signed-off-by: Juergen Gross <jgross@suse.com> --- xen/include/public/arch-x86/xen.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/xen/include/public/arch-x86/xen.h b/xen/include/public/arch-x86/xen.h index f35804b..2ca996c 100644 --- a/xen/include/public/arch-x86/xen.h +++ b/xen/include/public/arch-x86/xen.h @@ -224,7 +224,12 @@ struct arch_shared_info { /* Frame containing list of mfns containing list of mfns containing p2m. */ xen_pfn_t pfn_to_mfn_frame_list_list; unsigned long nmi_reason; - uint64_t pad[32]; + /* + * Following two fields are valid if pfn_to_mfn_frame_list_list contains + * ~0UL. + */ + unsigned long p2m_levels; /* number of levels of p2m tree */ + xen_pfn_t p2m_root; /* p2m tree top level mfn */ }; typedef struct arch_shared_info arch_shared_info_t; -- 1.8.4.5 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH V3 1/1] expand x86 arch_shared_info to support >3 level p2m tree 2014-09-09 9:58 ` [PATCH V3 1/1] expand x86 arch_shared_info to " Juergen Gross @ 2014-09-09 10:27 ` Andrew Cooper 2014-09-09 10:49 ` Juergen Gross 2014-09-30 8:53 ` Jan Beulich [not found] ` <542A8B93020000780003AE7B@suse.com> 2 siblings, 1 reply; 23+ messages in thread From: Andrew Cooper @ 2014-09-09 10:27 UTC (permalink / raw) To: Juergen Gross, ian.campbell, ian.jackson, jbeulich, keir, tim, xen-devel On 09/09/14 10:58, Juergen Gross wrote: > The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list > currently contains the mfn of the top level page frame of the 3 level > p2m tree, which is used by the Xen tools during saving and restoring > (and live migration) of pv domains. With three levels of the p2m tree > it is possible to support up to 512 GB of RAM for a 64 bit pv domain. > A 32 bit pv domain can support more, as each memory page can hold 1024 > instead of 512 entries, leading to a limit of 4 TB. To be able to > support more RAM on x86-64 an additional level is to be added. > > This patch expands struct arch_shared_info with a new p2m tree root > and the number of levels of the p2m tree. The new information is > indicated by the domain to be valid by storing ~0UL into > pfn_to_mfn_frame_list_list (this should be done only if more than > three levels are needed, of course). A small domain feeling a little tight on space could easily opt for a 2 or even 1 level p2m. (After all, one advantage of virt is to cram many small VMs into a server). How is xen and toolstack support for n-level p2ms going to be advertised to guests? Simply assuming the toolstack is capable of dealing with this new scheme wont work with a new pv guest running on an older Xen. ~Andrew > > Signed-off-by: Juergen Gross <jgross@suse.com> > --- > xen/include/public/arch-x86/xen.h | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/xen/include/public/arch-x86/xen.h b/xen/include/public/arch-x86/xen.h > index f35804b..2ca996c 100644 > --- a/xen/include/public/arch-x86/xen.h > +++ b/xen/include/public/arch-x86/xen.h > @@ -224,7 +224,12 @@ struct arch_shared_info { > /* Frame containing list of mfns containing list of mfns containing p2m. */ > xen_pfn_t pfn_to_mfn_frame_list_list; > unsigned long nmi_reason; > - uint64_t pad[32]; > + /* > + * Following two fields are valid if pfn_to_mfn_frame_list_list contains > + * ~0UL. > + */ > + unsigned long p2m_levels; /* number of levels of p2m tree */ > + xen_pfn_t p2m_root; /* p2m tree top level mfn */ > }; > typedef struct arch_shared_info arch_shared_info_t; > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH V3 1/1] expand x86 arch_shared_info to support >3 level p2m tree 2014-09-09 10:27 ` Andrew Cooper @ 2014-09-09 10:49 ` Juergen Gross 2014-09-12 10:31 ` Juergen Gross 0 siblings, 1 reply; 23+ messages in thread From: Juergen Gross @ 2014-09-09 10:49 UTC (permalink / raw) To: Andrew Cooper, ian.campbell, ian.jackson, jbeulich, keir, tim, xen-devel On 09/09/2014 12:27 PM, Andrew Cooper wrote: > On 09/09/14 10:58, Juergen Gross wrote: >> The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list >> currently contains the mfn of the top level page frame of the 3 level >> p2m tree, which is used by the Xen tools during saving and restoring >> (and live migration) of pv domains. With three levels of the p2m tree >> it is possible to support up to 512 GB of RAM for a 64 bit pv domain. >> A 32 bit pv domain can support more, as each memory page can hold 1024 >> instead of 512 entries, leading to a limit of 4 TB. To be able to >> support more RAM on x86-64 an additional level is to be added. >> >> This patch expands struct arch_shared_info with a new p2m tree root >> and the number of levels of the p2m tree. The new information is >> indicated by the domain to be valid by storing ~0UL into >> pfn_to_mfn_frame_list_list (this should be done only if more than >> three levels are needed, of course). > > A small domain feeling a little tight on space could easily opt for a 2 > or even 1 level p2m. (After all, one advantage of virt is to cram many > small VMs into a server). > > How is xen and toolstack support for n-level p2ms going to be advertised > to guests? Simply assuming the toolstack is capable of dealing with > this new scheme wont work with a new pv guest running on an older Xen. Is it really worth doing such an optimization? This would save only very few pages. If you think it should be done we can add another SIF_* flag to start_info->flags. In this case a domain using this feature could not be migrated to a server with old tools, however. So we would probably end with the need to be able to suppress that flag on a per-domain base. Juergen > > ~Andrew > >> >> Signed-off-by: Juergen Gross <jgross@suse.com> >> --- >> xen/include/public/arch-x86/xen.h | 7 ++++++- >> 1 file changed, 6 insertions(+), 1 deletion(-) >> >> diff --git a/xen/include/public/arch-x86/xen.h b/xen/include/public/arch-x86/xen.h >> index f35804b..2ca996c 100644 >> --- a/xen/include/public/arch-x86/xen.h >> +++ b/xen/include/public/arch-x86/xen.h >> @@ -224,7 +224,12 @@ struct arch_shared_info { >> /* Frame containing list of mfns containing list of mfns containing p2m. */ >> xen_pfn_t pfn_to_mfn_frame_list_list; >> unsigned long nmi_reason; >> - uint64_t pad[32]; >> + /* >> + * Following two fields are valid if pfn_to_mfn_frame_list_list contains >> + * ~0UL. >> + */ >> + unsigned long p2m_levels; /* number of levels of p2m tree */ >> + xen_pfn_t p2m_root; /* p2m tree top level mfn */ >> }; >> typedef struct arch_shared_info arch_shared_info_t; >> > > > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH V3 1/1] expand x86 arch_shared_info to support >3 level p2m tree 2014-09-09 10:49 ` Juergen Gross @ 2014-09-12 10:31 ` Juergen Gross 2014-09-15 8:29 ` Andrew Cooper 0 siblings, 1 reply; 23+ messages in thread From: Juergen Gross @ 2014-09-12 10:31 UTC (permalink / raw) To: Andrew Cooper, ian.campbell, ian.jackson, jbeulich, keir, tim, xen-devel On 09/09/2014 12:49 PM, Juergen Gross wrote: > On 09/09/2014 12:27 PM, Andrew Cooper wrote: >> On 09/09/14 10:58, Juergen Gross wrote: >>> The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list >>> currently contains the mfn of the top level page frame of the 3 level >>> p2m tree, which is used by the Xen tools during saving and restoring >>> (and live migration) of pv domains. With three levels of the p2m tree >>> it is possible to support up to 512 GB of RAM for a 64 bit pv domain. >>> A 32 bit pv domain can support more, as each memory page can hold 1024 >>> instead of 512 entries, leading to a limit of 4 TB. To be able to >>> support more RAM on x86-64 an additional level is to be added. >>> >>> This patch expands struct arch_shared_info with a new p2m tree root >>> and the number of levels of the p2m tree. The new information is >>> indicated by the domain to be valid by storing ~0UL into >>> pfn_to_mfn_frame_list_list (this should be done only if more than >>> three levels are needed, of course). >> >> A small domain feeling a little tight on space could easily opt for a 2 >> or even 1 level p2m. (After all, one advantage of virt is to cram many >> small VMs into a server). >> >> How is xen and toolstack support for n-level p2ms going to be advertised >> to guests? Simply assuming the toolstack is capable of dealing with >> this new scheme wont work with a new pv guest running on an older Xen. > > Is it really worth doing such an optimization? This would save only very > few pages. > > If you think it should be done we can add another SIF_* flag to > start_info->flags. In this case a domain using this feature could not be > migrated to a server with old tools, however. So we would probably end > with the need to be able to suppress that flag on a per-domain base. Any further comments? Which way should I go? Juergen ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH V3 1/1] expand x86 arch_shared_info to support >3 level p2m tree 2014-09-12 10:31 ` Juergen Gross @ 2014-09-15 8:29 ` Andrew Cooper 2014-09-15 8:52 ` Juergen Gross 0 siblings, 1 reply; 23+ messages in thread From: Andrew Cooper @ 2014-09-15 8:29 UTC (permalink / raw) To: Juergen Gross, ian.campbell, ian.jackson, jbeulich, keir, tim, xen-devel On 12/09/2014 11:31, Juergen Gross wrote: > On 09/09/2014 12:49 PM, Juergen Gross wrote: >> On 09/09/2014 12:27 PM, Andrew Cooper wrote: >>> On 09/09/14 10:58, Juergen Gross wrote: >>>> The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list >>>> currently contains the mfn of the top level page frame of the 3 level >>>> p2m tree, which is used by the Xen tools during saving and restoring >>>> (and live migration) of pv domains. With three levels of the p2m tree >>>> it is possible to support up to 512 GB of RAM for a 64 bit pv domain. >>>> A 32 bit pv domain can support more, as each memory page can hold 1024 >>>> instead of 512 entries, leading to a limit of 4 TB. To be able to >>>> support more RAM on x86-64 an additional level is to be added. >>>> >>>> This patch expands struct arch_shared_info with a new p2m tree root >>>> and the number of levels of the p2m tree. The new information is >>>> indicated by the domain to be valid by storing ~0UL into >>>> pfn_to_mfn_frame_list_list (this should be done only if more than >>>> three levels are needed, of course). >>> >>> A small domain feeling a little tight on space could easily opt for a 2 >>> or even 1 level p2m. (After all, one advantage of virt is to cram many >>> small VMs into a server). >>> >>> How is xen and toolstack support for n-level p2ms going to be >>> advertised >>> to guests? Simply assuming the toolstack is capable of dealing with >>> this new scheme wont work with a new pv guest running on an older Xen. >> >> Is it really worth doing such an optimization? This would save only very >> few pages. >> >> If you think it should be done we can add another SIF_* flag to >> start_info->flags. In this case a domain using this feature could not be >> migrated to a server with old tools, however. So we would probably end >> with the need to be able to suppress that flag on a per-domain base. > > Any further comments? > > Which way should I go? > There are two approaches, with different up/downsides 1) continue to use the old method, and use the new method only when absolutely required. This will function, but on old toolstacks, suffer migration/suspend failures when the toolstack fails to find the p2m. 2) Provide a Xen feature flag indicating the presence of N-level p2m support. Guests which can see this flag are free to use N-level, and guests which can't are not. Ultimately, giving more than 512GB to a current 64bit PV domain is not going to work, and the choice above depends on which failure mode you wish a new/old mix to have. ~Andrew ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH V3 1/1] expand x86 arch_shared_info to support >3 level p2m tree 2014-09-15 8:29 ` Andrew Cooper @ 2014-09-15 8:52 ` Juergen Gross 2014-09-15 9:42 ` Jan Beulich 2014-09-15 9:44 ` David Vrabel 0 siblings, 2 replies; 23+ messages in thread From: Juergen Gross @ 2014-09-15 8:52 UTC (permalink / raw) To: Andrew Cooper, ian.campbell, ian.jackson, jbeulich, keir, tim, xen-devel On 09/15/2014 10:29 AM, Andrew Cooper wrote: > > On 12/09/2014 11:31, Juergen Gross wrote: >> On 09/09/2014 12:49 PM, Juergen Gross wrote: >>> On 09/09/2014 12:27 PM, Andrew Cooper wrote: >>>> On 09/09/14 10:58, Juergen Gross wrote: >>>>> The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list >>>>> currently contains the mfn of the top level page frame of the 3 level >>>>> p2m tree, which is used by the Xen tools during saving and restoring >>>>> (and live migration) of pv domains. With three levels of the p2m tree >>>>> it is possible to support up to 512 GB of RAM for a 64 bit pv domain. >>>>> A 32 bit pv domain can support more, as each memory page can hold 1024 >>>>> instead of 512 entries, leading to a limit of 4 TB. To be able to >>>>> support more RAM on x86-64 an additional level is to be added. >>>>> >>>>> This patch expands struct arch_shared_info with a new p2m tree root >>>>> and the number of levels of the p2m tree. The new information is >>>>> indicated by the domain to be valid by storing ~0UL into >>>>> pfn_to_mfn_frame_list_list (this should be done only if more than >>>>> three levels are needed, of course). >>>> >>>> A small domain feeling a little tight on space could easily opt for a 2 >>>> or even 1 level p2m. (After all, one advantage of virt is to cram many >>>> small VMs into a server). >>>> >>>> How is xen and toolstack support for n-level p2ms going to be >>>> advertised >>>> to guests? Simply assuming the toolstack is capable of dealing with >>>> this new scheme wont work with a new pv guest running on an older Xen. >>> >>> Is it really worth doing such an optimization? This would save only very >>> few pages. >>> >>> If you think it should be done we can add another SIF_* flag to >>> start_info->flags. In this case a domain using this feature could not be >>> migrated to a server with old tools, however. So we would probably end >>> with the need to be able to suppress that flag on a per-domain base. >> >> Any further comments? >> >> Which way should I go? >> > > There are two approaches, with different up/downsides > > 1) continue to use the old method, and use the new method only when > absolutely required. This will function, but on old toolstacks, suffer > migration/suspend failures when the toolstack fails to find the p2m. > > 2) Provide a Xen feature flag indicating the presence of N-level p2m > support. Guests which can see this flag are free to use N-level, and > guests which can't are not. > > Ultimately, giving more than 512GB to a current 64bit PV domain is not > going to work, and the choice above depends on which failure mode you > wish a new/old mix to have. I'd prefer solution 1), as it will enable Dom0 with more than 512 GB without requiring a change of any Xen component. Additionally large domains can be started by users who don't care for migrating or suspending them. So I'd rather keep my patch as posted. Juergen ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH V3 1/1] expand x86 arch_shared_info to support >3 level p2m tree 2014-09-15 8:52 ` Juergen Gross @ 2014-09-15 9:42 ` Jan Beulich 2014-09-15 9:48 ` Juergen Gross 2014-09-15 9:44 ` David Vrabel 1 sibling, 1 reply; 23+ messages in thread From: Jan Beulich @ 2014-09-15 9:42 UTC (permalink / raw) To: Andrew Cooper, Juergen Gross Cc: keir, tim, ian.jackson, ian.campbell, xen-devel >>> On 15.09.14 at 10:52, <JGross@suse.com> wrote: > On 09/15/2014 10:29 AM, Andrew Cooper wrote: >> >> On 12/09/2014 11:31, Juergen Gross wrote: >>> On 09/09/2014 12:49 PM, Juergen Gross wrote: >>>> On 09/09/2014 12:27 PM, Andrew Cooper wrote: >>>>> On 09/09/14 10:58, Juergen Gross wrote: >>>>>> The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list >>>>>> currently contains the mfn of the top level page frame of the 3 level >>>>>> p2m tree, which is used by the Xen tools during saving and restoring >>>>>> (and live migration) of pv domains. With three levels of the p2m tree >>>>>> it is possible to support up to 512 GB of RAM for a 64 bit pv domain. >>>>>> A 32 bit pv domain can support more, as each memory page can hold 1024 >>>>>> instead of 512 entries, leading to a limit of 4 TB. To be able to >>>>>> support more RAM on x86-64 an additional level is to be added. >>>>>> >>>>>> This patch expands struct arch_shared_info with a new p2m tree root >>>>>> and the number of levels of the p2m tree. The new information is >>>>>> indicated by the domain to be valid by storing ~0UL into >>>>>> pfn_to_mfn_frame_list_list (this should be done only if more than >>>>>> three levels are needed, of course). >>>>> >>>>> A small domain feeling a little tight on space could easily opt for a 2 >>>>> or even 1 level p2m. (After all, one advantage of virt is to cram many >>>>> small VMs into a server). >>>>> >>>>> How is xen and toolstack support for n-level p2ms going to be >>>>> advertised >>>>> to guests? Simply assuming the toolstack is capable of dealing with >>>>> this new scheme wont work with a new pv guest running on an older Xen. >>>> >>>> Is it really worth doing such an optimization? This would save only very >>>> few pages. >>>> >>>> If you think it should be done we can add another SIF_* flag to >>>> start_info->flags. In this case a domain using this feature could not be >>>> migrated to a server with old tools, however. So we would probably end >>>> with the need to be able to suppress that flag on a per-domain base. >>> >>> Any further comments? >>> >>> Which way should I go? >>> >> >> There are two approaches, with different up/downsides >> >> 1) continue to use the old method, and use the new method only when >> absolutely required. This will function, but on old toolstacks, suffer >> migration/suspend failures when the toolstack fails to find the p2m. >> >> 2) Provide a Xen feature flag indicating the presence of N-level p2m >> support. Guests which can see this flag are free to use N-level, and >> guests which can't are not. >> >> Ultimately, giving more than 512GB to a current 64bit PV domain is not >> going to work, and the choice above depends on which failure mode you >> wish a new/old mix to have. > > I'd prefer solution 1), as it will enable Dom0 with more than 512 GB > without requiring a change of any Xen component. Additionally large > domains can be started by users who don't care for migrating or > suspending them. With the hopefully well understood limitation of kexec not working there (as it, just like migration for DomU, uses this table for Dom0 in at least machine_crash_shutdown()). Jan ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH V3 1/1] expand x86 arch_shared_info to support >3 level p2m tree 2014-09-15 9:42 ` Jan Beulich @ 2014-09-15 9:48 ` Juergen Gross 0 siblings, 0 replies; 23+ messages in thread From: Juergen Gross @ 2014-09-15 9:48 UTC (permalink / raw) To: Jan Beulich, Andrew Cooper Cc: tim, xen-devel, keir, ian.jackson, ian.campbell On 09/15/2014 11:42 AM, Jan Beulich wrote: >>>> On 15.09.14 at 10:52, <JGross@suse.com> wrote: >> On 09/15/2014 10:29 AM, Andrew Cooper wrote: >>> >>> On 12/09/2014 11:31, Juergen Gross wrote: >>>> On 09/09/2014 12:49 PM, Juergen Gross wrote: >>>>> On 09/09/2014 12:27 PM, Andrew Cooper wrote: >>>>>> On 09/09/14 10:58, Juergen Gross wrote: >>>>>>> The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list >>>>>>> currently contains the mfn of the top level page frame of the 3 level >>>>>>> p2m tree, which is used by the Xen tools during saving and restoring >>>>>>> (and live migration) of pv domains. With three levels of the p2m tree >>>>>>> it is possible to support up to 512 GB of RAM for a 64 bit pv domain. >>>>>>> A 32 bit pv domain can support more, as each memory page can hold 1024 >>>>>>> instead of 512 entries, leading to a limit of 4 TB. To be able to >>>>>>> support more RAM on x86-64 an additional level is to be added. >>>>>>> >>>>>>> This patch expands struct arch_shared_info with a new p2m tree root >>>>>>> and the number of levels of the p2m tree. The new information is >>>>>>> indicated by the domain to be valid by storing ~0UL into >>>>>>> pfn_to_mfn_frame_list_list (this should be done only if more than >>>>>>> three levels are needed, of course). >>>>>> >>>>>> A small domain feeling a little tight on space could easily opt for a 2 >>>>>> or even 1 level p2m. (After all, one advantage of virt is to cram many >>>>>> small VMs into a server). >>>>>> >>>>>> How is xen and toolstack support for n-level p2ms going to be >>>>>> advertised >>>>>> to guests? Simply assuming the toolstack is capable of dealing with >>>>>> this new scheme wont work with a new pv guest running on an older Xen. >>>>> >>>>> Is it really worth doing such an optimization? This would save only very >>>>> few pages. >>>>> >>>>> If you think it should be done we can add another SIF_* flag to >>>>> start_info->flags. In this case a domain using this feature could not be >>>>> migrated to a server with old tools, however. So we would probably end >>>>> with the need to be able to suppress that flag on a per-domain base. >>>> >>>> Any further comments? >>>> >>>> Which way should I go? >>>> >>> >>> There are two approaches, with different up/downsides >>> >>> 1) continue to use the old method, and use the new method only when >>> absolutely required. This will function, but on old toolstacks, suffer >>> migration/suspend failures when the toolstack fails to find the p2m. >>> >>> 2) Provide a Xen feature flag indicating the presence of N-level p2m >>> support. Guests which can see this flag are free to use N-level, and >>> guests which can't are not. >>> >>> Ultimately, giving more than 512GB to a current 64bit PV domain is not >>> going to work, and the choice above depends on which failure mode you >>> wish a new/old mix to have. >> >> I'd prefer solution 1), as it will enable Dom0 with more than 512 GB >> without requiring a change of any Xen component. Additionally large >> domains can be started by users who don't care for migrating or >> suspending them. > > With the hopefully well understood limitation of kexec not working > there (as it, just like migration for DomU, uses this table for Dom0 > in at least machine_crash_shutdown()). Sure. That's another issue to be addressed. Juergen ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH V3 1/1] expand x86 arch_shared_info to support >3 level p2m tree 2014-09-15 8:52 ` Juergen Gross 2014-09-15 9:42 ` Jan Beulich @ 2014-09-15 9:44 ` David Vrabel 2014-09-15 9:52 ` Juergen Gross 1 sibling, 1 reply; 23+ messages in thread From: David Vrabel @ 2014-09-15 9:44 UTC (permalink / raw) To: Juergen Gross, Andrew Cooper, ian.campbell, ian.jackson, jbeulich, keir, tim, xen-devel On 15/09/14 09:52, Juergen Gross wrote: > On 09/15/2014 10:29 AM, Andrew Cooper wrote: >> >> On 12/09/2014 11:31, Juergen Gross wrote: >>> On 09/09/2014 12:49 PM, Juergen Gross wrote: >>>> On 09/09/2014 12:27 PM, Andrew Cooper wrote: >>>>> On 09/09/14 10:58, Juergen Gross wrote: >>>>>> The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list >>>>>> currently contains the mfn of the top level page frame of the 3 level >>>>>> p2m tree, which is used by the Xen tools during saving and restoring >>>>>> (and live migration) of pv domains. With three levels of the p2m tree >>>>>> it is possible to support up to 512 GB of RAM for a 64 bit pv domain. >>>>>> A 32 bit pv domain can support more, as each memory page can hold >>>>>> 1024 >>>>>> instead of 512 entries, leading to a limit of 4 TB. To be able to >>>>>> support more RAM on x86-64 an additional level is to be added. >>>>>> >>>>>> This patch expands struct arch_shared_info with a new p2m tree root >>>>>> and the number of levels of the p2m tree. The new information is >>>>>> indicated by the domain to be valid by storing ~0UL into >>>>>> pfn_to_mfn_frame_list_list (this should be done only if more than >>>>>> three levels are needed, of course). >>>>> >>>>> A small domain feeling a little tight on space could easily opt for >>>>> a 2 >>>>> or even 1 level p2m. (After all, one advantage of virt is to cram >>>>> many >>>>> small VMs into a server). >>>>> >>>>> How is xen and toolstack support for n-level p2ms going to be >>>>> advertised >>>>> to guests? Simply assuming the toolstack is capable of dealing with >>>>> this new scheme wont work with a new pv guest running on an older Xen. >>>> >>>> Is it really worth doing such an optimization? This would save only >>>> very >>>> few pages. >>>> >>>> If you think it should be done we can add another SIF_* flag to >>>> start_info->flags. In this case a domain using this feature could >>>> not be >>>> migrated to a server with old tools, however. So we would probably end >>>> with the need to be able to suppress that flag on a per-domain base. >>> >>> Any further comments? >>> >>> Which way should I go? >>> >> >> There are two approaches, with different up/downsides >> >> 1) continue to use the old method, and use the new method only when >> absolutely required. This will function, but on old toolstacks, suffer >> migration/suspend failures when the toolstack fails to find the p2m. >> >> 2) Provide a Xen feature flag indicating the presence of N-level p2m >> support. Guests which can see this flag are free to use N-level, and >> guests which can't are not. >> >> Ultimately, giving more than 512GB to a current 64bit PV domain is not >> going to work, and the choice above depends on which failure mode you >> wish a new/old mix to have. > > I'd prefer solution 1), as it will enable Dom0 with more than 512 GB > without requiring a change of any Xen component. Additionally large > domains can be started by users who don't care for migrating or > suspending them. > > So I'd rather keep my patch as posted. PV guests can have extra memory added, beyond their initial limit. Supporting this would require option 2. David ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH V3 1/1] expand x86 arch_shared_info to support >3 level p2m tree 2014-09-15 9:44 ` David Vrabel @ 2014-09-15 9:52 ` Juergen Gross 2014-09-15 10:30 ` David Vrabel 0 siblings, 1 reply; 23+ messages in thread From: Juergen Gross @ 2014-09-15 9:52 UTC (permalink / raw) To: David Vrabel, Andrew Cooper, ian.campbell, ian.jackson, jbeulich, keir, tim, xen-devel On 09/15/2014 11:44 AM, David Vrabel wrote: > On 15/09/14 09:52, Juergen Gross wrote: >> On 09/15/2014 10:29 AM, Andrew Cooper wrote: >>> >>> On 12/09/2014 11:31, Juergen Gross wrote: >>>> On 09/09/2014 12:49 PM, Juergen Gross wrote: >>>>> On 09/09/2014 12:27 PM, Andrew Cooper wrote: >>>>>> On 09/09/14 10:58, Juergen Gross wrote: >>>>>>> The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list >>>>>>> currently contains the mfn of the top level page frame of the 3 level >>>>>>> p2m tree, which is used by the Xen tools during saving and restoring >>>>>>> (and live migration) of pv domains. With three levels of the p2m tree >>>>>>> it is possible to support up to 512 GB of RAM for a 64 bit pv domain. >>>>>>> A 32 bit pv domain can support more, as each memory page can hold >>>>>>> 1024 >>>>>>> instead of 512 entries, leading to a limit of 4 TB. To be able to >>>>>>> support more RAM on x86-64 an additional level is to be added. >>>>>>> >>>>>>> This patch expands struct arch_shared_info with a new p2m tree root >>>>>>> and the number of levels of the p2m tree. The new information is >>>>>>> indicated by the domain to be valid by storing ~0UL into >>>>>>> pfn_to_mfn_frame_list_list (this should be done only if more than >>>>>>> three levels are needed, of course). >>>>>> >>>>>> A small domain feeling a little tight on space could easily opt for >>>>>> a 2 >>>>>> or even 1 level p2m. (After all, one advantage of virt is to cram >>>>>> many >>>>>> small VMs into a server). >>>>>> >>>>>> How is xen and toolstack support for n-level p2ms going to be >>>>>> advertised >>>>>> to guests? Simply assuming the toolstack is capable of dealing with >>>>>> this new scheme wont work with a new pv guest running on an older Xen. >>>>> >>>>> Is it really worth doing such an optimization? This would save only >>>>> very >>>>> few pages. >>>>> >>>>> If you think it should be done we can add another SIF_* flag to >>>>> start_info->flags. In this case a domain using this feature could >>>>> not be >>>>> migrated to a server with old tools, however. So we would probably end >>>>> with the need to be able to suppress that flag on a per-domain base. >>>> >>>> Any further comments? >>>> >>>> Which way should I go? >>>> >>> >>> There are two approaches, with different up/downsides >>> >>> 1) continue to use the old method, and use the new method only when >>> absolutely required. This will function, but on old toolstacks, suffer >>> migration/suspend failures when the toolstack fails to find the p2m. >>> >>> 2) Provide a Xen feature flag indicating the presence of N-level p2m >>> support. Guests which can see this flag are free to use N-level, and >>> guests which can't are not. >>> >>> Ultimately, giving more than 512GB to a current 64bit PV domain is not >>> going to work, and the choice above depends on which failure mode you >>> wish a new/old mix to have. >> >> I'd prefer solution 1), as it will enable Dom0 with more than 512 GB >> without requiring a change of any Xen component. Additionally large >> domains can be started by users who don't care for migrating or >> suspending them. >> >> So I'd rather keep my patch as posted. > > PV guests can have extra memory added, beyond their initial limit. > Supporting this would require option 2. I don't see why this should require option 2. Option 1 only prohibits suspending/migrating a domain with more than 512 GB. Just running it is fine with either option. I should mention, however, that the number of p2m tree levels will be increased on demand when needed. The tree won't be created with more than 3 levels if the domain isn't started with more than 512 GB. Juergen ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH V3 1/1] expand x86 arch_shared_info to support >3 level p2m tree 2014-09-15 9:52 ` Juergen Gross @ 2014-09-15 10:30 ` David Vrabel 2014-09-15 10:46 ` Juergen Gross 0 siblings, 1 reply; 23+ messages in thread From: David Vrabel @ 2014-09-15 10:30 UTC (permalink / raw) To: Juergen Gross, Andrew Cooper, ian.campbell, ian.jackson, jbeulich, keir, tim, xen-devel On 15/09/14 10:52, Juergen Gross wrote: > On 09/15/2014 11:44 AM, David Vrabel wrote: >> On 15/09/14 09:52, Juergen Gross wrote: >>> On 09/15/2014 10:29 AM, Andrew Cooper wrote: >>>> >>>> On 12/09/2014 11:31, Juergen Gross wrote: >>>>> On 09/09/2014 12:49 PM, Juergen Gross wrote: >>>>>> On 09/09/2014 12:27 PM, Andrew Cooper wrote: >>>>>>> On 09/09/14 10:58, Juergen Gross wrote: >>>>>>>> The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list >>>>>>>> currently contains the mfn of the top level page frame of the 3 >>>>>>>> level >>>>>>>> p2m tree, which is used by the Xen tools during saving and >>>>>>>> restoring >>>>>>>> (and live migration) of pv domains. With three levels of the p2m >>>>>>>> tree >>>>>>>> it is possible to support up to 512 GB of RAM for a 64 bit pv >>>>>>>> domain. >>>>>>>> A 32 bit pv domain can support more, as each memory page can hold >>>>>>>> 1024 >>>>>>>> instead of 512 entries, leading to a limit of 4 TB. To be able to >>>>>>>> support more RAM on x86-64 an additional level is to be added. >>>>>>>> >>>>>>>> This patch expands struct arch_shared_info with a new p2m tree root >>>>>>>> and the number of levels of the p2m tree. The new information is >>>>>>>> indicated by the domain to be valid by storing ~0UL into >>>>>>>> pfn_to_mfn_frame_list_list (this should be done only if more than >>>>>>>> three levels are needed, of course). >>>>>>> >>>>>>> A small domain feeling a little tight on space could easily opt for >>>>>>> a 2 >>>>>>> or even 1 level p2m. (After all, one advantage of virt is to cram >>>>>>> many >>>>>>> small VMs into a server). >>>>>>> >>>>>>> How is xen and toolstack support for n-level p2ms going to be >>>>>>> advertised >>>>>>> to guests? Simply assuming the toolstack is capable of dealing with >>>>>>> this new scheme wont work with a new pv guest running on an older >>>>>>> Xen. >>>>>> >>>>>> Is it really worth doing such an optimization? This would save only >>>>>> very >>>>>> few pages. >>>>>> >>>>>> If you think it should be done we can add another SIF_* flag to >>>>>> start_info->flags. In this case a domain using this feature could >>>>>> not be >>>>>> migrated to a server with old tools, however. So we would probably >>>>>> end >>>>>> with the need to be able to suppress that flag on a per-domain base. >>>>> >>>>> Any further comments? >>>>> >>>>> Which way should I go? >>>>> >>>> >>>> There are two approaches, with different up/downsides >>>> >>>> 1) continue to use the old method, and use the new method only when >>>> absolutely required. This will function, but on old toolstacks, suffer >>>> migration/suspend failures when the toolstack fails to find the p2m. >>>> >>>> 2) Provide a Xen feature flag indicating the presence of N-level p2m >>>> support. Guests which can see this flag are free to use N-level, and >>>> guests which can't are not. >>>> >>>> Ultimately, giving more than 512GB to a current 64bit PV domain is not >>>> going to work, and the choice above depends on which failure mode you >>>> wish a new/old mix to have. >>> >>> I'd prefer solution 1), as it will enable Dom0 with more than 512 GB >>> without requiring a change of any Xen component. Additionally large >>> domains can be started by users who don't care for migrating or >>> suspending them. >>> >>> So I'd rather keep my patch as posted. >> >> PV guests can have extra memory added, beyond their initial limit. >> Supporting this would require option 2. > > I don't see why this should require option 2. Um... > Option 1 only prohibits suspending/migrating a domain with more than 512 GB. ...this is the reason. With the exception of VMs that have assigned direct access to hardware, migration is an essential feature and must be supported. David ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH V3 1/1] expand x86 arch_shared_info to support >3 level p2m tree 2014-09-15 10:30 ` David Vrabel @ 2014-09-15 10:46 ` Juergen Gross 2014-09-15 11:29 ` Jan Beulich 2014-09-15 14:30 ` David Vrabel 0 siblings, 2 replies; 23+ messages in thread From: Juergen Gross @ 2014-09-15 10:46 UTC (permalink / raw) To: David Vrabel, Andrew Cooper, ian.campbell, ian.jackson, jbeulich, keir, tim, xen-devel On 09/15/2014 12:30 PM, David Vrabel wrote: > On 15/09/14 10:52, Juergen Gross wrote: >> On 09/15/2014 11:44 AM, David Vrabel wrote: >>> On 15/09/14 09:52, Juergen Gross wrote: >>>> On 09/15/2014 10:29 AM, Andrew Cooper wrote: >>>>> >>>>> On 12/09/2014 11:31, Juergen Gross wrote: >>>>>> On 09/09/2014 12:49 PM, Juergen Gross wrote: >>>>>>> On 09/09/2014 12:27 PM, Andrew Cooper wrote: >>>>>>>> On 09/09/14 10:58, Juergen Gross wrote: >>>>>>>>> The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list >>>>>>>>> currently contains the mfn of the top level page frame of the 3 >>>>>>>>> level >>>>>>>>> p2m tree, which is used by the Xen tools during saving and >>>>>>>>> restoring >>>>>>>>> (and live migration) of pv domains. With three levels of the p2m >>>>>>>>> tree >>>>>>>>> it is possible to support up to 512 GB of RAM for a 64 bit pv >>>>>>>>> domain. >>>>>>>>> A 32 bit pv domain can support more, as each memory page can hold >>>>>>>>> 1024 >>>>>>>>> instead of 512 entries, leading to a limit of 4 TB. To be able to >>>>>>>>> support more RAM on x86-64 an additional level is to be added. >>>>>>>>> >>>>>>>>> This patch expands struct arch_shared_info with a new p2m tree root >>>>>>>>> and the number of levels of the p2m tree. The new information is >>>>>>>>> indicated by the domain to be valid by storing ~0UL into >>>>>>>>> pfn_to_mfn_frame_list_list (this should be done only if more than >>>>>>>>> three levels are needed, of course). >>>>>>>> >>>>>>>> A small domain feeling a little tight on space could easily opt for >>>>>>>> a 2 >>>>>>>> or even 1 level p2m. (After all, one advantage of virt is to cram >>>>>>>> many >>>>>>>> small VMs into a server). >>>>>>>> >>>>>>>> How is xen and toolstack support for n-level p2ms going to be >>>>>>>> advertised >>>>>>>> to guests? Simply assuming the toolstack is capable of dealing with >>>>>>>> this new scheme wont work with a new pv guest running on an older >>>>>>>> Xen. >>>>>>> >>>>>>> Is it really worth doing such an optimization? This would save only >>>>>>> very >>>>>>> few pages. >>>>>>> >>>>>>> If you think it should be done we can add another SIF_* flag to >>>>>>> start_info->flags. In this case a domain using this feature could >>>>>>> not be >>>>>>> migrated to a server with old tools, however. So we would probably >>>>>>> end >>>>>>> with the need to be able to suppress that flag on a per-domain base. >>>>>> >>>>>> Any further comments? >>>>>> >>>>>> Which way should I go? >>>>>> >>>>> >>>>> There are two approaches, with different up/downsides >>>>> >>>>> 1) continue to use the old method, and use the new method only when >>>>> absolutely required. This will function, but on old toolstacks, suffer >>>>> migration/suspend failures when the toolstack fails to find the p2m. >>>>> >>>>> 2) Provide a Xen feature flag indicating the presence of N-level p2m >>>>> support. Guests which can see this flag are free to use N-level, and >>>>> guests which can't are not. >>>>> >>>>> Ultimately, giving more than 512GB to a current 64bit PV domain is not >>>>> going to work, and the choice above depends on which failure mode you >>>>> wish a new/old mix to have. >>>> >>>> I'd prefer solution 1), as it will enable Dom0 with more than 512 GB >>>> without requiring a change of any Xen component. Additionally large >>>> domains can be started by users who don't care for migrating or >>>> suspending them. >>>> >>>> So I'd rather keep my patch as posted. >>> >>> PV guests can have extra memory added, beyond their initial limit. >>> Supporting this would require option 2. >> >> I don't see why this should require option 2. > > Um... > >> Option 1 only prohibits suspending/migrating a domain with more than 512 GB. > > ...this is the reason. > > With the exception of VMs that have assigned direct access to hardware, > migration is an essential feature and must be supported. So you'd prefer: 1) >512GB pv-domains (including Dom0) will be supported only with new Xen (4.6?), no matter if the user requires migration to be supported to: 2) >512GB pv-domains (especially Dom0 and VMs with direct hw access) can be started on current Xen versions, migration is possible only if Xen is new (4.6?) What is the common use case for migration? I doubt it is used very often for really huge domains. I'm not really opposed to solution 2, but I doubt it is the correct one. Juergen ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH V3 1/1] expand x86 arch_shared_info to support >3 level p2m tree 2014-09-15 10:46 ` Juergen Gross @ 2014-09-15 11:29 ` Jan Beulich 2014-09-15 14:30 ` David Vrabel 1 sibling, 0 replies; 23+ messages in thread From: Jan Beulich @ 2014-09-15 11:29 UTC (permalink / raw) To: David Vrabel, Juergen Gross Cc: keir, ian.campbell, Andrew Cooper, tim, xen-devel, ian.jackson >>> On 15.09.14 at 12:46, <JGross@suse.com> wrote: > On 09/15/2014 12:30 PM, David Vrabel wrote: >> With the exception of VMs that have assigned direct access to hardware, >> migration is an essential feature and must be supported. > > So you'd prefer: > > 1) >512GB pv-domains (including Dom0) will be supported only with new > Xen (4.6?), no matter if the user requires migration to be supported > > to: > > 2) >512GB pv-domains (especially Dom0 and VMs with direct hw access) can > be started on current Xen versions, migration is possible only if Xen > is new (4.6?) > > What is the common use case for migration? I doubt it is used very often > for really huge domains. Even without any guessing on the likelihood and usefulness of huge domains getting migrated, 1) clearly causing more functionality reduction than 2) I'm having a hard time seeing why 1) would be favored by anyone outside of academical/theoretical considerations. Jan ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH V3 1/1] expand x86 arch_shared_info to support >3 level p2m tree 2014-09-15 10:46 ` Juergen Gross 2014-09-15 11:29 ` Jan Beulich @ 2014-09-15 14:30 ` David Vrabel 2014-09-16 3:52 ` Juergen Gross 1 sibling, 1 reply; 23+ messages in thread From: David Vrabel @ 2014-09-15 14:30 UTC (permalink / raw) To: Juergen Gross, Andrew Cooper, ian.campbell, ian.jackson, jbeulich, keir, tim, xen-devel On 15/09/14 11:46, Juergen Gross wrote: > On 09/15/2014 12:30 PM, David Vrabel wrote: >> On 15/09/14 10:52, Juergen Gross wrote: >>> On 09/15/2014 11:44 AM, David Vrabel wrote: >>>> On 15/09/14 09:52, Juergen Gross wrote: >>>>> On 09/15/2014 10:29 AM, Andrew Cooper wrote: >>>>>> >>>>>> On 12/09/2014 11:31, Juergen Gross wrote: >>>>>>> On 09/09/2014 12:49 PM, Juergen Gross wrote: >>>>>>>> On 09/09/2014 12:27 PM, Andrew Cooper wrote: >>>>>>>>> On 09/09/14 10:58, Juergen Gross wrote: >>>>>>>>>> The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list >>>>>>>>>> currently contains the mfn of the top level page frame of the 3 >>>>>>>>>> level >>>>>>>>>> p2m tree, which is used by the Xen tools during saving and >>>>>>>>>> restoring >>>>>>>>>> (and live migration) of pv domains. With three levels of the p2m >>>>>>>>>> tree >>>>>>>>>> it is possible to support up to 512 GB of RAM for a 64 bit pv >>>>>>>>>> domain. >>>>>>>>>> A 32 bit pv domain can support more, as each memory page can hold >>>>>>>>>> 1024 >>>>>>>>>> instead of 512 entries, leading to a limit of 4 TB. To be able to >>>>>>>>>> support more RAM on x86-64 an additional level is to be added. >>>>>>>>>> >>>>>>>>>> This patch expands struct arch_shared_info with a new p2m tree >>>>>>>>>> root >>>>>>>>>> and the number of levels of the p2m tree. The new information is >>>>>>>>>> indicated by the domain to be valid by storing ~0UL into >>>>>>>>>> pfn_to_mfn_frame_list_list (this should be done only if more than >>>>>>>>>> three levels are needed, of course). >>>>>>>>> >>>>>>>>> A small domain feeling a little tight on space could easily opt >>>>>>>>> for >>>>>>>>> a 2 >>>>>>>>> or even 1 level p2m. (After all, one advantage of virt is to cram >>>>>>>>> many >>>>>>>>> small VMs into a server). >>>>>>>>> >>>>>>>>> How is xen and toolstack support for n-level p2ms going to be >>>>>>>>> advertised >>>>>>>>> to guests? Simply assuming the toolstack is capable of dealing >>>>>>>>> with >>>>>>>>> this new scheme wont work with a new pv guest running on an older >>>>>>>>> Xen. >>>>>>>> >>>>>>>> Is it really worth doing such an optimization? This would save only >>>>>>>> very >>>>>>>> few pages. >>>>>>>> >>>>>>>> If you think it should be done we can add another SIF_* flag to >>>>>>>> start_info->flags. In this case a domain using this feature could >>>>>>>> not be >>>>>>>> migrated to a server with old tools, however. So we would probably >>>>>>>> end >>>>>>>> with the need to be able to suppress that flag on a per-domain >>>>>>>> base. >>>>>>> >>>>>>> Any further comments? >>>>>>> >>>>>>> Which way should I go? >>>>>>> >>>>>> >>>>>> There are two approaches, with different up/downsides >>>>>> >>>>>> 1) continue to use the old method, and use the new method only when >>>>>> absolutely required. This will function, but on old toolstacks, >>>>>> suffer >>>>>> migration/suspend failures when the toolstack fails to find the p2m. >>>>>> >>>>>> 2) Provide a Xen feature flag indicating the presence of N-level p2m >>>>>> support. Guests which can see this flag are free to use N-level, and >>>>>> guests which can't are not. >>>>>> >>>>>> Ultimately, giving more than 512GB to a current 64bit PV domain is >>>>>> not >>>>>> going to work, and the choice above depends on which failure mode you >>>>>> wish a new/old mix to have. >>>>> >>>>> I'd prefer solution 1), as it will enable Dom0 with more than 512 GB >>>>> without requiring a change of any Xen component. Additionally large >>>>> domains can be started by users who don't care for migrating or >>>>> suspending them. >>>>> >>>>> So I'd rather keep my patch as posted. >>>> >>>> PV guests can have extra memory added, beyond their initial limit. >>>> Supporting this would require option 2. >>> >>> I don't see why this should require option 2. >> >> Um... >> >>> Option 1 only prohibits suspending/migrating a domain with more than >>> 512 GB. >> >> ...this is the reason. >> >> With the exception of VMs that have assigned direct access to hardware, >> migration is an essential feature and must be supported. > > So you'd prefer: > > 1) >512GB pv-domains (including Dom0) will be supported only with new > Xen (4.6?), no matter if the user requires migration to be supported Yes. >512 GiB and not being able to migrate are not obviously related from the point of view of the end user (unlike assigning a PCI device). Failing at domain save time is most likely too late for the end user. > to: > > 2) >512GB pv-domains (especially Dom0 and VMs with direct hw access) can > be started on current Xen versions, migration is possible only if Xen > is new (4.6?) There's also my preferred option: 3) >512 GiB PV domains are not supported. Large guests must be PVH or PVHVM. > What is the common use case for migration? I doubt it is used very often > for really huge domains. XenServer uses it for server pool upgrades with no VM downtime. Also, today's huge VM is tomorrow's common size. David ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH V3 1/1] expand x86 arch_shared_info to support >3 level p2m tree 2014-09-15 14:30 ` David Vrabel @ 2014-09-16 3:52 ` Juergen Gross 2014-09-16 10:14 ` David Vrabel 0 siblings, 1 reply; 23+ messages in thread From: Juergen Gross @ 2014-09-16 3:52 UTC (permalink / raw) To: David Vrabel, Andrew Cooper, ian.campbell, ian.jackson, jbeulich, keir, tim, xen-devel On 09/15/2014 04:30 PM, David Vrabel wrote: > On 15/09/14 11:46, Juergen Gross wrote: >> On 09/15/2014 12:30 PM, David Vrabel wrote: >>> On 15/09/14 10:52, Juergen Gross wrote: >>>> On 09/15/2014 11:44 AM, David Vrabel wrote: >>>>> On 15/09/14 09:52, Juergen Gross wrote: >>>>>> On 09/15/2014 10:29 AM, Andrew Cooper wrote: >>>>>>> >>>>>>> On 12/09/2014 11:31, Juergen Gross wrote: >>>>>>>> On 09/09/2014 12:49 PM, Juergen Gross wrote: >>>>>>>>> On 09/09/2014 12:27 PM, Andrew Cooper wrote: >>>>>>>>>> On 09/09/14 10:58, Juergen Gross wrote: >>>>>>>>>>> The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list >>>>>>>>>>> currently contains the mfn of the top level page frame of the 3 >>>>>>>>>>> level >>>>>>>>>>> p2m tree, which is used by the Xen tools during saving and >>>>>>>>>>> restoring >>>>>>>>>>> (and live migration) of pv domains. With three levels of the p2m >>>>>>>>>>> tree >>>>>>>>>>> it is possible to support up to 512 GB of RAM for a 64 bit pv >>>>>>>>>>> domain. >>>>>>>>>>> A 32 bit pv domain can support more, as each memory page can hold >>>>>>>>>>> 1024 >>>>>>>>>>> instead of 512 entries, leading to a limit of 4 TB. To be able to >>>>>>>>>>> support more RAM on x86-64 an additional level is to be added. >>>>>>>>>>> >>>>>>>>>>> This patch expands struct arch_shared_info with a new p2m tree >>>>>>>>>>> root >>>>>>>>>>> and the number of levels of the p2m tree. The new information is >>>>>>>>>>> indicated by the domain to be valid by storing ~0UL into >>>>>>>>>>> pfn_to_mfn_frame_list_list (this should be done only if more than >>>>>>>>>>> three levels are needed, of course). >>>>>>>>>> >>>>>>>>>> A small domain feeling a little tight on space could easily opt >>>>>>>>>> for >>>>>>>>>> a 2 >>>>>>>>>> or even 1 level p2m. (After all, one advantage of virt is to cram >>>>>>>>>> many >>>>>>>>>> small VMs into a server). >>>>>>>>>> >>>>>>>>>> How is xen and toolstack support for n-level p2ms going to be >>>>>>>>>> advertised >>>>>>>>>> to guests? Simply assuming the toolstack is capable of dealing >>>>>>>>>> with >>>>>>>>>> this new scheme wont work with a new pv guest running on an older >>>>>>>>>> Xen. >>>>>>>>> >>>>>>>>> Is it really worth doing such an optimization? This would save only >>>>>>>>> very >>>>>>>>> few pages. >>>>>>>>> >>>>>>>>> If you think it should be done we can add another SIF_* flag to >>>>>>>>> start_info->flags. In this case a domain using this feature could >>>>>>>>> not be >>>>>>>>> migrated to a server with old tools, however. So we would probably >>>>>>>>> end >>>>>>>>> with the need to be able to suppress that flag on a per-domain >>>>>>>>> base. >>>>>>>> >>>>>>>> Any further comments? >>>>>>>> >>>>>>>> Which way should I go? >>>>>>>> >>>>>>> >>>>>>> There are two approaches, with different up/downsides >>>>>>> >>>>>>> 1) continue to use the old method, and use the new method only when >>>>>>> absolutely required. This will function, but on old toolstacks, >>>>>>> suffer >>>>>>> migration/suspend failures when the toolstack fails to find the p2m. >>>>>>> >>>>>>> 2) Provide a Xen feature flag indicating the presence of N-level p2m >>>>>>> support. Guests which can see this flag are free to use N-level, and >>>>>>> guests which can't are not. >>>>>>> >>>>>>> Ultimately, giving more than 512GB to a current 64bit PV domain is >>>>>>> not >>>>>>> going to work, and the choice above depends on which failure mode you >>>>>>> wish a new/old mix to have. >>>>>> >>>>>> I'd prefer solution 1), as it will enable Dom0 with more than 512 GB >>>>>> without requiring a change of any Xen component. Additionally large >>>>>> domains can be started by users who don't care for migrating or >>>>>> suspending them. >>>>>> >>>>>> So I'd rather keep my patch as posted. >>>>> >>>>> PV guests can have extra memory added, beyond their initial limit. >>>>> Supporting this would require option 2. >>>> >>>> I don't see why this should require option 2. >>> >>> Um... >>> >>>> Option 1 only prohibits suspending/migrating a domain with more than >>>> 512 GB. >>> >>> ...this is the reason. >>> >>> With the exception of VMs that have assigned direct access to hardware, >>> migration is an essential feature and must be supported. >> >> So you'd prefer: >> >> 1) >512GB pv-domains (including Dom0) will be supported only with new >> Xen (4.6?), no matter if the user requires migration to be supported > > Yes. >512 GiB and not being able to migrate are not obviously related > from the point of view of the end user (unlike assigning a PCI device). > > Failing at domain save time is most likely too late for the end user. What would you think about following compromise: We add a flag that indicates support of multi-level p2m. Additionally the Linux kernel can ignore the flag not being set either if started as Dom0 or if told so via kernel parameter. > >> to: >> >> 2) >512GB pv-domains (especially Dom0 and VMs with direct hw access) can >> be started on current Xen versions, migration is possible only if Xen >> is new (4.6?) > > There's also my preferred option: > > 3) >512 GiB PV domains are not supported. Large guests must be PVH or > PVHVM. In theory okay, but not right now, I think. PVH Dom0 is not production ready. Juergen > >> What is the common use case for migration? I doubt it is used very often >> for really huge domains. > > XenServer uses it for server pool upgrades with no VM downtime. > > Also, today's huge VM is tomorrow's common size. > > David > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH V3 1/1] expand x86 arch_shared_info to support >3 level p2m tree 2014-09-16 3:52 ` Juergen Gross @ 2014-09-16 10:14 ` David Vrabel 2014-09-16 10:38 ` Juergen Gross 0 siblings, 1 reply; 23+ messages in thread From: David Vrabel @ 2014-09-16 10:14 UTC (permalink / raw) To: Juergen Gross, David Vrabel, Andrew Cooper, ian.campbell, ian.jackson, jbeulich, keir, tim, xen-devel On 16/09/14 04:52, Juergen Gross wrote: > On 09/15/2014 04:30 PM, David Vrabel wrote: >> On 15/09/14 11:46, Juergen Gross wrote: >>> So you'd prefer: >>> >>> 1) >512GB pv-domains (including Dom0) will be supported only with new >>> Xen (4.6?), no matter if the user requires migration to be supported >> >> Yes. >512 GiB and not being able to migrate are not obviously related >> from the point of view of the end user (unlike assigning a PCI device). >> >> Failing at domain save time is most likely too late for the end user. > > What would you think about following compromise: > > We add a flag that indicates support of multi-level p2m. Additionally > the Linux kernel can ignore the flag not being set either if started as > Dom0 or if told so via kernel parameter. This sounds fine but this override should be via the command line parameter only. Crash dump analysis tools may not understand the 4 level p2m. >>> to: >>> >>> 2) >512GB pv-domains (especially Dom0 and VMs with direct hw access) can >>> be started on current Xen versions, migration is possible only if >>> Xen >>> is new (4.6?) >> >> There's also my preferred option: >> >> 3) >512 GiB PV domains are not supported. Large guests must be PVH or >> PVHVM. > > In theory okay, but not right now, I think. PVH Dom0 is not production > ready. I'm not really seeing the need for such a large dom0. I remain unconvinced that there are sufficient use cases to justify extending the PV only ABI and increasing complexity of the current 3-level p2m code. I'm concerned that 4-level p2m support will impact the performance of guests that do not need the 4 levels. It may be necessary to use the alternatives mechanism to select the correct low-level lookup function. I also think a flat array for the p2m might be better (less complex). There's plenty of virtual address space in a 64-bit guest to allow for this. David ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH V3 1/1] expand x86 arch_shared_info to support >3 level p2m tree 2014-09-16 10:14 ` David Vrabel @ 2014-09-16 10:38 ` Juergen Gross 2014-09-16 11:56 ` David Vrabel 0 siblings, 1 reply; 23+ messages in thread From: Juergen Gross @ 2014-09-16 10:38 UTC (permalink / raw) To: David Vrabel, Andrew Cooper, ian.campbell, ian.jackson, jbeulich, keir, tim, xen-devel On 09/16/2014 12:14 PM, David Vrabel wrote: > On 16/09/14 04:52, Juergen Gross wrote: >> On 09/15/2014 04:30 PM, David Vrabel wrote: >>> On 15/09/14 11:46, Juergen Gross wrote: >>>> So you'd prefer: >>>> >>>> 1) >512GB pv-domains (including Dom0) will be supported only with new >>>> Xen (4.6?), no matter if the user requires migration to be supported >>> >>> Yes. >512 GiB and not being able to migrate are not obviously related >>> from the point of view of the end user (unlike assigning a PCI device). >>> >>> Failing at domain save time is most likely too late for the end user. >> >> What would you think about following compromise: >> >> We add a flag that indicates support of multi-level p2m. Additionally >> the Linux kernel can ignore the flag not being set either if started as >> Dom0 or if told so via kernel parameter. > > This sounds fine but this override should be via the command line > parameter only. Crash dump analysis tools may not understand the 4 > level p2m. > >>>> to: >>>> >>>> 2) >512GB pv-domains (especially Dom0 and VMs with direct hw access) can >>>> be started on current Xen versions, migration is possible only if >>>> Xen >>>> is new (4.6?) >>> >>> There's also my preferred option: >>> >>> 3) >512 GiB PV domains are not supported. Large guests must be PVH or >>> PVHVM. >> >> In theory okay, but not right now, I think. PVH Dom0 is not production >> ready. > > I'm not really seeing the need for such a large dom0. Okay, then I'd come back to V1 of my patches. This is the minimum required to be able to boot up a system with Xen and more than 512GB memory without having to reduce the Dom0 memory via Xen boot parameter. Otherwise the hypervisor built mfn_list mapped into the initial address space will be too large. And no, I don't think setting the boot parameter is the solution here. Dom0 should be usable on a huge machine without special parameters. > > I remain unconvinced that there are sufficient use cases to justify > extending the PV only ABI and increasing complexity of the current > 3-level p2m code. > > I'm concerned that 4-level p2m support will impact the performance of > guests that do not need the 4 levels. It may be necessary to use the > alternatives mechanism to select the correct low-level lookup function. I'll try to get some numbers to post together with a patch. > I also think a flat array for the p2m might be better (less complex). > There's plenty of virtual address space in a 64-bit guest to allow for this. Hmm, do you think we could reserve an area of many GBs for Xen in virtual space? I suspect this would be rejected as another "Xen-ism". BTW: the mfn_list_list will still be required to be built as a tree. Juergen ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH V3 1/1] expand x86 arch_shared_info to support >3 level p2m tree 2014-09-16 10:38 ` Juergen Gross @ 2014-09-16 11:56 ` David Vrabel 2014-09-16 12:44 ` Juergen Gross 0 siblings, 1 reply; 23+ messages in thread From: David Vrabel @ 2014-09-16 11:56 UTC (permalink / raw) To: Juergen Gross, David Vrabel, Andrew Cooper, ian.campbell, ian.jackson, jbeulich, keir, tim, xen-devel On 16/09/14 11:38, Juergen Gross wrote: > On 09/16/2014 12:14 PM, David Vrabel wrote: >> On 16/09/14 04:52, Juergen Gross wrote: >>> On 09/15/2014 04:30 PM, David Vrabel wrote: >>>> On 15/09/14 11:46, Juergen Gross wrote: >>>>> So you'd prefer: >>>>> >>>>> 1) >512GB pv-domains (including Dom0) will be supported only with new >>>>> Xen (4.6?), no matter if the user requires migration to be >>>>> supported >>>> >>>> Yes. >512 GiB and not being able to migrate are not obviously related >>>> from the point of view of the end user (unlike assigning a PCI device). >>>> >>>> Failing at domain save time is most likely too late for the end user. >>> >>> What would you think about following compromise: >>> >>> We add a flag that indicates support of multi-level p2m. Additionally >>> the Linux kernel can ignore the flag not being set either if started as >>> Dom0 or if told so via kernel parameter. >> >> This sounds fine but this override should be via the command line >> parameter only. Crash dump analysis tools may not understand the 4 >> level p2m. >> >>>>> to: >>>>> >>>>> 2) >512GB pv-domains (especially Dom0 and VMs with direct hw >>>>> access) can >>>>> be started on current Xen versions, migration is possible only if >>>>> Xen >>>>> is new (4.6?) >>>> >>>> There's also my preferred option: >>>> >>>> 3) >512 GiB PV domains are not supported. Large guests must be PVH or >>>> PVHVM. >>> >>> In theory okay, but not right now, I think. PVH Dom0 is not production >>> ready. >> >> I'm not really seeing the need for such a large dom0. > > Okay, then I'd come back to V1 of my patches. This is the minimum > required to be able to boot up a system with Xen and more than 512GB > memory without having to reduce the Dom0 memory via Xen boot parameter. > > Otherwise the hypervisor built mfn_list mapped into the initial address > space will be too large. > > And no, I don't think setting the boot parameter is the solution here. > Dom0 should be usable on a huge machine without special parameters. Ok. The case where's dom0's p2m format matters is pretty specialized. >> I also think a flat array for the p2m might be better (less complex). >> There's plenty of virtual address space in a 64-bit guest to allow for >> this. > > Hmm, do you think we could reserve an area of many GBs for Xen in > virtual space? I suspect this would be rejected as another "Xen-ism". alloc_vm_area() > BTW: the mfn_list_list will still be required to be built as a tree. The tools could be given the guest virtual address and walk the guest page tables. This is probably too much of a difference from the existing ABI to be worth pursuing at this point. David ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH V3 1/1] expand x86 arch_shared_info to support >3 level p2m tree 2014-09-16 11:56 ` David Vrabel @ 2014-09-16 12:44 ` Juergen Gross 2014-09-17 4:25 ` Juergen Gross 0 siblings, 1 reply; 23+ messages in thread From: Juergen Gross @ 2014-09-16 12:44 UTC (permalink / raw) To: David Vrabel, Andrew Cooper, ian.campbell, ian.jackson, jbeulich, keir, tim, xen-devel On 09/16/2014 01:56 PM, David Vrabel wrote: > On 16/09/14 11:38, Juergen Gross wrote: >> On 09/16/2014 12:14 PM, David Vrabel wrote: >>> On 16/09/14 04:52, Juergen Gross wrote: >>>> On 09/15/2014 04:30 PM, David Vrabel wrote: >>>>> On 15/09/14 11:46, Juergen Gross wrote: >>>>>> So you'd prefer: >>>>>> >>>>>> 1) >512GB pv-domains (including Dom0) will be supported only with new >>>>>> Xen (4.6?), no matter if the user requires migration to be >>>>>> supported >>>>> >>>>> Yes. >512 GiB and not being able to migrate are not obviously related >>>>> from the point of view of the end user (unlike assigning a PCI device). >>>>> >>>>> Failing at domain save time is most likely too late for the end user. >>>> >>>> What would you think about following compromise: >>>> >>>> We add a flag that indicates support of multi-level p2m. Additionally >>>> the Linux kernel can ignore the flag not being set either if started as >>>> Dom0 or if told so via kernel parameter. >>> >>> This sounds fine but this override should be via the command line >>> parameter only. Crash dump analysis tools may not understand the 4 >>> level p2m. >>> >>>>>> to: >>>>>> >>>>>> 2) >512GB pv-domains (especially Dom0 and VMs with direct hw >>>>>> access) can >>>>>> be started on current Xen versions, migration is possible only if >>>>>> Xen >>>>>> is new (4.6?) >>>>> >>>>> There's also my preferred option: >>>>> >>>>> 3) >512 GiB PV domains are not supported. Large guests must be PVH or >>>>> PVHVM. >>>> >>>> In theory okay, but not right now, I think. PVH Dom0 is not production >>>> ready. >>> >>> I'm not really seeing the need for such a large dom0. >> >> Okay, then I'd come back to V1 of my patches. This is the minimum >> required to be able to boot up a system with Xen and more than 512GB >> memory without having to reduce the Dom0 memory via Xen boot parameter. >> >> Otherwise the hypervisor built mfn_list mapped into the initial address >> space will be too large. >> >> And no, I don't think setting the boot parameter is the solution here. >> Dom0 should be usable on a huge machine without special parameters. > > Ok. The case where's dom0's p2m format matters is pretty specialized. > >>> I also think a flat array for the p2m might be better (less complex). >>> There's plenty of virtual address space in a 64-bit guest to allow for >>> this. >> >> Hmm, do you think we could reserve an area of many GBs for Xen in >> virtual space? I suspect this would be rejected as another "Xen-ism". > > alloc_vm_area() Nice idea, but alloc_vm_area() allocates ptes for the whole area. __get_vm_area() would be better, I think. > >> BTW: the mfn_list_list will still be required to be built as a tree. > > The tools could be given the guest virtual address and walk the guest > page tables. > > This is probably too much of a difference from the existing ABI to be > worth pursuing at this point. Okay, coming back to the main question: What to do regarding support of >512GB domains: 1. we need another level of the p2m map 2. we are trying the linear p2m table a) with a 4 level mfn_list_list b) with access to the p2m table via page tables 3. my V1 patches are okay, as they enable Dom0 to start on machines with huge memory Juergen ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH V3 1/1] expand x86 arch_shared_info to support >3 level p2m tree 2014-09-16 12:44 ` Juergen Gross @ 2014-09-17 4:25 ` Juergen Gross 0 siblings, 0 replies; 23+ messages in thread From: Juergen Gross @ 2014-09-17 4:25 UTC (permalink / raw) To: David Vrabel, Andrew Cooper, ian.campbell, ian.jackson, jbeulich, keir, tim, xen-devel On 09/16/2014 02:44 PM, Juergen Gross wrote: > On 09/16/2014 01:56 PM, David Vrabel wrote: >> On 16/09/14 11:38, Juergen Gross wrote: >>> On 09/16/2014 12:14 PM, David Vrabel wrote: >>>> On 16/09/14 04:52, Juergen Gross wrote: >>>>> On 09/15/2014 04:30 PM, David Vrabel wrote: >>>>>> On 15/09/14 11:46, Juergen Gross wrote: >>>>>>> So you'd prefer: >>>>>>> >>>>>>> 1) >512GB pv-domains (including Dom0) will be supported only with >>>>>>> new >>>>>>> Xen (4.6?), no matter if the user requires migration to be >>>>>>> supported >>>>>> >>>>>> Yes. >512 GiB and not being able to migrate are not obviously >>>>>> related >>>>>> from the point of view of the end user (unlike assigning a PCI >>>>>> device). >>>>>> >>>>>> Failing at domain save time is most likely too late for the end user. >>>>> >>>>> What would you think about following compromise: >>>>> >>>>> We add a flag that indicates support of multi-level p2m. Additionally >>>>> the Linux kernel can ignore the flag not being set either if >>>>> started as >>>>> Dom0 or if told so via kernel parameter. >>>> >>>> This sounds fine but this override should be via the command line >>>> parameter only. Crash dump analysis tools may not understand the 4 >>>> level p2m. >>>> >>>>>>> to: >>>>>>> >>>>>>> 2) >512GB pv-domains (especially Dom0 and VMs with direct hw >>>>>>> access) can >>>>>>> be started on current Xen versions, migration is possible >>>>>>> only if >>>>>>> Xen >>>>>>> is new (4.6?) >>>>>> >>>>>> There's also my preferred option: >>>>>> >>>>>> 3) >512 GiB PV domains are not supported. Large guests must be >>>>>> PVH or >>>>>> PVHVM. >>>>> >>>>> In theory okay, but not right now, I think. PVH Dom0 is not production >>>>> ready. >>>> >>>> I'm not really seeing the need for such a large dom0. >>> >>> Okay, then I'd come back to V1 of my patches. This is the minimum >>> required to be able to boot up a system with Xen and more than 512GB >>> memory without having to reduce the Dom0 memory via Xen boot parameter. >>> >>> Otherwise the hypervisor built mfn_list mapped into the initial address >>> space will be too large. >>> >>> And no, I don't think setting the boot parameter is the solution here. >>> Dom0 should be usable on a huge machine without special parameters. >> >> Ok. The case where's dom0's p2m format matters is pretty specialized. >> >>>> I also think a flat array for the p2m might be better (less complex). >>>> There's plenty of virtual address space in a 64-bit guest to allow for >>>> this. >>> >>> Hmm, do you think we could reserve an area of many GBs for Xen in >>> virtual space? I suspect this would be rejected as another "Xen-ism". >> >> alloc_vm_area() > > Nice idea, but alloc_vm_area() allocates ptes for the whole area. > __get_vm_area() would be better, I think. > >> >>> BTW: the mfn_list_list will still be required to be built as a tree. >> >> The tools could be given the guest virtual address and walk the guest >> page tables. >> >> This is probably too much of a difference from the existing ABI to be >> worth pursuing at this point. > > Okay, coming back to the main question: > > What to do regarding support of >512GB domains: > > 1. we need another level of the p2m map > 2. we are trying the linear p2m table > a) with a 4 level mfn_list_list > b) with access to the p2m table via page tables > 3. my V1 patches are okay, as they enable Dom0 to start on machines > with huge memory I thought a little bit more about this. I like the idea to use the virtual mapped linear p2m list. It would remove the need to build the p2m tree at an early boot stage, as the initial mfn_list supplied by the hypervisor can be used until the kernel builds it's own list. I'll try to create patch doing this. As this is not affecting the initial mapping of initrd and mfn_list I've posted V2 of my patches to eliminate some of the limitations of those initial mappings. Whether the mfn_list_list should be kept as a tree or (if indicated by a flag to be supported) is accessed via page table walk of the tools can be decided later. Juergen ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH V3 1/1] expand x86 arch_shared_info to support >3 level p2m tree 2014-09-09 9:58 ` [PATCH V3 1/1] expand x86 arch_shared_info to " Juergen Gross 2014-09-09 10:27 ` Andrew Cooper @ 2014-09-30 8:53 ` Jan Beulich [not found] ` <542A8B93020000780003AE7B@suse.com> 2 siblings, 0 replies; 23+ messages in thread From: Jan Beulich @ 2014-09-30 8:53 UTC (permalink / raw) To: David Vrabel, Juergen Gross Cc: keir, tim, ian.jackson, ian.campbell, xen-devel >>> On 09.09.14 at 11:58, <"jgross@suse.com".non-mime.internet> wrote: > The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list > currently contains the mfn of the top level page frame of the 3 level > p2m tree, which is used by the Xen tools during saving and restoring > (and live migration) of pv domains. With three levels of the p2m tree > it is possible to support up to 512 GB of RAM for a 64 bit pv domain. > A 32 bit pv domain can support more, as each memory page can hold 1024 > instead of 512 entries, leading to a limit of 4 TB. To be able to > support more RAM on x86-64 an additional level is to be added. > > This patch expands struct arch_shared_info with a new p2m tree root > and the number of levels of the p2m tree. The new information is > indicated by the domain to be valid by storing ~0UL into > pfn_to_mfn_frame_list_list (this should be done only if more than > three levels are needed, of course). > > Signed-off-by: Juergen Gross <jgross@suse.com> Still having this in my to-be-committed-or-otherwise list, David - you had reservations here. Did they get addressed by Jürgen? Is there any alternative proposal? Or are we setting this aside until after 4.5? Thanks, Jan > --- > xen/include/public/arch-x86/xen.h | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/xen/include/public/arch-x86/xen.h > b/xen/include/public/arch-x86/xen.h > index f35804b..2ca996c 100644 > --- a/xen/include/public/arch-x86/xen.h > +++ b/xen/include/public/arch-x86/xen.h > @@ -224,7 +224,12 @@ struct arch_shared_info { > /* Frame containing list of mfns containing list of mfns containing > p2m. */ > xen_pfn_t pfn_to_mfn_frame_list_list; > unsigned long nmi_reason; > - uint64_t pad[32]; > + /* > + * Following two fields are valid if pfn_to_mfn_frame_list_list > contains > + * ~0UL. > + */ > + unsigned long p2m_levels; /* number of levels of p2m tree */ > + xen_pfn_t p2m_root; /* p2m tree top level mfn */ > }; > typedef struct arch_shared_info arch_shared_info_t; > > -- > 1.8.4.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 23+ messages in thread
[parent not found: <542A8B93020000780003AE7B@suse.com>]
* Re: [PATCH V3 1/1] expand x86 arch_shared_info to support >3 level p2m tree [not found] ` <542A8B93020000780003AE7B@suse.com> @ 2014-09-30 8:59 ` Juergen Gross 0 siblings, 0 replies; 23+ messages in thread From: Juergen Gross @ 2014-09-30 8:59 UTC (permalink / raw) To: Jan Beulich, David Vrabel; +Cc: keir, tim, ian.jackson, ian.campbell, xen-devel On 09/30/2014 10:53 AM, Jan Beulich wrote: >>>> On 09.09.14 at 11:58, <"jgross@suse.com".non-mime.internet> wrote: >> The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list >> currently contains the mfn of the top level page frame of the 3 level >> p2m tree, which is used by the Xen tools during saving and restoring >> (and live migration) of pv domains. With three levels of the p2m tree >> it is possible to support up to 512 GB of RAM for a 64 bit pv domain. >> A 32 bit pv domain can support more, as each memory page can hold 1024 >> instead of 512 entries, leading to a limit of 4 TB. To be able to >> support more RAM on x86-64 an additional level is to be added. >> >> This patch expands struct arch_shared_info with a new p2m tree root >> and the number of levels of the p2m tree. The new information is >> indicated by the domain to be valid by storing ~0UL into >> pfn_to_mfn_frame_list_list (this should be done only if more than >> three levels are needed, of course). >> >> Signed-off-by: Juergen Gross <jgross@suse.com> > > Still having this in my to-be-committed-or-otherwise list, David - > you had reservations here. Did they get addressed by Jürgen? > Is there any alternative proposal? Or are we setting this aside > until after 4.5? David had the alternative proposal to use a virtual mapped linear mfn_list with the tools doing the translation as needed. I'm just trying to do this, so please ignore my patch. Juergen _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2014-09-30 8:59 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-09 9:58 [PATCH V3 0/1] support >3 level p2m tree Juergen Gross
2014-09-09 9:58 ` [PATCH V3 1/1] expand x86 arch_shared_info to " Juergen Gross
2014-09-09 10:27 ` Andrew Cooper
2014-09-09 10:49 ` Juergen Gross
2014-09-12 10:31 ` Juergen Gross
2014-09-15 8:29 ` Andrew Cooper
2014-09-15 8:52 ` Juergen Gross
2014-09-15 9:42 ` Jan Beulich
2014-09-15 9:48 ` Juergen Gross
2014-09-15 9:44 ` David Vrabel
2014-09-15 9:52 ` Juergen Gross
2014-09-15 10:30 ` David Vrabel
2014-09-15 10:46 ` Juergen Gross
2014-09-15 11:29 ` Jan Beulich
2014-09-15 14:30 ` David Vrabel
2014-09-16 3:52 ` Juergen Gross
2014-09-16 10:14 ` David Vrabel
2014-09-16 10:38 ` Juergen Gross
2014-09-16 11:56 ` David Vrabel
2014-09-16 12:44 ` Juergen Gross
2014-09-17 4:25 ` Juergen Gross
2014-09-30 8:53 ` Jan Beulich
[not found] ` <542A8B93020000780003AE7B@suse.com>
2014-09-30 8:59 ` Juergen Gross
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).