* [PATCH 0/4] support guest virtual mapped p2m list
@ 2014-11-14 9:37 Juergen Gross
2014-11-14 9:37 ` [PATCH 1/4] expand x86 arch_shared_info to support linear " Juergen Gross
` (3 more replies)
0 siblings, 4 replies; 25+ messages in thread
From: Juergen Gross @ 2014-11-14 9:37 UTC (permalink / raw)
To: xen-devel, jbeulich, konrad.wilk, david.vrabel; +Cc: Juergen Gross
The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list
currently contains the mfn of the top level page frame of the 3 level
p2m tree, which is used by the Xen tools during saving and restoring
(and live migration) of pv domains and for crash dump analysis. With
three levels of the p2m tree it is possible to support up to 512 GB of
RAM for a 64 bit pv domain.
A 32 bit pv domain can support more, as each memory page can hold 1024
instead of 512 entries, leading to a limit of 4 TB.
To be able to support more RAM on x86-64 switch to a virtual mapped
p2m list.
Juergen Gross (4):
expand x86 arch_shared_info to support linear p2m list
introduce arch_get_features()
introduce boot parameter for setting XENFEAT_virtual_p2m
document new boot parameter virt_p2m
docs/misc/xen-command-line.markdown | 22 ++++++++++
xen/arch/arm/domain.c | 5 +++
xen/arch/x86/domain.c | 80 +++++++++++++++++++++++++++++++++++++
xen/common/kernel.c | 22 +---------
xen/include/public/arch-x86/xen.h | 7 +++-
xen/include/public/features.h | 3 ++
xen/include/xen/domain.h | 2 +
7 files changed, 120 insertions(+), 21 deletions(-)
--
2.1.2
^ permalink raw reply [flat|nested] 25+ messages in thread* [PATCH 1/4] expand x86 arch_shared_info to support linear p2m list 2014-11-14 9:37 [PATCH 0/4] support guest virtual mapped p2m list Juergen Gross @ 2014-11-14 9:37 ` Juergen Gross 2014-11-14 11:41 ` Andrew Cooper 2014-11-21 12:23 ` Jan Beulich 2014-11-14 9:37 ` [PATCH 2/4] introduce arch_get_features() Juergen Gross ` (2 subsequent siblings) 3 siblings, 2 replies; 25+ messages in thread From: Juergen Gross @ 2014-11-14 9:37 UTC (permalink / raw) To: xen-devel, jbeulich, konrad.wilk, david.vrabel; +Cc: Juergen Gross The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list currently contains the mfn of the top level page frame of the 3 level p2m tree, which is used by the Xen tools during saving and restoring (and live migration) of pv domains and for crash dump analysis. With three levels of the p2m tree it is possible to support up to 512 GB of RAM for a 64 bit pv domain. A 32 bit pv domain can support more, as each memory page can hold 1024 instead of 512 entries, leading to a limit of 4 TB. To be able to support more RAM on x86-64 switch to a virtual mapped p2m list. This patch expands struct arch_shared_info with a new p2m list virtual address and the mfn of the page table root. The new information is indicated by the domain to be valid by storing ~0UL into pfn_to_mfn_frame_list_list. The hypervisor indicates usability of this feature by a new flag XENFEAT_virtual_p2m. --- xen/include/public/arch-x86/xen.h | 7 ++++++- xen/include/public/features.h | 3 +++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/xen/include/public/arch-x86/xen.h b/xen/include/public/arch-x86/xen.h index f35804b..b0f85a9 100644 --- a/xen/include/public/arch-x86/xen.h +++ b/xen/include/public/arch-x86/xen.h @@ -224,7 +224,12 @@ struct arch_shared_info { /* Frame containing list of mfns containing list of mfns containing p2m. */ xen_pfn_t pfn_to_mfn_frame_list_list; unsigned long nmi_reason; - uint64_t pad[32]; + /* + * Following two fields are valid if pfn_to_mfn_frame_list_list contains + * ~0UL. + */ + unsigned long p2m_vaddr; /* virtual address of the p2m list */ + unsigned long p2m_as_root; /* mfn of the top level page table */ }; typedef struct arch_shared_info arch_shared_info_t; diff --git a/xen/include/public/features.h b/xen/include/public/features.h index 16d92aa..ff0b82d 100644 --- a/xen/include/public/features.h +++ b/xen/include/public/features.h @@ -99,6 +99,9 @@ #define XENFEAT_grant_map_identity 12 */ +/* x86: guest may specify virtual address of p2m list */ +#define XENFEAT_virtual_p2m 13 + #define XENFEAT_NR_SUBMAPS 1 #endif /* __XEN_PUBLIC_FEATURES_H__ */ -- 2.1.2 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH 1/4] expand x86 arch_shared_info to support linear p2m list 2014-11-14 9:37 ` [PATCH 1/4] expand x86 arch_shared_info to support linear " Juergen Gross @ 2014-11-14 11:41 ` Andrew Cooper 2014-11-14 12:53 ` Juergen Gross 2014-11-21 12:23 ` Jan Beulich 1 sibling, 1 reply; 25+ messages in thread From: Andrew Cooper @ 2014-11-14 11:41 UTC (permalink / raw) To: Juergen Gross, xen-devel, jbeulich, konrad.wilk, david.vrabel On 14/11/14 09:37, Juergen Gross wrote: > The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list > currently contains the mfn of the top level page frame of the 3 level > p2m tree, which is used by the Xen tools during saving and restoring > (and live migration) of pv domains and for crash dump analysis. With > three levels of the p2m tree it is possible to support up to 512 GB of > RAM for a 64 bit pv domain. > > A 32 bit pv domain can support more, as each memory page can hold 1024 > instead of 512 entries, leading to a limit of 4 TB. > > To be able to support more RAM on x86-64 switch to a virtual mapped > p2m list. > > This patch expands struct arch_shared_info with a new p2m list virtual > address and the mfn of the page table root. The new information is > indicated by the domain to be valid by storing ~0UL into > pfn_to_mfn_frame_list_list. The hypervisor indicates usability of this > feature by a new flag XENFEAT_virtual_p2m. How do you envisage this being used? Are you expecting the tools to do manual pagetable walks using xc_map_foreign_xxx() ? ~Andrew > --- > xen/include/public/arch-x86/xen.h | 7 ++++++- > xen/include/public/features.h | 3 +++ > 2 files changed, 9 insertions(+), 1 deletion(-) > > diff --git a/xen/include/public/arch-x86/xen.h b/xen/include/public/arch-x86/xen.h > index f35804b..b0f85a9 100644 > --- a/xen/include/public/arch-x86/xen.h > +++ b/xen/include/public/arch-x86/xen.h > @@ -224,7 +224,12 @@ struct arch_shared_info { > /* Frame containing list of mfns containing list of mfns containing p2m. */ > xen_pfn_t pfn_to_mfn_frame_list_list; > unsigned long nmi_reason; > - uint64_t pad[32]; > + /* > + * Following two fields are valid if pfn_to_mfn_frame_list_list contains > + * ~0UL. > + */ > + unsigned long p2m_vaddr; /* virtual address of the p2m list */ > + unsigned long p2m_as_root; /* mfn of the top level page table */ > }; > typedef struct arch_shared_info arch_shared_info_t; > > diff --git a/xen/include/public/features.h b/xen/include/public/features.h > index 16d92aa..ff0b82d 100644 > --- a/xen/include/public/features.h > +++ b/xen/include/public/features.h > @@ -99,6 +99,9 @@ > #define XENFEAT_grant_map_identity 12 > */ > > +/* x86: guest may specify virtual address of p2m list */ > +#define XENFEAT_virtual_p2m 13 > + > #define XENFEAT_NR_SUBMAPS 1 > > #endif /* __XEN_PUBLIC_FEATURES_H__ */ ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/4] expand x86 arch_shared_info to support linear p2m list 2014-11-14 11:41 ` Andrew Cooper @ 2014-11-14 12:53 ` Juergen Gross 2014-11-14 13:56 ` Andrew Cooper 0 siblings, 1 reply; 25+ messages in thread From: Juergen Gross @ 2014-11-14 12:53 UTC (permalink / raw) To: Andrew Cooper, xen-devel, jbeulich, konrad.wilk, david.vrabel On 11/14/2014 12:41 PM, Andrew Cooper wrote: > On 14/11/14 09:37, Juergen Gross wrote: >> The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list >> currently contains the mfn of the top level page frame of the 3 level >> p2m tree, which is used by the Xen tools during saving and restoring >> (and live migration) of pv domains and for crash dump analysis. With >> three levels of the p2m tree it is possible to support up to 512 GB of >> RAM for a 64 bit pv domain. >> >> A 32 bit pv domain can support more, as each memory page can hold 1024 >> instead of 512 entries, leading to a limit of 4 TB. >> >> To be able to support more RAM on x86-64 switch to a virtual mapped >> p2m list. >> >> This patch expands struct arch_shared_info with a new p2m list virtual >> address and the mfn of the page table root. The new information is >> indicated by the domain to be valid by storing ~0UL into >> pfn_to_mfn_frame_list_list. The hypervisor indicates usability of this >> feature by a new flag XENFEAT_virtual_p2m. > > How do you envisage this being used? Are you expecting the tools to do > manual pagetable walks using xc_map_foreign_xxx() ? Yes. Not very different compared to today's mapping via the 3 level p2m tree. Just another entry format, 4 instead of 3 levels and starting at an offset. Juergen ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/4] expand x86 arch_shared_info to support linear p2m list 2014-11-14 12:53 ` Juergen Gross @ 2014-11-14 13:56 ` Andrew Cooper 2014-11-14 14:14 ` Jürgen Groß 0 siblings, 1 reply; 25+ messages in thread From: Andrew Cooper @ 2014-11-14 13:56 UTC (permalink / raw) To: Juergen Gross, xen-devel, jbeulich, konrad.wilk, david.vrabel On 14/11/14 12:53, Juergen Gross wrote: > On 11/14/2014 12:41 PM, Andrew Cooper wrote: >> On 14/11/14 09:37, Juergen Gross wrote: >>> The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list >>> currently contains the mfn of the top level page frame of the 3 level >>> p2m tree, which is used by the Xen tools during saving and restoring >>> (and live migration) of pv domains and for crash dump analysis. With >>> three levels of the p2m tree it is possible to support up to 512 GB of >>> RAM for a 64 bit pv domain. >>> >>> A 32 bit pv domain can support more, as each memory page can hold 1024 >>> instead of 512 entries, leading to a limit of 4 TB. >>> >>> To be able to support more RAM on x86-64 switch to a virtual mapped >>> p2m list. >>> >>> This patch expands struct arch_shared_info with a new p2m list virtual >>> address and the mfn of the page table root. The new information is >>> indicated by the domain to be valid by storing ~0UL into >>> pfn_to_mfn_frame_list_list. The hypervisor indicates usability of this >>> feature by a new flag XENFEAT_virtual_p2m. >> >> How do you envisage this being used? Are you expecting the tools to do >> manual pagetable walks using xc_map_foreign_xxx() ? > > Yes. Not very different compared to today's mapping via the 3 level > p2m tree. Just another entry format, 4 instead of 3 levels and starting > at an offset. Yes - David and I were discussing this over lunch, and it is not actually very different. In reality, how likely is it that the pages backing this virtual linear array change? One issue currently is that, during the live part of migration, the toolstack has no way of working out whether the structure of the p2m has changed (intermediate leaves rearranged, or the length increasing). In the case that the VM does change the structure of the p2m under the feet of the toolstack, migration will either blow up in a non-subtle way with a p2m/m2p mismatch, or in a subtle way with the receiving side copying the new p2m over the wrong part of the new domain. I am wondering whether, with this new p2m method, we can take sufficient steps to be able to guarantee mishaps like this can't occur. ~Andrew ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/4] expand x86 arch_shared_info to support linear p2m list 2014-11-14 13:56 ` Andrew Cooper @ 2014-11-14 14:14 ` Jürgen Groß 2014-11-14 14:59 ` Andrew Cooper 0 siblings, 1 reply; 25+ messages in thread From: Jürgen Groß @ 2014-11-14 14:14 UTC (permalink / raw) To: Andrew Cooper, xen-devel, jbeulich, konrad.wilk, david.vrabel On 11/14/2014 02:56 PM, Andrew Cooper wrote: > On 14/11/14 12:53, Juergen Gross wrote: >> On 11/14/2014 12:41 PM, Andrew Cooper wrote: >>> On 14/11/14 09:37, Juergen Gross wrote: >>>> The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list >>>> currently contains the mfn of the top level page frame of the 3 level >>>> p2m tree, which is used by the Xen tools during saving and restoring >>>> (and live migration) of pv domains and for crash dump analysis. With >>>> three levels of the p2m tree it is possible to support up to 512 GB of >>>> RAM for a 64 bit pv domain. >>>> >>>> A 32 bit pv domain can support more, as each memory page can hold 1024 >>>> instead of 512 entries, leading to a limit of 4 TB. >>>> >>>> To be able to support more RAM on x86-64 switch to a virtual mapped >>>> p2m list. >>>> >>>> This patch expands struct arch_shared_info with a new p2m list virtual >>>> address and the mfn of the page table root. The new information is >>>> indicated by the domain to be valid by storing ~0UL into >>>> pfn_to_mfn_frame_list_list. The hypervisor indicates usability of this >>>> feature by a new flag XENFEAT_virtual_p2m. >>> >>> How do you envisage this being used? Are you expecting the tools to do >>> manual pagetable walks using xc_map_foreign_xxx() ? >> >> Yes. Not very different compared to today's mapping via the 3 level >> p2m tree. Just another entry format, 4 instead of 3 levels and starting >> at an offset. > > Yes - David and I were discussing this over lunch, and it is not > actually very different. > > In reality, how likely is it that the pages backing this virtual linear > array change? Very unlikely, I think. But not impossible. > One issue currently is that, during the live part of migration, the > toolstack has no way of working out whether the structure of the p2m has > changed (intermediate leaves rearranged, or the length increasing). > > In the case that the VM does change the structure of the p2m under the > feet of the toolstack, migration will either blow up in a non-subtle way > with a p2m/m2p mismatch, or in a subtle way with the receiving side > copying the new p2m over the wrong part of the new domain. > > I am wondering whether, with this new p2m method, we can take sufficient > steps to be able to guarantee mishaps like this can't occur. This should be easy: I could add a counter in arch_shared_info which is incremented whenever a p2m mapping is being changed. The toolstack could compare the counter values before start and at end of migration and redo the migration (or fail) if they are different. In order to avoid races I would have to increment the counter before and after changing the mapping. Juergen ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/4] expand x86 arch_shared_info to support linear p2m list 2014-11-14 14:14 ` Jürgen Groß @ 2014-11-14 14:59 ` Andrew Cooper 2014-11-14 15:32 ` Juergen Gross 0 siblings, 1 reply; 25+ messages in thread From: Andrew Cooper @ 2014-11-14 14:59 UTC (permalink / raw) To: Jürgen Groß, xen-devel, jbeulich, konrad.wilk, david.vrabel On 14/11/14 14:14, Jürgen Groß wrote: > On 11/14/2014 02:56 PM, Andrew Cooper wrote: >> On 14/11/14 12:53, Juergen Gross wrote: >>> On 11/14/2014 12:41 PM, Andrew Cooper wrote: >>>> On 14/11/14 09:37, Juergen Gross wrote: >>>>> The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list >>>>> currently contains the mfn of the top level page frame of the 3 level >>>>> p2m tree, which is used by the Xen tools during saving and restoring >>>>> (and live migration) of pv domains and for crash dump analysis. With >>>>> three levels of the p2m tree it is possible to support up to 512 >>>>> GB of >>>>> RAM for a 64 bit pv domain. >>>>> >>>>> A 32 bit pv domain can support more, as each memory page can hold >>>>> 1024 >>>>> instead of 512 entries, leading to a limit of 4 TB. >>>>> >>>>> To be able to support more RAM on x86-64 switch to a virtual mapped >>>>> p2m list. >>>>> >>>>> This patch expands struct arch_shared_info with a new p2m list >>>>> virtual >>>>> address and the mfn of the page table root. The new information is >>>>> indicated by the domain to be valid by storing ~0UL into >>>>> pfn_to_mfn_frame_list_list. The hypervisor indicates usability of >>>>> this >>>>> feature by a new flag XENFEAT_virtual_p2m. >>>> >>>> How do you envisage this being used? Are you expecting the tools >>>> to do >>>> manual pagetable walks using xc_map_foreign_xxx() ? >>> >>> Yes. Not very different compared to today's mapping via the 3 level >>> p2m tree. Just another entry format, 4 instead of 3 levels and starting >>> at an offset. >> >> Yes - David and I were discussing this over lunch, and it is not >> actually very different. >> >> In reality, how likely is it that the pages backing this virtual linear >> array change? > > Very unlikely, I think. But not impossible. > >> One issue currently is that, during the live part of migration, the >> toolstack has no way of working out whether the structure of the p2m has >> changed (intermediate leaves rearranged, or the length increasing). >> >> In the case that the VM does change the structure of the p2m under the >> feet of the toolstack, migration will either blow up in a non-subtle way >> with a p2m/m2p mismatch, or in a subtle way with the receiving side >> copying the new p2m over the wrong part of the new domain. >> >> I am wondering whether, with this new p2m method, we can take sufficient >> steps to be able to guarantee mishaps like this can't occur. > > This should be easy: I could add a counter in arch_shared_info which is > incremented whenever a p2m mapping is being changed. The toolstack could > compare the counter values before start and at end of migration and redo > the migration (or fail) if they are different. In order to avoid races > I would have to increment the counter before and after changing the > mapping. > That is insufficient I believe. Consider: * Toolstack walks pagetables and maps the frames containing the linear p2m * Live migration starts * VM remaps a frame in the middle of the linear p2m * Live migration continues, but the toolstack has a stale frame in the middle of its view of the p2m. As the p2m is almost never expected to change, I think it might be better to have a flag the toolstack can set to say "The toolstack is peeking at your p2m behind your back - you must not change its structure." Having just thought this through, I think there is also a race condition between a VM changing an entry in the p2m, and the toolstack doing verifications of frames being sent. ~Andrew ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/4] expand x86 arch_shared_info to support linear p2m list 2014-11-14 14:59 ` Andrew Cooper @ 2014-11-14 15:32 ` Juergen Gross 2014-11-14 16:08 ` Andrew Cooper 0 siblings, 1 reply; 25+ messages in thread From: Juergen Gross @ 2014-11-14 15:32 UTC (permalink / raw) To: Andrew Cooper, xen-devel, jbeulich, konrad.wilk, david.vrabel On 11/14/2014 03:59 PM, Andrew Cooper wrote: > On 14/11/14 14:14, Jürgen Groß wrote: >> On 11/14/2014 02:56 PM, Andrew Cooper wrote: >>> On 14/11/14 12:53, Juergen Gross wrote: >>>> On 11/14/2014 12:41 PM, Andrew Cooper wrote: >>>>> On 14/11/14 09:37, Juergen Gross wrote: >>>>>> The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list >>>>>> currently contains the mfn of the top level page frame of the 3 level >>>>>> p2m tree, which is used by the Xen tools during saving and restoring >>>>>> (and live migration) of pv domains and for crash dump analysis. With >>>>>> three levels of the p2m tree it is possible to support up to 512 >>>>>> GB of >>>>>> RAM for a 64 bit pv domain. >>>>>> >>>>>> A 32 bit pv domain can support more, as each memory page can hold >>>>>> 1024 >>>>>> instead of 512 entries, leading to a limit of 4 TB. >>>>>> >>>>>> To be able to support more RAM on x86-64 switch to a virtual mapped >>>>>> p2m list. >>>>>> >>>>>> This patch expands struct arch_shared_info with a new p2m list >>>>>> virtual >>>>>> address and the mfn of the page table root. The new information is >>>>>> indicated by the domain to be valid by storing ~0UL into >>>>>> pfn_to_mfn_frame_list_list. The hypervisor indicates usability of >>>>>> this >>>>>> feature by a new flag XENFEAT_virtual_p2m. >>>>> >>>>> How do you envisage this being used? Are you expecting the tools >>>>> to do >>>>> manual pagetable walks using xc_map_foreign_xxx() ? >>>> >>>> Yes. Not very different compared to today's mapping via the 3 level >>>> p2m tree. Just another entry format, 4 instead of 3 levels and starting >>>> at an offset. >>> >>> Yes - David and I were discussing this over lunch, and it is not >>> actually very different. >>> >>> In reality, how likely is it that the pages backing this virtual linear >>> array change? >> >> Very unlikely, I think. But not impossible. >> >>> One issue currently is that, during the live part of migration, the >>> toolstack has no way of working out whether the structure of the p2m has >>> changed (intermediate leaves rearranged, or the length increasing). >>> >>> In the case that the VM does change the structure of the p2m under the >>> feet of the toolstack, migration will either blow up in a non-subtle way >>> with a p2m/m2p mismatch, or in a subtle way with the receiving side >>> copying the new p2m over the wrong part of the new domain. >>> >>> I am wondering whether, with this new p2m method, we can take sufficient >>> steps to be able to guarantee mishaps like this can't occur. >> >> This should be easy: I could add a counter in arch_shared_info which is >> incremented whenever a p2m mapping is being changed. The toolstack could >> compare the counter values before start and at end of migration and redo >> the migration (or fail) if they are different. In order to avoid races >> I would have to increment the counter before and after changing the >> mapping. >> > > That is insufficient I believe. > > Consider: > > * Toolstack walks pagetables and maps the frames containing the linear p2m > * Live migration starts > * VM remaps a frame in the middle of the linear p2m > * Live migration continues, but the toolstack has a stale frame in the > middle of its view of the p2m. This would be covered by my suggestion. At the end of the memory transfer (with some bogus contents) the toolstack would discover the change of the p2m structure and either fail the migration or start it from the beginning and thus overwriting the bogus frames. > As the p2m is almost never expected to change, I think it might be > better to have a flag the toolstack can set to say "The toolstack is > peeking at your p2m behind your back - you must not change its structure." Be careful here: changes of the structure can be due to two scenarios: - ballooning (invalid entries being populated): this is no problem, as we can stop the ballooning during live migration. - mapping of grant pages e.g. in a stub domain (first map in an area former marked as invalid): you can't stop this, as the stub domain has to do some work. Here a restart of the migration should work, as the p2m structure change can only happen once for each affected p2m page. > Having just thought this through, I think there is also a race condition > between a VM changing an entry in the p2m, and the toolstack doing > verifications of frames being sent. Okay, so the flag you mentioned should just prohibit changes in the p2m list related to memory frames of the affected domain: ballooning up or down, or rearranging the memory layout (does this happen today?). Mapping and unmapping of grant pages should be still allowed. Juergen ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/4] expand x86 arch_shared_info to support linear p2m list 2014-11-14 15:32 ` Juergen Gross @ 2014-11-14 16:08 ` Andrew Cooper 2014-11-18 5:33 ` Juergen Gross 0 siblings, 1 reply; 25+ messages in thread From: Andrew Cooper @ 2014-11-14 16:08 UTC (permalink / raw) To: Juergen Gross, xen-devel, jbeulich, konrad.wilk, david.vrabel On 14/11/14 15:32, Juergen Gross wrote: > On 11/14/2014 03:59 PM, Andrew Cooper wrote: >> On 14/11/14 14:14, Jürgen Groß wrote: >>> On 11/14/2014 02:56 PM, Andrew Cooper wrote: >>>> On 14/11/14 12:53, Juergen Gross wrote: >>>>> On 11/14/2014 12:41 PM, Andrew Cooper wrote: >>>>>> On 14/11/14 09:37, Juergen Gross wrote: >>>>>>> The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list >>>>>>> currently contains the mfn of the top level page frame of the 3 >>>>>>> level >>>>>>> p2m tree, which is used by the Xen tools during saving and >>>>>>> restoring >>>>>>> (and live migration) of pv domains and for crash dump analysis. >>>>>>> With >>>>>>> three levels of the p2m tree it is possible to support up to 512 >>>>>>> GB of >>>>>>> RAM for a 64 bit pv domain. >>>>>>> >>>>>>> A 32 bit pv domain can support more, as each memory page can hold >>>>>>> 1024 >>>>>>> instead of 512 entries, leading to a limit of 4 TB. >>>>>>> >>>>>>> To be able to support more RAM on x86-64 switch to a virtual mapped >>>>>>> p2m list. >>>>>>> >>>>>>> This patch expands struct arch_shared_info with a new p2m list >>>>>>> virtual >>>>>>> address and the mfn of the page table root. The new information is >>>>>>> indicated by the domain to be valid by storing ~0UL into >>>>>>> pfn_to_mfn_frame_list_list. The hypervisor indicates usability of >>>>>>> this >>>>>>> feature by a new flag XENFEAT_virtual_p2m. >>>>>> >>>>>> How do you envisage this being used? Are you expecting the tools >>>>>> to do >>>>>> manual pagetable walks using xc_map_foreign_xxx() ? >>>>> >>>>> Yes. Not very different compared to today's mapping via the 3 level >>>>> p2m tree. Just another entry format, 4 instead of 3 levels and >>>>> starting >>>>> at an offset. >>>> >>>> Yes - David and I were discussing this over lunch, and it is not >>>> actually very different. >>>> >>>> In reality, how likely is it that the pages backing this virtual >>>> linear >>>> array change? >>> >>> Very unlikely, I think. But not impossible. >>> >>>> One issue currently is that, during the live part of migration, the >>>> toolstack has no way of working out whether the structure of the >>>> p2m has >>>> changed (intermediate leaves rearranged, or the length increasing). >>>> >>>> In the case that the VM does change the structure of the p2m under the >>>> feet of the toolstack, migration will either blow up in a >>>> non-subtle way >>>> with a p2m/m2p mismatch, or in a subtle way with the receiving side >>>> copying the new p2m over the wrong part of the new domain. >>>> >>>> I am wondering whether, with this new p2m method, we can take >>>> sufficient >>>> steps to be able to guarantee mishaps like this can't occur. >>> >>> This should be easy: I could add a counter in arch_shared_info which is >>> incremented whenever a p2m mapping is being changed. The toolstack >>> could >>> compare the counter values before start and at end of migration and >>> redo >>> the migration (or fail) if they are different. In order to avoid races >>> I would have to increment the counter before and after changing the >>> mapping. >>> >> >> That is insufficient I believe. >> >> Consider: >> >> * Toolstack walks pagetables and maps the frames containing the >> linear p2m >> * Live migration starts >> * VM remaps a frame in the middle of the linear p2m >> * Live migration continues, but the toolstack has a stale frame in the >> middle of its view of the p2m. > > This would be covered by my suggestion. At the end of the memory > transfer (with some bogus contents) the toolstack would discover the > change of the p2m structure and either fail the migration or start it > from the beginning and thus overwriting the bogus frames. Checking after pause is too late. The content of the p2m is used verify each frame being sent on the wire, so is in active use for the entire duration of live migration. If the toolstack starts verifying frames being sent using information from a stale p2m, the best that can be hoped for is that the toolstack declares that the p2m and m2p are inconsistent and abort the migrate. > >> As the p2m is almost never expected to change, I think it might be >> better to have a flag the toolstack can set to say "The toolstack is >> peeking at your p2m behind your back - you must not change its >> structure." > > Be careful here: changes of the structure can be due to two scenarios: > - ballooning (invalid entries being populated): this is no problem, as > we can stop the ballooning during live migration. > - mapping of grant pages e.g. in a stub domain (first map in an area > former marked as invalid): you can't stop this, as the stub domain > has to do some work. Here a restart of the migration should work, as > the p2m structure change can only happen once for each affected p2m > page. Migration is not at all possible with a domain referencing foreign frames. The live part can cope with foreign frames referenced in the ptes. As part of the pause handling in the VM, the frontends must unmap any grants they have. After pause, any remaining foreign frames cause a migration failure. > >> Having just thought this through, I think there is also a race condition >> between a VM changing an entry in the p2m, and the toolstack doing >> verifications of frames being sent. > > Okay, so the flag you mentioned should just prohibit changes in the > p2m list related to memory frames of the affected domain: ballooning > up or down, or rearranging the memory layout (does this happen today?). > Mapping and unmapping of grant pages should be still allowed. HVM guests doesn't have any of their p2m updates represented in the logdirty bitmap, so ballooning an HVM guest during migrate leads to unexpected holes or lack of holes on the resuming side, leading to a very confused balloon driver. At the time I had not found a problem with PV guests, but it is now clear that there is a period of time when a guest is altering its p2m where the p2m and m2p are out of sync, which will cause a migration failure if the toolstack observes this artefact. ~Andrew ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/4] expand x86 arch_shared_info to support linear p2m list 2014-11-14 16:08 ` Andrew Cooper @ 2014-11-18 5:33 ` Juergen Gross 2014-11-18 10:51 ` Andrew Cooper 0 siblings, 1 reply; 25+ messages in thread From: Juergen Gross @ 2014-11-18 5:33 UTC (permalink / raw) To: Andrew Cooper, xen-devel, jbeulich, konrad.wilk, david.vrabel On 11/14/2014 05:08 PM, Andrew Cooper wrote: > On 14/11/14 15:32, Juergen Gross wrote: >> On 11/14/2014 03:59 PM, Andrew Cooper wrote: >>> On 14/11/14 14:14, Jürgen Groß wrote: >>>> On 11/14/2014 02:56 PM, Andrew Cooper wrote: >>>>> On 14/11/14 12:53, Juergen Gross wrote: >>>>>> On 11/14/2014 12:41 PM, Andrew Cooper wrote: >>>>>>> On 14/11/14 09:37, Juergen Gross wrote: >>>>>>>> The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list >>>>>>>> currently contains the mfn of the top level page frame of the 3 >>>>>>>> level >>>>>>>> p2m tree, which is used by the Xen tools during saving and >>>>>>>> restoring >>>>>>>> (and live migration) of pv domains and for crash dump analysis. >>>>>>>> With >>>>>>>> three levels of the p2m tree it is possible to support up to 512 >>>>>>>> GB of >>>>>>>> RAM for a 64 bit pv domain. >>>>>>>> >>>>>>>> A 32 bit pv domain can support more, as each memory page can hold >>>>>>>> 1024 >>>>>>>> instead of 512 entries, leading to a limit of 4 TB. >>>>>>>> >>>>>>>> To be able to support more RAM on x86-64 switch to a virtual mapped >>>>>>>> p2m list. >>>>>>>> >>>>>>>> This patch expands struct arch_shared_info with a new p2m list >>>>>>>> virtual >>>>>>>> address and the mfn of the page table root. The new information is >>>>>>>> indicated by the domain to be valid by storing ~0UL into >>>>>>>> pfn_to_mfn_frame_list_list. The hypervisor indicates usability of >>>>>>>> this >>>>>>>> feature by a new flag XENFEAT_virtual_p2m. >>>>>>> >>>>>>> How do you envisage this being used? Are you expecting the tools >>>>>>> to do >>>>>>> manual pagetable walks using xc_map_foreign_xxx() ? >>>>>> >>>>>> Yes. Not very different compared to today's mapping via the 3 level >>>>>> p2m tree. Just another entry format, 4 instead of 3 levels and >>>>>> starting >>>>>> at an offset. >>>>> >>>>> Yes - David and I were discussing this over lunch, and it is not >>>>> actually very different. >>>>> >>>>> In reality, how likely is it that the pages backing this virtual >>>>> linear >>>>> array change? >>>> >>>> Very unlikely, I think. But not impossible. >>>> >>>>> One issue currently is that, during the live part of migration, the >>>>> toolstack has no way of working out whether the structure of the >>>>> p2m has >>>>> changed (intermediate leaves rearranged, or the length increasing). >>>>> >>>>> In the case that the VM does change the structure of the p2m under the >>>>> feet of the toolstack, migration will either blow up in a >>>>> non-subtle way >>>>> with a p2m/m2p mismatch, or in a subtle way with the receiving side >>>>> copying the new p2m over the wrong part of the new domain. >>>>> >>>>> I am wondering whether, with this new p2m method, we can take >>>>> sufficient >>>>> steps to be able to guarantee mishaps like this can't occur. >>>> >>>> This should be easy: I could add a counter in arch_shared_info which is >>>> incremented whenever a p2m mapping is being changed. The toolstack >>>> could >>>> compare the counter values before start and at end of migration and >>>> redo >>>> the migration (or fail) if they are different. In order to avoid races >>>> I would have to increment the counter before and after changing the >>>> mapping. >>>> >>> >>> That is insufficient I believe. >>> >>> Consider: >>> >>> * Toolstack walks pagetables and maps the frames containing the >>> linear p2m >>> * Live migration starts >>> * VM remaps a frame in the middle of the linear p2m >>> * Live migration continues, but the toolstack has a stale frame in the >>> middle of its view of the p2m. >> >> This would be covered by my suggestion. At the end of the memory >> transfer (with some bogus contents) the toolstack would discover the >> change of the p2m structure and either fail the migration or start it >> from the beginning and thus overwriting the bogus frames. > > Checking after pause is too late. The content of the p2m is used verify > each frame being sent on the wire, so is in active use for the entire > duration of live migration. > > If the toolstack starts verifying frames being sent using information > from a stale p2m, the best that can be hoped for is that the toolstack > declares that the p2m and m2p are inconsistent and abort the migrate. > >> >>> As the p2m is almost never expected to change, I think it might be >>> better to have a flag the toolstack can set to say "The toolstack is >>> peeking at your p2m behind your back - you must not change its >>> structure." >> >> Be careful here: changes of the structure can be due to two scenarios: >> - ballooning (invalid entries being populated): this is no problem, as >> we can stop the ballooning during live migration. >> - mapping of grant pages e.g. in a stub domain (first map in an area >> former marked as invalid): you can't stop this, as the stub domain >> has to do some work. Here a restart of the migration should work, as >> the p2m structure change can only happen once for each affected p2m >> page. > > Migration is not at all possible with a domain referencing foreign frames. > > The live part can cope with foreign frames referenced in the ptes. As > part of the pause handling in the VM, the frontends must unmap any > grants they have. After pause, any remaining foreign frames cause a > migration failure. > >> >>> Having just thought this through, I think there is also a race condition >>> between a VM changing an entry in the p2m, and the toolstack doing >>> verifications of frames being sent. >> >> Okay, so the flag you mentioned should just prohibit changes in the >> p2m list related to memory frames of the affected domain: ballooning >> up or down, or rearranging the memory layout (does this happen today?). >> Mapping and unmapping of grant pages should be still allowed. > > HVM guests doesn't have any of their p2m updates represented in the > logdirty bitmap, so ballooning an HVM guest during migrate leads to > unexpected holes or lack of holes on the resuming side, leading to a > very confused balloon driver. > > At the time I had not found a problem with PV guests, but it is now > clear that there is a period of time when a guest is altering its p2m > where the p2m and m2p are out of sync, which will cause a migration > failure if the toolstack observes this artefact. So ballooning should be disabled during migration. I think this should be handled via callbacks triggered by xenstore: one at start of migration to stop ballooning and one at end to restart it. I wouldn't want to tie this functionality to the p2m list structure, as it is not related to it. Juergen ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/4] expand x86 arch_shared_info to support linear p2m list 2014-11-18 5:33 ` Juergen Gross @ 2014-11-18 10:51 ` Andrew Cooper 2014-11-18 10:56 ` David Vrabel 0 siblings, 1 reply; 25+ messages in thread From: Andrew Cooper @ 2014-11-18 10:51 UTC (permalink / raw) To: Juergen Gross, xen-devel, jbeulich, konrad.wilk, david.vrabel On 18/11/14 05:33, Juergen Gross wrote: > On 11/14/2014 05:08 PM, Andrew Cooper wrote: >> On 14/11/14 15:32, Juergen Gross wrote: >>> On 11/14/2014 03:59 PM, Andrew Cooper wrote: >>>> On 14/11/14 14:14, Jürgen Groß wrote: >>>>> On 11/14/2014 02:56 PM, Andrew Cooper wrote: >>>>>> On 14/11/14 12:53, Juergen Gross wrote: >>>>>>> On 11/14/2014 12:41 PM, Andrew Cooper wrote: >>>>>>>> On 14/11/14 09:37, Juergen Gross wrote: >>>>>>>>> The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list >>>>>>>>> currently contains the mfn of the top level page frame of the 3 >>>>>>>>> level >>>>>>>>> p2m tree, which is used by the Xen tools during saving and >>>>>>>>> restoring >>>>>>>>> (and live migration) of pv domains and for crash dump analysis. >>>>>>>>> With >>>>>>>>> three levels of the p2m tree it is possible to support up to 512 >>>>>>>>> GB of >>>>>>>>> RAM for a 64 bit pv domain. >>>>>>>>> >>>>>>>>> A 32 bit pv domain can support more, as each memory page can hold >>>>>>>>> 1024 >>>>>>>>> instead of 512 entries, leading to a limit of 4 TB. >>>>>>>>> >>>>>>>>> To be able to support more RAM on x86-64 switch to a virtual >>>>>>>>> mapped >>>>>>>>> p2m list. >>>>>>>>> >>>>>>>>> This patch expands struct arch_shared_info with a new p2m list >>>>>>>>> virtual >>>>>>>>> address and the mfn of the page table root. The new >>>>>>>>> information is >>>>>>>>> indicated by the domain to be valid by storing ~0UL into >>>>>>>>> pfn_to_mfn_frame_list_list. The hypervisor indicates usability of >>>>>>>>> this >>>>>>>>> feature by a new flag XENFEAT_virtual_p2m. >>>>>>>> >>>>>>>> How do you envisage this being used? Are you expecting the tools >>>>>>>> to do >>>>>>>> manual pagetable walks using xc_map_foreign_xxx() ? >>>>>>> >>>>>>> Yes. Not very different compared to today's mapping via the 3 level >>>>>>> p2m tree. Just another entry format, 4 instead of 3 levels and >>>>>>> starting >>>>>>> at an offset. >>>>>> >>>>>> Yes - David and I were discussing this over lunch, and it is not >>>>>> actually very different. >>>>>> >>>>>> In reality, how likely is it that the pages backing this virtual >>>>>> linear >>>>>> array change? >>>>> >>>>> Very unlikely, I think. But not impossible. >>>>> >>>>>> One issue currently is that, during the live part of migration, the >>>>>> toolstack has no way of working out whether the structure of the >>>>>> p2m has >>>>>> changed (intermediate leaves rearranged, or the length increasing). >>>>>> >>>>>> In the case that the VM does change the structure of the p2m >>>>>> under the >>>>>> feet of the toolstack, migration will either blow up in a >>>>>> non-subtle way >>>>>> with a p2m/m2p mismatch, or in a subtle way with the receiving side >>>>>> copying the new p2m over the wrong part of the new domain. >>>>>> >>>>>> I am wondering whether, with this new p2m method, we can take >>>>>> sufficient >>>>>> steps to be able to guarantee mishaps like this can't occur. >>>>> >>>>> This should be easy: I could add a counter in arch_shared_info >>>>> which is >>>>> incremented whenever a p2m mapping is being changed. The toolstack >>>>> could >>>>> compare the counter values before start and at end of migration and >>>>> redo >>>>> the migration (or fail) if they are different. In order to avoid >>>>> races >>>>> I would have to increment the counter before and after changing the >>>>> mapping. >>>>> >>>> >>>> That is insufficient I believe. >>>> >>>> Consider: >>>> >>>> * Toolstack walks pagetables and maps the frames containing the >>>> linear p2m >>>> * Live migration starts >>>> * VM remaps a frame in the middle of the linear p2m >>>> * Live migration continues, but the toolstack has a stale frame in the >>>> middle of its view of the p2m. >>> >>> This would be covered by my suggestion. At the end of the memory >>> transfer (with some bogus contents) the toolstack would discover the >>> change of the p2m structure and either fail the migration or start it >>> from the beginning and thus overwriting the bogus frames. >> >> Checking after pause is too late. The content of the p2m is used verify >> each frame being sent on the wire, so is in active use for the entire >> duration of live migration. >> >> If the toolstack starts verifying frames being sent using information >> from a stale p2m, the best that can be hoped for is that the toolstack >> declares that the p2m and m2p are inconsistent and abort the migrate. >> >>> >>>> As the p2m is almost never expected to change, I think it might be >>>> better to have a flag the toolstack can set to say "The toolstack is >>>> peeking at your p2m behind your back - you must not change its >>>> structure." >>> >>> Be careful here: changes of the structure can be due to two scenarios: >>> - ballooning (invalid entries being populated): this is no problem, as >>> we can stop the ballooning during live migration. >>> - mapping of grant pages e.g. in a stub domain (first map in an area >>> former marked as invalid): you can't stop this, as the stub domain >>> has to do some work. Here a restart of the migration should work, as >>> the p2m structure change can only happen once for each affected p2m >>> page. >> >> Migration is not at all possible with a domain referencing foreign >> frames. >> >> The live part can cope with foreign frames referenced in the ptes. As >> part of the pause handling in the VM, the frontends must unmap any >> grants they have. After pause, any remaining foreign frames cause a >> migration failure. >> >>> >>>> Having just thought this through, I think there is also a race >>>> condition >>>> between a VM changing an entry in the p2m, and the toolstack doing >>>> verifications of frames being sent. >>> >>> Okay, so the flag you mentioned should just prohibit changes in the >>> p2m list related to memory frames of the affected domain: ballooning >>> up or down, or rearranging the memory layout (does this happen today?). >>> Mapping and unmapping of grant pages should be still allowed. >> >> HVM guests doesn't have any of their p2m updates represented in the >> logdirty bitmap, so ballooning an HVM guest during migrate leads to >> unexpected holes or lack of holes on the resuming side, leading to a >> very confused balloon driver. >> >> At the time I had not found a problem with PV guests, but it is now >> clear that there is a period of time when a guest is altering its p2m >> where the p2m and m2p are out of sync, which will cause a migration >> failure if the toolstack observes this artefact. > > So ballooning should be disabled during migration. I think this should > be handled via callbacks triggered by xenstore: one at start of > migration to stop ballooning and one at end to restart it. I wouldn't > want to tie this functionality to the p2m list structure, as it is > not related to it. It is not just ballooning. It is any change to the p2m whatsoever. This includes mapping/unmapping grants, XENMEM_exchange, and the guest simply changing the p2m layout. I suspect that the only reason this has not been encountered in practice is that noone has attempted migrating a domain which makes use of foreign mappings. It is typically only the backend drivers which map frontend memory, and dom0 doesn't migrate. ~Andrew ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/4] expand x86 arch_shared_info to support linear p2m list 2014-11-18 10:51 ` Andrew Cooper @ 2014-11-18 10:56 ` David Vrabel 0 siblings, 0 replies; 25+ messages in thread From: David Vrabel @ 2014-11-18 10:56 UTC (permalink / raw) To: Andrew Cooper, Juergen Gross, xen-devel, jbeulich, konrad.wilk On 18/11/14 10:51, Andrew Cooper wrote: >> > It is not just ballooning. It is any change to the p2m whatsoever. > This includes mapping/unmapping grants, XENMEM_exchange, and the guest > simply changing the p2m layout. Grants don't matter because they're never saved. David ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/4] expand x86 arch_shared_info to support linear p2m list 2014-11-14 9:37 ` [PATCH 1/4] expand x86 arch_shared_info to support linear " Juergen Gross 2014-11-14 11:41 ` Andrew Cooper @ 2014-11-21 12:23 ` Jan Beulich 2014-11-21 12:57 ` Juergen Gross 1 sibling, 1 reply; 25+ messages in thread From: Jan Beulich @ 2014-11-21 12:23 UTC (permalink / raw) To: Juergen Gross; +Cc: xen-devel, david.vrabel >>> On 14.11.14 at 10:37, <"jgross@suse.com".non-mime.internet> wrote: > --- a/xen/include/public/arch-x86/xen.h > +++ b/xen/include/public/arch-x86/xen.h > @@ -224,7 +224,12 @@ struct arch_shared_info { > /* Frame containing list of mfns containing list of mfns containing p2m. */ > xen_pfn_t pfn_to_mfn_frame_list_list; > unsigned long nmi_reason; > - uint64_t pad[32]; > + /* > + * Following two fields are valid if pfn_to_mfn_frame_list_list contains > + * ~0UL. > + */ > + unsigned long p2m_vaddr; /* virtual address of the p2m list */ > + unsigned long p2m_as_root; /* mfn of the top level page table */ xen_pfn_t please. And what does the "as" in the name stand for? It's also kind of unclear in the description what "the page table root" is, as I don't think there are many OSes which use just a single set of page tables (i.e. just a single address space). Not having followed the discussion closely - what is this needed for anyway? > --- a/xen/include/public/features.h > +++ b/xen/include/public/features.h > @@ -99,6 +99,9 @@ > #define XENFEAT_grant_map_identity 12 > */ > > +/* x86: guest may specify virtual address of p2m list */ > +#define XENFEAT_virtual_p2m 13 The name to me suggests something that's not real. Perhaps better XENFEAT_virtually_mapped_p2m or XENFEAT_p2m_va{,ddr}? Jan ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/4] expand x86 arch_shared_info to support linear p2m list 2014-11-21 12:23 ` Jan Beulich @ 2014-11-21 12:57 ` Juergen Gross 2014-11-21 13:26 ` Andrew Cooper 0 siblings, 1 reply; 25+ messages in thread From: Juergen Gross @ 2014-11-21 12:57 UTC (permalink / raw) To: Jan Beulich Cc: keir@xen.org, Ian.Campbell@citrix.com, Ian Jackson, Tim Deegan, david.vrabel, xen-devel On 11/21/2014 01:23 PM, Jan Beulich wrote: >>>> On 14.11.14 at 10:37, <"jgross@suse.com".non-mime.internet> wrote: >> --- a/xen/include/public/arch-x86/xen.h >> +++ b/xen/include/public/arch-x86/xen.h >> @@ -224,7 +224,12 @@ struct arch_shared_info { >> /* Frame containing list of mfns containing list of mfns containing p2m. */ >> xen_pfn_t pfn_to_mfn_frame_list_list; >> unsigned long nmi_reason; >> - uint64_t pad[32]; >> + /* >> + * Following two fields are valid if pfn_to_mfn_frame_list_list contains >> + * ~0UL. >> + */ >> + unsigned long p2m_vaddr; /* virtual address of the p2m list */ >> + unsigned long p2m_as_root; /* mfn of the top level page table */ > > xen_pfn_t please. And what does the "as" in the name stand for? "as" is address space. I can rename it to e.g. "p2m_pgd_mfn". > It's also kind of unclear in the description what "the page table root" > is, as I don't think there are many OSes which use just a single set > of page tables (i.e. just a single address space). Not having followed > the discussion closely - what is this needed for anyway? It's a replacement of the pfn_to_mfn_frame_list_list using the same page table as the kernel for accessing the p2m list. We need the root of the page table and the virtual address of the p2m list. > >> --- a/xen/include/public/features.h >> +++ b/xen/include/public/features.h >> @@ -99,6 +99,9 @@ >> #define XENFEAT_grant_map_identity 12 >> */ >> >> +/* x86: guest may specify virtual address of p2m list */ >> +#define XENFEAT_virtual_p2m 13 > > The name to me suggests something that's not real. Perhaps better > XENFEAT_virtually_mapped_p2m or XENFEAT_p2m_va{,ddr}? Yeah, that's better. I'll use XENFEAT_p2m_vaddr. Juergen ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/4] expand x86 arch_shared_info to support linear p2m list 2014-11-21 12:57 ` Juergen Gross @ 2014-11-21 13:26 ` Andrew Cooper 2014-11-21 13:37 ` Jürgen Groß 0 siblings, 1 reply; 25+ messages in thread From: Andrew Cooper @ 2014-11-21 13:26 UTC (permalink / raw) To: Juergen Gross, Jan Beulich Cc: keir@xen.org, Ian.Campbell@citrix.com, Tim Deegan, Ian Jackson, david.vrabel, xen-devel On 21/11/14 12:57, Juergen Gross wrote: > On 11/21/2014 01:23 PM, Jan Beulich wrote: >>>>> On 14.11.14 at 10:37, <"jgross@suse.com".non-mime.internet> wrote: >>> --- a/xen/include/public/arch-x86/xen.h >>> +++ b/xen/include/public/arch-x86/xen.h >>> @@ -224,7 +224,12 @@ struct arch_shared_info { >>> /* Frame containing list of mfns containing list of mfns >>> containing p2m. */ >>> xen_pfn_t pfn_to_mfn_frame_list_list; >>> unsigned long nmi_reason; >>> - uint64_t pad[32]; >>> + /* >>> + * Following two fields are valid if pfn_to_mfn_frame_list_list >>> contains >>> + * ~0UL. >>> + */ >>> + unsigned long p2m_vaddr; /* virtual address of the p2m list */ >>> + unsigned long p2m_as_root; /* mfn of the top level page table */ >> >> xen_pfn_t please. And what does the "as" in the name stand for? > > "as" is address space. I can rename it to e.g. "p2m_pgd_mfn". That is a linuxism in naming, which is also not accurate. >From my understanding, this frame must be an L4 table for 64bit guests, and an L3 table for 32bit guests. I.e. it is effectively a cr3 with which to use the p2m_vaddr field above? ~Andrew ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/4] expand x86 arch_shared_info to support linear p2m list 2014-11-21 13:26 ` Andrew Cooper @ 2014-11-21 13:37 ` Jürgen Groß 2014-11-21 14:04 ` Andrew Cooper 2014-11-21 14:07 ` Jan Beulich 0 siblings, 2 replies; 25+ messages in thread From: Jürgen Groß @ 2014-11-21 13:37 UTC (permalink / raw) To: Andrew Cooper, Jan Beulich Cc: keir@xen.org, Ian.Campbell@citrix.com, Tim Deegan, Ian Jackson, david.vrabel, xen-devel On 11/21/2014 02:26 PM, Andrew Cooper wrote: > On 21/11/14 12:57, Juergen Gross wrote: >> On 11/21/2014 01:23 PM, Jan Beulich wrote: >>>>>> On 14.11.14 at 10:37, <"jgross@suse.com".non-mime.internet> wrote: >>>> --- a/xen/include/public/arch-x86/xen.h >>>> +++ b/xen/include/public/arch-x86/xen.h >>>> @@ -224,7 +224,12 @@ struct arch_shared_info { >>>> /* Frame containing list of mfns containing list of mfns >>>> containing p2m. */ >>>> xen_pfn_t pfn_to_mfn_frame_list_list; >>>> unsigned long nmi_reason; >>>> - uint64_t pad[32]; >>>> + /* >>>> + * Following two fields are valid if pfn_to_mfn_frame_list_list >>>> contains >>>> + * ~0UL. >>>> + */ >>>> + unsigned long p2m_vaddr; /* virtual address of the p2m list */ >>>> + unsigned long p2m_as_root; /* mfn of the top level page table */ >>> >>> xen_pfn_t please. And what does the "as" in the name stand for? >> >> "as" is address space. I can rename it to e.g. "p2m_pgd_mfn". > > That is a linuxism in naming, which is also not accurate. > > From my understanding, this frame must be an L4 table for 64bit guests, > and an L3 table for 32bit guests. I.e. it is effectively a cr3 with > which to use the p2m_vaddr field above? Okay, so p2m_cr3 then? Juergen ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/4] expand x86 arch_shared_info to support linear p2m list 2014-11-21 13:37 ` Jürgen Groß @ 2014-11-21 14:04 ` Andrew Cooper 2014-11-21 14:07 ` Jan Beulich 1 sibling, 0 replies; 25+ messages in thread From: Andrew Cooper @ 2014-11-21 14:04 UTC (permalink / raw) To: Jürgen Groß, Jan Beulich Cc: keir@xen.org, Ian.Campbell@citrix.com, Tim Deegan, Ian Jackson, david.vrabel, xen-devel On 21/11/14 13:37, Jürgen Groß wrote: > On 11/21/2014 02:26 PM, Andrew Cooper wrote: >> On 21/11/14 12:57, Juergen Gross wrote: >>> On 11/21/2014 01:23 PM, Jan Beulich wrote: >>>>>>> On 14.11.14 at 10:37, <"jgross@suse.com".non-mime.internet> wrote: >>>>> --- a/xen/include/public/arch-x86/xen.h >>>>> +++ b/xen/include/public/arch-x86/xen.h >>>>> @@ -224,7 +224,12 @@ struct arch_shared_info { >>>>> /* Frame containing list of mfns containing list of mfns >>>>> containing p2m. */ >>>>> xen_pfn_t pfn_to_mfn_frame_list_list; >>>>> unsigned long nmi_reason; >>>>> - uint64_t pad[32]; >>>>> + /* >>>>> + * Following two fields are valid if pfn_to_mfn_frame_list_list >>>>> contains >>>>> + * ~0UL. >>>>> + */ >>>>> + unsigned long p2m_vaddr; /* virtual address of the p2m >>>>> list */ >>>>> + unsigned long p2m_as_root; /* mfn of the top level page >>>>> table */ >>>> >>>> xen_pfn_t please. And what does the "as" in the name stand for? >>> >>> "as" is address space. I can rename it to e.g. "p2m_pgd_mfn". >> >> That is a linuxism in naming, which is also not accurate. >> >> From my understanding, this frame must be an L4 table for 64bit guests, >> and an L3 table for 32bit guests. I.e. it is effectively a cr3 with >> which to use the p2m_vaddr field above? > > Okay, so p2m_cr3 then? > > I think that is reasonable, along with the comment explaining that it must reference an L4 or L3 frame, depending on the width of the guest. In the case of a 32bit guest, it should be a magic folded pfn, in the same representation that cr3 has in the vcpu register state. ~Andrew ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/4] expand x86 arch_shared_info to support linear p2m list 2014-11-21 13:37 ` Jürgen Groß 2014-11-21 14:04 ` Andrew Cooper @ 2014-11-21 14:07 ` Jan Beulich 1 sibling, 0 replies; 25+ messages in thread From: Jan Beulich @ 2014-11-21 14:07 UTC (permalink / raw) To: Juergen Gross Cc: keir@xen.org, Ian.Campbell@citrix.com, Andrew Cooper, Ian Jackson, Tim Deegan, david.vrabel, xen-devel >>> On 21.11.14 at 14:37, <JGross@suse.com> wrote: > On 11/21/2014 02:26 PM, Andrew Cooper wrote: >> On 21/11/14 12:57, Juergen Gross wrote: >>> On 11/21/2014 01:23 PM, Jan Beulich wrote: >>>>>>> On 14.11.14 at 10:37, <"jgross@suse.com".non-mime.internet> wrote: >>>>> --- a/xen/include/public/arch-x86/xen.h >>>>> +++ b/xen/include/public/arch-x86/xen.h >>>>> @@ -224,7 +224,12 @@ struct arch_shared_info { >>>>> /* Frame containing list of mfns containing list of mfns >>>>> containing p2m. */ >>>>> xen_pfn_t pfn_to_mfn_frame_list_list; >>>>> unsigned long nmi_reason; >>>>> - uint64_t pad[32]; >>>>> + /* >>>>> + * Following two fields are valid if pfn_to_mfn_frame_list_list >>>>> contains >>>>> + * ~0UL. >>>>> + */ >>>>> + unsigned long p2m_vaddr; /* virtual address of the p2m list */ >>>>> + unsigned long p2m_as_root; /* mfn of the top level page table */ >>>> >>>> xen_pfn_t please. And what does the "as" in the name stand for? >>> >>> "as" is address space. I can rename it to e.g. "p2m_pgd_mfn". >> >> That is a linuxism in naming, which is also not accurate. >> >> From my understanding, this frame must be an L4 table for 64bit guests, >> and an L3 table for 32bit guests. I.e. it is effectively a cr3 with >> which to use the p2m_vaddr field above? > > Okay, so p2m_cr3 then? I'm okay with that, but would think p2m_pt_root would be more generic. Considering PAE, wouldn't this need to be a 64-bit value then even in the 32-bit interface? Or alternatively be represented as MFN instead? Jan ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 2/4] introduce arch_get_features() 2014-11-14 9:37 [PATCH 0/4] support guest virtual mapped p2m list Juergen Gross 2014-11-14 9:37 ` [PATCH 1/4] expand x86 arch_shared_info to support linear " Juergen Gross @ 2014-11-14 9:37 ` Juergen Gross 2014-11-21 12:26 ` Jan Beulich 2014-11-21 13:21 ` Julien Grall 2014-11-14 9:37 ` [PATCH 3/4] introduce boot parameter for setting XENFEAT_virtual_p2m Juergen Gross 2014-11-14 9:37 ` [PATCH 4/4] document new boot parameter virt_p2m Juergen Gross 3 siblings, 2 replies; 25+ messages in thread From: Juergen Gross @ 2014-11-14 9:37 UTC (permalink / raw) To: xen-devel, jbeulich, konrad.wilk, david.vrabel; +Cc: Juergen Gross The XENVER_get_features sub command of the xen_version hypercall is handled completely in common/kernel.c despite of some architecture dependant parts. Move the architecture dependant parts in an own function in arch/*/domain.c Signed-off-by: Juergen Gross <jgross@suse.com> --- xen/arch/arm/domain.c | 5 +++++ xen/arch/x86/domain.c | 30 ++++++++++++++++++++++++++++++ xen/common/kernel.c | 22 ++-------------------- xen/include/xen/domain.h | 2 ++ 4 files changed, 39 insertions(+), 20 deletions(-) diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c index 7221bc8..dc5a3fb 100644 --- a/xen/arch/arm/domain.c +++ b/xen/arch/arm/domain.c @@ -823,6 +823,11 @@ void vcpu_block_unless_event_pending(struct vcpu *v) vcpu_unblock(current); } +uint32_t arch_get_features(struct domain *d, unsigned int submap_idx) +{ + return 0; +} + /* * Local variables: * mode: C diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index ae0a344..d98aabd 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -2166,6 +2166,36 @@ static int __init init_vcpu_kick_softirq(void) } __initcall(init_vcpu_kick_softirq); +uint32_t arch_get_features(struct domain *d, unsigned int submap_idx) +{ + uint32_t submap = 0; + + switch ( submap_idx ) + { + case 0: + switch ( d->guest_type ) + { + case guest_type_pv: + submap |= (1U << XENFEAT_mmu_pt_update_preserve_ad) | + (1U << XENFEAT_highmem_assist) | + (1U << XENFEAT_gnttab_map_avail_bits); + break; + case guest_type_pvh: + submap |= (1U << XENFEAT_hvm_safe_pvclock) | + (1U << XENFEAT_supervisor_mode_kernel) | + (1U << XENFEAT_hvm_callback_vector); + break; + case guest_type_hvm: + submap |= (1U << XENFEAT_hvm_safe_pvclock) | + (1U << XENFEAT_hvm_callback_vector) | + (1U << XENFEAT_hvm_pirqs); + break; + } + break; + } + + return submap; +} /* * Local variables: diff --git a/xen/common/kernel.c b/xen/common/kernel.c index d23c422..d22a860 100644 --- a/xen/common/kernel.c +++ b/xen/common/kernel.c @@ -312,31 +312,13 @@ DO(xen_version)(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) fi.submap |= 1U << XENFEAT_supervisor_mode_kernel; if ( is_hardware_domain(current->domain) ) fi.submap |= 1U << XENFEAT_dom0; -#ifdef CONFIG_X86 - switch ( d->guest_type ) - { - case guest_type_pv: - fi.submap |= (1U << XENFEAT_mmu_pt_update_preserve_ad) | - (1U << XENFEAT_highmem_assist) | - (1U << XENFEAT_gnttab_map_avail_bits); - break; - case guest_type_pvh: - fi.submap |= (1U << XENFEAT_hvm_safe_pvclock) | - (1U << XENFEAT_supervisor_mode_kernel) | - (1U << XENFEAT_hvm_callback_vector); - break; - case guest_type_hvm: - fi.submap |= (1U << XENFEAT_hvm_safe_pvclock) | - (1U << XENFEAT_hvm_callback_vector) | - (1U << XENFEAT_hvm_pirqs); - break; - } -#endif break; default: return -EINVAL; } + fi.submap |= arch_get_features(d, fi.submap_idx); + if ( copy_to_guest(arg, &fi, 1) ) return -EFAULT; return 0; diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h index 9215b0e..0d12dc0 100644 --- a/xen/include/xen/domain.h +++ b/xen/include/xen/domain.h @@ -80,6 +80,8 @@ extern spinlock_t vcpu_alloc_lock; bool_t domctl_lock_acquire(void); void domctl_lock_release(void); +uint32_t arch_get_features(struct domain *d, unsigned int submap_idx); + /* * Continue the current hypercall via func(data) on specified cpu. * If this function returns 0 then the function is guaranteed to run at some -- 2.1.2 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH 2/4] introduce arch_get_features() 2014-11-14 9:37 ` [PATCH 2/4] introduce arch_get_features() Juergen Gross @ 2014-11-21 12:26 ` Jan Beulich 2014-11-21 13:21 ` Julien Grall 1 sibling, 0 replies; 25+ messages in thread From: Jan Beulich @ 2014-11-21 12:26 UTC (permalink / raw) To: Juergen Gross; +Cc: xen-devel, david.vrabel >>> On 14.11.14 at 10:37, <"jgross@suse.com".non-mime.internet> wrote: > The XENVER_get_features sub command of the xen_version hypercall is > handled completely in common/kernel.c despite of some architecture > dependant parts. > > Move the architecture dependant parts in an own function in > arch/*/domain.c > > Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Jan Beulich <jbeulich@suse.com> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/4] introduce arch_get_features() 2014-11-14 9:37 ` [PATCH 2/4] introduce arch_get_features() Juergen Gross 2014-11-21 12:26 ` Jan Beulich @ 2014-11-21 13:21 ` Julien Grall 1 sibling, 0 replies; 25+ messages in thread From: Julien Grall @ 2014-11-21 13:21 UTC (permalink / raw) To: Juergen Gross, xen-devel, jbeulich, konrad.wilk, david.vrabel Hi Juergen, On 11/14/2014 09:37 AM, Juergen Gross wrote: > The XENVER_get_features sub command of the xen_version hypercall is > handled completely in common/kernel.c despite of some architecture > dependant parts. > > Move the architecture dependant parts in an own function in > arch/*/domain.c > > Signed-off-by: Juergen Gross <jgross@suse.com> For the ARM part: Reviewed-by: Julien Grall <julien.grall@linaro.org> Regards, > --- > xen/arch/arm/domain.c | 5 +++++ > xen/arch/x86/domain.c | 30 ++++++++++++++++++++++++++++++ > xen/common/kernel.c | 22 ++-------------------- > xen/include/xen/domain.h | 2 ++ > 4 files changed, 39 insertions(+), 20 deletions(-) > > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c > index 7221bc8..dc5a3fb 100644 > --- a/xen/arch/arm/domain.c > +++ b/xen/arch/arm/domain.c > @@ -823,6 +823,11 @@ void vcpu_block_unless_event_pending(struct vcpu *v) > vcpu_unblock(current); > } > > +uint32_t arch_get_features(struct domain *d, unsigned int submap_idx) > +{ > + return 0; > +} > + > /* > * Local variables: > * mode: C > diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c > index ae0a344..d98aabd 100644 > --- a/xen/arch/x86/domain.c > +++ b/xen/arch/x86/domain.c > @@ -2166,6 +2166,36 @@ static int __init init_vcpu_kick_softirq(void) > } > __initcall(init_vcpu_kick_softirq); > > +uint32_t arch_get_features(struct domain *d, unsigned int submap_idx) > +{ > + uint32_t submap = 0; > + > + switch ( submap_idx ) > + { > + case 0: > + switch ( d->guest_type ) > + { > + case guest_type_pv: > + submap |= (1U << XENFEAT_mmu_pt_update_preserve_ad) | > + (1U << XENFEAT_highmem_assist) | > + (1U << XENFEAT_gnttab_map_avail_bits); > + break; > + case guest_type_pvh: > + submap |= (1U << XENFEAT_hvm_safe_pvclock) | > + (1U << XENFEAT_supervisor_mode_kernel) | > + (1U << XENFEAT_hvm_callback_vector); > + break; > + case guest_type_hvm: > + submap |= (1U << XENFEAT_hvm_safe_pvclock) | > + (1U << XENFEAT_hvm_callback_vector) | > + (1U << XENFEAT_hvm_pirqs); > + break; > + } > + break; > + } > + > + return submap; > +} > > /* > * Local variables: > diff --git a/xen/common/kernel.c b/xen/common/kernel.c > index d23c422..d22a860 100644 > --- a/xen/common/kernel.c > +++ b/xen/common/kernel.c > @@ -312,31 +312,13 @@ DO(xen_version)(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) > fi.submap |= 1U << XENFEAT_supervisor_mode_kernel; > if ( is_hardware_domain(current->domain) ) > fi.submap |= 1U << XENFEAT_dom0; > -#ifdef CONFIG_X86 > - switch ( d->guest_type ) > - { > - case guest_type_pv: > - fi.submap |= (1U << XENFEAT_mmu_pt_update_preserve_ad) | > - (1U << XENFEAT_highmem_assist) | > - (1U << XENFEAT_gnttab_map_avail_bits); > - break; > - case guest_type_pvh: > - fi.submap |= (1U << XENFEAT_hvm_safe_pvclock) | > - (1U << XENFEAT_supervisor_mode_kernel) | > - (1U << XENFEAT_hvm_callback_vector); > - break; > - case guest_type_hvm: > - fi.submap |= (1U << XENFEAT_hvm_safe_pvclock) | > - (1U << XENFEAT_hvm_callback_vector) | > - (1U << XENFEAT_hvm_pirqs); > - break; > - } > -#endif > break; > default: > return -EINVAL; > } > > + fi.submap |= arch_get_features(d, fi.submap_idx); > + > if ( copy_to_guest(arg, &fi, 1) ) > return -EFAULT; > return 0; > diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h > index 9215b0e..0d12dc0 100644 > --- a/xen/include/xen/domain.h > +++ b/xen/include/xen/domain.h > @@ -80,6 +80,8 @@ extern spinlock_t vcpu_alloc_lock; > bool_t domctl_lock_acquire(void); > void domctl_lock_release(void); > > +uint32_t arch_get_features(struct domain *d, unsigned int submap_idx); > + > /* > * Continue the current hypercall via func(data) on specified cpu. > * If this function returns 0 then the function is guaranteed to run at some > -- Julien Grall ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 3/4] introduce boot parameter for setting XENFEAT_virtual_p2m 2014-11-14 9:37 [PATCH 0/4] support guest virtual mapped p2m list Juergen Gross 2014-11-14 9:37 ` [PATCH 1/4] expand x86 arch_shared_info to support linear " Juergen Gross 2014-11-14 9:37 ` [PATCH 2/4] introduce arch_get_features() Juergen Gross @ 2014-11-14 9:37 ` Juergen Gross 2014-11-19 21:04 ` Konrad Rzeszutek Wilk 2014-11-14 9:37 ` [PATCH 4/4] document new boot parameter virt_p2m Juergen Gross 3 siblings, 1 reply; 25+ messages in thread From: Juergen Gross @ 2014-11-14 9:37 UTC (permalink / raw) To: xen-devel, jbeulich, konrad.wilk, david.vrabel; +Cc: Juergen Gross Introduce a new boot parameter "virt_p2m" to be able to set XENFEAT_virtual_p2m for a pv domain. As long as Xen tools and kdump don't support this new feature it is turned off by default. Signed-off-by: Juergen Gross <jgross@suse.com> --- xen/arch/x86/domain.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index d98aabd..ccb54f6 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -2166,8 +2166,44 @@ static int __init init_vcpu_kick_softirq(void) } __initcall(init_vcpu_kick_softirq); +#define VIRT_P2M_DOM0 0x01 +#define VIRT_P2M_DOM0_LARGE 0x02 +#define VIRT_P2M_DOMU 0x04 +#define VIRT_P2M_DOMU_LARGE 0x08 +static unsigned virt_p2m = 0; + +static void __init parse_virt_p2m(const char *s) +{ + char *ss; + int b; + + do { + ss = strchr(s, ','); + if ( ss ) + *ss = '\0'; + + b = parse_bool(s); + if ( b == 0 ) + virt_p2m = 0; + else if ( b == 1 ) + virt_p2m = VIRT_P2M_DOM0 | VIRT_P2M_DOMU; + else if ( !strcmp(s, "dom0") ) + virt_p2m |= VIRT_P2M_DOM0; + else if ( !strcmp(s, "dom0_large") ) + virt_p2m |= VIRT_P2M_DOM0_LARGE; + else if ( !strcmp(s, "domu") ) + virt_p2m |= VIRT_P2M_DOMU; + else if ( !strcmp(s, "domu_large") ) + virt_p2m |= VIRT_P2M_DOMU_LARGE; + + s = ss + 1; + } while ( ss ); +} +custom_param("virt_p2m", parse_virt_p2m); + uint32_t arch_get_features(struct domain *d, unsigned int submap_idx) { +#define DOM_IS_LARGE(d) ((d)->max_pages > 1U << 27) uint32_t submap = 0; switch ( submap_idx ) @@ -2179,6 +2215,20 @@ uint32_t arch_get_features(struct domain *d, unsigned int submap_idx) submap |= (1U << XENFEAT_mmu_pt_update_preserve_ad) | (1U << XENFEAT_highmem_assist) | (1U << XENFEAT_gnttab_map_avail_bits); + if ( is_hardware_domain(d) ) + { + if ( virt_p2m & VIRT_P2M_DOM0 ) + submap |= 1U << XENFEAT_virtual_p2m; + if ( DOM_IS_LARGE(d) && virt_p2m & VIRT_P2M_DOM0_LARGE ) + submap |= 1U << XENFEAT_virtual_p2m; + } + else + { + if ( virt_p2m & VIRT_P2M_DOMU ) + submap |= 1U << XENFEAT_virtual_p2m; + if ( DOM_IS_LARGE(d) && virt_p2m & VIRT_P2M_DOMU_LARGE ) + submap |= 1U << XENFEAT_virtual_p2m; + } break; case guest_type_pvh: submap |= (1U << XENFEAT_hvm_safe_pvclock) | -- 2.1.2 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH 3/4] introduce boot parameter for setting XENFEAT_virtual_p2m 2014-11-14 9:37 ` [PATCH 3/4] introduce boot parameter for setting XENFEAT_virtual_p2m Juergen Gross @ 2014-11-19 21:04 ` Konrad Rzeszutek Wilk 2014-11-20 4:46 ` Juergen Gross 0 siblings, 1 reply; 25+ messages in thread From: Konrad Rzeszutek Wilk @ 2014-11-19 21:04 UTC (permalink / raw) To: Juergen Gross; +Cc: xen-devel, david.vrabel, jbeulich On Fri, Nov 14, 2014 at 10:37:25AM +0100, Juergen Gross wrote: > Introduce a new boot parameter "virt_p2m" to be able to set > XENFEAT_virtual_p2m for a pv domain. > > As long as Xen tools and kdump don't support this new feature it is > turned off by default. Couldn't the dom0_large and dom0 be detected automatically? That is the dom0 could advertise it can do large-dom0 support and Xen would automatically switch to the right mode? ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 3/4] introduce boot parameter for setting XENFEAT_virtual_p2m 2014-11-19 21:04 ` Konrad Rzeszutek Wilk @ 2014-11-20 4:46 ` Juergen Gross 0 siblings, 0 replies; 25+ messages in thread From: Juergen Gross @ 2014-11-20 4:46 UTC (permalink / raw) To: Konrad Rzeszutek Wilk; +Cc: xen-devel, david.vrabel, jbeulich On 11/19/2014 10:04 PM, Konrad Rzeszutek Wilk wrote: > On Fri, Nov 14, 2014 at 10:37:25AM +0100, Juergen Gross wrote: >> Introduce a new boot parameter "virt_p2m" to be able to set >> XENFEAT_virtual_p2m for a pv domain. >> >> As long as Xen tools and kdump don't support this new feature it is >> turned off by default. > > Couldn't the dom0_large and dom0 be detected automatically? That is > the dom0 could advertise it can do large-dom0 support and Xen would > automatically switch to the right mode? No, that's not the problem. Xen has to indicate it is capable to handle the new mode. At dom0 construction time the dom0 kernel can't know about the capability of kdump to handle the new mode. In case the new interface is accepted I'll set up some kdump patches to handle it. We can switch to dom0/dom0_large set on default if they are accepted on time (e.g. at the time the kernel support for the new interface is put in place). Juergen ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 4/4] document new boot parameter virt_p2m 2014-11-14 9:37 [PATCH 0/4] support guest virtual mapped p2m list Juergen Gross ` (2 preceding siblings ...) 2014-11-14 9:37 ` [PATCH 3/4] introduce boot parameter for setting XENFEAT_virtual_p2m Juergen Gross @ 2014-11-14 9:37 ` Juergen Gross 3 siblings, 0 replies; 25+ messages in thread From: Juergen Gross @ 2014-11-14 9:37 UTC (permalink / raw) To: xen-devel, jbeulich, konrad.wilk, david.vrabel; +Cc: Juergen Gross Add documentation for the new boot parameter "virt_p2m". Signed-off-by: Juergen Gross <jgross@suse.com> --- docs/misc/xen-command-line.markdown | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown index 0830e5f..c56273d 100644 --- a/docs/misc/xen-command-line.markdown +++ b/docs/misc/xen-command-line.markdown @@ -1272,6 +1272,28 @@ The optional `keep` parameter causes Xen to continue using the vga console even after dom0 has been started. The default behaviour is to relinquish control to dom0. +### virt\_p2m +> `= List of [ <boolean> | dom0 | dom0\_large | domu | domu\_large ]` + +> Default: `false` + +Allow pv-domains to specify a virtual address for the domain's p2m list which +is used by the Xen tools during domain save and restore and by kdump. + +`dom0` enables this feature for Dom0. + +`dom0\_large` enables this feature for Dom0 with more than 512 GiB of RAM +(the traditional 3 level p2m tree can't map more than that). + +`domu` enables this feature for all pv domains but Dom0. + +`domu\_large` enables this feature for all pv domains with more than 512 GiB +of RAM but Dom0. + +`true` enables this feature for all pv domains. + +`false` disables this feature for all domains. + ### vpid (Intel) > `= <boolean>` -- 2.1.2 ^ permalink raw reply related [flat|nested] 25+ messages in thread
end of thread, other threads:[~2014-11-21 14:07 UTC | newest] Thread overview: 25+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-11-14 9:37 [PATCH 0/4] support guest virtual mapped p2m list Juergen Gross 2014-11-14 9:37 ` [PATCH 1/4] expand x86 arch_shared_info to support linear " Juergen Gross 2014-11-14 11:41 ` Andrew Cooper 2014-11-14 12:53 ` Juergen Gross 2014-11-14 13:56 ` Andrew Cooper 2014-11-14 14:14 ` Jürgen Groß 2014-11-14 14:59 ` Andrew Cooper 2014-11-14 15:32 ` Juergen Gross 2014-11-14 16:08 ` Andrew Cooper 2014-11-18 5:33 ` Juergen Gross 2014-11-18 10:51 ` Andrew Cooper 2014-11-18 10:56 ` David Vrabel 2014-11-21 12:23 ` Jan Beulich 2014-11-21 12:57 ` Juergen Gross 2014-11-21 13:26 ` Andrew Cooper 2014-11-21 13:37 ` Jürgen Groß 2014-11-21 14:04 ` Andrew Cooper 2014-11-21 14:07 ` Jan Beulich 2014-11-14 9:37 ` [PATCH 2/4] introduce arch_get_features() Juergen Gross 2014-11-21 12:26 ` Jan Beulich 2014-11-21 13:21 ` Julien Grall 2014-11-14 9:37 ` [PATCH 3/4] introduce boot parameter for setting XENFEAT_virtual_p2m Juergen Gross 2014-11-19 21:04 ` Konrad Rzeszutek Wilk 2014-11-20 4:46 ` Juergen Gross 2014-11-14 9:37 ` [PATCH 4/4] document new boot parameter virt_p2m Juergen Gross
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.