* Linux 4.1 reports wrong number of pages to toolstack @ 2015-09-04 0:40 Wei Liu 2015-09-04 3:38 ` Juergen Gross 2015-09-04 8:53 ` Ian Campbell 0 siblings, 2 replies; 15+ messages in thread From: Wei Liu @ 2015-09-04 0:40 UTC (permalink / raw) To: xen-devel, David Vrabel, Juergen Gross Cc: Ian Jackson, wei.liu2, Ian Campbell, Andrew Cooper Hi David This issue is exposed by the introduction of migration v2. The symptom is that a guest with 32 bit 4.1 kernel can't be restored because it's asking for too many pages. Note that all guests have 512MB memory, which means they have 131072 pages. Both 3.14 tests [2] [3] get the correct number of pages. Like: xc: detail: max_pfn 0x1ffff, p2m_frames 256 ... xc: detail: Memory: 2048/131072 1% ... However in both 4.1 [0] [1] the number of pages are quite wrong. 4.1 32 bit: xc: detail: max_pfn 0xfffff, p2m_frames 1024 ... xc: detail: Memory: 11264/1048576 1% ... It thinks it has 4096MB memory. 4.1 64 bit: xc: detail: max_pfn 0x3ffff, p2m_frames 512 ... xc: detail: Memory: 3072/262144 1% ... It thinks it has 1024MB memory. The total number of pages is determined in libxc by calling xc_domain_nr_gpfns, which yanks shared_info->arch.max_pfn from hypervisor. And that value is clearly touched by Linux in some way. I now think this is a bug in Linux kernel. The biggest suspect is the introduction of linear P2M. If you think this is a bug in toolstack, please let me know. I don't know why 4.1 64 bit [0] can still be successfully restored. I don't have handy setup to experiment. The restore path doesn't show enough information to tell anything. The thing I worry about is that migration v2 somehow make the guest bigger than it should be. But that's another topic. Wei. [0] 4.1 kernel 64 bit save restore: http://logs.test-lab.xenproject.org/osstest/logs/60785/test-amd64-amd64-xl/16.ts-guest-saverestore.log [1] 4.1 kernel 32 bit save restore: http://logs.test-lab.xenproject.org/osstest/logs/60785/test-amd64-i386-xl/14.ts-guest-saverestore.log [2] 3.14 kernel 64 bit save restore: http://logs.test-lab.xenproject.org/osstest/logs/61263/test-amd64-amd64-xl/16.ts-guest-saverestore.log [3] 3.14 kernel 32 bit save restore: http://logs.test-lab.xenproject.org/osstest/logs/61263/test-amd64-i386-xl/16.ts-guest-saverestore.log ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linux 4.1 reports wrong number of pages to toolstack 2015-09-04 0:40 Linux 4.1 reports wrong number of pages to toolstack Wei Liu @ 2015-09-04 3:38 ` Juergen Gross 2015-09-04 8:28 ` Jan Beulich 2015-09-04 8:53 ` Ian Campbell 1 sibling, 1 reply; 15+ messages in thread From: Juergen Gross @ 2015-09-04 3:38 UTC (permalink / raw) To: Wei Liu, xen-devel, David Vrabel; +Cc: Andrew Cooper, Ian Jackson, Ian Campbell On 09/04/2015 02:40 AM, Wei Liu wrote: > Hi David > > This issue is exposed by the introduction of migration v2. The symptom is that > a guest with 32 bit 4.1 kernel can't be restored because it's asking for too > many pages. > > Note that all guests have 512MB memory, which means they have 131072 pages. > > Both 3.14 tests [2] [3] get the correct number of pages. Like: > > xc: detail: max_pfn 0x1ffff, p2m_frames 256 > ... > xc: detail: Memory: 2048/131072 1% > ... > > However in both 4.1 [0] [1] the number of pages are quite wrong. > > 4.1 32 bit: > > xc: detail: max_pfn 0xfffff, p2m_frames 1024 > ... > xc: detail: Memory: 11264/1048576 1% > ... > > It thinks it has 4096MB memory. > > 4.1 64 bit: > > xc: detail: max_pfn 0x3ffff, p2m_frames 512 > ... > xc: detail: Memory: 3072/262144 1% > ... > > It thinks it has 1024MB memory. > > The total number of pages is determined in libxc by calling > xc_domain_nr_gpfns, which yanks shared_info->arch.max_pfn from > hypervisor. And that value is clearly touched by Linux in some way. Sure. shared_info->arch.max_pfn holds the number of pfns the p2m list can handle. This is not the memory size of the domain. > I now think this is a bug in Linux kernel. The biggest suspect is the > introduction of linear P2M. If you think this is a bug in toolstack, > please let me know. I absolutely think it is a toolstack bug. Even without the linear p2m things would go wrong in case a ballooned down guest would be migrated, as shared_info->arch.max_pfn would hold the upper limit of the guest in this case and not the current size. Juergen ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linux 4.1 reports wrong number of pages to toolstack 2015-09-04 3:38 ` Juergen Gross @ 2015-09-04 8:28 ` Jan Beulich 2015-09-04 9:35 ` Andrew Cooper 2015-09-04 11:40 ` Wei Liu 0 siblings, 2 replies; 15+ messages in thread From: Jan Beulich @ 2015-09-04 8:28 UTC (permalink / raw) To: Wei Liu, Juergen Gross Cc: Andrew Cooper, xen-devel, Ian Jackson, David Vrabel, Ian Campbell >>> On 04.09.15 at 05:38, <JGross@suse.com> wrote: > On 09/04/2015 02:40 AM, Wei Liu wrote: >> This issue is exposed by the introduction of migration v2. The symptom is that >> a guest with 32 bit 4.1 kernel can't be restored because it's asking for too >> many pages. >> >> Note that all guests have 512MB memory, which means they have 131072 pages. >> >> Both 3.14 tests [2] [3] get the correct number of pages. Like: >> >> xc: detail: max_pfn 0x1ffff, p2m_frames 256 >> ... >> xc: detail: Memory: 2048/131072 1% >> ... >> >> However in both 4.1 [0] [1] the number of pages are quite wrong. >> >> 4.1 32 bit: >> >> xc: detail: max_pfn 0xfffff, p2m_frames 1024 >> ... >> xc: detail: Memory: 11264/1048576 1% >> ... >> >> It thinks it has 4096MB memory. >> >> 4.1 64 bit: >> >> xc: detail: max_pfn 0x3ffff, p2m_frames 512 >> ... >> xc: detail: Memory: 3072/262144 1% >> ... >> >> It thinks it has 1024MB memory. >> >> The total number of pages is determined in libxc by calling >> xc_domain_nr_gpfns, which yanks shared_info->arch.max_pfn from >> hypervisor. And that value is clearly touched by Linux in some way. > > Sure. shared_info->arch.max_pfn holds the number of pfns the p2m list > can handle. This is not the memory size of the domain. > >> I now think this is a bug in Linux kernel. The biggest suspect is the >> introduction of linear P2M. If you think this is a bug in toolstack, >> please let me know. > > I absolutely think it is a toolstack bug. Even without the linear p2m > things would go wrong in case a ballooned down guest would be migrated, > as shared_info->arch.max_pfn would hold the upper limit of the guest > in this case and not the current size. I don't think this necessarily is a tool stack bug, at least not in the sense implied above - since (afaik) migrating ballooned guests (at least PV ones) has been working before, there ought to be logic to skip ballooned pages (and I certainly recall having seen migration slowly move up to e.g. 50% and the skip the other half due to being ballooned, albeit that recollection certainly is from before v2). And pages above the highest populated one ought to be considered ballooned just as much. With the information provided by Wei I don't think we can judge about this, since it only shows the values the migration process starts from, not when, why, or how it fails. Jan ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linux 4.1 reports wrong number of pages to toolstack 2015-09-04 8:28 ` Jan Beulich @ 2015-09-04 9:35 ` Andrew Cooper 2015-09-04 11:35 ` Wei Liu 2015-09-04 11:40 ` Wei Liu 1 sibling, 1 reply; 15+ messages in thread From: Andrew Cooper @ 2015-09-04 9:35 UTC (permalink / raw) To: Jan Beulich, Wei Liu, Juergen Gross Cc: xen-devel, Ian Jackson, David Vrabel, Ian Campbell On 04/09/15 09:28, Jan Beulich wrote: >>>> On 04.09.15 at 05:38, <JGross@suse.com> wrote: >> On 09/04/2015 02:40 AM, Wei Liu wrote: >>> This issue is exposed by the introduction of migration v2. The symptom is that >>> a guest with 32 bit 4.1 kernel can't be restored because it's asking for too >>> many pages. >>> >>> Note that all guests have 512MB memory, which means they have 131072 pages. >>> >>> Both 3.14 tests [2] [3] get the correct number of pages. Like: >>> >>> xc: detail: max_pfn 0x1ffff, p2m_frames 256 >>> ... >>> xc: detail: Memory: 2048/131072 1% >>> ... >>> >>> However in both 4.1 [0] [1] the number of pages are quite wrong. >>> >>> 4.1 32 bit: >>> >>> xc: detail: max_pfn 0xfffff, p2m_frames 1024 >>> ... >>> xc: detail: Memory: 11264/1048576 1% >>> ... >>> >>> It thinks it has 4096MB memory. >>> >>> 4.1 64 bit: >>> >>> xc: detail: max_pfn 0x3ffff, p2m_frames 512 >>> ... >>> xc: detail: Memory: 3072/262144 1% >>> ... >>> >>> It thinks it has 1024MB memory. >>> >>> The total number of pages is determined in libxc by calling >>> xc_domain_nr_gpfns, which yanks shared_info->arch.max_pfn from >>> hypervisor. And that value is clearly touched by Linux in some way. >> Sure. shared_info->arch.max_pfn holds the number of pfns the p2m list >> can handle. This is not the memory size of the domain. >> >>> I now think this is a bug in Linux kernel. The biggest suspect is the >>> introduction of linear P2M. If you think this is a bug in toolstack, >>> please let me know. >> I absolutely think it is a toolstack bug. Even without the linear p2m >> things would go wrong in case a ballooned down guest would be migrated, >> as shared_info->arch.max_pfn would hold the upper limit of the guest >> in this case and not the current size. > I don't think this necessarily is a tool stack bug, at least not in > the sense implied above - since (afaik) migrating ballooned guests > (at least PV ones) has been working before, there ought to be > logic to skip ballooned pages (and I certainly recall having seen > migration slowly move up to e.g. 50% and the skip the other > half due to being ballooned, albeit that recollection certainly is > from before v2). And pages above the highest populated one > ought to be considered ballooned just as much. With the > information provided by Wei I don't think we can judge about > this, since it only shows the values the migration process starts > from, not when, why, or how it fails. Max pfn reported by migration v2 is max pfn, not the number of pages of RAM in the guest. It is used for the size of the bitmaps used by migration v2, including the logdirty op calls. All frames between 0 and max pfn will have their type queried, and acted upon appropriately, including doing nothing if the frame was ballooned out. ~Andrew ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linux 4.1 reports wrong number of pages to toolstack 2015-09-04 9:35 ` Andrew Cooper @ 2015-09-04 11:35 ` Wei Liu 2015-09-04 18:39 ` Andrew Cooper 0 siblings, 1 reply; 15+ messages in thread From: Wei Liu @ 2015-09-04 11:35 UTC (permalink / raw) To: Andrew Cooper Cc: Juergen Gross, Wei Liu, Ian Campbell, Ian Jackson, David Vrabel, Jan Beulich, xen-devel On Fri, Sep 04, 2015 at 10:35:52AM +0100, Andrew Cooper wrote: > On 04/09/15 09:28, Jan Beulich wrote: > >>>>On 04.09.15 at 05:38, <JGross@suse.com> wrote: > >>On 09/04/2015 02:40 AM, Wei Liu wrote: > >>>This issue is exposed by the introduction of migration v2. The symptom is that > >>>a guest with 32 bit 4.1 kernel can't be restored because it's asking for too > >>>many pages. > >>> > >>>Note that all guests have 512MB memory, which means they have 131072 pages. > >>> > >>>Both 3.14 tests [2] [3] get the correct number of pages. Like: > >>> > >>> xc: detail: max_pfn 0x1ffff, p2m_frames 256 > >>> ... > >>> xc: detail: Memory: 2048/131072 1% > >>> ... > >>> > >>>However in both 4.1 [0] [1] the number of pages are quite wrong. > >>> > >>>4.1 32 bit: > >>> > >>> xc: detail: max_pfn 0xfffff, p2m_frames 1024 > >>> ... > >>> xc: detail: Memory: 11264/1048576 1% > >>> ... > >>> > >>>It thinks it has 4096MB memory. > >>> > >>>4.1 64 bit: > >>> > >>> xc: detail: max_pfn 0x3ffff, p2m_frames 512 > >>> ... > >>> xc: detail: Memory: 3072/262144 1% > >>> ... > >>> > >>>It thinks it has 1024MB memory. > >>> > >>>The total number of pages is determined in libxc by calling > >>>xc_domain_nr_gpfns, which yanks shared_info->arch.max_pfn from > >>>hypervisor. And that value is clearly touched by Linux in some way. > >>Sure. shared_info->arch.max_pfn holds the number of pfns the p2m list > >>can handle. This is not the memory size of the domain. > >> > >>>I now think this is a bug in Linux kernel. The biggest suspect is the > >>>introduction of linear P2M. If you think this is a bug in toolstack, > >>>please let me know. > >>I absolutely think it is a toolstack bug. Even without the linear p2m > >>things would go wrong in case a ballooned down guest would be migrated, > >>as shared_info->arch.max_pfn would hold the upper limit of the guest > >>in this case and not the current size. > >I don't think this necessarily is a tool stack bug, at least not in > >the sense implied above - since (afaik) migrating ballooned guests > >(at least PV ones) has been working before, there ought to be > >logic to skip ballooned pages (and I certainly recall having seen > >migration slowly move up to e.g. 50% and the skip the other > >half due to being ballooned, albeit that recollection certainly is > >from before v2). And pages above the highest populated one > >ought to be considered ballooned just as much. With the > >information provided by Wei I don't think we can judge about > >this, since it only shows the values the migration process starts > >from, not when, why, or how it fails. > > Max pfn reported by migration v2 is max pfn, not the number of pages of RAM > in the guest. > I understand that by looking at the code. Just the log itself is very confusing. I propose we rename the log a bit. Maybe change "Memory" to "P2M" or something else? > It is used for the size of the bitmaps used by migration v2, including the > logdirty op calls. > > All frames between 0 and max pfn will have their type queried, and acted > upon appropriately, including doing nothing if the frame was ballooned out. In short, do you think this is a bug in migration v2? When I looked at write_batch() I found some snippets that I thought to be wrong. But I didn't what to make the judgement when I didn't have a clear head. Wei. > > ~Andrew ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linux 4.1 reports wrong number of pages to toolstack 2015-09-04 11:35 ` Wei Liu @ 2015-09-04 18:39 ` Andrew Cooper 2015-09-04 19:46 ` Wei Liu 0 siblings, 1 reply; 15+ messages in thread From: Andrew Cooper @ 2015-09-04 18:39 UTC (permalink / raw) To: Wei Liu Cc: Juergen Gross, Ian Campbell, Ian Jackson, David Vrabel, Jan Beulich, xen-devel On 04/09/15 12:35, Wei Liu wrote: > On Fri, Sep 04, 2015 at 10:35:52AM +0100, Andrew Cooper wrote: >> On 04/09/15 09:28, Jan Beulich wrote: >>>>>> On 04.09.15 at 05:38, <JGross@suse.com> wrote: >>>> On 09/04/2015 02:40 AM, Wei Liu wrote: >>>>> This issue is exposed by the introduction of migration v2. The symptom is that >>>>> a guest with 32 bit 4.1 kernel can't be restored because it's asking for too >>>>> many pages. >>>>> >>>>> Note that all guests have 512MB memory, which means they have 131072 pages. >>>>> >>>>> Both 3.14 tests [2] [3] get the correct number of pages. Like: >>>>> >>>>> xc: detail: max_pfn 0x1ffff, p2m_frames 256 >>>>> ... >>>>> xc: detail: Memory: 2048/131072 1% >>>>> ... >>>>> >>>>> However in both 4.1 [0] [1] the number of pages are quite wrong. >>>>> >>>>> 4.1 32 bit: >>>>> >>>>> xc: detail: max_pfn 0xfffff, p2m_frames 1024 >>>>> ... >>>>> xc: detail: Memory: 11264/1048576 1% >>>>> ... >>>>> >>>>> It thinks it has 4096MB memory. >>>>> >>>>> 4.1 64 bit: >>>>> >>>>> xc: detail: max_pfn 0x3ffff, p2m_frames 512 >>>>> ... >>>>> xc: detail: Memory: 3072/262144 1% >>>>> ... >>>>> >>>>> It thinks it has 1024MB memory. >>>>> >>>>> The total number of pages is determined in libxc by calling >>>>> xc_domain_nr_gpfns, which yanks shared_info->arch.max_pfn from >>>>> hypervisor. And that value is clearly touched by Linux in some way. >>>> Sure. shared_info->arch.max_pfn holds the number of pfns the p2m list >>>> can handle. This is not the memory size of the domain. >>>> >>>>> I now think this is a bug in Linux kernel. The biggest suspect is the >>>>> introduction of linear P2M. If you think this is a bug in toolstack, >>>>> please let me know. >>>> I absolutely think it is a toolstack bug. Even without the linear p2m >>>> things would go wrong in case a ballooned down guest would be migrated, >>>> as shared_info->arch.max_pfn would hold the upper limit of the guest >>>> in this case and not the current size. >>> I don't think this necessarily is a tool stack bug, at least not in >>> the sense implied above - since (afaik) migrating ballooned guests >>> (at least PV ones) has been working before, there ought to be >>> logic to skip ballooned pages (and I certainly recall having seen >>> migration slowly move up to e.g. 50% and the skip the other >>> half due to being ballooned, albeit that recollection certainly is >> >from before v2). And pages above the highest populated one >>> ought to be considered ballooned just as much. With the >>> information provided by Wei I don't think we can judge about >>> this, since it only shows the values the migration process starts >>> from, not when, why, or how it fails. >> Max pfn reported by migration v2 is max pfn, not the number of pages of RAM >> in the guest. >> > I understand that by looking at the code. Just the log itself > is very confusing. > > I propose we rename the log a bit. Maybe change "Memory" to "P2M" or > something else? P2M would be wrong for HVM guests. Memory was the same term used by the legacy code iirc. "Frames" is probably the best term. > >> It is used for the size of the bitmaps used by migration v2, including the >> logdirty op calls. >> >> All frames between 0 and max pfn will have their type queried, and acted >> upon appropriately, including doing nothing if the frame was ballooned out. > In short, do you think this is a bug in migration v2? There is insufficient information in this thread to say either way. Maybe. Maybe a Linux kernel bug. > > When I looked at write_batch() I found some snippets that I thought to > be wrong. But I didn't what to make the judgement when I didn't have a > clear head. write_batch() is a complicated function but it can't usefully be split any further. I would be happy to explain bits or expand the existing comments, but it is also possible that it is buggy. ~Andrew ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linux 4.1 reports wrong number of pages to toolstack 2015-09-04 18:39 ` Andrew Cooper @ 2015-09-04 19:46 ` Wei Liu 2015-09-04 20:32 ` Andrew Cooper 0 siblings, 1 reply; 15+ messages in thread From: Wei Liu @ 2015-09-04 19:46 UTC (permalink / raw) To: Andrew Cooper Cc: Juergen Gross, Wei Liu, Ian Campbell, Ian Jackson, David Vrabel, Jan Beulich, xen-devel On Fri, Sep 04, 2015 at 07:39:27PM +0100, Andrew Cooper wrote: > > > On 04/09/15 12:35, Wei Liu wrote: > >On Fri, Sep 04, 2015 at 10:35:52AM +0100, Andrew Cooper wrote: > >>On 04/09/15 09:28, Jan Beulich wrote: > >>>>>>On 04.09.15 at 05:38, <JGross@suse.com> wrote: > >>>>On 09/04/2015 02:40 AM, Wei Liu wrote: > >>>>>This issue is exposed by the introduction of migration v2. The symptom is that > >>>>>a guest with 32 bit 4.1 kernel can't be restored because it's asking for too > >>>>>many pages. > >>>>> > >>>>>Note that all guests have 512MB memory, which means they have 131072 pages. > >>>>> > >>>>>Both 3.14 tests [2] [3] get the correct number of pages. Like: > >>>>> > >>>>> xc: detail: max_pfn 0x1ffff, p2m_frames 256 > >>>>> ... > >>>>> xc: detail: Memory: 2048/131072 1% > >>>>> ... > >>>>> > >>>>>However in both 4.1 [0] [1] the number of pages are quite wrong. > >>>>> > >>>>>4.1 32 bit: > >>>>> > >>>>> xc: detail: max_pfn 0xfffff, p2m_frames 1024 > >>>>> ... > >>>>> xc: detail: Memory: 11264/1048576 1% > >>>>> ... > >>>>> > >>>>>It thinks it has 4096MB memory. > >>>>> > >>>>>4.1 64 bit: > >>>>> > >>>>> xc: detail: max_pfn 0x3ffff, p2m_frames 512 > >>>>> ... > >>>>> xc: detail: Memory: 3072/262144 1% > >>>>> ... > >>>>> > >>>>>It thinks it has 1024MB memory. > >>>>> > >>>>>The total number of pages is determined in libxc by calling > >>>>>xc_domain_nr_gpfns, which yanks shared_info->arch.max_pfn from > >>>>>hypervisor. And that value is clearly touched by Linux in some way. > >>>>Sure. shared_info->arch.max_pfn holds the number of pfns the p2m list > >>>>can handle. This is not the memory size of the domain. > >>>> > >>>>>I now think this is a bug in Linux kernel. The biggest suspect is the > >>>>>introduction of linear P2M. If you think this is a bug in toolstack, > >>>>>please let me know. > >>>>I absolutely think it is a toolstack bug. Even without the linear p2m > >>>>things would go wrong in case a ballooned down guest would be migrated, > >>>>as shared_info->arch.max_pfn would hold the upper limit of the guest > >>>>in this case and not the current size. > >>>I don't think this necessarily is a tool stack bug, at least not in > >>>the sense implied above - since (afaik) migrating ballooned guests > >>>(at least PV ones) has been working before, there ought to be > >>>logic to skip ballooned pages (and I certainly recall having seen > >>>migration slowly move up to e.g. 50% and the skip the other > >>>half due to being ballooned, albeit that recollection certainly is > >>>from before v2). And pages above the highest populated one > >>>ought to be considered ballooned just as much. With the > >>>information provided by Wei I don't think we can judge about > >>>this, since it only shows the values the migration process starts > >>>from, not when, why, or how it fails. > >>Max pfn reported by migration v2 is max pfn, not the number of pages of RAM > >>in the guest. > >> > >I understand that by looking at the code. Just the log itself > >is very confusing. > > > >I propose we rename the log a bit. Maybe change "Memory" to "P2M" or > >something else? > > P2M would be wrong for HVM guests. Memory was the same term used by the > legacy code iirc. > > "Frames" is probably the best term. > > > > >>It is used for the size of the bitmaps used by migration v2, including the > >>logdirty op calls. > >> > >>All frames between 0 and max pfn will have their type queried, and acted > >>upon appropriately, including doing nothing if the frame was ballooned out. > >In short, do you think this is a bug in migration v2? > > There is insufficient information in this thread to say either way. Maybe. > Maybe a Linux kernel bug. > > > > >When I looked at write_batch() I found some snippets that I thought to > >be wrong. But I didn't what to make the judgement when I didn't have a > >clear head. > > write_batch() is a complicated function but it can't usefully be split any > further. I would be happy to explain bits or expand the existing comments, > but it is also possible that it is buggy. > I think write_batch is correct. I overlooked one function call. I'm not overly happy with the handling of balloon pages and the use of deferred array in non-live transfer, but those things are not buggy in itself. See my patch series for the real bug I discover. Gosh, took me a whole day to identity the culprit. Wei. > ~Andrew ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linux 4.1 reports wrong number of pages to toolstack 2015-09-04 19:46 ` Wei Liu @ 2015-09-04 20:32 ` Andrew Cooper 0 siblings, 0 replies; 15+ messages in thread From: Andrew Cooper @ 2015-09-04 20:32 UTC (permalink / raw) To: Wei Liu Cc: Juergen Gross, Ian Campbell, Ian Jackson, David Vrabel, Jan Beulich, xen-devel On 04/09/15 20:46, Wei Liu wrote: > >>> When I looked at write_batch() I found some snippets that I thought to >>> be wrong. But I didn't what to make the judgement when I didn't have a >>> clear head. >> write_batch() is a complicated function but it can't usefully be split any >> further. I would be happy to explain bits or expand the existing comments, >> but it is also possible that it is buggy. >> > I think write_batch is correct. I overlooked one function call. I'm not > overly happy with the handling of balloon pages and the use of deferred > array in non-live transfer, but those things are not buggy in itself. Handling of ballooned pages is broken at several layers. This was covered in my talk at Seattle. Fixing it is non-trivial. The use of the deferred array is necessary for live migrates, and used in non-live migrates to avoid diverging the algorithm. Nothing in the non-live side queries the deferred array (which itself is a contributory factor to the ballooning issue, as there is no interlock to prevent something else issuing population/depopoulation hypercalls on behalf of the paused domain). ~Andrew ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linux 4.1 reports wrong number of pages to toolstack 2015-09-04 8:28 ` Jan Beulich 2015-09-04 9:35 ` Andrew Cooper @ 2015-09-04 11:40 ` Wei Liu 1 sibling, 0 replies; 15+ messages in thread From: Wei Liu @ 2015-09-04 11:40 UTC (permalink / raw) To: Jan Beulich Cc: Juergen Gross, Wei Liu, Ian Campbell, Andrew Cooper, Ian Jackson, David Vrabel, xen-devel On Fri, Sep 04, 2015 at 02:28:41AM -0600, Jan Beulich wrote: > >>> On 04.09.15 at 05:38, <JGross@suse.com> wrote: > > On 09/04/2015 02:40 AM, Wei Liu wrote: > >> This issue is exposed by the introduction of migration v2. The symptom is that > >> a guest with 32 bit 4.1 kernel can't be restored because it's asking for too > >> many pages. > >> > >> Note that all guests have 512MB memory, which means they have 131072 pages. > >> > >> Both 3.14 tests [2] [3] get the correct number of pages. Like: > >> > >> xc: detail: max_pfn 0x1ffff, p2m_frames 256 > >> ... > >> xc: detail: Memory: 2048/131072 1% > >> ... > >> > >> However in both 4.1 [0] [1] the number of pages are quite wrong. > >> > >> 4.1 32 bit: > >> > >> xc: detail: max_pfn 0xfffff, p2m_frames 1024 > >> ... > >> xc: detail: Memory: 11264/1048576 1% > >> ... > >> > >> It thinks it has 4096MB memory. > >> > >> 4.1 64 bit: > >> > >> xc: detail: max_pfn 0x3ffff, p2m_frames 512 > >> ... > >> xc: detail: Memory: 3072/262144 1% > >> ... > >> > >> It thinks it has 1024MB memory. > >> > >> The total number of pages is determined in libxc by calling > >> xc_domain_nr_gpfns, which yanks shared_info->arch.max_pfn from > >> hypervisor. And that value is clearly touched by Linux in some way. > > > > Sure. shared_info->arch.max_pfn holds the number of pfns the p2m list > > can handle. This is not the memory size of the domain. > > > >> I now think this is a bug in Linux kernel. The biggest suspect is the > >> introduction of linear P2M. If you think this is a bug in toolstack, > >> please let me know. > > > > I absolutely think it is a toolstack bug. Even without the linear p2m > > things would go wrong in case a ballooned down guest would be migrated, > > as shared_info->arch.max_pfn would hold the upper limit of the guest > > in this case and not the current size. > > I don't think this necessarily is a tool stack bug, at least not in > the sense implied above - since (afaik) migrating ballooned guests > (at least PV ones) has been working before, there ought to be > logic to skip ballooned pages (and I certainly recall having seen Yes, there is. Migration v2 has logic to skip gpfn when the underlying mfn is INVALID_MFN. I'm not too convinced the code that implement that logic is working correctly. I need to have a closer look today. > migration slowly move up to e.g. 50% and the skip the other > half due to being ballooned, albeit that recollection certainly is > from before v2). And pages above the highest populated one > ought to be considered ballooned just as much. With the > information provided by Wei I don't think we can judge about > this, since it only shows the values the migration process starts > from, not when, why, or how it fails. > It fails on the receiving end when helper tries to populate more pages than the guest can have. In the specific case above, helper populates nr 131073 page and fails. Wei. > Jan ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linux 4.1 reports wrong number of pages to toolstack 2015-09-04 0:40 Linux 4.1 reports wrong number of pages to toolstack Wei Liu 2015-09-04 3:38 ` Juergen Gross @ 2015-09-04 8:53 ` Ian Campbell 2015-09-04 9:28 ` Ian Campbell ` (2 more replies) 1 sibling, 3 replies; 15+ messages in thread From: Ian Campbell @ 2015-09-04 8:53 UTC (permalink / raw) To: Wei Liu, xen-devel, David Vrabel, Juergen Gross Cc: Andrew Cooper, Ian Jackson On Fri, 2015-09-04 at 01:40 +0100, Wei Liu wrote: > Hi David > > This issue is exposed by the introduction of migration v2. The symptom is that > a guest with 32 bit 4.1 kernel can't be restored because it's asking for too > many pages. FWIW my adhoc tests overnight gave me: 37858: b953c0d234bc72e8489d3bf51a276c5c4ec85345 v4.1 Fail 37862: 39a8804455fb23f09157341d3ba7db6d7ae6ee76 v4.0 Fail 37860: bfa76d49576599a4b9f9b7a71f23d73d6dcff735 v3.19 Fail 37872: e36f014edff70fc02b3d3d79cead1d58f289332e v3.19-rc7 Fail 37866: 26bc420b59a38e4e6685a73345a0def461136dce v3.19-rc6 Fail 37868: ec6f34e5b552fb0a52e6aae1a5afbbb1605cc6cc v3.19-rc5 Fail 37864: eaa27f34e91a14cdceed26ed6c6793ec1d186115 v3.19-rc4 Fail * 37867: b1940cd21c0f4abdce101253e860feff547291b0 v3.19-rc3 Pass * 37865: b7392d2247cfe6771f95d256374f1a8e6a6f48d6 v3.19-rc2 Pass 37863: 97bf6af1f928216fd6c5a66e8a57bfa95a659672 v3.19-rc1 Pass 37861: b2776bf7149bddd1f4161f14f79520f17fc1d71d v3.18 Pass I have set the adhoc bisector working on the ~200 commits between rc3 and rc4. It's running in the Citrix instance (which is quieter) so the interim results are only visible within our network at http://osstest.xs.citrite.ne t/~osstest/testlogs/results-adhoc/bisect/xen-unstable/test-amd64-i386 -xl..html. So far it has confirmed the basis fail and it is now rechecking the basis pass. Slightly strange though is: $ git log --oneline v3.19-rc3..v3.19-rc4 -- drivers/xen/ arch/x86/xen/ include/xen/ $ i.e. there are no relevant seeming xen commits in that range. Maybe the last one of this is more relevant? $ git log --grep=[xX][eE][nN] --oneline v3.19-rc3..v3.19-rc4 -- bdec419 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 07ff890 xen-netback: fixing the propagation of the transmit shaper timeout 132978b x86: Fix step size adjustment during initial memory mapping $ I don't think this particular issue is prone to false positives (i.e. passing when it should fail) and the bisector has reconfirmed the fail case already, so I think it is unlikely that the bisector is going to come back and say it can't find a reliable basis for running. Which might mean we have two issues, some as yet unknown issue between v3.19-rc3 and -rc4 and the issue you have observed with the number of pages the toolstack thinks it should be working on, which is masked by the unknown issue (and could very well be a toolstack bug exposed by a change in Linux, not a Linux bug at all). I'm going to leave the bisector going, hopefully it'll tell us something interesting in whatever it fingers... Ian. > > Note that all guests have 512MB memory, which means they have 131072 > pages. > > Both 3.14 tests [2] [3] get the correct number of pages. Like: > > xc: detail: max_pfn 0x1ffff, p2m_frames 256 > ... > xc: detail: Memory: 2048/131072 1% > ... > > However in both 4.1 [0] [1] the number of pages are quite wrong. > > 4.1 32 bit: > > xc: detail: max_pfn 0xfffff, p2m_frames 1024 > ... > xc: detail: Memory: 11264/1048576 1% > ... > > It thinks it has 4096MB memory. > > 4.1 64 bit: > > xc: detail: max_pfn 0x3ffff, p2m_frames 512 > ... > xc: detail: Memory: 3072/262144 1% > ... > > It thinks it has 1024MB memory. > > The total number of pages is determined in libxc by calling > xc_domain_nr_gpfns, which yanks shared_info->arch.max_pfn from > hypervisor. And that value is clearly touched by Linux in some way. > > I now think this is a bug in Linux kernel. The biggest suspect is the > introduction of linear P2M. If you think this is a bug in toolstack, > please let me know. > > I don't know why 4.1 64 bit [0] can still be successfully restored. I > don't have handy setup to experiment. The restore path doesn't show > enough information to tell anything. The thing I worry about is that > migration v2 somehow make the guest bigger than it should be. But that's > another topic. > > > Wei. > > [0] 4.1 kernel 64 bit save restore: > http://logs.test-lab.xenproject.org/osstest/logs/60785/test-amd64-amd64 > -xl/16.ts-guest-saverestore.log > > [1] 4.1 kernel 32 bit save restore: > http://logs.test-lab.xenproject.org/osstest/logs/60785/test-amd64-i386 > -xl/14.ts-guest-saverestore.log > > [2] 3.14 kernel 64 bit save restore: > http://logs.test-lab.xenproject.org/osstest/logs/61263/test-amd64-amd64 > -xl/16.ts-guest-saverestore.log > > [3] 3.14 kernel 32 bit save restore: > http://logs.test-lab.xenproject.org/osstest/logs/61263/test-amd64-i386 > -xl/16.ts-guest-saverestore.log ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linux 4.1 reports wrong number of pages to toolstack 2015-09-04 8:53 ` Ian Campbell @ 2015-09-04 9:28 ` Ian Campbell 2015-09-04 14:42 ` David Vrabel 2015-09-07 7:09 ` Jan Beulich 2 siblings, 0 replies; 15+ messages in thread From: Ian Campbell @ 2015-09-04 9:28 UTC (permalink / raw) To: Wei Liu, xen-devel, David Vrabel, Juergen Gross Cc: Andrew Cooper, Ian Jackson On Fri, 2015-09-04 at 09:53 +0100, Ian Campbell wrote: > I have set the adhoc bisector working on the ~200 commits between rc3 and > rc4. It's running in the Citrix instance (which is quieter) so the interim > results are only visible within our network at http://osstest.xs.citrite.ne > t/~osstest/testlogs/results-adhoc/bisect/xen-unstable/test-amd64-i386 > -xl..html. > > So far it has confirmed the basis fail and it is now rechecking the basis > pass. It's checked the basis and is now actually bisecting. I setup a periodic rsync to http://xenbits.xen.org/people/ianc/tmp/adhoc/test-amd64-i386-xl..html for anyone outside the Citrix network who wants to follow along... The first hash in the tuple is the Linux one, all the others are the same for all nodes (because I arranged the basis flights that way) Ian. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linux 4.1 reports wrong number of pages to toolstack 2015-09-04 8:53 ` Ian Campbell 2015-09-04 9:28 ` Ian Campbell @ 2015-09-04 14:42 ` David Vrabel 2015-09-04 14:53 ` Wei Liu 2015-09-07 7:09 ` Jan Beulich 2 siblings, 1 reply; 15+ messages in thread From: David Vrabel @ 2015-09-04 14:42 UTC (permalink / raw) To: Ian Campbell, Wei Liu, xen-devel, Juergen Gross Cc: Juergen Gross, Andrew Cooper, Ian Jackson On 04/09/15 09:53, Ian Campbell wrote: > On Fri, 2015-09-04 at 01:40 +0100, Wei Liu wrote: >> Hi David >> >> This issue is exposed by the introduction of migration v2. The symptom is that >> a guest with 32 bit 4.1 kernel can't be restored because it's asking for too >> many pages. > > FWIW my adhoc tests overnight gave me: > > 37858: b953c0d234bc72e8489d3bf51a276c5c4ec85345 v4.1 Fail > 37862: 39a8804455fb23f09157341d3ba7db6d7ae6ee76 v4.0 Fail > 37860: bfa76d49576599a4b9f9b7a71f23d73d6dcff735 v3.19 Fail > > 37872: e36f014edff70fc02b3d3d79cead1d58f289332e v3.19-rc7 Fail > 37866: 26bc420b59a38e4e6685a73345a0def461136dce v3.19-rc6 Fail > 37868: ec6f34e5b552fb0a52e6aae1a5afbbb1605cc6cc v3.19-rc5 Fail > 37864: eaa27f34e91a14cdceed26ed6c6793ec1d186115 v3.19-rc4 Fail * > 37867: b1940cd21c0f4abdce101253e860feff547291b0 v3.19-rc3 Pass * > 37865: b7392d2247cfe6771f95d256374f1a8e6a6f48d6 v3.19-rc2 Pass > > 37863: 97bf6af1f928216fd6c5a66e8a57bfa95a659672 v3.19-rc1 Pass > > 37861: b2776bf7149bddd1f4161f14f79520f17fc1d71d v3.18 Pass > > I have set the adhoc bisector working on the ~200 commits between rc3 and > rc4. It's running in the Citrix instance (which is quieter) so the interim > results are only visible within our network at http://osstest.xs.citrite.ne > t/~osstest/testlogs/results-adhoc/bisect/xen-unstable/test-amd64-i386 > -xl..html. > > So far it has confirmed the basis fail and it is now rechecking the basis > pass. > > Slightly strange though is: > $ git log --oneline v3.19-rc3..v3.19-rc4 -- drivers/xen/ arch/x86/xen/ include/xen/ > $ > > i.e. there are no relevant seeming xen commits in that range. Maybe the > last one of this is more relevant? Since this bisect attempt appears to have disappeared into the weeds I did my own and it fingered: 633d6f17cd91ad5bf2370265946f716e42d388c6 (x86/xen: prepare p2m list for memory hotplug) which was introduced in 4.0-rc7. This looks a lot more plausible as the Linux change triggering the migration failures. David ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linux 4.1 reports wrong number of pages to toolstack 2015-09-04 14:42 ` David Vrabel @ 2015-09-04 14:53 ` Wei Liu 2015-09-04 14:58 ` David Vrabel 0 siblings, 1 reply; 15+ messages in thread From: Wei Liu @ 2015-09-04 14:53 UTC (permalink / raw) To: David Vrabel Cc: Juergen Gross, Wei Liu, Ian Campbell, Andrew Cooper, Ian Jackson, xen-devel On Fri, Sep 04, 2015 at 03:42:06PM +0100, David Vrabel wrote: > On 04/09/15 09:53, Ian Campbell wrote: > > On Fri, 2015-09-04 at 01:40 +0100, Wei Liu wrote: > >> Hi David > >> > >> This issue is exposed by the introduction of migration v2. The symptom is that > >> a guest with 32 bit 4.1 kernel can't be restored because it's asking for too > >> many pages. > > > > FWIW my adhoc tests overnight gave me: > > > > 37858: b953c0d234bc72e8489d3bf51a276c5c4ec85345 v4.1 Fail > > 37862: 39a8804455fb23f09157341d3ba7db6d7ae6ee76 v4.0 Fail > > 37860: bfa76d49576599a4b9f9b7a71f23d73d6dcff735 v3.19 Fail > > > > 37872: e36f014edff70fc02b3d3d79cead1d58f289332e v3.19-rc7 Fail > > 37866: 26bc420b59a38e4e6685a73345a0def461136dce v3.19-rc6 Fail > > 37868: ec6f34e5b552fb0a52e6aae1a5afbbb1605cc6cc v3.19-rc5 Fail > > 37864: eaa27f34e91a14cdceed26ed6c6793ec1d186115 v3.19-rc4 Fail * > > 37867: b1940cd21c0f4abdce101253e860feff547291b0 v3.19-rc3 Pass * > > 37865: b7392d2247cfe6771f95d256374f1a8e6a6f48d6 v3.19-rc2 Pass > > > > 37863: 97bf6af1f928216fd6c5a66e8a57bfa95a659672 v3.19-rc1 Pass > > > > 37861: b2776bf7149bddd1f4161f14f79520f17fc1d71d v3.18 Pass > > > > I have set the adhoc bisector working on the ~200 commits between rc3 and > > rc4. It's running in the Citrix instance (which is quieter) so the interim > > results are only visible within our network at http://osstest.xs.citrite.ne > > t/~osstest/testlogs/results-adhoc/bisect/xen-unstable/test-amd64-i386 > > -xl..html. > > > > So far it has confirmed the basis fail and it is now rechecking the basis > > pass. > > > > Slightly strange though is: > > $ git log --oneline v3.19-rc3..v3.19-rc4 -- drivers/xen/ arch/x86/xen/ include/xen/ > > $ > > > > i.e. there are no relevant seeming xen commits in that range. Maybe the > > last one of this is more relevant? > > Since this bisect attempt appears to have disappeared into the weeds I > did my own and it fingered: > > 633d6f17cd91ad5bf2370265946f716e42d388c6 (x86/xen: prepare p2m list for > memory hotplug) which was introduced in 4.0-rc7. > > This looks a lot more plausible as the Linux change triggering the > migration failures. > FWIW. Same 32bit kernel, 128MB memory, migration is OK. > David ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linux 4.1 reports wrong number of pages to toolstack 2015-09-04 14:53 ` Wei Liu @ 2015-09-04 14:58 ` David Vrabel 0 siblings, 0 replies; 15+ messages in thread From: David Vrabel @ 2015-09-04 14:58 UTC (permalink / raw) To: Wei Liu; +Cc: Juergen Gross, xen-devel, Ian Jackson, Ian Campbell, Andrew Cooper On 04/09/15 15:53, Wei Liu wrote: > On Fri, Sep 04, 2015 at 03:42:06PM +0100, David Vrabel wrote: >> On 04/09/15 09:53, Ian Campbell wrote: >>> On Fri, 2015-09-04 at 01:40 +0100, Wei Liu wrote: >>>> Hi David >>>> >>>> This issue is exposed by the introduction of migration v2. The symptom is that >>>> a guest with 32 bit 4.1 kernel can't be restored because it's asking for too >>>> many pages. >>> >>> FWIW my adhoc tests overnight gave me: >>> >>> 37858: b953c0d234bc72e8489d3bf51a276c5c4ec85345 v4.1 Fail >>> 37862: 39a8804455fb23f09157341d3ba7db6d7ae6ee76 v4.0 Fail >>> 37860: bfa76d49576599a4b9f9b7a71f23d73d6dcff735 v3.19 Fail >>> >>> 37872: e36f014edff70fc02b3d3d79cead1d58f289332e v3.19-rc7 Fail >>> 37866: 26bc420b59a38e4e6685a73345a0def461136dce v3.19-rc6 Fail >>> 37868: ec6f34e5b552fb0a52e6aae1a5afbbb1605cc6cc v3.19-rc5 Fail >>> 37864: eaa27f34e91a14cdceed26ed6c6793ec1d186115 v3.19-rc4 Fail * >>> 37867: b1940cd21c0f4abdce101253e860feff547291b0 v3.19-rc3 Pass * >>> 37865: b7392d2247cfe6771f95d256374f1a8e6a6f48d6 v3.19-rc2 Pass >>> >>> 37863: 97bf6af1f928216fd6c5a66e8a57bfa95a659672 v3.19-rc1 Pass >>> >>> 37861: b2776bf7149bddd1f4161f14f79520f17fc1d71d v3.18 Pass >>> >>> I have set the adhoc bisector working on the ~200 commits between rc3 and >>> rc4. It's running in the Citrix instance (which is quieter) so the interim >>> results are only visible within our network at http://osstest.xs.citrite.ne >>> t/~osstest/testlogs/results-adhoc/bisect/xen-unstable/test-amd64-i386 >>> -xl..html. >>> >>> So far it has confirmed the basis fail and it is now rechecking the basis >>> pass. >>> >>> Slightly strange though is: >>> $ git log --oneline v3.19-rc3..v3.19-rc4 -- drivers/xen/ arch/x86/xen/ include/xen/ >>> $ >>> >>> i.e. there are no relevant seeming xen commits in that range. Maybe the >>> last one of this is more relevant? >> >> Since this bisect attempt appears to have disappeared into the weeds I >> did my own and it fingered: >> >> 633d6f17cd91ad5bf2370265946f716e42d388c6 (x86/xen: prepare p2m list for >> memory hotplug) which was introduced in 4.0-rc7. >> >> This looks a lot more plausible as the Linux change triggering the >> migration failures. >> > > FWIW. Same 32bit kernel, 128MB memory, migration is OK. This commit is only bad with 64-bit guests -- with a 32-bit guest the maximum p2m size covers only 64 GiB. It will also requires XEN_BALLOON_MEMORY_HOTPLUG to be enabled. This commit is exposing a toolstack bug. David ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linux 4.1 reports wrong number of pages to toolstack 2015-09-04 8:53 ` Ian Campbell 2015-09-04 9:28 ` Ian Campbell 2015-09-04 14:42 ` David Vrabel @ 2015-09-07 7:09 ` Jan Beulich 2 siblings, 0 replies; 15+ messages in thread From: Jan Beulich @ 2015-09-07 7:09 UTC (permalink / raw) To: Ian Campbell, xen-devel Cc: Ian Jackson, Andrew Cooper, Wei Liu, DavidVrabel, Juergen Gross >>> On 04.09.15 at 10:53, <ian.campbell@citrix.com> wrote: > I have set the adhoc bisector working on the ~200 commits between rc3 and > rc4. It's running in the Citrix instance (which is quieter) so the interim > results are only visible within our network at http://osstest.xs.citrite.ne > t/~osstest/testlogs/results-adhoc/bisect/xen-unstable/test-amd64-i386 > -xl..html. > > So far it has confirmed the basis fail and it is now rechecking the basis > pass. > > Slightly strange though is: > $ git log --oneline v3.19-rc3..v3.19-rc4 -- drivers/xen/ arch/x86/xen/ > include/xen/ > $ > > i.e. there are no relevant seeming xen commits in that range. Maybe the > last one of this is more relevant? > > $ git log --grep=[xX][eE][nN] --oneline v3.19-rc3..v3.19-rc4 -- > bdec419 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net > 07ff890 xen-netback: fixing the propagation of the transmit shaper timeout > 132978b x86: Fix step size adjustment during initial memory mapping > $ So if I'm interpreting the graph right it was indeed the last of these which got fingered, which is mine. Yet having looked at it in close detail just now again I can't see it to be wrong, or even have an effect on post-boot state: All it does is adjust the block sizes in which the 1:1 mapping gets established. The final result ought to still be the same (with - obviously - the exception of which pages may get used for page tables). No change to any global variables afaics. Jan ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2015-09-07 7:09 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-09-04 0:40 Linux 4.1 reports wrong number of pages to toolstack Wei Liu 2015-09-04 3:38 ` Juergen Gross 2015-09-04 8:28 ` Jan Beulich 2015-09-04 9:35 ` Andrew Cooper 2015-09-04 11:35 ` Wei Liu 2015-09-04 18:39 ` Andrew Cooper 2015-09-04 19:46 ` Wei Liu 2015-09-04 20:32 ` Andrew Cooper 2015-09-04 11:40 ` Wei Liu 2015-09-04 8:53 ` Ian Campbell 2015-09-04 9:28 ` Ian Campbell 2015-09-04 14:42 ` David Vrabel 2015-09-04 14:53 ` Wei Liu 2015-09-04 14:58 ` David Vrabel 2015-09-07 7:09 ` Jan Beulich
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).