* makedumpfile 1.5.0 takes much more time to dump @ 2012-09-20 20:06 Vivek Goyal 2012-09-21 0:23 ` HATAYAMA Daisuke 0 siblings, 1 reply; 13+ messages in thread From: Vivek Goyal @ 2012-09-20 20:06 UTC (permalink / raw) To: Atsushi Kumagai; +Cc: Dave Young, Kexec Mailing List, chaowang Hi Atsushi san, We tried makedumpfile 1.5.0 on a 1TB machine and it seems to regress badly. We reserved 192MB of memory and following are test results. #1. makedumpfile-1.4.2 -E --message-level 1 -d 31 real 3m47.520s user 0m56.543s sys 2m41.631s #2. makedumpfile-1.5.0 -E --message-level 1 -d 31 real 52m25.262s user 32m51.310s sys 18m53.265s #3. makedumpfile-1.4.2 -c --message-level 1 -d 31 real 8m49.107s user 4m34.180s sys 4m8.691s #4. makedumpfile-1.5.0 -c --message-level 1 -d 31 real 46m48.985s user 29m35.203s sys 16m43.149s Thanks Vivek _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: makedumpfile 1.5.0 takes much more time to dump 2012-09-20 20:06 makedumpfile 1.5.0 takes much more time to dump Vivek Goyal @ 2012-09-21 0:23 ` HATAYAMA Daisuke 2012-09-21 0:43 ` HATAYAMA Daisuke 2012-09-21 13:32 ` Vivek Goyal 0 siblings, 2 replies; 13+ messages in thread From: HATAYAMA Daisuke @ 2012-09-21 0:23 UTC (permalink / raw) To: vgoyal; +Cc: dyoung, kumagai-atsushi, kexec, chaowang From: Vivek Goyal <vgoyal@redhat.com> Subject: makedumpfile 1.5.0 takes much more time to dump Date: Thu, 20 Sep 2012 16:06:34 -0400 > Hi Atsushi san, > > We tried makedumpfile 1.5.0 on a 1TB machine and it seems to regress > badly. We reserved 192MB of memory and following are test results. > > #1. makedumpfile-1.4.2 -E --message-level 1 -d 31 > real 3m47.520s > user 0m56.543s > sys 2m41.631s > > #2. makedumpfile-1.5.0 -E --message-level 1 -d 31 > real 52m25.262s > user 32m51.310s > sys 18m53.265s > > #3. makedumpfile-1.4.2 -c --message-level 1 -d 31 > real 8m49.107s > user 4m34.180s > sys 4m8.691s > > #4. makedumpfile-1.5.0 -c --message-level 1 -d 31 > real 46m48.985s > user 29m35.203s > sys 16m43.149s > Hello Vivek, On v1.5.0 we cannot filter free pages in constant space becuase we have yet to test it. Instead, the existing method is used here, which repeats walking on a whole page frames the number of cycles times. As Kumagai-san explains, the number of cycles can be calculated by the following expression: N = physical memory size / (page size * bit per byte(8) * BUFSIZE_CYCLIC) So, N = 2TB / (4KB * 8 * 1MB) = 64 cycles. I guess on this environment, it took about 50 seconds to filter free pages in one cycle. Thanks. HATAYAMA, Daisuke _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: makedumpfile 1.5.0 takes much more time to dump 2012-09-21 0:23 ` HATAYAMA Daisuke @ 2012-09-21 0:43 ` HATAYAMA Daisuke 2012-09-21 13:32 ` Vivek Goyal 1 sibling, 0 replies; 13+ messages in thread From: HATAYAMA Daisuke @ 2012-09-21 0:43 UTC (permalink / raw) To: vgoyal; +Cc: kumagai-atsushi, dyoung, kexec, chaowang From: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Subject: Re: makedumpfile 1.5.0 takes much more time to dump Date: Fri, 21 Sep 2012 09:23:57 +0900 > From: Vivek Goyal <vgoyal@redhat.com> > Subject: makedumpfile 1.5.0 takes much more time to dump > Date: Thu, 20 Sep 2012 16:06:34 -0400 > >> Hi Atsushi san, >> >> We tried makedumpfile 1.5.0 on a 1TB machine and it seems to regress >> badly. We reserved 192MB of memory and following are test results. >> >> #1. makedumpfile-1.4.2 -E --message-level 1 -d 31 >> real 3m47.520s >> user 0m56.543s >> sys 2m41.631s >> >> #2. makedumpfile-1.5.0 -E --message-level 1 -d 31 >> real 52m25.262s >> user 32m51.310s >> sys 18m53.265s >> >> #3. makedumpfile-1.4.2 -c --message-level 1 -d 31 >> real 8m49.107s >> user 4m34.180s >> sys 4m8.691s >> >> #4. makedumpfile-1.5.0 -c --message-level 1 -d 31 >> real 46m48.985s >> user 29m35.203s >> sys 16m43.149s >> > > Hello Vivek, > > On v1.5.0 we cannot filter free pages in constant space becuase we > have yet to test it. Instead, the existing method is used here, which > repeats walking on a whole page frames the number of cycles times. > > As Kumagai-san explains, the number of cycles can be calculated by the > following expression: > > N = physical memory size / (page size * bit per byte(8) * BUFSIZE_CYCLIC) > > So, > > N = 2TB / (4KB * 8 * 1MB) = 64 cycles. > > I guess on this environment, it took about 50 seconds to filter free > pages in one cycle. > I noticed a careless miss. 1TB is correct on your case. N = 1TB / (4KB * 8 * 1MB) = 32 cycles. So, about 95 seconds for one cycle? Thanks. HATAYAMA, Daisuke _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: makedumpfile 1.5.0 takes much more time to dump 2012-09-21 0:23 ` HATAYAMA Daisuke 2012-09-21 0:43 ` HATAYAMA Daisuke @ 2012-09-21 13:32 ` Vivek Goyal 2012-09-24 0:51 ` HATAYAMA Daisuke 1 sibling, 1 reply; 13+ messages in thread From: Vivek Goyal @ 2012-09-21 13:32 UTC (permalink / raw) To: HATAYAMA Daisuke; +Cc: dyoung, kumagai-atsushi, kexec, chaowang On Fri, Sep 21, 2012 at 09:23:57AM +0900, HATAYAMA Daisuke wrote: > From: Vivek Goyal <vgoyal@redhat.com> > Subject: makedumpfile 1.5.0 takes much more time to dump > Date: Thu, 20 Sep 2012 16:06:34 -0400 > > > Hi Atsushi san, > > > > We tried makedumpfile 1.5.0 on a 1TB machine and it seems to regress > > badly. We reserved 192MB of memory and following are test results. > > > > #1. makedumpfile-1.4.2 -E --message-level 1 -d 31 > > real 3m47.520s > > user 0m56.543s > > sys 2m41.631s > > > > #2. makedumpfile-1.5.0 -E --message-level 1 -d 31 > > real 52m25.262s > > user 32m51.310s > > sys 18m53.265s > > > > #3. makedumpfile-1.4.2 -c --message-level 1 -d 31 > > real 8m49.107s > > user 4m34.180s > > sys 4m8.691s > > > > #4. makedumpfile-1.5.0 -c --message-level 1 -d 31 > > real 46m48.985s > > user 29m35.203s > > sys 16m43.149s > > > > Hello Vivek, > > On v1.5.0 we cannot filter free pages in constant space becuase we > have yet to test it. Instead, the existing method is used here, which > repeats walking on a whole page frames the number of cycles times. > > As Kumagai-san explains, the number of cycles can be calculated by the > following expression: > > N = physical memory size / (page size * bit per byte(8) * BUFSIZE_CYCLIC) > > So, > > N = 2TB / (4KB * 8 * 1MB) = 64 cycles. > > I guess on this environment, it took about 50 seconds to filter free > pages in one cycle. Ok, so once we have your walking through page struct patches in, hopefully this problem will be gone? If that's going to take time, can we make using of new logic conditional on a command line option. So that user has the option of using old logic. Thanks Vivek _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: makedumpfile 1.5.0 takes much more time to dump 2012-09-21 13:32 ` Vivek Goyal @ 2012-09-24 0:51 ` HATAYAMA Daisuke 2012-09-24 14:51 ` Vivek Goyal 0 siblings, 1 reply; 13+ messages in thread From: HATAYAMA Daisuke @ 2012-09-24 0:51 UTC (permalink / raw) To: vgoyal, kumagai-atsushi; +Cc: kexec, dyoung, chaowang From: Vivek Goyal <vgoyal@redhat.com> Subject: Re: makedumpfile 1.5.0 takes much more time to dump Date: Fri, 21 Sep 2012 09:32:32 -0400 > On Fri, Sep 21, 2012 at 09:23:57AM +0900, HATAYAMA Daisuke wrote: >> From: Vivek Goyal <vgoyal@redhat.com> >> Subject: makedumpfile 1.5.0 takes much more time to dump >> Date: Thu, 20 Sep 2012 16:06:34 -0400 >> >> > Hi Atsushi san, >> > >> > We tried makedumpfile 1.5.0 on a 1TB machine and it seems to regress >> > badly. We reserved 192MB of memory and following are test results. >> > >> > #1. makedumpfile-1.4.2 -E --message-level 1 -d 31 >> > real 3m47.520s >> > user 0m56.543s >> > sys 2m41.631s >> > >> > #2. makedumpfile-1.5.0 -E --message-level 1 -d 31 >> > real 52m25.262s >> > user 32m51.310s >> > sys 18m53.265s >> > >> > #3. makedumpfile-1.4.2 -c --message-level 1 -d 31 >> > real 8m49.107s >> > user 4m34.180s >> > sys 4m8.691s >> > >> > #4. makedumpfile-1.5.0 -c --message-level 1 -d 31 >> > real 46m48.985s >> > user 29m35.203s >> > sys 16m43.149s >> > >> >> Hello Vivek, >> >> On v1.5.0 we cannot filter free pages in constant space becuase we >> have yet to test it. Instead, the existing method is used here, which >> repeats walking on a whole page frames the number of cycles times. >> >> As Kumagai-san explains, the number of cycles can be calculated by the >> following expression: >> >> N = physical memory size / (page size * bit per byte(8) * BUFSIZE_CYCLIC) >> >> So, >> >> N = 2TB / (4KB * 8 * 1MB) = 64 cycles. >> >> I guess on this environment, it took about 50 seconds to filter free >> pages in one cycle. > > Ok, so once we have your walking through page struct patches in, hopefully > this problem will be gone? > Yes, then free page filtering is integrated in other mem_map array logics. You can see how it takes from the following message. Excluding unnecessary pages : [100 %] STEP [Excluding unnecessary pages] : 0.204574 seconds New logic takes equal to or quicker than the total time indicated by the above kind of messages now. > If that's going to take time, can we make using of new logic conditional > on a command line option. So that user has the option of using old > logic. > Kumagai-san should decide this. BTW, Kumagai-san, in what version do you plan release of this new mem_map logic? Around v1.5.2? I want to re-send a new patch set to which I add small changes and comments for maintainability. Thanks. HATAYAMA, Daisuke _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: makedumpfile 1.5.0 takes much more time to dump 2012-09-24 0:51 ` HATAYAMA Daisuke @ 2012-09-24 14:51 ` Vivek Goyal 2012-10-03 7:38 ` Atsushi Kumagai 0 siblings, 1 reply; 13+ messages in thread From: Vivek Goyal @ 2012-09-24 14:51 UTC (permalink / raw) To: HATAYAMA Daisuke; +Cc: dyoung, kumagai-atsushi, kexec, chaowang On Mon, Sep 24, 2012 at 09:51:12AM +0900, HATAYAMA Daisuke wrote: > From: Vivek Goyal <vgoyal@redhat.com> > Subject: Re: makedumpfile 1.5.0 takes much more time to dump > Date: Fri, 21 Sep 2012 09:32:32 -0400 > > > On Fri, Sep 21, 2012 at 09:23:57AM +0900, HATAYAMA Daisuke wrote: > >> From: Vivek Goyal <vgoyal@redhat.com> > >> Subject: makedumpfile 1.5.0 takes much more time to dump > >> Date: Thu, 20 Sep 2012 16:06:34 -0400 > >> > >> > Hi Atsushi san, > >> > > >> > We tried makedumpfile 1.5.0 on a 1TB machine and it seems to regress > >> > badly. We reserved 192MB of memory and following are test results. > >> > > >> > #1. makedumpfile-1.4.2 -E --message-level 1 -d 31 > >> > real 3m47.520s > >> > user 0m56.543s > >> > sys 2m41.631s > >> > > >> > #2. makedumpfile-1.5.0 -E --message-level 1 -d 31 > >> > real 52m25.262s > >> > user 32m51.310s > >> > sys 18m53.265s > >> > > >> > #3. makedumpfile-1.4.2 -c --message-level 1 -d 31 > >> > real 8m49.107s > >> > user 4m34.180s > >> > sys 4m8.691s > >> > > >> > #4. makedumpfile-1.5.0 -c --message-level 1 -d 31 > >> > real 46m48.985s > >> > user 29m35.203s > >> > sys 16m43.149s > >> > > >> > >> Hello Vivek, > >> > >> On v1.5.0 we cannot filter free pages in constant space becuase we > >> have yet to test it. Instead, the existing method is used here, which > >> repeats walking on a whole page frames the number of cycles times. > >> > >> As Kumagai-san explains, the number of cycles can be calculated by the > >> following expression: > >> > >> N = physical memory size / (page size * bit per byte(8) * BUFSIZE_CYCLIC) > >> > >> So, > >> > >> N = 2TB / (4KB * 8 * 1MB) = 64 cycles. > >> > >> I guess on this environment, it took about 50 seconds to filter free > >> pages in one cycle. > > > > Ok, so once we have your walking through page struct patches in, hopefully > > this problem will be gone? > > > > Yes, then free page filtering is integrated in other mem_map array > logics. You can see how it takes from the following message. > > Excluding unnecessary pages : [100 %] STEP [Excluding unnecessary pages] : 0.204574 seconds > > New logic takes equal to or quicker than the total time indicated by > the above kind of messages now. > > > If that's going to take time, can we make using of new logic conditional > > on a command line option. So that user has the option of using old > > logic. > > > > Kumagai-san should decide this. Ok, thanks a lot for testing and information. Kumagai-san is on vacation till sept 30. So I will wait for the response. Thanks Vivek _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: makedumpfile 1.5.0 takes much more time to dump 2012-09-24 14:51 ` Vivek Goyal @ 2012-10-03 7:38 ` Atsushi Kumagai 2012-10-03 12:48 ` Vivek Goyal 2012-10-04 1:15 ` HATAYAMA Daisuke 0 siblings, 2 replies; 13+ messages in thread From: Atsushi Kumagai @ 2012-10-03 7:38 UTC (permalink / raw) To: vgoyal, d.hatayama; +Cc: kexec, dyoung, chaowang Hello, > >> > Hi Atsushi san, > >> > > >> > We tried makedumpfile 1.5.0 on a 1TB machine and it seems to regress > >> > badly. We reserved 192MB of memory and following are test results. > >> > > >> > #1. makedumpfile-1.4.2 -E --message-level 1 -d 31 > >> > real 3m47.520s > >> > user 0m56.543s > >> > sys 2m41.631s > >> > > >> > #2. makedumpfile-1.5.0 -E --message-level 1 -d 31 > >> > real 52m25.262s > >> > user 32m51.310s > >> > sys 18m53.265s > >> > > >> > #3. makedumpfile-1.4.2 -c --message-level 1 -d 31 > >> > real 8m49.107s > >> > user 4m34.180s > >> > sys 4m8.691s > >> > > >> > #4. makedumpfile-1.5.0 -c --message-level 1 -d 31 > >> > real 46m48.985s > >> > user 29m35.203s > >> > sys 16m43.149s > >> > > >> > >> Hello Vivek, > >> > >> On v1.5.0 we cannot filter free pages in constant space becuase we > >> have yet to test it. Instead, the existing method is used here, which > >> repeats walking on a whole page frames the number of cycles times. > >> > >> As Kumagai-san explains, the number of cycles can be calculated by the > >> following expression: > >> > >> N = physical memory size / (page size * bit per byte(8) * BUFSIZE_CYCLIC) > >> > >> So, > >> > >> N = 2TB / (4KB * 8 * 1MB) = 64 cycles. > >> > >> I guess on this environment, it took about 50 seconds to filter free > >> pages in one cycle. > > > > Ok, so once we have your walking through page struct patches in, hopefully > > this problem will be gone? > Yes, then free page filtering is integrated in other mem_map array > logics. You can see how it takes from the following message. > > Excluding unnecessary pages : [100 %] STEP [Excluding unnecessary pages] : 0.204574 seconds > > New logic takes equal to or quicker than the total time indicated by > the above kind of messages now. > > > If that's going to take time, can we make using of new logic conditional > > on a command line option. So that user has the option of using old > > logic. > > > > Kumagai-san should decide this. I'm not planning to make the cyclic mode optional. However, I think the performance issue should be improved even without mem_map array logic. So, I will try to reduce the number of cycles as few times as possible for v1.5.1, the performance issue will be improved. To make sure of it, would you re-test with --cyclic-buffer 32768 (32MB), Vivek ? Then the result of v1.5.0 is still too bad, I will consider using the old logic as default logic. > BTW, Kumagai-san, in what version do you plan release of this new > mem_map logic? Around v1.5.2? I want to re-send a new patch set to > which I add small changes and comments for maintainability. Yes, I plan to merge it in v1.5.2. Would you re-send the new patch set based on v1.5.1 ? Additionally, I'm afraid that I haven't reviewed the most of patches for v1.5.1 yet. So I can't tell you when v1.5.2 will be released. Thanks Atsushi Kumagai _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: makedumpfile 1.5.0 takes much more time to dump 2012-10-03 7:38 ` Atsushi Kumagai @ 2012-10-03 12:48 ` Vivek Goyal 2012-10-04 1:36 ` HATAYAMA Daisuke 2012-10-04 1:15 ` HATAYAMA Daisuke 1 sibling, 1 reply; 13+ messages in thread From: Vivek Goyal @ 2012-10-03 12:48 UTC (permalink / raw) To: Atsushi Kumagai; +Cc: kexec, d.hatayama, dyoung, chaowang On Wed, Oct 03, 2012 at 04:38:35PM +0900, Atsushi Kumagai wrote: [..] > > > If that's going to take time, can we make using of new logic conditional > > > on a command line option. So that user has the option of using old > > > logic. > > > > > > > Kumagai-san should decide this. > > I'm not planning to make the cyclic mode optional. > However, I think the performance issue should be improved even without > mem_map array logic. > > So, I will try to reduce the number of cycles as few times as possible for v1.5.1, > the performance issue will be improved. > To make sure of it, would you re-test with --cyclic-buffer 32768 (32MB), Vivek ? > Then the result of v1.5.0 is still too bad, I will consider using the old logic > as default logic. Actually chaowang did the testing. In the bug he provided data for 16MB buffer. makedumpfile with 16M cyclic buffer, #1. makedumpfile-1.5.0 -c --message-level 1 -d 31 --cyclic-buffer 16384 real 12m51.886s user 6m30.710s sys 6m11.642s #2. makedumpfile-1.5.0 -E --message-level 1 -d 31 --cyclic-buffer 16384 real 11m24.141s user 4m25.897s sys 6m38.116s Which looks much better than default numbers. Chao, can you please do the testing with 32MB buffer size and provide the data here. Thanks Vivek _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: makedumpfile 1.5.0 takes much more time to dump 2012-10-03 12:48 ` Vivek Goyal @ 2012-10-04 1:36 ` HATAYAMA Daisuke 0 siblings, 0 replies; 13+ messages in thread From: HATAYAMA Daisuke @ 2012-10-04 1:36 UTC (permalink / raw) To: vgoyal; +Cc: dyoung, kumagai-atsushi, kexec, chaowang From: Vivek Goyal <vgoyal@redhat.com> Subject: Re: makedumpfile 1.5.0 takes much more time to dump Date: Wed, 3 Oct 2012 08:48:13 -0400 > On Wed, Oct 03, 2012 at 04:38:35PM +0900, Atsushi Kumagai wrote: > > [..] >> > > If that's going to take time, can we make using of new logic conditional >> > > on a command line option. So that user has the option of using old >> > > logic. >> > > >> > >> > Kumagai-san should decide this. >> >> I'm not planning to make the cyclic mode optional. >> However, I think the performance issue should be improved even without >> mem_map array logic. >> >> So, I will try to reduce the number of cycles as few times as possible for v1.5.1, >> the performance issue will be improved. >> To make sure of it, would you re-test with --cyclic-buffer 32768 (32MB), Vivek ? >> Then the result of v1.5.0 is still too bad, I will consider using the old logic >> as default logic. > > Actually chaowang did the testing. In the bug he provided data for 16MB > buffer. > > makedumpfile with 16M cyclic buffer, > #1. makedumpfile-1.5.0 -c --message-level 1 -d 31 --cyclic-buffer 16384 > real 12m51.886s > user 6m30.710s > sys 6m11.642s > #2. makedumpfile-1.5.0 -E --message-level 1 -d 31 --cyclic-buffer 16384 > real 11m24.141s > user 4m25.897s > sys 6m38.116s > > Which looks much better than default numbers. Chao, can you please do the > testing with 32MB buffer size and provide the data here. > This 32MB setting is identical to the old logic in 1TB case except that the cyclic buffer is on memory, not treated via temporary file. So, free list filtering is done at most 1 time even on the worst case. But it's necessary to specify proper sizes for dumpfiles of other sizes. Thanks. HATAYAMA, Daisuke _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: makedumpfile 1.5.0 takes much more time to dump 2012-10-03 7:38 ` Atsushi Kumagai 2012-10-03 12:48 ` Vivek Goyal @ 2012-10-04 1:15 ` HATAYAMA Daisuke 1 sibling, 0 replies; 13+ messages in thread From: HATAYAMA Daisuke @ 2012-10-04 1:15 UTC (permalink / raw) To: kumagai-atsushi; +Cc: kexec, dyoung, chaowang, vgoyal From: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp> Subject: Re: makedumpfile 1.5.0 takes much more time to dump Date: Wed, 3 Oct 2012 16:38:35 +0900 <cut> >> BTW, Kumagai-san, in what version do you plan release of this new >> mem_map logic? Around v1.5.2? I want to re-send a new patch set to >> which I add small changes and comments for maintainability. > > Yes, I plan to merge it in v1.5.2. > Would you re-send the new patch set based on v1.5.1 ? > Yes, I'll do so. > Additionally, I'm afraid that I haven't reviewed the most of patches > for v1.5.1 yet. So I can't tell you when v1.5.2 will be released. > Please tell me that later when you figure out. Thanks. HATAYAMA, Daisuke _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <1350912018.13097.54.camel@lisamlinux.fc.hp.com>]
* Re: makedumpfile 1.5.0 takes much more time to dump [not found] <1350912018.13097.54.camel@lisamlinux.fc.hp.com> @ 2012-10-24 7:45 ` Atsushi Kumagai 2012-10-25 11:09 ` Lisa Mitchell 0 siblings, 1 reply; 13+ messages in thread From: Atsushi Kumagai @ 2012-10-24 7:45 UTC (permalink / raw) To: lisa.mitchell; +Cc: kexec, jerry.hoemann Hello Lisa, On Mon, 22 Oct 2012 07:20:18 -0600 Lisa Mitchell <lisa.mitchell@hp.com> wrote: > Jerry Hoemann and I tested the new makedumpfile 1.5.0 on a DL980 with 4 > TB of memory, which is the maximum supported for this system. We tested > it on top of a 2.6.32 kernel plus patches, had the dump level set to 31 > for smallest dump, and found that the dump would not complete in a > reasonable time frame, basically staying for over 16 hours in the state > where it cycled through "Excluding Free pages" (would go from 0-100%) > and "Excluding unnecessary pages" (0-100%). It just alternated between > these two all night. I did not try waiting longer than 17 hours to see > if it ever completed, because with an earlier makedumpfile on this same > system, the dump would complete in a few hours. Console logs can be > provided if desired. > > Are we are seeing known issues that will be addressed in the next > makedumpfile? > > >From this email chain, it sounds like others see similar issues, but we > want to be sure we are not seeing something different. I think you're seeing the known issue which we discussed, I will address it in v1.5.1 and v1.5.2. > I can arrange for access to a DL980 with 4 TB of memory later when the > new makedumpfile v1.5.1 is available, and we would very much like to > test any fixes on our 4 TB system. Please let me know when it is > available to try. I will release the next version by the end of this year. If you need some workarounds now, please use the workaroud described in the release note: http://lists.infradead.org/pipermail/kexec/2012-September/006768.html At least in v1.5.0, if you feel the cyclic mode is slow, you can try 2 workaronds: 1. Use old running mode with "--non-cyclic" option. 2. Decrease the number of cycles by increasing BUFSIZE_CYCLIC with "--cyclic-buffer" option. Please refer to the manual page for how to use these options. > Meanwhile, if there are debug steps we could take to better understand > the performance issue, and help get this new solution working (so dumps > can scale to larger memory, and we can keep crashkernel size limited to > 384 MB), please let me know. At first, the behavior of makedumpfile can be described as two steps: Step1. analysis Analyzing vmcore and creating the bitmap which represent whether each pages should be excluded or not. v1.4.4 or before save the bitmap into a file and it grows with the size of vmcore, while v1.5.0 saves it in memory and the size of it is constant based on BUFSIZE_CYCLIC parameter. The bitmap is the biggest memory footprint and that's why v1.5.0 can work in constant memory space. Step2. writing Writing each pages to a disk according to the bitmap created in step1. Second, I show the process image below: a. v1.4.4 or before [process image] cycle 1 +----------------- -----+ vmcore | ... | +----------------- -----+ [execution sequence] cycle | 1 ---------+------- step1 | 1 | step2 | 2 [bitmap] Save the bitmap for the whole of vmcore at a time. b. v1.5.0 [process image] cycle 1 2 3 4 ... N +----------------- -----+ vmcore | | | | | ... | | +----------------- -----+ [execution sequence] cycle | 1 2 3 4 ... N ---------+------------------------------------ step1 | 1 /3 /5 /7 / (2N-1) | | / | / | / | / | step2 | 2/ 4/ 6/ 8/ (2N) [bitmap] Save the bitmap only for a cycle at a time. Step1 should scan only the constant region of vmcore correspond to each cycle, but the current logic needs to scan all free pages every cycle. To sum it up, the more the number of cycle, the more redundant scans will be done. The default BUFSIZE_CYCLIC of v1.5.0 is too small for terabytes of memory, the number of cycle will be so large. (e.g. N is 32 in 1TB machines.) As a result, a lot of time will be spend for step1. Therefore, I will implement the feature to reduce the number of cycle as few as possible automatically in v1.5.1. Now, you can get the same benefit by allocating enough memory with --cyclic-buffer option. For 4TB machines, you should specify "--cyclic-buffer 131072" if it's possible. (In this case, 256MB is required actually. Please see the man page for the details of this option.) Additionally, I will resolve the issue included in the logic of excluding free pages in v1.5.2. Thanks Atsushi Kumagai _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: makedumpfile 1.5.0 takes much more time to dump 2012-10-24 7:45 ` Atsushi Kumagai @ 2012-10-25 11:09 ` Lisa Mitchell 2012-11-06 3:37 ` Atsushi Kumagai 0 siblings, 1 reply; 13+ messages in thread From: Lisa Mitchell @ 2012-10-25 11:09 UTC (permalink / raw) To: Atsushi Kumagai; +Cc: kexec@lists.infradead.org, Hoemann, Jerry On Wed, 2012-10-24 at 07:45 +0000, Atsushi Kumagai wrote: > Hello Lisa, > > On Mon, 22 Oct 2012 07:20:18 -0600 > Lisa Mitchell <lisa.mitchell@hp.com> wrote: > > > Jerry Hoemann and I tested the new makedumpfile 1.5.0 on a DL980 with 4 > > TB of memory, which is the maximum supported for this system. We tested > > it on top of a 2.6.32 kernel plus patches, had the dump level set to 31 > > for smallest dump, and found that the dump would not complete in a > > reasonable time frame, basically staying for over 16 hours in the state > > where it cycled through "Excluding Free pages" (would go from 0-100%) > > and "Excluding unnecessary pages" (0-100%). It just alternated between > > these two all night. I did not try waiting longer than 17 hours to see > > if it ever completed, because with an earlier makedumpfile on this same > > system, the dump would complete in a few hours. Console logs can be > > provided if desired. > > > > Are we are seeing known issues that will be addressed in the next > > makedumpfile? > > > > >From this email chain, it sounds like others see similar issues, but we > > want to be sure we are not seeing something different. > > I think you're seeing the known issue which we discussed, I will address it > in v1.5.1 and v1.5.2. > > > I can arrange for access to a DL980 with 4 TB of memory later when the > > new makedumpfile v1.5.1 is available, and we would very much like to > > test any fixes on our 4 TB system. Please let me know when it is > > available to try. > > I will release the next version by the end of this year. > If you need some workarounds now, please use the workaroud described in > the release note: > > http://lists.infradead.org/pipermail/kexec/2012-September/006768.html > > At least in v1.5.0, if you feel the cyclic mode is slow, you can try 2 workaronds: > > 1. Use old running mode with "--non-cyclic" option. > > 2. Decrease the number of cycles by increasing BUFSIZE_CYCLIC with > "--cyclic-buffer" option. > > Please refer to the manual page for how to use these options. > > > Meanwhile, if there are debug steps we could take to better understand > > the performance issue, and help get this new solution working (so dumps > > can scale to larger memory, and we can keep crashkernel size limited to > > 384 MB), please let me know. > > At first, the behavior of makedumpfile can be described as two steps: > > Step1. analysis > Analyzing vmcore and creating the bitmap which represent whether each pages > should be excluded or not. > v1.4.4 or before save the bitmap into a file and it grows with the size of > vmcore, while v1.5.0 saves it in memory and the size of it is constant > based on BUFSIZE_CYCLIC parameter. > The bitmap is the biggest memory footprint and that's why v1.5.0 can work > in constant memory space. > > Step2. writing > Writing each pages to a disk according to the bitmap created in step1. > > Second, I show the process image below: > > a. v1.4.4 or before > > [process image] > > cycle 1 > +----------------- -----+ > vmcore | ... | > +----------------- -----+ > > [execution sequence] > > cycle | 1 > ---------+------- > step1 | 1 > | > step2 | 2 > > [bitmap] > > Save the bitmap for the whole of vmcore at a time. > > > b. v1.5.0 > > [process image] > > cycle 1 2 3 4 ... N > +----------------- -----+ > vmcore | | | | | ... | | > +----------------- -----+ > > [execution sequence] > > cycle | 1 2 3 4 ... N > ---------+------------------------------------ > step1 | 1 /3 /5 /7 / (2N-1) > | | / | / | / | / | > step2 | 2/ 4/ 6/ 8/ (2N) > > [bitmap] > > Save the bitmap only for a cycle at a time. > > > Step1 should scan only the constant region of vmcore correspond to each cycle, > but the current logic needs to scan all free pages every cycle. > To sum it up, the more the number of cycle, the more redundant scans will be done. > > The default BUFSIZE_CYCLIC of v1.5.0 is too small for terabytes of memory, > the number of cycle will be so large. (e.g. N is 32 in 1TB machines.) > As a result, a lot of time will be spend for step1. > > Therefore, I will implement the feature to reduce the number of cycle as few as > possible automatically in v1.5.1. > Now, you can get the same benefit by allocating enough memory with --cyclic-buffer > option. For 4TB machines, you should specify "--cyclic-buffer 131072" if it's possible. > (In this case, 256MB is required actually. Please see the man page for the > details of this option.) > > Additionally, I will resolve the issue included in the logic of excluding > free pages in v1.5.2. > > > Thanks > Atsushi Kumagai Thanks, Atsushi! I tried the dump on the 4 TB system with --cyclic-buffer 131072, and the dump completed overnight, and I collected a complete vmcore for dump level 31. It looks like from the console log the system "cycled" twice with this setting, two passes of excluding and copying, before the dump was completed. I am in the process of making a more precise timing measurement of the dump time today. Looks like each cycle takes about 1 hour for this system, with the majority of this time spent in "Excluding unnecessary pages" phase of each cycle. However if I understand what you are doing with the cyclic-buffer parameter, it seems we are taking up 128 MB of the crash kernel memory space for this buffer, and it may have to scale larger to get decent performance on larger memory. Is that conclusion correct? I was only successful with the new makedumpfile with cyclic-buffer set to 128 MB when I set crashkernel=384 MB, but ran out of memory trying to start dump (Out of memory killer killed makedumpfile) when crashkernel=256 MB, on this system. Will we be able to dump larger memory systems, up to 12 TB for instance, with any kind of reasonable performance, with a crashkernel size limited to 384 MB, as I understand all current upstream kernels are now? If the ratio of memory size to total bitmap space is assumed linear, this would predict a 12 TB system would take about 6 cycles to dump. And larger memory will need even more cycles, etc. I can see where performance improvements in getting through each cycle will make this better, so more cycles will not mean that much increase in dump time over the copy time, but I am concerned for crashkernel size being able to stay at 384MB, and still be able to accommodate a large enough cyclic-buffer size to maintain a reasonable dump time on future large memory systems. What other things on a large system will effect usable crashkernel size, that will make it insufficient to support a 128 MB cyclic-buffer size? Or will the cycle performance fixes proposed for the future makedumpfile versions improve things enough that performance penalties for having a large number of cycles to dump will be small enough not to matter? Thanks, Lisa Mitchell _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: makedumpfile 1.5.0 takes much more time to dump 2012-10-25 11:09 ` Lisa Mitchell @ 2012-11-06 3:37 ` Atsushi Kumagai 0 siblings, 0 replies; 13+ messages in thread From: Atsushi Kumagai @ 2012-11-06 3:37 UTC (permalink / raw) To: lisa.mitchell; +Cc: kexec, jerry.hoemann Hello Lisa, On Thu, 25 Oct 2012 05:09:44 -0600 Lisa Mitchell <lisa.mitchell@hp.com> wrote: > Thanks, Atsushi! > > I tried the dump on the 4 TB system with --cyclic-buffer 131072, and the > dump completed overnight, and I collected a complete vmcore for dump > level 31. It looks like from the console log the system "cycled" twice > with this setting, two passes of excluding and copying, before the dump > was completed. I am in the process of making a more precise timing > measurement of the dump time today. Looks like each cycle takes about 1 > hour for this system, with the majority of this time spent in "Excluding > unnecessary pages" phase of each cycle. Sorry for my lack of description, the excluding phase will be run twice at a cycle. So in your case, the number of cycle is 1. > However if I understand what you are doing with the cyclic-buffer > parameter, it seems we are taking up 128 MB of the crash kernel memory > space for this buffer, and it may have to scale larger to get decent > performance on larger memory. > > Is that conclusion correct? Yes, but to increase cyclic-buffer is just workaround for v1.5.0. (Additionally, the enhancement of v1.5.1 is just automation of this.) I think that the dump time should be constant regardless of the buffer size ideally, because the purpose of cyclic process is to work on constant memory, so to increase cyclic-buffer is putting the cart before the horse. > I was only successful with the new makedumpfile with cyclic-buffer set > to 128 MB when I set crashkernel=384 MB, but ran out of memory trying to > start dump (Out of memory killer killed makedumpfile) when > crashkernel=256 MB, on this system. > > Will we be able to dump larger memory systems, up to 12 TB for instance, > with any kind of reasonable performance, with a crashkernel size limited > to 384 MB, as I understand all current upstream kernels are now? I think v1.5.2 can do it because the most of overhead of cyclic process will be removed with the optimization of the logic of excluding free pages in v1.5.2. I expect v1.5.2 to work in constant time regardless of the number of cycle. > If the ratio of memory size to total bitmap space is assumed linear, > this would predict a 12 TB system would take about 6 cycles to dump. And > larger memory will need even more cycles, etc. I can see where > performance improvements in getting through each cycle will make this > better, so more cycles will not mean that much increase in dump time > over the copy time, but I am concerned for crashkernel size being able > to stay at 384MB, and still be able to accommodate a large enough > cyclic-buffer size to maintain a reasonable dump time on future large > memory systems. > > What other things on a large system will effect usable crashkernel size, > that will make it insufficient to support a 128 MB cyclic-buffer size? > > Or will the cycle performance fixes proposed for the future makedumpfile > versions improve things enough that performance penalties for having a > large number of cycles to dump will be small enough not to matter? I hope so, but some overhead of cyclic process may be unavoidable and I can't assume how much time will be spent in them now. So we need to see the measurement of v1.5.2. Thanks Atsushi Kumagai _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2012-11-06 3:40 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-20 20:06 makedumpfile 1.5.0 takes much more time to dump Vivek Goyal
2012-09-21 0:23 ` HATAYAMA Daisuke
2012-09-21 0:43 ` HATAYAMA Daisuke
2012-09-21 13:32 ` Vivek Goyal
2012-09-24 0:51 ` HATAYAMA Daisuke
2012-09-24 14:51 ` Vivek Goyal
2012-10-03 7:38 ` Atsushi Kumagai
2012-10-03 12:48 ` Vivek Goyal
2012-10-04 1:36 ` HATAYAMA Daisuke
2012-10-04 1:15 ` HATAYAMA Daisuke
[not found] <1350912018.13097.54.camel@lisamlinux.fc.hp.com>
2012-10-24 7:45 ` Atsushi Kumagai
2012-10-25 11:09 ` Lisa Mitchell
2012-11-06 3:37 ` Atsushi Kumagai
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox