* Ext4 performance regression: Post 2.6.30 @ 2010-03-29 6:25 Keith Mannthey 2010-03-29 15:10 ` Greg Freemyer 0 siblings, 1 reply; 8+ messages in thread From: Keith Mannthey @ 2010-03-29 6:25 UTC (permalink / raw) To: linux-ext4 After 2.6.30 I am seeing large performance regressions on a raid setup. I am working to publish a larger amount of data but I wanted to get some quick data out about what I am seeing. The test (FFSB test suite) I am running is basically random direct io writes. The below data is from 128 threads all doing these random writes. 1 and 32 thread results are not as drastically bad but 2.6.30 has the strongest results. Under a mailserver workload I see similar performance impacts at this same kernel change point. I hope to publish better data soon. Several other workload types do not show this performance regression. 2.6.30: Total Results =============== Op Name Transactions Trans/sec % Trans % Op Weight Throughput ======= ============ ========= ======= =========== ========== write : 9015040 29561.46 100.000% 100.000% 115MB/sec - 29561.46 Transactions per Second Any kernel past 2.6.30. This is from 2.6.31-rc1: Total Results =============== Op Name Transactions Trans/sec % Trans % Op Weight Throughput ======= ============ ========= ======= =========== ========== write : 3185920 10120.50 100.000% 100.000% 39.5MB/sec ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Ext4 performance regression: Post 2.6.30 2010-03-29 6:25 Ext4 performance regression: Post 2.6.30 Keith Mannthey @ 2010-03-29 15:10 ` Greg Freemyer 2010-03-31 1:56 ` Keith Mannthey 0 siblings, 1 reply; 8+ messages in thread From: Greg Freemyer @ 2010-03-29 15:10 UTC (permalink / raw) To: Keith Mannthey; +Cc: linux-ext4 On Mon, Mar 29, 2010 at 2:25 AM, Keith Mannthey <kmannth@us.ibm.com> wrote: > > > After 2.6.30 I am seeing large performance regressions on a raid setup. > I am working to publish a larger amount of data but I wanted to get some > quick data out about what I am seeing. > Is mdraid involved? They added barrier support for some configs after 2.6.30 I believe. It can cause a drastic perf change, but it increases reliability and is "correct". Greg ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Ext4 performance regression: Post 2.6.30 2010-03-29 15:10 ` Greg Freemyer @ 2010-03-31 1:56 ` Keith Mannthey 2010-03-31 4:06 ` Eric Sandeen 0 siblings, 1 reply; 8+ messages in thread From: Keith Mannthey @ 2010-03-31 1:56 UTC (permalink / raw) To: Greg Freemyer; +Cc: linux-ext4 On Mon, 2010-03-29 at 11:10 -0400, Greg Freemyer wrote: > On Mon, Mar 29, 2010 at 2:25 AM, Keith Mannthey <kmannth@us.ibm.com> wrote: > > > > > > After 2.6.30 I am seeing large performance regressions on a raid setup. > > I am working to publish a larger amount of data but I wanted to get some > > quick data out about what I am seeing. > > > > Is mdraid involved? > > They added barrier support for some configs after 2.6.30 I believe. > It can cause a drastic perf change, but it increases reliability and > is "correct". lvm and device mapper are is involved. The git bisect just took me to: 374bf7e7f6cc38b0483351a2029a97910eadde1b is first bad commit commit 374bf7e7f6cc38b0483351a2029a97910eadde1b Author: Mikulas Patocka <mpatocka@redhat.com> Date: Mon Jun 22 10:12:22 2009 +0100 dm: stripe support flush Flush support for the stripe target. This sets ti->num_flush_requests to the number of stripes and remaps individual flush requests to the appropriate stripe devices. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> :040000 040000 542f4b9b442d1371c6534f333b7e00714ef98609 d490479b660139fc1b6b0ecd17bb58c9e00e597e M drivers This may be correct behavior but the performance penalty in this test case is pretty high. I am going to move back to current kernels and starting looking into ext4/dm flushing. Thanks, Keith Mannthey ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Ext4 performance regression: Post 2.6.30 2010-03-31 1:56 ` Keith Mannthey @ 2010-03-31 4:06 ` Eric Sandeen 2010-03-31 22:02 ` Keith Mannthey 0 siblings, 1 reply; 8+ messages in thread From: Eric Sandeen @ 2010-03-31 4:06 UTC (permalink / raw) To: Keith Mannthey; +Cc: Greg Freemyer, linux-ext4 Keith Mannthey wrote: > On Mon, 2010-03-29 at 11:10 -0400, Greg Freemyer wrote: >> On Mon, Mar 29, 2010 at 2:25 AM, Keith Mannthey <kmannth@us.ibm.com> wrote: >>> >>> After 2.6.30 I am seeing large performance regressions on a raid setup. >>> I am working to publish a larger amount of data but I wanted to get some >>> quick data out about what I am seeing. >>> >> Is mdraid involved? >> >> They added barrier support for some configs after 2.6.30 I believe. >> It can cause a drastic perf change, but it increases reliability and >> is "correct". > > lvm and device mapper are is involved. The git bisect just took me to: > > 374bf7e7f6cc38b0483351a2029a97910eadde1b is first bad commit > commit 374bf7e7f6cc38b0483351a2029a97910eadde1b > Author: Mikulas Patocka <mpatocka@redhat.com> > Date: Mon Jun 22 10:12:22 2009 +0100 > > dm: stripe support flush > > Flush support for the stripe target. > > This sets ti->num_flush_requests to the number of stripes and > remaps individual flush requests to the appropriate stripe devices. > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> > Signed-off-by: Alasdair G Kergon <agk@redhat.com> > > :040000 040000 542f4b9b442d1371c6534f333b7e00714ef98609 d490479b660139fc1b6b0ecd17bb58c9e00e597e M drivers > > > This may be correct behavior but the performance penalty in this test > case is pretty high. > > I am going to move back to current kernels and starting looking into > ext4/dm flushing. It would probably be interesting to do a mount -o nobarrier to see if that makes the regression go away. -Eric > Thanks, > Keith Mannthey ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Ext4 performance regression: Post 2.6.30 2010-03-31 4:06 ` Eric Sandeen @ 2010-03-31 22:02 ` Keith Mannthey 2010-03-31 22:06 ` Greg Freemyer 0 siblings, 1 reply; 8+ messages in thread From: Keith Mannthey @ 2010-03-31 22:02 UTC (permalink / raw) To: Eric Sandeen; +Cc: Greg Freemyer, linux-ext4 On Tue, 2010-03-30 at 23:06 -0500, Eric Sandeen wrote: > Keith Mannthey wrote: > > On Mon, 2010-03-29 at 11:10 -0400, Greg Freemyer wrote: > >> On Mon, Mar 29, 2010 at 2:25 AM, Keith Mannthey <kmannth@us.ibm.com> wrote: > >>> > >>> After 2.6.30 I am seeing large performance regressions on a raid setup. > >>> I am working to publish a larger amount of data but I wanted to get some > >>> quick data out about what I am seeing. > >>> > >> Is mdraid involved? > >> > >> They added barrier support for some configs after 2.6.30 I believe. > >> It can cause a drastic perf change, but it increases reliability and > >> is "correct". > > > > lvm and device mapper are is involved. The git bisect just took me to: > > > > 374bf7e7f6cc38b0483351a2029a97910eadde1b is first bad commit > > commit 374bf7e7f6cc38b0483351a2029a97910eadde1b > > Author: Mikulas Patocka <mpatocka@redhat.com> > > Date: Mon Jun 22 10:12:22 2009 +0100 > > > > dm: stripe support flush > > > > Flush support for the stripe target. > > > > This sets ti->num_flush_requests to the number of stripes and > > remaps individual flush requests to the appropriate stripe devices. > > > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> > > Signed-off-by: Alasdair G Kergon <agk@redhat.com> > > > > :040000 040000 542f4b9b442d1371c6534f333b7e00714ef98609 d490479b660139fc1b6b0ecd17bb58c9e00e597e M drivers > > > > > > This may be correct behavior but the performance penalty in this test > > case is pretty high. > > > > I am going to move back to current kernels and starting looking into > > ext4/dm flushing. > > It would probably be interesting to do a mount -o nobarrier to see if > that makes the regression go away. -o nobarrier takes the regression away with 2.6.34-rc3: Default mount: ~27500 -o nobarrier: ~12500 Barriers on this setup cost ALOT during writes. Interestingly as well the "mailserver" workload regression is also removed by mounting with "-o nobarrier". I am going to see what impact is seen on a single disk setup. Thanks, Keith Mannthey LTC FS-Dev ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Ext4 performance regression: Post 2.6.30 2010-03-31 22:02 ` Keith Mannthey @ 2010-03-31 22:06 ` Greg Freemyer 2010-03-31 22:14 ` Keith Mannthey 0 siblings, 1 reply; 8+ messages in thread From: Greg Freemyer @ 2010-03-31 22:06 UTC (permalink / raw) To: Keith Mannthey; +Cc: Eric Sandeen, linux-ext4 On Wed, Mar 31, 2010 at 6:02 PM, Keith Mannthey <kmannth@us.ibm.com> wrote: > On Tue, 2010-03-30 at 23:06 -0500, Eric Sandeen wrote: >> Keith Mannthey wrote: >> > On Mon, 2010-03-29 at 11:10 -0400, Greg Freemyer wrote: >> >> On Mon, Mar 29, 2010 at 2:25 AM, Keith Mannthey <kmannth@us.ibm.com> wrote: >> >>> >> >>> After 2.6.30 I am seeing large performance regressions on a raid setup. >> >>> I am working to publish a larger amount of data but I wanted to get some >> >>> quick data out about what I am seeing. >> >>> >> >> Is mdraid involved? >> >> >> >> They added barrier support for some configs after 2.6.30 I believe. >> >> It can cause a drastic perf change, but it increases reliability and >> >> is "correct". >> > >> > lvm and device mapper are is involved. The git bisect just took me to: >> > >> > 374bf7e7f6cc38b0483351a2029a97910eadde1b is first bad commit >> > commit 374bf7e7f6cc38b0483351a2029a97910eadde1b >> > Author: Mikulas Patocka <mpatocka@redhat.com> >> > Date: Mon Jun 22 10:12:22 2009 +0100 >> > >> > dm: stripe support flush >> > >> > Flush support for the stripe target. >> > >> > This sets ti->num_flush_requests to the number of stripes and >> > remaps individual flush requests to the appropriate stripe devices. >> > >> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> >> > Signed-off-by: Alasdair G Kergon <agk@redhat.com> >> > >> > :040000 040000 542f4b9b442d1371c6534f333b7e00714ef98609 d490479b660139fc1b6b0ecd17bb58c9e00e597e M drivers >> > >> > >> > This may be correct behavior but the performance penalty in this test >> > case is pretty high. >> > >> > I am going to move back to current kernels and starting looking into >> > ext4/dm flushing. >> >> It would probably be interesting to do a mount -o nobarrier to see if >> that makes the regression go away. > > -o nobarrier takes the regression away with 2.6.34-rc3: > > Default mount: ~27500 > > -o nobarrier: ~12500 > > Barriers on this setup cost ALOT during writes. > > Interestingly as well the "mailserver" workload regression is also > removed by mounting with "-o nobarrier". > > I am going to see what impact is seen on a single disk setup. > > Thanks, > Keith Mannthey > LTC FS-Dev I'm curious if your using an internal or external journal? I'd guess the cost of barriers is much greater with an internal journal, but I don't recall seeing any benchmarks one way or the other. Greg -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Ext4 performance regression: Post 2.6.30 2010-03-31 22:06 ` Greg Freemyer @ 2010-03-31 22:14 ` Keith Mannthey 2010-03-31 22:55 ` Greg Freemyer 0 siblings, 1 reply; 8+ messages in thread From: Keith Mannthey @ 2010-03-31 22:14 UTC (permalink / raw) To: Greg Freemyer; +Cc: Eric Sandeen, linux-ext4 On Wed, 2010-03-31 at 18:06 -0400, Greg Freemyer wrote: > On Wed, Mar 31, 2010 at 6:02 PM, Keith Mannthey <kmannth@us.ibm.com> wrote: > > On Tue, 2010-03-30 at 23:06 -0500, Eric Sandeen wrote: > >> Keith Mannthey wrote: > >> > On Mon, 2010-03-29 at 11:10 -0400, Greg Freemyer wrote: > >> >> On Mon, Mar 29, 2010 at 2:25 AM, Keith Mannthey <kmannth@us.ibm.com> wrote: > >> >>> > >> >>> After 2.6.30 I am seeing large performance regressions on a raid setup. > >> >>> I am working to publish a larger amount of data but I wanted to get some > >> >>> quick data out about what I am seeing. > >> >>> > >> >> Is mdraid involved? > >> >> > >> >> They added barrier support for some configs after 2.6.30 I believe. > >> >> It can cause a drastic perf change, but it increases reliability and > >> >> is "correct". > >> > > >> > lvm and device mapper are is involved. The git bisect just took me to: > >> > > >> > 374bf7e7f6cc38b0483351a2029a97910eadde1b is first bad commit > >> > commit 374bf7e7f6cc38b0483351a2029a97910eadde1b > >> > Author: Mikulas Patocka <mpatocka@redhat.com> > >> > Date: Mon Jun 22 10:12:22 2009 +0100 > >> > > >> > dm: stripe support flush > >> > > >> > Flush support for the stripe target. > >> > > >> > This sets ti->num_flush_requests to the number of stripes and > >> > remaps individual flush requests to the appropriate stripe devices. > >> > > >> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> > >> > Signed-off-by: Alasdair G Kergon <agk@redhat.com> > >> > > >> > :040000 040000 542f4b9b442d1371c6534f333b7e00714ef98609 d490479b660139fc1b6b0ecd17bb58c9e00e597e M drivers > >> > > >> > > >> > This may be correct behavior but the performance penalty in this test > >> > case is pretty high. > >> > > >> > I am going to move back to current kernels and starting looking into > >> > ext4/dm flushing. > >> > >> It would probably be interesting to do a mount -o nobarrier to see if > >> that makes the regression go away. > > > > -o nobarrier takes the regression away with 2.6.34-rc3: > > > > Default mount: ~27500 > > > > -o nobarrier: ~12500 > > > > Barriers on this setup cost ALOT during writes. > > > > Interestingly as well the "mailserver" workload regression is also > > removed by mounting with "-o nobarrier". > > > > I am going to see what impact is seen on a single disk setup. > > > > Thanks, > > Keith Mannthey > > LTC FS-Dev > > I'm curious if your using an internal or external journal? I am unsure. How do I tell? I am using defaults except with the -o nobarrier. I know jdb2 is being used. Thanks, Keith > I'd guess the cost of barriers is much greater with an internal > journal, but I don't recall seeing any benchmarks one way or the > other. > > Greg ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Ext4 performance regression: Post 2.6.30 2010-03-31 22:14 ` Keith Mannthey @ 2010-03-31 22:55 ` Greg Freemyer 0 siblings, 0 replies; 8+ messages in thread From: Greg Freemyer @ 2010-03-31 22:55 UTC (permalink / raw) To: Keith Mannthey; +Cc: Eric Sandeen, linux-ext4 On Wed, Mar 31, 2010 at 6:14 PM, Keith Mannthey <kmannth@us.ibm.com> wrote: > On Wed, 2010-03-31 at 18:06 -0400, Greg Freemyer wrote: >> On Wed, Mar 31, 2010 at 6:02 PM, Keith Mannthey <kmannth@us.ibm.com> wrote: >> > On Tue, 2010-03-30 at 23:06 -0500, Eric Sandeen wrote: >> >> Keith Mannthey wrote: >> >> > On Mon, 2010-03-29 at 11:10 -0400, Greg Freemyer wrote: >> >> >> On Mon, Mar 29, 2010 at 2:25 AM, Keith Mannthey <kmannth@us.ibm.com> wrote: >> >> >>> >> >> >>> After 2.6.30 I am seeing large performance regressions on a raid setup. >> >> >>> I am working to publish a larger amount of data but I wanted to get some >> >> >>> quick data out about what I am seeing. >> >> >>> >> >> >> Is mdraid involved? >> >> >> >> >> >> They added barrier support for some configs after 2.6.30 I believe. >> >> >> It can cause a drastic perf change, but it increases reliability and >> >> >> is "correct". >> >> > >> >> > lvm and device mapper are is involved. The git bisect just took me to: >> >> > >> >> > 374bf7e7f6cc38b0483351a2029a97910eadde1b is first bad commit >> >> > commit 374bf7e7f6cc38b0483351a2029a97910eadde1b >> >> > Author: Mikulas Patocka <mpatocka@redhat.com> >> >> > Date: Mon Jun 22 10:12:22 2009 +0100 >> >> > >> >> > dm: stripe support flush >> >> > >> >> > Flush support for the stripe target. >> >> > >> >> > This sets ti->num_flush_requests to the number of stripes and >> >> > remaps individual flush requests to the appropriate stripe devices. >> >> > >> >> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> >> >> > Signed-off-by: Alasdair G Kergon <agk@redhat.com> >> >> > >> >> > :040000 040000 542f4b9b442d1371c6534f333b7e00714ef98609 d490479b660139fc1b6b0ecd17bb58c9e00e597e M drivers >> >> > >> >> > >> >> > This may be correct behavior but the performance penalty in this test >> >> > case is pretty high. >> >> > >> >> > I am going to move back to current kernels and starting looking into >> >> > ext4/dm flushing. >> >> >> >> It would probably be interesting to do a mount -o nobarrier to see if >> >> that makes the regression go away. >> > >> > -o nobarrier takes the regression away with 2.6.34-rc3: >> > >> > Default mount: ~27500 >> > >> > -o nobarrier: ~12500 >> > >> > Barriers on this setup cost ALOT during writes. >> > >> > Interestingly as well the "mailserver" workload regression is also >> > removed by mounting with "-o nobarrier". >> > >> > I am going to see what impact is seen on a single disk setup. >> > >> > Thanks, >> > Keith Mannthey >> > LTC FS-Dev >> >> I'm curious if your using an internal or external journal? > > I am unsure. How do I tell? I am using defaults except with the -o > nobarrier. I know jdb2 is being used. > > Thanks, > Keith The default is internal. External requires a separate partition be provided to hold the journal. Since journals are typically very small relative to the overall filesystem, a small raid 1 partition would be my production recommendation to hold an external journal. But for performance testing purposes, if you have a drive that is not participating in your current raid setup, you can simply create a small partition on it and use it to hold the external journal. I believe you can convert your existing file system to an external journal easily and without having to recreate your file system. Greg -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-03-31 22:55 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-03-29 6:25 Ext4 performance regression: Post 2.6.30 Keith Mannthey 2010-03-29 15:10 ` Greg Freemyer 2010-03-31 1:56 ` Keith Mannthey 2010-03-31 4:06 ` Eric Sandeen 2010-03-31 22:02 ` Keith Mannthey 2010-03-31 22:06 ` Greg Freemyer 2010-03-31 22:14 ` Keith Mannthey 2010-03-31 22:55 ` Greg Freemyer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).