* [Cluster-devel] [GFS2 PATCH] GFS2: Average in only non-zero round-trip times for congestion stats [not found] <1202489161.4769650.1429721928815.JavaMail.zimbra@redhat.com> @ 2015-04-22 17:00 ` Bob Peterson 2015-04-23 9:06 ` Steven Whitehouse 0 siblings, 1 reply; 4+ messages in thread From: Bob Peterson @ 2015-04-22 17:00 UTC (permalink / raw) To: cluster-devel.redhat.com Hi, This patch changes function gfs2_rgrp_congested so that it only factors in non-zero values into its average round trip time. If the round-trip time is zero for a particular cpu, that cpu has obviously never dealt with bouncing the resource group in question, so factoring in a zero value will only skew the numbers. It also fixes a compile error on some arches related to division. Regards, Bob Peterson Signed-off-by: Bob Peterson <rpeterso@redhat.com> --- diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c index f39eedc..cb27065 100644 --- a/fs/gfs2/rgrp.c +++ b/fs/gfs2/rgrp.c @@ -1854,15 +1854,19 @@ static bool gfs2_rgrp_congested(const struct gfs2_rgrpd *rgd, int loops) s64 srttb_diff; s64 sqr_diff; s64 var; - int cpu; + int cpu, nonzero = 0; preempt_disable(); for_each_present_cpu(cpu) { st = &per_cpu_ptr(sdp->sd_lkstats, cpu)->lkstats[LM_TYPE_RGRP]; - a_srttb += st->stats[GFS2_LKS_SRTTB]; + if (st->stats[GFS2_LKS_SRTTB]) { + a_srttb += st->stats[GFS2_LKS_SRTTB]; + nonzero++; + } } st = &this_cpu_ptr(sdp->sd_lkstats)->lkstats[LM_TYPE_RGRP]; - a_srttb /= num_present_cpus(); + if (nonzero) + do_div(a_srttb, nonzero); r_dcount = st->stats[GFS2_LKS_DCOUNT]; var = st->stats[GFS2_LKS_SRTTVARB] + gl->gl_stats.stats[GFS2_LKS_SRTTVARB]; ^ permalink raw reply related [flat|nested] 4+ messages in thread
* [Cluster-devel] [GFS2 PATCH] GFS2: Average in only non-zero round-trip times for congestion stats 2015-04-22 17:00 ` [Cluster-devel] [GFS2 PATCH] GFS2: Average in only non-zero round-trip times for congestion stats Bob Peterson @ 2015-04-23 9:06 ` Steven Whitehouse 2015-04-23 16:45 ` Bob Peterson 0 siblings, 1 reply; 4+ messages in thread From: Steven Whitehouse @ 2015-04-23 9:06 UTC (permalink / raw) To: cluster-devel.redhat.com Hi, That looks better. Do you get better results with it compared with the previous version? Steve. On 22/04/15 18:00, Bob Peterson wrote: > Hi, > > This patch changes function gfs2_rgrp_congested so that it only factors > in non-zero values into its average round trip time. If the round-trip > time is zero for a particular cpu, that cpu has obviously never dealt > with bouncing the resource group in question, so factoring in a zero > value will only skew the numbers. It also fixes a compile error on > some arches related to division. > > Regards, > > Bob Peterson > Signed-off-by: Bob Peterson <rpeterso@redhat.com> > --- > diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c > index f39eedc..cb27065 100644 > --- a/fs/gfs2/rgrp.c > +++ b/fs/gfs2/rgrp.c > @@ -1854,15 +1854,19 @@ static bool gfs2_rgrp_congested(const struct gfs2_rgrpd *rgd, int loops) > s64 srttb_diff; > s64 sqr_diff; > s64 var; > - int cpu; > + int cpu, nonzero = 0; > > preempt_disable(); > for_each_present_cpu(cpu) { > st = &per_cpu_ptr(sdp->sd_lkstats, cpu)->lkstats[LM_TYPE_RGRP]; > - a_srttb += st->stats[GFS2_LKS_SRTTB]; > + if (st->stats[GFS2_LKS_SRTTB]) { > + a_srttb += st->stats[GFS2_LKS_SRTTB]; > + nonzero++; > + } > } > st = &this_cpu_ptr(sdp->sd_lkstats)->lkstats[LM_TYPE_RGRP]; > - a_srttb /= num_present_cpus(); > + if (nonzero) > + do_div(a_srttb, nonzero); > r_dcount = st->stats[GFS2_LKS_DCOUNT]; > var = st->stats[GFS2_LKS_SRTTVARB] + > gl->gl_stats.stats[GFS2_LKS_SRTTVARB]; > ^ permalink raw reply [flat|nested] 4+ messages in thread
* [Cluster-devel] [GFS2 PATCH] GFS2: Average in only non-zero round-trip times for congestion stats 2015-04-23 9:06 ` Steven Whitehouse @ 2015-04-23 16:45 ` Bob Peterson 2015-04-23 17:59 ` Steven Whitehouse 0 siblings, 1 reply; 4+ messages in thread From: Bob Peterson @ 2015-04-23 16:45 UTC (permalink / raw) To: cluster-devel.redhat.com ----- Original Message ----- > Hi, > > That looks better. Do you get better results with it compared with the > previous version? > > Steve. > > On 22/04/15 18:00, Bob Peterson wrote: > > Hi, > > > > This patch changes function gfs2_rgrp_congested so that it only factors > > in non-zero values into its average round trip time. If the round-trip > > time is zero for a particular cpu, that cpu has obviously never dealt > > with bouncing the resource group in question, so factoring in a zero > > value will only skew the numbers. It also fixes a compile error on > > some arches related to division. > > > > Regards, > > > > Bob Peterson > > Signed-off-by: Bob Peterson <rpeterso@redhat.com> It's not straightforward because the "preferred rgrps" patch skews the performance results greatly. However, based on my performance tests, yes, the numbers do look better. I temporarily commented out that "preferred rgrps" patch to get a clearer picture. Here are numbers from my testing: This test consists of 5 nodes all simultaneously doing this 'dd' command: dd if=/dev/zero of=/mnt/gfs2/`hostname` bs=1M count=100 (after drop_caches) Stock (nothing disabled): Run 1 -------- pats 716 MB/s jets 615 MB/s bills 669 MB/s dolphins 605 MB/s ravens 735 MB/s -------- Average: 668 MB/s Stock (plus preferred rgrps disabled): AKA 'c3' Run 1 Run 2 -------- -------- pats 766 MB/s 675 MB/s jets 649 MB/s 716 MB/s bills 790 MB/s 735 MB/s dolphins 761 MB/s 727 MB/s ravens 712 MB/s 728 MB/s -------- -------- Average: 736 MB/s 716 MB/s Without latest patch (plus preferred rgrps disabled): AKA 'c2' (In other words, this is the previous patch which was called "GFS2: Use average srttb value in congestion calculations") Run 1 Run 2 Run 3 -------- -------- --------- pats 830 MB/s 688 MB/s 697 MB/s jets 833 MB/s 622 MB/s 645 MB/s bills 831 MB/s 796 MB/s 637 MB/s dolphins 834 MB/s 597 MB/s 690 MB/s ravens 815 MB/s 731 MB/s 734 MB/s -------- -------- --------- Average: 829 MB/s 687 MB/s 681 MB/s Latest patch (plus preferred rgrps disabled): Run 1 Run 2 Run 3 -------- -------- --------- pats 811 MB/s 829 MB/s 652 MB/s jets 825 MB/s 863 MB/s 702 MB/s bills 846 MB/s 825 MB/s 710 MB/s dolphins 845 MB/s 845 MB/s 683 MB/s ravens 820 MB/s 818 MB/s 682 MB/s -------- -------- --------- Average: 829 MB/s 836 MB/s 686 MB/s Latest patch (nothing disabled): Run 1 Run 2 Run 3 -------- -------- --------- pats 834 MB/s 817 MB/s 819 MB/s jets 837 MB/s 835 MB/s 836 MB/s bills 841 MB/s 837 MB/s 834 MB/s dolphins 838 MB/s 851 MB/s 842 MB/s ravens 795 MB/s 808 MB/s 815 MB/s -------- -------- --------- Average: 829 MB/s 830 MB/s 829 MB/s This test (simultaneous dd) is known to be a worst case scenario, so I expect it to show the most improvement. For ordinary block allocations, I don't expect that big of an improvement. Regards, Bob Peterson Red Hat File Systems ^ permalink raw reply [flat|nested] 4+ messages in thread
* [Cluster-devel] [GFS2 PATCH] GFS2: Average in only non-zero round-trip times for congestion stats 2015-04-23 16:45 ` Bob Peterson @ 2015-04-23 17:59 ` Steven Whitehouse 0 siblings, 0 replies; 4+ messages in thread From: Steven Whitehouse @ 2015-04-23 17:59 UTC (permalink / raw) To: cluster-devel.redhat.com Hi, On 23/04/15 17:45, Bob Peterson wrote: > ----- Original Message ----- >> Hi, >> >> That looks better. Do you get better results with it compared with the >> previous version? >> >> Steve. >> >> On 22/04/15 18:00, Bob Peterson wrote: >>> Hi, >>> >>> This patch changes function gfs2_rgrp_congested so that it only factors >>> in non-zero values into its average round trip time. If the round-trip >>> time is zero for a particular cpu, that cpu has obviously never dealt >>> with bouncing the resource group in question, so factoring in a zero >>> value will only skew the numbers. It also fixes a compile error on >>> some arches related to division. >>> >>> Regards, >>> >>> Bob Peterson >>> Signed-off-by: Bob Peterson <rpeterso@redhat.com> > It's not straightforward because the "preferred rgrps" patch skews the > performance results greatly. However, based on my performance tests, yes, > the numbers do look better. I temporarily commented out that "preferred > rgrps" patch to get a clearer picture. Here are numbers from my testing: > This test consists of 5 nodes all simultaneously doing this 'dd' command: > dd if=/dev/zero of=/mnt/gfs2/`hostname` bs=1M count=100 > (after drop_caches) > > Stock (nothing disabled): > Run 1 > -------- > pats 716 MB/s > jets 615 MB/s > bills 669 MB/s > dolphins 605 MB/s > ravens 735 MB/s > -------- > Average: 668 MB/s > > Stock (plus preferred rgrps disabled): AKA 'c3' > Run 1 Run 2 > -------- -------- > pats 766 MB/s 675 MB/s > jets 649 MB/s 716 MB/s > bills 790 MB/s 735 MB/s > dolphins 761 MB/s 727 MB/s > ravens 712 MB/s 728 MB/s > -------- -------- > Average: 736 MB/s 716 MB/s > > Without latest patch (plus preferred rgrps disabled): AKA 'c2' > (In other words, this is the previous patch which was called > "GFS2: Use average srttb value in congestion calculations") > Run 1 Run 2 Run 3 > -------- -------- --------- > pats 830 MB/s 688 MB/s 697 MB/s > jets 833 MB/s 622 MB/s 645 MB/s > bills 831 MB/s 796 MB/s 637 MB/s > dolphins 834 MB/s 597 MB/s 690 MB/s > ravens 815 MB/s 731 MB/s 734 MB/s > -------- -------- --------- > Average: 829 MB/s 687 MB/s 681 MB/s > > Latest patch (plus preferred rgrps disabled): > Run 1 Run 2 Run 3 > -------- -------- --------- > pats 811 MB/s 829 MB/s 652 MB/s > jets 825 MB/s 863 MB/s 702 MB/s > bills 846 MB/s 825 MB/s 710 MB/s > dolphins 845 MB/s 845 MB/s 683 MB/s > ravens 820 MB/s 818 MB/s 682 MB/s > -------- -------- --------- > Average: 829 MB/s 836 MB/s 686 MB/s > > Latest patch (nothing disabled): > Run 1 Run 2 Run 3 > -------- -------- --------- > pats 834 MB/s 817 MB/s 819 MB/s > jets 837 MB/s 835 MB/s 836 MB/s > bills 841 MB/s 837 MB/s 834 MB/s > dolphins 838 MB/s 851 MB/s 842 MB/s > ravens 795 MB/s 808 MB/s 815 MB/s > -------- -------- --------- > Average: 829 MB/s 830 MB/s 829 MB/s > > This test (simultaneous dd) is known to be a worst case scenario, > so I expect it to show the most improvement. For ordinary block > allocations, I don't expect that big of an improvement. > > Regards, > > Bob Peterson > Red Hat File Systems That looks very encouraging I think, definitely a good step forward. Acked-by: Steven Whitehouse <swhiteho@redhat.com> Steve. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2015-04-23 17:59 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <1202489161.4769650.1429721928815.JavaMail.zimbra@redhat.com> 2015-04-22 17:00 ` [Cluster-devel] [GFS2 PATCH] GFS2: Average in only non-zero round-trip times for congestion stats Bob Peterson 2015-04-23 9:06 ` Steven Whitehouse 2015-04-23 16:45 ` Bob Peterson 2015-04-23 17:59 ` Steven Whitehouse
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).