cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
* [Cluster-devel] [GFS2 PATCH] GFS2: Average in only non-zero round-trip times for congestion stats
       [not found] <1202489161.4769650.1429721928815.JavaMail.zimbra@redhat.com>
@ 2015-04-22 17:00 ` Bob Peterson
  2015-04-23  9:06   ` Steven Whitehouse
  0 siblings, 1 reply; 4+ messages in thread
From: Bob Peterson @ 2015-04-22 17:00 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

This patch changes function gfs2_rgrp_congested so that it only factors
in non-zero values into its average round trip time. If the round-trip
time is zero for a particular cpu, that cpu has obviously never dealt
with bouncing the resource group in question, so factoring in a zero
value will only skew the numbers. It also fixes a compile error on
some arches related to division.

Regards,

Bob Peterson
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
index f39eedc..cb27065 100644
--- a/fs/gfs2/rgrp.c
+++ b/fs/gfs2/rgrp.c
@@ -1854,15 +1854,19 @@ static bool gfs2_rgrp_congested(const struct gfs2_rgrpd *rgd, int loops)
 	s64 srttb_diff;
 	s64 sqr_diff;
 	s64 var;
-	int cpu;
+	int cpu, nonzero = 0;
 
 	preempt_disable();
 	for_each_present_cpu(cpu) {
 		st = &per_cpu_ptr(sdp->sd_lkstats, cpu)->lkstats[LM_TYPE_RGRP];
-		a_srttb += st->stats[GFS2_LKS_SRTTB];
+		if (st->stats[GFS2_LKS_SRTTB]) {
+			a_srttb += st->stats[GFS2_LKS_SRTTB];
+			nonzero++;
+		}
 	}
 	st = &this_cpu_ptr(sdp->sd_lkstats)->lkstats[LM_TYPE_RGRP];
-	a_srttb /= num_present_cpus();
+	if (nonzero)
+		do_div(a_srttb, nonzero);
 	r_dcount = st->stats[GFS2_LKS_DCOUNT];
 	var = st->stats[GFS2_LKS_SRTTVARB] +
 	      gl->gl_stats.stats[GFS2_LKS_SRTTVARB];



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [Cluster-devel] [GFS2 PATCH] GFS2: Average in only non-zero round-trip times for congestion stats
  2015-04-22 17:00 ` [Cluster-devel] [GFS2 PATCH] GFS2: Average in only non-zero round-trip times for congestion stats Bob Peterson
@ 2015-04-23  9:06   ` Steven Whitehouse
  2015-04-23 16:45     ` Bob Peterson
  0 siblings, 1 reply; 4+ messages in thread
From: Steven Whitehouse @ 2015-04-23  9:06 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

That looks better. Do you get better results with it compared with the 
previous version?

Steve.

On 22/04/15 18:00, Bob Peterson wrote:
> Hi,
>
> This patch changes function gfs2_rgrp_congested so that it only factors
> in non-zero values into its average round trip time. If the round-trip
> time is zero for a particular cpu, that cpu has obviously never dealt
> with bouncing the resource group in question, so factoring in a zero
> value will only skew the numbers. It also fixes a compile error on
> some arches related to division.
>
> Regards,
>
> Bob Peterson
> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> ---
> diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
> index f39eedc..cb27065 100644
> --- a/fs/gfs2/rgrp.c
> +++ b/fs/gfs2/rgrp.c
> @@ -1854,15 +1854,19 @@ static bool gfs2_rgrp_congested(const struct gfs2_rgrpd *rgd, int loops)
>   	s64 srttb_diff;
>   	s64 sqr_diff;
>   	s64 var;
> -	int cpu;
> +	int cpu, nonzero = 0;
>   
>   	preempt_disable();
>   	for_each_present_cpu(cpu) {
>   		st = &per_cpu_ptr(sdp->sd_lkstats, cpu)->lkstats[LM_TYPE_RGRP];
> -		a_srttb += st->stats[GFS2_LKS_SRTTB];
> +		if (st->stats[GFS2_LKS_SRTTB]) {
> +			a_srttb += st->stats[GFS2_LKS_SRTTB];
> +			nonzero++;
> +		}
>   	}
>   	st = &this_cpu_ptr(sdp->sd_lkstats)->lkstats[LM_TYPE_RGRP];
> -	a_srttb /= num_present_cpus();
> +	if (nonzero)
> +		do_div(a_srttb, nonzero);
>   	r_dcount = st->stats[GFS2_LKS_DCOUNT];
>   	var = st->stats[GFS2_LKS_SRTTVARB] +
>   	      gl->gl_stats.stats[GFS2_LKS_SRTTVARB];
>



^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Cluster-devel] [GFS2 PATCH] GFS2: Average in only non-zero round-trip times for congestion stats
  2015-04-23  9:06   ` Steven Whitehouse
@ 2015-04-23 16:45     ` Bob Peterson
  2015-04-23 17:59       ` Steven Whitehouse
  0 siblings, 1 reply; 4+ messages in thread
From: Bob Peterson @ 2015-04-23 16:45 UTC (permalink / raw)
  To: cluster-devel.redhat.com

----- Original Message -----
> Hi,
> 
> That looks better. Do you get better results with it compared with the
> previous version?
> 
> Steve.
> 
> On 22/04/15 18:00, Bob Peterson wrote:
> > Hi,
> >
> > This patch changes function gfs2_rgrp_congested so that it only factors
> > in non-zero values into its average round trip time. If the round-trip
> > time is zero for a particular cpu, that cpu has obviously never dealt
> > with bouncing the resource group in question, so factoring in a zero
> > value will only skew the numbers. It also fixes a compile error on
> > some arches related to division.
> >
> > Regards,
> >
> > Bob Peterson
> > Signed-off-by: Bob Peterson <rpeterso@redhat.com>

It's not straightforward because the "preferred rgrps" patch skews the
performance results greatly. However, based on my performance tests, yes,
the numbers do look better. I temporarily commented out that "preferred
rgrps" patch to get a clearer picture. Here are numbers from my testing:
This test consists of 5 nodes all simultaneously doing this 'dd' command:
dd if=/dev/zero of=/mnt/gfs2/`hostname` bs=1M count=100
(after drop_caches)

Stock (nothing disabled):
            Run 1
         --------
pats     716 MB/s
jets     615 MB/s
bills    669 MB/s
dolphins 605 MB/s
ravens   735 MB/s
         --------
Average: 668 MB/s

Stock (plus preferred rgrps disabled): AKA 'c3'
            Run 1     Run 2
         --------  --------
pats     766 MB/s  675 MB/s  
jets     649 MB/s  716 MB/s  
bills    790 MB/s  735 MB/s  
dolphins 761 MB/s  727 MB/s  
ravens   712 MB/s  728 MB/s  
         --------  --------
Average: 736 MB/s  716 MB/s

Without latest patch (plus preferred rgrps disabled): AKA 'c2'
(In other words, this is the previous patch which was called
"GFS2: Use average srttb value in congestion calculations")
            Run 1     Run 2     Run 3 
         --------  --------  ---------
pats     830 MB/s  688 MB/s  697 MB/s
jets     833 MB/s  622 MB/s  645 MB/s
bills    831 MB/s  796 MB/s  637 MB/s
dolphins 834 MB/s  597 MB/s  690 MB/s
ravens   815 MB/s  731 MB/s  734 MB/s
         --------  --------  ---------
Average: 829 MB/s  687 MB/s  681 MB/s

Latest patch (plus preferred rgrps disabled):
            Run 1     Run 2     Run 3 
         --------  --------  ---------
pats     811 MB/s  829 MB/s  652 MB/s
jets     825 MB/s  863 MB/s  702 MB/s
bills    846 MB/s  825 MB/s  710 MB/s
dolphins 845 MB/s  845 MB/s  683 MB/s
ravens   820 MB/s  818 MB/s  682 MB/s
         --------  --------  ---------
Average: 829 MB/s  836 MB/s  686 MB/s

Latest patch (nothing disabled):
            Run 1     Run 2     Run 3 
         --------  --------  ---------
pats     834 MB/s  817 MB/s  819 MB/s
jets     837 MB/s  835 MB/s  836 MB/s
bills    841 MB/s  837 MB/s  834 MB/s
dolphins 838 MB/s  851 MB/s  842 MB/s
ravens   795 MB/s  808 MB/s  815 MB/s
         --------  --------  ---------
Average: 829 MB/s  830 MB/s  829 MB/s

This test (simultaneous dd) is known to be a worst case scenario,
so I expect it to show the most improvement. For ordinary block
allocations, I don't expect that big of an improvement.

Regards,

Bob Peterson
Red Hat File Systems



^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Cluster-devel] [GFS2 PATCH] GFS2: Average in only non-zero round-trip times for congestion stats
  2015-04-23 16:45     ` Bob Peterson
@ 2015-04-23 17:59       ` Steven Whitehouse
  0 siblings, 0 replies; 4+ messages in thread
From: Steven Whitehouse @ 2015-04-23 17:59 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On 23/04/15 17:45, Bob Peterson wrote:
> ----- Original Message -----
>> Hi,
>>
>> That looks better. Do you get better results with it compared with the
>> previous version?
>>
>> Steve.
>>
>> On 22/04/15 18:00, Bob Peterson wrote:
>>> Hi,
>>>
>>> This patch changes function gfs2_rgrp_congested so that it only factors
>>> in non-zero values into its average round trip time. If the round-trip
>>> time is zero for a particular cpu, that cpu has obviously never dealt
>>> with bouncing the resource group in question, so factoring in a zero
>>> value will only skew the numbers. It also fixes a compile error on
>>> some arches related to division.
>>>
>>> Regards,
>>>
>>> Bob Peterson
>>> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> It's not straightforward because the "preferred rgrps" patch skews the
> performance results greatly. However, based on my performance tests, yes,
> the numbers do look better. I temporarily commented out that "preferred
> rgrps" patch to get a clearer picture. Here are numbers from my testing:
> This test consists of 5 nodes all simultaneously doing this 'dd' command:
> dd if=/dev/zero of=/mnt/gfs2/`hostname` bs=1M count=100
> (after drop_caches)
>
> Stock (nothing disabled):
>              Run 1
>           --------
> pats     716 MB/s
> jets     615 MB/s
> bills    669 MB/s
> dolphins 605 MB/s
> ravens   735 MB/s
>           --------
> Average: 668 MB/s
>
> Stock (plus preferred rgrps disabled): AKA 'c3'
>              Run 1     Run 2
>           --------  --------
> pats     766 MB/s  675 MB/s
> jets     649 MB/s  716 MB/s
> bills    790 MB/s  735 MB/s
> dolphins 761 MB/s  727 MB/s
> ravens   712 MB/s  728 MB/s
>           --------  --------
> Average: 736 MB/s  716 MB/s
>
> Without latest patch (plus preferred rgrps disabled): AKA 'c2'
> (In other words, this is the previous patch which was called
> "GFS2: Use average srttb value in congestion calculations")
>              Run 1     Run 2     Run 3
>           --------  --------  ---------
> pats     830 MB/s  688 MB/s  697 MB/s
> jets     833 MB/s  622 MB/s  645 MB/s
> bills    831 MB/s  796 MB/s  637 MB/s
> dolphins 834 MB/s  597 MB/s  690 MB/s
> ravens   815 MB/s  731 MB/s  734 MB/s
>           --------  --------  ---------
> Average: 829 MB/s  687 MB/s  681 MB/s
>
> Latest patch (plus preferred rgrps disabled):
>              Run 1     Run 2     Run 3
>           --------  --------  ---------
> pats     811 MB/s  829 MB/s  652 MB/s
> jets     825 MB/s  863 MB/s  702 MB/s
> bills    846 MB/s  825 MB/s  710 MB/s
> dolphins 845 MB/s  845 MB/s  683 MB/s
> ravens   820 MB/s  818 MB/s  682 MB/s
>           --------  --------  ---------
> Average: 829 MB/s  836 MB/s  686 MB/s
>
> Latest patch (nothing disabled):
>              Run 1     Run 2     Run 3
>           --------  --------  ---------
> pats     834 MB/s  817 MB/s  819 MB/s
> jets     837 MB/s  835 MB/s  836 MB/s
> bills    841 MB/s  837 MB/s  834 MB/s
> dolphins 838 MB/s  851 MB/s  842 MB/s
> ravens   795 MB/s  808 MB/s  815 MB/s
>           --------  --------  ---------
> Average: 829 MB/s  830 MB/s  829 MB/s
>
> This test (simultaneous dd) is known to be a worst case scenario,
> so I expect it to show the most improvement. For ordinary block
> allocations, I don't expect that big of an improvement.
>
> Regards,
>
> Bob Peterson
> Red Hat File Systems

That looks very encouraging I think, definitely a good step forward.
Acked-by: Steven Whitehouse <swhiteho@redhat.com>

Steve.



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-04-23 17:59 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1202489161.4769650.1429721928815.JavaMail.zimbra@redhat.com>
2015-04-22 17:00 ` [Cluster-devel] [GFS2 PATCH] GFS2: Average in only non-zero round-trip times for congestion stats Bob Peterson
2015-04-23  9:06   ` Steven Whitehouse
2015-04-23 16:45     ` Bob Peterson
2015-04-23 17:59       ` Steven Whitehouse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).