All clients impatient

All of lore.kernel.org
 help / color / mirror / Atom feed

* All clients impatient
@ 2012-11-13 23:23 Scott Emery
  2012-11-14 17:42 ` Jens Axboe
  0 siblings, 1 reply; 11+ messages in thread
From: Scott Emery @ 2012-11-13 23:23 UTC (permalink / raw)
  To: fio


	I'm trying to use the client/server feature in fio 2.0.10
vanilla and I'm finding that "All clients" prints too soon. 

---
semery@xxx2-zzz2 multi2]$ cat run.all.2

export FIO="../fio-2.0.10/fio"
export SERVERS="xxx2-yyy1-ext xxx2-yyy2-ext xxx2-yyy3-ext xxx2-yyy4-ext"
#export SERVERS="xxx2-yyy1-ext"
export BASE=/pdmf/semery/bench
export SCRIPT="write_tape_2"

# parameters used in test file

export ARGS=""

for i in $SERVERS
do
# prep for run
        export HOST=$i
        export DIR=${BASE}/${HOST}
        sed "s_DIR_${DIR}"_ < $SCRIPT > $SCRIPT.$HOST
        mkdir -p $DIR
        ARGS+="--client=$HOST $SCRIPT.$HOST "
done

echo $FIO $ARGS

$FIO $ARGS > $SCRIPT.out &
[semery@xxx2-zzz2 multi2]$ rm /pdmf/semery/bench/xxx2*/*
[semery@xxx2-zzz2 multi2]$ ./run.all.2
../fio-2.0.10/fio --client=xxx2-yyy1-ext write_tape_2.xxx2-yyy1-ext --client=xxx
2-yyy2-ext write_tape_2.xxx2-yyy2-ext --client=xxx2-yyy3-ext write_tape_2.xxx2-y
yy3-ext --client=xxx2-yyy4-ext write_tape_2.xxx2-yyy4-ext
[semery@xxx2-zzz2 multi2]$ egrep 'groupid|bw=' write_tape_2.out
wt2: (groupid=1, jobs=1): err= 0: pid=1456: Tue Nov 13 15:06:11 2012
  read : io=10240MB, bw=277592KB/s, iops=135 , runt= 37774msec
wt2: (groupid=1, jobs=1): err= 0: pid=1457: Tue Nov 13 15:06:11 2012
  read : io=10240MB, bw=278144KB/s, iops=135 , runt= 37699msec
wt2: (groupid=1, jobs=1): err= 0: pid=19876: Tue Nov 13 15:06:12 2012
  read : io=10240MB, bw=276954KB/s, iops=135 , runt= 37861msec
All clients: (groupid=1, jobs=3): err= 0: pid=0: Tue Nov 13 15:06:12 2012
  read : io=32212MB, bw=850803KB/s, iops=405 , runt= 37861msec
wt2: (groupid=1, jobs=1): err= 0: pid=19877: Tue Nov 13 15:06:12 2012
  read : io=10240MB, bw=279003KB/s, iops=136 , runt= 37583msec
wt2: (groupid=1, jobs=1): err= 0: pid=6569: Tue Nov 13 15:06:18 2012
  read : io=10240MB, bw=276443KB/s, iops=134 , runt= 37931msec
wt2: (groupid=1, jobs=1): err= 0: pid=6570: Tue Nov 13 15:06:18 2012
  read : io=10240MB, bw=276530KB/s, iops=135 , runt= 37919msec
wt2: (groupid=1, jobs=1): err= 0: pid=19145: Tue Nov 13 15:06:18 2012
  read : io=10240MB, bw=276501KB/s, iops=135 , runt= 37923msec
wt2: (groupid=1, jobs=1): err= 0: pid=19146: Tue Nov 13 15:06:18 2012
  read : io=10240MB, bw=276669KB/s, iops=135 , runt= 37900msec
[semery@xxx2-zzz2 multi2]$ cat write_tape_2.xxx2-yyy1-ext

[global]
bs=2m
ioengine=libaio
fallocate=posix
rate=270m
iodepth=64
size=10g
direct=1
directory=/pdmf/semery/bench/xxx2-yyy1-ext

[wt2]
rw=read
numjobs=2
stonewall

---

	For a while it was totting things up after four clients,
which kind of made sense... now it's stuck on three.  Back when
it made more sense I thought that maybe handle_ts only dealt with
the information from a single thread for each client rather than
the aggregate of multiple thread jobs.  That part seems to hold
true even when it doesn't have the info for all the clients.

	Is this reporting the aggregate as expected?
	Is there a parameter change that I need to make to get
the output I expect?

Scott Emery
emery@sgi.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: All clients impatient
  2012-11-13 23:23 All clients impatient Scott Emery
@ 2012-11-14 17:42 ` Jens Axboe
  2012-11-14 17:54   ` Jens Axboe
  0 siblings, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2012-11-14 17:42 UTC (permalink / raw)
  To: Scott Emery; +Cc: fio

On 2012-11-13 16:23, Scott Emery wrote:
> 	I'm trying to use the client/server feature in fio 2.0.10
> vanilla and I'm finding that "All clients" prints too soon. 
> 
> ---
> semery@xxx2-zzz2 multi2]$ cat run.all.2
> 
> export FIO="../fio-2.0.10/fio"
> export SERVERS="xxx2-yyy1-ext xxx2-yyy2-ext xxx2-yyy3-ext xxx2-yyy4-ext"
> #export SERVERS="xxx2-yyy1-ext"
> export BASE=/pdmf/semery/bench
> export SCRIPT="write_tape_2"
> 
> # parameters used in test file
> 
> export ARGS=""
> 
> for i in $SERVERS
> do
> # prep for run
>         export HOST=$i
>         export DIR=${BASE}/${HOST}
>         sed "s_DIR_${DIR}"_ < $SCRIPT > $SCRIPT.$HOST
>         mkdir -p $DIR
>         ARGS+="--client=$HOST $SCRIPT.$HOST "
> done
> 
> echo $FIO $ARGS
> 
> $FIO $ARGS > $SCRIPT.out &
> [semery@xxx2-zzz2 multi2]$ rm /pdmf/semery/bench/xxx2*/*
> [semery@xxx2-zzz2 multi2]$ ./run.all.2
> ../fio-2.0.10/fio --client=xxx2-yyy1-ext write_tape_2.xxx2-yyy1-ext --client=xxx
> 2-yyy2-ext write_tape_2.xxx2-yyy2-ext --client=xxx2-yyy3-ext write_tape_2.xxx2-y
> yy3-ext --client=xxx2-yyy4-ext write_tape_2.xxx2-yyy4-ext
> [semery@xxx2-zzz2 multi2]$ egrep 'groupid|bw=' write_tape_2.out
> wt2: (groupid=1, jobs=1): err= 0: pid=1456: Tue Nov 13 15:06:11 2012
>   read : io=10240MB, bw=277592KB/s, iops=135 , runt= 37774msec
> wt2: (groupid=1, jobs=1): err= 0: pid=1457: Tue Nov 13 15:06:11 2012
>   read : io=10240MB, bw=278144KB/s, iops=135 , runt= 37699msec
> wt2: (groupid=1, jobs=1): err= 0: pid=19876: Tue Nov 13 15:06:12 2012
>   read : io=10240MB, bw=276954KB/s, iops=135 , runt= 37861msec
> All clients: (groupid=1, jobs=3): err= 0: pid=0: Tue Nov 13 15:06:12 2012
>   read : io=32212MB, bw=850803KB/s, iops=405 , runt= 37861msec
> wt2: (groupid=1, jobs=1): err= 0: pid=19877: Tue Nov 13 15:06:12 2012
>   read : io=10240MB, bw=279003KB/s, iops=136 , runt= 37583msec
> wt2: (groupid=1, jobs=1): err= 0: pid=6569: Tue Nov 13 15:06:18 2012
>   read : io=10240MB, bw=276443KB/s, iops=134 , runt= 37931msec
> wt2: (groupid=1, jobs=1): err= 0: pid=6570: Tue Nov 13 15:06:18 2012
>   read : io=10240MB, bw=276530KB/s, iops=135 , runt= 37919msec
> wt2: (groupid=1, jobs=1): err= 0: pid=19145: Tue Nov 13 15:06:18 2012
>   read : io=10240MB, bw=276501KB/s, iops=135 , runt= 37923msec
> wt2: (groupid=1, jobs=1): err= 0: pid=19146: Tue Nov 13 15:06:18 2012
>   read : io=10240MB, bw=276669KB/s, iops=135 , runt= 37900msec
> [semery@xxx2-zzz2 multi2]$ cat write_tape_2.xxx2-yyy1-ext
> 
> [global]
> bs=2m
> ioengine=libaio
> fallocate=posix
> rate=270m
> iodepth=64
> size=10g
> direct=1
> directory=/pdmf/semery/bench/xxx2-yyy1-ext
> 
> [wt2]
> rw=read
> numjobs=2
> stonewall
> 
> ---
> 
> 	For a while it was totting things up after four clients,
> which kind of made sense... now it's stuck on three.  Back when
> it made more sense I thought that maybe handle_ts only dealt with
> the information from a single thread for each client rather than
> the aggregate of multiple thread jobs.  That part seems to hold
> true even when it doesn't have the info for all the clients.
> 
> 	Is this reporting the aggregate as expected?
> 	Is there a parameter change that I need to make to get
> the output I expect?

As the job file is written, it will not aggregate results from a single
"instance" of the server. You would want group_reporting=1 to do that.
However, that will still give you one set of outputs per connection, not
one for all of them. Right now fio does not support collecting outputs
from all connections, a higher level group reporting if you will.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: All clients impatient
  2012-11-14 17:42 ` Jens Axboe
@ 2012-11-14 17:54   ` Jens Axboe
  2012-11-14 18:10     ` Jens Axboe
  0 siblings, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2012-11-14 17:54 UTC (permalink / raw)
  To: Scott Emery; +Cc: fio

On 2012-11-14 10:42, Jens Axboe wrote:
>> 	For a while it was totting things up after four clients,
>> which kind of made sense... now it's stuck on three.  Back when
>> it made more sense I thought that maybe handle_ts only dealt with
>> the information from a single thread for each client rather than
>> the aggregate of multiple thread jobs.  That part seems to hold
>> true even when it doesn't have the info for all the clients.
>>
>> 	Is this reporting the aggregate as expected?
>> 	Is there a parameter change that I need to make to get
>> the output I expect?
> 
> As the job file is written, it will not aggregate results from a single
> "instance" of the server. You would want group_reporting=1 to do that.
> However, that will still give you one set of outputs per connection, not
> one for all of them. Right now fio does not support collecting outputs
> from all connections, a higher level group reporting if you will.

Actually, I misremembered that and didn't check before replying. It
_will_ sum all clients, if it has more than one connection. But there's
a bug where we race on client exit and dec the expected client count.
Does it work better with the below patch?


diff --git a/client.c b/client.c
index bf09d7e..f82bc30 100644
--- a/client.c
+++ b/client.c
@@ -50,6 +50,7 @@ struct fio_client {
 	int error;
 	int ipv6;
 	int sent_job;
+	int did_stat;
 
 	struct flist_head eta_list;
 	struct client_eta *eta_in_flight;
@@ -83,6 +84,7 @@ static struct thread_stat client_ts;
 static struct group_run_stats client_gs;
 static int sum_stat_clients;
 static int sum_stat_nr;
+static int do_output_all_clients;
 
 #define FIO_CLIENT_HASH_BITS	7
 #define FIO_CLIENT_HASH_SZ	(1 << FIO_CLIENT_HASH_BITS)
@@ -159,9 +161,11 @@ static void remove_client(struct fio_client *client)
 	if (client->ini_file)
 		free(client->ini_file);
 
+	if (!client->did_stat)
+		sum_stat_clients--;
+
 	free(client);
 	nr_clients--;
-	sum_stat_clients--;
 }
 
 static void put_client(struct fio_client *client)
@@ -664,7 +668,7 @@ static void convert_gs(struct group_run_stats *dst, struct group_run_stats *src)
 	dst->groupid	= le32_to_cpu(src->groupid);
 }
 
-static void handle_ts(struct fio_net_cmd *cmd)
+static void handle_ts(struct fio_client *client, struct fio_net_cmd *cmd)
 {
 	struct cmd_ts_pdu *p = (struct cmd_ts_pdu *) cmd->payload;
 
@@ -672,8 +676,9 @@ static void handle_ts(struct fio_net_cmd *cmd)
 	convert_gs(&p->rs, &p->rs);
 
 	show_thread_status(&p->ts, &p->rs);
+	client->did_stat = 1;
 
-	if (sum_stat_clients == 1)
+	if (!do_output_all_clients)
 		return;
 
 	sum_thread_stats(&client_ts, &p->ts, sum_stat_nr);
@@ -921,7 +926,7 @@ static int handle_client(struct fio_client *client)
 		free(cmd);
 		break;
 	case FIO_NET_CMD_TS:
-		handle_ts(cmd);
+		handle_ts(client, cmd);
 		free(cmd);
 		break;
 	case FIO_NET_CMD_GS:
@@ -1054,6 +1059,9 @@ int fio_handle_clients(void)
 	pfds = malloc(nr_clients * sizeof(struct pollfd));
 
 	sum_stat_clients = nr_clients;
+	if (sum_stat_clients > 1)
+		do_output_all_clients = 1;
+
 	init_thread_stat(&client_ts);
 	init_group_run_stat(&client_gs);
 

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: All clients impatient
  2012-11-14 17:54   ` Jens Axboe
@ 2012-11-14 18:10     ` Jens Axboe
  2012-11-14 19:09       ` emery
  0 siblings, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2012-11-14 18:10 UTC (permalink / raw)
  To: Scott Emery; +Cc: fio

On 2012-11-14 10:54, Jens Axboe wrote:
> On 2012-11-14 10:42, Jens Axboe wrote:
>>> 	For a while it was totting things up after four clients,
>>> which kind of made sense... now it's stuck on three.  Back when
>>> it made more sense I thought that maybe handle_ts only dealt with
>>> the information from a single thread for each client rather than
>>> the aggregate of multiple thread jobs.  That part seems to hold
>>> true even when it doesn't have the info for all the clients.
>>>
>>> 	Is this reporting the aggregate as expected?
>>> 	Is there a parameter change that I need to make to get
>>> the output I expect?
>>
>> As the job file is written, it will not aggregate results from a single
>> "instance" of the server. You would want group_reporting=1 to do that.
>> However, that will still give you one set of outputs per connection, not
>> one for all of them. Right now fio does not support collecting outputs
>> from all connections, a higher level group reporting if you will.
> 
> Actually, I misremembered that and didn't check before replying. It
> _will_ sum all clients, if it has more than one connection. But there's
> a bug where we race on client exit and dec the expected client count.
> Does it work better with the below patch?

Note that you still need group_reporting=1 in your job file, the all
clients report depend on getting only one set of stats from each client.
I'll look into fixing that up, too.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: All clients impatient
  2012-11-14 18:10     ` Jens Axboe
@ 2012-11-14 19:09       ` emery
  2012-11-14 19:29         ` emery
  0 siblings, 1 reply; 11+ messages in thread
From: emery @ 2012-11-14 19:09 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Scott Emery, fio, emery

emery@sgi.com:  Jens Axboe <axboe@kernel.dk>
In message <50A3DE92.3020509@kernel.dk>, Jens Axboe writes:
>On 2012-11-14 10:54, Jens Axboe wrote:
>> On 2012-11-14 10:42, Jens Axboe wrote:
>>>> 	For a while it was totting things up after four clients,
>>>> which kind of made sense... now it's stuck on three.  Back when
>>>> it made more sense I thought that maybe handle_ts only dealt with
>>>> the information from a single thread for each client rather than
>>>> the aggregate of multiple thread jobs.  That part seems to hold
>>>> true even when it doesn't have the info for all the clients.
>>>>
>>>> 	Is this reporting the aggregate as expected?
>>>> 	Is there a parameter change that I need to make to get
>>>> the output I expect?
>>>
>>> As the job file is written, it will not aggregate results from a single
>>> "instance" of the server. You would want group_reporting=1 to do that.
>>> However, that will still give you one set of outputs per connection, not
>>> one for all of them. Right now fio does not support collecting outputs
>>> from all connections, a higher level group reporting if you will.
>> 
>> Actually, I misremembered that and didn't check before replying. It
>> _will_ sum all clients, if it has more than one connection. But there's
>> a bug where we race on client exit and dec the expected client count.
>> Does it work better with the below patch?
>
>Note that you still need group_reporting=1 in your job file, the all
>clients report depend on getting only one set of stats from each client.
>I'll look into fixing that up, too.
>
>-- 
>Jens Axboe
>
>--
>To unsubscribe from this list: send the line "unsubscribe fio" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html

	Ah, I just sent an email to you noticing that... :-)

Scott Emery
emery@sgi.com


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: All clients impatient
  2012-11-14 19:09       ` emery
@ 2012-11-14 19:29         ` emery
  2012-11-14 19:58           ` Jens Axboe
  0 siblings, 1 reply; 11+ messages in thread
From: emery @ 2012-11-14 19:29 UTC (permalink / raw)
  To: emery; +Cc: Jens Axboe, fio

emery@sgi.com:  emery@sgi.com
In message <201211141909.qAEJ9HtO29431352@zion.americas.sgi.com>, emery@sgi.com
 writes:
>emery@sgi.com:  Jens Axboe <axboe@kernel.dk>
>In message <50A3DE92.3020509@kernel.dk>, Jens Axboe writes:
>>On 2012-11-14 10:54, Jens Axboe wrote:
>>> On 2012-11-14 10:42, Jens Axboe wrote:
>>>>> 	For a while it was totting things up after four clients,
>>>>> which kind of made sense... now it's stuck on three.  Back when
>>>>> it made more sense I thought that maybe handle_ts only dealt with
>>>>> the information from a single thread for each client rather than
>>>>> the aggregate of multiple thread jobs.  That part seems to hold
>>>>> true even when it doesn't have the info for all the clients.
>>>>>
>>>>> 	Is this reporting the aggregate as expected?
>>>>> 	Is there a parameter change that I need to make to get
>>>>> the output I expect?
>>>>
>>>> As the job file is written, it will not aggregate results from a single
>>>> "instance" of the server. You would want group_reporting=1 to do that.
>>>> However, that will still give you one set of outputs per connection, not
>>>> one for all of them. Right now fio does not support collecting outputs
>>>> from all connections, a higher level group reporting if you will.
>>> 
>>> Actually, I misremembered that and didn't check before replying. It
>>> _will_ sum all clients, if it has more than one connection. But there's
>>> a bug where we race on client exit and dec the expected client count.
>>> Does it work better with the below patch?
>>
>>Note that you still need group_reporting=1 in your job file, the all
>>clients report depend on getting only one set of stats from each client.
>>I'll look into fixing that up, too.
>>
>>-- 
>>Jens Axboe
>>
>>--
>>To unsubscribe from this list: send the line "unsubscribe fio" in
>>the body of a message to majordomo@vger.kernel.org
>>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>	Ah, I just sent an email to you noticing that... :-)
>
>Scott Emery
>emery@sgi.com
>--
>To unsubscribe from this list: send the line "unsubscribe fio" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html


	Works great when I add group_reporting.

[semery@lou2-mds2 multi2]$ egrep 'groupid|bw=' write_tape_2.out
wt2: (groupid=1, jobs=2): err= 0: pid=3070: Wed Nov 14 11:26:17 2012
  read : io=20480MB, bw=554641KB/s, iops=270 , runt= 37811msec
wt2: (groupid=1, jobs=2): err= 0: pid=10638: Wed Nov 14 11:26:17 2012
  read : io=20480MB, bw=556658KB/s, iops=271 , runt= 37674msec
wt2: (groupid=1, jobs=2): err= 0: pid=26969: Wed Nov 14 11:26:17 2012
  read : io=20480MB, bw=553703KB/s, iops=270 , runt= 37875msec
wt2: (groupid=1, jobs=2): err= 0: pid=14154: Wed Nov 14 11:26:17 2012
  read : io=20480MB, bw=553251KB/s, iops=270 , runt= 37906msec
All clients: (groupid=1, jobs=4): err= 0: pid=0: Wed Nov 14 11:26:17 2012
  read : io=85899MB, bw=2266.2MB/s, iops=1080 , runt= 37906msec

Scott Emery
emery@sgi.com


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: All clients impatient
  2012-11-14 19:29         ` emery
@ 2012-11-14 19:58           ` Jens Axboe
  2012-11-14 20:57             ` Jens Axboe
  2012-11-14 21:26             ` emery
  0 siblings, 2 replies; 11+ messages in thread
From: Jens Axboe @ 2012-11-14 19:58 UTC (permalink / raw)
  To: emery; +Cc: fio

On 2012-11-14 12:29, emery@sgi.com wrote:
> emery@sgi.com:  emery@sgi.com
> In message <201211141909.qAEJ9HtO29431352@zion.americas.sgi.com>, emery@sgi.com
>  writes:
>> emery@sgi.com:  Jens Axboe <axboe@kernel.dk>
>> In message <50A3DE92.3020509@kernel.dk>, Jens Axboe writes:
>>> On 2012-11-14 10:54, Jens Axboe wrote:
>>>> On 2012-11-14 10:42, Jens Axboe wrote:
>>>>>> 	For a while it was totting things up after four clients,
>>>>>> which kind of made sense... now it's stuck on three.  Back when
>>>>>> it made more sense I thought that maybe handle_ts only dealt with
>>>>>> the information from a single thread for each client rather than
>>>>>> the aggregate of multiple thread jobs.  That part seems to hold
>>>>>> true even when it doesn't have the info for all the clients.
>>>>>>
>>>>>> 	Is this reporting the aggregate as expected?
>>>>>> 	Is there a parameter change that I need to make to get
>>>>>> the output I expect?
>>>>>
>>>>> As the job file is written, it will not aggregate results from a single
>>>>> "instance" of the server. You would want group_reporting=1 to do that.
>>>>> However, that will still give you one set of outputs per connection, not
>>>>> one for all of them. Right now fio does not support collecting outputs
>>>>> from all connections, a higher level group reporting if you will.
>>>>
>>>> Actually, I misremembered that and didn't check before replying. It
>>>> _will_ sum all clients, if it has more than one connection. But there's
>>>> a bug where we race on client exit and dec the expected client count.
>>>> Does it work better with the below patch?
>>>
>>> Note that you still need group_reporting=1 in your job file, the all
>>> clients report depend on getting only one set of stats from each client.
>>> I'll look into fixing that up, too.
>>>
>>> -- 
>>> Jens Axboe
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe fio" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>> 	Ah, I just sent an email to you noticing that... :-)
>>
>> Scott Emery
>> emery@sgi.com
>> --
>> To unsubscribe from this list: send the line "unsubscribe fio" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> 	Works great when I add group_reporting.
> 
> [semery@lou2-mds2 multi2]$ egrep 'groupid|bw=' write_tape_2.out
> wt2: (groupid=1, jobs=2): err= 0: pid=3070: Wed Nov 14 11:26:17 2012
>   read : io=20480MB, bw=554641KB/s, iops=270 , runt= 37811msec
> wt2: (groupid=1, jobs=2): err= 0: pid=10638: Wed Nov 14 11:26:17 2012
>   read : io=20480MB, bw=556658KB/s, iops=271 , runt= 37674msec
> wt2: (groupid=1, jobs=2): err= 0: pid=26969: Wed Nov 14 11:26:17 2012
>   read : io=20480MB, bw=553703KB/s, iops=270 , runt= 37875msec
> wt2: (groupid=1, jobs=2): err= 0: pid=14154: Wed Nov 14 11:26:17 2012
>   read : io=20480MB, bw=553251KB/s, iops=270 , runt= 37906msec
> All clients: (groupid=1, jobs=4): err= 0: pid=0: Wed Nov 14 11:26:17 2012
>   read : io=85899MB, bw=2266.2MB/s, iops=1080 , runt= 37906msec

Should work without group_reporting as well with the below patch. Can
you confirm?

diff --git a/backend.c b/backend.c
index fd73eda..b80c903 100644
--- a/backend.c
+++ b/backend.c
@@ -62,6 +62,7 @@ struct io_log *agg_io_log[DDIR_RWDIR_CNT];
 
 int groupid = 0;
 unsigned int thread_number = 0;
+unsigned int stat_number = 0;
 unsigned int nr_process = 0;
 unsigned int nr_thread = 0;
 int shm_id = 0;
diff --git a/client.c b/client.c
index bf09d7e..e02a33b 100644
--- a/client.c
+++ b/client.c
@@ -50,6 +50,7 @@ struct fio_client {
 	int error;
 	int ipv6;
 	int sent_job;
+	int did_stat;
 
 	struct flist_head eta_list;
 	struct client_eta *eta_in_flight;
@@ -83,6 +84,7 @@ static struct thread_stat client_ts;
 static struct group_run_stats client_gs;
 static int sum_stat_clients;
 static int sum_stat_nr;
+static int do_output_all_clients;
 
 #define FIO_CLIENT_HASH_BITS	7
 #define FIO_CLIENT_HASH_SZ	(1 << FIO_CLIENT_HASH_BITS)
@@ -159,9 +161,11 @@ static void remove_client(struct fio_client *client)
 	if (client->ini_file)
 		free(client->ini_file);
 
+	if (!client->did_stat)
+		sum_stat_clients--;
+
 	free(client);
 	nr_clients--;
-	sum_stat_clients--;
 }
 
 static void put_client(struct fio_client *client)
@@ -664,7 +668,7 @@ static void convert_gs(struct group_run_stats *dst, struct group_run_stats *src)
 	dst->groupid	= le32_to_cpu(src->groupid);
 }
 
-static void handle_ts(struct fio_net_cmd *cmd)
+static void handle_ts(struct fio_client *client, struct fio_net_cmd *cmd)
 {
 	struct cmd_ts_pdu *p = (struct cmd_ts_pdu *) cmd->payload;
 
@@ -672,8 +676,9 @@ static void handle_ts(struct fio_net_cmd *cmd)
 	convert_gs(&p->rs, &p->rs);
 
 	show_thread_status(&p->ts, &p->rs);
+	client->did_stat = 1;
 
-	if (sum_stat_clients == 1)
+	if (!do_output_all_clients)
 		return;
 
 	sum_thread_stats(&client_ts, &p->ts, sum_stat_nr);
@@ -870,6 +875,11 @@ static void handle_start(struct fio_client *client, struct fio_net_cmd *cmd)
 
 	client->state = Client_started;
 	client->jobs = le32_to_cpu(pdu->jobs);
+
+	if (sum_stat_clients > 1)
+		do_output_all_clients = 1;
+
+	sum_stat_clients += le32_to_cpu(pdu->stat_outputs);
 }
 
 static void handle_stop(struct fio_client *client, struct fio_net_cmd *cmd)
@@ -921,7 +931,7 @@ static int handle_client(struct fio_client *client)
 		free(cmd);
 		break;
 	case FIO_NET_CMD_TS:
-		handle_ts(cmd);
+		handle_ts(client, cmd);
 		free(cmd);
 		break;
 	case FIO_NET_CMD_GS:
@@ -1053,7 +1063,6 @@ int fio_handle_clients(void)
 
 	pfds = malloc(nr_clients * sizeof(struct pollfd));
 
-	sum_stat_clients = nr_clients;
 	init_thread_stat(&client_ts);
 	init_group_run_stat(&client_gs);
 
diff --git a/fio.h b/fio.h
index ca1a5f0..f69de0d 100644
--- a/fio.h
+++ b/fio.h
@@ -558,6 +558,7 @@ enum {
 
 extern int exitall_on_terminate;
 extern unsigned int thread_number;
+extern unsigned int stat_number;
 extern unsigned int nr_process, nr_thread;
 extern int shm_id;
 extern int groupid;
diff --git a/init.c b/init.c
index 23be863..a682423 100644
--- a/init.c
+++ b/init.c
@@ -317,6 +317,10 @@ static struct thread_data *get_new_job(int global, struct thread_data *parent,
 	profile_add_hooks(td);
 
 	td->thread_number = thread_number;
+
+	if (!parent || !parent->o.group_reporting)
+		stat_number++;
+
 	return td;
 }
 
diff --git a/server.c b/server.c
index 72def7e..f8c3635 100644
--- a/server.c
+++ b/server.c
@@ -342,6 +342,7 @@ static int handle_job_cmd(struct fio_net_cmd *cmd)
 	}
 
 	spdu.jobs = cpu_to_le32(thread_number);
+	spdu.stat_outputs = cpu_to_le32(stat_number);
 	fio_net_send_cmd(server_fd, FIO_NET_CMD_START, &spdu, sizeof(spdu), 0);
 
 	ret = fio_backend();
diff --git a/server.h b/server.h
index 9bf8907..5b8c7ba 100644
--- a/server.h
+++ b/server.h
@@ -96,6 +96,7 @@ struct cmd_line_pdu {
 
 struct cmd_start_pdu {
 	uint32_t jobs;
+	uint32_t stat_outputs;
 };
 
 struct cmd_end_pdu {

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: All clients impatient
  2012-11-14 19:58           ` Jens Axboe
@ 2012-11-14 20:57             ` Jens Axboe
  2012-11-14 21:30               ` emery
  2012-11-14 21:26             ` emery
  1 sibling, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2012-11-14 20:57 UTC (permalink / raw)
  To: emery; +Cc: fio

On 2012-11-14 12:58, Jens Axboe wrote:
> Should work without group_reporting as well with the below patch. Can
> you confirm?

Committed the patch, with a few modifications. So fio git should work
for you now, regardless of whether group_reporting=1 is used or not. I
suspect you want it on in any case, as you are only interested in the
aggregate results for the clients.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: All clients impatient
  2012-11-14 20:57             ` Jens Axboe
@ 2012-11-14 21:30               ` emery
  2012-11-14 21:31                 ` Jens Axboe
  0 siblings, 1 reply; 11+ messages in thread
From: emery @ 2012-11-14 21:30 UTC (permalink / raw)
  To: Jens Axboe; +Cc: fio

emery@sgi.com:  Jens Axboe <axboe@kernel.dk>
In message <50A405C4.50100@kernel.dk>, Jens Axboe writes:
>On 2012-11-14 12:58, Jens Axboe wrote:
>> Should work without group_reporting as well with the below patch. Can
>> you confirm?
>
>Committed the patch, with a few modifications. So fio git should work
>for you now, regardless of whether group_reporting=1 is used or not. I
>suspect you want it on in any case, as you are only interested in the
>aggregate results for the clients.
>
>-- 
>Jens Axboe
>
>--
>To unsubscribe from this list: send the line "unsubscribe fio" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html

	I will probably run w/o group_reporting=1. I'm interested
in both the aggregate performance of the cluster filesystem(s), and
the deviation of the individual threads.

	Aggregate performance helps me quickly see whether I'm getting
the performance I expect from the filesystem or filesystems. W/o that
feature I might have to whip out a calculator. :-)

	Individual thread performance lets me know whether one or
more tape drives would be bandwidth starved, which can lead to 
excessive tape wear.

Scott Emery
emery@sgi.com


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: All clients impatient
  2012-11-14 21:30               ` emery
@ 2012-11-14 21:31                 ` Jens Axboe
  0 siblings, 0 replies; 11+ messages in thread
From: Jens Axboe @ 2012-11-14 21:31 UTC (permalink / raw)
  To: emery; +Cc: fio

On 2012-11-14 14:30, emery@sgi.com wrote:
> emery@sgi.com:  Jens Axboe <axboe@kernel.dk>
> In message <50A405C4.50100@kernel.dk>, Jens Axboe writes:
>> On 2012-11-14 12:58, Jens Axboe wrote:
>>> Should work without group_reporting as well with the below patch. Can
>>> you confirm?
>>
>> Committed the patch, with a few modifications. So fio git should work
>> for you now, regardless of whether group_reporting=1 is used or not. I
>> suspect you want it on in any case, as you are only interested in the
>> aggregate results for the clients.
>>
>> -- 
>> Jens Axboe
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe fio" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 	I will probably run w/o group_reporting=1. I'm interested
> in both the aggregate performance of the cluster filesystem(s), and
> the deviation of the individual threads.
> 
> 	Aggregate performance helps me quickly see whether I'm getting
> the performance I expect from the filesystem or filesystems. W/o that
> feature I might have to whip out a calculator. :-)
> 
> 	Individual thread performance lets me know whether one or
> more tape drives would be bandwidth starved, which can lead to 
> excessive tape wear.

Ah I see, so you do care about individual thread numbers. Which is fine!
And at least everything should work now, regardless of whether
group_reporting is set or not.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: All clients impatient
  2012-11-14 19:58           ` Jens Axboe
  2012-11-14 20:57             ` Jens Axboe
@ 2012-11-14 21:26             ` emery
  1 sibling, 0 replies; 11+ messages in thread
From: emery @ 2012-11-14 21:26 UTC (permalink / raw)
  To: Jens Axboe; +Cc: emery, fio, emery

emery@sgi.com:  Jens Axboe <axboe@kernel.dk>
In message <50A3F7C8.4070103@kernel.dk>, Jens Axboe writes:
>On 2012-11-14 12:29, emery@sgi.com wrote:
>> emery@sgi.com:  emery@sgi.com
>> In message <201211141909.qAEJ9HtO29431352@zion.americas.sgi.com>, emery@sgi.
>com
>>  writes:
>>> emery@sgi.com:  Jens Axboe <axboe@kernel.dk>
>>> In message <50A3DE92.3020509@kernel.dk>, Jens Axboe writes:
>>>> On 2012-11-14 10:54, Jens Axboe wrote:
>>>>> On 2012-11-14 10:42, Jens Axboe wrote:
>>>>>>> 	For a while it was totting things up after four clients,
>>>>>>> which kind of made sense... now it's stuck on three.  Back when
>>>>>>> it made more sense I thought that maybe handle_ts only dealt with
>>>>>>> the information from a single thread for each client rather than
>>>>>>> the aggregate of multiple thread jobs.  That part seems to hold
>>>>>>> true even when it doesn't have the info for all the clients.
>>>>>>>
>>>>>>> 	Is this reporting the aggregate as expected?
>>>>>>> 	Is there a parameter change that I need to make to get
>>>>>>> the output I expect?
>>>>>>
>>>>>> As the job file is written, it will not aggregate results from a single
>>>>>> "instance" of the server. You would want group_reporting=1 to do that.
>>>>>> However, that will still give you one set of outputs per connection, not
>>>>>> one for all of them. Right now fio does not support collecting outputs
>>>>>> from all connections, a higher level group reporting if you will.
>>>>>
>>>>> Actually, I misremembered that and didn't check before replying. It
>>>>> _will_ sum all clients, if it has more than one connection. But there's
>>>>> a bug where we race on client exit and dec the expected client count.
>>>>> Does it work better with the below patch?
>>>>
>>>> Note that you still need group_reporting=1 in your job file, the all
>>>> clients report depend on getting only one set of stats from each client.
>>>> I'll look into fixing that up, too.
>>>>
>>>> -- 
>>>> Jens Axboe
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe fio" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>> 	Ah, I just sent an email to you noticing that... :-)
>>>
>>> Scott Emery
>>> emery@sgi.com
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe fio" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
>> 
>> 	Works great when I add group_reporting.
>> 
>> [semery@xxx2-zzz2 multi2]$ egrep 'groupid|bw=' write_tape_2.out
>> wt2: (groupid=1, jobs=2): err= 0: pid=3070: Wed Nov 14 11:26:17 2012
>>   read : io=20480MB, bw=554641KB/s, iops=270 , runt= 37811msec
>> wt2: (groupid=1, jobs=2): err= 0: pid=10638: Wed Nov 14 11:26:17 2012
>>   read : io=20480MB, bw=556658KB/s, iops=271 , runt= 37674msec
>> wt2: (groupid=1, jobs=2): err= 0: pid=26969: Wed Nov 14 11:26:17 2012
>>   read : io=20480MB, bw=553703KB/s, iops=270 , runt= 37875msec
>> wt2: (groupid=1, jobs=2): err= 0: pid=14154: Wed Nov 14 11:26:17 2012
>>   read : io=20480MB, bw=553251KB/s, iops=270 , runt= 37906msec
>> All clients: (groupid=1, jobs=4): err= 0: pid=0: Wed Nov 14 11:26:17 2012
>>   read : io=85899MB, bw=2266.2MB/s, iops=1080 , runt= 37906msec
>
>Should work without group_reporting as well with the below patch. Can
>you confirm?
>


	Provides "All clients" correctly with and without  group_reporting
in the job description.

	Here is the output of without (plus a little debug code):
[semery@xxx2-zzz2 multi2]$ egrep 'groupid|bw=|sum_stat_clients' write_tape_2.out
wt2: (groupid=1, jobs=2): err= 0: pid=6167: Wed Nov 14 13:20:44 2012
  read : io=20480MB, bw=554905KB/s, iops=270 , runt= 37793msec
 sum_stat_nr 0, sum_stat_clients 4
wt2: (groupid=1, jobs=2): err= 0: pid=13534: Wed Nov 14 13:20:44 2012
  read : io=20480MB, bw=556628KB/s, iops=271 , runt= 37676msec
 sum_stat_nr 1, sum_stat_clients 4
wt2: (groupid=1, jobs=2): err= 0: pid=30390: Wed Nov 14 13:20:44 2012
  read : io=20480MB, bw=553003KB/s, iops=270 , runt= 37923msec
 sum_stat_nr 2, sum_stat_clients 4
wt2: (groupid=1, jobs=2): err= 0: pid=17854: Wed Nov 14 13:20:44 2012
  read : io=20480MB, bw=553003KB/s, iops=270 , runt= 37923msec
 sum_stat_nr 3, sum_stat_clients 4
All clients: (groupid=1, jobs=4): err= 0: pid=0: Wed Nov 14 13:20:44 2012
  read : io=85899MB, bw=2265.1MB/s, iops=1080 , runt= 37923msec



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2012-11-14 21:32 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-13 23:23 All clients impatient Scott Emery
2012-11-14 17:42 ` Jens Axboe
2012-11-14 17:54   ` Jens Axboe
2012-11-14 18:10     ` Jens Axboe
2012-11-14 19:09       ` emery
2012-11-14 19:29         ` emery
2012-11-14 19:58           ` Jens Axboe
2012-11-14 20:57             ` Jens Axboe
2012-11-14 21:30               ` emery
2012-11-14 21:31                 ` Jens Axboe
2012-11-14 21:26             ` emery

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.