* Understanding iostat-ms output
@ 2008-01-18 9:37 Fredrik Lindgren
[not found] ` <2a29d0a3c209d9488b0a3e8263c53e38-FpffG6+3qsA@public.gmane.org>
0 siblings, 1 reply; 10+ messages in thread
From: Fredrik Lindgren @ 2008-01-18 9:37 UTC (permalink / raw)
To: linux-nfs@vger.kernel.org
Hello,
I have been messing around with the excellent "iostat-ms" tool in
an effort to understand what is going on with our machines that are
showing lots of IOwait.
Below are two 15 second samples, the first one is taken during peak
and the second one during the night. The obvious difference is in the
"exe" field. Please excuse my ignorance, but what does this field mean?
Where is this additional time spent? On the filer or in the NFS stack
on the host?
stbnas1:/foo mounted on /misc/foo:
op/s rpc bklog
2364.60 0.96
read: ops/s Kb/s Kb/op retrans avg RTT (ms) avg exe (ms)
43.267 659.624 15.246 0 (0.0%) 17.066 293.880
write: ops/s Kb/s Kb/op retrans avg RTT (ms) avg exe (ms)
16.133 60.314 3.738 0 (0.0%) 3.868 255.074
stbnas1:/foo mounted on /misc/foo:
op/s rpc bklog
1446.07 0.26
read: ops/s Kb/s Kb/op retrans avg RTT (ms) avg exe (ms)
105.200 189.155 1.798 0 (0.0%) 8.264 8.702
write: ops/s Kb/s Kb/op retrans avg RTT (ms) avg exe (ms)
105.067 297.774 2.834 0 (0.0%) 1.048 1.310
Regards,
Fredrik Lindgren
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Understanding iostat-ms output
[not found] ` <2a29d0a3c209d9488b0a3e8263c53e38-FpffG6+3qsA@public.gmane.org>
@ 2008-01-18 16:34 ` Chuck Lever
2008-01-18 17:02 ` Trond Myklebust
1 sibling, 0 replies; 10+ messages in thread
From: Chuck Lever @ 2008-01-18 16:34 UTC (permalink / raw)
To: Fredrik Lindgren; +Cc: linux-nfs@vger.kernel.org
On Jan 18, 2008, at 4:37 AM, Fredrik Lindgren wrote:
> I have been messing around with the excellent "iostat-ms" tool in
> an effort to understand what is going on with our machines that are
> showing lots of IOwait.
>
> Below are two 15 second samples, the first one is taken during peak
> and the second one during the night. The obvious difference is in the
> "exe" field. Please excuse my ignorance, but what does this field
> mean?
> Where is this additional time spent? On the filer or in the NFS stack
> on the host?
RTT is the time from when the kernel RPC client sends the RPC request
to the time it receives the reply; RTT includes network transit time
and server execution time.
"exe" is the time from when the NFS client dispatches the RPC request
to the kernel RPC client to the time the RPC request is complete;
"exe" includes RTT, and RPC client queuing and processing time.
So, even though the server turns your requests around in under
20msec, the RPC client is taking almost a quarter second, on average,
to process an RPC request.
The RPC client, by default, can process up to 16 requests
concurrently. Once this limit is exceeded, the RPC client queues
requests coming from the NFS client in a backlog queue. Average
execution time "exe" goes up if the requests have to wait in the
backlog queue.
Since your retransmit rate is zero, I'll assume you are using TCP.
Try increasing the default slot table size of the RPC client:
sudo sysctl -w sunrpc.tcp_slot_table_entries=64
then remounting your NFS file systems.
> stbnas1:/foo mounted on /misc/foo:
>
> op/s rpc bklog
> 2364.60 0.96
> read: ops/s Kb/s Kb/op
> retrans avg RTT (ms) avg exe (ms)
> 43.267 659.624 15.246 0
> (0.0%) 17.066 293.880
> write: ops/s Kb/s Kb/op
> retrans avg RTT (ms) avg exe (ms)
> 16.133 60.314 3.738 0
> (0.0%) 3.868 255.074
>
> stbnas1:/foo mounted on /misc/foo:
>
> op/s rpc bklog
> 1446.07 0.26
> read: ops/s Kb/s Kb/op
> retrans avg RTT (ms) avg exe (ms)
> 105.200 189.155 1.798 0
> (0.0%) 8.264 8.702
> write: ops/s Kb/s Kb/op
> retrans avg RTT (ms) avg exe (ms)
> 105.067 297.774 2.834 0
> (0.0%) 1.048 1.310
>
>
> Regards,
> Fredrik Lindgren
>
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-
> nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Understanding iostat-ms output
[not found] ` <2a29d0a3c209d9488b0a3e8263c53e38-FpffG6+3qsA@public.gmane.org>
2008-01-18 16:34 ` Chuck Lever
@ 2008-01-18 17:02 ` Trond Myklebust
[not found] ` <1200675753.7781.32.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
1 sibling, 1 reply; 10+ messages in thread
From: Trond Myklebust @ 2008-01-18 17:02 UTC (permalink / raw)
To: Fredrik Lindgren, Mr. Charles Edward Lever; +Cc: linux-nfs@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 2272 bytes --]
On Fri, 2008-01-18 at 10:37 +0100, Fredrik Lindgren wrote:
> Hello,
>
> I have been messing around with the excellent "iostat-ms" tool in
> an effort to understand what is going on with our machines that are
> showing lots of IOwait.
>
> Below are two 15 second samples, the first one is taken during peak
> and the second one during the night. The obvious difference is in the
> "exe" field. Please excuse my ignorance, but what does this field mean?
> Where is this additional time spent? On the filer or in the NFS stack
> on the host?
>
> stbnas1:/foo mounted on /misc/foo:
>
> op/s rpc bklog
> 2364.60 0.96
> read: ops/s Kb/s Kb/op retrans avg RTT (ms) avg exe (ms)
> 43.267 659.624 15.246 0 (0.0%) 17.066 293.880
> write: ops/s Kb/s Kb/op retrans avg RTT (ms) avg exe (ms)
> 16.133 60.314 3.738 0 (0.0%) 3.868 255.074
>
> stbnas1:/foo mounted on /misc/foo:
>
> op/s rpc bklog
> 1446.07 0.26
> read: ops/s Kb/s Kb/op retrans avg RTT (ms) avg exe (ms)
> 105.200 189.155 1.798 0 (0.0%) 8.264 8.702
> write: ops/s Kb/s Kb/op retrans avg RTT (ms) avg exe (ms)
> 105.067 297.774 2.834 0 (0.0%) 1.048 1.310
AFAICS the RTT (i.e. round trip time) is being reported as being of the
order of milliseconds in both cases, so the filer would not appear to be
responsible. The discrepancy has to be on the client side.
It would be interesting to see the 'rpc bklog' parameter in order to
confirm if the requests are being held up in the backlog queue.
Unfortunately, there appears to be a bug in iostat-ms: it appears to be
reading the 'number of requests in flight' where it should be reading
the 'length of backlog queue' parameter.
See if the attached patch doesn't help... (it also fixes a minor issue
of storing unsigned long long counters in the wrong type)
Cheers
Trond
[-- Attachment #2: iostat-ms.dif --]
[-- Type: text/x-patch, Size: 1160 bytes --]
--- iostat-ms.orig 2007-10-15 07:27:24.000000000 -0400
+++ iostat-ms 2008-01-18 11:45:20.832558000 -0500
@@ -121,7 +121,8 @@
self.__rpc_data['rpcsends'] = int(words[4])
self.__rpc_data['rpcreceives'] = int(words[5])
self.__rpc_data['badxids'] = int(words[6])
- self.__rpc_data['backlogutil'] = int(words[7])
+ self.__rpc_data['inflightsends'] = long(words[7])
+ self.__rpc_data['backlogutil'] = long(words[8])
elif words[1] == 'tcp':
self.__rpc_data['port'] = words[2]
self.__rpc_data['bind_count'] = int(words[3])
@@ -131,7 +132,8 @@
self.__rpc_data['rpcsends'] = int(words[7])
self.__rpc_data['rpcreceives'] = int(words[8])
self.__rpc_data['badxids'] = int(words[9])
- self.__rpc_data['backlogutil'] = int(words[10])
+ self.__rpc_data['inflightsends'] = long(words[10])
+ self.__rpc_data['backlogutil'] = long(words[11])
elif words[0] == 'per-op':
self.__rpc_data['per-op'] = words
else:
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Understanding iostat-ms output
[not found] ` <1200675753.7781.32.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2008-01-18 17:10 ` Trond Myklebust
2008-01-18 17:37 ` Chuck Lever
1 sibling, 0 replies; 10+ messages in thread
From: Trond Myklebust @ 2008-01-18 17:10 UTC (permalink / raw)
To: Fredrik Lindgren; +Cc: Mr. Charles Edward Lever, linux-nfs@vger.kernel.org
On Fri, 2008-01-18 at 12:02 -0500, Trond Myklebust wrote:
> It would be interesting to see the 'rpc bklog' parameter in order to
> confirm if the requests are being held up in the backlog queue.
> Unfortunately, there appears to be a bug in iostat-ms: it appears to be
> reading the 'number of requests in flight' where it should be reading
> the 'length of backlog queue' parameter.
BTW: the same bug exists in the 'mountstats' utility. The same patch
fixes the issue there too...
Trond
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Understanding iostat-ms output
[not found] ` <1200675753.7781.32.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2008-01-18 17:10 ` Trond Myklebust
@ 2008-01-18 17:37 ` Chuck Lever
2008-01-18 18:06 ` Trond Myklebust
1 sibling, 1 reply; 10+ messages in thread
From: Chuck Lever @ 2008-01-18 17:37 UTC (permalink / raw)
To: linux-nfs@vger.kernel.org list
On Jan 18, 2008, at 12:02 PM, Trond Myklebust wrote:
> On Fri, 2008-01-18 at 10:37 +0100, Fredrik Lindgren wrote:
>> Hello,
>>
>> I have been messing around with the excellent "iostat-ms" tool in
>> an effort to understand what is going on with our machines that are
>> showing lots of IOwait.
>>
>> Below are two 15 second samples, the first one is taken during peak
>> and the second one during the night. The obvious difference is in the
>> "exe" field. Please excuse my ignorance, but what does this field
>> mean?
>> Where is this additional time spent? On the filer or in the NFS stack
>> on the host?
>>
>> stbnas1:/foo mounted on /misc/foo:
>>
>> op/s rpc bklog
>> 2364.60 0.96
>> read: ops/s Kb/s Kb/op
>> retrans avg RTT (ms) avg exe (ms)
>> 43.267 659.624 15.246 0
>> (0.0%) 17.066 293.880
>> write: ops/s Kb/s Kb/op
>> retrans avg RTT (ms) avg exe (ms)
>> 16.133 60.314 3.738 0
>> (0.0%) 3.868 255.074
>>
>> stbnas1:/foo mounted on /misc/foo:
>>
>> op/s rpc bklog
>> 1446.07 0.26
>> read: ops/s Kb/s Kb/op
>> retrans avg RTT (ms) avg exe (ms)
>> 105.200 189.155 1.798 0
>> (0.0%) 8.264 8.702
>> write: ops/s Kb/s Kb/op
>> retrans avg RTT (ms) avg exe (ms)
>> 105.067 297.774 2.834 0
>> (0.0%) 1.048 1.310
>
> AFAICS the RTT (i.e. round trip time) is being reported as being of
> the
> order of milliseconds in both cases, so the filer would not appear
> to be
> responsible. The discrepancy has to be on the client side.
>
> It would be interesting to see the 'rpc bklog' parameter in order to
> confirm if the requests are being held up in the backlog queue.
> Unfortunately, there appears to be a bug in iostat-ms: it appears
> to be
> reading the 'number of requests in flight' where it should be reading
> the 'length of backlog queue' parameter.
> See if the attached patch doesn't help... (it also fixes a minor issue
> of storing unsigned long long counters in the wrong type)
I've applied Trond's patch, and patches from NetApp to support RDMA
statistics to these two scripts, and posted fresh versions here:
http://oss.oracle.com/~cel/linux-2.6/2.6.24/
--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Understanding iostat-ms output
2008-01-18 17:37 ` Chuck Lever
@ 2008-01-18 18:06 ` Trond Myklebust
[not found] ` <1200679596.7781.37.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
0 siblings, 1 reply; 10+ messages in thread
From: Trond Myklebust @ 2008-01-18 18:06 UTC (permalink / raw)
To: Chuck Lever; +Cc: linux-nfs@vger.kernel.org list
On Fri, 2008-01-18 at 12:37 -0500, Chuck Lever wrote:
> I've applied Trond's patch, and patches from NetApp to support RDMA
> statistics to these two scripts, and posted fresh versions here:
>
> http://oss.oracle.com/~cel/linux-2.6/2.6.24/
Any chance you could put this in a git repository on git.linux-nfs.org
so that we can track changes?
Trond
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Understanding iostat-ms output
[not found] ` <1200679596.7781.37.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2008-01-18 18:11 ` Chuck Lever
2008-01-18 19:19 ` Trond Myklebust
0 siblings, 1 reply; 10+ messages in thread
From: Chuck Lever @ 2008-01-18 18:11 UTC (permalink / raw)
To: Trond Myklebust; +Cc: linux-nfs@vger.kernel.org list
On Jan 18, 2008, at 1:06 PM, Trond Myklebust wrote:
> On Fri, 2008-01-18 at 12:37 -0500, Chuck Lever wrote:
>> I've applied Trond's patch, and patches from NetApp to support RDMA
>> statistics to these two scripts, and posted fresh versions here:
>>
>> http://oss.oracle.com/~cel/linux-2.6/2.6.24/
>
> Any chance you could put this in a git repository on git.linux-nfs.org
> so that we can track changes?
The plan is to include these scripts in nfs-utils during 2CQ08, after
the IPv6 work is complete. There is some additional work to be done,
such as composing man pages for both scripts, adding appropriate
dependency information in the nfs-utils build scripts, and so on.
--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Understanding iostat-ms output
2008-01-18 18:11 ` Chuck Lever
@ 2008-01-18 19:19 ` Trond Myklebust
[not found] ` <1200683992.13348.4.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
0 siblings, 1 reply; 10+ messages in thread
From: Trond Myklebust @ 2008-01-18 19:19 UTC (permalink / raw)
To: Chuck Lever; +Cc: linux-nfs@vger.kernel.org list
On Fri, 2008-01-18 at 13:11 -0500, Chuck Lever wrote:
> The plan is to include these scripts in nfs-utils during 2CQ08, after
> the IPv6 work is complete. There is some additional work to be done,
> such as composing man pages for both scripts, adding appropriate
> dependency information in the nfs-utils build scripts, and so on.
I'm of two minds about that. I agree that we need an iostat-like utility
in nfs-utils, but I'm wary about also adding in a new language
dependency on python.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Understanding iostat-ms output
[not found] ` <1200683992.13348.4.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2008-01-18 21:06 ` Chuck Lever
2008-01-19 12:51 ` Steve Dickson
0 siblings, 1 reply; 10+ messages in thread
From: Chuck Lever @ 2008-01-18 21:06 UTC (permalink / raw)
To: Trond Myklebust; +Cc: linux-nfs@vger.kernel.org list
Hi Trond-
On Jan 18, 2008, at 2:19 PM, Trond Myklebust wrote:
> On Fri, 2008-01-18 at 13:11 -0500, Chuck Lever wrote:
>> The plan is to include these scripts in nfs-utils during 2CQ08, after
>> the IPv6 work is complete. There is some additional work to be done,
>> such as composing man pages for both scripts, adding appropriate
>> dependency information in the nfs-utils build scripts, and so on.
>
> I'm of two minds about that. I agree that we need an iostat-like
> utility
> in nfs-utils, but I'm wary about also adding in a new language
> dependency on python.
Steve has already agreed to include the Python versions. However, if
it presents a challenge, it would be fairly trivial to rewrite these
scripts in C.
--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Understanding iostat-ms output
2008-01-18 21:06 ` Chuck Lever
@ 2008-01-19 12:51 ` Steve Dickson
0 siblings, 0 replies; 10+ messages in thread
From: Steve Dickson @ 2008-01-19 12:51 UTC (permalink / raw)
To: Chuck Lever; +Cc: Trond Myklebust, linux-nfs@vger.kernel.org list
Chuck Lever wrote:
> Hi Trond-
>
> On Jan 18, 2008, at 2:19 PM, Trond Myklebust wrote:
>> On Fri, 2008-01-18 at 13:11 -0500, Chuck Lever wrote:
>>> The plan is to include these scripts in nfs-utils during 2CQ08, after
>>> the IPv6 work is complete. There is some additional work to be done,
>>> such as composing man pages for both scripts, adding appropriate
>>> dependency information in the nfs-utils build scripts, and so on.
>>
>> I'm of two minds about that. I agree that we need an iostat-like utility
>> in nfs-utils, but I'm wary about also adding in a new language
>> dependency on python.
>
>
> Steve has already agreed to include the Python versions. However, if it
> presents a challenge, it would be fairly trivial to rewrite these
> scripts in C.
Personally I think Python is fine... but let me look into it to how
painful the integration is and what dependencies are needed.
steved.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2008-01-19 12:53 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-18 9:37 Understanding iostat-ms output Fredrik Lindgren
[not found] ` <2a29d0a3c209d9488b0a3e8263c53e38-FpffG6+3qsA@public.gmane.org>
2008-01-18 16:34 ` Chuck Lever
2008-01-18 17:02 ` Trond Myklebust
[not found] ` <1200675753.7781.32.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2008-01-18 17:10 ` Trond Myklebust
2008-01-18 17:37 ` Chuck Lever
2008-01-18 18:06 ` Trond Myklebust
[not found] ` <1200679596.7781.37.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2008-01-18 18:11 ` Chuck Lever
2008-01-18 19:19 ` Trond Myklebust
[not found] ` <1200683992.13348.4.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2008-01-18 21:06 ` Chuck Lever
2008-01-19 12:51 ` Steve Dickson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox