Linux NFS development
 help / color / mirror / Atom feed
* Understanding iostat-ms output
@ 2008-01-18  9:37 Fredrik Lindgren
       [not found] ` <2a29d0a3c209d9488b0a3e8263c53e38-FpffG6+3qsA@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Fredrik Lindgren @ 2008-01-18  9:37 UTC (permalink / raw)
  To: linux-nfs@vger.kernel.org

Hello,

I have been messing around with the excellent "iostat-ms" tool in
an effort to understand what is going on with our machines that are
showing lots of IOwait.

Below are two 15 second samples, the first one is taken during peak
and the second one during the night. The obvious difference is in the
"exe" field. Please excuse my ignorance, but what does this field mean?
Where is this additional time spent? On the filer or in the NFS stack
on the host?

stbnas1:/foo mounted on /misc/foo:

   op/s         rpc bklog
2364.60            0.96
read:             ops/s            Kb/s           Kb/op         retrans         avg RTT (ms)    avg exe (ms)
                 43.267         659.624          15.246        0 (0.0%)          17.066         293.880
write:            ops/s            Kb/s           Kb/op         retrans         avg RTT (ms)    avg exe (ms)
                 16.133          60.314           3.738        0 (0.0%)           3.868         255.074

stbnas1:/foo mounted on /misc/foo:

   op/s         rpc bklog
1446.07            0.26
read:             ops/s            Kb/s           Kb/op         retrans         avg RTT (ms)    avg exe (ms)
                105.200         189.155           1.798        0 (0.0%)           8.264           8.702
write:            ops/s            Kb/s           Kb/op         retrans         avg RTT (ms)    avg exe (ms)
                105.067         297.774           2.834        0 (0.0%)           1.048           1.310


Regards,
  Fredrik Lindgren





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Understanding iostat-ms output
       [not found] ` <2a29d0a3c209d9488b0a3e8263c53e38-FpffG6+3qsA@public.gmane.org>
@ 2008-01-18 16:34   ` Chuck Lever
  2008-01-18 17:02   ` Trond Myklebust
  1 sibling, 0 replies; 10+ messages in thread
From: Chuck Lever @ 2008-01-18 16:34 UTC (permalink / raw)
  To: Fredrik Lindgren; +Cc: linux-nfs@vger.kernel.org

On Jan 18, 2008, at 4:37 AM, Fredrik Lindgren wrote:
> I have been messing around with the excellent "iostat-ms" tool in
> an effort to understand what is going on with our machines that are
> showing lots of IOwait.
>
> Below are two 15 second samples, the first one is taken during peak
> and the second one during the night. The obvious difference is in the
> "exe" field. Please excuse my ignorance, but what does this field  
> mean?
> Where is this additional time spent? On the filer or in the NFS stack
> on the host?

RTT is the time from when the kernel RPC client sends the RPC request  
to the time it receives the reply; RTT includes network transit time  
and server execution time.

"exe" is the time from when the NFS client dispatches the RPC request  
to the kernel RPC client to the time the RPC request is complete;  
"exe" includes RTT, and RPC client queuing and processing time.

So, even though the server turns your requests around in under  
20msec, the RPC client is taking almost a quarter second, on average,  
to process an RPC request.

The RPC client, by default, can process up to 16 requests  
concurrently.  Once this limit is exceeded, the RPC client queues  
requests coming from the NFS client in a backlog queue.  Average  
execution time "exe" goes up if the requests have to wait in the  
backlog queue.

Since your retransmit rate is zero, I'll assume you are using TCP.   
Try increasing the default slot table size of the RPC client:

   sudo sysctl -w sunrpc.tcp_slot_table_entries=64

then remounting your NFS file systems.

> stbnas1:/foo mounted on /misc/foo:
>
>    op/s         rpc bklog
> 2364.60            0.96
> read:             ops/s            Kb/s           Kb/op          
> retrans         avg RTT (ms)    avg exe (ms)
>                  43.267         659.624          15.246        0  
> (0.0%)          17.066         293.880
> write:            ops/s            Kb/s           Kb/op          
> retrans         avg RTT (ms)    avg exe (ms)
>                  16.133          60.314           3.738        0  
> (0.0%)           3.868         255.074
>
> stbnas1:/foo mounted on /misc/foo:
>
>    op/s         rpc bklog
> 1446.07            0.26
> read:             ops/s            Kb/s           Kb/op          
> retrans         avg RTT (ms)    avg exe (ms)
>                 105.200         189.155           1.798        0  
> (0.0%)           8.264           8.702
> write:            ops/s            Kb/s           Kb/op          
> retrans         avg RTT (ms)    avg exe (ms)
>                 105.067         297.774           2.834        0  
> (0.0%)           1.048           1.310
>
>
> Regards,
>   Fredrik Lindgren
>
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux- 
> nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Understanding iostat-ms output
       [not found] ` <2a29d0a3c209d9488b0a3e8263c53e38-FpffG6+3qsA@public.gmane.org>
  2008-01-18 16:34   ` Chuck Lever
@ 2008-01-18 17:02   ` Trond Myklebust
       [not found]     ` <1200675753.7781.32.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
  1 sibling, 1 reply; 10+ messages in thread
From: Trond Myklebust @ 2008-01-18 17:02 UTC (permalink / raw)
  To: Fredrik Lindgren, Mr. Charles Edward Lever; +Cc: linux-nfs@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 2272 bytes --]


On Fri, 2008-01-18 at 10:37 +0100, Fredrik Lindgren wrote:
> Hello,
> 
> I have been messing around with the excellent "iostat-ms" tool in
> an effort to understand what is going on with our machines that are
> showing lots of IOwait.
> 
> Below are two 15 second samples, the first one is taken during peak
> and the second one during the night. The obvious difference is in the
> "exe" field. Please excuse my ignorance, but what does this field mean?
> Where is this additional time spent? On the filer or in the NFS stack
> on the host?
> 
> stbnas1:/foo mounted on /misc/foo:
> 
>    op/s         rpc bklog
> 2364.60            0.96
> read:             ops/s            Kb/s           Kb/op         retrans         avg RTT (ms)    avg exe (ms)
>                  43.267         659.624          15.246        0 (0.0%)          17.066         293.880
> write:            ops/s            Kb/s           Kb/op         retrans         avg RTT (ms)    avg exe (ms)
>                  16.133          60.314           3.738        0 (0.0%)           3.868         255.074
> 
> stbnas1:/foo mounted on /misc/foo:
> 
>    op/s         rpc bklog
> 1446.07            0.26
> read:             ops/s            Kb/s           Kb/op         retrans         avg RTT (ms)    avg exe (ms)
>                 105.200         189.155           1.798        0 (0.0%)           8.264           8.702
> write:            ops/s            Kb/s           Kb/op         retrans         avg RTT (ms)    avg exe (ms)
>                 105.067         297.774           2.834        0 (0.0%)           1.048           1.310

AFAICS the RTT (i.e. round trip time) is being reported as being of the
order of milliseconds in both cases, so the filer would not appear to be
responsible. The discrepancy has to be on the client side.

It would be interesting to see the 'rpc bklog' parameter in order to
confirm if the requests are being held up in the backlog queue.
Unfortunately, there appears to be a bug in iostat-ms: it appears to be
reading the 'number of requests in flight' where it should be reading
the 'length of backlog queue' parameter.
See if the attached patch doesn't help... (it also fixes a minor issue
of storing unsigned long long counters in the wrong type)

Cheers
  Trond


[-- Attachment #2: iostat-ms.dif --]
[-- Type: text/x-patch, Size: 1160 bytes --]

--- iostat-ms.orig	2007-10-15 07:27:24.000000000 -0400
+++ iostat-ms	2008-01-18 11:45:20.832558000 -0500
@@ -121,7 +121,8 @@
                 self.__rpc_data['rpcsends'] = int(words[4])
                 self.__rpc_data['rpcreceives'] = int(words[5])
                 self.__rpc_data['badxids'] = int(words[6])
-                self.__rpc_data['backlogutil'] = int(words[7])
+                self.__rpc_data['inflightsends'] = long(words[7])
+                self.__rpc_data['backlogutil'] = long(words[8])
             elif words[1] == 'tcp':
                 self.__rpc_data['port'] = words[2]
                 self.__rpc_data['bind_count'] = int(words[3])
@@ -131,7 +132,8 @@
                 self.__rpc_data['rpcsends'] = int(words[7])
                 self.__rpc_data['rpcreceives'] = int(words[8])
                 self.__rpc_data['badxids'] = int(words[9])
-                self.__rpc_data['backlogutil'] = int(words[10])
+                self.__rpc_data['inflightsends'] = long(words[10])
+                self.__rpc_data['backlogutil'] = long(words[11])
         elif words[0] == 'per-op':
             self.__rpc_data['per-op'] = words
         else:

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Understanding iostat-ms output
       [not found]     ` <1200675753.7781.32.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2008-01-18 17:10       ` Trond Myklebust
  2008-01-18 17:37       ` Chuck Lever
  1 sibling, 0 replies; 10+ messages in thread
From: Trond Myklebust @ 2008-01-18 17:10 UTC (permalink / raw)
  To: Fredrik Lindgren; +Cc: Mr. Charles Edward Lever, linux-nfs@vger.kernel.org


On Fri, 2008-01-18 at 12:02 -0500, Trond Myklebust wrote:
> It would be interesting to see the 'rpc bklog' parameter in order to
> confirm if the requests are being held up in the backlog queue.
> Unfortunately, there appears to be a bug in iostat-ms: it appears to be
> reading the 'number of requests in flight' where it should be reading
> the 'length of backlog queue' parameter.

BTW: the same bug exists in the 'mountstats' utility. The same patch
fixes the issue there too...

Trond


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Understanding iostat-ms output
       [not found]     ` <1200675753.7781.32.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
  2008-01-18 17:10       ` Trond Myklebust
@ 2008-01-18 17:37       ` Chuck Lever
  2008-01-18 18:06         ` Trond Myklebust
  1 sibling, 1 reply; 10+ messages in thread
From: Chuck Lever @ 2008-01-18 17:37 UTC (permalink / raw)
  To: linux-nfs@vger.kernel.org list

On Jan 18, 2008, at 12:02 PM, Trond Myklebust wrote:
> On Fri, 2008-01-18 at 10:37 +0100, Fredrik Lindgren wrote:
>> Hello,
>>
>> I have been messing around with the excellent "iostat-ms" tool in
>> an effort to understand what is going on with our machines that are
>> showing lots of IOwait.
>>
>> Below are two 15 second samples, the first one is taken during peak
>> and the second one during the night. The obvious difference is in the
>> "exe" field. Please excuse my ignorance, but what does this field  
>> mean?
>> Where is this additional time spent? On the filer or in the NFS stack
>> on the host?
>>
>> stbnas1:/foo mounted on /misc/foo:
>>
>>    op/s         rpc bklog
>> 2364.60            0.96
>> read:             ops/s            Kb/s           Kb/op          
>> retrans         avg RTT (ms)    avg exe (ms)
>>                  43.267         659.624          15.246        0  
>> (0.0%)          17.066         293.880
>> write:            ops/s            Kb/s           Kb/op          
>> retrans         avg RTT (ms)    avg exe (ms)
>>                  16.133          60.314           3.738        0  
>> (0.0%)           3.868         255.074
>>
>> stbnas1:/foo mounted on /misc/foo:
>>
>>    op/s         rpc bklog
>> 1446.07            0.26
>> read:             ops/s            Kb/s           Kb/op          
>> retrans         avg RTT (ms)    avg exe (ms)
>>                 105.200         189.155           1.798        0  
>> (0.0%)           8.264           8.702
>> write:            ops/s            Kb/s           Kb/op          
>> retrans         avg RTT (ms)    avg exe (ms)
>>                 105.067         297.774           2.834        0  
>> (0.0%)           1.048           1.310
>
> AFAICS the RTT (i.e. round trip time) is being reported as being of  
> the
> order of milliseconds in both cases, so the filer would not appear  
> to be
> responsible. The discrepancy has to be on the client side.
>
> It would be interesting to see the 'rpc bklog' parameter in order to
> confirm if the requests are being held up in the backlog queue.
> Unfortunately, there appears to be a bug in iostat-ms: it appears  
> to be
> reading the 'number of requests in flight' where it should be reading
> the 'length of backlog queue' parameter.
> See if the attached patch doesn't help... (it also fixes a minor issue
> of storing unsigned long long counters in the wrong type)


I've applied Trond's patch, and patches from NetApp to support RDMA  
statistics to these two scripts, and posted fresh versions here:

   http://oss.oracle.com/~cel/linux-2.6/2.6.24/

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Understanding iostat-ms output
  2008-01-18 17:37       ` Chuck Lever
@ 2008-01-18 18:06         ` Trond Myklebust
       [not found]           ` <1200679596.7781.37.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Trond Myklebust @ 2008-01-18 18:06 UTC (permalink / raw)
  To: Chuck Lever; +Cc: linux-nfs@vger.kernel.org list


On Fri, 2008-01-18 at 12:37 -0500, Chuck Lever wrote:

> I've applied Trond's patch, and patches from NetApp to support RDMA  
> statistics to these two scripts, and posted fresh versions here:
> 
>    http://oss.oracle.com/~cel/linux-2.6/2.6.24/

Any chance you could put this in a git repository on git.linux-nfs.org
so that we can track changes?

Trond


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Understanding iostat-ms output
       [not found]           ` <1200679596.7781.37.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2008-01-18 18:11             ` Chuck Lever
  2008-01-18 19:19               ` Trond Myklebust
  0 siblings, 1 reply; 10+ messages in thread
From: Chuck Lever @ 2008-01-18 18:11 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs@vger.kernel.org list

On Jan 18, 2008, at 1:06 PM, Trond Myklebust wrote:
> On Fri, 2008-01-18 at 12:37 -0500, Chuck Lever wrote:
>> I've applied Trond's patch, and patches from NetApp to support RDMA
>> statistics to these two scripts, and posted fresh versions here:
>>
>>    http://oss.oracle.com/~cel/linux-2.6/2.6.24/
>
> Any chance you could put this in a git repository on git.linux-nfs.org
> so that we can track changes?


The plan is to include these scripts in nfs-utils during 2CQ08, after  
the IPv6 work is complete.  There is some additional work to be done,  
such as composing man pages for both scripts, adding appropriate  
dependency information in the nfs-utils build scripts, and so on.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Understanding iostat-ms output
  2008-01-18 18:11             ` Chuck Lever
@ 2008-01-18 19:19               ` Trond Myklebust
       [not found]                 ` <1200683992.13348.4.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Trond Myklebust @ 2008-01-18 19:19 UTC (permalink / raw)
  To: Chuck Lever; +Cc: linux-nfs@vger.kernel.org list


On Fri, 2008-01-18 at 13:11 -0500, Chuck Lever wrote:
> The plan is to include these scripts in nfs-utils during 2CQ08, after  
> the IPv6 work is complete.  There is some additional work to be done,  
> such as composing man pages for both scripts, adding appropriate  
> dependency information in the nfs-utils build scripts, and so on.

I'm of two minds about that. I agree that we need an iostat-like utility
in nfs-utils, but I'm wary about also adding in a new language
dependency on python.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Understanding iostat-ms output
       [not found]                 ` <1200683992.13348.4.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2008-01-18 21:06                   ` Chuck Lever
  2008-01-19 12:51                     ` Steve Dickson
  0 siblings, 1 reply; 10+ messages in thread
From: Chuck Lever @ 2008-01-18 21:06 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs@vger.kernel.org list

Hi Trond-

On Jan 18, 2008, at 2:19 PM, Trond Myklebust wrote:
> On Fri, 2008-01-18 at 13:11 -0500, Chuck Lever wrote:
>> The plan is to include these scripts in nfs-utils during 2CQ08, after
>> the IPv6 work is complete.  There is some additional work to be done,
>> such as composing man pages for both scripts, adding appropriate
>> dependency information in the nfs-utils build scripts, and so on.
>
> I'm of two minds about that. I agree that we need an iostat-like  
> utility
> in nfs-utils, but I'm wary about also adding in a new language
> dependency on python.


Steve has already agreed to include the Python versions.  However, if  
it presents a challenge, it would be fairly trivial to rewrite these  
scripts in C.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Understanding iostat-ms output
  2008-01-18 21:06                   ` Chuck Lever
@ 2008-01-19 12:51                     ` Steve Dickson
  0 siblings, 0 replies; 10+ messages in thread
From: Steve Dickson @ 2008-01-19 12:51 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Trond Myklebust, linux-nfs@vger.kernel.org list



Chuck Lever wrote:
> Hi Trond-
> 
> On Jan 18, 2008, at 2:19 PM, Trond Myklebust wrote:
>> On Fri, 2008-01-18 at 13:11 -0500, Chuck Lever wrote:
>>> The plan is to include these scripts in nfs-utils during 2CQ08, after
>>> the IPv6 work is complete.  There is some additional work to be done,
>>> such as composing man pages for both scripts, adding appropriate
>>> dependency information in the nfs-utils build scripts, and so on.
>>
>> I'm of two minds about that. I agree that we need an iostat-like utility
>> in nfs-utils, but I'm wary about also adding in a new language
>> dependency on python.
> 
> 
> Steve has already agreed to include the Python versions.  However, if it
> presents a challenge, it would be fairly trivial to rewrite these
> scripts in C.
Personally I think Python is fine... but let me look into it to how
painful the integration is and what dependencies are needed. 

steved.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2008-01-19 12:53 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-18  9:37 Understanding iostat-ms output Fredrik Lindgren
     [not found] ` <2a29d0a3c209d9488b0a3e8263c53e38-FpffG6+3qsA@public.gmane.org>
2008-01-18 16:34   ` Chuck Lever
2008-01-18 17:02   ` Trond Myklebust
     [not found]     ` <1200675753.7781.32.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2008-01-18 17:10       ` Trond Myklebust
2008-01-18 17:37       ` Chuck Lever
2008-01-18 18:06         ` Trond Myklebust
     [not found]           ` <1200679596.7781.37.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2008-01-18 18:11             ` Chuck Lever
2008-01-18 19:19               ` Trond Myklebust
     [not found]                 ` <1200683992.13348.4.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2008-01-18 21:06                   ` Chuck Lever
2008-01-19 12:51                     ` Steve Dickson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox