From: Kevin Constantine <Kevin.Constantine-FfNkGbSheRGpB8w63BLUukEOCMrvLtNR@public.gmane.org>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH] nfs-iostat.py: Print Data Cache Statistics
Date: Wed, 29 Apr 2009 18:57:19 -0700 [thread overview]
Message-ID: <49F9057F.40105@disney.com> (raw)
In-Reply-To: <50DACB0A-59C3-4BCF-A470-AAB5525CEF26@oracle.com>
Chuck Lever wrote:
> On Apr 24, 2009, at 11:16 PM, Kevin Constantine wrote:
>> Chuck Lever wrote:
>>> Hi Kevin-
>>> On Apr 21, 2009, at 9:42 PM, Kevin Constantine wrote:
>>>> I rolled the read cache, and write patches into a single patch below.
>>>>
>>>> add --data flag to print data cache statistics
>>>> print read cache stats from __print_data_cache_stats
>>>> print stats about bytes written by NFS
>>>>
>>>> Signed-off-by: Kevin Constantine
>>>> <kevin.constantine-FfNkGbSheRGpB8w63BLUukEOCMrvLtNR@public.gmane.org>
>>>> ---
>>>> tools/nfs-iostat/nfs-iostat.py | 28 ++++++++++++++++++++++------
>>>> 1 files changed, 22 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/tools/nfs-iostat/nfs-iostat.py
>>>> b/tools/nfs-iostat/nfs-iostat.py
>>>> index 9626d42..cb3da59 100644
>>>> --- a/tools/nfs-iostat/nfs-iostat.py
>>>> +++ b/tools/nfs-iostat/nfs-iostat.py
>>>> @@ -220,14 +220,22 @@ class DeviceData:
>>>> """Print the data cache hit rate
>>>> """
>>>> nfs_stats = self.__nfs_data
>>>> - app_bytes_read = float(nfs_stats['normalreadbytes'])
>>>> + app_bytes_read = float(nfs_stats['normalreadbytes'] +
>>>> nfs_stats['directreadbytes'])
>>> Again, I object to this particular change. The reason for the
>>> current computation is to retain precision and mitigate the chance of
>>> an overflow. Reversing the sense of this computation (by introducing
>>> the addition above) increases the probability of overflow here.
>>> Remember that over time, these numbers can grow exceptionally large.
>>> If you think you need this change, you do need to provide some
>>> justification for the reorganization in the patch description. Why
>>> is the current code inadequate?
>>
>> I've changed my opinion on how the cache hit ratio should be computed
>> and now agree with you. I was thinking that direct reads should
>> negatively impact the ratio, but that doesn't make a whole lot of sense.
>>
>>>> if app_bytes_read != 0:
>>>> - client_bytes_read = float(nfs_stats['serverreadbytes']
>>>> - nfs_stats['directreadbytes'])
>>>> - ratio = ((app_bytes_read - client_bytes_read) * 100) /
>>>> app_bytes_read
>>>> -
>>>> + cached_read_bytes = float(app_bytes_read -
>>>> float(nfs_stats['serverreadbytes']))
>>>> + ratio = (cached_read_bytes * 100) / app_bytes_read
>>>> print
>>>> - print 'app bytes: %f client bytes %f' %
>>>> (app_bytes_read, client_bytes_read)
>>>> - print 'Data cache hit ratio: %4.2f%%' % ratio
>>> In my opinion, displaying O_DIRECT reads right next to the cache hit
>>> ratio is deceptive. This is why I did not include the O_DIRECT
>>> counts here. They have no impact on the page cache hit ratio, as
>>> O_DIRECT data is never cached in the kernel.
>>
>> I've changed the output considerably this time around. I think it
>> makes more sense, let me know what you think.
>>
>> nani:/vol/pamtest mounted on /data/pamtest:
>>
>> op/s rpc bklog
>> 3452.00 13.16
>> read: ops/s kB/s kB/op
>> retrans avg RTT (ms) avg exe (ms)
>> 3452.000 111407.906 32.273 0
>> (0.0%) 4.633 8.628
>> write: ops/s kB/s kB/op
>> retrans avg RTT (ms) avg exe (ms)
>> 0.000 0.000 0.000 0
>> (0.0%) 0.000 0.000
>>
>> Network Direct
>> I/O Direct % Total
>> Read Write Cache Reads Cache Hit % Read
>> Write Read Write Read Write
>> 107.8750MB 0.0000MB 4.1236MB 3.68% 0.0000MB
>> 0.0000MB 0.00% 0.00% 111.9986MB 0.0000MB
>
> This is a lot better, but I still find the placement of direct I/O
> statistics near the cache hit ratio confusing. Direct reads are a
> totally different beast from normal (cached) reads.
The goal of this output is to display what kind of I/O the client is
performing, and where that data is coming from (be it over the network,
or from cache). I'm not particularly attached to keeping the direct
stats in there, I just figured that it's useful to know that, for
example, 100MB of the 112MB can be attributed to direct i/o. I've
pulled the direct i/o stuff out.
The new output looks like:
Network Total
Read Write Cache Reads Cache Hit % Invalidations Read Write
3585.4648MB 0.0000MB 606.5352MB 14.47% 1 4192.0000MB 0.0000MB
> I also think that
> read caching and write caching behavior are different enough that
> reporting them separately might be warranted.
>
Are you suggesting that instead of printing out Network Reads/writes,
instead there is a way to print out just reads, and just writes? I guess
I'm a little confused by this.
> It occurs to me that another important piece of information here is how
> often the NFS client decides it needs to invalidate page cache pages.
> It doesn't invalidate individual pages; if a file's mtime has changed,
> the client purges all cached data for that file. So, if the client has
> read the data in a page once already, but the cache was invalidated, it
> will read the same page again (whether data in the page has changed on
> the server or not).
>
> I think it would be informative to provide at least the page cache
> invalidation number with the hit ratio, but it could also be interesting
> to see how often the same page is read multiple times. The kernel
> doesn't keep that stat at the moment.
>
> Distinguishing between "cache missed because client had never read the
> page before" and "cache missed because client noticed the file changed"
> might be important, but it might also be difficult to track. For now,
> maybe the best approach is simply to report the page cache invalidation
> stat along with the computed hit ratio?
>
I've printed out the datainvalidate counter, which, as far as I can
tell, is the correct counter.
>>>> + print '%10s %15s %15s %15s %7s' % ("Data Read:", "From
>>>> Server", "From Cache", "Total", "Hit %")
>>>> + print '%10s %13.4fMB %13.4fMB %13.4fMB %6.2f%%' % ("", \
>>>> + float(nfs_stats['serverreadbytes']) /
>>>> 1024.0 / 1024.0, \
>>>> + cached_read_bytes / 1024.0 / 1024.0, \
>>>> + app_bytes_read / 1024.0 / 1024.0, \
>>>> + ratio)
>>> If app_bytes_read is zero, we should print a line here that displays
>>> zeroes, as is done in other areas of this code. I know that nfsstat
>>> quells output lines containing all zeros, but other tools (like
>>> iostat) do not. What ends up happening is that when there is no I/O
>>> activity, no lines are produced at all. That is confusing to users
>>> when nfs-iostat is in the "output one line every N seconds" mode.
>>
>> Agreed. It now prints out zeroes. There are instances where data has
>> been
>> sent through the NFS stack and counted in the serverbytes counter, but
>> haven't made it through the userspace stack and haven't been counted
>> so we
>> get a negative cache hit ratio. In these cases, I'm converting the cache
>> ratio and cache bytes to zero.
>
> I don't see how this can happen. All of the NFS I/O statistics are
> counted in the kernel; there is no "user space stack". "normal" and
> "direct" are counted before the NFS client starts any network operation.
>
> So, maybe there's a bug in the kernel accounting or reporting? Or
> perhaps this behavior is exposed only on SMP?
>
As an example: I mounted a volume, and began a sequential read. The
mountstats proc file bytes counter looks like:
486533058 0 0 0 487096320 0 118920 0
According to this, I've read more from the server than my application
has asked for. Because we calculate cached_byes as application_bytes -
server_bytes, we end up with a negative value.
>>>> + bytes_written_by_nfs = float(nfs_stats['serverwritebytes'])
>>>> + if bytes_written_by_nfs != 0:
>>>> + print
>>>> + print '%13s %12s' % ("Data Written:", "To Server")
>>>> + print '%10s %13.4fMB' % ("", bytes_written_by_nfs / 1024.0
>>>> / 1024.0)
>>> As you are not dividing by bytes_written_by_nfs, there is no
>>> particular reason to gate this output. I don't see the need for the
>>> extra variable here either. You don't have a similar variable in the
>>> read statistics, for example.
>>
>> I was thinking the gate was there to prevent non-zero values from
>> printing.
>> This has been fixed.
>>
>>> You included application and direct read bytes in your read
>>> statistics, but not here. Why not?
>>
>> The new output prints Network reads/writes, direct reads/writes, total
>> reads/writes, cache reads, cache ratio and direct i/o ratios for
>> reads/writes.
>>
>> New Patch below:
[...]
>> nfs_stats['directreadbytes']) / 1024.0 / 1024.0
>> + total_writes_MB = float(nfs_stats['normalwritebytes'] +
>> nfs_stats['directwritebytes']) / \
>> + 1024.0 / 1024.0
>
> Just a nit: the indentation here is uneven. Given this is Python, this
> could be a little more than just a cosmetic problem over the long run.
>
fixed
> I always find it a challenge to avoid using tabs in Python scripts.
>
Out of curiosity, why were nfs-iostats.py and mountstats.py written in
python instead of c like the rest of nfs-utils?
>> +
>> + """ Bytes may have been counted by the NFS stack but haven't
>> been counted by the
[...]
The next iteration of the patch below:
add --data flag to print data cache statistics
--data flag prints Network reads/writes, cache reads, cache hit%, Cache invalidation count, Total reads/Writes
Signed-off-by: Kevin Constantine <kevin.constantine-FfNkGbSheRGpB8w63BLUukEOCMrvLtNR@public.gmane.org>
---
tools/nfs-iostat/nfs-iostat.py | 43 ++++++++++++++++++++++++++++++++++++---
1 files changed, 39 insertions(+), 4 deletions(-)
diff --git a/tools/nfs-iostat/nfs-iostat.py b/tools/nfs-iostat/nfs-iostat.py
index 9626d42..e8bb2fb 100644
--- a/tools/nfs-iostat/nfs-iostat.py
+++ b/tools/nfs-iostat/nfs-iostat.py
@@ -220,14 +220,41 @@ class DeviceData:
"""Print the data cache hit rate
"""
nfs_stats = self.__nfs_data
+ client_bytes_read = 0
+ ratio = 0
+ direct_read_ratio = 0
+ direct_write_ratio = 0
+
app_bytes_read = float(nfs_stats['normalreadbytes'])
+ client_bytes_read = float(nfs_stats['serverreadbytes'] - nfs_stats['directreadbytes'])
+ direct_read_bytes = float(nfs_stats['directreadbytes'])
+ direct_write_bytes = float(nfs_stats['directwritebytes'])
if app_bytes_read != 0:
- client_bytes_read = float(nfs_stats['serverreadbytes'] - nfs_stats['directreadbytes'])
ratio = ((app_bytes_read - client_bytes_read) * 100) / app_bytes_read
+ if direct_read_bytes != 0:
+ direct_read_ratio = float(nfs_stats['directreadbytes']) * 100 / \
+ (app_bytes_read + direct_read_bytes)
+ if direct_write_bytes != 0:
+ direct_write_ratio = float(nfs_stats['directwritebytes']) * 100 / \
+ (direct_write_bytes + float(nfs_stats['normalwritebytes']))
+
+ network_reads_MB = float(nfs_stats['serverreadbytes']) / 1024.0 / 1024.0
+ network_writes_MB = float(nfs_stats['serverwritebytes']) / 1024.0 / 1024.0
+ cached_reads_MB = float(app_bytes_read - client_bytes_read) / 1024.0 / 1024.0
+ direct_reads_MB = float(nfs_stats['directreadbytes']) / 1024.0 / 1024.0
+ direct_writes_MB = float(nfs_stats['directwritebytes']) / 1024.0 / 1024.0
+ total_reads_MB = float(app_bytes_read + nfs_stats['directreadbytes']) / 1024.0 / 1024.0
+ total_writes_MB = float(nfs_stats['normalwritebytes'] + nfs_stats['directwritebytes']) / \
+ 1024.0 / 1024.0
- print
- print 'app bytes: %f client bytes %f' % (app_bytes_read, client_bytes_read)
- print 'Data cache hit ratio: %4.2f%%' % ratio
+ print
+ print '%18s %68s' % ("Network", "Total")
+ print '%10s %12s %14s %14s %15s %10s %12s' % \
+ ("Read", "Write", "Cache Reads", "Cache Hit %", \
+ "Invalidations", "Read", "Write")
+ print '%10.4fMB %10.4fMB %10.4fMB %13.2f%% %15d %10.4fMB %10.4fMB' % ( \
+ network_reads_MB, network_writes_MB, cached_reads_MB, ratio, \
+ nfs_stats['datainvalidates'], total_reads_MB, total_writes_MB)
def __print_attr_cache_stats(self, sample_time):
"""Print attribute cache efficiency stats
@@ -390,6 +417,10 @@ class DeviceData:
self.__print_rpc_op_stats('READ', sample_time)
self.__print_rpc_op_stats('WRITE', sample_time)
self.__print_page_stats(sample_time)
+ elif which == 4:
+ self.__print_rpc_op_stats('READ', sample_time)
+ self.__print_rpc_op_stats('WRITE', sample_time)
+ self.__print_data_cache_stats()
#
# Functions
@@ -487,6 +518,10 @@ def iostat_command(name):
if arg in ['-p', '--page']:
which = 3
continue
+
+ if arg in ['--data']:
+ which = 4
+ continue
if arg == sys.argv[0]:
continue
--
1.6.2.1
next prev parent reply other threads:[~2009-04-30 1:57 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-21 2:03 Add data cache statistics to nfs-iostat.py Kevin Constantine
[not found] ` <1240279414-30528-1-git-send-email-kevin.constantine-FfNkGbSheRGpB8w63BLUukEOCMrvLtNR@public.gmane.org>
2009-04-21 2:03 ` [PATCH 1/2] nfs-iostat.py: Print Data Cache Statistics Kevin Constantine
[not found] ` <1240279414-30528-2-git-send-email-kevin.constantine-FfNkGbSheRGpB8w63BLUukEOCMrvLtNR@public.gmane.org>
2009-04-21 2:03 ` [PATCH 2/2] nfs-iostat.py: Added bytes written output Kevin Constantine
[not found] ` <1240279414-30528-3-git-send-email-kevin.constantine-FfNkGbSheRGpB8w63BLUukEOCMrvLtNR@public.gmane.org>
2009-04-21 14:26 ` Chuck Lever
2009-04-21 17:57 ` Kevin Constantine
[not found] ` <49EE0901.2040704-P5ys19MLBK/QT0dZR+AlfA@public.gmane.org>
2009-04-21 19:00 ` Chuck Lever
2009-04-21 20:39 ` Kevin Constantine
2009-04-21 20:46 ` Chuck Lever
2009-04-21 14:26 ` [PATCH 1/2] nfs-iostat.py: Print Data Cache Statistics Chuck Lever
2009-04-21 17:54 ` Kevin Constantine
[not found] ` <49EE083C.6010105-P5ys19MLBK/QT0dZR+AlfA@public.gmane.org>
2009-04-21 18:58 ` Chuck Lever
2009-04-22 1:42 ` [PATCH] " Kevin Constantine
[not found] ` <1240364531-11499-1-git-send-email-kevin.constantine-FfNkGbSheRGpB8w63BLUukEOCMrvLtNR@public.gmane.org>
2009-04-22 18:38 ` Chuck Lever
2009-04-25 3:16 ` Kevin Constantine
[not found] ` <49F28096.9060300-P5ys19MLBK/QT0dZR+AlfA@public.gmane.org>
2009-04-27 15:33 ` Chuck Lever
2009-04-30 1:57 ` Kevin Constantine [this message]
2009-04-30 14:43 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49F9057F.40105@disney.com \
--to=kevin.constantine-ffnkgbshergpb8w63bluukeocmrvltnr@public.gmane.org \
--cc=chuck.lever@oracle.com \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox