From: Anna Schumaker <Anna.Schumaker@netapp.com>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH v3 3/3] NFSD: Add support for encoding multiple segments
Date: Wed, 18 Mar 2015 14:16:29 -0400 [thread overview]
Message-ID: <5509C0FD.70309@Netapp.com> (raw)
In-Reply-To: <20150317213654.GE29843@fieldses.org>
On 03/17/2015 05:36 PM, J. Bruce Fields wrote:
> On Tue, Mar 17, 2015 at 04:07:38PM -0400, J. Bruce Fields wrote:
>> On Tue, Mar 17, 2015 at 03:56:33PM -0400, J. Bruce Fields wrote:
>>> On Mon, Mar 16, 2015 at 05:18:08PM -0400, Anna Schumaker wrote:
>>>> This patch implements sending an array of segments back to the client.
>>>> Clients should be prepared to handle multiple segment reads to make this
>>>> useful. We try to splice the first data segment into the XDR result,
>>>> and remaining segments are encoded directly.
>>>
>>> I'm still interested in what would happen if we started with an
>>> implementation like:
>>>
>>> - if the entire requested range falls within a hole, return that
>>> single hole.
>>> - otherwise, just treat the thing as one big data segment.
>>>
>>> That would provide a benefit in the case there are large-ish holes
>>> with minimal impact otherwise.
>>>
>>> (Though patches for full support are still useful even if only for
>>> client-testing purposes.)
>>
>> Also, looks like
>>
>> xvs_io -c "fiemap -v" <file>
>>
>> will give hole sizes for a given <file>. (Thanks, esandeen.) Running
>> that on a few of my test vm images shows a fair number of large
>> (hundreds of megs) files, which suggests identifying only >=rwsize holes
>> might still be useful.
>
> Just for fun.... I wrote the following test program and ran it on my
> collection of testing vm's. Some looked like this:
>
> f21-1.qcow2
> 144784 -rw-------. 1 qemu qemu 8591507456 Mar 16 10:13 f21-1.qcow2
> total hole bytes: 8443252736 (98%)
> in aligned 1MB chunks: 8428453888 (98%)
>
> So, basically, read_plus would save transferring most of the data even
> when only handling 1MB holes.
>
> But some looked like this:
>
> 501524 -rw-------. 1 qemu qemu 8589934592 May 20 2014 rhel6-1-1.img
> total hole bytes: 8077516800 (94%)
> in aligned 1MB chunks: 0 (0%)
>
> So the READ_PLUS that caught every hole might save a lot, the one that
> only caught 1MB holes wouldn't help at all.
>
> And there were lots of examples in between those two extremes.
I tested with three different 512 MB files: 100% data, 100% hole, and alternating every megabyte. The results were surprising:
| v4.1 | v4.2
-----------------------
data | 0.685s | 0.714s
hole | 0.485s | 15.547s
mixed | 1.283s | 0.448
>From what I can tell, the 100% hole case takes so long because of the SEEK_DATA call in nfsd4_encode_read_plus_hole(). I took this out to trick the function into thinking that the entire file was already a hole, and runtime dropped to the levels of v4.1 and v4.2. I wonder if this is filesystem dependent? My server is exporting ext4.
Anna
>
> (But, check my math, I haven't tested this carefully.)
>
> --b.
>
> #define _GNU_SOURCE
> #include <stdio.h>
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <fcntl.h>
> #include <unistd.h>
> #include <errno.h>
> #include <err.h>
>
> long round_up(long n, long b)
> {
> return ((n + b - 1)/b) * b;
> }
>
> long round_down(long n, long b)
> {
> return (n/b) * b;
> }
>
> long hbytes = 0;
> long rplusbytes = 0;
>
> do_stats(off_t hole_start, off_t hole_end)
> {
> off_t hole_start_up, hole_end_down;
>
> hole_start_up = round_up(hole_start, 1024*1024);
> hole_end_down = round_down(hole_end, 1024*1024);
>
> hbytes += hole_end - hole_start;
> if (hole_start_up < hole_end_down)
> rplusbytes += hole_end_down - hole_start_up;
> }
>
> int main(int argc, char *argv[])
> {
> off_t hole_start, hole_end;
> int fd;
> char *name;
>
> /* Map out holes with SEEK_HOLE, SEEK_DATA */
> /* Useful statistics:
> * - what percentage of file is in holes?
> * - what percentage of file would be skipped if we read it
> * sequentially in 1MB chunks?
> */
>
> if (argc != 2)
> errx(1, "usage: %s <filename>\n", argv[0]);
> name = argv[1];
> fd = open(name, O_RDONLY);
> if (fd == -1)
> err(1, "open");
>
> hole_end = 0;
> while (1) {
> hole_start = lseek(fd, hole_end, SEEK_HOLE);
> if (hole_start == -1)
> err(1, "lseek");
> hole_end = lseek(fd, hole_start, SEEK_DATA);
> if (hole_end == -1) {
> if (errno == ENXIO)
> break;
> err(1, "lseek");
> }
> do_stats(hole_start, hole_end);
> }
> hole_end = lseek(fd, 0, SEEK_END);
> do_stats(hole_start, hole_end);
> printf("total hole bytes: %ld (%.0f%)\n", hbytes,
> 100 * (float)hbytes/hole_end);
> printf("in aligned 1MB chunks: %ld (%.0f%)\n", rplusbytes,
> 100 * (float)rplusbytes/hole_end);
> }
>
next prev parent reply other threads:[~2015-03-18 18:16 UTC|newest]
Thread overview: 75+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-16 21:18 [PATCH v3 0/3] NFSD: Add READ_PLUS support Anna Schumaker
2015-03-16 21:18 ` [PATCH v3 1/3] NFSD: nfsd4_encode_read{v}() should encode eof and maxcount Anna Schumaker
2015-03-16 21:18 ` [PATCH v3 2/3] NFSD: Add basic READ_PLUS support Anna Schumaker
2015-03-16 21:18 ` [PATCH v3 3/3] NFSD: Add support for encoding multiple segments Anna Schumaker
2015-03-17 19:56 ` J. Bruce Fields
2015-03-17 20:07 ` J. Bruce Fields
2015-03-17 21:36 ` J. Bruce Fields
2015-03-18 18:16 ` Anna Schumaker [this message]
2015-03-18 18:55 ` J. Bruce Fields
2015-03-18 20:39 ` Anna Schumaker
2015-03-18 20:55 ` J. Bruce Fields
2015-03-18 21:03 ` Anna Schumaker
2015-03-18 21:11 ` J. Bruce Fields
[not found] ` <OFB111A6D8.016B8BD5-ON88257E0D.001D174D-88257E0D.005268D6@us.ibm.com>
2015-03-19 15:36 ` J. Bruce Fields
2015-03-19 16:28 ` Marc Eshel
2015-03-20 15:17 ` J. Bruce Fields
2015-03-20 15:17 ` J. Bruce Fields
2015-03-20 16:23 ` Christoph Hellwig
2015-03-20 16:23 ` Christoph Hellwig
2015-03-20 18:26 ` J. Bruce Fields
2015-03-20 18:26 ` J. Bruce Fields
2015-03-24 12:43 ` Anna Schumaker
2015-03-24 12:43 ` Anna Schumaker
2015-03-24 17:49 ` Christoph Hellwig
2015-03-24 17:49 ` Christoph Hellwig
2015-03-25 17:15 ` Anna Schumaker
2015-03-25 17:15 ` Anna Schumaker
2015-03-26 15:21 ` Anna Schumaker
2015-03-26 15:21 ` Anna Schumaker
2015-03-26 15:32 ` Trond Myklebust
2015-03-26 15:32 ` Trond Myklebust
2015-03-26 15:36 ` Anna Schumaker
2015-03-26 15:36 ` Anna Schumaker
2015-03-26 15:38 ` J. Bruce Fields
2015-03-26 15:38 ` J. Bruce Fields
2015-03-26 15:47 ` Anna Schumaker
2015-03-26 15:47 ` Anna Schumaker
2015-03-26 16:06 ` Trond Myklebust
2015-03-26 16:06 ` Trond Myklebust
2015-03-26 16:11 ` Anna Schumaker
2015-03-26 16:11 ` Anna Schumaker
2015-03-26 16:13 ` Trond Myklebust
2015-03-26 16:13 ` Trond Myklebust
2015-03-26 16:14 ` Anna Schumaker
2015-03-26 16:14 ` Anna Schumaker
2015-03-27 19:04 ` Anna Schumaker
2015-03-27 19:04 ` Anna Schumaker
2015-03-27 20:22 ` Trond Myklebust
2015-03-27 20:22 ` Trond Myklebust
2015-03-27 20:46 ` Anna Schumaker
2015-03-27 20:46 ` Anna Schumaker
2015-03-27 20:54 ` J. Bruce Fields
2015-03-27 20:54 ` J. Bruce Fields
2015-03-27 20:55 ` Anna Schumaker
2015-03-27 20:55 ` Anna Schumaker
2015-03-27 21:08 ` J. Bruce Fields
2015-03-27 21:08 ` J. Bruce Fields
2015-04-15 19:32 ` Anna Schumaker
2015-04-15 19:32 ` Anna Schumaker
2015-04-15 19:56 ` J. Bruce Fields
2015-04-15 19:56 ` J. Bruce Fields
2015-04-15 20:00 ` J. Bruce Fields
2015-04-15 20:00 ` J. Bruce Fields
2015-04-15 22:50 ` Dave Chinner
2015-04-15 22:50 ` Dave Chinner
2015-04-17 22:07 ` J. Bruce Fields
2015-04-17 22:07 ` J. Bruce Fields
2015-04-15 22:57 ` Dave Chinner
2015-04-15 22:57 ` Dave Chinner
2015-03-26 16:11 ` J. Bruce Fields
2015-03-26 16:11 ` J. Bruce Fields
2015-03-26 16:18 ` Anna Schumaker
2015-03-26 16:18 ` Anna Schumaker
2015-03-30 14:06 ` Christoph Hellwig
2015-03-30 14:06 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5509C0FD.70309@Netapp.com \
--to=anna.schumaker@netapp.com \
--cc=bfields@fieldses.org \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.