linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* regression in DIO write behavior
@ 2017-01-24 15:44 Jeff Layton
  2017-01-24 17:23 ` Weston Andros Adamson
  0 siblings, 1 reply; 4+ messages in thread
From: Jeff Layton @ 2017-01-24 15:44 UTC (permalink / raw)
  To: Linux NFS Mailing List

[-- Attachment #1: Type: text/plain, Size: 721 bytes --]

I've noticed a probable regression in recent kernels. When you run the
attached program on an older kernel (I used 2.6.32-642.6.2.el6.x86_64),
I see the kernel generate wsize WRITE calls on the wire.

When I run the same program on a more modern kernel (mainline as of
today), it generates a ton of page-sized I/Os instead. I've verified
that iov_iter_get_pages_alloc is returning a wsize array of pages, it
just seems like the request handling code isn't stitching them together
like it should.

Is this an expected change or a regression? I'm guessing the latter, and
that it might have crept in during the pageio rework from a couple of
years ago.

Any idea where the bug might be?
-- 
Jeff Layton <jlayton@redhat.com>

[-- Attachment #2: diotest2.c --]
[-- Type: text/x-csrc, Size: 814 bytes --]

#define _GNU_SOURCE

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>
#include <sys/uio.h>
#include <unistd.h>
#include <stdlib.h>

#define	NUM_PAGES	(256)

int main(int argc, char **argv)
{
	int ret, fd;
	long pagesize;
	struct iovec	iov;

	if (argc < 2) {
		fprintf(stderr, "Usage: %s <filename>\n", argv[0]);
		return 1;
	}

	pagesize = sysconf(_SC_PAGESIZE);
	if (pagesize < 0) {
		perror("sysconf");
		return 1;
	}

	fd = open(argv[1], O_CREAT|O_WRONLY|O_DIRECT, 0666);
	if (fd < 0) {
		perror("open");
		return 1;
	}

	ret = posix_memalign(&iov.iov_base, pagesize, pagesize * NUM_PAGES);
	if (ret) {
		perror("posix_memalign");
		return 1;
	}
	iov.iov_len = pagesize * NUM_PAGES;

	ret = writev(fd, &iov, 1);
	if (ret < 0) {
		perror("writev");
		return 1;
	}

	return 0;
}

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: regression in DIO write behavior
  2017-01-24 15:44 regression in DIO write behavior Jeff Layton
@ 2017-01-24 17:23 ` Weston Andros Adamson
  2017-01-24 17:50   ` Jeff Layton
  2017-01-24 19:46   ` Jeff Layton
  0 siblings, 2 replies; 4+ messages in thread
From: Weston Andros Adamson @ 2017-01-24 17:23 UTC (permalink / raw)
  To: Jeffrey Layton; +Cc: linux-nfs list

Hey Jeff,

That sounds like a regression to me. I don't think it's been around since the
pgio rework, but maybe?

-dros

> On Jan 24, 2017, at 10:44 AM, Jeff Layton <jlayton@redhat.com> wrote:
> 
> I've noticed a probable regression in recent kernels. When you run the
> attached program on an older kernel (I used 2.6.32-642.6.2.el6.x86_64),
> I see the kernel generate wsize WRITE calls on the wire.
> 
> When I run the same program on a more modern kernel (mainline as of
> today), it generates a ton of page-sized I/Os instead. I've verified
> that iov_iter_get_pages_alloc is returning a wsize array of pages, it
> just seems like the request handling code isn't stitching them together
> like it should.
> 
> Is this an expected change or a regression? I'm guessing the latter, and
> that it might have crept in during the pageio rework from a couple of
> years ago.
> 
> Any idea where the bug might be?
> -- 
> Jeff Layton <jlayton@redhat.com><diotest2.c>


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: regression in DIO write behavior
  2017-01-24 17:23 ` Weston Andros Adamson
@ 2017-01-24 17:50   ` Jeff Layton
  2017-01-24 19:46   ` Jeff Layton
  1 sibling, 0 replies; 4+ messages in thread
From: Jeff Layton @ 2017-01-24 17:50 UTC (permalink / raw)
  To: Weston Andros Adamson; +Cc: linux-nfs list, Scott Mayhew

On Tue, 2017-01-24 at 12:23 -0500, Weston Andros Adamson wrote:
> Hey Jeff,
> 
> That sounds like a regression to me. I don't think it's been around since the
> pgio rework, but maybe?
> 
> -dros
> 

I certainly could be wrong. :)

I did open this bug, and we'll track it down there:

https://bugzilla.redhat.com/show_bug.cgi?id=1416127

Looks like Scott bisected it down in RHEL7 kernels so we should be able
to ID it from there.

Cheers,
Jeff

> > On Jan 24, 2017, at 10:44 AM, Jeff Layton <jlayton@redhat.com> wrote:
> > 
> > I've noticed a probable regression in recent kernels. When you run the
> > attached program on an older kernel (I used 2.6.32-642.6.2.el6.x86_64),
> > I see the kernel generate wsize WRITE calls on the wire.
> > 
> > When I run the same program on a more modern kernel (mainline as of
> > today), it generates a ton of page-sized I/Os instead. I've verified
> > that iov_iter_get_pages_alloc is returning a wsize array of pages, it
> > just seems like the request handling code isn't stitching them together
> > like it should.
> > 
> > Is this an expected change or a regression? I'm guessing the latter, and
> > that it might have crept in during the pageio rework from a couple of
> > years ago.
> > 
> > Any idea where the bug might be?
> > -- 
> > Jeff Layton <jlayton@redhat.com><diotest2.c>
> 
> 

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: regression in DIO write behavior
  2017-01-24 17:23 ` Weston Andros Adamson
  2017-01-24 17:50   ` Jeff Layton
@ 2017-01-24 19:46   ` Jeff Layton
  1 sibling, 0 replies; 4+ messages in thread
From: Jeff Layton @ 2017-01-24 19:46 UTC (permalink / raw)
  To: Weston Andros Adamson; +Cc: linux-nfs list

On Tue, 2017-01-24 at 12:23 -0500, Weston Andros Adamson wrote:
> Hey Jeff,
> 
> That sounds like a regression to me. I don't think it's been around since the
> pgio rework, but maybe?
> 
> -dros
> 
> > On Jan 24, 2017, at 10:44 AM, Jeff Layton <jlayton@redhat.com> wrote:
> > 
> > I've noticed a probable regression in recent kernels. When you run the
> > attached program on an older kernel (I used 2.6.32-642.6.2.el6.x86_64),
> > I see the kernel generate wsize WRITE calls on the wire.
> > 
> > When I run the same program on a more modern kernel (mainline as of
> > today), it generates a ton of page-sized I/Os instead. I've verified
> > that iov_iter_get_pages_alloc is returning a wsize array of pages, it
> > just seems like the request handling code isn't stitching them together
> > like it should.
> > 
> > Is this an expected change or a regression? I'm guessing the latter, and
> > that it might have crept in during the pageio rework from a couple of
> > years ago.
> > 
> > Any idea where the bug might be?
> > -- 
> > Jeff Layton <jlayton@redhat.com><diotest2.c>
> 
> 

Ahh, I think I might get it now and it's not as bad as I had originally
feared...

If you dirty all of the pages before writing, it seems to coalesce them
correctly. The reproducer allocates pages, but doesn't actually dirty
them before writing them. Apparently the allocator is setting up the
mapping such that each page offset address in the allocation points to
the same page. I imagine it's then setting up that page for CoW.

So we end up in this test in nfs_can_coalesce_requests and hit the
return false:

                if (req->wb_page == prev->wb_page) {
                        if (req->wb_pgbase != prev->wb_pgbase + prev->wb_bytes)
                                return false;

I think that's in place to handle sub-page write requests, but maybe we
should consider doing that a different way for DIO?
-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-01-24 19:46 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-01-24 15:44 regression in DIO write behavior Jeff Layton
2017-01-24 17:23 ` Weston Andros Adamson
2017-01-24 17:50   ` Jeff Layton
2017-01-24 19:46   ` Jeff Layton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).