linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@poochiereds.net>
To: viro@zeniv.linux.org.uk
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-nfs@vger.kernel.org, ceph-devel@vger.kernel.org,
	lustre-devel@lists.lustre.org,
	v9fs-developer@lists.sourceforge.net
Subject: Re: [PATCH v4 2/2] ceph: switch DIO code to use iov_iter_get_pages_alloc
Date: Mon, 30 Jan 2017 10:40:44 -0500	[thread overview]
Message-ID: <1485790844.2671.8.camel@poochiereds.net> (raw)
In-Reply-To: <20170127132451.6601-3-jlayton@redhat.com>

On Fri, 2017-01-27 at 08:24 -0500, Jeff Layton wrote:
> xfstest generic/095 triggers soft lockups in kcephfs. It uses fio to
> drive some I/O via vmsplice ane splice. Ceph then ends up trying to
> access an ITER_BVEC type iov_iter as a ITER_IOVEC one. That causes it to
> pick up a wrong offset and get stuck in an infinite loop while trying to
> populate the page array. dio_get_pagev_size has a similar problem.
> 
> Now that iov_iter_get_pages_alloc doesn't stop after the first vector in
> the array, we can just call it instead and dump the old code that tried
> to do the same thing.
> 
> Signed-off-by: Jeff Layton <jlayton@redhat.com>
> ---
>  fs/ceph/file.c | 75 +++-------------------------------------------------------
>  1 file changed, 3 insertions(+), 72 deletions(-)
> 
> diff --git a/fs/ceph/file.c b/fs/ceph/file.c
> index 045d30d26624..0ce79f1eabbc 100644
> --- a/fs/ceph/file.c
> +++ b/fs/ceph/file.c
> @@ -35,75 +35,6 @@
>   */
>  
>  /*
> - * Calculate the length sum of direct io vectors that can
> - * be combined into one page vector.
> - */
> -static size_t dio_get_pagev_size(const struct iov_iter *it)
> -{
> -    const struct iovec *iov = it->iov;
> -    const struct iovec *iovend = iov + it->nr_segs;
> -    size_t size;
> -
> -    size = iov->iov_len - it->iov_offset;
> -    /*
> -     * An iov can be page vectored when both the current tail
> -     * and the next base are page aligned.
> -     */
> -    while (PAGE_ALIGNED((iov->iov_base + iov->iov_len)) &&
> -           (++iov < iovend && PAGE_ALIGNED((iov->iov_base)))) {
> -        size += iov->iov_len;
> -    }
> -    dout("dio_get_pagevlen len = %zu\n", size);
> -    return size;
> -}
> -
> -/*
> - * Allocate a page vector based on (@it, @nbytes).
> - * The return value is the tuple describing a page vector,
> - * that is (@pages, @page_align, @num_pages).
> - */
> -static struct page **
> -dio_get_pages_alloc(const struct iov_iter *it, size_t nbytes,
> -		    size_t *page_align, int *num_pages)
> -{
> -	struct iov_iter tmp_it = *it;
> -	size_t align;
> -	struct page **pages;
> -	int ret = 0, idx, npages;
> -
> -	align = (unsigned long)(it->iov->iov_base + it->iov_offset) &
> -		(PAGE_SIZE - 1);
> -	npages = calc_pages_for(align, nbytes);
> -	pages = kmalloc(sizeof(*pages) * npages, GFP_KERNEL);
> -	if (!pages) {
> -		pages = vmalloc(sizeof(*pages) * npages);
> -		if (!pages)
> -			return ERR_PTR(-ENOMEM);
> -	}
> -
> -	for (idx = 0; idx < npages; ) {
> -		size_t start;
> -		ret = iov_iter_get_pages(&tmp_it, pages + idx, nbytes,
> -					 npages - idx, &start);
> -		if (ret < 0)
> -			goto fail;
> -
> -		iov_iter_advance(&tmp_it, ret);
> -		nbytes -= ret;
> -		idx += (ret + start + PAGE_SIZE - 1) / PAGE_SIZE;
> -	}
> -
> -	BUG_ON(nbytes != 0);
> -	*num_pages = npages;
> -	*page_align = align;
> -	dout("dio_get_pages_alloc: got %d pages align %zu\n", npages, align);
> -	return pages;
> -fail:
> -	ceph_put_page_vector(pages, idx, false);
> -	return ERR_PTR(ret);
> -}
> -
> -/*
>   * Prepare an open request.  Preallocate ceph_cap to avoid an
>   * inopportune ENOMEM later.
>   */
> @@ -923,7 +854,7 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter,
>  	}
>  
>  	while (iov_iter_count(iter) > 0) {
> -		u64 size = dio_get_pagev_size(iter);
> +		u64 size = iov_iter_count(iter);
>  		size_t start = 0;
>  		ssize_t len;
>  
> @@ -943,13 +874,13 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter,
>  			break;
>  		}
>  
> -		len = size;
> -		pages = dio_get_pages_alloc(iter, len, &start, &num_pages);
> +		len = iov_iter_get_pages_alloc(iter, &pages, size, &start);
>  		if (IS_ERR(pages)) {
>  			ceph_osdc_put_request(req);
>  			ret = PTR_ERR(pages);
>  			break;
>  		}
> +		num_pages = DIV_ROUND_UP(len, PAGE_SIZE);

Sigh, this should be:

    num_pages = DIV_ROUND_UP(len + start, PAGE_SIZE);

Also, while it is a simple thing to determine, it is rather easy to get
that wrong.

Maybe we should have iov_iter_get_pages_alloc also return the number of
pages? Not having to do a DIV_ROUND_UP on every call into it would be
nice, and all of the callers need that value anyway.

>  
>  		/*
>  		 * To simplify error handling, allow AIO when IO within i_size

-- 
Jeff Layton <jlayton@poochiereds.net>

  reply	other threads:[~2017-01-30 15:42 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-24 21:23 [PATCH] iov_iter: allow iov_iter_get_pages_alloc to allocate more pages per call Jeff Layton
2017-01-25 13:32 ` [PATCH v3 0/2] " Jeff Layton
2017-01-25 13:32   ` [PATCH v3 1/2] " Jeff Layton
2017-01-26 12:35     ` Jeff Layton
2017-01-27 13:24       ` [PATCH v4 0/2] " Jeff Layton
2017-01-27 13:24         ` [PATCH v4 1/2] " Jeff Layton
2017-01-27 13:24         ` [PATCH v4 2/2] ceph: switch DIO code to use iov_iter_get_pages_alloc Jeff Layton
2017-01-30 15:40           ` Jeff Layton [this message]
2017-01-25 13:32   ` [PATCH v3 " Jeff Layton
2017-02-02  9:51   ` [PATCH v3 0/2] iov_iter: allow iov_iter_get_pages_alloc to allocate more pages per call Al Viro
2017-02-02 10:56     ` Christoph Hellwig
2017-02-02 11:16       ` Al Viro
2017-02-02 13:00         ` Jeff Layton
2017-02-03  7:29           ` Al Viro
2017-02-03 18:29             ` Linus Torvalds
2017-02-03 19:08               ` Al Viro
2017-02-03 19:28                 ` Linus Torvalds
2017-02-13  9:56                   ` Steve Capper
2017-02-13 21:40                     ` Linus Torvalds
2017-02-03  7:49           ` Christoph Hellwig
2017-02-03  8:54             ` Al Viro
2017-02-03 11:09               ` Christoph Hellwig
2017-02-02 14:48     ` Jan Kara
2017-02-02 18:28       ` Al Viro
2017-02-03 14:47         ` Jan Kara
2017-02-04  3:08     ` Al Viro
2017-02-04 19:26       ` Al Viro
2017-02-04 22:12         ` Miklos Szeredi
2017-02-04 22:11       ` Miklos Szeredi
2017-02-05  1:51         ` Al Viro
2017-02-05 20:15           ` Miklos Szeredi
2017-02-05 21:01             ` Al Viro
2017-02-05 21:19               ` Miklos Szeredi
2017-02-05 22:04                 ` Al Viro
2017-02-06  3:05                   ` Al Viro
2017-02-06  9:08                     ` Miklos Szeredi
2017-02-06  9:57                       ` Al Viro
2017-02-06 14:18                         ` Miklos Szeredi
2017-02-07  7:19                           ` Al Viro
2017-02-07 11:35                             ` Miklos Szeredi
2017-02-08  5:54                               ` Al Viro
2017-02-08  9:53                                 ` Miklos Szeredi
2017-02-06  8:37                   ` Miklos Szeredi
2017-02-05 20:56           ` Al Viro
2017-02-16 13:10     ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1485790844.2671.8.camel@poochiereds.net \
    --to=jlayton@poochiereds.net \
    --cc=ceph-devel@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=lustre-devel@lists.lustre.org \
    --cc=v9fs-developer@lists.sourceforge.net \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).