linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Al Viro <viro@ZenIV.linux.org.uk>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-nfs@vger.kernel.org, ceph-devel@vger.kernel.org,
	lustre-devel@lists.lustre.org,
	v9fs-developer@lists.sourceforge.net,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Jan Kara <jack@suse.cz>, Chris Wilson <chris@chris-wilson.co.uk>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Jeff Layton <jlayton@redhat.com>
Subject: Re: [PATCH v3 0/2] iov_iter: allow iov_iter_get_pages_alloc to allocate more pages per call
Date: Sat, 4 Feb 2017 19:26:55 +0000	[thread overview]
Message-ID: <20170204192655.GA13195@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20170204030842.GL27291@ZenIV.linux.org.uk>

On Sat, Feb 04, 2017 at 03:08:42AM +0000, Al Viro wrote:
> On Thu, Feb 02, 2017 at 09:51:25AM +0000, Al Viro wrote:
> 
> > 	* fuse_copy_fill().  I'm not at all sure that iov_iter_get_pages()
> > is a good idea there - fuse_copy_do() could bloody well just use
> > copy_{to,from}_iter().
> 
> Miklos, could you explain why does lock_request() prohibit page faults until
> the matching unlock_request()?  All it does is setting FR_LOCKED on
> our request and the only thing that even looks at that is fuse_abort_conn(),
> which doesn't (AFAICS) wait for anything.
> 
> Where does the deadlock come from, and if it's not a deadlock - what is
> it?  Or is that comment stale since "fuse: simplify request abort"?

While we are at it...  How can fuse_copy_page() ever get called with
*pagep == NULL?  AFAICS, for that to happen you need either
	i < req->num_pages && req->pages[i] == NULL  
or in fuse_notify_store() have
                err = fuse_copy_page(cs, &page, offset, this_num, 0);
reached with page == NULL.  The latter is flat-out impossible - we have
                if (!page)
                        goto out_iput;

                this_num = min_t(unsigned, num, PAGE_SIZE - offset);
immediately before the call in question.  As for the former...  I don't see
any place where we would increase ->num_pages without having all additional
->pages[] elements guaranteed to be non-NULL.  Stores to ->num_pages are

in cuse_send_init():
	req->num_pages = 1;
with
        req->pages[0] = page;
shortly before that and
        if (!page)
                goto err_put_req;
earlier.

In fuse_retrieve():
                if (!page)
                        break;

                this_num = min_t(unsigned, num, PAGE_SIZE - offset);
                req->pages[req->num_pages] = page;
                req->page_descs[req->num_pages].length = this_num;
                req->num_pages++;

In fuse_readdir():
        req->num_pages = 1;
        req->pages[0] = page;
with
        if (!page) {
                fuse_put_request(fc, req);
                return -ENOMEM;
        }
several lines above.

In fuse_do_readpage():
        req->num_pages = 1;
        req->pages[0] = page;
with page dereferenced earlier.

In fuse_fill_write_pages():
                req->pages[req->num_pages] = page;
                req->page_descs[req->num_pages].length = tmp;
                req->num_pages++;
with
                if (!page)
                        break;
earlier.

In fuse_get_user_pages():
                ret = iov_iter_get_pages(ii, &req->pages[req->num_pages],
                                        *nbytesp - nbytes, 
                                        req->max_pages - req->num_pages,
                                        &start);
                if (ret < 0)
                        break;

                iov_iter_advance(ii, ret);
                nbytes += ret;

                ret += start;
                npages = (ret + PAGE_SIZE - 1) / PAGE_SIZE;

                req->page_descs[req->num_pages].offset = start;
                fuse_page_descs_length_init(req, req->num_pages, npages);

                req->num_pages += npages;
which also shouldn't leave any NULLs in the area in question.

In fuse_writepage_locked():
        req->num_pages = 1;
        req->pages[0] = tmp_page;
with
        if (!tmp_page)
                goto err_free;
done earlier.

In fuse_writepage_in_flight():
	req->num_pages = 1;
with
        BUG_ON(new_req->num_pages != 0);
earlier and
        req->pages[req->num_pages] = tmp_page;
done in the caller (which passes req as new_req).  Earlier in the caller
we have
        if (!tmp_page)
                goto out_unlock;

In fuse_writepages_fill():
		req->num_pages = 0;
is obviously OK and
        req->num_pages++;
in the end of the same function is preceded by the same
        req->pages[req->num_pages] = tmp_page;
(fuse_writepages_fill() is the caller of fuse_writepage_in_flight();
reassignment in fuse_writepage_in_flight() happens only in case when
it returns 1 and in that case we don't reach the increment in
fuse_writepages_fill() at all).

In fuse_do_ioctl():
        memcpy(req->pages, pages, sizeof(req->pages[0]) * num_pages);
        req->num_pages = num_pages;
and all increments of num_pages in there are
                pages[num_pages] = alloc_page(GFP_KERNEL | __GFP_HIGHMEM);
                if (!pages[num_pages])
                        goto out;
                num_pages++;
so the array we copy into req->pages has the same property wrt num_pages.

req->pages is assigned only in fuse_request_init(); that gets called
in two places - one is at request allocation time, another (from
put_reserved_req()) passes the current req->pages value, so that leaves it
unchanged.  The contents of req->pages[] is changed only in the aforementioned
places + fuse_request_init(), which zeros ->num_pages + fuse_copy_page()
called from fuse_copy_pages() with &req->pages[i] as argument.  The last
one can modify the damn thing, but only if it hits
                                err = fuse_try_move_page(cs, pagep);
and that sucker never stores a NULL in there -
                *pagep = newpage;
with get_page(newpage) upstream from that point.

What am I missing here?  Looks like those checks in fuse_copy_page() are
dead code...  They had been there since the initial merge, but AFAICS
they were just as useless in 2.6.14.  Rudiments of some prehistorical stuff
that never had been cleaned out, perhaps?

  reply	other threads:[~2017-02-04 19:27 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-24 21:23 [PATCH] iov_iter: allow iov_iter_get_pages_alloc to allocate more pages per call Jeff Layton
2017-01-25 13:32 ` [PATCH v3 0/2] " Jeff Layton
2017-01-25 13:32   ` [PATCH v3 1/2] " Jeff Layton
2017-01-26 12:35     ` Jeff Layton
2017-01-27 13:24       ` [PATCH v4 0/2] " Jeff Layton
2017-01-27 13:24         ` [PATCH v4 1/2] " Jeff Layton
2017-01-27 13:24         ` [PATCH v4 2/2] ceph: switch DIO code to use iov_iter_get_pages_alloc Jeff Layton
2017-01-30 15:40           ` Jeff Layton
2017-01-25 13:32   ` [PATCH v3 " Jeff Layton
2017-02-02  9:51   ` [PATCH v3 0/2] iov_iter: allow iov_iter_get_pages_alloc to allocate more pages per call Al Viro
2017-02-02 10:56     ` Christoph Hellwig
2017-02-02 11:16       ` Al Viro
2017-02-02 13:00         ` Jeff Layton
2017-02-03  7:29           ` Al Viro
2017-02-03 18:29             ` Linus Torvalds
2017-02-03 19:08               ` Al Viro
2017-02-03 19:28                 ` Linus Torvalds
2017-02-13  9:56                   ` Steve Capper
2017-02-13 21:40                     ` Linus Torvalds
2017-02-03  7:49           ` Christoph Hellwig
2017-02-03  8:54             ` Al Viro
2017-02-03 11:09               ` Christoph Hellwig
2017-02-02 14:48     ` Jan Kara
2017-02-02 18:28       ` Al Viro
2017-02-03 14:47         ` Jan Kara
2017-02-04  3:08     ` Al Viro
2017-02-04 19:26       ` Al Viro [this message]
2017-02-04 22:12         ` Miklos Szeredi
2017-02-04 22:11       ` Miklos Szeredi
2017-02-05  1:51         ` Al Viro
2017-02-05 20:15           ` Miklos Szeredi
2017-02-05 21:01             ` Al Viro
2017-02-05 21:19               ` Miklos Szeredi
2017-02-05 22:04                 ` Al Viro
2017-02-06  3:05                   ` Al Viro
2017-02-06  9:08                     ` Miklos Szeredi
2017-02-06  9:57                       ` Al Viro
2017-02-06 14:18                         ` Miklos Szeredi
2017-02-07  7:19                           ` Al Viro
2017-02-07 11:35                             ` Miklos Szeredi
2017-02-08  5:54                               ` Al Viro
2017-02-08  9:53                                 ` Miklos Szeredi
2017-02-06  8:37                   ` Miklos Szeredi
2017-02-05 20:56           ` Al Viro
2017-02-16 13:10     ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170204192655.GA13195@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=ceph-devel@vger.kernel.org \
    --cc=chris@chris-wilson.co.uk \
    --cc=jack@suse.cz \
    --cc=jlayton@redhat.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=lustre-devel@lists.lustre.org \
    --cc=miklos@szeredi.hu \
    --cc=torvalds@linux-foundation.org \
    --cc=v9fs-developer@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).