public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@ZenIV.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [git pull] vfs.git get_user_pages_fast() conversion
Date: Fri, 17 Nov 2017 21:32:15 +0000	[thread overview]
Message-ID: <20171117213215.GQ21978@ZenIV.linux.org.uk> (raw)
In-Reply-To: <CA+55aFy+gsHRm+rBmimU3-PS_NUWyAd5Aq12=ygezQdX7_7LAg@mail.gmail.com>

On Fri, Nov 17, 2017 at 12:50:47PM -0800, Linus Torvalds wrote:

> Not because the conversion was wrong, but because the original code is
> so broken.
> 
> In particular, that "1" that is unchanged in the arguments is correct
> in the conversion, but it was completely wrong in the original, even
> if it happened to work.
> 
> it _should_ have been a FOLL_WRITE. Yes, it happens to have that
> value, but it was broken.
> 
> (I note that a bit of grepping shows we have the same issue in a stale
> comment in mm/ksm.c).
> 
> It would have been nice to see things like this mentioned in the commit message.
> 
> Because I'm pretty sure you actually _realized_ that as you made the
> conversion, but there's no sign of that in the logs, because the
> commit message just says
> 
>     atomisp: use get_user_pages_fast()
> 
> without mentioning how broken the old case was (even it if happened to work).

Point...  Frankly, my impression from the whole thing is that get_user_pages()
and get_user_pages_unlocked() calling conventions are overcomplicated, especially
since most of the callers either should be get_user_pages_fast() or happen
to be inside implementations of get_user_pages_fast().  Grepping for the
remaining callers right now yields just this:

arch/cris/arch-v32/drivers/cryptocop.c:2722:    err = get_user_pages((unsigned long int)(oper.indata + prev_ix),
arch/cris/arch-v32/drivers/cryptocop.c:2736:            err = get_user_pages((unsigned long int)oper.cipher_outdata,
arch/ia64/kernel/err_inject.c:145:      ret = get_user_pages(virt_addr, 1, FOLL_WRITE, NULL, NULL);
arch/x86/mm/mpx.c:550:  gup_ret = get_user_pages((unsigned long)addr, nr_pages,
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c:646:            r = get_user_pages(userptr, num_pages, flags, p, NULL);
drivers/gpu/drm/radeon/radeon_ttm.c:571:                r = get_user_pages(userptr, num_pages, write ? FOLL_WRITE : 0,
drivers/infiniband/core/umem.c:194:             ret = get_user_pages(cur_base,
drivers/infiniband/hw/mthca/mthca_memfree.c:475:        ret = get_user_pages(uaddr & PAGE_MASK, 1, FOLL_WRITE, pages, NUL
L);
drivers/infiniband/hw/qib/qib_user_pages.c:70:          ret = get_user_pages(start_page + got * PAGE_SIZE,
drivers/infiniband/hw/usnic/usnic_uiom.c:146:           ret = get_user_pages(cur_base,
drivers/media/pci/ivtv/ivtv-udma.c:127: err = get_user_pages_unlocked(user_dma.uaddr, user_dma.page_count,
drivers/media/pci/ivtv/ivtv-yuv.c:78:   y_pages = get_user_pages_unlocked(y_dma.uaddr,
drivers/media/pci/ivtv/ivtv-yuv.c:82:           uv_pages = get_user_pages_unlocked(uv_dma.uaddr,
drivers/media/v4l2-core/videobuf-dma-sg.c:188:  err = get_user_pages(data & PAGE_MASK, dma->nr_pages,
drivers/misc/mic/scif/scif_rma.c:1399:          pinned_pages->nr_pages = get_user_pages(
drivers/misc/sgi-gru/grufault.c:201:    if (get_user_pages(vaddr, 1, write ? FOLL_WRITE : 0, &page, NULL) <= 0)
mm/mempolicy.c:829:     err = get_user_pages(addr & PAGE_MASK, 1, 0, &p, NULL);
virt/kvm/kvm_main.c:1326:       return get_user_pages(start, 1, flags, page, NULL);
virt/kvm/kvm_main.c:1333:       rc = get_user_pages(addr, 1, flags, NULL, NULL);
virt/kvm/kvm_main.c:1395:               npages = get_user_pages_unlocked(addr, 1, page, flags);

and even those are dubious - e.g. cris ones could bloody well become a pair of
get_user_pages_fast(), without ->mmap_sem being held across both calls.  Itanic
one is almost certainly buggered - we are not holding ->mmap_sem there.

Hell knows...  I wonder if exposing FOLL_... thing outside of mm/gup.c actually
makes sense.  With the users so heavily skewed towards just two combinations of
flags (0 and FOLL_WRITE)...

And for get_user_pages() itself it's even more ridiculous - vmalist (the last
argument) is non-NULL in only one caller.  Which uses it only to check if all
of the VMAs happen to be hugetlb ones, apparently.

FWIW, I wanted to trim the users of those two suckers and see what remains.
And then go through those with maintainers of subsystems in question, to
see what is really wanted there.  That's for the coming cycle, though...

  reply	other threads:[~2017-11-17 21:32 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-17  3:02 [git pull] vfs.git get_user_pages_fast() conversion Al Viro
2017-11-17 20:50 ` Linus Torvalds
2017-11-17 21:32   ` Al Viro [this message]
2017-11-18 21:45     ` Al Viro
2017-11-21 18:35       ` Andrea Arcangeli
2017-11-18 21:49     ` Dan Williams
2017-11-19 21:23     ` mthca misuse of get_user_pages() (was Re: [git pull] vfs.git get_user_pages_fast() conversion) Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171117213215.GQ21978@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox