public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@ZenIV.linux.org.uk>
To: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Al Viro <viro@ftp.linux.org.uk>,
	linux-arch@vger.kernel.org, torvalds@linux-foundation.org,
	akpm@linux-foundation.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC][PATCHSET] mremap/mmap mess
Date: Mon, 7 Dec 2009 19:30:48 +0000	[thread overview]
Message-ID: <20091207193048.GI14381@ZenIV.linux.org.uk> (raw)
In-Reply-To: <Pine.LNX.4.64.0912071710540.14445@sister.anvils>

On Mon, Dec 07, 2009 at 06:58:25PM +0000, Hugh Dickins wrote:

> [PATCH 4/19] fix checks for expand-in-place mremap
> 
> A couple of points on vma_expandable() in 4/19:
> +	if (arch_mmap_check(vma->vm_start, end - vma->vm_start, MAP_FIXED))
> +		return 0;
> +	if (get_unmapped_area(NULL, vma->vm_start, end - vma->vm_start,
> +			      0, MAP_FIXED) & ~PAGE_MASK)
>  		return 0;
> 
> I wondered why you don't pass the appropriate filp and pgoff here?
> Maybe it doesn't matter for anything at present (given that you're
> expanding an existing mmap), but even if that's the case, I do think
> it would be more robust to pass the correct filp and pgoff.

Umm...  Can do, but that's a bit of an extra work - I'd need to check
if we want MAP_SHARED passed if we bother to pass file.

> And that arch_mmap_check(,, MAP_FIXED) there: no problem with that,
> but it does become a problem later on (mainly in 9 and 11 and 16):
> one idiocy you haven't yet noticed, is that (a) nothing has been
> using the flags arg to arch_mmap_check(), and (b) do_mmap_pgoff()
> and do_brk() disagree on whether they're MAP_ flags or VM_ flags
> (and MAP_FIXED == VM_MAYREAD).

*ow*

> You're going for MAP_ flags, fine, but then do_brk() will need
> changing once you take notice of those flags (in 9 and 11).

Point taken, but see below.

> [PATCH 8/19] file ->get_unmapped_area() shouldn't duplicate work of get_unmapped_area()
> 
> I kept on finding more interesting things to do than arrive at a
> full understanding of file ->get_unmapped_area()s: they confuse me,
> and obviously that's not your fault.  But, while I agree with your
> principle of apportioning the work appropriately between the different
> helpers, I did wonder (a) what actual benefit this patch brings? you
> don't mention it, and it looks like a rearrangement which is very
> easy to get wrong (which you admit you did), and (b) whether the patch
> is complete, since there are lots of driver ->get_unmapped_area()s
> which are not doing the current->mm->get_unmapped_area thing.
> I think.  Maybe they're all NOMMU, and it doesn't matter there:
> I gave up on trying to work it all out and moved on.

_That_ is one hell of a mess; most of those suckers are NOMMU (and
!MAP_FIXED, while we are at it).  There are very few exceptions:
	* hugetlb
	* shm on top of hugetlb
	* fb (== pci)
	* spufs
That's *all*.  And frankly, hugetlb/shm/spufs look a lot like candidates
for a single mm method; "is this a hugepage mapping" matters in a lot more
places and I'm not at all sure that spufs is correct.  Oh, and there's
pure cargo-cult thing in bad_inode.c - we are not going to get that
file_operations instance as anything->f_op, so *all* methods except
->open() and ->fsync() (the latter due to nfsd playing silly buggers with
sync of directories) are never going to be called.

The benefit of patch... it's a preparation to the next one - we want to
push arch_mmap_check() down into get_unmapped_area() and the less extra
calls we grow and have to analyse, the better...

> [PATCH 9/19] arm: add arch_mmap_check(), get rid of sys_arm_mremap()
> 
> You give arm an arch_mmap_check() which tests MAP_FIXED,
> so now do_brk() needs fixing.  Or can arm's get_unmapped_area()
> handle this, without any arch_mmap_check()?  In the end you move
> the arch_mmap_check() call into get_unmapped_area(), but could it be
> eliminated completely, in favour of code in arch's get_unmapped_area()?

Point, but take a look at actual check there.  do_brk() won't run afoul
of it anyway with existing callers on arm.  But yes, I agree that flags
need to be fixed there.

> [PATCH 12/19] Cut hugetlb case early for 32bit on ia64
> Could you explain this one a bit more, please?  I worry because
> MAP_HUGETLB is a recently added special, there's plenty of hugetlb
> without MAP_HUGETLB, so I wonder if you're really catching early
> all that you want to be catching, whatever that is.

What I want to catch is "do_mmap_pgoff() would create struct file" case.
And MAP_HUGETLB is *exactly* what I want to catch - existing file will
get through to do_mmap_pgoff() and we'll fail due to unsuitable address
(with MAP_FIXED) or address beyond TASK_SIZE (without MAP_FIXED).  That's
fine.

> [PATCH 13/19] Unify sys_mmap*
> Couple of points on this one.
> (a) I didn't understand the arch/score part of it, why you said it
> should almost certainly shift pgoff too, nor why you left in the
> (pgoff & (~PAGE_MASK >> 12)) part, which looked redundant to me.

Exactly because it's redundant ;-)

If they ever grow PAGE_SHIFT > 12, they'll need to shift; until they do
that the check is simply if (0) and gets compiled away.

That check looks like either a pure cargo-cult thing, or a preparation to
different page sizes.  I'd rather give a benefit of doubt and put a warning
in comment...
 
> And (b) I thought you were being perverse in putting sys_mmap_pgoff()
> in mm/util.c, that's never where I'd look for it, we have a file
> mm/mmap.c which is just the place for it, after do_mmap_pgoff().
> Ah, you're trying to avoid duplicating it in mm/nommu.c?  Hmm,
> well, I'd much rather you do duplicate it.  Particularly once
> 14/19 complicates it with the MAP_HUGETLB fix, which we should
> keep in in mm/mmap.c, and shouldn't be needed in mm/nommu.c.

I'm not too happy with mm/util.c, but I really don't like the mmap vs nommu
duplications.  Hell knows; we can always split and move later on.

  reply	other threads:[~2009-12-07 19:30 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-12-05 19:08 [RFC][PATCHSET] mremap/mmap mess Al Viro
2009-12-05 20:44 ` Linus Torvalds
2009-12-05 23:01 ` Al Viro
2009-12-05 23:58 ` Russell King
2009-12-06 17:22 ` Hugh Dickins
2009-12-06 18:00   ` Linus Torvalds
2009-12-07  3:58 ` Al Viro
2009-12-07 18:58   ` Hugh Dickins
2009-12-07 19:30     ` Al Viro [this message]
2009-12-07 20:05       ` Hugh Dickins
2009-12-08  6:07         ` Al Viro
2009-12-08 11:42           ` Hugh Dickins
2009-12-08 13:03             ` Hugh Dickins
2009-12-08 21:08               ` David Miller
2009-12-08 22:06                 ` Al Viro
2009-12-09 11:43                   ` Hugh Dickins
2009-12-09 12:21                     ` Peter Zijlstra
2009-12-09 13:12                       ` Hugh Dickins
2009-12-09 13:37                         ` Peter Zijlstra
2009-12-09 13:24                       ` Al Viro
2009-12-09 13:39                         ` Peter Zijlstra
2009-12-09 13:46                     ` Al Viro
2009-12-09 14:36                       ` Hugh Dickins
2009-12-09 15:12                     ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091207193048.GI14381@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=akpm@linux-foundation.org \
    --cc=hugh.dickins@tiscali.co.uk \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@ftp.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox