linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.cz>
To: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mgorman@suse.de>, Linux-MM <linux-mm@kvack.org>,
	David Gibson <david@gibson.dropbear.id.au>,
	Ken Chen <kenchen@google.com>,
	Cong Wang <xiyou.wangcong@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)
Date: Tue, 24 Jul 2012 10:32:01 +0200	[thread overview]
Message-ID: <20120724083201.GA7291@tiehlicka.suse.cz> (raw)
In-Reply-To: <alpine.LSU.2.00.1207231702440.1683@eggly.anvils>

On Mon 23-07-12 18:08:05, Hugh Dickins wrote:
> On Mon, 23 Jul 2012, Mel Gorman wrote:
> > On Sun, Jul 22, 2012 at 09:04:33PM -0700, Hugh Dickins wrote:
> > > On Fri, 20 Jul 2012, Mel Gorman wrote:
> > > > On Fri, Jul 20, 2012 at 04:36:35PM +0200, Michal Hocko wrote:
> > 
> > I like it in that it's simple and I can confirm it works for the test case
> > of interest.
> 
> Phew, I'm glad to hear that, thanks.
> 
> > 
> > However, is your patch not vunerable to truncate issues?
> > madvise()/truncate() issues was the main reason why I was wary of VMA tricks
> > as a solution. As it turns out, madvise(DONTNEED) is not a problem as it is
> > ignored for hugetlbfs but I think truncate is still problematic. Lets say
> > we mmap(MAP_SHARED) a hugetlbfs file and then truncate for whatever reason.
> > 
> > invalidate_inode_pages2
> >   invalidate_inode_pages2_range
> >     unmap_mapping_range_vma
> >       zap_page_range_single
> >         unmap_single_vma
> > 	  __unmap_hugepage_range (removes VM_MAYSHARE)
> > 
> > The VMA still exists so the consequences for this would be varied but
> > minimally fault is going to be "interesting".
> 
> You had me worried there, I hadn't considered truncation or invalidation2
> at all.
> 
> But actually, I don't think they do pose any problem for my patch.  They
> would indeed if I were removing VM_MAYSHARE in __unmap_hugepage_range()
> as you show above; but no, I'm removing it in unmap_hugepage_range().
> 
> That's only called by unmap_single_vma(): which is called via
> unmap_vmas() by unmap_region() or exit_mmap() just before free_pgtables()
> (the problem cases); or by madvise_dontneed() via zap_page_range(), which
> as you note is disallowed on VM_HUGETLB; or by zap_page_range_single().
> 
> zap_page_range_single() is called by zap_vma_ptes(), which is only
> allowed on VM_PFNMAP; or by unmap_mapping_range_vma(), which looked
> like it was going to deadlock on i_mmap_mutex (with or without my
> patch) until I realized that hugetlbfs has its own hugetlbfs_setattr()
> and hugetlb_vmtruncate() which don't use unmap_mapping_range() at all.
> 
> invalidate_inode_pages2() (and _range()) do use unmap_mapping_range(),
> but hugetlbfs doesn't support direct_IO, and otherwise I think they're
> called by a filesystem directly on its own inodes, which hugetlbfs
> does not.  

Good point, I didn't get this while looking into the code so I introduce
the `last' parameter which told me that I am in the correct path.
Thanks for clarification.

> Anyway, if there's a deadlock on i_mmap_mutex somewhere in there, it's
> not introduced by the proposed patch.

> So, unmap_hugepage_range() is only being called in the problem cases,
> just before free_pgtables(), when unmapping a vma (with mmap_sem held),
> or when exiting (when we have the last reference to mm): in each case,
> the vma is on its way out, and VM_MAYSHARE no longer of interest to others.
> 
> I spent a while concerned that I'd overlooked the truncation case, before
> realizing that it's not a problem: the issue comes when we free_pgtables(),
> which truncation makes no attempt to do.
> 
> So, after a bout of anxiety, I think my &= ~VM_MAYSHARE remains good.

Yes, this is convincing (and subtle ;)) and much less polluting.
You can add my Reviewed-by (with the above reasoning in the patch
description)

Anyway, the patch for mmotm needs an update because there was a
reorganization in the area. First, we need to revert "hugetlb: avoid
taking i_mmap_mutex in unmap_single_vma() for hugetlb)" (80f408f5 in
memcg-devel) and then push your code into unmap_single_vma. All the
above is still valid AFAICS.

> 
> Hugh

Thanks a lot Hugh!
-- 
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2012-07-24  8:32 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-20 13:49 [PATCH] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables v2 Mel Gorman
2012-07-20 14:11 ` [PATCH] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend) Mel Gorman
2012-07-20 14:29   ` Michal Hocko
2012-07-20 14:37     ` Mel Gorman
2012-07-20 14:40       ` Michal Hocko
2012-07-20 14:36   ` [PATCH -alternative] " Michal Hocko
2012-07-20 14:51     ` Mel Gorman
2012-07-23  4:04       ` Hugh Dickins
2012-07-23 11:40         ` Mel Gorman
2012-07-24  1:08           ` Hugh Dickins
2012-07-24  8:32             ` Michal Hocko [this message]
2012-07-24  9:34             ` Mel Gorman
2012-07-24 10:04               ` Michal Hocko
2012-07-24 19:23               ` Hugh Dickins
2012-07-25  8:36                 ` Mel Gorman
2012-07-26 17:42         ` Rik van Riel
2012-07-26 18:04           ` Larry Woodman
2012-07-27  8:42           ` Mel Gorman
2012-07-26 18:37         ` Rik van Riel
2012-07-26 21:03           ` Larry Woodman
2012-07-27  3:48           ` Larry Woodman
2012-07-27 10:10             ` Larry Woodman
2012-07-27 10:23             ` Mel Gorman
2012-07-27 10:36               ` Larry Woodman
2012-07-30 19:11               ` Larry Woodman
2012-07-31 12:16                 ` Hillf Danton
2012-07-31 12:46                 ` Mel Gorman
2012-07-31 13:07                   ` Larry Woodman
2012-07-31 13:29                     ` Mel Gorman
2012-07-31 13:21                   ` Michal Hocko
2012-07-31 17:49                   ` Larry Woodman
2012-07-31 20:06                     ` Michal Hocko
2012-07-31 20:57                       ` Larry Woodman
2012-08-01  2:45                       ` Larry Woodman
2012-08-01  8:20                         ` Michal Hocko
2012-08-01 12:32                           ` Michal Hocko
2012-08-01 15:06                             ` Larry Woodman
2012-08-02  7:19                               ` Michal Hocko
2012-08-02  7:37                                 ` Mel Gorman
2012-08-02 12:36                                   ` Michal Hocko
2012-08-02 13:33                                     ` Mel Gorman
2012-08-02 13:53                                       ` Michal Hocko
2012-07-31 18:03                   ` Rik van Riel
2012-07-26 18:31     ` Rik van Riel
2012-07-27  9:02       ` Michal Hocko
2012-07-26 16:01 ` [PATCH] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables v2 Larry Woodman
2012-07-27  8:47   ` Mel Gorman
2012-07-26 21:00 ` Rik van Riel
2012-07-26 21:54   ` Hugh Dickins
2012-07-27  8:52   ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120724083201.GA7291@tiehlicka.suse.cz \
    --to=mhocko@suse.cz \
    --cc=david@gibson.dropbear.id.au \
    --cc=hughd@google.com \
    --cc=kenchen@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).