From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>,
Tyler Hicks <tyhicks@canonical.com>
Cc: linux-mm@kvack.org, davej@redhat.com, jboyer@redhat.com,
linux-kernel@vger.kernel.org, Al Viro <viro@zeniv.linux.org.uk>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Mimi Zohar <zohar@linux.vnet.ibm.com>,
David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [PATCH] hugetlbfs: lockdep annotate root inode properly
Date: Fri, 09 Mar 2012 10:33:24 +0530 [thread overview]
Message-ID: <87vcme8ixv.fsf@linux.vnet.ibm.com> (raw)
In-Reply-To: <20120308134050.f53a0b2f.akpm@linux-foundation.org>
On Thu, 8 Mar 2012 13:40:50 -0800, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Thu, 8 Mar 2012 15:19:27 -0600
> Tyler Hicks <tyhicks@canonical.com> wrote:
>
> > >
> > >
> > > Sigh. Was lockdep_annotate_inode_mutex_key() sufficiently
> > > self-explanatory to justify leaving it undocumented?
> > >
> > > <goes off and reads e096d0c7e2e>
> > >
> > > OK, the patch looks correct given the explanation in e096d0c7e2e, but
> > > I'd like to understand why it becomes necessary only now.
> > >
> > > > NOTE: This patch also require
> > > > http://thread.gmane.org/gmane.linux.file-systems/58795/focus=59565
> > > > to remove the lockdep warning
> > >
> > > And that patch has been basically ignored.
> >
> > Al commented on it here:
> >
> > https://lkml.org/lkml/2012/2/16/518
> >
> > He said that while my patch is correct, taking i_mutex inside mmap_sem
> > is still wrong.
>
> OK, thanks, yup. Taking i_mutex in file_operations.mmap() is wrong.
>
> Is hugetlbfs actually deadlockable because of this, or is it the case
> that the i_mutex->mmap_sem ordering happens to never happen for this
> filesystem? Although we shouldn't go and create incompatible lock
> ranking rules for different filesystems!
>
> So we need to pull the i_mutex out of hugetlbfs_file_mmap(). What's it
> actually trying to do in there? If we switch to
> i_size_read()/i_size_write() then AFAICT the problem comes down to
> hugetlb_reserve_pages().
>
> hugetlb_reserve_pages() fiddles with i_mapping->private_list and the fs
> owns private_list and is free to use a lock other than i_mutex to
> protect it. (In fact i_mapping.private_lock is the usual lock for
> private_list).
>
>
>
> So from a quick scan here I'm thinking that a decent fix is to remove
> the i_mutex locking from hugetlbfs_file_mmap(), switch
> hugetlbfs_file_mmap() to i_size_read/write then use a hugetlb-private
> lock to protect i_mapping->private_list. region_chg() will do
> GFP_KERNEL allocations under that lock, so some care is needed.
>
But as per 7762f5a0b709b415fda132258ad37b9f2a1db994 i_size_write should
always happen with i_mutex held
-aneesh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>,
Tyler Hicks <tyhicks@canonical.com>
Cc: linux-mm@kvack.org, davej@redhat.com, jboyer@redhat.com,
linux-kernel@vger.kernel.org, Al Viro <viro@zeniv.linux.org.uk>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Mimi Zohar <zohar@linux.vnet.ibm.com>,
David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [PATCH] hugetlbfs: lockdep annotate root inode properly
Date: Fri, 09 Mar 2012 10:33:24 +0530 [thread overview]
Message-ID: <87vcme8ixv.fsf@linux.vnet.ibm.com> (raw)
In-Reply-To: <20120308134050.f53a0b2f.akpm@linux-foundation.org>
On Thu, 8 Mar 2012 13:40:50 -0800, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Thu, 8 Mar 2012 15:19:27 -0600
> Tyler Hicks <tyhicks@canonical.com> wrote:
>
> > >
> > >
> > > Sigh. Was lockdep_annotate_inode_mutex_key() sufficiently
> > > self-explanatory to justify leaving it undocumented?
> > >
> > > <goes off and reads e096d0c7e2e>
> > >
> > > OK, the patch looks correct given the explanation in e096d0c7e2e, but
> > > I'd like to understand why it becomes necessary only now.
> > >
> > > > NOTE: This patch also require
> > > > http://thread.gmane.org/gmane.linux.file-systems/58795/focus=59565
> > > > to remove the lockdep warning
> > >
> > > And that patch has been basically ignored.
> >
> > Al commented on it here:
> >
> > https://lkml.org/lkml/2012/2/16/518
> >
> > He said that while my patch is correct, taking i_mutex inside mmap_sem
> > is still wrong.
>
> OK, thanks, yup. Taking i_mutex in file_operations.mmap() is wrong.
>
> Is hugetlbfs actually deadlockable because of this, or is it the case
> that the i_mutex->mmap_sem ordering happens to never happen for this
> filesystem? Although we shouldn't go and create incompatible lock
> ranking rules for different filesystems!
>
> So we need to pull the i_mutex out of hugetlbfs_file_mmap(). What's it
> actually trying to do in there? If we switch to
> i_size_read()/i_size_write() then AFAICT the problem comes down to
> hugetlb_reserve_pages().
>
> hugetlb_reserve_pages() fiddles with i_mapping->private_list and the fs
> owns private_list and is free to use a lock other than i_mutex to
> protect it. (In fact i_mapping.private_lock is the usual lock for
> private_list).
>
>
>
> So from a quick scan here I'm thinking that a decent fix is to remove
> the i_mutex locking from hugetlbfs_file_mmap(), switch
> hugetlbfs_file_mmap() to i_size_read/write then use a hugetlb-private
> lock to protect i_mapping->private_list. region_chg() will do
> GFP_KERNEL allocations under that lock, so some care is needed.
>
But as per 7762f5a0b709b415fda132258ad37b9f2a1db994 i_size_write should
always happen with i_mutex held
-aneesh
next prev parent reply other threads:[~2012-03-09 5:03 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-08 9:15 [PATCH] hugetlbfs: lockdep annotate root inode properly Aneesh Kumar K.V
2012-03-08 9:15 ` Aneesh Kumar K.V
2012-03-08 21:02 ` Andrew Morton
2012-03-08 21:02 ` Andrew Morton
2012-03-08 21:10 ` Dave Jones
2012-03-08 21:10 ` Dave Jones
2012-03-08 21:19 ` Tyler Hicks
2012-03-08 21:40 ` Andrew Morton
2012-03-08 21:40 ` Andrew Morton
2012-03-08 21:49 ` Al Viro
2012-03-08 21:49 ` Al Viro
2012-03-08 22:19 ` Andrew Morton
2012-03-08 22:19 ` Andrew Morton
2012-03-08 22:33 ` Dave Jones
2012-03-08 22:33 ` Dave Jones
2012-03-08 22:45 ` Andrew Morton
2012-03-08 22:45 ` Andrew Morton
2012-03-09 5:00 ` Aneesh Kumar K.V
2012-03-09 5:00 ` Aneesh Kumar K.V
2012-03-09 5:03 ` Aneesh Kumar K.V [this message]
2012-03-09 5:03 ` Aneesh Kumar K.V
2012-03-08 21:44 ` Al Viro
2012-03-08 21:44 ` Al Viro
2012-03-08 22:44 ` Peter Zijlstra
2012-03-08 22:44 ` Peter Zijlstra
2012-03-08 22:46 ` Peter Zijlstra
2012-03-08 22:46 ` Peter Zijlstra
-- strict thread matches above, loose matches on Subject: below --
2012-04-16 20:28 Aneesh Kumar K.V
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87vcme8ixv.fsf@linux.vnet.ibm.com \
--to=aneesh.kumar@linux.vnet.ibm.com \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=davej@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=jboyer@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=tyhicks@canonical.com \
--cc=viro@zeniv.linux.org.uk \
--cc=zohar@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.