linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* i_mutex questions
@ 2011-09-13 18:33 Allison Henderson
  2011-09-13 19:02 ` Joel Becker
  2011-09-14  1:29 ` Yongqiang Yang
  0 siblings, 2 replies; 7+ messages in thread
From: Allison Henderson @ 2011-09-13 18:33 UTC (permalink / raw)
  To: Ext4 Developers List, Ted Ts'o

Hi All,

I have been trying to find a way to synchronize punch hole with read and 
write operations with out the use of i_mutex.  The concern is that after 
punch hole has released the pages inside the hole, another process may 
remap the page to a block before punch has taken i_data_sem.  I think 
putting i_mutex around the punch hole operation would fix this, but 
since we are trying to avoid further improper use of i_mutex, I am 
trying to avoid that solution.

I cannot use i_data_sem to protect the pages because it seems most of 
the code has already established a locking order of pages first, then 
i_data_sem.  So moving i_data_sem up tends to cause a lot of dead locks. 
  I'm thinking that there probably needs to be a another mutex involved 
some where, but I wasnt sure if some one is already working on the idea 
of introducing a replacement for i_mutex.  So I just wanted to know if 
there are any plans already in motion for this, or if any one else could 
suggest some ideas for the punch hole issue.  Thx all!

Allison Henderson


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: i_mutex questions
  2011-09-13 18:33 i_mutex questions Allison Henderson
@ 2011-09-13 19:02 ` Joel Becker
  2011-09-13 22:10   ` Allison Henderson
  2011-09-14  1:29 ` Yongqiang Yang
  1 sibling, 1 reply; 7+ messages in thread
From: Joel Becker @ 2011-09-13 19:02 UTC (permalink / raw)
  To: Allison Henderson; +Cc: Ext4 Developers List, Ted Ts'o

On Tue, Sep 13, 2011 at 11:33:29AM -0700, Allison Henderson wrote:
> Hi All,
> 
> I have been trying to find a way to synchronize punch hole with read
> and write operations with out the use of i_mutex.  The concern is
> that after punch hole has released the pages inside the hole,
> another process may remap the page to a block before punch has taken
> i_data_sem.  I think putting i_mutex around the punch hole operation
> would fix this, but since we are trying to avoid further improper
> use of i_mutex, I am trying to avoid that solution.

Hey Allison,
	Actually, i_mutex is the normal way to handle this.  ocfs2 takes
i_mutex down under its ->fallocate().  Truncate is in the same boat,
which is why do_truncate() takes i_mutex before calling notify_change().
	The read-write paths grab i_mutex for buffered operation.  They
don't for O_DIRECT, which doesn't map to the pagecache.  This is where
i_data_sem should speed things up.  

Joel


-- 

"Anything that is too stupid to be spoken is sung."  
        - Voltaire

			http://www.jlbec.org/
			jlbec@evilplan.org

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: i_mutex questions
  2011-09-13 19:02 ` Joel Becker
@ 2011-09-13 22:10   ` Allison Henderson
  2011-09-14  0:05     ` Joel Becker
  0 siblings, 1 reply; 7+ messages in thread
From: Allison Henderson @ 2011-09-13 22:10 UTC (permalink / raw)
  To: Ext4 Developers List, Ted Ts'o, jlbec

On 09/13/2011 12:02 PM, Joel Becker wrote:
> On Tue, Sep 13, 2011 at 11:33:29AM -0700, Allison Henderson wrote:
>> Hi All,
>>
>> I have been trying to find a way to synchronize punch hole with read
>> and write operations with out the use of i_mutex.  The concern is
>> that after punch hole has released the pages inside the hole,
>> another process may remap the page to a block before punch has taken
>> i_data_sem.  I think putting i_mutex around the punch hole operation
>> would fix this, but since we are trying to avoid further improper
>> use of i_mutex, I am trying to avoid that solution.
>
> Hey Allison,
> 	Actually, i_mutex is the normal way to handle this.  ocfs2 takes
> i_mutex down under its ->fallocate().  Truncate is in the same boat,
> which is why do_truncate() takes i_mutex before calling notify_change().
> 	The read-write paths grab i_mutex for buffered operation.  They
> don't for O_DIRECT, which doesn't map to the pagecache.  This is where
> i_data_sem should speed things up.
>
> Joel
>
Hi Joel,

Well, I actually already had a patch that was trying to use i_mutex to 
solve this ([PATCH 4/6 v7] ext4: Lock i_mutex for punch hole).  But we 
decided not to apply it because of plans to reduce the usage of i_mutex 
in the ext4 code.  So I've been trying to figure out a different way to 
solve this, but so far I haven't had a whole lot of luck finding a 
solution that doesn't involve introducing a new locking mechanism.  So I 
wanted to check back here for more details on what the plan for i_mutex 
is so I dont conflict with anything that might already be going on.  :)

Ted, would you be able to give us some more details on this topic?  Thx!

Allison Henderson

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: i_mutex questions
  2011-09-13 22:10   ` Allison Henderson
@ 2011-09-14  0:05     ` Joel Becker
  0 siblings, 0 replies; 7+ messages in thread
From: Joel Becker @ 2011-09-14  0:05 UTC (permalink / raw)
  To: Allison Henderson; +Cc: Ext4 Developers List, Ted Ts'o

On Tue, Sep 13, 2011 at 03:10:56PM -0700, Allison Henderson wrote:
> Well, I actually already had a patch that was trying to use i_mutex
> to solve this ([PATCH 4/6 v7] ext4: Lock i_mutex for punch hole).
> But we decided not to apply it because of plans to reduce the usage
> of i_mutex in the ext4 code.  So I've been trying to figure out a
> different way to solve this, but so far I haven't had a whole lot of
> luck finding a solution that doesn't involve introducing a new
> locking mechanism.  So I wanted to check back here for more details
> on what the plan for i_mutex is so I dont conflict with anything
> that might already be going on.  :)

	Sure ;-)  If you find another mechanism that reduces contention
but still plays well with read/write et al, please let us all know.
Getting i_mutex out of read/write would be interesting.

Joel

-- 

"A narcissist is someone better looking than you are."  
         - Gore Vidal

			http://www.jlbec.org/
			jlbec@evilplan.org

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: i_mutex questions
  2011-09-13 18:33 i_mutex questions Allison Henderson
  2011-09-13 19:02 ` Joel Becker
@ 2011-09-14  1:29 ` Yongqiang Yang
  2011-09-14  4:23   ` Andreas Dilger
  1 sibling, 1 reply; 7+ messages in thread
From: Yongqiang Yang @ 2011-09-14  1:29 UTC (permalink / raw)
  To: Allison Henderson; +Cc: Ext4 Developers List, Ted Ts'o, Lukas Czerner

On Wed, Sep 14, 2011 at 2:33 AM, Allison Henderson
<achender@linux.vnet.ibm.com> wrote:
> Hi All,
>
> I have been trying to find a way to synchronize punch hole with read and
> write operations with out the use of i_mutex.  The concern is that after
> punch hole has released the pages inside the hole, another process may remap
> the page to a block before punch has taken i_data_sem.  I think putting
> i_mutex around the punch hole operation would fix this, but since we are
> trying to avoid further improper use of i_mutex, I am trying to avoid that
> solution.
>
> I cannot use i_data_sem to protect the pages because it seems most of the
> code has already established a locking order of pages first, then
> i_data_sem.  So moving i_data_sem up tends to cause a lot of dead locks.
>  I'm thinking that there probably needs to be a another mutex involved some
> where, but I wasnt sure if some one is already working on the idea of
> introducing a replacement for i_mutex.  So I just wanted to know if there
> are any plans already in motion for this, or if any one else could suggest
> some ideas for the punch hole issue.  Thx all!

HI,

Lukas sent out a patch ([PATCH] ext4: Make reads/writes atomic with
i_rwlock semaphore) which collected some feedbacks suggesting using
extent lock instead of a read-write semaphore.  If there is extent
lock implementation in ext4, then fallocate can use it, maybe
dioread-nolock can use it as well, e.g. locking a range and unlocking
the range until the extent is converted from unwritten to init.

Yongqiang.


>
> Allison Henderson
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
Best Wishes
Yongqiang Yang
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: i_mutex questions
  2011-09-14  1:29 ` Yongqiang Yang
@ 2011-09-14  4:23   ` Andreas Dilger
  2011-09-14 18:36     ` Allison Henderson
  0 siblings, 1 reply; 7+ messages in thread
From: Andreas Dilger @ 2011-09-14  4:23 UTC (permalink / raw)
  To: Allison Henderson, Yongqiang Yang
  Cc: Ext4 Developers List, Ted Ts'o, Lukas Czerner, Zhen Liang

On 2011-09-13, at 7:29 PM, Yongqiang Yang wrote:
> On Wed, Sep 14, 2011 at 2:33 AM, Allison Henderson
> <achender@linux.vnet.ibm.com> wrote:
>> Hi All,
>> 
>> I have been trying to find a way to synchronize punch hole with read and
>> write operations with out the use of i_mutex.  The concern is that after
>> punch hole has released the pages inside the hole, another process may remap
>> the page to a block before punch has taken i_data_sem.  I think putting
>> i_mutex around the punch hole operation would fix this, but since we are
>> trying to avoid further improper use of i_mutex, I am trying to avoid that
>> solution.
>> 
>> I cannot use i_data_sem to protect the pages because it seems most of the
>> code has already established a locking order of pages first, then
>> i_data_sem.  So moving i_data_sem up tends to cause a lot of dead locks.
>>  I'm thinking that there probably needs to be a another mutex involved some
>> where, but I wasnt sure if some one is already working on the idea of
>> introducing a replacement for i_mutex.  So I just wanted to know if there
>> are any plans already in motion for this, or if any one else could suggest
>> some ideas for the punch hole issue.  Thx all!
> 
> Lukas sent out a patch ([PATCH] ext4: Make reads/writes atomic with
> i_rwlock semaphore) which collected some feedbacks suggesting using
> extent lock instead of a read-write semaphore.  If there is extent
> lock implementation in ext4, then fallocate can use it, maybe
> dioread-nolock can use it as well, e.g. locking a range and unlocking
> the range until the extent is converted from unwritten to init.

We have a prototype patch for extent locking for ext4.  We are planning to
use this for parallel locking of htree directories, but it could potentially
be modified to for extent locking of files.

The current patch is below, but it hasn't gone through a lot of testing:

http://review.whamcloud.com/#patch,sidebyside,375,2,ldiskfs/kernel_patches/patches/ext4_pdirop-rhel6.patch

Cheers, Andreas






^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: i_mutex questions
  2011-09-14  4:23   ` Andreas Dilger
@ 2011-09-14 18:36     ` Allison Henderson
  0 siblings, 0 replies; 7+ messages in thread
From: Allison Henderson @ 2011-09-14 18:36 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: Yongqiang Yang, Ext4 Developers List, Ted Ts'o, Lukas Czerner,
	Zhen Liang

On 09/13/2011 09:23 PM, Andreas Dilger wrote:
> On 2011-09-13, at 7:29 PM, Yongqiang Yang wrote:
>> On Wed, Sep 14, 2011 at 2:33 AM, Allison Henderson
>> <achender@linux.vnet.ibm.com>  wrote:
>>> Hi All,
>>>
>>> I have been trying to find a way to synchronize punch hole with read and
>>> write operations with out the use of i_mutex.  The concern is that after
>>> punch hole has released the pages inside the hole, another process may remap
>>> the page to a block before punch has taken i_data_sem.  I think putting
>>> i_mutex around the punch hole operation would fix this, but since we are
>>> trying to avoid further improper use of i_mutex, I am trying to avoid that
>>> solution.
>>>
>>> I cannot use i_data_sem to protect the pages because it seems most of the
>>> code has already established a locking order of pages first, then
>>> i_data_sem.  So moving i_data_sem up tends to cause a lot of dead locks.
>>>   I'm thinking that there probably needs to be a another mutex involved some
>>> where, but I wasnt sure if some one is already working on the idea of
>>> introducing a replacement for i_mutex.  So I just wanted to know if there
>>> are any plans already in motion for this, or if any one else could suggest
>>> some ideas for the punch hole issue.  Thx all!
>>
>> Lukas sent out a patch ([PATCH] ext4: Make reads/writes atomic with
>> i_rwlock semaphore) which collected some feedbacks suggesting using
>> extent lock instead of a read-write semaphore.  If there is extent
>> lock implementation in ext4, then fallocate can use it, maybe
>> dioread-nolock can use it as well, e.g. locking a range and unlocking
>> the range until the extent is converted from unwritten to init.
>
> We have a prototype patch for extent locking for ext4.  We are planning to
> use this for parallel locking of htree directories, but it could potentially
> be modified to for extent locking of files.
>
> The current patch is below, but it hasn't gone through a lot of testing:
>
> http://review.whamcloud.com/#patch,sidebyside,375,2,ldiskfs/kernel_patches/patches/ext4_pdirop-rhel6.patch
>
> Cheers, Andreas
>

Alrighty then, I will have a look at the existing patches.  Thx!

Allison Henderson

>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-09-14 18:36 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-09-13 18:33 i_mutex questions Allison Henderson
2011-09-13 19:02 ` Joel Becker
2011-09-13 22:10   ` Allison Henderson
2011-09-14  0:05     ` Joel Becker
2011-09-14  1:29 ` Yongqiang Yang
2011-09-14  4:23   ` Andreas Dilger
2011-09-14 18:36     ` Allison Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).