linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Allison Henderson <achender@linux.vnet.ibm.com>
To: Lukas Czerner <lczerner@redhat.com>
Cc: Ext4 Developers List <linux-ext4@vger.kernel.org>,
	"Ted Ts'o" <tytso@mit.edu>, Christoph Hellwig <hch@infradead.org>
Subject: Re: Plan for reducing i_mutex in ext4
Date: Thu, 06 Oct 2011 10:36:46 -0700	[thread overview]
Message-ID: <4E8DE72E.2060103@linux.vnet.ibm.com> (raw)
In-Reply-To: <alpine.LFD.2.00.1110041024470.7833@dhcp-27-109.brq.redhat.com>

On 10/04/2011 01:38 AM, Lukas Czerner wrote:
> On Mon, 3 Oct 2011, Allison Henderson wrote:
>
>> Hi all,
>>
>> I've been working on locating all the existing uses of i_mutex in the current
>> ext4 code because I know we are planning to reduce the usage of i_mutex in
>> ext4.  So I've gone through the ext4 code and also the vfs code and come up
>> with a list of ext4 items that appear to be protected under i_mutex.  I'm
>> thinking about doing a patch to replace i_mutex with a private ext4 mutex, and
>> I wanted to update folks on this idea and pick up any feed back people might
>> have.
>>
>> I'm thinking maybe we can have a separate mutex for functions that only modify
>> meta data like ext4_ioctl and ext4_setattr to help relieve unneeded
>> contention.  And then the rest of functions that are modifying data can go
>> under a data mutex (including truncate since sometimes ext4_ioctl and
>> ext4_setattr will call ext4_truncate if they modify i_size).
>
> Just the other day I was talking with Christoph (adding him to cc) about
> this, but unfortunately I still did not have time to look at this, but I
> am glad that someone did.
>
> His suggestion was a bit more general than creating separate ext4
> specific mutex. His idea was to change i_mutex to union of plain mutex
> for directories and a rwlock for regular files. Then this union can be
> used in other file systems as well, for example to replace xfs_iolock in
> xfs.
>
> Also it might be nice to do something smarter than just a rwlock for
> regular files. It would be nice to have an structure of extent locks, so
> we can use it for file system using extents, which will improve
> scalability while hammering a single file from different processes.
>
> Note that currently ext4 concurrent read/write are atomic only wrt
> individual pages, but not on the system call as the whole. This might
> cause read() to return data mixed from several different writes, which
> is not posix conform. That could be solved with the generic rwlock for
> files, or even better with the system of extent locking.
>
> But Christoph, can probably describe hi idea a bit better.
>
> Thanks!
> -Lukas

Hi Lukas,

Sorry for the delay, and thanks for the response :)  Alrighty, I will 
have to do some prototyping and see if I can work in some of these 
concepts into a solution.  At the moment, Im trying to make sure I come 
up with something that still provides all the existing functionality so 
I dont introduce any new race problems, but there's certainly a lot of 
room for optimizing too.  Thx!

Allison Henderson
>
>>
>> So these are ext4 functions that currently lock i_mutex:
>>
>> ext4_sync_file
>> ext4_fallocate
>> ext4_move_extents via two helper routines:
>> 	mext_inode_double_lock and mext_inode_double_unlock
>> ext4_ioctl (for the EXT4_IOC_SETFLAGS ioctl)
>> ext4_quota_write
>> ext4_llseek
>> ext4_end_io_work
>> ext4_evict_inode (only while calling ext4_flush_completed_IO)
>> ext4_ind_direct_IO (only while calling ext4_flush_completed_IO)
>>
>>
>> And these are ext4 functions that have i_mutex locked by the vfs layer.  So we
>> will need to lock the new private mutex here too if we want them to be
>> synchronous with the above functions.
>>
>> ext4_setattr
>> ext4_da_writepages
>> ext4_rmdir
>> ext4_unlink
>> ext4_symlink
>> ext4_link
>> ext4_rename
>>
>> And one unique case:
>> ext4_fiemap calls generic_block_fiemap and passes it a function pointer to
>> ext4_get_block.  generic_block_fiemap will lock i_mutex before calling the
>> pointer.  I dont think ext4_get_block needs i_mutex locked all the time, so I
>> think we can just make a wrapper for ext4_get_block that locks the new private
>> mutex and then we can pass a pointer to the wrapper.
>>
>>
>> That's my list so far, if anyone knows of one I missed please let me know, and
>> also if you spot any other places where we can reduce unneeded contention by
>> using a separate lock.  Thx!
>>
>> Allison Henderson
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>


  reply	other threads:[~2011-10-06 17:45 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-03 19:00 Plan for reducing i_mutex in ext4 Allison Henderson
2011-10-04  8:38 ` Lukas Czerner
2011-10-06 17:36   ` Allison Henderson [this message]
2011-10-04  8:57 ` Dmitry Monakhov
2011-10-04 19:13   ` Allison Henderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E8DE72E.2060103@linux.vnet.ibm.com \
    --to=achender@linux.vnet.ibm.com \
    --cc=hch@infradead.org \
    --cc=lczerner@redhat.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).