All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: viro@zeniv.linux.org.uk, tglx@linutronix.de,
	linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
	linux-btrfs@vger.kernel.org, hirofumi@mail.parknet.co.jp,
	mfasheh@suse.com, jlbec@evilplan.org
Subject: Re: [PATCH 4/8] fs: kill i_alloc_sem
Date: Tue, 21 Jun 2011 15:40:56 +1000	[thread overview]
Message-ID: <20110621054056.GP32466@dastard> (raw)
In-Reply-To: <20110620202031.175620498@bombadil.infradead.org>

On Mon, Jun 20, 2011 at 04:15:37PM -0400, Christoph Hellwig wrote:
> i_alloc_sem is a rather special rw_semaphore.  It's the last one that may
> be released by a non-owner, and it's write side is always mirrored by
> real exclusion.  It's intended use it to wait for all pending direct I/O
> requests to finish before starting a truncate.
> 
> Replace it with a hand-grown construct:
> 
>  - exclusion for truncates is already guaranteed by i_mutex, so it can
>    simply fall way
>  - the reader side is replaced by an i_dio_count member in struct inode
>    that counts the number of pending direct I/O requests.  Truncate can't
>    proceed as long as it's non-zero
>  - when i_dio_count reaches non-zero we wake up a pending truncate using
>    wake_up_bit on a new bit in i_flags
>  - new references to i_dio_count can't appear while we are waiting for
>    it to read zero because the direct I/O count always needs i_mutex
>    (or an equivalent like XFS's i_iolock) for starting a new operation.
> 
> This scheme is much simpler, and saves the space of a spinlock_t and a
> struct list_head in struct inode (typically 160 bytes on a non-debug 64-bit
> system).
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> 
> Index: linux-2.6/fs/direct-io.c
> ===================================================================
> --- linux-2.6.orig/fs/direct-io.c	2011-06-20 14:55:31.000000000 +0200
> +++ linux-2.6/fs/direct-io.c	2011-06-20 14:55:34.602490284 +0200
> @@ -136,6 +136,27 @@ struct dio {
>  };
>  
>  /*
> + * Wait for outstanding DIO requests to finish.  Must be locked against
> + * increments of i_dio_count by i_mutex.
> + */
> +void inode_dio_wait(struct inode *inode)
> +{
> +	might_sleep();
> +	while (atomic_read(&inode->i_dio_count)) {
> +		wait_on_bit(&inode->i_state, __I_DIO_WAKEUP, inode_wait,
> +			    TASK_UNINTERRUPTIBLE);
> +	}
> +}
> +EXPORT_SYMBOL_GPL(inode_dio_wait);
> +
> +void inode_dio_wake(struct inode *inode)
> +{
> +	if (atomic_dec_and_test(&inode->i_dio_count))
> +		wake_up_bit(&inode->i_state, __I_DIO_WAKEUP);
> +}
> +EXPORT_SYMBOL_GPL(inode_dio_wake);

Modification of inode->i_state is not safe outside the
inode->i_lock.

This probably needs to be implemented similar to the
__I_NEW/__wait_on_freeing_inode() and
__I_SYNC/inode_wait_for_writeback() pattern...

Cheers,

Dave.

-- 
Dave Chinner
david@fromorbit.com

  parent reply	other threads:[~2011-06-21  5:40 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-20 20:15 [PATCH 0/8] remove i_alloc_sem Christoph Hellwig
2011-06-20 20:15 ` [PATCH 1/8] far: remove i_alloc_sem abuse Christoph Hellwig
2011-06-20 20:15   ` Christoph Hellwig
2011-06-21 15:57   ` OGAWA Hirofumi
2011-06-21 16:09     ` OGAWA Hirofumi
2011-06-21 16:09     ` Christoph Hellwig
2011-06-20 20:15 ` [PATCH 2/8] ext4: " Christoph Hellwig
2011-06-20 20:15   ` Christoph Hellwig
2011-06-21 16:34   ` Lukas Czerner
2011-06-21 16:48     ` Lukas Czerner
2011-06-21 17:16     ` Christoph Hellwig
2011-06-20 20:15 ` [PATCH 3/8] fs: simpler handling of zero sized reads in __blockdev_direct_IO Christoph Hellwig
2011-06-20 20:15   ` Christoph Hellwig
2011-06-20 20:15 ` [PATCH 4/8] fs: kill i_alloc_sem Christoph Hellwig
2011-06-20 20:15   ` Christoph Hellwig
2011-06-20 21:32   ` Joel Becker
2011-06-20 22:18     ` Christoph Hellwig
2011-07-01  2:58       ` Joel Becker
2011-06-21  5:40   ` Dave Chinner [this message]
2011-06-21  9:35     ` Christoph Hellwig
2011-06-20 20:15 ` [PATCH 5/8] fs: move inode_dio_wait calls into ->setattr Christoph Hellwig
2011-06-20 20:15   ` Christoph Hellwig
2011-06-20 20:15 ` [PATCH 6/8] fs: always maintain i_dio_count Christoph Hellwig
2011-06-20 20:15   ` Christoph Hellwig
2011-06-20 21:29   ` Joel Becker
2011-06-20 22:23     ` Christoph Hellwig
2011-06-20 20:15 ` [PATCH 7/8] btrfs: wait for direct I/O requests in truncate Christoph Hellwig
2011-06-20 20:15   ` Christoph Hellwig
2011-06-20 20:15 ` [PATCH 8/8] rw_semaphore: remove up/down_read_non_owner Christoph Hellwig
2011-06-20 20:15   ` Christoph Hellwig
2011-06-20 20:32 ` [PATCH 0/8] remove i_alloc_sem Christoph Hellwig
2011-06-21 23:54 ` Jan Kara
2011-06-22  9:39   ` Christoph Hellwig
2011-06-22 14:22   ` Ted Ts'o
2011-06-22 18:13     ` Jan Kara
2011-06-23 10:36       ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110621054056.GP32466@dastard \
    --to=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=hirofumi@mail.parknet.co.jp \
    --cc=jlbec@evilplan.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=mfasheh@suse.com \
    --cc=tglx@linutronix.de \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.