public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Steve Lord <lord@xfs.org>
To: Christoph Hellwig <hch@infradead.org>
Cc: Miquel van Smoorenburg <miquels@cistron.nl>,
	nathans@sgi.com, linux-kernel@vger.kernel.org
Subject: Re: 2.6.2-rc2 nfsd+xfs spins in i_size_read()
Date: Sat, 31 Jan 2004 10:41:03 -0600	[thread overview]
Message-ID: <401BDA9F.1080001@xfs.org> (raw)
In-Reply-To: <20040201161513.A15966@infradead.org>

Christoph Hellwig wrote:
> On Sat, Jan 31, 2004 at 09:59:06AM -0600, Steve Lord wrote:
> 
>>The vn_revalidate call is called out of linvfs_setattr,
>>which is called with the i_sem held, it is also potentially called out
>>of linvfs_getattr, although since the i_size is always maintained
>>as it is changed, this call should not actually be updating the size.
>>Possibly changing the code in vn_revalidate to do this:
>>
>>	if (i_size_read(inode) < va.va_size))
>>		i_size_write(inode, va.va_size);
>>
>>Would be a good starting point, I suspect those calls from the nfs
>>revalidate call are not really going to change the inode size. My
>>guess is this will make your problem go away.
>>
>>Probably some larger code restructure is needed so that revalidate
>>knows if the i_sem is held or not at this point.
> 
> 
> I think the right fix is to update the Linux i_size always shortly after
> di_size is updated.  There's a lot of updates in the directory code while
> should be handled by an i_size_write in the matching linvfs routines, and
> there's a few more but we should be able to handle those without
> vn_revalidate aswell.
> 

Changing the validate_fields call to use

	if (i_size_read(inode) != va.va_size)
		i_size_write(inode, va.va_size);

should take care of directories, certainly better than the
direct assignment to i_size which is in there now....
This is called from the directory ops which modify the
contents of a directory and should already be under the
i_sem. The vn_revalidate code should use a != test
as well of course...


> 
>>The O_DIRECT write case is the hard one. In XFS's internal view of
>>the world, the inode size is maintained via the XFS_ILOCK, but we
>>only hold that across metadata manipulation within the fs code,
>>not across I/O such as a call to generic_file_aio_write_nolock.
>>Right now the only way I see of dealing with that is to make
>>writes which we know will extent the file hold the i_sem for
>>the duration in the O_DIRECT case.
> 
> 
> That's the more difficult case.  Any reson why you'd hold i_sem
> for the whole O_DIRECT I/O instead of just for updating i_sem?
> 

We worked hard not to hold it, but that i_size_write in
the middle of the async_io code is tough to work around.

> Note that the EOF zeroing code also calls i_size_write.
> 

But that is called out of an extending setattr or a buffered write which
hold the semaphore. Hmm, actually O_DIRECT write can be in there as
well, but that is the same problem as the generic I/O path call.

Steve


  reply	other threads:[~2004-02-01 16:41 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-01-28 17:17 2.6.2-rc2 nfsd+xfs spins in i_size_read() Miquel van Smoorenburg
2004-01-29  6:25 ` Andrew Morton
2004-01-29 23:20   ` Miquel van Smoorenburg
2004-02-04  0:03     ` Christoph Hellwig
2004-02-03 14:13       ` Steve Lord
2004-02-04 15:16         ` Christoph Hellwig
2004-02-04 18:17         ` Christoph Hellwig
2004-02-04  0:06       ` David Weinehall
2004-02-04  0:07         ` Christoph Hellwig
2004-01-30 16:01   ` Miquel van Smoorenburg
2004-01-30 20:21   ` Miquel van Smoorenburg
2004-01-30 22:13     ` Miquel van Smoorenburg
2004-01-30 22:34       ` Andrew Morton
2004-01-30 22:53         ` Christoph Hellwig
2004-01-30 23:13           ` Andrew Morton
2004-01-31  1:25             ` Miquel van Smoorenburg
2004-01-31  1:38               ` Andrew Morton
2004-01-31 11:46                 ` Miquel van Smoorenburg
2004-01-31 15:59                   ` Steve Lord
2004-02-01 16:15                     ` Christoph Hellwig
2004-01-31 16:41                       ` Steve Lord [this message]
2004-01-31 17:07               ` Christoph Hellwig
2004-02-01  1:46               ` Anton Blanchard
2004-01-30 23:07         ` Nathan Scott
2004-01-29  6:30 ` Nathan Scott

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=401BDA9F.1080001@xfs.org \
    --to=lord@xfs.org \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miquels@cistron.nl \
    --cc=nathans@sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox