linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nick Piggin <npiggin@suse.de>
To: Boaz Harrosh <bharrosh@panasas.com>
Cc: Christoph Hellwig <hch@lst.de>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	open-osd <osd-dev@open-osd.org>
Subject: Re: [RFC] exofs: New truncate sequence
Date: Tue, 1 Jun 2010 00:33:37 +1000	[thread overview]
Message-ID: <20100531143337.GK9453@laptop> (raw)
In-Reply-To: <4C03C40E.2090800@panasas.com>

On Mon, May 31, 2010 at 05:13:34PM +0300, Boaz Harrosh wrote:
> On 05/31/2010 04:44 PM, Nick Piggin wrote:
> > On Mon, May 31, 2010 at 03:30:02PM +0300, Boaz Harrosh wrote:
> >> ---
> >>  fs/exofs/exofs.h |    1 -
> >>  fs/exofs/file.c  |    1 -
> >>  fs/exofs/inode.c |  115 +++++++++++++++++++++++++++---------------------------
> > 
> > Can you rip out all the rest of the buffer_head stuff too?
> > 
> 
> I hope I don't have any left, that was the last, have I missed
> something?

exofs_invalidatepage, exofs_releasepage, includes of buffer_head.h.
No point to any of that if you never actually map the buffers or
use them for tracking state yourself.


> >> @@ -750,6 +760,10 @@ static int exofs_write_end(struct file *file, struct address_space *mapping,
> >>  	int ret;
> >>  
> >>  	ret = simple_write_end(file, mapping,pos, len, copied, page, fsdata);
> >> +	if (unlikely(ret && pos + len > inode->i_size))
> >> +		truncate_pagecache(inode, pos + len, inode->i_size);
> >> +
> > 
> > So there is no need to do any oi_truncate? Even if _readpage in
> > write_begin has set up some blocks?
> > 
> 
> No blocks are setup. Exofs does not have any blocks.

Sure, I was just using blocks as a placeholder for whatever you call
it.


> If the write
> failed then the object on OSD has the last written offset as objects
> size. If error handling was done right residual is subtracted from IO
> size and should be reflected here.
> 
> If IO was never attempted then object size did not grow and we revert
> in memory i_size here. (size is keept in two places and is checked
> for consistency in fsck)

So long as the _readpage has not altered the object state at all,
then this looks fine to me.

 
> TODO:
> 	eject short writes and see if this works correctly.
> 
> > 
> >> +	/* TODO: once simple_write_end marks inode dirty remove */
> >>  	if (i_size != inode->i_size)
> >>  		mark_inode_dirty(inode);
> >>  	return ret;
> > 
> > Hmm, I suppose simple_write_end probably should mark the inode dirty?
> > 
> 
> I think Christoph has a patch for that.
> 
> > 
> >> @@ -1335,28 +1339,25 @@ void exofs_delete_inode(struct inode *inode)
> >>  
> >>  	truncate_inode_pages(&inode->i_data, 0);
> >>  
> >> +	/* TODO: should do better here */
> >>  	if (is_bad_inode(inode))
> >>  		goto no_delete;
> >>  
> >> -	mark_inode_dirty(inode);
> >> -	exofs_update_inode(inode, inode_needs_sync(inode));
> >> -
> >>  	inode->i_size = 0;
> >> -	if (inode->i_blocks)
> >> -		exofs_truncate(inode);
> > 
> > This guy has gone missing -- I assume exofs_sbi_remove is a more
> > efficient way to do this anyway?
> > 
> 
> You see this is where exofs is different a file is an object_no on
> multiple OSD devices. The inode is kept as an attribute of the
> object. (data as object's data) so a exofs_sbi_remove will just
> obliterate any association to the object. It was historically
> called because exofs_truncate used to do what truncate_inode_pages
> does today. (And some other in memory book keeping.) But with
> your help all this was cleaned up.

OK, I was thinking the underlying object itself needs to be trimmed
to match i_size similarly to just a block based filesystem? Like
exofs_oi_truncate appears to.

 
> Do you see any operation I missed that might need cleaning from the
> generic VFS inode, that might now leak. As far as storage is concerned
> I'm covered.
> 
> [I ran git clone linux; rm -rf linux; 100 times in a loop and the OSD
>  storage stayed constant size. So I presume there is no storage leak.
>  OSD is good in this respect]

I can't see anything off hand. Was just flagging points where 
vmtruncate or truncate had been called and is not now. If you
have all those covered, then you should be OK.

 
> >> -
> >>  	clear_inode(inode);
> >>  
> >> -	ret = exofs_get_io_state(&sbi->layout, &ios);
> >> -	if (unlikely(ret)) {
> >> -		EXOFS_ERR("%s: exofs_get_io_state failed\n", __func__);
> >> -		return;
> >> -	}
> >> -
> >>  	/* if we are deleting an obj that hasn't been created yet, wait */
> >>  	if (!obj_created(oi)) {
> >>  		BUG_ON(!obj_2bcreated(oi));
> >>  		wait_event(oi->i_wq, obj_created(oi));
> >> +		/* ignore the error attempt a remove anyway */
> >> +	}
> >> +
> >> +	/* Now Remove the OSD objects */
> >> +	ret = exofs_get_io_state(&sbi->layout, &ios);
> >> +	if (unlikely(ret)) {
> >> +		EXOFS_ERR("%s: exofs_get_io_state failed\n", __func__);
> >> +		return;
> >>  	}
> >>  
> >>  	ios->obj.id = exofs_oi_objno(oi);
> >> -- 
> >> 1.6.6.1
> 
> Thanks for lookin. And thanks for making this patch possible. I wanted
> this cleaned, long ago, but it was only made easy and simple after your
> changes.

That's OK, thanks for helping with it.


  reply	other threads:[~2010-05-31 14:33 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-31 12:30 [RFC] exofs: New truncate sequence Boaz Harrosh
2010-05-31 13:44 ` Nick Piggin
2010-05-31 14:13   ` Boaz Harrosh
2010-05-31 14:33     ` Nick Piggin [this message]
2010-05-31 14:50       ` Boaz Harrosh
2010-05-31 15:09         ` Nick Piggin
2010-05-31 15:19           ` Boaz Harrosh
2010-06-01 10:08             ` Christoph Hellwig
2010-06-01 10:26               ` Boaz Harrosh
2010-06-01 10:44                 ` Christoph Hellwig
2010-06-01 11:05                   ` Boaz Harrosh
2010-06-01 11:06                     ` Christoph Hellwig
2010-06-01 10:28 ` [PATCH ver2] " Boaz Harrosh
2010-06-01 10:43   ` Christoph Hellwig
2010-06-01 10:59     ` Boaz Harrosh
2010-06-01 11:06       ` Christoph Hellwig
2010-06-01 11:31 ` [PATCH ver3] " Boaz Harrosh
2010-06-01 11:36   ` Christoph Hellwig
2010-06-01 11:52     ` Boaz Harrosh
2010-06-01 15:09   ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100531143337.GK9453@laptop \
    --to=npiggin@suse.de \
    --cc=bharrosh@panasas.com \
    --cc=hch@lst.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=osd-dev@open-osd.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).