All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nick Piggin <npiggin@suse.de>
To: Boaz Harrosh <bharrosh@panasas.com>
Cc: Christoph Hellwig <hch@lst.de>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	open-osd <osd-dev@open-osd.org>
Subject: Re: [RFC] exofs: New truncate sequence
Date: Tue, 1 Jun 2010 00:33:37 +1000	[thread overview]
Message-ID: <20100531143337.GK9453@laptop> (raw)
In-Reply-To: <4C03C40E.2090800@panasas.com>

On Mon, May 31, 2010 at 05:13:34PM +0300, Boaz Harrosh wrote:
> On 05/31/2010 04:44 PM, Nick Piggin wrote:
> > On Mon, May 31, 2010 at 03:30:02PM +0300, Boaz Harrosh wrote:
> >> ---
> >>  fs/exofs/exofs.h |    1 -
> >>  fs/exofs/file.c  |    1 -
> >>  fs/exofs/inode.c |  115 +++++++++++++++++++++++++++---------------------------
> > 
> > Can you rip out all the rest of the buffer_head stuff too?
> > 
> 
> I hope I don't have any left, that was the last, have I missed
> something?

exofs_invalidatepage, exofs_releasepage, includes of buffer_head.h.
No point to any of that if you never actually map the buffers or
use them for tracking state yourself.


> >> @@ -750,6 +760,10 @@ static int exofs_write_end(struct file *file, struct address_space *mapping,
> >>  	int ret;
> >>  
> >>  	ret = simple_write_end(file, mapping,pos, len, copied, page, fsdata);
> >> +	if (unlikely(ret && pos + len > inode->i_size))
> >> +		truncate_pagecache(inode, pos + len, inode->i_size);
> >> +
> > 
> > So there is no need to do any oi_truncate? Even if _readpage in
> > write_begin has set up some blocks?
> > 
> 
> No blocks are setup. Exofs does not have any blocks.

Sure, I was just using blocks as a placeholder for whatever you call
it.


> If the write
> failed then the object on OSD has the last written offset as objects
> size. If error handling was done right residual is subtracted from IO
> size and should be reflected here.
> 
> If IO was never attempted then object size did not grow and we revert
> in memory i_size here. (size is keept in two places and is checked
> for consistency in fsck)

So long as the _readpage has not altered the object state at all,
then this looks fine to me.

 
> TODO:
> 	eject short writes and see if this works correctly.
> 
> > 
> >> +	/* TODO: once simple_write_end marks inode dirty remove */
> >>  	if (i_size != inode->i_size)
> >>  		mark_inode_dirty(inode);
> >>  	return ret;
> > 
> > Hmm, I suppose simple_write_end probably should mark the inode dirty?
> > 
> 
> I think Christoph has a patch for that.
> 
> > 
> >> @@ -1335,28 +1339,25 @@ void exofs_delete_inode(struct inode *inode)
> >>  
> >>  	truncate_inode_pages(&inode->i_data, 0);
> >>  
> >> +	/* TODO: should do better here */
> >>  	if (is_bad_inode(inode))
> >>  		goto no_delete;
> >>  
> >> -	mark_inode_dirty(inode);
> >> -	exofs_update_inode(inode, inode_needs_sync(inode));
> >> -
> >>  	inode->i_size = 0;
> >> -	if (inode->i_blocks)
> >> -		exofs_truncate(inode);
> > 
> > This guy has gone missing -- I assume exofs_sbi_remove is a more
> > efficient way to do this anyway?
> > 
> 
> You see this is where exofs is different a file is an object_no on
> multiple OSD devices. The inode is kept as an attribute of the
> object. (data as object's data) so a exofs_sbi_remove will just
> obliterate any association to the object. It was historically
> called because exofs_truncate used to do what truncate_inode_pages
> does today. (And some other in memory book keeping.) But with
> your help all this was cleaned up.

OK, I was thinking the underlying object itself needs to be trimmed
to match i_size similarly to just a block based filesystem? Like
exofs_oi_truncate appears to.

 
> Do you see any operation I missed that might need cleaning from the
> generic VFS inode, that might now leak. As far as storage is concerned
> I'm covered.
> 
> [I ran git clone linux; rm -rf linux; 100 times in a loop and the OSD
>  storage stayed constant size. So I presume there is no storage leak.
>  OSD is good in this respect]

I can't see anything off hand. Was just flagging points where 
vmtruncate or truncate had been called and is not now. If you
have all those covered, then you should be OK.

 
> >> -
> >>  	clear_inode(inode);
> >>  
> >> -	ret = exofs_get_io_state(&sbi->layout, &ios);
> >> -	if (unlikely(ret)) {
> >> -		EXOFS_ERR("%s: exofs_get_io_state failed\n", __func__);
> >> -		return;
> >> -	}
> >> -
> >>  	/* if we are deleting an obj that hasn't been created yet, wait */
> >>  	if (!obj_created(oi)) {
> >>  		BUG_ON(!obj_2bcreated(oi));
> >>  		wait_event(oi->i_wq, obj_created(oi));
> >> +		/* ignore the error attempt a remove anyway */
> >> +	}
> >> +
> >> +	/* Now Remove the OSD objects */
> >> +	ret = exofs_get_io_state(&sbi->layout, &ios);
> >> +	if (unlikely(ret)) {
> >> +		EXOFS_ERR("%s: exofs_get_io_state failed\n", __func__);
> >> +		return;
> >>  	}
> >>  
> >>  	ios->obj.id = exofs_oi_objno(oi);
> >> -- 
> >> 1.6.6.1
> 
> Thanks for lookin. And thanks for making this patch possible. I wanted
> this cleaned, long ago, but it was only made easy and simple after your
> changes.

That's OK, thanks for helping with it.


  reply	other threads:[~2010-05-31 14:33 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-31 12:30 [RFC] exofs: New truncate sequence Boaz Harrosh
2010-05-31 13:44 ` Nick Piggin
2010-05-31 14:13   ` Boaz Harrosh
2010-05-31 14:33     ` Nick Piggin [this message]
2010-05-31 14:50       ` Boaz Harrosh
2010-05-31 15:09         ` Nick Piggin
2010-05-31 15:19           ` Boaz Harrosh
2010-06-01 10:08             ` Christoph Hellwig
2010-06-01 10:26               ` Boaz Harrosh
2010-06-01 10:44                 ` Christoph Hellwig
2010-06-01 11:05                   ` Boaz Harrosh
2010-06-01 11:06                     ` Christoph Hellwig
2010-06-01 10:28 ` [PATCH ver2] " Boaz Harrosh
2010-06-01 10:43   ` Christoph Hellwig
2010-06-01 10:59     ` Boaz Harrosh
2010-06-01 11:06       ` Christoph Hellwig
2010-06-01 11:31 ` [PATCH ver3] " Boaz Harrosh
2010-06-01 11:36   ` Christoph Hellwig
2010-06-01 11:52     ` Boaz Harrosh
2010-06-01 15:09   ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100531143337.GK9453@laptop \
    --to=npiggin@suse.de \
    --cc=bharrosh@panasas.com \
    --cc=hch@lst.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=osd-dev@open-osd.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.