From: Nick Piggin <npiggin@suse.de>
To: Boaz Harrosh <bharrosh@panasas.com>
Cc: Christoph Hellwig <hch@lst.de>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
open-osd <osd-dev@open-osd.org>
Subject: Re: [RFC] exofs: New truncate sequence
Date: Tue, 1 Jun 2010 00:33:37 +1000 [thread overview]
Message-ID: <20100531143337.GK9453@laptop> (raw)
In-Reply-To: <4C03C40E.2090800@panasas.com>
On Mon, May 31, 2010 at 05:13:34PM +0300, Boaz Harrosh wrote:
> On 05/31/2010 04:44 PM, Nick Piggin wrote:
> > On Mon, May 31, 2010 at 03:30:02PM +0300, Boaz Harrosh wrote:
> >> ---
> >> fs/exofs/exofs.h | 1 -
> >> fs/exofs/file.c | 1 -
> >> fs/exofs/inode.c | 115 +++++++++++++++++++++++++++---------------------------
> >
> > Can you rip out all the rest of the buffer_head stuff too?
> >
>
> I hope I don't have any left, that was the last, have I missed
> something?
exofs_invalidatepage, exofs_releasepage, includes of buffer_head.h.
No point to any of that if you never actually map the buffers or
use them for tracking state yourself.
> >> @@ -750,6 +760,10 @@ static int exofs_write_end(struct file *file, struct address_space *mapping,
> >> int ret;
> >>
> >> ret = simple_write_end(file, mapping,pos, len, copied, page, fsdata);
> >> + if (unlikely(ret && pos + len > inode->i_size))
> >> + truncate_pagecache(inode, pos + len, inode->i_size);
> >> +
> >
> > So there is no need to do any oi_truncate? Even if _readpage in
> > write_begin has set up some blocks?
> >
>
> No blocks are setup. Exofs does not have any blocks.
Sure, I was just using blocks as a placeholder for whatever you call
it.
> If the write
> failed then the object on OSD has the last written offset as objects
> size. If error handling was done right residual is subtracted from IO
> size and should be reflected here.
>
> If IO was never attempted then object size did not grow and we revert
> in memory i_size here. (size is keept in two places and is checked
> for consistency in fsck)
So long as the _readpage has not altered the object state at all,
then this looks fine to me.
> TODO:
> eject short writes and see if this works correctly.
>
> >
> >> + /* TODO: once simple_write_end marks inode dirty remove */
> >> if (i_size != inode->i_size)
> >> mark_inode_dirty(inode);
> >> return ret;
> >
> > Hmm, I suppose simple_write_end probably should mark the inode dirty?
> >
>
> I think Christoph has a patch for that.
>
> >
> >> @@ -1335,28 +1339,25 @@ void exofs_delete_inode(struct inode *inode)
> >>
> >> truncate_inode_pages(&inode->i_data, 0);
> >>
> >> + /* TODO: should do better here */
> >> if (is_bad_inode(inode))
> >> goto no_delete;
> >>
> >> - mark_inode_dirty(inode);
> >> - exofs_update_inode(inode, inode_needs_sync(inode));
> >> -
> >> inode->i_size = 0;
> >> - if (inode->i_blocks)
> >> - exofs_truncate(inode);
> >
> > This guy has gone missing -- I assume exofs_sbi_remove is a more
> > efficient way to do this anyway?
> >
>
> You see this is where exofs is different a file is an object_no on
> multiple OSD devices. The inode is kept as an attribute of the
> object. (data as object's data) so a exofs_sbi_remove will just
> obliterate any association to the object. It was historically
> called because exofs_truncate used to do what truncate_inode_pages
> does today. (And some other in memory book keeping.) But with
> your help all this was cleaned up.
OK, I was thinking the underlying object itself needs to be trimmed
to match i_size similarly to just a block based filesystem? Like
exofs_oi_truncate appears to.
> Do you see any operation I missed that might need cleaning from the
> generic VFS inode, that might now leak. As far as storage is concerned
> I'm covered.
>
> [I ran git clone linux; rm -rf linux; 100 times in a loop and the OSD
> storage stayed constant size. So I presume there is no storage leak.
> OSD is good in this respect]
I can't see anything off hand. Was just flagging points where
vmtruncate or truncate had been called and is not now. If you
have all those covered, then you should be OK.
> >> -
> >> clear_inode(inode);
> >>
> >> - ret = exofs_get_io_state(&sbi->layout, &ios);
> >> - if (unlikely(ret)) {
> >> - EXOFS_ERR("%s: exofs_get_io_state failed\n", __func__);
> >> - return;
> >> - }
> >> -
> >> /* if we are deleting an obj that hasn't been created yet, wait */
> >> if (!obj_created(oi)) {
> >> BUG_ON(!obj_2bcreated(oi));
> >> wait_event(oi->i_wq, obj_created(oi));
> >> + /* ignore the error attempt a remove anyway */
> >> + }
> >> +
> >> + /* Now Remove the OSD objects */
> >> + ret = exofs_get_io_state(&sbi->layout, &ios);
> >> + if (unlikely(ret)) {
> >> + EXOFS_ERR("%s: exofs_get_io_state failed\n", __func__);
> >> + return;
> >> }
> >>
> >> ios->obj.id = exofs_oi_objno(oi);
> >> --
> >> 1.6.6.1
>
> Thanks for lookin. And thanks for making this patch possible. I wanted
> this cleaned, long ago, but it was only made easy and simple after your
> changes.
That's OK, thanks for helping with it.
next prev parent reply other threads:[~2010-05-31 14:33 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-31 12:30 [RFC] exofs: New truncate sequence Boaz Harrosh
2010-05-31 13:44 ` Nick Piggin
2010-05-31 14:13 ` Boaz Harrosh
2010-05-31 14:33 ` Nick Piggin [this message]
2010-05-31 14:50 ` Boaz Harrosh
2010-05-31 15:09 ` Nick Piggin
2010-05-31 15:19 ` Boaz Harrosh
2010-06-01 10:08 ` Christoph Hellwig
2010-06-01 10:26 ` Boaz Harrosh
2010-06-01 10:44 ` Christoph Hellwig
2010-06-01 11:05 ` Boaz Harrosh
2010-06-01 11:06 ` Christoph Hellwig
2010-06-01 10:28 ` [PATCH ver2] " Boaz Harrosh
2010-06-01 10:43 ` Christoph Hellwig
2010-06-01 10:59 ` Boaz Harrosh
2010-06-01 11:06 ` Christoph Hellwig
2010-06-01 11:31 ` [PATCH ver3] " Boaz Harrosh
2010-06-01 11:36 ` Christoph Hellwig
2010-06-01 11:52 ` Boaz Harrosh
2010-06-01 15:09 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100531143337.GK9453@laptop \
--to=npiggin@suse.de \
--cc=bharrosh@panasas.com \
--cc=hch@lst.de \
--cc=linux-fsdevel@vger.kernel.org \
--cc=osd-dev@open-osd.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).