public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Iustin Pop <iusty@k1024.org>
To: David Chinner <dgc@sgi.com>
Cc: Ruben Porras <nahoo82@gmail.com>, xfs@oss.sgi.com, cw@f00f.org
Subject: Re: XFS shrink functionality
Date: Fri, 8 Jun 2007 18:03:18 +0200	[thread overview]
Message-ID: <20070608160318.GA25579@teal.hq.k1024.org> (raw)
In-Reply-To: <20070608151223.GF86004887@sgi.com>

On Sat, Jun 09, 2007 at 01:12:23AM +1000, David Chinner wrote:
> > I took a look at both items since this discussion started. And honestly,
> > I think 1) is harder that 4), so you're welcome to work on it :) The
> > points that make it harder is that, per David's suggestion, there needs
> > to be:
> >  - define two new transaction types
> 
> one new transaction type:
> 
> XFS_TRANS_AGF_FLAGS
> 
> and and extension to xfs_alloc_log_agf(). Is about all that is
> needed there.
> 
> See the patch here:
> 
> http://oss.sgi.com/archives/xfs/2007-04/msg00103.html

Ah, I see now. I was wondering how one can enable the new bits (CVS
xfs_db shows the btreeblks but 'version' cmd doesn't allow to change
them), it seems that manual xfs_db work + xfs_repair allows them.

> For an example of a very simlar transaction to what is needed
> (look at xfs_log_sbcount()) and very similar addition to
> the AGF (xfs_btreeblks).
Just a question: why do you think this per-ag-bit to be persistent? I'm
just curious. When I first thought about this, I was thinking more like
this should be an in-core flag only, like the freeze flag is for the
filesystem. The idea being that you don't need to recover this state
after a crash - there is no actual state, just restart the shrink
operation if you want. And no actual filesystem state (e.g. space
allocation or such) is happenning when you toggle the AGs not
allocatable. This would allow a much simpler implementation of the
'no-alloc' part.

> >  - update the ondisk-format (!), if we want persistence of these flags;
> >    luckily, there are two spare fields in the AGF structure.
> 
> Better to expand, I think. The AGF is a sector in length - we can
> expand the structure as we need to this size without fear, esp. as
> the part of the sector outside the structure is guaranteed to be
> zero.  i.e. we can add a fields flag to the end of the AGF
> structure - old filesystems simple read as "no flags set" and
> old kernels never look at those bits....
Yes, makes sense. Just to make sure: the xfs_agf_t, xfs_agi_t and
xfs_sb_t structures as defined in xfs_sb.h and xfs_ag.h are what
actually is on-disk, right? Adding to them, defining the new bits i.e.
XFS_AGF_FLAGS and bumping up XFS_AGF_ALL_BITS should take care of the
on-disk part?

> > Open questions (re. point 4):
> >  - the filesystem document says the agf->agf_btreeblks is held only in
> >    case we have an extended flag active for the filesystem
> >    (XFS_SB_VERSION2_LAZYSBCOUNTBIT); is this true? without this, I'm not
> >    sure how to calculate this number of blocks nicely
> 
> Yes, that is true. There's a pre-req for shrinking for the moment :/
> 
> >  - or can I assume that an empty AG will *always* have agf_levels = 1
> >    for both Btrees, so there are no extra blocks actually used for the
> >    btrees (except for the two reserved ones at the beggining of the AG
> 
> Yes, that is a valid assumption.
Ok, perfect. This then eliminates the need for LAZYSBCOUNTBIT. Just one
more question: can I *read* from the mp->m_perag structure or do I need
a lock (even for read), i.e. down_read, read the fields, up_read? (As
you can see, I don't have much experience w.r.t. kernel programming).

> >  - can I assume that an AG with agi->icount == agi->ifree == 0 will have
> >    no blocks used for the inode btrees (logically yes, but I'm not sure)
> 
> yes.

Good.

Thanks for your explanations. Patch for shrink if the AGs are empty will
be simpler and nicer then as opposed to what I have now.

iustin

  reply	other threads:[~2007-06-08 16:04 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-06-01 16:39 XFS shrink functionality Ruben Porras
2007-06-04  0:16 ` David Chinner
2007-06-04  8:41   ` Iustin Pop
2007-06-04  9:21     ` David Chinner
2007-06-05  8:00       ` Iustin Pop
2007-06-06  1:50         ` Nathan Scott
2007-06-07  8:18         ` David Chinner
2007-06-08  8:23     ` Ruben Porras
2007-06-08 10:15       ` Iustin Pop
2007-06-08 15:12         ` David Chinner
2007-06-08 16:03           ` Iustin Pop [this message]
2007-06-09  2:15             ` David Chinner
2007-06-08 19:47           ` Ruben Porras
2007-06-14  8:35           ` Ruben Porras
2007-06-14  9:14             ` David Chinner
2007-06-08 14:44       ` David Chinner
2007-06-19 22:22   ` XFS shrink (step 0) Ruben Porras
2007-06-19 23:42     ` David Chinner
2007-06-28 10:38       ` Ruben Porras
2007-06-29  6:55         ` David Chinner
2007-07-30 17:30           ` Ruben Porras

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070608160318.GA25579@teal.hq.k1024.org \
    --to=iusty@k1024.org \
    --cc=cw@f00f.org \
    --cc=dgc@sgi.com \
    --cc=nahoo82@gmail.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox