public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Brian Foster <bfoster@redhat.com>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH v3 04/11] xfs: update inode allocation/free transaction reservations for finobt
Date: Fri, 21 Feb 2014 07:50:16 +1100	[thread overview]
Message-ID: <20140220205016.GN4916@dastard> (raw)
In-Reply-To: <53064E26.2050607@redhat.com>

On Thu, Feb 20, 2014 at 01:49:10PM -0500, Brian Foster wrote:
> On 02/19/2014 09:01 PM, Dave Chinner wrote:
> > [patched in the extra case from your subsequent reply]
> > 
> > On Tue, Feb 18, 2014 at 12:10:16PM -0500, Brian Foster wrote:
> >> On 02/11/2014 01:46 AM, Dave Chinner wrote:
> >>> On Tue, Feb 04, 2014 at 12:49:35PM -0500, Brian Foster wrote:
> >>>> Create the xfs_calc_finobt_res() helper to calculate the finobt log
> >>>> reservation for inode allocation and free. Update
> >>>> XFS_IALLOC_SPACE_RES() to reserve blocks for the additional finobt
> >>>> insertion on inode allocation. Create XFS_IFREE_SPACE_RES() to
> >>>> reserve blocks for the potential finobt record insertion on inode
> >>>> free (i.e., if an inode chunk was previously fully allocated).
> >>>>
> >>>> Signed-off-by: Brian Foster <bfoster@redhat.com>
> >>>> ---
> >>>>  fs/xfs/xfs_inode.c       |  4 +++-
> >>>>  fs/xfs/xfs_trans_resv.c  | 47 +++++++++++++++++++++++++++++++++++++++++++----
> >>>>  fs/xfs/xfs_trans_space.h |  7 ++++++-
> >>>>  3 files changed, 52 insertions(+), 6 deletions(-)
> >>>>
> >>>> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> >>>> index 001aa89..57c77ed 100644
> >>>> --- a/fs/xfs/xfs_inode.c
> >>>> +++ b/fs/xfs/xfs_inode.c
> >>>> @@ -1730,7 +1730,9 @@ xfs_inactive_ifree(
> >>>>  	int			error;
> >>>>  
> >>>>  	tp = xfs_trans_alloc(mp, XFS_TRANS_INACTIVE);
> >>>> -	error = xfs_trans_reserve(tp, &M_RES(mp)->tr_ifree, 0, 0);
> >>>> +	tp->t_flags |= XFS_TRANS_RESERVE;
> >>>> +	error = xfs_trans_reserve(tp, &M_RES(mp)->tr_ifree,
> >>>> +				  XFS_IFREE_SPACE_RES(mp), 0);
> >>>
> >>> Can you add a comment explaining why the XFS_TRANS_RESERVE flag is
> >>> used here, and why it's use won't lead to accelerated reserve pool
> >>> depletion?
.....
> >> Perhaps another argument could be made that it's rather unlikely we run
> >> into an fs with as many 0-sized (or sub-inode chunk sized) files as
> >> required to deplete the reserve pool without freeing any space, and we
> >> should just touch up the failure handling. E.g.,
> >>
> >> 1.) Continue to reserve enable the ifree transaction. Consider expanding
> >> the reserve pool on finobt-enabled fs' if appropriate. Note that this is
> >> not guaranteed to provide enough resources to populate the finobt to the
> >> level of the inobt without freeing up more space.
> >> 2.) Attempt a !XFS_TRANS_RESERVE tp reservation in xfs_inactive_ifree().
> >> If fails, xfs_warn()/notice() and enable XFS_TRANS_RESERVE.
> >> 3.) Attempt XFS_TRANS_RESERVE reservation. If fails, xfs_notice() and
> >> shutdown.
> > 
> > I don't think we ned to shut down. Indeed, there's no point in doing
> > an !XFS_TRANS_RESERVE in the first place because a warning will just
> > generate unnecessary noise in the logs.
> > 
> > Realistically, we can leave inodes on the unlinked list
> > indefinitely without causing any significant problems except for
> > there being used space that users can't account for from the
> > namespace. Log recovery cleans them up when it runs, or blows away
> > the unlinked list when it fails, and that results in leaked inodes.
> > If we get to that point, xfs-repair will clean it up just fine
> > unless there's still not enough space. At that point, it's not a
> > problem we can solve with tools - the user has to free up some space
> > in the filesystem....
> 
> Ok, the current failure behavior (as unlikely as it seems to hit) seems
> less hasty given the roadmap for improved unlinked list management.
> 
> I'm not sure how log recovery plays into things unless there is a crash.
> In my experiments, the inodes are simply never freed and linger on the
> unlinked list until repair. Repair moves everything to lost+found as
> opposed to freeing (I presume since the inodes are still "allocated"
> after all), but repairs the fs nonetheless.

With the recovery-without-a-crash case, the recovery of unlinked
inodes happens in the second phase of recovery (i.e via
xlog_recover_process_iunlinks() in xlog_recover_finish()). This is
after we've done the actual log recovery (so that unlinked list
changes have been recovered), and so effectively in not so much a
part of "log recovery" but a part of "unclean shutdown cleanup".
That currently is associates with log recovery, but there's not
reason why we couldn't just do it unconditionally on a mount. If we
are going to leave inodes on the unlinked lists and allocate
directly from there, then it kind of makes sense to have the
unlinked lists cleaned up at mount time, even if it is only by
kicking a background cleaner thread...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2014-02-20 20:50 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-04 17:49 [PATCH v3 00/11] xfs: introduce the free inode btree Brian Foster
2014-02-04 17:49 ` [PATCH v3 01/11] xfs: refactor xfs_ialloc_btree.c to support multiple inobt numbers Brian Foster
2014-02-04 17:49 ` [PATCH v3 02/11] xfs: reserve v5 superblock read-only compat. feature bit for finobt Brian Foster
2014-02-11  6:07   ` Dave Chinner
2014-02-04 17:49 ` [PATCH v3 03/11] xfs: support the XFS_BTNUM_FINOBT free inode btree type Brian Foster
2014-02-11  6:22   ` Dave Chinner
2014-02-04 17:49 ` [PATCH v3 04/11] xfs: update inode allocation/free transaction reservations for finobt Brian Foster
2014-02-11  6:46   ` Dave Chinner
2014-02-11 16:22     ` Brian Foster
2014-02-20  1:00       ` Dave Chinner
2014-02-20 16:04         ` Brian Foster
2014-02-18 17:10     ` Brian Foster
2014-02-18 20:34       ` Brian Foster
2014-02-20  2:01       ` Dave Chinner
2014-02-20 18:49         ` Brian Foster
2014-02-20 20:50           ` Dave Chinner [this message]
2014-02-20 21:14           ` Christoph Hellwig
2014-02-20 23:13             ` Dave Chinner
2014-02-04 17:49 ` [PATCH v3 05/11] xfs: insert newly allocated inode chunks into the finobt Brian Foster
2014-02-11  6:48   ` Dave Chinner
2014-02-04 17:49 ` [PATCH v3 06/11] xfs: use and update the finobt on inode allocation Brian Foster
2014-02-11  7:17   ` Dave Chinner
2014-02-11 16:32     ` Brian Foster
2014-02-14 20:01     ` Brian Foster
2014-02-20  0:38       ` Dave Chinner
2014-02-04 17:49 ` [PATCH v3 07/11] xfs: refactor xfs_difree() inobt bits into xfs_difree_inobt() helper Brian Foster
2014-02-11  7:19   ` Dave Chinner
2014-02-04 17:49 ` [PATCH v3 08/11] xfs: update the finobt on inode free Brian Foster
2014-02-11  7:31   ` Dave Chinner
2014-02-04 17:49 ` [PATCH v3 09/11] xfs: add finobt support to growfs Brian Foster
2014-02-04 17:49 ` [PATCH v3 10/11] xfs: report finobt status in fs geometry Brian Foster
2014-02-11  7:34   ` Dave Chinner
2014-02-04 17:49 ` [PATCH v3 11/11] xfs: enable the finobt feature on v5 superblocks Brian Foster
2014-02-11  7:34   ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140220205016.GN4916@dastard \
    --to=david@fromorbit.com \
    --cc=bfoster@redhat.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox