public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: "Alex Lyakas" <alex@zadarastorage.com>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com
Subject: Re: Questions about XFS discard and xfs_free_extent() code (newbie)
Date: Tue, 24 Dec 2013 20:21:50 +0200	[thread overview]
Message-ID: <8155F3F9D6094F40B4DA71BD561D2DE8@alyakaslap> (raw)
In-Reply-To: <20131219105513.GZ31386@dastard>

Hi Dave,
Reading through the code some more, I see that the extent that is freed 
through xfs_free_extent() can be an XFS metadata extent as well.
For example, xfs_inobt_free_block() frees a block of the AG's free-inode 
btree. Also, xfs_bmbt_free_block() frees a generic btree block by putting it 
onto the cursor's "to-be-freed" list, which will be dropped into the 
free-space btree (by xfs_free_extent) in xfs_bmap_finish(). If we discard 
such metadata block before the transaction is committed to the log and we 
crash, we might not be able to properly mount after reboot, is that right? I 
mean it's not that some file's data block will show 0s to the user instead 
of before-delete data, but some XFS btree node (for example) will be wiped 
in such case. Can this happen?

Thanks,
Alex.


-----Original Message----- 
From: Dave Chinner
Sent: 19 December, 2013 12:55 PM
To: Alex Lyakas
Cc: xfs@oss.sgi.com
Subject: Re: Questions about XFS discard and xfs_free_extent() code (newbie)

On Thu, Dec 19, 2013 at 11:24:15AM +0200, Alex Lyakas wrote:
> Hi Dave,
> Thank you for your comments.
> I realize now that what I proposed cannot be done; I need to
> understand deeper how XFS transactions work (unfortunately, the
> awesome "XFS Filesystem Structure" doc has a TODO in the "Journaling
> Log" section).
>
> Can you please comment on one more question:
> Let's say we had such fully asynchronous "fire-and-forget" discard
> operation (I can implement one myself for my block-device via a
> custom IOCTL). What is wrong if we trigger such operation in
> xfs_free_ag_extent(), right after we have merged the freed extent
> into a bigger one? I understand that the extent-free-intent is not
> yet committed to the log at this point. But from the user's point of
> view, the extent has been deleted, no? So if the underlying block
> device discards the merged extent right away, before committing to
> the log, what issues this can cause?

Think of what happens when a crash occurs immediately after the
discard completes. The freeing of the extent never made it to th
elog, so after recovery, the file still exists and the user can
access it. Except that it's contents are now all different to
before the crash occurred.

IOWs, issuing the discard before the transaction that frees the
extent is on stable storage means we are discarding user data or
metadata before we've guaranteed that the extent free transaction
is permanent and that means we violate certain guarantees with
respect to crash recovery...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  parent reply	other threads:[~2013-12-24 18:21 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-18 18:37 Questions about XFS discard and xfs_free_extent() code (newbie) Alex Lyakas
2013-12-18 23:06 ` Dave Chinner
2013-12-19  9:24   ` Alex Lyakas
2013-12-19 10:55     ` Dave Chinner
2013-12-19 19:24       ` Alex Lyakas
2013-12-21 17:03         ` Chris Murphy
2013-12-24 18:21       ` Alex Lyakas [this message]
2013-12-26 23:00         ` Dave Chinner
2014-01-08 18:13           ` Alex Lyakas
2014-01-13  3:02             ` Dave Chinner
2014-01-13 17:44               ` Alex Lyakas
2014-01-13 20:43                 ` Dave Chinner
2014-01-14 13:48                   ` Alex Lyakas
2014-01-15  1:45                     ` Dave Chinner
2014-01-19  9:38                       ` Alex Lyakas
2014-01-19 23:17                         ` Dave Chinner
2014-07-01 15:06                           ` xfs_growfs_data_private memory leak Alex Lyakas
2014-07-01 21:56                             ` Dave Chinner
2014-07-02 12:27                               ` Alex Lyakas
2014-08-04 18:15                                 ` Eric Sandeen
2014-08-06  8:56                                   ` Alex Lyakas
2014-08-04 11:00                             ` use-after-free on log replay failure Alex Lyakas
2014-08-04 14:12                               ` Brian Foster
2014-08-04 23:07                               ` Dave Chinner
2014-08-06 10:05                                 ` Alex Lyakas
2014-08-06 12:32                                   ` Dave Chinner
2014-08-06 14:43                                     ` Alex Lyakas
2014-08-10 16:26                                     ` Alex Lyakas
2014-08-06 12:52                                 ` Alex Lyakas
2014-08-06 15:20                                   ` Brian Foster
2014-08-06 15:28                                     ` Alex Lyakas
2014-08-10 12:20                                     ` Alex Lyakas
2014-08-11 13:20                                       ` Brian Foster
2014-08-11 21:52                                         ` Dave Chinner
2014-08-12 12:03                                           ` Brian Foster
2014-08-12 12:39                                             ` Alex Lyakas
2014-08-12 19:31                                               ` Brian Foster
2014-08-12 23:56                                               ` Dave Chinner
2014-08-13 12:59                                                 ` Brian Foster
2014-08-13 20:59                                                   ` Dave Chinner
2014-08-13 23:21                                                     ` Brian Foster
2014-08-14  6:14                                                       ` Dave Chinner
2014-08-14 19:05                                                         ` Brian Foster
2014-08-14 22:27                                                           ` Dave Chinner
2014-08-13 17:07                                                 ` Alex Lyakas
2014-08-13  0:03                                               ` Dave Chinner
2014-08-13 13:11                                                 ` Brian Foster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8155F3F9D6094F40B4DA71BD561D2DE8@alyakaslap \
    --to=alex@zadarastorage.com \
    --cc=david@fromorbit.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox