[PATCH] handle start_unlink_transaction the same for an exceded quota , limit as an out of space error.

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Kevin Brandstatter <icarusthecow@gmail.com>
To: Duncan <1i5t5.duncan@cox.net>, linux-btrfs@vger.kernel.org
Subject: [PATCH] handle start_unlink_transaction the same for an exceded quota , limit as an out of space error.
Date: Mon, 23 Jun 2014 18:36:07 -0500	[thread overview]
Message-ID: <53A8B9E7.7060206@gmail.com> (raw)
In-Reply-To: <pan$d6c28$7dab67c8$466f1e49$ff456a63@cox.net>

---
 fs/btrfs/inode.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
 
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 0ec8766..41209e8 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -3751,10 +3751,10 @@ static struct btrfs_trans_handle
*__unlink_start_trans(struct inode *dir)
    * 1 for the inode
    */
   trans = btrfs_start_transaction(root, 5);
-  if (!IS_ERR(trans) || PTR_ERR(trans) != -ENOSPC)
+  if (!IS_ERR(trans) || (PTR_ERR(trans) != -ENOSPC && PTR_ERR(trans) !=
-EDQUOT))
      return trans;
.
-  if (PTR_ERR(trans) == -ENOSPC) {
+  if (PTR_ERR(trans) == -ENOSPC || PTR_ERR(trans) == -EDQUOT) {
      u64 num_bytes = btrfs_calc_trans_metadata_size(root, 5);
.
      trans = btrfs_start_transaction(root, 0);
--.
2.0.0

On 06/22/2014 08:53 PM, Duncan wrote:
> Kevin Brandstatter posted on Sun, 22 Jun 2014 12:56:30 -0500 as excerpted:
>
>> One thing i note is that I can unlink from a full filesystem.
>> I tested it by writing a file until the device ran out of space, and
>> then rm it, the same method that i used to cause the disk quota error,
>> and it was able to remove without issue.
> It's worth noting that due to the btrfs separation between data and 
> metadata and the fact that btrfs space allocation happens in two steps 
> but it can only automatically free one of them (with a rebalance normally 
> used to deal with the other), there's three different kinds of "full 
> filesystem", (1) "all space chunk allocated", which isn't yet /entirely/ 
> full but means a significant loss of flexibility in filling up the rest, 
> (2) "all space chunk-allocated and metadata space ran out of room first 
> but there's still room in the data chunks", which is what happens most of 
> the time in normal usage, and (3) "all space chunk-allocated and data 
> space ran out first but there's still room in the metadata chunks", which 
> can produce decidedly non-intuitive behavior for people used to standard 
> filesystem behavior.
>
> Data/metadata chunk allocation is only one-way.  Once a chunk is 
> allocated to one or the other, the system cannot (yet) reallocate chunks 
> of one type to the other without a rebalance, so once all previously 
> unallocated space is allocated to either data or metadata chunks, it's 
> only a matter of time until one or the other runs out.
>
> In normal usage with a significant amount of file deletion, the spread 
> between data chunk allocation and actual usage tends to get rather large, 
> because file deletion normally frees much more data space than it does 
> metadata.  As such, the most common out-of-space condition is all 
> unallocated space gone, with most of the still actually unused space 
> allocated to data and thus not available to be used for metadata, such 
> that metadata space runs out first.
>
> When metadata space runs out, normal df will likely still report a decent 
> amount of space remaining, but btrfs filesystem df combined with btrfs 
> filesystem show will reveal that it's all locked up in data chunks -- a 
> big spread, often multiple gigabytes between data used and total (which 
> given the 1 GiB data chunk size means multiple data chunks could be 
> freed), a much smaller spread between metadata used and total (the system 
> reserves some metadata space, typically 200-ish MiB, so it should never 
> show as entirely gone, even when it's triggering ENOSPC).
>
> But due to COW, even file deletion requires available metadata space in 
> ordered to create the new/modified copy of the (normally 4-16 KiB 
> depending on mkfs.btrfs age and parameters supplied) metadata block, and 
> if there's no metadata space left and no more unallocated space to 
> allocate, ENOSPC even on file deletion!
>
> OTOH, in use-cases where there is little file deletion, the spread 
> between data chunk total and data chunk used tends to be much smaller, 
> and it can happen that there's still free metadata chunk space when the 
> last free data space is used and another data chunk needs allocated, but 
> there's no more unallocated space to allocate.  Of course btrfs 
> filesystem df (to see how allocated space is used) in combination with 
> btrfs filesystem show (to see whether all space is allocated) should tell 
> the story, in this case, reporting all or nearly all data space used but 
> a larger gap (> 200 MiB) between metadata total and used.
>
> This triggers a much more interesting and non-intuitive failure mode.  In 
> particular, because there's still metadata space available, attempts to 
> create a new file will succeed, but actually putting significant content 
> in that file will fail, often resulting in the creation of zero-length 
> files that won't accept data!  However, because btrfs stores very small 
> files (generally something under 16 MiB, the precise size depends on 
> filesystem parameters) entirely within metadata without actually 
> allocating a data extent for them, attempts to copy small enough files 
> will generally succeed as well -- as long as they're small enough to fit 
> in metadata only and not require a data allocation.
>
> Now I don't deal with quotas here and thus haven't looked into how quotas 
> account for metadata in particular, but it's worth noting that your 
> "write a file until there's no more space" test could well have triggered 
> the latter, all space chunk-allocated and data filled up first, 
> condition.  If that's the case, deleting a file wouldn't be a problem 
> because there's metadata space still available to record the deletion.  
> As I said above, another characteristic would be that attempts to create 
> new files and fill them with data (> 16 MiB at a time) would result in 
> zero-length files, as there's metadata space available to create them, 
> but no data space available to fill them.
>
> So your test may have been testing an *ENTIRELY* different failure 
> condition!
>

next prev parent reply	other threads:[~2014-06-23 23:36 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-22  1:16 Removing file = quota exceded Kevin Brandstatter
2014-06-22 16:38 ` Josef Bacik
2014-06-22 17:56   ` Kevin Brandstatter
2014-06-23  1:53     ` Duncan
2014-06-23 23:36       ` Kevin Brandstatter [this message]
2014-06-24  5:31       ` Duncan
2014-06-24 12:15         ` Kevin Brandstatter
2014-06-23 23:43   ` Kevin Brandstatter

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:0ec8766 dfblob:41209e8 )
 OR (
bs:"[PATCH] handle start_unlink_transaction the same for an exceded quota , limit as an out of space error." )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53A8B9E7.7060206@gmail.com \
    --to=icarusthecow@gmail.com \
    --cc=1i5t5.duncan@cox.net \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).