linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: Roman Mamedov <rm@romanrm.net>
Cc: Timofey Titovets <nefelim4ag@gmail.com>,
	linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Is it possible to speed up unlink()?
Date: Thu, 20 Oct 2016 11:49:20 -0400	[thread overview]
Message-ID: <63eb8a0b-d664-1907-8ad1-7bec19ff80fa@gmail.com> (raw)
In-Reply-To: <20161020202635.57c8a72b@natsu>

On 2016-10-20 11:26, Roman Mamedov wrote:
> On Thu, 20 Oct 2016 08:09:14 -0400
> "Austin S. Hemmelgarn" <ahferroin7@gmail.com> wrote:
>
>>> So, it's possible to return unlink() early? or this a bad idea(and why)?
>> I may be completely off about this, but I could have sworn that unlink()
>> returns when enough info is on the disk that both:
>> 1. The file isn't actually visible in the directory.
>> 2. If the system crashes, the filesystem will know to finish the cleanup.
>
> As I understand it there is no fundamental reason why rm of a heavily
> fragmented file couldn't be exactly as fast as deleting a subvolume with
> only that single file in it. Remove the directory reference and instantly
> return success to userspace, continuing to clean up extents in the background.
The tree cleanup is actually a bit easier for a subvolume since it's the 
root of it's own tree.  This in turn means that there is less that 
actually needs to be written for a subvolume with a single file in it to 
be deleted than for the file by itself to be deleted, since the write 
doesn't propagate up quite as many trees.

The thing is though that since the NFS export is set to async mode, the 
unlink should return almost immediately anyway.

The other issue is that the type file in question is a pathological case 
for any COW filesystem, not just BTRFS, and this behavior is pretty well 
understood.  Once you get past about 8G for a VM image on BTRFS, you 
either need to be looking at real block storage (LVM or something 
similar with the image exported using something like iSCSI or NBD), make 
absolutely certain the file is pre-allocated and marked NOCOW, or use a 
split file format.
>
> However for many uses that could be counter-productive, as scripts might
> expect the disk space to be freed up completely after the rm command returns
> (as they might need to start filling up the partition with new data).
'Might' is an understatement, scripts _do_ expect the disk space to free 
up immediately, and this has caused a number of issues with various 
tools on BTRFS.  It's also an issue because just about everything 
expects unlink() to be functionally synchronous (ie, unlink() shouldn't 
have an impact on other operations if it's already returned).
>
> In snapshot deletion there are various commit modes built in for that purpose,
> but I'm not sure if you can easily extend POSIX file deletion to implement
> synchronous and non-synchronous deletion modes.
There isn't.  In theory it could be implemented as a mount option, but 
even that gets risky for the same reason taht implementing it globally 
is potentially problematic.
>
> * Try the 'unlink' program instead of 'rm'; if "just remove the dir entry for
>   now" was implemented anywhere, I'd expect it to be via that.
'rm' just puts a nice UI on the unlink() call, 'unlink' just calls it 
directly, so I severely doubt that it will have any impact.
> * Try doing 'eatmydata rm', but that's more of a crazy idea than anything else,
>   as eatmydata only affects fsyncs, and I don't think rm is necessarily
>   invoking those.
It isn't, so this almost certainly won't help.


      reply	other threads:[~2016-10-20 15:49 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-20  9:29 Is it possible to speed up unlink()? Timofey Titovets
2016-10-20 12:09 ` Austin S. Hemmelgarn
2016-10-20 13:47   ` Timofey Titovets
2016-10-20 14:44     ` Austin S. Hemmelgarn
2016-10-20 17:33       ` ronnie sahlberg
2016-10-20 17:44         ` Austin S. Hemmelgarn
2016-10-20 15:26   ` Roman Mamedov
2016-10-20 15:49     ` Austin S. Hemmelgarn [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=63eb8a0b-d664-1907-8ad1-7bec19ff80fa@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=nefelim4ag@gmail.com \
    --cc=rm@romanrm.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).