public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Deleting large files
@ 2008-05-07 19:49 Morten Welinder
  2008-05-07 20:10 ` Jan Engelhardt
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Morten Welinder @ 2008-05-07 19:49 UTC (permalink / raw)
  To: linux-kernel

Hi there,

deleting large files, say on the order of 4.6GB, takes approximately forever.
Why is that?  Well, it is because a lot of things need to take place to free
the formerly used space, but my real question is "why does the unlink caller
have to wait for it?"

I.e., could unlink do the directory changes and then hand off the rest of the
task to a kernel thread?

Morten

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Deleting large files
  2008-05-07 19:49 Deleting large files Morten Welinder
@ 2008-05-07 20:10 ` Jan Engelhardt
  2008-05-07 20:17   ` Xavier Bestel
  2008-05-07 22:34 ` linux-os (Dick Johnson)
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 16+ messages in thread
From: Jan Engelhardt @ 2008-05-07 20:10 UTC (permalink / raw)
  To: Morten Welinder; +Cc: linux-kernel


On Wednesday 2008-05-07 21:49, Morten Welinder wrote:

>Hi there,
>
>deleting large files, say on the order of 4.6GB, takes approximately forever.
>Why is that?  Well, it is because a lot of things need to take place to free
>the formerly used space,

Only for a few filesystems which use that sort of housekeeping.

>but my real question is "why does the unlink caller have to wait for it?"

Same reason your shell waits for your program to complete before
showing the prompt again?

>I.e., could unlink do the directory changes and then hand off the rest of the
>task to a kernel thread?

Say you had one realtime application running that would do lots of new
writes after the unlink finished. When the unlink is put into the
background, you interleave the unlink operation with new writes,
probably causing needless seeks and therefore not hitting the deadlines
anymore.

For you desktop use, `rm -f foobar.avi &` should do. No?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Deleting large files
  2008-05-07 20:10 ` Jan Engelhardt
@ 2008-05-07 20:17   ` Xavier Bestel
  2008-05-07 20:48     ` Jan Engelhardt
  0 siblings, 1 reply; 16+ messages in thread
From: Xavier Bestel @ 2008-05-07 20:17 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: Morten Welinder, linux-kernel

Le mercredi 07 mai 2008 à 22:10 +0200, Jan Engelhardt a écrit :
> 
> >I.e., could unlink do the directory changes and then hand off the rest of the
> >task to a kernel thread?
> 
> Say you had one realtime application running that would do lots of new
> writes after the unlink finished. When the unlink is put into the
> background, you interleave the unlink operation with new writes,
> probably causing needless seeks and therefore not hitting the deadlines
> anymore.

Why ? The writes are delayed, so could be the unlink operations.

	Xav



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Deleting large files
  2008-05-07 20:17   ` Xavier Bestel
@ 2008-05-07 20:48     ` Jan Engelhardt
  0 siblings, 0 replies; 16+ messages in thread
From: Jan Engelhardt @ 2008-05-07 20:48 UTC (permalink / raw)
  To: Xavier Bestel; +Cc: Morten Welinder, linux-kernel


On Wednesday 2008-05-07 22:17, Xavier Bestel wrote:
>Le mercredi 07 mai 2008 à 22:10 +0200, Jan Engelhardt a écrit :
>> 
>> >I.e., could unlink do the directory changes and then hand off the rest of the
>> >task to a kernel thread?
>> 
>> Say you had one realtime application running that would do lots of new
>> writes after the unlink finished. When the unlink is put into the
>> background, you interleave the unlink operation with new writes,
>> probably causing needless seeks and therefore not hitting the deadlines
>> anymore.
>
>Why ? The writes are delayed, so could be the unlink operations.

Code complexity. But then again, a few good filesystems
don't even need to do such heavy housekeeping, and I
suggest using these if you are worried about unlink speed.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Deleting large files
  2008-05-07 19:49 Deleting large files Morten Welinder
  2008-05-07 20:10 ` Jan Engelhardt
@ 2008-05-07 22:34 ` linux-os (Dick Johnson)
  2008-05-07 23:14   ` Morten Welinder
  2008-05-08  8:19 ` Matti Aarnio
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 16+ messages in thread
From: linux-os (Dick Johnson) @ 2008-05-07 22:34 UTC (permalink / raw)
  To: Morten Welinder; +Cc: linux-kernel


On Wed, 7 May 2008, Morten Welinder wrote:

> Hi there,
>
> deleting large files, say on the order of 4.6GB, takes approximately forever.
> Why is that?  Well, it is because a lot of things need to take place to free
> the formerly used space, but my real question is "why does the unlink caller
> have to wait for it?"
>
> I.e., could unlink do the directory changes and then hand off the rest of the
> task to a kernel thread?
>
> Morten
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

Suppose you had an N GB file that just filled up the disk. You now
delete it, but get control back before it is really deleted. You
now start to write a new file that will eventually just fill up
the disk. Your task will get a media full error long before
the media is really full because the old file's data space
hasn't been freed yet. So, to "fix" this, you modify the file-
system to defer your logical writes until all the previous
spaces has been freed (writes to the physical media are deferred
anyway as long as there is RAM available). The result is that
your new data, that may be precious from a quasi-real-time source,
will fail to be written. To "fix" this, you queue everything.
This will eventually fail because the disk and RAM are of
a finite size. The size of the disk is known, but you don't
know what will be deleted before the queued writes have
completed, so you really don't know when to tell the writer
that there is no more space available.

That's why the task that deletes data can't get control back
until is has been deleted. However, for user applications, at
the user's risk, one can do `rm filename &` and let the shell
do the waiting.


Cheers,
Dick Johnson
Penguin : Linux version 2.6.22.1 on an i686 machine (5588.29 BogoMips).
My book : http://www.AbominableFirebug.com/
_


****************************************************************
The information transmitted in this message is confidential and may be privileged.  Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited.  If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to DeliveryErrors@analogic.com - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Deleting large files
  2008-05-07 22:34 ` linux-os (Dick Johnson)
@ 2008-05-07 23:14   ` Morten Welinder
  2008-05-08 23:01     ` Alan Cox
                       ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Morten Welinder @ 2008-05-07 23:14 UTC (permalink / raw)
  To: linux-os (Dick Johnson); +Cc: linux-kernel

>  Suppose you had an N GB file that just filled up the disk. You now
>  delete it, but get control back before it is really deleted. You
>  now start to write a new file that will eventually just fill up
>  the disk. [...]

That argument ought to stop right there.  If you believe that deleting a
file will necessarily and immediately give you back the space, then you
are wrong in the current state of the affairs already.

NFS does not do that -- in fact, I don't believe any file system does that
unless you can guarantee at least that no other process or the kernel has
that file open; AFS did not do that last I looked a decade ago; versioning
file systems do not; journaling file systems might not.  File systems that
support undelete do not do that.  In short: assuming such a thing is a
bug in need of a fix today.

Right now, unlink is a commonly used syscall with unbounded response
time.  If your GUI program deletes a file, the GUI generally locks up until
the kernel feels like returning -- that is certainly not how you get a smooth
user experience.  Forking a process to do the deletion (a) is pathetic,
(b) is not currently done, and (c) does not work: you cannot get a result
right away, i.e., you lose error handling.

Morten

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Deleting large files
  2008-05-07 19:49 Deleting large files Morten Welinder
  2008-05-07 20:10 ` Jan Engelhardt
  2008-05-07 22:34 ` linux-os (Dick Johnson)
@ 2008-05-08  8:19 ` Matti Aarnio
  2008-05-11 11:16   ` Christoph Hellwig
  2008-05-08 17:29 ` Christian Kujau
  2008-05-17 12:15 ` Pavel Machek
  4 siblings, 1 reply; 16+ messages in thread
From: Matti Aarnio @ 2008-05-08  8:19 UTC (permalink / raw)
  To: Morten Welinder; +Cc: linux-kernel

On Wed, May 07, 2008 at 03:49:30PM -0400, Morten Welinder wrote:
> Hi there,
> 
> deleting large files, say on the order of 4.6GB, takes approximately forever.
> Why is that?  Well, it is because a lot of things need to take place to free
> the formerly used space, but my real question is "why does the unlink caller
> have to wait for it?"

This very question has troubled SQUID developers.  Whatever the system, unlink()
that really does free diskspace does so with unbound timelimit and in services
where one millisecond is long wait time, the solution has been to run separate
subprocess that actually does the unlinks.

Squid is not threaded software, and it was created long ago when threads were
rare and implementations were different in subtle details --> no threads at all.

> Morten

  /Matti Aarnio

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Deleting large files
  2008-05-07 19:49 Deleting large files Morten Welinder
                   ` (2 preceding siblings ...)
  2008-05-08  8:19 ` Matti Aarnio
@ 2008-05-08 17:29 ` Christian Kujau
       [not found]   ` <118833cc0805081110u7aad3921v3a1ec4187acc4ef4@mail.gmail.com>
  2008-05-17 12:15 ` Pavel Machek
  4 siblings, 1 reply; 16+ messages in thread
From: Christian Kujau @ 2008-05-08 17:29 UTC (permalink / raw)
  To: Morten Welinder; +Cc: linux-kernel

On Wed, May 7, 2008 21:49, Morten Welinder wrote:
> deleting large files, say on the order of 4.6GB, takes approximately
> forever.

What filesystem are you talking about? How long is "forever"? Also the
kernel version would be interesting...

C.
-- 
make bzImage, not war


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Deleting large files
       [not found]   ` <118833cc0805081110u7aad3921v3a1ec4187acc4ef4@mail.gmail.com>
@ 2008-05-08 18:54     ` Christian Kujau
  0 siblings, 0 replies; 16+ messages in thread
From: Christian Kujau @ 2008-05-08 18:54 UTC (permalink / raw)
  To: Morten Welinder; +Cc: LKML

On Thu, 8 May 2008, Morten Welinder wrote:
> A quick search of the list or google would have shown you that the
> problem is well known.  My post added a possible solution.
> Knee-jerk reactions asking for more information that does not add
> anything is not always the right solution.

Dude, WTF? I was indeed asking for this information because I thought ppl 
guessed[0] about your fs and whether it did this kind of "housekeeping" or 
not. So I was just curious about a tiny bit more information.

Sorry for replying....

C.

[0] http://lkml.org/lkml/2008/5/7/269
-- 
BOFH excuse #137:

User was distributing pornography on server; system seized by FBI.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Deleting large files
  2008-05-07 23:14   ` Morten Welinder
@ 2008-05-08 23:01     ` Alan Cox
  2008-05-11 10:30     ` Jan Engelhardt
  2008-05-20 14:33     ` Pavel Machek
  2 siblings, 0 replies; 16+ messages in thread
From: Alan Cox @ 2008-05-08 23:01 UTC (permalink / raw)
  To: Morten Welinder; +Cc: linux-os (Dick Johnson), linux-kernel

> user experience.  Forking a process to do the deletion (a) is pathetic,
> (b) is not currently done, and (c) does not work: you cannot get a result
> right away, i.e., you lose error handling.

I wouldn't call it pathetic. Quite a few big media file tools create a
thread to do deletions of big objects.

The error handling isn't usually a problem. You know any error you can do
anything meaningful with actually occurs immediately or close to it. If
you get errors because of I/O problems in a minutes time there isn't a
sensible response and recovery anyway - nor would a kernel side
asynchronous delete be able to recover any better.

In theory you can push some kind of asynchronous delete threads into the
kernel or extent the AIO interfaces we have to do AIO_delete but at the
end of the day the implementation would effectively be create thread,
unlink, exit - and you can do that neatly and sanely in user space.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Deleting large files
  2008-05-07 23:14   ` Morten Welinder
  2008-05-08 23:01     ` Alan Cox
@ 2008-05-11 10:30     ` Jan Engelhardt
  2008-05-11 16:38       ` Enrico Weigelt
  2008-05-20 14:33     ` Pavel Machek
  2 siblings, 1 reply; 16+ messages in thread
From: Jan Engelhardt @ 2008-05-11 10:30 UTC (permalink / raw)
  To: Morten Welinder; +Cc: linux-os (Dick Johnson), linux-kernel


On Thursday 2008-05-08 01:14, Morten Welinder wrote:

>>  Suppose you had an N GB file that just filled up the disk. You now
>>  delete it, but get control back before it is really deleted. You
>>  now start to write a new file that will eventually just fill up
>>  the disk. [...]
>
>NFS does not do that -- in fact, I don't believe any file system does that
>unless you can guarantee at least that no other process or the kernel has
>that file open;

Iff a process still has the file open, your unlink will succeed immediately
anyway, and the real deallocation takes place when the last process runs
close(). Which shows an interesting fact too: not only unlink can block.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Deleting large files
  2008-05-08  8:19 ` Matti Aarnio
@ 2008-05-11 11:16   ` Christoph Hellwig
  2008-05-11 16:42     ` Aneesh Kumar K.V
  0 siblings, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2008-05-11 11:16 UTC (permalink / raw)
  To: Matti Aarnio; +Cc: Morten Welinder, linux-kernel

On Thu, May 08, 2008 at 11:19:06AM +0300, Matti Aarnio wrote:
> This very question has troubled SQUID developers.  Whatever the system, unlink()
> that really does free diskspace does so with unbound timelimit and in services
> where one millisecond is long wait time, the solution has been to run separate
> subprocess that actually does the unlinks.
> 
> Squid is not threaded software, and it was created long ago when threads were
> rare and implementations were different in subtle details --> no threads at all.

I'd call long times for the final unlink a bug in the filesystem.
There's not all that much to do when deleting a file.   What you need to
do is basically return the allocated space to the free space allocator
and mark the inode as unused and return it to the inode allocator.  The
first one may take quite a while with a indirect block scheme, but with
an extent based filesystem it shouldn't be a problem.  The latter
shouldn't take too long either, and with a journaling filesystem it's
even easier because you can intent-log the inode deletion first and then
perform it later e.g. as part of a batched write-back of the inode
cluster.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Deleting large files
  2008-05-11 10:30     ` Jan Engelhardt
@ 2008-05-11 16:38       ` Enrico Weigelt
  0 siblings, 0 replies; 16+ messages in thread
From: Enrico Weigelt @ 2008-05-11 16:38 UTC (permalink / raw)
  To: linux kernel list

* Jan Engelhardt <jengelh@medozas.de> wrote:

> Iff a process still has the file open, your unlink will succeed immediately
> anyway, and the real deallocation takes place when the last process runs
> close(). Which shows an interesting fact too: not only unlink can block.

Yep, the point is: on *nix there is no delete syscall, but just 
an unlink (decreasing the refcount). The kernel then decides when
to actually remove the file (normally when refcount==0).

So, when refcount==0 the kernel (more precisely: the fs) could
just hand over the inode to some kthread, which does the actual 
space-reclaiming. When properly done, the case of powerfail will
catched by fsck or journal replay, just the same as when several
processes were in the middle of deleting files.

Maybe this could be implemented by an overlaying filesystem,
which essentially moves to some special deleted dir instad of 
real unlink'ing - an separate process (which even could run in 
userland) will do the actual unlinking. So when an user process
calls unlink(), the inodes don't even have to be touched.


cu
-- 
---------------------------------------------------------------------
 Enrico Weigelt    ==   metux IT service - http://www.metux.de/
---------------------------------------------------------------------
 Please visit the OpenSource QM Taskforce:
 	http://wiki.metux.de/public/OpenSource_QM_Taskforce
 Patches / Fixes for a lot dozens of packages in dozens of versions:
	http://patches.metux.de/
---------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Deleting large files
  2008-05-11 11:16   ` Christoph Hellwig
@ 2008-05-11 16:42     ` Aneesh Kumar K.V
  0 siblings, 0 replies; 16+ messages in thread
From: Aneesh Kumar K.V @ 2008-05-11 16:42 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Matti Aarnio, Morten Welinder, linux-kernel

On Sun, May 11, 2008 at 07:16:53AM -0400, Christoph Hellwig wrote:
> On Thu, May 08, 2008 at 11:19:06AM +0300, Matti Aarnio wrote:
> > This very question has troubled SQUID developers.  Whatever the system, unlink()
> > that really does free diskspace does so with unbound timelimit and in services
> > where one millisecond is long wait time, the solution has been to run separate
> > subprocess that actually does the unlinks.
> > 
> > Squid is not threaded software, and it was created long ago when threads were
> > rare and implementations were different in subtle details --> no threads at all.
> 
> I'd call long times for the final unlink a bug in the filesystem.
> There's not all that much to do when deleting a file.   What you need to
> do is basically return the allocated space to the free space allocator
> and mark the inode as unused and return it to the inode allocator.  The
> first one may take quite a while with a indirect block scheme, but with
> an extent based filesystem it shouldn't be a problem.  The latter
> shouldn't take too long either, and with a journaling filesystem it's
> even easier because you can intent-log the inode deletion first and then
> perform it later e.g. as part of a batched write-back of the inode
> cluster.

The problem with journalling file system like ext3 is that the credits
available in the journal may not be sufficient for full truncate. In
that case we will have to commit the journal. And that means we will
have to zero fill some of the indirect blocks so that when the
transaction is committed the inode format is a valid one.

For ext3 there are patches from abhishek that actually speed up
meta-data intensive operation. Eric Sandeen did some measurements
here.

http://people.redhat.com/esandeen/rm_test/

I have patches for Ext4 based on top of the new block allocator for
Ext4. There is some improvment with Ext3 mode.

http://www.radian.org/~kvaneesh/ext4/meta-group/

-aneesh

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Deleting large files
  2008-05-07 19:49 Deleting large files Morten Welinder
                   ` (3 preceding siblings ...)
  2008-05-08 17:29 ` Christian Kujau
@ 2008-05-17 12:15 ` Pavel Machek
  4 siblings, 0 replies; 16+ messages in thread
From: Pavel Machek @ 2008-05-17 12:15 UTC (permalink / raw)
  To: Morten Welinder; +Cc: linux-kernel

On Wed 2008-05-07 15:49:30, Morten Welinder wrote:
> Hi there,
> 
> deleting large files, say on the order of 4.6GB, takes approximately forever.
> Why is that?  Well, it is because a lot of things need to take place to free
> the formerly used space, but my real question is "why does the unlink caller
> have to wait for it?"
> 
> I.e., could unlink do the directory changes and then hand off the rest of the
> task to a kernel thread?

Yep, but implementation is not going to be trivial. Send a patch ;-).

							Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Deleting large files
  2008-05-07 23:14   ` Morten Welinder
  2008-05-08 23:01     ` Alan Cox
  2008-05-11 10:30     ` Jan Engelhardt
@ 2008-05-20 14:33     ` Pavel Machek
  2 siblings, 0 replies; 16+ messages in thread
From: Pavel Machek @ 2008-05-20 14:33 UTC (permalink / raw)
  To: Morten Welinder; +Cc: linux-os (Dick Johnson), linux-kernel

On Wed 2008-05-07 19:14:33, Morten Welinder wrote:
> >  Suppose you had an N GB file that just filled up the disk. You now
> >  delete it, but get control back before it is really deleted. You
> >  now start to write a new file that will eventually just fill up
> >  the disk. [...]
> 
> That argument ought to stop right there.  If you believe that deleting a
> file will necessarily and immediately give you back the space, then you
> are wrong in the current state of the affairs already.

Not if you are the only user.

> user experience.  Forking a process to do the deletion (a) is pathetic,
> (b) is not currently done, and (c) does not work: you cannot get a result
> right away, i.e., you lose error handling.

If you fork a kernel thread, you lose error handling, too.

Think -EIO when writing back bitmaps...

(Hmm, you'd have to use O_SYNC to see that, so this is probably
minor).

I guess doing freeing asynchronously would be okay in the 'close'
case...

							Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2008-05-20 14:33 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-07 19:49 Deleting large files Morten Welinder
2008-05-07 20:10 ` Jan Engelhardt
2008-05-07 20:17   ` Xavier Bestel
2008-05-07 20:48     ` Jan Engelhardt
2008-05-07 22:34 ` linux-os (Dick Johnson)
2008-05-07 23:14   ` Morten Welinder
2008-05-08 23:01     ` Alan Cox
2008-05-11 10:30     ` Jan Engelhardt
2008-05-11 16:38       ` Enrico Weigelt
2008-05-20 14:33     ` Pavel Machek
2008-05-08  8:19 ` Matti Aarnio
2008-05-11 11:16   ` Christoph Hellwig
2008-05-11 16:42     ` Aneesh Kumar K.V
2008-05-08 17:29 ` Christian Kujau
     [not found]   ` <118833cc0805081110u7aad3921v3a1ec4187acc4ef4@mail.gmail.com>
2008-05-08 18:54     ` Christian Kujau
2008-05-17 12:15 ` Pavel Machek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox