All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mason <mpeg.blue@free.fr>
To: "Lukáš Czerner" <lczerner@redhat.com>
Cc: Andreas Dilger <adilger@dilger.ca>,
	Ext4 Developers List <linux-ext4@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: After unlinking a large file on ext4, the process stalls for a long time
Date: Thu, 17 Jul 2014 13:17:11 +0200	[thread overview]
Message-ID: <53C7B0B7.9030007@free.fr> (raw)
In-Reply-To: <alpine.LFD.2.00.1407171238060.2312@localhost.localdomain>

Lukáš Czerner wrote:

> So it really does not seem to be stalling in fallocate, nor unlink.
> Can you add close() before unlink, just to be sure what's happening
> there ?

Doh! Good catch! Unlinking was fast because the ref count didn't drop
to 0 on unlink, it did so on the implicit close done on exit, which
would explain why the process stalled "at the end".

If I unlink a closed file, it is indeed unlink that stalls.

[BTW, some of the e2fsprogs devs may be reading this. I suppose you
already know, but the cross-compile build was broken in 1.4.10.
I wrote a trivial patch to fix it (cf. the end of this message)
although I'm not sure I did it the canonical way.]


# time strace -T ./foo /mnt/hdd/xxx 300 2> strace.out
posix_fallocate(fd, 0, size_in_GiB << 30): 0 [412 ms]
close(fd): 0 [0 ms]
unlink(filename): 0 [111481 ms]

open("/mnt/hdd/xxx", O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = 3 <0.000456>
clock_gettime(CLOCK_MONOTONIC, {82152, 251657385}) = 0 <0.000085>
SYS_4320()                              = 0 <0.411628>
clock_gettime(CLOCK_MONOTONIC, {82152, 664179762}) = 0 <0.000089>
fstat64(1, {st_mode=S_IFCHR|0755, st_rdev=makedev(4, 64), ...}) = 0 <0.000094>
ioctl(1, TIOCNXCL, {B115200 opost isig icanon echo ...}) = 0 <0.000128>
old_mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x773e4000 <0.000195>
write(1, "posix_fallocate(fd, 0, size_in_G"..., 54) = 54 <0.000281>
clock_gettime(CLOCK_MONOTONIC, {82152, 668413115}) = 0 <0.000077>
close(3)                                = 0 <0.000119>
clock_gettime(CLOCK_MONOTONIC, {82152, 669249479}) = 0 <0.000129>
write(1, "close(fd): 0 [0 ms]\n", 20)   = 20 <0.000145>
clock_gettime(CLOCK_MONOTONIC, {82152, 670361133}) = 0 <0.000078>
unlink("/mnt/hdd/xxx")                  = 0 <111.479283>
clock_gettime(CLOCK_MONOTONIC, {82264, 150551496}) = 0 <0.000080>
write(1, "unlink(filename): 0 [111481 ms]\n", 32) = 32 <0.000225>
exit_group(0)                           = ?

0.01user 111.48system 1:51.99elapsed 99%CPU (0avgtext+0avgdata 772maxresident)k
0inputs+0outputs (0major+434minor)pagefaults 0swaps


For reference, here's my minimal test case:

#define _FILE_OFFSET_BITS 64
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <fcntl.h>
#include <time.h>

#define BENCH(op) do { \
  struct timespec t0; clock_gettime(CLOCK_MONOTONIC, &t0); \
  int err = op; \
  struct timespec t1; clock_gettime(CLOCK_MONOTONIC, &t1); \
  int ms = (t1.tv_sec-t0.tv_sec)*1000 + (t1.tv_nsec-t0.tv_nsec)/1000000; \
  printf("%s: %d [%d ms]\n", #op, err, ms); } while(0)

int main(int argc, char **argv)
{
  if (argc != 3) { puts("Usage: prog filename size"); return 42; }

  char *filename = argv[1];
  int fd = open(filename, O_CREAT | O_EXCL | O_WRONLY, 0600);
  if (fd < 0) { perror("open"); return 1; }

  long long size_in_GiB = atoi(argv[2]);
  BENCH(posix_fallocate(fd, 0, size_in_GiB << 30));
  BENCH(close(fd));
  BENCH(unlink(filename));
  return 0;
}


$ cat e2fsprogs-1.42.10.patch 
diff -ur a/util/Makefile.in b/util/Makefile.in
--- a/util/Makefile.in	2014-05-15 19:04:08.000000000 +0200
+++ b/util/Makefile.in	2014-07-10 15:31:04.819352596 +0200
@@ -15,7 +15,7 @@
 
 .c.o:
 	$(E) "	CC $<"
-	$(Q) $(BUILD_CC) -c $(BUILD_CFLAGS) $< -o $@
+	$(Q) $(BUILD_CC) $(CPPFLAGS) -c $(BUILD_CFLAGS) $< -o $@
 	$(Q) $(CHECK_CMD) $(ALL_CFLAGS) $<
 
 PROGS=		subst symlinks



-- 
Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2014-07-17 11:16 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-16 14:09 After unlinking a large file on ext4, the process stalls for a long time Mason
2014-07-16 15:16 ` John Stoffel
2014-07-16 17:16   ` Mason
2014-07-16 20:18     ` John Stoffel
2014-07-16 21:46       ` Mason
2014-07-17  3:37     ` Andreas Dilger
2014-07-17 10:30       ` Mason
2014-07-17 10:40         ` Lukáš Czerner
2014-07-17 11:17           ` Mason [this message]
2014-07-17 13:37             ` Theodore Ts'o
2014-07-17 16:07               ` Mason
2014-07-17 16:32                 ` Mason
2014-07-18  9:29                 ` Lukáš Czerner
     [not found]                   ` <53DF9918.3010206@free.fr>
2014-08-04 22:55                     ` Andreas Dilger
2014-08-05  2:33                       ` Theodore Ts'o
2014-08-05 21:54                         ` Andreas Dilger
2014-08-05 12:06                       ` Mason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53C7B0B7.9030007@free.fr \
    --to=mpeg.blue@free.fr \
    --cc=adilger@dilger.ca \
    --cc=lczerner@redhat.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.