All of lore.kernel.org
 help / color / mirror / Atom feed
* f2fs stability problems keep me from testing
@ 2015-11-17 17:24 Marc Lehmann
  2015-11-18 10:00 ` Chao Yu
  2015-11-19  8:04 ` Chao Yu
  0 siblings, 2 replies; 10+ messages in thread
From: Marc Lehmann @ 2015-11-17 17:24 UTC (permalink / raw)
  To: linux-f2fs-devel

Hi!

I have trouble executing the tests I wanted to run with the current 3.18
checkout. This morning, the box was completely unresponsive - I had to
reboot, not knowing the cause (the only difference is that f2fs is in more
or less production use for a few days).

An hour ago, I was awake when similar problems started - interactive
login was impossible, but I was able to execute a few commands in an open
shell, which make me suspect f2fs to be the culprit. Both times, the
f2fs filesystem was streaming video at low speed (<1mb/s) with no other
activity.

Anyway, here are the four experiments I did, after finding out that the
problem seems to the the f2fs fs (the other 4 filesystems on the box were
responsive, as was the underlying disk itself).

1. ls /cold1, find /cold1 (/cold1 is the f2fs mountpoint) gave empty results
   here is an strace of find /cold1:

   openat(AT_FDCWD, "/cold1", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 5
   fchdir(5)                               = 0
   getdents(5, /* 0 entries */, 32768)     = 0
   close(5)                                = 0

   so /cold1 is an empty directory. not good.

2. so no files in /cold1, let's see what happens when I list /cold1/var, a
   directory known to exist:

   openat(AT_FDCWD, "/cold1/var", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 5
   fchdir(5)                               = 0
   getdents(5, /* 0 entries */, 32768)     = 0
   close(5)                                = 0

   so f2fs knowsn that /cold1/var exists, but readdir gives no results. very
   troubling.

3. "sync&" - this did hang, with no apparent activity

4. cat /proc/<sync-pid>/task/*stack:

   [<ffffffff8121d4d8>] sync_inodes_sb+0xa8/0x1c0
   [<ffffffff81224249>] sync_inodes_one_sb+0x19/0x20
   [<ffffffff811f6192>] iterate_supers+0xb2/0x110
   [<ffffffff812244d5>] sys_sync+0x35/0x90
   [<ffffffff817a684d>] system_call_fastpath+0x16/0x1b
   [<ffffffffffffffff>] 0xffffffffffffffff

5. dmesg showed no related messages whatsoever - it still had the kernel
   messages generated from boot, and nothing else.

6. at this point I lost my shell and control over the box completely, and had to be rebooted

So something in the current f2fs tree (I checked that
/sys/fs/f2fs/dm-17/ra_nid_pages exists, so it is a more or less current
shapshot) is still locking up and/or returning corrupt data. If it was
a simple locking failure, though, I would expect readdir and other
operations to also block, not return bad data.

-- 
                The choice of a       Deliantra, the free code+content MORPG
      -----==-     _GNU_              http://www.deliantra.net
      ----==-- _       generation
      ---==---(_)__  __ ____  __      Marc Lehmann
      --==---/ / _ \/ // /\ \/ /      schmorp@schmorp.de
      -=====/_/_//_/\_,_/ /_/\_\

------------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-11-20  1:49 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-17 17:24 f2fs stability problems keep me from testing Marc Lehmann
2015-11-18 10:00 ` Chao Yu
2015-11-19  0:38   ` Marc Lehmann
2015-11-19  1:29     ` Chao Yu
2015-11-19  2:23       ` Marc Lehmann
2015-11-19 20:56         ` Jaegeuk Kim
2015-11-19  1:42   ` Marc Lehmann
2015-11-19  8:04 ` Chao Yu
2015-11-19 21:01   ` Marc Lehmann
2015-11-20  1:48     ` Chao Yu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.