linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Marc Lehmann <schmorp@schmorp.de>
To: Calvin Walton <calvin.walton@kepstin.ca>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: first mount(s) after unclean shutdown always fail, second attempt
Date: Sun, 12 Jul 2020 07:12:41 +0200	[thread overview]
Message-ID: <20200712051241.GC491@schmorp.de> (raw)
In-Reply-To: <96009f54f7548080513ab2100d420d82f50d4e90.camel@kepstin.ca>

On Wed, Jul 08, 2020 at 01:35:08PM -0400, Calvin Walton <calvin.walton@kepstin.ca> wrote:
> You shared kernel logs, but it would be helpful to see the systemd
> journal. One thing to note is that by default systemd has a timeout on
> mounts! It's entirely possible that as soon as the mount kernel thread
> becomes unblocked, it notices that systemd has sent a SIGTERM/SIGKILL
> and aborts the mount.
> 
> See the documentation (man systemd.mount) and consider increasing or
> disabling the timeout on the affected mount units.

Good idea, systemd indeed KILLed the mount:

   Jul 01 02:02:26 cerebro systemd[1]: cryptlocalvol.mount: Mount process still around after SIGKILL. Ignoring.
   Jul 01 02:02:26 cerebro systemd[1]: cryptlocalvol.mount: Failed with result 'timeout'.
   Jul 01 02:02:26 cerebro systemd[1]: Failed to mount /cryptlocalvol.

However, are btrfs mounts really interruptible? In any case, the
problem happens both for systemd-triggered mounts as well as for mount
commands entered interactively (and also for volumes where there is no
fstab/mount unit at all). For example, here is a case where the initial
systemd-controlled mount fails, and the interactively-entered "mount
/localvol" then fails once more and it only succeeds on the third attempt:

   May 30 03:17:53 cerebro kernel: BTRFS info (device dm-7): disk space caching is enabled
   May 30 03:17:53 cerebro kernel: BTRFS info (device dm-7): has skinny extents
   May 30 03:19:23 cerebro systemd[1]: localvol.mount: Mounting timed out. Terminating.
   May 30 03:20:53 cerebro systemd[1]: localvol.mount: Mount process timed out. Killing.
   May 30 03:20:53 cerebro systemd[1]: localvol.mount: Killing process 1116 (mount) with signal SIGKILL.
   May 30 03:22:23 cerebro systemd[1]: localvol.mount: Mount process still around after SIGKILL. Ignoring.
   May 30 03:22:23 cerebro systemd[1]: localvol.mount: Failed with result 'timeout'.
   May 30 03:22:23 cerebro systemd[1]: Failed to mount /localvol.
   May 30 03:27:53 cerebro kernel: BTRFS error (device dm-7): open_ctree failed
[systemd-initiated mount failed here]
   May 30 03:27:54 cerebro kernel: BTRFS info (device dm-7): turning on discard
   May 30 03:27:54 cerebro kernel: BTRFS info (device dm-7): disk space caching is enabled
   May 30 03:27:54 cerebro kernel: BTRFS info (device dm-7): has skinny extents
   May 30 03:27:54 cerebro kernel: BTRFS error (device dm-7): open_ctree failed
[emergency-shell interactive mount failed here]
   May 30 03:28:14 cerebro kernel: BTRFS info (device dm-7): turning on discard
   May 30 03:28:14 cerebro kernel: BTRFS info (device dm-7): disk space caching is enabled
   May 30 03:28:14 cerebro kernel: BTRFS info (device dm-7): has skinny extents
[third attempt succeeded]
   May 30 03:40:04 cerebro systemd[1]: localvol.mount: Succeeded.

While looking at the case above, I notice that the second mount fails
practically instantly, and it was initiated practically the moment the
previous mount failed.

I think the reason is that systemd dropped me into the emergency
shell long before the (kernel) mount failed, and I likely entered the
interactive mount command long before the previous mount finished, which
could explain why the interactive mount appears to happen within one
second of the previous mount failure - it was probably running for minutes
already, waiting for some lock.

I have disabled the systemd mount timeout for the time being, to exclude
this case.

-- 
                The choice of a       Deliantra, the free code+content MORPG
      -----==-     _GNU_              http://www.deliantra.net
      ----==-- _       generation
      ---==---(_)__  __ ____  __      Marc Lehmann
      --==---/ / _ \/ // /\ \/ /      schmorp@schmorp.de
      -=====/_/_//_/\_,_/ /_/\_\

      reply	other threads:[~2020-07-12  5:12 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-02  2:18 first mount(s) after unclean shutdown always fail, second attempt Marc Lehmann
2020-07-08 17:35 ` Calvin Walton
2020-07-12  5:12   ` Marc Lehmann [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200712051241.GC491@schmorp.de \
    --to=schmorp@schmorp.de \
    --cc=calvin.walton@kepstin.ca \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).