Linux NILFS development
 help / color / mirror / Atom feed
From: "Michael L. Semon" <mlsemon35-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: slava-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org
Cc: linux-nilfs <linux-nilfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: Best way to shut down NILFS2? (umount hang issue)...
Date: Thu, 19 Sep 2013 19:19:09 -0400	[thread overview]
Message-ID: <523B866D.9060406@gmail.com> (raw)
In-Reply-To: <1379571773.2310.5.camel@slavad-ubuntu>

On 09/19/2013 02:22 AM, Vyacheslav Dubeyko wrote:
> On Wed, 2013-09-18 at 12:26 -0400, Michael L. Semon wrote:
> 
> [snip]
>>>
>>> As far as I can see, your NILFS2 file system was remounted in RO mode
>>> because of internal error. Could you confirm my understanding?
>>
>> Yes, but only on reboot.  Other programs crash the PC, and NILFS2 has to
>> recover from that crash.  The PC spends a lot of time running xfstests and
>> LTP with a kernel that is set to panic.  NILFS2 itself seems OK, and its
>> latest xfstests run looked good, using default mkfs.nilfs2 options and
>> mounting with "-o pp=0".
> 
> [snip]
>>
>> It is strictly like this so far:
>>
>> 1) NILFS2 / boots OK
>> 2) no problems
>> 3) shutdown is OK
>> 4) NILFS2 / boots OK
>> 5) computer crashes for some other reason
>> 6) NILFS2 / boots OK, but displays a message that recovery was used
>> 7) no problems
>> 8) here, shutdown may hang on sync or umount (50% chance)
>>
>> In other words, NILFS2 has not had an error to make it remount read-only
>> while the PC is running.  The problem may solve itself over time, or I
>> may have to boot to another partition, then mount and umount the NILFS2
>> partition to get it to recover and umount cleanly again.
>>
> 
> So, maybe it is another issue.
> 
> [snip]
>>
>> I'll try your patches tonight and report back in 1-2 days.
>>
> 
> Ok. Please, inform me about the result anyway. If suggested patches
> don't fix the issue then I will begin investigation.
> 
> But, I begin to suspect presence of another issue after additional
> analysis of provided by you outputs. So, I am waiting results of your
> attempt.
> 
> Thanks,
> Vyacheslav Dubeyko.

The issue still happens.  One patch was already in the kernel, and
the second patch you mentioned did not make much of a difference.
The second patch is still installed, though.

The problem I mentioned above is the one that is easy to explain.
The crash doesn't even have to stress the computer:  A simple
SysRq-induced crash should be enough to get the problem started, 
though the PC might need to be crashed more than once.

I've changed / to mount as errors=panic, but there has been no 
panic yet.

# ================

Here is where the overall problem becomes hard to explain.  Consider this 
scenario:

/ is NILFS2 (rw,order=strict)
/boot is JFS
/tmp is JFS
/usr/src is JFS

Because I don't want the hung NILFS2 umount to give problems to /tmp and 
/usr/src, I adapted the end of the standard Slackware shutdown script to 
look something like this:

/bin/umount -v -a -t noproc,nosysfs,nonilfs2

# This line can be here to show a sync problem, or removed 
# to show a umount problem....
sync

/bin/umount -v -a -t nilfs2

echo "Remounting root filesystem read-only."
/bin/mount -v -n -o remount,ro /dev/sdb12 /

[I can get you the exact script next time.]

I choose to build a kernel, which fills memory, exercises a JFS
filesystem and probably writes temp files to /tmp on JFS.  `make 
install` installs the kernel to /boot on JFS.  [BTW, `make install` 
can stall when /boot is within a NILFS2 / partition, but that has 
not been tested since I started using a separate /boot partition.] 

There is a much higher chance that shutdown will hang before the
NILFS2 partitions are umounted.  A simple `mount` placed before the
`sync` shows that umount is honoring the "nonilfs2" flag, and the
NILFS2 partitions are still mounted.  So why would the sync *before*
the umount of NILFS2 partitions get hung between segctord and sync,
when mount supposedly has not umounted the NILFS2 partitions yet?
This is why I mentioned the sync issue and the umount issue at the
same time.

Could it be that `umount ... nonilfs2` causes /etc/mtab to be
modified, which is updated by NILFS2 on /, but it is not done in 
time to make sync (or the next `umount ... nilfs2`) happy?  I'm 
only speculating on this idea.

Thanks!

Michael

--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2013-09-19 23:19 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-17 22:42 Best way to shut down NILFS2? (umount hang issue) Michael L. Semon
     [not found] ` <5238DAD8.3070804-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2013-09-18  6:18   ` Vyacheslav Dubeyko
2013-09-18 16:26     ` Michael L. Semon
     [not found]       ` <CAJzLF9nbfM6aY8u57Lgkm4r_mpBtd96J=HaqSnF=+oLvhYpmUw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-09-19  6:22         ` Vyacheslav Dubeyko
2013-09-19 23:19           ` Michael L. Semon [this message]
     [not found]             ` <523B866D.9060406-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2013-09-20  8:14               ` Vyacheslav Dubeyko
2013-09-22  3:20                 ` Michael L. Semon
     [not found]                   ` <523E6203.2090509-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2013-09-25  8:16                     ` Vyacheslav Dubeyko
2013-09-26  0:21                       ` Michael L. Semon
2013-09-26 21:19                       ` Michael L. Semon
     [not found]                         ` <5244A4D1.8000705-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2013-09-27  6:13                           ` Vyacheslav Dubeyko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=523B866D.9060406@gmail.com \
    --to=mlsemon35-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=linux-nilfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=slava-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox