From: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
To: linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org
Cc: Christian Kujau <lists@nerdbynature.de>, Jan Kara <jack@suse.cz>,
Eric Sandeen <sandeen@redhat.com>,
mszeredi@suse.cz, Al Viro <viro@ZenIV.linux.org.uk>
Subject: Re: EXT4-fs (dm-1): Couldn't remount RDWR because of unprocessed orphan inode list
Date: Thu, 06 Oct 2011 19:12:33 +0900 [thread overview]
Message-ID: <4E8D7F11.8050309@jp.fujitsu.com> (raw)
In-Reply-To: <alpine.DEB.2.01.1110051823380.8000@trent.utfs.org>
(2011/10/06 10:34), Christian Kujau wrote:
> On Wed, 5 Oct 2011 at 20:03, Jan Kara wrote:
>>> With Miklos' patches applied to -rc5, this happend again just now :-(
>>>
>> Thanks for careful testing! Hmm, since you are able to reproduce on ppc
>> but not on x86 there might be some memory ordering bug in Miklos' patches
>> or it's simply because of different timing. Miklos, care to debug this
>> further?
>
> Just to be clear: I'm still not entirely sure how to reproduce this at
> will. I *assumed* that the daily remount-rw-and-ro-again routine that left
> some inodes in limbo and eventually lead to those "unprocessed orphan
> inodes". With that in mind I tried to reproduce this with the help of a
> test-script (test-remount.sh, [0]) - but the message did not occur while
> the script was running.
>
> I've ran the script again today on the said powerpc machine on a
> loop-mounted 500MB ext4 partition. But even after 100 iterations no
> such message occured.
>
> So maybe it's caused by something else or my test-script just doesn't get
> the scenario right and there's something subtle to this whole
> remounting-business I haven't figured out yet, leading to those orphan
> inodes.
>
> I'm at 3.1.0-rc9 now and will wait until the errors occur again.
>
> Christian.
>
> [0] nerdbynature.de/bits/3.1-rc4/ext4/
With Miklos' patches applies to -rc8, I could display
"Couldn't remount RDWR because of unprocessed orphan inode list".
on my x86_64 machine by my reproducer.
Because actual removal starts from over a range between mnt_want_write() and
mnt_drop_write() even if do_unlinkat() or do_rmdir() calls mnt_want_write()
and mnt_drop_write() to prevent a filesystem from re-mounting read-only.
My reproducer is as follows:
-----------------------------------------------------------------------------
[1] go.sh
#!/bin/sh
dd if=/dev/zero of=/tmp/img bs=1k count=1 seek=1000k > /dev/null 2>&1
/sbin/mkfs.ext4 -Fq /tmp/img
mount -o loop /tmp/img /mnt
./writer.sh /mnt &
LOOP=1000000000
for ((i=0; i<LOOP; i++));
do
echo "[$i]"
if ((i%2 == 0));
then
mount -o ro,remount,loop /mnt
else
mount -o rw,remount,loop /mnt
fi
sleep 1
done
[2] writer.sh
#!/bin/sh
dir=$1
for ((i=0;i<10000000;i++));
do
for ((j=0;j<64;j++));
do
filename="$dir/file$((i*64 + j))"
dd if=/dev/zero of=$filename bs=1k count=8 > /dev/null 2>&1 &
done
for ((j=0;j<64;j++));
do
filename="$dir/file$((i*64 + j))"
rm -f $filename > /dev/null 2>&1 &
done
wait
if ((i%100 == 0 && i > 0));
then
rm -f $dir/file*
fi
done
exit
[step to run]
# ./go.sh
-----------------------------------------------------------------------------
Therefore, we need a mechanism to prevent a filesystem from re-mounting
read-only until actual removal finishes.
------------------------------------------------------------------------
[example fix]
do_unlinkat() {
...
mnt_want_write()
vfs_unlink()
if (inode && inode->i_nlink == 0) { //
atomic_inc(&inode->i_sb->s_unlink_count); //
inode->i_deleting++; //
} //
mnt_drop_write()
...
iput() // usually, an acutal removal starts
...
}
destroy_inode() {
...
if (inode->i_deleting)
atomic_dec(&inode->i_sb->s_unlink_count);
...
}
do_remount_sb() {
...
else if (!fs_may_remount_ro(sb) || atomic_read(&sb->s_unlink_count)
return -EBUSY;
...
}
------------------------------------------------------------------------
Besides, my reproducer also detects the following message:
"Ext4-fs (xxx): ext4_da_writepages: jbd2_start: xxx pages, ino xx: err -30"
This is because ext4_remount() cannot guarantee to write all ext4
filesystem data out due to the delayed allocation feature.
(ext4_da_writepages() fails after ext4_remount() sets MS_RDONLY with
sb->s_flags)
Therefore, we must write all delayed allocation buffers out before
ext4_remount() sets sb->s_flags with MS_RDONLY.
------------------------------------------------------------------------
[example fix] // This requires Miklos' patches.
ext4_remount() {
...
if (*flags & MS_RDONLY) {
err = dquot_suspend(sb, -1);
if (err < 0)
goto restore_opts;
sync_filesystem(sb); // write all delayed buffers out
sb->s_flags |= MS_RDONLY;
...
}
------------------------------------------------------------------------
Best Regards,
Toshiyuki Okajima
next prev parent reply other threads:[~2011-10-06 10:10 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-09-02 21:00 EXT4-fs (dm-1): Couldn't remount RDWR because of unprocessed orphan inode list Christian Kujau
2011-09-06 16:17 ` Eric Sandeen
2011-09-06 16:37 ` Christian Kujau
2011-09-06 16:44 ` Eric Sandeen
2011-09-06 18:14 ` Christian Kujau
2011-09-08 18:51 ` Jan Kara
2011-09-10 1:11 ` Christian Kujau
2011-09-10 20:04 ` Jan Kara
2011-09-13 4:52 ` Christian Kujau
2011-09-16 3:49 ` Christian Kujau
2011-09-16 12:04 ` Amir Goldstein
2011-09-16 12:17 ` Christian Kujau
2011-09-16 12:36 ` Amir Goldstein
2011-10-05 18:03 ` Jan Kara
2011-10-06 1:34 ` Christian Kujau
2011-10-06 10:12 ` Toshiyuki Okajima [this message]
2011-10-11 8:45 ` Miklos Szeredi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E8D7F11.8050309@jp.fujitsu.com \
--to=toshi.okajima@jp.fujitsu.com \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=lists@nerdbynature.de \
--cc=mszeredi@suse.cz \
--cc=sandeen@redhat.com \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.