From: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
To: linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org
Cc: Christian Kujau <lists@nerdbynature.de>, Jan Kara <jack@suse.cz>,
Eric Sandeen <sandeen@redhat.com>,
mszeredi@suse.cz, Al Viro <viro@ZenIV.linux.org.uk>
Subject: Re: EXT4-fs (dm-1): Couldn't remount RDWR because of unprocessed orphan inode list
Date: Thu, 06 Oct 2011 19:12:33 +0900 [thread overview]
Message-ID: <4E8D7F11.8050309@jp.fujitsu.com> (raw)
In-Reply-To: <alpine.DEB.2.01.1110051823380.8000@trent.utfs.org>
(2011/10/06 10:34), Christian Kujau wrote:
> On Wed, 5 Oct 2011 at 20:03, Jan Kara wrote:
>>> With Miklos' patches applied to -rc5, this happend again just now :-(
>>>
>> Thanks for careful testing! Hmm, since you are able to reproduce on ppc
>> but not on x86 there might be some memory ordering bug in Miklos' patches
>> or it's simply because of different timing. Miklos, care to debug this
>> further?
>
> Just to be clear: I'm still not entirely sure how to reproduce this at
> will. I *assumed* that the daily remount-rw-and-ro-again routine that left
> some inodes in limbo and eventually lead to those "unprocessed orphan
> inodes". With that in mind I tried to reproduce this with the help of a
> test-script (test-remount.sh, [0]) - but the message did not occur while
> the script was running.
>
> I've ran the script again today on the said powerpc machine on a
> loop-mounted 500MB ext4 partition. But even after 100 iterations no
> such message occured.
>
> So maybe it's caused by something else or my test-script just doesn't get
> the scenario right and there's something subtle to this whole
> remounting-business I haven't figured out yet, leading to those orphan
> inodes.
>
> I'm at 3.1.0-rc9 now and will wait until the errors occur again.
>
> Christian.
>
> [0] nerdbynature.de/bits/3.1-rc4/ext4/
With Miklos' patches applies to -rc8, I could display
"Couldn't remount RDWR because of unprocessed orphan inode list".
on my x86_64 machine by my reproducer.
Because actual removal starts from over a range between mnt_want_write() and
mnt_drop_write() even if do_unlinkat() or do_rmdir() calls mnt_want_write()
and mnt_drop_write() to prevent a filesystem from re-mounting read-only.
My reproducer is as follows:
-----------------------------------------------------------------------------
[1] go.sh
#!/bin/sh
dd if=/dev/zero of=/tmp/img bs=1k count=1 seek=1000k > /dev/null 2>&1
/sbin/mkfs.ext4 -Fq /tmp/img
mount -o loop /tmp/img /mnt
./writer.sh /mnt &
LOOP=1000000000
for ((i=0; i<LOOP; i++));
do
echo "[$i]"
if ((i%2 == 0));
then
mount -o ro,remount,loop /mnt
else
mount -o rw,remount,loop /mnt
fi
sleep 1
done
[2] writer.sh
#!/bin/sh
dir=$1
for ((i=0;i<10000000;i++));
do
for ((j=0;j<64;j++));
do
filename="$dir/file$((i*64 + j))"
dd if=/dev/zero of=$filename bs=1k count=8 > /dev/null 2>&1 &
done
for ((j=0;j<64;j++));
do
filename="$dir/file$((i*64 + j))"
rm -f $filename > /dev/null 2>&1 &
done
wait
if ((i%100 == 0 && i > 0));
then
rm -f $dir/file*
fi
done
exit
[step to run]
# ./go.sh
-----------------------------------------------------------------------------
Therefore, we need a mechanism to prevent a filesystem from re-mounting
read-only until actual removal finishes.
------------------------------------------------------------------------
[example fix]
do_unlinkat() {
...
mnt_want_write()
vfs_unlink()
if (inode && inode->i_nlink == 0) { //
atomic_inc(&inode->i_sb->s_unlink_count); //
inode->i_deleting++; //
} //
mnt_drop_write()
...
iput() // usually, an acutal removal starts
...
}
destroy_inode() {
...
if (inode->i_deleting)
atomic_dec(&inode->i_sb->s_unlink_count);
...
}
do_remount_sb() {
...
else if (!fs_may_remount_ro(sb) || atomic_read(&sb->s_unlink_count)
return -EBUSY;
...
}
------------------------------------------------------------------------
Besides, my reproducer also detects the following message:
"Ext4-fs (xxx): ext4_da_writepages: jbd2_start: xxx pages, ino xx: err -30"
This is because ext4_remount() cannot guarantee to write all ext4
filesystem data out due to the delayed allocation feature.
(ext4_da_writepages() fails after ext4_remount() sets MS_RDONLY with
sb->s_flags)
Therefore, we must write all delayed allocation buffers out before
ext4_remount() sets sb->s_flags with MS_RDONLY.
------------------------------------------------------------------------
[example fix] // This requires Miklos' patches.
ext4_remount() {
...
if (*flags & MS_RDONLY) {
err = dquot_suspend(sb, -1);
if (err < 0)
goto restore_opts;
sync_filesystem(sb); // write all delayed buffers out
sb->s_flags |= MS_RDONLY;
...
}
------------------------------------------------------------------------
Best Regards,
Toshiyuki Okajima
next prev parent reply other threads:[~2011-10-06 10:10 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-09-02 21:00 EXT4-fs (dm-1): Couldn't remount RDWR because of unprocessed orphan inode list Christian Kujau
2011-09-06 16:17 ` Eric Sandeen
2011-09-06 16:37 ` Christian Kujau
2011-09-06 16:44 ` Eric Sandeen
2011-09-06 18:14 ` Christian Kujau
2011-09-08 18:51 ` Jan Kara
2011-09-10 1:11 ` Christian Kujau
2011-09-10 20:04 ` Jan Kara
2011-09-13 4:52 ` Christian Kujau
2011-09-16 3:49 ` Christian Kujau
2011-09-16 12:04 ` Amir Goldstein
2011-09-16 12:17 ` Christian Kujau
2011-09-16 12:36 ` Amir Goldstein
2011-10-05 18:03 ` Jan Kara
2011-10-06 1:34 ` Christian Kujau
2011-10-06 10:12 ` Toshiyuki Okajima [this message]
2011-10-11 8:45 ` Miklos Szeredi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E8D7F11.8050309@jp.fujitsu.com \
--to=toshi.okajima@jp.fujitsu.com \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=lists@nerdbynature.de \
--cc=mszeredi@suse.cz \
--cc=sandeen@redhat.com \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox