linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Al Viro <viro@ZenIV.linux.org.uk>
To: Richard Narron <comet.berkeley@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [git pull] first batch of ufs fixes
Date: Thu, 15 Jun 2017 09:00:09 +0100	[thread overview]
Message-ID: <20170615080009.GE31671@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20170614071133.GD31671@ZenIV.linux.org.uk>

On Wed, Jun 14, 2017 at 08:11:33AM +0100, Al Viro wrote:
> NOTE: all I have is your image *after* it had counters buggered; I don't
> know the exact sequence of operations that fucked it in your case.  One
> way to trigger it is to mount/umount on OpenBSD, then mount/modify/umount
> on Linux, then mount/umount on OpenBSD, then fsck on OpenBSD.  This patch
> apparently fixes that, but your reproducer might be something different.

FWIW, it seems to work here.  Said that, *BSD fsck_ffs is not worth much -
play a bit with redundancy in UFS superblock (starting with fragment
and block sizes, their ratio, logarithms, bitmasks, etc.) and you can
screw at least 10.3 into the ground when mounting an image that passes
their fsck.  Sure, anyone who mounts untrusted images is a cretin who
deserves everything they get, fsck or no fsck, but... no complaints from
fsck is not a reliable indicator of image being in good condition and
that's PITA for testing.

Another pile of fun: "reserve ->s_minfree percents of total" logics had
been broken.
	* using hardwired 5% is wrong - especially for ufs2, where it's
not even the default
	* ufs_freespace() returns u64; testing for <= 0 is not doing the
right thing
	* no capability checks before we need them, TYVM...
	* ufs2 needs 64bit uspi->s_dsize (and ->s_size, while we are at it).
64bit variants were even calculated - and never used.
	* while we are at it, doing "multiply the total data frags by
s_minfree and divide by 100" every time we allocate a block is bloody
dumb - that should be calculated once.

We really need to get the sodding tail unpacking moved up from the place
where it's buried - turns out that my doubts about that code managing to
avoid deadlocks had been correct.  Long-term we need to move that thing
to iomap-based ->write_iter() and do unpacking there and in truncate().
For now I've slapped together something that is easier to backport -
avoiding ->truncate_mutex when possible and not holding ->s_lock over
ufs_change_blocknr().

Another bug in the same area: ufs_get_locked_page() doesn't guarantee
that buffer_heads are attached (race with vmscan trying to evict the
page in question can end with buffer_heads freed and page left alive
and uptodate).  Callers do expect buffer_heads to be there, so we either
need to do create_empty_buffers() in those callers or in ufs_get_locked_page();
I went for the latter for now.

Off-by-one in ufs_truncate_blocks(): the logics when deciding whether
we need to do anything with direct blocks is broken when new size is
within the last direct block.  It's better to find the path to the
last byte _not_ to be removed and use that instead of the path to the
beginning of the first block to be freed.

I've pushed fixes for those into vfs.git#ufs-fixes; they do need more
testing before I send a pull request, though.

  parent reply	other threads:[~2017-06-15  8:00 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-09 21:38 [git pull] first batch of ufs fixes Al Viro
2017-06-10 13:03 ` Richard Narron
2017-06-10 16:07   ` Al Viro
2017-06-10 18:08     ` Al Viro
2017-06-10 18:12       ` Linus Torvalds
2017-06-11 19:47       ` Richard Narron
2017-06-11 21:30         ` Al Viro
2017-06-12  6:14         ` Al Viro
2017-06-13  0:54           ` Richard Narron
2017-06-13  1:43             ` Al Viro
2017-06-13 21:56               ` Richard Narron
2017-06-14  7:11                 ` Al Viro
2017-06-14 20:33                   ` Richard Narron
2017-06-15  8:00                   ` Al Viro [this message]
2017-06-16 14:29                     ` Richard Narron
2017-06-17  2:15                       ` Al Viro
2017-06-18  1:09                         ` Al Viro
2017-06-18 20:45                           ` Richard Narron
2017-06-20  5:17                           ` [git pull] " Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170615080009.GE31671@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=comet.berkeley@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).