linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Sven Anderson <sven@anderson.de>
Cc: Chris Mason <chris.mason@oracle.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: Faulty has_zero()? (was: .. anybody know of any filesystems that depend on the exact VFS 'namehash' implementation?)
Date: Sat, 17 Mar 2012 09:53:37 -0700	[thread overview]
Message-ID: <CA+55aFzg7Wm6hNyFLrtPB5HOdu0SSkZrbLXFJrgreEN+LYEsdg@mail.gmail.com> (raw)
In-Reply-To: <FA611BE7-A574-444D-9C1E-7E32CB3BD659@anderson.de>

On Sat, Mar 17, 2012 at 5:29 AM, Sven Anderson <sven@anderson.de> wrote:
>
> Am 01.03.2012 um 23:42 schrieb Linus Torvalds:
>
>> +/* Return the high bit set in the first byte that is a zero */
>> +static inline unsigned long has_zero(unsigned long a)
>> +{
>> +     return ((a - ONEBYTES) & ~a) & HIGHBITS;
>> +}
>
> (I commented this on your google+ posting as well, but I'm not sure if you will notice it there.)
>
> Out of curiosity I studied your code, and if I'm not mistaken your has_zero() function doesn't do what is expected. If there are leading 0x01 bytes in front of a NUL byte, they are also marked in the mask because of the borrow bit.

So has_zero() doesn't guarantee to return a mask of all zero bytes -
only the *first* one. And that's the only one we care about. Any
subsequent zero bytes are suspect, but we don't use that information -
we mask it all away with the "(mask-1) & ~mask"

So what has_zero() guarantees is two-fold:

 - if there are no zeroes at all, it will return 0.

 - if there is one or more zero bytes, it will return with the high
bit set in the *first* zero byte (and nothing below that).

But if it returns non-zero, only the first bit is "guaranteed". Any of
the high bits in the higher bytes are indeed suspect, exactly because
of borrow.

But for our use, we simply just don't care - we only ever care about
the *first* NUL byte.

This is why the algorithm is fundamentally little-endian only. You may
have confused your test-case on a big-endian machine. On big-endian,
you would need to do a byte-switching load.

And when we do the "NUL or slash" thing, we will or the two cases
together, which means that we catch the first NUL or slash, and even
though *both* of those are suspect in the higher bits, once more, we
won't care. We will only ever look at the *lowest* of the high bits in
the bytes (that whole , which is exactly why borrow will never matter
for us: it can only affect bits above the lowest one.

                      Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2012-03-17 16:53 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-29 23:36 .. anybody know of any filesystems that depend on the exact VFS 'namehash' implementation? Linus Torvalds
2012-03-01 10:13 ` Steven Whitehouse
2012-03-01 15:59   ` Linus Torvalds
2012-03-01 16:57 ` Al Viro
2012-03-01 17:14   ` Linus Torvalds
2012-03-01 18:34     ` Chris Mason
2012-03-01 22:42       ` Linus Torvalds
2012-03-17 12:29         ` Faulty has_zero()? (was: .. anybody know of any filesystems that depend on the exact VFS 'namehash' implementation?) Sven Anderson
2012-03-17 16:53           ` Linus Torvalds [this message]
2012-03-17 18:00             ` Sven Anderson
2012-03-02  0:46 ` .. anybody know of any filesystems that depend on the exact VFS 'namehash' implementation? Andi Kleen
2012-03-02  1:01   ` Linus Torvalds
2012-03-02  1:11     ` Andi Kleen
2012-03-02  1:38       ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+55aFzg7Wm6hNyFLrtPB5HOdu0SSkZrbLXFJrgreEN+LYEsdg@mail.gmail.com \
    --to=torvalds@linux-foundation.org \
    --cc=chris.mason@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sven@anderson.de \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).