All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Cree <mcree@orcon.net.nz>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: David Howells <dhowells@redhat.com>,
	David Miller <davem@davemloft.net>,
	James.Bottomley@hansenpartnership.com, geert@linux-m68k.org,
	linux-arch@vger.kernel.org
Subject: Re: Arch maintainers Ahoy!
Date: Wed, 13 Jun 2012 23:08:20 +1200	[thread overview]
Message-ID: <4FD874A4.8060606@orcon.net.nz> (raw)
In-Reply-To: <CA+55aFzwZ3mCeALQx7fXX_HO4GZF03q4gtZOFGmdWkJrvwjrJg@mail.gmail.com>

On 25/05/12 03:53, Linus Torvalds wrote:
> First off, the *last* thing you want to do is go to big-endian mode.
> All the bit counting gets *much* more complicated, and your argument
> that it's "free" on some architectures is pointless, since it is only
> free on the architectures that have the *least* users.

On Alpha we can find the zero bytes extrememly efficiently, and, yeah,
we have rather few users, so carry bugger-all weight.  Nevertheless I
want to ask about the semantics of the new prep_zero_mask() function
because if we have to implement it exactly as specified in the message
to commit 36126f8f2ed8 then we are forced to take a round-about, thus
less efficient, route in the find_zero() implementation on Alpha.

From commit 36126f8f2ed8 prep_zero_mask() must, and I quote, "generate
an *exact* mask of which byte had the first zero."  But the result of
prep_zero_mask() in all current extant usage is passed _only_ to
create_zero_mask().  It seems to me then that current usage is only
constrained by the following:

1) The result of prep_zero_mask() must be bitwise "OR"-able and the
result of the ORed results must in turn be a valid mask of zero bytes.

2) The result is only ever passed to create_zero_mask() which, like
prep_zero_mask(), is architecture specific.

But there is nothing currently in the kernel that currently requires
(other than a commit message) the result of prep_zero_mask() to be an
*exact* mask of the zero bytes, only that it be *a* mask of zero bytes.
 The difference is important to Alpha because if we can have a mask
where the lowest eight bits represent each byte (rather than a 64-bit
mask where a whole eight bits are set to represent a byte) we get an
extremely efficient implementation.

So, may I generalise prep_zero_mask() as suggested above?

I follow with the Alpha code for word-at-a-time.h that results if I may
(and is running fine on my Alpha):


/*
 * We do not use the word_at_a_time struct on Alpha, but it needs to be
 * implemented to humour the generic code.
 */
struct word_at_a_time {
        const unsigned long unused;
};

#define WORD_AT_A_TIME_CONSTANTS { 0 }

/* Return nonzero if val has a zero */
static inline unsigned long has_zero(unsigned long val, unsigned long
*bits, const struct word_at_a_time *c)
{
        unsigned long zero_locations = __kernel_cmpbge(0, val);
        *bits = zero_locations;
        return zero_locations;
}

static inline unsigned long prep_zero_mask(unsigned long val, unsigned
long bits, const struct word_at_a_time *c)
{
        return bits;
}

#define create_zero_mask(bits) (bits)

static inline unsigned long find_zero(unsigned long bits)
{
        return bits & (unsigned long)(-(long)bits);
}

Cheers
Michael.

  parent reply	other threads:[~2012-06-13 11:23 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-21 16:50 Arch maintainers Ahoy! (was Re: x86: faster strncpy_from_user()) Linus Torvalds
2012-05-23  5:46 ` Arch maintainers Ahoy! David Miller
2012-05-23  8:02   ` Geert Uytterhoeven
2012-05-23  9:40     ` James Bottomley
2012-05-23 15:15       ` Linus Torvalds
2012-05-23 17:21         ` David Miller
2012-05-23 17:32           ` Linus Torvalds
2012-05-23 18:16             ` David Miller
2012-05-23 18:27               ` Linus Torvalds
2012-05-23 18:35                 ` David Miller
2012-05-23 18:44                   ` Linus Torvalds
2012-05-23 18:46                   ` Linus Torvalds
2012-05-23 20:36                     ` David Miller
2012-05-23 21:01                       ` Linus Torvalds
2012-05-24  2:11                         ` David Miller
2012-05-24  5:25                           ` Paul Mackerras
2012-05-24  5:56                             ` David Miller
2012-05-24  9:40                 ` David Howells
2012-05-24 15:53                   ` Linus Torvalds
2012-05-24 16:45                     ` David Howells
2012-05-24 16:56                       ` Linus Torvalds
2012-05-24 17:16                         ` David Howells
2012-06-13 11:08                     ` Michael Cree [this message]
2012-06-13 14:51                       ` Linus Torvalds
2012-05-23 17:19       ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FD874A4.8060606@orcon.net.nz \
    --to=mcree@orcon.net.nz \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=davem@davemloft.net \
    --cc=dhowells@redhat.com \
    --cc=geert@linux-m68k.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.