public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: crc32 cleanups
@ 2001-10-13  2:09 Stuart Lynne
  0 siblings, 0 replies; 35+ messages in thread
From: Stuart Lynne @ 2001-10-13  2:09 UTC (permalink / raw)
  To: linux-kernel

vonbrand@sleipnir.valparaiso.cl said:

> Matt_Domsch@Dell.com said:
> > > That leaves (a) unconditionally building 
> > > it into the kernel, or (b) Makefile and Config.in rules.
> 
> > (a) is simple, but needs a 1KB malloc (or alternately, a 1KB static const
> > array - I've taken the approach that the malloc is better)
> 
> Better static (less overhead in size and at runtime), initialized at build
> time (you could compute it then). In case of _dire_ kernel size problems, it
> can be left out anyway. AFAIU, there are now a _lot_ of copies of this
> around, so you'll win overall in any case.
> 
> > (b) isn't that much harder, but requires drivers to be sure to call
> > init_crc32 and cleanup_crc32.  If somehow they manage not to do that, Oops.
> > I don't want to add a runtime check for the existance of the array in
> > crc32().

I have to agree. A single global config that controls if CRC32 is available
as a static table, and an inline function that allows me to use it. 

Any module that tries to use it will either not compile or if will not load 
due an unresolved address.

Least impact. Simplest API. The lowest overhead for kernels that do or do
not need CRC. 

We should probably provide CRC10 and CRC16 as well.

-- 
                                            __O 
Lineo - Where Open  Meets Smart           _-\<,_ 
PGP Fingerprint: 28 E2 A0 15 99 62 9A 00 (_)/ (_) 88 EC A3 EE 2D 1C 15 68
Stuart Lynne <sl@lineo.com>            www.lineo.com         604-461-7532

^ permalink raw reply	[flat|nested] 35+ messages in thread
* RE: crc32 cleanups
@ 2001-10-23 21:19 Matt_Domsch
  0 siblings, 0 replies; 35+ messages in thread
From: Matt_Domsch @ 2001-10-23 21:19 UTC (permalink / raw)
  To: linux-kernel; +Cc: jgarzik, kaos

Yet another try...

Changes since last time:
* Do away with modules calling init/cleanup_crc32().  This now happens using
module_init().  This avoids refcounting (though I finally got it right with
assistance) and unnecessary pain associated with knowing exactly when in
each module to call init/cleanup_crc32, how many times, etc.

* Each driver that uses crc32 functions now calls request_module("crc32") in
their init or probe function.  This way if crc32.o is a module it gets
loaded and initialized prior to use.  If it's static, no harm.

I'm concerned by Keith's comment:
> __initcall entries are executed in the order that they are linked
> into the kernel.  The linkage order is controlled by the order that
> Makefiles are processed during kbuild and by line order within each
> Makefile.  There is definitely a priority order for __initcall code.

Top-level Makefile has:
SUBDIRS = kernel drivers mm fs net ipc lib

And indeed, drivers and fs get built before lib.  Therefore, drivers that
use any of the crc32 functions as part of their probe or module_init will be
broken.  So, I've moved lib before drivers, both there and in the link
(where it really matters).  Now library __init functions are called after
the file system drivers (which I think is OK, as no file systems, even
initrds, are used before the crc32 functions), and before static drivers are
loaded, which is critical.  Any objections?

Patches at:
http://domsch.com/linux/patches/linux-2.4.13-pre6-crc32-lib-20011023.patch
http://domsch.com/linux/patches/linux-2.4.13-pre6-crc32-drivers-20011023.pat
ch
http://domsch.com/linux/patches/linux-2.4.13-pre6-gpt-20011023.patch
http://domsch.com/linux/patches/linux-2.4.13-pre6-efivars-20011023.patch


Feedback welcome.  I'm hoping to wrap this up soon... :-)

--
Matt Domsch
Sr. Software Engineer
Dell Linux Solutions
www.dell.com/linux
#2 Linux Server provider with 17% in the US and 14% Worldwide (IDC)!
#3 Unix provider with 18% in the US (Dataquest)!

^ permalink raw reply	[flat|nested] 35+ messages in thread
* RE: crc32 cleanups
@ 2001-10-16 19:05 Matt_Domsch
  0 siblings, 0 replies; 35+ messages in thread
From: Matt_Domsch @ 2001-10-16 19:05 UTC (permalink / raw)
  To: jgarzik; +Cc: linux-kernel

Changes since last time:
* A lot more drivers were updated.  These had in-lined the CRC32 code
(either BE or LE).  I think I got them all except natsemi, which I was
afraid I'd screw up.
* Incorporated the new crc32 library routines.  Both BE and LE now use a
table.  LE is faster, BE uses less space.  It's trivial to make this
tradeoff now.

> * Still need init_crc32:
>   8139too, au1000_eth, fealnx, smc91c92_cs, xircom_tulip_cb, smc9194,
>   via-rhine, winbond-840

Now that the BE functions use a table, yes.

> * lib/uuid.o needs to be in lib/Makefile's export-objs

Oops, good catch.  I meant to submit uuid as a separate patch, but I'll
include it in the crc32-lib + efivars patches (it's small).

> * in init/cleanup_crc32, don't hold spinlock during kmalloc/kfree.
>   Check out other refcounting schemes in the various fs/*.c files.

OK, hope this is better.

> * Add a Config.in entry, so that people can manually select to compile
>   crc32.o as a module.  This takes care of the 3rd party module case,
>   the module that might for example have been built some time
>   after the kernel itself was built.

OK.  Turns out this then needs to change arch/$arch/config.in to add
lib/Config.in to the menus too. :-(

Here's today's pass at all this.
http://domsch.com/linux/patches/linux-2.4.13-pre3-crc32-lib-20011016.patch
http://domsch.com/linux/patches/linux-2.4.13-pre3-crc32-drivers-20011016.pat
ch
http://domsch.com/linux/patches/linux-2.4.13-pre3-gpt-20011016.patch
http://domsch.com/linux/patches/linux-2.4.13-pre3-efivars-20011016.patch


This is working on my x86 system with CONFIG_EFI_PARTITION enabled, so I
feel pretty good about at least the LE crc32 function.  All the drivers that
could be built on x86 were, but I don't have hardware to test any of them,
nor any non-x86 platforms.

I'm not sure I got the init/cleanup right in all the modules.  I tried to
call init_crc32() once only if/when we know the driver's init routine
actually found one or more cards.  For the hassle of getting this right, I'm
tempted to make crc32.c have module_init/cleanup() that allocates the memory
unconditionally...  It's only 2*1KB at most.

Feedback appreciated.

Thanks,
Matt

--
Matt Domsch
Sr. Software Engineer
Dell Linux Solutions
www.dell.com/linux
#2 Linux Server provider with 17% in the US and 14% Worldwide (IDC)!
#3 Unix provider with 18% in the US (Dataquest)!


^ permalink raw reply	[flat|nested] 35+ messages in thread
* Re: crc32 cleanups
@ 2001-10-15 11:34 Samium Gromoff
  0 siblings, 0 replies; 35+ messages in thread
From: Samium Gromoff @ 2001-10-15 11:34 UTC (permalink / raw)
  To: bjorn; +Cc: linux-kernel

> > Does not work if all the code that uses crc32 is in a module.  No
> > references from the main kernel so crc32 is not included by the linker.
>
> So make the CRC32 code a module itself ?
   Nasty idea, since its ideologically unclean.
   The idea behind the CONFIG_xxx options is to give user the control over
 the kernel content in the cases he really can win from it.
   I suppose no sane user knows what crc32 is needed by, nor even whats it..   
   Nor he should, since maximum of his knowledge lies in device selection,
 which is the proper source for such information.

   What i`m trying to say is that CONFIG_xxx is just the wrong information
 bearer for that.

Cheers, Samium Gromoff


^ permalink raw reply	[flat|nested] 35+ messages in thread
* RE: crc32 cleanups
@ 2001-10-14  3:19 Matt_Domsch
  0 siblings, 0 replies; 35+ messages in thread
From: Matt_Domsch @ 2001-10-14  3:19 UTC (permalink / raw)
  To: kaih, linux-kernel

> I don't think this does what was intended. Should that perhaps be
> 
> > 		for (i = 0; i < 1; i++)
> 

Good catch.  linux@horizon.com sent me a cleaned up version.

-- 
Matt Domsch
Sr. Software Engineer
Dell Linux Solutions
www.dell.com/linux
#2 Linux Server provider with 17% in the US and 14% Worldwide (IDC)!
#3 Unix provider with 18% in the US (Dataquest)!

That'll teach me to clean up code by hand and not test it afterwards.
That's supposed to be "i < 8" in that for() loop, but there are a couple
of other glitches, too.  In particular, I got bits of the
table-initialization
code transposed when I generalized it for less than 8 bits.

Here's a *tested* version, with test harness (if compiled with -DUNITTEST).

#include <stddef.h>	/* For size_t */
typedef unsigned _u32;

#if __GNUC__ >= 3	/* 2.x has "attribute", but only 3.0 has "pure */
#define attribute(x) __attribute__(x)
#else
#define attribute(x)
#endif

/*
 * This code is in the public domain; copyright abandoned.
 * Liability for non-performance of this code is limited to the amount
 * you paid for it.  Since it is distributed for free, your refund will
 * be very very small.  If it breaks, you get to keep both pieces.
 */

/*
 * There are multiple 16-bit CRC polynomials in common use, but this is
 * *the* standard CRC-32 polynomial, first popularized by Ethernet.
 * x^32+x^26+x^23+x^22+x^16+x^12+x^11+x^10+x^8+x^7+x^5+x^4+x^2+x^1+x^0
 */
#define CRCPOLY_LE 0xedb88320
#define CRCPOLY_BE 0x04c11db7

/* How many bits at a time to use.  Requires a table of 4<<CRC_xx_BITS
bytes. */
#define CRC_LE_BITS 8
#define CRC_BE_BITS 4	/* Less performance-sensitive */

/*
 * Little-endian CRC computation.  Used with serial bit streams sent
 * lsbit-first.  Be sure to use cpu_to_le32() to append the computed CRC.
 */
#if CRC_LE_BITS > 8 || CRC_LE_BITS < 1 || CRC_LE_BITS & CRC_LE_BITS-1
# error CRC_LE_BITS must be a power of 2 between 1 and 8
#endif

#if CRC_LE_BITS == 1
/*
 * In fact, the table-based code will work in this case, but it can be
 * simplified by inlining the table in ?: form.
 */
void
crc32init_le(void)
{/* no-op */;}

_u32 attribute((pure))
crc32_le(_u32 crc, unsigned char const *p, size_t len)
{
	int i;
	while (len--) {
		crc ^= *p++;
		for (i = 0; i < 8; i++)
			crc = (crc >> 1) ^ ((crc & 1) ? CRCPOLY_LE : 0);
	}
	return crc;
}
#else	/* Table-based approach */

_u32 crc32table_le[1<<CRC_LE_BITS];

/*
 * crc is the crc of the byte i; other entries are filled in based on the
 * fact that crctable[i^j] = crctable[i] ^ crctable[j].
 *
 * Note that the _init functions never write anything but the final correct
 * value to each table entry, so they're safe to call repeatedly, even if
 * someone else is currently using the table.
 */
void
crc32init_le(void)
{
	unsigned i, j;
	_u32 crc = 1;

	crc32table_le[0] = 0;

	for (i = 1<<(CRC_LE_BITS-1); i; i >>= 1) {
		crc = (crc >> 1) ^ ((crc & 1) ? CRCPOLY_LE : 0);
		for (j = 0; j < 1<<CRC_LE_BITS; j += 2*i)
			crc32table_le[i+j] = crc ^ crc32table_le[j];
	}
}

_u32 attribute((pure))
crc32_le(_u32 crc, unsigned char const *p, size_t len)
{
	while (len--) {
# if CRC_LE_BITS == 8
		crc = (crc >> 8) ^ crc32table_le[(crc ^ *p++) & 255];
# elif CRC_LE_BITS == 4
		crc ^= *p++;
		crc = (crc >> 4) ^ crc32table_le[crc & 15];
		crc = (crc >> 4) ^ crc32table_le[crc & 15];
# elif CRC_LE_BITS == 2
		crc ^= *p++;
		crc = (crc >> 2) ^ crc32table_le[crc & 3];
		crc = (crc >> 2) ^ crc32table_le[crc & 3];
		crc = (crc >> 2) ^ crc32table_le[crc & 3];
		crc = (crc >> 2) ^ crc32table_le[crc & 3];
# endif
	}
	return crc;
}
#endif

/*
 * Big-endian CRC computation.  Used with serial bit streams sent
 * msbit-first.  Be sure to use cpu_to_be32() to append the computed CRC.
 */
#if CRC_BE_BITS > 8 || CRC_BE_BITS < 1 || CRC_BE_BITS & CRC_BE_BITS-1
# error CRC_BE_BITS must be a power of 2 between 1 and 8
#endif

#if CRC_BE_BITS == 1
/*
 * In fact, the table-based code will work in this case, but it can be
 * simplified by inlining the table in ?: form.
 */
void
crc32init_be(void)
{/*no-op*/;}

_u32 attribute((pure))
crc32_be(_u32 crc, unsigned char const *p, size_t len)
{
	int i;
	while (len--) {
		crc ^= *p++ << 24;
		for (i = 0; i < 8; i++)
			crc = (crc << 1) ^ ((crc & 0x80000000) ? CRCPOLY_BE
: 0);
	}
	return crc;
}

#else	/* Table-based approach */
_u32 crc32table_be[256];

void
crc32init_be(void)
{
	unsigned i, j;
	_u32 crc = 0x80000000;

	crc32table_be[0] = 0;

	for (i = 1 ; i < 1<<CRC_BE_BITS; i <<= 1) {
		crc = (crc << 1) ^ ((crc & 0x80000000) ? CRCPOLY_BE : 0);
		for (j = 0; j < i; j++)
			crc32table_be[i+j] = crc ^ crc32table_be[j];
	}
}

_u32 attribute((pure))
crc32_be(_u32 crc, unsigned char const *p, size_t len)
{
	while (len--) {
# if CRC_BE_BITS == 8
		crc = (crc << 8) ^ crc32table_be[(crc >> 24) ^ *p++];
# elif CRC_BE_BITS == 4
		crc ^= *p++ << 24;
		crc = (crc << 4) ^ crc32table_be[crc >> 28];
		crc = (crc << 4) ^ crc32table_be[crc >> 28];
# elif CRC_BE_BITS == 2
		crc ^= *p++ << 24;
		crc = (crc << 2) ^ crc32table_be[crc >> 30];
		crc = (crc << 2) ^ crc32table_be[crc >> 30];
		crc = (crc << 2) ^ crc32table_be[crc >> 30];
		crc = (crc << 2) ^ crc32table_be[crc >> 30];
# endif
	}
	return crc;
}
#endif

/*
 * A brief CRC tutorial.
 *
 * A CRC is a long-division remainder.  You add the CRC to the message,
 * and the whole thing (message+CRC) is a multiple of the given
 * CRC polynomial.  To check the CRC, you can either check that the
 * CRC matches the recomputed value, *or* you can check that the
 * remainder computed on the message+CRC is 0.  This latter approach
 * is used by a lot of hardware implementations, and is why so many
 * protocols put the end-of-frame flag after the CRC.
 *
 * It's actually the same long division you learned in school, except that
 * - We're working in binary, so the digits are only 0 and 1, and
 * - When dividing polynomials, there are no carries.  Rather than add and
 *   subtract, we just xor.  Thus, we tend to get a bit sloppy about
 *   the difference between adding and subtracting.
 *
 * A 32-bit CRC polynomial is actually 33 bits long.  But since it's
 * 33 bits long, bit 32 is always going to be set, so usually the CRC
 * is written in hex with the most significant bit omitted.  (If you're
 * familiar with the IEEE 754 floating-point format, it's the same idea.)
 *
 * Note that a CRC is computed over a string of *bits*, so you have
 * to decide on the endianness of the bits within each byte.  To get
 * the best error-detecting properties, this should correspond to the
 * order they're actually sent.  For example, standard RS-232 serial is
 * little-endian; the most significant bit (sometimes used for parity)
 * is sent last.  And when appending a CRC word to a message, you should
 * do it in the right order, matching the endianness.
 *
 * Just like with ordinary division, the remainder is always smaller than
 * the divisor (the CRC polynomial) you're dividing by.  Each step of the
 * division, you take one more digit (bit) of the dividend and append it
 * to the current remainder.  Then you figure out the appropriate multiple
 * of the divisor to subtract to being the remainder back into range.
 * In binary, it's easy - it has to be either 0 or 1, and to make the
 * XOR cancel, it's just a copy of bit 32 of the remainder.
 *
 * When computing a CRC, we don't care about the quotient, so we can
 * throw the quotient bit away, but subtract the appropriate multiple of
 * the polynomial from the remainder and we're back to where we started,
 * ready to process the next bit.
 *
 * A big-endian CRC written this way would be coded like:
 * for (i = 0; i < input_bits; i++) {
 * 	multiple = remainder & 0x80000000 ? CRCPOLY : 0;
 * 	remainder = (remainder << 1 | next_input_bit()) ^ multiple;
 * }
 * Notice how, to get at bit 32 of the shifted remainder, we look
 * at bit 31 of the remainder *before* shifting it.
 *
 * But also notice how the next_input_bit() bits we're shifting into
 * the remainder don't actually affect any decision-making until
 * 32 bits later.  Thus, the first 32 cycles of this are pretty boring.
 * Also, to add the CRC to a message, we need a 32-bit-long hole for it at
 * the end, so we have to add 32 extra cycles shifting in zeros at the
 * end of every message,
 *
 * So the standard trick is to rearrage merging in the next_input_bit()
 * until the moment it's needed.  Then the first 32 cycles can be
precomputed,
 * and merging in the final 32 zero bits to make room for the CRC can be
 * skipped entirely.
 * This changes the code to:
 * for (i = 0; i < input_bits; i++) {
 *      remainder ^= next_input_bit() << 31;
 * 	multiple = (remainder & 0x80000000) ? CRCPOLY : 0;
 * 	remainder = (remainder << 1) ^ multiple;
 * }
 * With this optimization, the little-endian code is simpler:
 * for (i = 0; i < input_bits; i++) {
 *      remainder ^= next_input_bit();
 * 	multiple = (remainder & 1) ? CRCPOLY : 0;
 * 	remainder = (remainder >> 1) ^ multiple;
 * }
 *
 * Note that the other details of endianness have been hidden in CRCPOLY
 * (which must be bit-reversed) and next_input_bit().
 *
 * However, as long as next_input_bit is returning the bits in a sensible
 * order, we can actually do the merging 8 or more bits at a time rather
 * than one bit at a time:
 * for (i = 0; i < input_bytes; i++) {
 * 	remainder ^= next_input_byte() << 24;
 * 	for (j = 0; j < 8; j++) {
 * 		multiple = (remainder & 0x80000000) ? CRCPOLY : 0;
 * 		remainder = (remainder << 1) ^ multiple;
 * 	}
 * }
 * Or in little-endian:
 * for (i = 0; i < input_bytes; i++) {
 * 	remainder ^= next_input_byte();
 * 	for (j = 0; j < 8; j++) {
 * 		multiple = (remainder & 1) ? CRCPOLY : 0;
 * 		remainder = (remainder << 1) ^ multiple;
 * 	}
 * }
 * If the input is a multiple of 32 bits, you can even XOR in a 32-bit
 * word at a time and increase the inner loop count to 32.
 *
 * You can also mix and match the two loop styles, for example doing the
 * bulk of a message byte-at-a-time and adding bit-at-a-time processing
 * for any fractional bytes at the end.
 *
 * The only remaining optimization is to the byte-at-a-time table method.
 * Here, rather than just shifting one bit of the remainder to decide
 * in the correct multiple to subtract, we can shift a byte at a time.
 * This produces a 40-bit (rather than a 33-bit) intermediate remainder,
 * but again the multiple of the polynomial to subtract depends only on
 * the high bits, the high 8 bits in this case.  
 *
 * The multile we need in that case is the low 32 bits of a 40-bit
 * value whose high 8 bits are given, and which is a multiple of the
 * generator polynomial.  This is simply the CRC-32 of the given
 * one-byte message.
 *
 * Two more details: normally, appending zero bits to a message which
 * is already a multiple of a polynomial produces a larger multiple of that
 * polynomial.  To enable a CRC to detect this condition, it's common to
 * invert the CRC before appending it.  This makes the remainder of the
 * message+crc come out not as zero, but some fixed non-zero value.
 *
 * The same problem applies to zero bits prepended to the message, and
 * a similar solution is used.  Instead of starting with a remainder of
 * 0, an initial remainder of all ones is used.  As long as you start
 * the same way on decoding, it doesn't make a difference.
 */

#if UNITTEST

#include <stdlib.h>
#include <stdio.h>

#if 0 /*Not used at present */
static void
buf_dump(char const *prefix, unsigned char const *buf, size_t len)
{
	fputs(prefix, stdout);
	while (len--)
		printf(" %02x", *buf++);
	putchar('\n');

}
#endif

static _u32 attribute((const))
bitreverse(_u32 x)
{
	x = (x >> 16) | (x << 16);
	x = (x >> 8 & 0x00ff00ff) | (x << 8 & 0xff00ff00);
	x = (x >> 4 & 0x0f0f0f0f) | (x << 4 & 0xf0f0f0f0);
	x = (x >> 2 & 0x33333333) | (x << 2 & 0xcccccccc);
	x = (x >> 1 & 0x55555555) | (x << 1 & 0xaaaaaaaa);
	return x;
}

static void
bytereverse(unsigned char *buf, size_t len)
{
	while (len--) {
		unsigned char x = *buf;
		x = (x >> 4) | (x << 4);
		x = (x >> 2 & 0x33) | (x << 2 & 0xcc);
		x = (x >> 1 & 0x55) | (x << 1 & 0xaa);
		*buf++ = x;
	}
}

static void
random_garbage(unsigned char *buf, size_t len)
{
	while (len--)
		*buf++ = (unsigned char)random();
}

#if 0 /* Not used at present */
static void
store_le(_u32 x, unsigned char *buf)
{
	buf[0] = (unsigned char)x;
	buf[1] = (unsigned char)(x >> 8);
	buf[2] = (unsigned char)(x >> 16);
	buf[3] = (unsigned char)(x >> 24);
}
#endif

static void
store_be(_u32 x, unsigned char *buf)
{
	buf[0] = (unsigned char)(x >> 24);
	buf[1] = (unsigned char)(x >> 16);
	buf[2] = (unsigned char)(x >> 8);
	buf[3] = (unsigned char)x;
}

/*
 * This checks that CRC(buf + CRC(buf)) = 0, and that
 * CRC commutes with bit-reversal.  This has the side effect
 * of bytewise bit-reversing the input buffer, and returns
 * the CRC of the reversed buffer.
 */
static _u32
test_step(_u32 init, unsigned char *buf, size_t len)
{
	_u32 crc1, crc2;
	size_t i;

	crc1 = crc32_be(init, buf, len);
	store_be(crc1, buf+len);
	crc2 = crc32_be(init, buf, len+4);
	if (crc2)
		printf("\nCRC cancellation fail: 0x%08x should be 0\n",
crc2);

	for (i = 0; i <= len+4; i++) {
		crc2 = crc32_be(init, buf, i);
		crc2 = crc32_be(crc2, buf+i, len+4-i);
		if (crc2)
			printf("\nCRC split fail: 0x%08x\n", crc2);
	}

	/* Now swap it around for the other test */

	bytereverse(buf, len+4);
	init = bitreverse(init);
	crc2 = bitreverse(crc1);
	if (crc1 != bitreverse(crc2))
		printf("\nBit reversal fail: 0x%08x -> %0x08x -> 0x%08x\n",
				crc1, crc2, bitreverse(crc2));
	crc1 = crc32_le(init, buf, len);
	if (crc1 != crc2)
		printf("\nCRC endianness fail: 0x%08x != 0x%08x\n", crc1,
crc2);
	crc2 = crc32_le(init, buf, len+4);
	if (crc2)
		printf("\nCRC cancellation fail: 0x%08x should be 0\n",
crc2);

	for (i = 0; i <= len+4; i++) {
		crc2 = crc32_le(init, buf, i);
		crc2 = crc32_le(crc2, buf+i, len+4-i);
		if (crc2)
			printf("\nCRC split fail: 0x%08x\n", crc2);
	}

	return crc1;
}

#define SIZE 64
#define INIT1 0
#define INIT2 0

int
main(void)
{
	unsigned char buf1[SIZE+4];
	unsigned char buf2[SIZE+4];
	unsigned char buf3[SIZE+4];
	int i, j;
	_u32 crc1, crc2, crc3;

	crc32init_le();
	crc32init_be();

	for (i = 0; i <= SIZE; i++) {
		printf("\rTesting length %d...", i);
		fflush(stdout);
		random_garbage(buf1, i);
		random_garbage(buf2, i);
		for (j = 0; j < i; j++)
			buf3[j] = buf1[j] ^ buf2[j];

		crc1 = test_step(INIT1, buf1, i);
		crc2 = test_step(INIT2, buf2, i);
		/* Now check that CRC(buf1 ^ buf2) = CRC(buf1) ^ CRC(buf2)
*/
		crc3 = test_step(INIT1^INIT2, buf3, i);
		if (crc3 != (crc1 ^ crc2))
			printf("CRC XOR fail: 0x%08x != 0x%08x ^ 0x%08x\n",
					crc3, crc1, crc2);
	}
	printf("\nAll test complete.  No failures expected.\n");
	return 0;
}

#endif /* UNITTEST */

^ permalink raw reply	[flat|nested] 35+ messages in thread
* RE: crc32 cleanups
@ 2001-10-13  3:46 Matt_Domsch
  2001-10-13 11:47 ` Kai Henningsen
  0 siblings, 1 reply; 35+ messages in thread
From: Matt_Domsch @ 2001-10-13  3:46 UTC (permalink / raw)
  To: kaos, jgarzik; +Cc: linux-kernel

And, just when I thought I understood the crc32 stuff, here's an even better
explanation/code/etc.  With thanks.
-Matt


-----Original Message-----
From: linux@horizon.com [mailto:linux@horizon.com]
Sent: Friday, October 12, 2001 10:06 PM
Subject: Here's table-optimized crc32 code for both ways...

I think just having these in the kernel unconditionally is best.
This version allows various space/time tradeoffs.

For a RAM kernel, the initialization code is smaller than the tables,
and can be made initcode.  For a ROM kernel, it would make sense
to precompute the tables and compile them in.

#include <stddef.h>	/* For size_t */
typedef unsigned _u32;

/*
 * This code is in the public domain; copyright abandoned.
 * Liability for non-performance of this code is limited to the amount
 * you paid for it.  Since it is distributed for free, your refund will
 * be very very small.  If it breaks, you get to keep both pieces.
 */

/*
 * There are multiple 16-bit CRC polynomials in common use, but this is
 * *the* standard CRC-32 polynomial, first popularized by Ethernet.
 * x^32+x^26+x^23+x^22+x^16+x^12+x^11+x^10+x^8+x^7+x^5+x^4+x^2+x^1+x^0
 */
#define CRCPOLY_LE 0xedb88320
#define CRCPOLY_BE 0x04c11db7

/* How many bits at a time to use.  Requires a table of 4<<CRC_xx_BITS
bytes. */
#define CRC_LE_BITS 8
#define CRC_BE_BITS 4	/* Half the speed, 960 bytes less space */

/*
 * Little-endian CRC computation.  Used with serial bit streams sent
 * lsbit-first.  Be sure to use cpu_to_le32() to append the computed CRC.
 */
#if CRC_LE_BITS > 8 || CRC_LE_BITS < 1 || CRC_LE_BITS & CRC_LE_BITS-1
# error CRC_LE_BITS must be a power of 2 between 1 and 8
#endif

#if CRC_LE_BITS == 1
/*
 * In fact, the table-based code will work in this case, but it can be
 * simplified by inlining the table in ?: form.
 */
void
crc32init_le(void)
{/* no-op */;}

_u32
crc32_le(_u32 crc, unsigned char const *p, size_t len)
{
	int i;
	while (len--) {
		crc ^= *p++;
		for (i = 0; i < i; i++)
			crc = (crc >> 1) ^ (crc & 1 ? CRCPOLY_LE : 0);
	}
	return crc;
}
#else	/* Table-based approach */

_u32 crc32table_le[1<<CRC_LE_BITS];

/*
 * crc is the crc of the byte i; other entries are filled in based on the
 * fact that crctable[i^j] = crctable[i] ^ crctable[j].
 *
 * Note that the _init functions never write anything but the final correct
 * value to each table entry, so they're safe to call repeatedly, even if
 * someone else is currently using the table.
 */
void
crc32init_le(void)
{
	unsigned i, j;
	_u32 crc = 1;

	crc32table_le[0] = 0;

	for (i = 1 ; i < 1<<CRC_LE_BITS; i <<= 1) {
		crc = (crc >> 1) ^ (crc & 1 ? CRCPOLY_LE : 0);
		for (j = 0; j < i; j++)
			crc32table_le[i+j] = crc ^ crc32table_le[j];
	}
}

_u32
crc32_le(_u32 crc, unsigned char const *p, size_t len)
{
	while (len--) {
# if CRC_LE_BITS == 8
		crc = (crc >> 8) ^ crc32table_le[(crc ^ *p++) & 255];
# elif CRC_LE_BITS == 4
		crc ^= *p++;
		crc = (crc >> 4) ^ crc32table_le[crc & 15];
		crc = (crc >> 4) ^ crc32table_le[crc & 15];
# elif CRC_LE_BITS == 2
		crc ^= *p++;
		crc = (crc >> 2) ^ crc32table_le[crc & 3];
		crc = (crc >> 2) ^ crc32table_le[crc & 3];
		crc = (crc >> 2) ^ crc32table_le[crc & 3];
		crc = (crc >> 2) ^ crc32table_le[crc & 3];
# endif
	}
	return crc;
}
#endif

/*
 * Big-endian CRC computation.  Used with serial bit streams sent
 * msbit-first.  Be sure to use cpu_to_be32() to append the computed CRC.
 */
#if CRC_BE_BITS > 8 || CRC_BE_BITS < 1 || CRC_BE_BITS & CRC_BE_BITS-1
# error CRC_BE_BITS must be a power of 2 between 1 and 8
#endif

#if CRC_BE_BITS == 1
/*
 * In fact, the table-based code will work in this case, but it can be
 * simplified by inlining the table in ?: form.
 */
void
crc32init_be(void)
{/*no-op*/;}

_u32
crc32_be(_u32 crc, unsigned char const *p, size_t len)
{
	int i;
	while (len--) {
		crc ^= *p++ << 24;
		for (i = 0; i < i; i++)
			crc = (crc << 1) ^ (crc & 0x80000000 ? CRCPOLY_BE :
0);
	}
	return crc;
}

#else	/* Table-based approach */
_u32 crc32table_be[256];

void
crc32init_be(void)
{
	unsigned i, j;
	_u32 crc = 0x80000000;

	crc32table_le[0] = 0;

	for (i = 1<<(CRC_BE_BITS-1); i; i >>= 1) {
		crc = (crc << 1) ^ (crc & 0x80000000 ? CRCPOLY_BE : 0);
		for (j = 0; j < 1<<CRC_BE_BITS; j += 2*i)
			crc32table_le[i+j] = crc ^ crc32table_le[j];
	}
}

_u32
crc32_be(_u32 crc, unsigned char const *p, size_t len)
{
	while (len--) {
# if CRC_BE_BITS == 8
		crc = (crc << 8) ^ crc32table_be[(crc >> 24) ^ *p++];
# elif CRC_BE_BITS == 4
		crc ^= *p++ << 24;
		crc = (crc << 4) ^ crc32table_be[crc >> 28];
		crc = (crc << 4) ^ crc32table_be[crc >> 28];
# elif CRC_BE_BITS == 2
		crc ^= *p++ << 24;
		crc = (crc << 2) ^ crc32table_be[crc >> 30];
		crc = (crc << 2) ^ crc32table_be[crc >> 30];
		crc = (crc << 2) ^ crc32table_be[crc >> 30];
		crc = (crc << 2) ^ crc32table_be[crc >> 30];
# endif
	}
	return crc;
}
#endif

/*
 * A brief CRC tutorial.
 *
 * A CRC is a long-division remainder.  You add the CRC to the message,
 * and the whole thing (message+CRC) is a multiple of the given
 * CRC polynomial.  To check the CRC, you can either check that the
 * CRC matches the recomputed value, *or* you can check that the
 * remainder computed on the message+CRC is 0.  This latter approach
 * is used by a lot of hardware implementations, and is why so many
 * protocols put the end-of-frame flag after the CRC.
 *
 * It's actually the same long division you learned in school, except that
 * - We're working in binary, so the digits are only 0 and 1, and
 * - When dividing polynomials, there are no carries.  Rather than add and
 *   subtract, we just xor.  Thus, we tend to get a bit sloppy about
 *   the difference between adding and subtracting.
 *
 * A 32-bit CRC polynomial is actually 33 bits long.  But since it's
 * 33 bits long, bit 32 is always going to be set, so usually the CRC
 * is written in hex with the most significant bit omitted.  (If you're
 * familiar with the IEEE 754 floating-point format, it's the same idea.)
 *
 * Note that a CRC is computed over a string of *bits*, so you have
 * to decide on the endianness of the bits within each byte.  To get
 * the best error-detecting properties, this should correspond to the
 * order they're actually sent.  For example, standard RS-232 serial is
 * little-endian; the most significant bit (sometimes used for parity)
 * is sent last.  And when appending a CRC word to a message, you should
 * do it in the right order, matching the endianness.
 *
 * Just like with ordinary division, the remainder is always smaller than
 * the divisor (the CRC polynomial) you're dividing by.  Each step of the
 * division, you take one more digit (bit) of the dividend and append it
 * to the current remainder.  Then you figure out the appropriate multiple
 * of the divisor to subtract to being the remainder back into range.
 * In binary, it's easy - it has to be either 0 or 1, and to make the
 * XOR cancel, it's just a copy of bit 32 of the remainder.
 *
 * When computing a CRC, we don't care about the quotient, so we can
 * throw the quotient bit away, but subtract the appropriate multiple of
 * the polynomial from the remainder and we're back to where we started,
 * ready to process the next bit.
 *
 * A big-endian CRC written this way would be coded like:
 * for (i = 0; i < input_bits; i++) {
 * 	multiple = remainder & 0x80000000 ? CRCPOLY : 0;
 * 	remainder = (remainder << 1 | next_input_bit()) ^ multiple;
 * }
 * Notice how, to get at bit 32 of the shifted remainder, we look
 * at bit 31 of the remainder *before* shifting it.
 *
 * But also notice how the next_input_bit() bits we're shifting into
 * the remainder don't actually affect any decision-making until
 * 32 bits later.  Thus, the first 32 cycles of this are pretty boring.
 * Also, to add the CRC to a message, we need a 32-bit-long hole for it at
 * the end, so we have to add 32 extra cycles shifting in zeros at the
 * end of every message,
 *
 * So the standard trick is to rearrage merging in the next_input_bit()
 * until the moment it's needed.  Then the first 32 cycles can be
precomputed,
 * and merging in the final 32 zero bits to make room for the CRC can be
 * skipped entirely.
 * This changes the code to:
 * for (i = 0; i < input_bits; i++) {
 *      remainder ^= next_input_bit() << 31;
 * 	multiple = (remainder & 0x80000000) ? CRCPOLY : 0;
 * 	remainder = (remainder << 1) ^ multiple;
 * }
 * With this optimization, the little-endian code is simpler:
 * for (i = 0; i < input_bits; i++) {
 *      remainder ^= next_input_bit();
 * 	multiple = (remainder & 1) ? CRCPOLY : 0;
 * 	remainder = (remainder >> 1) ^ multiple;
 * }
 *
 * Note that the other details of endianness have been hidden in CRCPOLY
 * (which must be bit-reversed) and next_input_bit().
 *
 * However, as long as next_input_bit is returning the bits in a sensible
 * order, we can actually do the merging 8 or more bits at a time rather
 * than one bit at a time:
 * for (i = 0; i < input_bytes; i++) {
 * 	remainder ^= next_input_byte() << 24;
 * 	for (j = 0; j < 8; j++) {
 * 		multiple = (remainder & 0x80000000) ? CRCPOLY : 0;
 * 		remainder = (remainder << 1) ^ multiple;
 * 	}
 * }
 * Or in little-endian:
 * for (i = 0; i < input_bytes; i++) {
 * 	remainder ^= next_input_byte();
 * 	for (j = 0; j < 8; j++) {
 * 		multiple = (remainder & 1) ? CRCPOLY : 0;
 * 		remainder = (remainder << 1) ^ multiple;
 * 	}
 * }
 * If the input is a multiple of 32 bits, you can even XOR in a 32-bit
 * word at a time and increase the inner loop count to 32.
 *
 * You can also mix and match the two loop styles, for example doing the
 * bulk of a message byte-at-a-time and adding bit-at-a-time processing
 * for any fractional bytes at the end.
 *
 * The only remaining optimization is to the byte-at-a-time table method.
 * Here, rather than just shifting one bit of the remainder to decide
 * in the correct multiple to subtract, we can shift a byte at a time.
 * This produces a 40-bit (rather than a 33-bit) intermediate remainder,
 * but again the multiple of the polynomial to subtract depends only on
 * the high bits, the high 8 bits in this case.  
 *
 * The multile we need in that case is the low 32 bits of a 40-bit
 * value whose high 8 bits are given, and which is a multiple of the
 * generator polynomial.  This is simply the CRC-32 of the given
 * one-byte message.
 *
 * Two more details: normally, appending zero bits to a message which
 * is already a multiple of a polynomial produces a larger multiple of that
 * polynomial.  To enable a CRC to detect this condition, it's common to
 * invert the CRC before appending it.  This makes the remainder of the
 * message+crc come out not as zero, but some fixed non-zero value.
 *
 * The same problem applies to zero bits prepended to the message, and
 * a similar solution is used.  Instead of starting with a remainder of
 * 0, an initial remainder of all ones is used.  As long as you start
 * the same way on decoding, it doesn't make a difference.
 */

^ permalink raw reply	[flat|nested] 35+ messages in thread
* RE: crc32 cleanups
@ 2001-10-12 22:20 Matt_Domsch
  2001-10-12 22:34 ` Jeff Garzik
  0 siblings, 1 reply; 35+ messages in thread
From: Matt_Domsch @ 2001-10-12 22:20 UTC (permalink / raw)
  To: jgarzik; +Cc: vonbrand, linux-kernel

> I was pondering whether it was ok to unconditionally include the
> lib/crc32.c code, regardless of need.  I am leaning towards 
> "no," which
> implies Makefile and Config.in rules which must be updated for each
> driver that uses crc32.

OK, I'm taking this approach.

> * if ether_crc is always == ether_crc_be, then create a 
> #define instead of patching driver code

Done.

> * no need to inline ether_crc_be, stick it in lib/crc32.c also

Done.

> * using a ref-counting init_crc32 and cleanup_crc32
> (linux/lib/Makefile)
> obj-$(CONFIG_TULIP) += crc32.o
> obj-$(CONFIG_NATSEMI) += crc32.o
> obj-$(CONFIG_DMFE) += crc32.o
> obj-$(CONFIG_ANOTHERDRIVER) += crc32.o

Done.  init_crc32() needs to be called from all drivers which call crc32()
or ether_crc_le().  I think I've got them added, but will verify and report
back after the weekend.  For the curious, I've got one big patch with crc32,
GPT, efivars, and uuid stuff (separates pretty easily into multiple patches,
but for review now, it's just one).
http://domsch.com/linux/patches/big-crc32-20011012.patch

Thanks,
Matt

--
Matt Domsch
Sr. Software Engineer
Dell Linux Solutions
www.dell.com/linux
#2 Linux Server provider with 17% in the US and 14% Worldwide (IDC)!
#3 Unix provider with 18% in the US (Dataquest)!

^ permalink raw reply	[flat|nested] 35+ messages in thread
* RE: crc32 cleanups
@ 2001-10-12 20:17 Matt_Domsch
  2001-10-12 20:34 ` Jeff Garzik
  2001-10-13  1:09 ` Horst von Brand
  0 siblings, 2 replies; 35+ messages in thread
From: Matt_Domsch @ 2001-10-12 20:17 UTC (permalink / raw)
  To: jgarzik, vonbrand; +Cc: linux-kernel

> That leaves (a) unconditionally building 
> it into the kernel, or (b) Makefile and Config.in rules.

(a) is simple, but needs a 1KB malloc (or alternately, a 1KB static const
array - I've taken the approach that the malloc is better)
(b) isn't that much harder, but requires drivers to be sure to call
init_crc32 and cleanup_crc32.  If somehow they manage not to do that, Oops.
I don't want to add a runtime check for the existance of the array in
crc32().

--
Matt Domsch
Sr. Software Engineer
Dell Linux Solutions
www.dell.com/linux
#2 Linux Server provider with 17% in the US and 14% Worldwide (IDC)!
#3 Unix provider with 18% in the US (Dataquest)!

^ permalink raw reply	[flat|nested] 35+ messages in thread
* crc32 cleanups
@ 2001-10-12 19:11 Matt Domsch
  2001-10-12 19:25 ` Jeff Garzik
                   ` (3 more replies)
  0 siblings, 4 replies; 35+ messages in thread
From: Matt Domsch @ 2001-10-12 19:11 UTC (permalink / raw)
  To: linux-kernel

In trying to get my EFI GUID Partition Table patch included into the stock
kernel, Andreas Dilger suggested it was time for some crc32 cleanup, as
the GPT patch added yet another copy of the common crc32 function.  So,
here's my first pass at it.  Comments welcome.

First patch:
http://domsch.com/linux/patches/linux-2.4.13-pre1-crc32-20011012.patch
 linux-2.4.13-pre1.crc/include/linux/crc32.h |   62 +++++++++++
 linux-2.4.13-pre1.crc/lib/Makefile          |    4
 linux-2.4.13-pre1.crc/lib/crc32.c           |  151 ++++++++++++++++++++++++++++
 linux-2.4.13-pre1.gpt/init/main.c           |    2
 4 files changed, 217 insertions(+), 2 deletions(-)

This patch (appended below), makes include/linux/crc32.h and
lib/crc32.c.  It generates a table based on the commonly used polynomial
at init time provided a driver needs it.  It changes ether_crc_le() to use
this table.  It renames the commonly seen ether_crc() to be
ether_crc_be(), and puts it in crc32.h (allowing lots of copies to be
deleted).

ether_crc_be() continues to use the calculated method, rather than a table
lookup method, for computing CRC.  Since I don't understand the
differences between big-endian and little-endian CRC32 functions in this
case, and I don't have access to BE systems, I tried to do the safest
thing here.  If someone wishes to make this use a table lookup, please do
so.

Exactly how init_crc32() is called is subject to debate.  The method
presented below avoids allocating and filling the table unless something
that uses it puts a line in lib/crc32.c.  A second method would be to
have each  user call init_crc32() itself, but if a driver neglects to do
such, the kernel would Oops.  I'm sure there's yet a better method I
haven't thought of.


Assuming the above patch is somewhat close to correct, then a
corresponding patch to various network drivers and jffs2 can be applied
(not appended)
http://domsch.com/linux/patches/linux-2.4.13-pre1-crc32-drivers-20011012.patch
 drivers/net/8139too.c                |   24 --------
 drivers/net/at1700.c                 |   22 -------
 drivers/net/atp.c                    |   21 -------
 drivers/net/au1000_eth.c             |   19 ------
 drivers/net/dl2k.c                   |   15 -----
 drivers/net/dl2k.h                   |    1
 drivers/net/dmfe.c                   |   84 ++----------------------------
 drivers/net/epic100.c                |   22 -------
 drivers/net/fealnx.c                 |   24 --------
 drivers/net/pci-skeleton.c           |   25 ---------
 drivers/net/pcmcia/smc91c92_cs.c     |   28 ----------
 drivers/net/pcmcia/xircom_tulip_cb.c |   40 --------------
 drivers/net/smc9194.c                |   35 +-----------
 drivers/net/starfire.c               |   26 ---------
 drivers/net/sundance.c               |   27 ---------
 drivers/net/tulip/tulip_core.c       |   40 +-------------
 drivers/net/via-rhine.c              |   23 --------
 drivers/net/winbond-840.c            |   19 ------
 drivers/net/yellowfin.c              |   24 --------
 fs/jffs2/crc32.c                     |   97 -----------------------------------
 fs/jffs2/crc32.h                     |   21 -------
 fs/jffs2/dir.c                       |    2
 fs/jffs2/erase.c                     |    2
 fs/jffs2/file.c                      |    2
 fs/jffs2/gc.c                        |    2
 fs/jffs2/read.c                      |    2
 fs/jffs2/readinode.c                 |    2
 fs/jffs2/scan.c                      |    2
 fs/jffs2/write.c                     |    2
 29 files changed, 46 insertions(+), 607 deletions(-)



I've got GPT and common uuid patches also at
http://domsch.com/linux/patches/ (see linux-2.4.13-*) to submit pending the
inclusion of the crc32 patches.


Feedback welcomed.

Thanks,
Matt

--
Matt Domsch
Sr. Software Engineer
Dell Linux Solutions
www.dell.com/linux
#2 Linux Server provider with 17% in the US and 14% Worldwide (IDC)!
#3 Unix provider with 18% in the US (Dataquest)!




diff -burNp --exclude='*~' --exclude='*.orig' linux-2.4.13-pre1/include/linux/crc32.h linux-2.4.13-pre1.gpt/include/linux/crc32.h
--- linux-2.4.13-pre1/include/linux/crc32.h	Wed Dec 31 18:00:00 1969
+++ linux-2.4.13-pre1.crc/include/linux/crc32.h	Fri Oct 12 12:20:47 2001
@@ -0,0 +1,62 @@
+/*
+ * crc32.h
+ * See linux/lib/crc32.c for license and changes
+ */
+#ifndef _LINUX_CRC32_H
+#define _LINUX_CRC32_H
+
+#include <linux/types.h>
+#include <linux/init.h>
+
+/* This is used only by init/main.c */
+extern void crc32_init(void) __init;
+
+/*
+ * This computes a 32 bit CRC of the data in the buffer, and returns the CRC.
+ * The polynomial used is 0xedb88320.
+ */
+
+extern u32 crc32 (u32 seed, const void *buf, unsigned long len);
+
+/**
+ * ether_crc_le(int length, unsigned char *data)
+ * @length - length of buffer @data
+ * @data   - buffer to generate CRC32 value for
+ *
+ * The little-endian AUTODIN32 ethernet CRC calculation.
+ * ethernet_polynomial_le = 0xedb88320U;
+ * Seed of ~0, no final xor of ~0.
+ */
+static inline unsigned ether_crc_le(int length, unsigned char *data)
+{
+	return crc32(~0, data, length);
+}
+
+/**
+ * ether_crc_be(int length, unsigned char *data)
+ * @length - length of buffer @data
+ * @data   - buffer to generate CRC32 value for
+ *
+ * The big-endian AUTODIN II ethernet CRC calculation.
+ * N.B. Do not use for bulk data, use a table-based routine instead.
+ */
+static inline u32 ether_crc_be(int length, unsigned char *data)
+{
+	int crc = ~0;
+	unsigned const ethernet_polynomial_be = 0x04c11db7U;
+
+	while(--length >= 0) {
+		unsigned char current_octet = *data++;
+		int bit;
+		for (bit = 0; bit < 8; bit++, current_octet >>= 1) {
+			crc = (crc << 1) ^
+				((crc < 0) ^ (current_octet & 1) ?
+				 ethernet_polynomial_be : 0);
+		}
+	}
+	return crc;
+}
+
+
+
+#endif /* _LINUX_CRC32_H */
diff -burNp --exclude='*~' --exclude='*.orig' linux-2.4.13-pre1/lib/Makefile linux-2.4.13-pre1.crc/lib/Makefile
--- linux-2.4.13-pre1/lib/Makefile	Mon Sep 17 17:31:15 2001
+++ linux-2.4.13-pre1.crc/lib/Makefile	Thu Oct 11 15:45:06 2001
@@ -8,9 +8,9 @@

 L_TARGET := lib.a

-export-objs := cmdline.o dec_and_lock.o rwsem-spinlock.o rwsem.o
+export-objs := cmdline.o dec_and_lock.o rwsem-spinlock.o rwsem.o crc32.o

-obj-y := errno.o ctype.o string.o vsprintf.o brlock.o cmdline.o bust_spinlocks.o rbtree.o
+obj-y := errno.o ctype.o string.o vsprintf.o brlock.o cmdline.o bust_spinlocks.o rbtree.o crc32.o

 obj-$(CONFIG_RWSEM_GENERIC_SPINLOCK) += rwsem-spinlock.o
 obj-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem.o
diff -burNp --exclude='*~' --exclude='*.orig' linux-2.4.13-pre1/lib/crc32.c linux-2.4.13-pre1.crc/lib/crc32.c
--- linux-2.4.13-pre1/lib/crc32.c	Wed Dec 31 18:00:00 1969
+++ linux-2.4.13-pre1.crc/lib/crc32.c	Fri Oct 12 11:48:03 2001
@@ -0,0 +1,151 @@
+/*
+ * Oct 12, 2000 Matt Domsch <Matt_Domsch@dell.com>
+ * Same crc32 function was used in 5 other places in the kernel.
+ * I made one version, and deleted the others.
+ * There are various incantations of crc32().  Some use a seed of 0 or ~0.
+ * Some xor at the end with ~0.  The generic crc32() function takes
+ * seed as an argument, and doesn't xor at the end.  Then individual
+ * users can do whatever they need.
+ *   drivers/net/smc9194.c uses seed ~0, doesn't xor with ~0.
+ *   fs/jffs2 uses seed 0, doesn't xor with ~0.
+ *   fs/partitions/efi.c uses seed ~0, xor's with ~0.
+ *
+ * Dec 5, 2000 Matt Domsch <Matt_Domsch@dell.com>
+ * - Copied crc32.c from the linux/drivers/net/cipe directory.
+ * - Now pass seed as an arg
+ * - changed unsigned long to u32, added #include<linux/types.h>
+ * - changed len to be an unsigned long
+ * - changed crc32val to be a register
+ * - License remains unchanged!  It's still GPL-compatable!
+ */
+
+  /* ============================================================= */
+  /*  COPYRIGHT (C) 1986 Gary S. Brown.  You may use this program, or       */
+  /*  code or tables extracted from it, as desired without restriction.     */
+  /*                                                                        */
+  /*  First, the polynomial itself and its table of feedback terms.  The    */
+  /*  polynomial is                                                         */
+  /*  X^32+X^26+X^23+X^22+X^16+X^12+X^11+X^10+X^8+X^7+X^5+X^4+X^2+X^1+X^0   */
+  /*                                                                        */
+  /*  Note that we take it "backwards" and put the highest-order term in    */
+  /*  the lowest-order bit.  The X^32 term is "implied"; the LSB is the     */
+  /*  X^31 term, etc.  The X^0 term (usually shown as "+1") results in      */
+  /*  the MSB being 1.                                                      */
+  /*                                                                        */
+  /*  Note that the usual hardware shift register implementation, which     */
+  /*  is what we're using (we're merely optimizing it by doing eight-bit    */
+  /*  chunks at a time) shifts bits into the lowest-order term.  In our     */
+  /*  implementation, that means shifting towards the right.  Why do we     */
+  /*  do it this way?  Because the calculated CRC must be transmitted in    */
+  /*  order from highest-order term to lowest-order term.  UARTs transmit   */
+  /*  characters in order from LSB to MSB.  By storing the CRC this way,    */
+  /*  we hand it to the UART in the order low-byte to high-byte; the UART   */
+  /*  sends each low-bit to hight-bit; and the result is transmission bit   */
+  /*  by bit from highest- to lowest-order term without requiring any bit   */
+  /*  shuffling on our part.  Reception works similarly.                    */
+  /*                                                                        */
+  /*  The feedback terms table consists of 256, 32-bit entries.  Notes:     */
+  /*                                                                        */
+  /*      The table can be generated at runtime if desired; code to do so   */
+  /*      is shown later.  It might not be obvious, but the feedback        */
+  /*      terms simply represent the results of eight shift/xor opera-      */
+  /*      tions for all combinations of data and CRC register values.       */
+  /*                                                                        */
+  /*      The values must be right-shifted by eight bits by the "updcrc"    */
+  /*      logic; the shift must be unsigned (bring in zeroes).  On some     */
+  /*      hardware you could probably optimize the shift in assembler by    */
+  /*      using byte-swap instructions.                                     */
+  /*      polynomial $edb88320                                              */
+  /*                                                                        */
+  /*  --------------------------------------------------------------------  */
+
+#include <linux/crc32.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+
+static u32 *crc_32_tab;
+
+/**
+ * crc32_init(): generates CRC32 table for polynomial $edb88320
+ *
+ * Description: Code to compute the CRC-32 table. Borrowed from
+ * gzip-1.0.3/makecrc.c.
+ */
+#if defined(CONFIG_AT1700)        || \
+    defined(CONFIG_ATP)           || \
+    defined(CONFIG_DL2K)          || \
+    defined(CONFIG_DM9102)        || \
+    defined(CONFIG_EPIC100)       || \
+    defined(CONFIG_SMC9194)       || \
+    defined(CONFIG_STARFIRE)      || \
+    defined(CONFIG_SUNDANCE)      || \
+    defined(CONFIG_TULIP)         || \
+    defined(CONFIG_YELLOWFIN)     || \
+    defined(CONFIG_FS_JFFS2)      || \
+    defined(CONFIG_EFI_PARTITION)
+void __init crc32_init(void)
+{
+/* Not copyrighted 1990 Mark Adler	*/
+
+	u32 c;      /* crc shift register */
+	u32 e;      /* polynomial exclusive-or pattern */
+	int i;      /* counter for all possible eight bit values */
+	int k;      /* byte being shifted into crc apparatus */
+	/* terms of polynomial defining this crc (except x^32): */
+	const int p[] = {0,1,2,4,5,7,8,10,11,12,16,22,23,26};
+
+	if (crc_32_tab) return;
+	else crc_32_tab = kmalloc(256*sizeof(u32), GFP_KERNEL);
+
+	/* Make exclusive-or pattern from polynomial */
+	e = 0;
+	for (i = 0; i < sizeof(p)/sizeof(int); i++)
+		e |= 1L << (31 - p[i]);
+
+	crc_32_tab[0] = 0;
+
+	for (i = 1; i < 256; i++) {
+		c = 0;
+		for (k = i | 256; k != 1; k >>= 1) {
+			c = c & 1 ? (c >> 1) ^ e : c >> 1;
+			if (k & 1)
+				c ^= e;
+		}
+		crc_32_tab[i] = c;
+	}
+}
+#else
+#define init_crc32()
+#define NEED_INIT_CRC32
+#endif
+
+/**
+ * crc32(): Generates CRC32 checksum of buffer
+ * @seed: initial value, often 0 or ~0, or the
+ *        running crc if checksumming multiple buffers.
+ * @buf:  buffer to be checksummed
+ * @len:  length of @buf
+ *
+ */
+
+
+u32
+crc32(u32 seed, const void *buf, unsigned long len)
+{
+	unsigned long i;
+	register u32 crc32val;
+	const u8 *s = buf;
+
+#ifdef NEED_INIT_CRC32
+#error You must add the CONFIG variable for this module to lib/crc32.c:init_crc32().
+#endif
+	crc32val = seed;
+	for (i = 0;  i < len;  i++) {
+		crc32val = crc_32_tab[(crc32val ^ s[i]) & 0xff] ^
+			(crc32val >> 8);
+	}
+	return crc32val;
+}
+
+EXPORT_SYMBOL(crc32);
diff -burNp --exclude='*~' --exclude='*.orig' linux-2.4.13-pre1/init/main.c linux-2.4.13-pre1.gpt/init/main.c
--- linux-2.4.13-pre1/init/main.c	Sat Oct  6 10:49:16 2001
+++ linux-2.4.13-pre1.gpt/init/main.c	Fri Oct 12 10:40:02 2001
@@ -27,6 +27,7 @@
 #include <linux/iobuf.h>
 #include <linux/bootmem.h>
 #include <linux/tty.h>
+#include <linux/crc32.h>

 #include <asm/io.h>
 #include <asm/bugs.h>
@@ -608,6 +609,7 @@ asmlinkage void __init start_kernel(void
 #if defined(CONFIG_SYSVIPC)
 	ipc_init();
 #endif
+	crc32_init();
 	check_bugs();
 	printk("POSIX conformance testing by UNIFIX\n");




^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2001-10-23 21:19 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-10-13  2:09 crc32 cleanups Stuart Lynne
  -- strict thread matches above, loose matches on Subject: below --
2001-10-23 21:19 Matt_Domsch
2001-10-16 19:05 Matt_Domsch
2001-10-15 11:34 Samium Gromoff
2001-10-14  3:19 Matt_Domsch
2001-10-13  3:46 Matt_Domsch
2001-10-13 11:47 ` Kai Henningsen
2001-10-12 22:20 Matt_Domsch
2001-10-12 22:34 ` Jeff Garzik
2001-10-12 20:17 Matt_Domsch
2001-10-12 20:34 ` Jeff Garzik
2001-10-13  1:20   ` Horst von Brand
2001-10-13 15:16     ` Martin Dalecki
2001-10-13  1:09 ` Horst von Brand
2001-10-13  1:20   ` Jeff Garzik
2001-10-12 19:11 Matt Domsch
2001-10-12 19:25 ` Jeff Garzik
2001-10-12 19:37 ` Jeff Garzik
2001-10-13  1:51   ` Keith Owens
2001-10-13  2:36     ` Jeff Garzik
2001-10-13  2:45       ` Keith Owens
2001-10-13  2:56         ` Jeff Garzik
2001-10-13  3:12           ` Keith Owens
2001-10-12 19:45 ` Horst von Brand
2001-10-12 20:07   ` Jeff Garzik
2001-10-13  1:04     ` Horst von Brand
2001-10-13  1:09       ` Jeff Garzik
2001-10-13 12:45         ` Horst von Brand
2001-10-13  9:48 ` Jan-Marek Glogowski
2001-10-13 10:02   ` Andi Kleen
2001-10-13 10:07     ` Jan-Marek Glogowski
2001-10-13 23:09       ` Horst von Brand
2001-10-13 10:25     ` Keith Owens
2001-10-15  8:29       ` Bjorn Wesen
2001-10-15 11:35         ` Keith Owens

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox