* Crc32 Challenge
@ 2015-11-17 16:51 chris holcombe
2015-11-18 9:33 ` Dan van der Ster
2015-11-23 15:47 ` Gregory Farnum
0 siblings, 2 replies; 4+ messages in thread
From: chris holcombe @ 2015-11-17 16:51 UTC (permalink / raw)
To: Ceph Development
Hello Ceph Devs,
I'm almost certain at this point that I have discovered a major bug in
ceph's crc32c mechanism. http://tracker.ceph.com/issues/13713 I'm
totally open to be proven wrong and that's what this email is about.
Can someone out there write a piece of code using an outside library
that produces the same crc32c checksums that Ceph does? If they can
I'll close my bug and stand corrected :). I've tried 3 python libraries
and 1 rust library so far and my conclusions are 1) they are all in
agreement and 2) they all produce different checksums than ceph's
checksums
https://github.com/ceph/ceph/blob/83e10f7e2df0a71bd59e6ef2aa06b52b186fddaa/src/test/common/test_crc32c.cc#L21
Start small and see if you can verify the "foo bar baz" checksum and
then try some of the others.
For a known good checksum to test your program against use this:
http://www.pdl.cmu.edu/mailinglists/ips/mail/msg04970.html In there
Mark Bakke talks about a 32 byte array of all 00h should produce a
checksum of 8A9136AA. Printing that with python in decimal: 2324772522
The implications of this are unfortunately tricky. If I'm right and we
fix ceph's algorithm then it won't be able to talk to any previous
version of ceph past the beginning protocol handshake. There would have
to be a mechanism introduced so that any x and older version would speak
the previous crc and anything y and newer would speak the new version.
Another option is we could break ceph's crc code out into a library and
make that available to everyone and call it ceph-crc32c.
Thanks!
Chris
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Crc32 Challenge
2015-11-17 16:51 Crc32 Challenge chris holcombe
@ 2015-11-18 9:33 ` Dan van der Ster
2015-11-23 15:47 ` Gregory Farnum
1 sibling, 0 replies; 4+ messages in thread
From: Dan van der Ster @ 2015-11-18 9:33 UTC (permalink / raw)
To: chris holcombe; +Cc: Ceph Development
Hi,
I checked the partial crc after each iteration in google's python
implementation and found that the crc of the last iteration matches
ceph's [1]:
>>> from crc32c import crc
>>> crc('foo bar baz')
crc 1197962378
crc 3599162226
crc 2946501991
crc 2501826906
crc 3132034983
crc 3851841059
crc 2745946046
crc 1047783679
crc 767476524
crc 4269731756
crc 4119623852
After finalize (crc ^ 0xFFFFFFFF) the crc becomes:
175343443L
So it looks like ceph_crc32c just isn't doing that final XOR step.
>>> 4119623852 ^ 0xFFFFFFFF
175343443
Cheers, Dan
[1] From test_crc32c.cc
const char *a = "foo bar baz";
ASSERT_EQ(4119623852u, ceph_crc32c(0, (unsigned char *)a, strlen(a)));
On Tue, Nov 17, 2015 at 5:51 PM, chris holcombe
<chris.holcombe@canonical.com> wrote:
> Hello Ceph Devs,
>
> I'm almost certain at this point that I have discovered a major bug in
> ceph's crc32c mechanism. http://tracker.ceph.com/issues/13713 I'm totally
> open to be proven wrong and that's what this email is about. Can someone
> out there write a piece of code using an outside library that produces the
> same crc32c checksums that Ceph does? If they can I'll close my bug and
> stand corrected :). I've tried 3 python libraries and 1 rust library so far
> and my conclusions are 1) they are all in agreement and 2) they all produce
> different checksums than ceph's checksums
> https://github.com/ceph/ceph/blob/83e10f7e2df0a71bd59e6ef2aa06b52b186fddaa/src/test/common/test_crc32c.cc#L21
>
> Start small and see if you can verify the "foo bar baz" checksum and then
> try some of the others.
>
> For a known good checksum to test your program against use this:
> http://www.pdl.cmu.edu/mailinglists/ips/mail/msg04970.html In there Mark
> Bakke talks about a 32 byte array of all 00h should produce a checksum of
> 8A9136AA. Printing that with python in decimal: 2324772522
>
> The implications of this are unfortunately tricky. If I'm right and we fix
> ceph's algorithm then it won't be able to talk to any previous version of
> ceph past the beginning protocol handshake. There would have to be a
> mechanism introduced so that any x and older version would speak the
> previous crc and anything y and newer would speak the new version. Another
> option is we could break ceph's crc code out into a library and make that
> available to everyone and call it ceph-crc32c.
>
> Thanks!
> Chris
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Crc32 Challenge
2015-11-17 16:51 Crc32 Challenge chris holcombe
2015-11-18 9:33 ` Dan van der Ster
@ 2015-11-23 15:47 ` Gregory Farnum
2015-11-23 15:50 ` Sage Weil
1 sibling, 1 reply; 4+ messages in thread
From: Gregory Farnum @ 2015-11-23 15:47 UTC (permalink / raw)
To: chris holcombe; +Cc: Ceph Development
On Tue, Nov 17, 2015 at 10:51 AM, chris holcombe
<chris.holcombe@canonical.com> wrote:
> Hello Ceph Devs,
>
> I'm almost certain at this point that I have discovered a major bug in
> ceph's crc32c mechanism. http://tracker.ceph.com/issues/13713 I'm totally
> open to be proven wrong and that's what this email is about. Can someone
> out there write a piece of code using an outside library that produces the
> same crc32c checksums that Ceph does? If they can I'll close my bug and
> stand corrected :). I've tried 3 python libraries and 1 rust library so far
> and my conclusions are 1) they are all in agreement and 2) they all produce
> different checksums than ceph's checksums
> https://github.com/ceph/ceph/blob/83e10f7e2df0a71bd59e6ef2aa06b52b186fddaa/src/test/common/test_crc32c.cc#L21
>
> Start small and see if you can verify the "foo bar baz" checksum and then
> try some of the others.
>
> For a known good checksum to test your program against use this:
> http://www.pdl.cmu.edu/mailinglists/ips/mail/msg04970.html In there Mark
> Bakke talks about a 32 byte array of all 00h should produce a checksum of
> 8A9136AA. Printing that with python in decimal: 2324772522
>
> The implications of this are unfortunately tricky. If I'm right and we fix
> ceph's algorithm then it won't be able to talk to any previous version of
> ceph past the beginning protocol handshake. There would have to be a
> mechanism introduced so that any x and older version would speak the
> previous crc and anything y and newer would speak the new version. Another
> option is we could break ceph's crc code out into a library and make that
> available to everyone and call it ceph-crc32c.
I haven't checked the source for exactly where we use CRC32s, but I
think the basic messenger protocol isn't checksummed — we ought to be
able to use the feature bits exchanged in the protocol handshake to
decide which version of the crc to use?
At least if it's worth changing; I've no idea about that.
-Greg
>
> Thanks!
> Chris
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Crc32 Challenge
2015-11-23 15:47 ` Gregory Farnum
@ 2015-11-23 15:50 ` Sage Weil
0 siblings, 0 replies; 4+ messages in thread
From: Sage Weil @ 2015-11-23 15:50 UTC (permalink / raw)
To: Gregory Farnum; +Cc: chris holcombe, Ceph Development
On Mon, 23 Nov 2015, Gregory Farnum wrote:
> On Tue, Nov 17, 2015 at 10:51 AM, chris holcombe
> <chris.holcombe@canonical.com> wrote:
> > Hello Ceph Devs,
> >
> > I'm almost certain at this point that I have discovered a major bug in
> > ceph's crc32c mechanism. http://tracker.ceph.com/issues/13713 I'm totally
> > open to be proven wrong and that's what this email is about. Can someone
> > out there write a piece of code using an outside library that produces the
> > same crc32c checksums that Ceph does? If they can I'll close my bug and
> > stand corrected :). I've tried 3 python libraries and 1 rust library so far
> > and my conclusions are 1) they are all in agreement and 2) they all produce
> > different checksums than ceph's checksums
> > https://github.com/ceph/ceph/blob/83e10f7e2df0a71bd59e6ef2aa06b52b186fddaa/src/test/common/test_crc32c.cc#L21
> >
> > Start small and see if you can verify the "foo bar baz" checksum and then
> > try some of the others.
> >
> > For a known good checksum to test your program against use this:
> > http://www.pdl.cmu.edu/mailinglists/ips/mail/msg04970.html In there Mark
> > Bakke talks about a 32 byte array of all 00h should produce a checksum of
> > 8A9136AA. Printing that with python in decimal: 2324772522
> >
> > The implications of this are unfortunately tricky. If I'm right and we fix
> > ceph's algorithm then it won't be able to talk to any previous version of
> > ceph past the beginning protocol handshake. There would have to be a
> > mechanism introduced so that any x and older version would speak the
> > previous crc and anything y and newer would speak the new version. Another
> > option is we could break ceph's crc code out into a library and make that
> > available to everyone and call it ceph-crc32c.
>
> I haven't checked the source for exactly where we use CRC32s, but I
> think the basic messenger protocol isn't checksummed ? we ought to be
> able to use the feature bits exchanged in the protocol handshake to
> decide which version of the crc to use?
> At least if it's worth changing; I've no idea about that.
The difference turned out to be a reasonable common convention of doing an
xor ~0 with the final result. We don't do that, and I don't think we
should since it would (1) be a really painful change and (2) makes
chaining together the xor values of multiple buffers more error-prone.
So we're off the hook!
sage
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2015-11-23 15:50 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-17 16:51 Crc32 Challenge chris holcombe
2015-11-18 9:33 ` Dan van der Ster
2015-11-23 15:47 ` Gregory Farnum
2015-11-23 15:50 ` Sage Weil
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.