From mboxrd@z Thu Jan 1 00:00:00 1970 From: Igor Fedotov Subject: Re: Adding compression/checksum support for bluestore. Date: Thu, 31 Mar 2016 19:27:24 +0300 Message-ID: <56FD4FEC.4060000@mirantis.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from mail-lb0-f172.google.com ([209.85.217.172]:33040 "EHLO mail-lb0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752206AbcCaQ11 (ORCPT ); Thu, 31 Mar 2016 12:27:27 -0400 Received: by mail-lb0-f172.google.com with SMTP id u8so55947045lbk.0 for ; Thu, 31 Mar 2016 09:27:27 -0700 (PDT) In-Reply-To: Sender: ceph-devel-owner@vger.kernel.org List-ID: To: Sage Weil , Allen Samuels Cc: ceph-devel On 31.03.2016 1:15, Sage Weil wrote: > On Wed, 30 Mar 2016, Allen Samuels wrote: >> [snip] >> >> Time to talk about checksums. >> >> First let's divide the world into checksums for data and checksums for >> metadata -- and defer the discussion about checksums for metadata >> (important, but one at a time...) >> >> I believe it's a requirement that when checksums are enabled that 100% >> of data reads must be validated against their corresponding checksum. >> This leads you to conclude that you must store a checksum for each >> independently readable piece of data. > I'm just worried about the size of metadata if we have 4k checksums but > have to read big extents anyway; cheaper to store a 4 byte checksum for > each compressed blob. But do we really need to store checksums as metadata? What's about pre(post)fixing 4K-4(?) blob with the checksum and store this pair to the disk. IMO we always need checksum values along with blob data thus let's store and read them together. This immediately eliminates the question about the granularity and corresponding overhead... Have I missed something?