From: "Martin K. Petersen" <martin.petersen@oracle.com>
To: djwong@us.ibm.com
Cc: Greg Freemyer <greg.freemyer@gmail.com>,
Andreas Dilger <adilger.kernel@dilger.ca>,
Theodore Tso <tytso@mit.edu>,
Sunil Mushran <sunil.mushran@oracle.com>,
Amir Goldstein <amir73il@gmail.com>,
linux-kernel <linux-kernel@vger.kernel.org>,
Andi Kleen <andi@firstfloor.org>, Mingming Cao <cmm@us.ibm.com>,
Joel Becker <jlbec@evilplan.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
linux-ext4@vger.kernel.org, Coly Li <colyli@gmail.com>
Subject: Re: [PATCH v1 00/16] ext4: Add metadata checksumming
Date: Sun, 04 Sep 2011 07:41:03 -0400 [thread overview]
Message-ID: <yq1d3fgwn5c.fsf@sermon.lab.mkp.net> (raw)
In-Reply-To: <20110902182214.GC12086@tux1.beaverton.ibm.com> (Darrick J. Wong's message of "Fri, 2 Sep 2011 11:22:14 -0700")
>>>>> "Darrick" == Darrick J Wong <djwong@us.ibm.com> writes:
Darrick,
Darrick> Furthermore, the nice thing about the in-filesystem checksum is
Darrick> that we bake in other things like the FS UUID and the inode
Darrick> number, which gives you a somewhat better assurance that the
Darrick> data block belongs to the fs and the file that the code think
Darrick> it belongs to.
Yeah, I view DIF/DIX mostly as in-flight protection for writes. Whereas
FS metadata checksumming is great for problem detection at read time.
Another problem with using the DIF app tag to store filesystem metadata
is that many array vendors use it internally and thus only disk drives
are likely to provide the app tag space.
Darrick> The DIX interface allows for a 32-bit block number and a 16-bit
Darrick> application tag ... which is unfortunately small given 64-bit
Darrick> block numbers and 32-bit inode numbers.
I never understood the 32-bit ref tag. Seems silly to have a check that
wraps at the exact boundary where problems are most likely to occur.
I advocated for a DIF Type with 16-bit guard tag and 48-bit ref tag but
that never went anywhere. Too bad - would have been easy for the storage
vendors to implement.
Darrick> As a side note, the crc-t10dif implementation is quite slow --
Darrick> the hardware accelerated crc32c is 15x faster, and the sw
Darrick> implementation is usually 3-6x faster. I suspect somebody will
Darrick> want to fix that before DIF becomes more widespread...
The CRC32C op on Nehalem and beyond is really, really fast. It's
essentially free except for pulling the data through the cache. So it's
not entirely fair to use that as baseline for a pure software
implementation. What is the faster sw implementation are you referring
to, btw.?
lib/crc-t10dif is a regular 256-entry table-based CRC implementation. It
is done pretty much like all our other software CRCs. I seem to recall
attempting a bigger table but that yielded worse real life results due
to cache pollution.
On Westmere and beyond it is possible to accelerate generic CRC
calculation using the PCLMULQDQ operation. There are many of our CRC
functions that could benefit from this. However, so far intel have not
been willing to contribute the relevant code to Linux.
Darrick> The good news is that if you're really worried about integrity,
Darrick> metadata_csum and DIF/DIX aren't mutually exclusive features.
Darrick> Rejecting corrupted write commands at write time seems like a
Darrick> useful feature. :)
Yup!
--
Martin K. Petersen Oracle Linux Engineering
next prev parent reply other threads:[~2011-09-04 11:42 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-09-01 0:30 [PATCH v1 00/16] ext4: Add metadata checksumming Darrick J. Wong
2011-09-01 0:30 ` [PATCH 01/16] ext4: ext4_dx_add_entry should dirty directory metadata with the directory inode Darrick J. Wong
2011-09-01 0:30 ` [PATCH 02/16] ext4: ext4_rename should dirty dir_bh with the correct directory Darrick J. Wong
2011-09-01 0:30 ` [PATCH 03/16] ext4: ext4_mkdir should dirty dir_block with the parent inode Darrick J. Wong
2011-09-01 0:30 ` [PATCH 04/16] ext4: Create a new BH_Verified flag to avoid unnecessary metadata validation Darrick J. Wong
2011-09-01 0:31 ` [PATCH 05/16] ext4: Create a rocompat flag for extended metadata checksumming Darrick J. Wong
2011-09-01 0:31 ` [PATCH 06/16] ext4: Calculate and verify inode checksums Darrick J. Wong
2011-09-01 2:30 ` Andreas Dilger
2011-09-02 19:32 ` Darrick J. Wong
2011-09-02 22:02 ` Andreas Dilger
2011-09-05 17:57 ` Darrick J. Wong
2011-09-01 0:31 ` [PATCH 07/16] ext4: Create bitmap checksum helper functions Darrick J. Wong
2011-09-01 0:31 ` [PATCH 08/16] ext4: Calculate and verify checksums for inode bitmaps Darrick J. Wong
2011-09-01 4:49 ` Andreas Dilger
2011-09-02 19:18 ` Darrick J. Wong
2011-09-02 21:27 ` Andreas Dilger
2011-09-05 18:22 ` Darrick J. Wong
2011-09-05 18:27 ` Andi Kleen
2011-09-05 19:45 ` James Bottomley
2011-09-05 22:12 ` Darrick J. Wong
2011-09-01 0:31 ` [PATCH 09/16] ext4: Calculate and verify block bitmap checksum Darrick J. Wong
[not found] ` <2492E720-3316-4561-8C9C-BBC6E8670EAD@dilger.ca>
2011-09-02 19:08 ` Darrick J. Wong
2011-09-02 21:06 ` Andreas Dilger
2011-09-01 0:31 ` [PATCH 10/16] ext4: Verify and calculate checksums for extent tree blocks Darrick J. Wong
[not found] ` <26584BE9-B716-40D5-B3B4-8C5912869648@dilger.ca>
2011-09-02 19:02 ` Darrick J. Wong
2011-09-01 0:31 ` [PATCH 11/16] ext4: Calculate and verify checksums for htree nodes Darrick J. Wong
2011-09-01 0:31 ` [PATCH 12/16] ext4: Calculate and verify checksums of directory leaf blocks Darrick J. Wong
[not found] ` <4D5D8FB2-0D8A-4D30-B6D8-51158395C1C9@dilger.ca>
2011-09-02 18:57 ` Darrick J. Wong
2011-09-02 20:52 ` Andreas Dilger
2011-09-05 18:30 ` Darrick J. Wong
2011-09-01 0:32 ` [PATCH 13/16] ext4: Calculate and verify checksums of extended attribute blocks Darrick J. Wong
[not found] ` <F4A42DCE-F91C-419B-9153-65A7EA91D241@dilger.ca>
2011-09-02 18:43 ` Darrick J. Wong
2011-09-01 0:32 ` [PATCH 14/16] ext4: Make inode checksum cover empty space Darrick J. Wong
[not found] ` <4540D5DB-53D9-4C33-BC6B-868870D42AF3@dilger.ca>
2011-09-02 18:42 ` Darrick J. Wong
2011-09-01 0:32 ` [PATCH 15/16] ext4: Calculate and verify superblock checksum Darrick J. Wong
[not found] ` <2882FBB2-797C-4D27-8569-B6826DD34F68@dilger.ca>
2011-09-02 18:40 ` Darrick J. Wong
2011-09-01 0:32 ` [PATCH 16/16] jbd2: Support CRC32C transaction checksums Darrick J. Wong
[not found] ` <99019900-30B1-450A-9620-E94371A30CC6@dilger.ca>
2011-09-02 18:31 ` Darrick J. Wong
2011-09-02 14:15 ` [PATCH v1 00/16] ext4: Add metadata checksumming Greg Freemyer
2011-09-02 18:22 ` Darrick J. Wong
2011-09-04 11:41 ` Martin K. Petersen [this message]
2011-09-04 16:54 ` Andi Kleen
2011-09-04 17:17 ` Martin K. Petersen
2011-09-04 17:44 ` Andi Kleen
2011-09-04 22:19 ` Martin K. Petersen
2011-09-05 18:55 ` Darrick J. Wong
2011-09-05 18:45 ` Darrick J. Wong
2011-09-06 12:59 ` Martin K. Petersen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=yq1d3fgwn5c.fsf@sermon.lab.mkp.net \
--to=martin.petersen@oracle.com \
--cc=adilger.kernel@dilger.ca \
--cc=amir73il@gmail.com \
--cc=andi@firstfloor.org \
--cc=cmm@us.ibm.com \
--cc=colyli@gmail.com \
--cc=djwong@us.ibm.com \
--cc=greg.freemyer@gmail.com \
--cc=jlbec@evilplan.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=sunil.mushran@oracle.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox