public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH 4/9] xfs: rework dquot CRCs
Date: Thu, 30 May 2013 08:02:12 -0400	[thread overview]
Message-ID: <51A73FC4.5080700@redhat.com> (raw)
In-Reply-To: <20130530010025.GF29466@dastard>

On 05/29/2013 09:00 PM, Dave Chinner wrote:
> On Wed, May 29, 2013 at 02:58:27PM -0400, Brian Foster wrote:
>> On 05/27/2013 02:38 AM, Dave Chinner wrote:
>>> From: Dave Chinner <dchinner@redhat.com>
>>>
>>> Calculating dquot CRCs when the backing buffer is written back just
>>> doesn't work reliably. There are several places which manipulate
>>> dquots directly in the buffers, and they don't calculate CRCs
>>> appropriately, nor do they always set the buffer up to calculate
>>> CRCs appropriately.
>>>
>>> Firstly, if we log a dquot buffer (e.g. during allocation) it gets
>>> logged without valid CRC, and so on recovery we end up with a dquot
>>> that is not valid.
>>>
>>> Secondly, if we recover/repair a dquot, we don't have a verifier
>>> attached to the buffer and hence CRCs arenot calculate don the way
>>> down to disk.
>>>
>>> Thirdly, calculating the CRC after we've changed the contents means
>>> that if we re-read the dquot from the buffer, we cannot verify the
>>> contents of the dquot are valid, as the CRC is invalid.
>>>
>>> So, to avoid all the dquot CRC errors that are being detected by the
>>> read verifier, change to using the same model as for inodes. that
>>> is, dquot CRCs are calculated and written to the backing buffer at
>>> the time the dquot is flushed to the backing buffer. If we modify
>>> the dquuot directly in the backing buffer, calculate the CRC
>>> immediately after the modification is complete. Hence the dquot in
>>> the on-disk buffer should always have a valid CRC.
>>>
>>> Signed-off-by: Dave Chinner <dchinner@redhat.com>
>>> ---
>>>  fs/xfs/xfs_dquot.c       |   37 ++++++++++++++++---------------------
>>>  fs/xfs/xfs_log_recover.c |   10 ++++++++++
>>>  fs/xfs/xfs_qm.c          |   36 ++++++++++++++++++++++++++----------
>>>  fs/xfs/xfs_quota.h       |    2 ++
>>>  4 files changed, 54 insertions(+), 31 deletions(-)
>>>
>> ...
>>> diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
>>> index f41702b..d181542 100644
>>> --- a/fs/xfs/xfs_qm.c
>>> +++ b/fs/xfs/xfs_qm.c
>>> @@ -41,6 +41,7 @@
>>>  #include "xfs_qm.h"
>>>  #include "xfs_trace.h"
>>>  #include "xfs_icache.h"
>>> +#include "xfs_cksum.h"
>>>  
>>>  /*
>>>   * The global quota manager. There is only one of these for the entire
>>> @@ -839,7 +840,7 @@ xfs_qm_reset_dqcounts(
>>>  	xfs_dqid_t	id,
>>>  	uint		type)
>>>  {
>>> -	xfs_disk_dquot_t	*ddq;
>>> +	struct xfs_dqblk	*dqb;
>>>  	int			j;
>>>  
>>>  	trace_xfs_reset_dqcounts(bp, _RET_IP_);
>>> @@ -853,8 +854,12 @@ xfs_qm_reset_dqcounts(
>>>  	do_div(j, sizeof(xfs_dqblk_t));
>>>  	ASSERT(mp->m_quotainfo->qi_dqperchunk == j);
>>>  #endif
>>> -	ddq = bp->b_addr;
>>> +	dqb = bp->b_addr;
>>>  	for (j = 0; j < mp->m_quotainfo->qi_dqperchunk; j++) {
>>> +		struct xfs_disk_dquot	*ddq;
>>> +
>>> +		ddq =  (struct xfs_disk_dquot *)&dqb[j];
>>> +
>>>  		/*
>>>  		 * Do a sanity check, and if needed, repair the dqblk. Don't
>>>  		 * output any warnings because it's perfectly possible to
>>> @@ -871,7 +876,8 @@ xfs_qm_reset_dqcounts(
>>>  		ddq->d_bwarns = 0;
>>>  		ddq->d_iwarns = 0;
>>>  		ddq->d_rtbwarns = 0;
>>> -		ddq = (xfs_disk_dquot_t *) ((xfs_dqblk_t *)ddq + 1);
>>> +		xfs_update_cksum((char *)&dqb[j], sizeof(struct xfs_dqblk),
>>> +					 XFS_DQUOT_CRC_OFF);
>>
>> Nice cleanup on the cast ugliness even without the crc change. Is there
>> a technical reason for the unconditional crc update here beyond that
>> we're doing a reset? I'm wondering if there's any value in leaving those
>> bits untouched for a filesystem that might have never enabled crc
>> (quotacheck or not).
> 
> The dquot might be zeroed and unused, but the buffer it sits in is
> still allocated and valid. That means if we ever start using that
> dquot again (either by quotacheck or a new uid/gid/prid), it will be
> read straight out of the buffer rather than allocated, and hence the
> constraint that allocated but unused dquots still need to have valid
> CRCs.
> 

The constraint makes sense when CRCs are enabled...

> FWIW, the dquot buffer read validates the CRC on all dquots in the
> buffer when it comes off disk as it has no way of knowing what
> dquots contain valid data or not. Same with the xfs_qm_dqcheck()
> call - an unused dquot still needs to be a valid dquot to pass those
> checks...
> 

Yeah, that part makes sense. I've followed through and grokked most of
the dquot buffer read and dquot CRC validation code, I think.

My question is more why is the code above (in xfs_qm_reset_dqcounts())
not the following?

	if (xfs_sb_version_hascrc(&mp->m_sb))
		xfs_update_cksum(...);

... because it currently looks like that if CRCs are not enabled, you're
writing what is effectively a padded area (in terms of the semantics of
the on-disk structure, not necessarily the kernel code). It was never a
valid CRC and won't be as soon as the dquot is used again, no?

>> FWIW, the rest of this patch looks sane to me (I'm less familiar with
>> the log recovery code, but the changes seem isolated and
>> straightforward) and I couldn't locate anywhere else we modify the
>> backing buffer for the dquot.
> 
> Right, apart from dquot allocation and flushing, there aren't any.
> Resetting the dquots to zero before a quota check is a special case.
> Doing it via the buffer avoids needing to initialise all the dquots
> in memory that *might* be tracked by the quota file. But we don't
> know what quota IDs are tracked in the quota file with first mapping
> the quota file and finding all the offsets that contain blocks. And,
> well, once we have that mapping, why would be do N xfs_qm_dqget()
> calls per buffer to get initialised, cached dquots from the buffer
> when we can simply RMW the buffers we've mapped directly?
> 

I walked through a bit of the quota check code and that makes sense.
Thanks for the explanation.

Brian

> Cheers,
> 
> Dave.
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2013-05-30 12:05 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-27  6:38 [PATH 0/9] xfs: fixes for 3.10-rc4 Dave Chinner
2013-05-27  6:38 ` [PATCH 1/9] xfs: don't emit v5 superblock warnings on write Dave Chinner
2013-05-29 16:39   ` Brian Foster
2013-05-30 17:49     ` Ben Myers
2013-06-11  6:05       ` Dave Chinner
2013-06-11 21:29         ` Ben Myers
2013-05-27  6:38 ` [PATCH 2/9] xfs: fix incorrect remote symlink block count Dave Chinner
2013-05-29 16:39   ` Brian Foster
2013-05-30  0:46     ` Dave Chinner
2013-05-30 17:49     ` Ben Myers
2013-05-27  6:38 ` [PATCH 3/9] xfs: increase number of ACL entries for V5 superblocks Dave Chinner
2013-05-29 16:40   ` Brian Foster
2013-05-27  6:38 ` [PATCH 4/9] xfs: rework dquot CRCs Dave Chinner
2013-05-29 18:58   ` Brian Foster
2013-05-30  1:00     ` Dave Chinner
2013-05-30 12:02       ` Brian Foster [this message]
2013-06-03  4:12         ` Dave Chinner
2013-05-27  6:38 ` [PATCH 5/9] xfs: fix split buffer vector log recovery support Dave Chinner
2013-05-29 19:21   ` Mark Tinguely
2013-05-30 17:49     ` Ben Myers
2013-05-27  6:38 ` [PATCH 6/9] xfs: disable swap extents ioctl on CRC enabled filesystems Dave Chinner
2013-05-28 21:49   ` Ben Myers
2013-05-30  1:07     ` Dave Chinner
2013-05-29 21:06   ` Brian Foster
2013-05-30 17:56     ` Ben Myers
2013-05-27  6:38 ` [PATCH 7/9] xfs: kill suid/sgid through the truncate path Dave Chinner
2013-05-30 14:17   ` Brian Foster
2013-05-30 15:52     ` Ben Myers
2013-05-30 16:02       ` Brian Foster
2013-05-30 17:07         ` Ben Myers
2013-05-27  6:38 ` [PATCH 8/9] xfs: add fsgeom flag for v5 superblock support Dave Chinner
2013-05-29 15:10   ` Eric Sandeen
2013-05-29 21:43     ` Ben Myers
2013-05-29 21:47       ` Ben Myers
2013-05-30  1:28       ` Dave Chinner
2013-05-30  1:11     ` Dave Chinner
2013-05-30 14:17   ` Brian Foster
2013-05-30 17:57     ` Ben Myers
2013-05-27  6:38 ` [PATCH 9/9] xfs: inode unlinked list needs to recalculate the inode CRC Dave Chinner
2013-05-28 11:51   ` Dave Chinner
2013-05-28 20:36   ` [PATCH 9a,9b v2, replacements] xfs: unlinked list crcs Dave Chinner
2013-05-28 20:36     ` [PATCH 1/2] xfs: fix log recovery transaction item reordering Dave Chinner
2013-05-28 20:36     ` [PATCH 2/2] xfs: inode unlinked list needs to recalculate the inode CRC Dave Chinner
2013-05-30 14:17       ` Brian Foster
2013-05-30 20:27         ` Dave Chinner
2013-05-28  8:37 ` [PATCH 10/9] xfs: fix dir3 freespace block corruption Dave Chinner
2013-05-30 19:15   ` Ben Myers
2013-05-31 21:54     ` Ben Myers
2013-05-28 17:56 ` [PATH 0/9] xfs: fixes for 3.10-rc4 Ben Myers
2013-05-28 23:54   ` Dave Chinner
2013-05-29 19:01     ` Ben Myers
2013-05-29 19:27       ` Eric Sandeen
2013-05-29 19:45         ` Ben Myers
2013-05-28 21:27 ` [PATCH 11/9] xfs: fix remote attribute invalidation for a leaf Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51A73FC4.5080700@redhat.com \
    --to=bfoster@redhat.com \
    --cc=david@fromorbit.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox