Update -- so far, we've not managed to gain any confidence that we'll 
ever be able to re-mount that disk. The general consensus seems to be to 
fish all the data off the disk using rsync, and then move off XFS to ext4.

Not a very helpful message for y'all to hear, I know.  But if it's any 
help in prioritising your future work, i think the dealbreaker for us 
was the inescapable quotacheck on mount, which means that any time a 
fileserver goes down unexpectedly, we have an unavoidable, 
indeterminate-but-long period of downtime...

hp

On 26/02/15 13:07, Harry wrote:
> Thanks Dave,
>
> * The main filesystem is currently online and seems ok, but quotas are 
> not active.
> * We want to estimate how long the quotacheck will take when we 
> reboot/remount
> * We're even a bit worried the disk might be in a broken state, such 
> that the quotacheck won't actually complete successfully at all.
>
> A brief description of our setup:
> - we're on AWS
> - using mdadm to make a raid array out of 8x 200GB SSD EBS drives (and 
> lvm)
> - we're using DRBD to make a live backup of all writes to another 
> instance with a similar raid array
>
> We're not doing our experiments on our live system.  Instead, we're 
> using the drives from the DRBD target system.  We take DRBD offline, 
> so it's no longer writing, then we take snapshots of the drives, then 
> remount those elsewhere so we can experiment without disturbing the 
> live system.
>
> We've managed to mount the backup drives ok, with the 'noquota' 
> option.  Files look ok.  But, so far, we haven't been able to get a 
> quotacheck to complete.  We've waited 12 hours+. Do you think it's 
> possible DRBD is giving us copies of the live disks that are 
> inconsistent somehow?
>
> How can we reassure ourselves that this live disk *will* mount 
> successfully if we reboot the machine, and can we estimate how long it 
> will take?
>
>     /mount | grep log_storage/
>     /dev/drbd0 on /mnt/log_storage type xfs
>     (rw,prjquota,allocsize=64k,_netdev)
>
>     /df -i /mnt/log_storage//
>     Filesystem        Inodes    IUsed     IFree IUse% Mounted on
>     /dev/drbd0     938210704 72929413 865281291    8% /mnt/log_storage
>
>     /df -h /mnt/log_storage//
>     Filesystem      Size  Used Avail Use% Mounted on
>     /dev/drbd0      1.6T  1.4T  207G  88% /mnt/log_storage
>
>     /xfs_info ///mnt/log_storage////
>     /<lots of errors re: cannot find mount point path `xyz`>/
>     meta-data=/dev/drbd0             isize=256    agcount=64,
>     agsize=6553600 blks
>              =                       sectsz=512   attr=2
>     data     =                       bsize=4096 blocks=418906112,
>     imaxpct=25
>              =                       sunit=0      swidth=0 blks
>     naming   =version 2              bsize=4096   ascii-ci=0
>     log      =internal               bsize=4096   blocks=12800, version=2
>              =                       sectsz=512   sunit=0 blks,
>     lazy-count=1
>     realtime =none                   extsz=4096   blocks=0, rtextents=0
>
> The missing paths errors are, I think, from folders we've deleted but 
> not yet removed from the projid/projects folders. I *think* they're a 
> red herring here.
>
> We've also tried running xfs_repair on the backup drives.  It takes 
> about 3 hours, and shows a lot of errors about incorrect directory 
> flags on inodes.  here's one from the bottom of the log of a recent 
> attempt:
>
>     directory flags set on non-directory inode 268702898
>
>
> rgds,
> Confused in London.
>
>
>
> On 24/02/15 21:59, Dave Chinner wrote:
>> On Tue, Feb 24, 2015 at 03:15:26PM +0000, Harry wrote:
>>> Hi there,
>>>
>>> We've got a moderately large disk (~2TB) into an inconsistent state,
>>> such that it's going to want a quotacheck the next time we mount it
>>> (it's currently mounted with quota accounting inactive).  Our tests
>>> suggest this is going to take several hours, and cause an outage we
>>> can't afford.
>> What tests are you performing to suggest a quotacheck of a small
>> filesystem will take hours? (yes, 2TB is a *small* filesystem).
>>
>> (xfs_info, df -i, df -h, storage hardware, etc are all relevant
>> here).
>>
>>> We're wondering whether there's a 'nuke the site from orbit' option
>>> that will let us avoid it.  The plan would be to:
>>> - switch off quotas and delete them completely, using the commands:
>>>    -- disable
>>>    -- off
>>>    -- remove
>>> - remount the drive with -o prjquota, hoping that there will not be
>>> a quotacheck, because we've deleted all the old quota data
>> Mounting with a quota enabled *forces* a quota check if quotas
>> aren't currently enabled. You cannot avoid it; it's the way quota
>> consistency is created.
>>
>>> - run a script gradually restore all the quotas, one by one and in
>>> good time, from our own external backups (we've got the quotas in a
>>> database basically).
>> Can't be done - quotas need to be consistent with what is currently
>> on disk, not what you have in a backup somewhere.
>>
>>> So the questions are:
>>> - is there a way to remove all quota information from a mounted drive?
>>> (the current mount status seems to be that it tried to mount it with
>> mount with quotas on and turn them off via xfs_quota,i or mount
>> without quota options at all. Then run the remove command in
>> xfs_quota.
>>
>>> -o prjquota but that quota accounting is *not* active)
>> Not possible.
>>
>>> - will it work and let us remount the drive with -o prjquota without
>>> causing a quotacheck?
>> No.
>>
>> Cheers,
>>
>> Dave.
>
> Rgds,
> Harry + the PythonAnywhere team.
>
> -- 
> Harry Percival
> Developer
> harry@pythonanywhere.com
>
> PythonAnywhere - a fully browser-based Python development and hosting environment
> <http://www.pythonanywhere.com/>
>
> PythonAnywhere LLP
> 17a Clerkenwell Road, London EC1M 5RD, UK
> VAT No.: GB 893 5643 79
> Registered in England and Wales as company number OC378414.
> Registered address: 28 Ely Place, 3rd Floor, London EC1N 6TD, UK

Rgds,
Harry + the PythonAnywhere team.

-- 
Harry Percival
Developer
harry@pythonanywhere.com

PythonAnywhere - a fully browser-based Python development and hosting environment
<http://www.pythonanywhere.com/>

PythonAnywhere LLP
17a Clerkenwell Road, London EC1M 5RD, UK
VAT No.: GB 893 5643 79
Registered in England and Wales as company number OC378414.
Registered address: 28 Ely Place, 3rd Floor, London EC1N 6TD, UK