Re: Shutdown filesystem when a thin pool become full

linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Gionatan Danti <g.danti@assyoma.it>
To: linux-xfs@vger.kernel.org
Cc: g.danti@assyoma.it
Subject: Re: Shutdown filesystem when a thin pool become full
Date: Tue, 23 May 2017 22:05:34 +0200	[thread overview]
Message-ID: <24daa89a452496d2cdffa5512a64ed2e@assyoma.it> (raw)
In-Reply-To: <20170523122753.k7plzg3musc4up73@eorzea.usersys.redhat.com>

Il 23-05-2017 14:27 Carlos Maiolino ha scritto:
> 
> Aha, you are using sync flag, that's why you are getting IO errors 
> instead of
> ENOSPC, I don't remember from the top of my mind why exactly, it's been 
> a while
> since I started to work on this XFS and dm-thin integration, but IIRC, 
> the
> problem is that XFS reserves the data required, and don't expect to get 
> an
> ENOSPC once the device "have space", and when the sync occurs, kaboom. 
> I should
> take a look again on it.

Ok, I tried with a more typical non-sync write and it seems to report 
ENOSPC:

[root@blackhole ~]# dd if=/dev/zero of=/mnt/storage/disk.img bs=1M 
count=2048
dd: error writing ‘/mnt/storage/disk.img’: No space left on device
2002+0 records in
2001+0 records out
2098917376 bytes (2.1 GB) copied, 7.88216 s, 266 MB/s

With /sys/fs/xfs/dm-6/error/metadata/ENOSPC/max_retries = -1 (default), 
I have the following dmesg output:

[root@blackhole ~]# dmesg
[23152.667198] XFS (dm-6): Mounting V5 Filesystem
[23152.762711] XFS (dm-6): Ending clean mount
[23192.704672] device-mapper: thin: 253:4: reached low water mark for 
data device: sending event.
[23192.988356] device-mapper: thin: 253:4: switching pool to 
out-of-data-space (error IO) mode
[23193.046288] Buffer I/O error on dev dm-6, logical block 385299, lost 
async page write
[23193.046299] Buffer I/O error on dev dm-6, logical block 385300, lost 
async page write
[23193.046302] Buffer I/O error on dev dm-6, logical block 385301, lost 
async page write
[23193.046304] Buffer I/O error on dev dm-6, logical block 385302, lost 
async page write
[23193.046307] Buffer I/O error on dev dm-6, logical block 385303, lost 
async page write
[23193.046309] Buffer I/O error on dev dm-6, logical block 385304, lost 
async page write
[23193.046312] Buffer I/O error on dev dm-6, logical block 385305, lost 
async page write
[23193.046314] Buffer I/O error on dev dm-6, logical block 385306, lost 
async page write
[23193.046316] Buffer I/O error on dev dm-6, logical block 385307, lost 
async page write
[23193.046319] Buffer I/O error on dev dm-6, logical block 385308, lost 
async page write

With /sys/fs/xfs/dm-6/error/metadata/ENOSPC/max_retries = 0, dmesg 
output is slightly different:

[root@blackhole default]# dmesg
[23557.594502] device-mapper: thin: 253:4: switching pool to 
out-of-data-space (error IO) mode
[23557.649772] buffer_io_error: 257430 callbacks suppressed
[23557.649784] Buffer I/O error on dev dm-6, logical block 381193, lost 
async page write
[23557.649805] Buffer I/O error on dev dm-6, logical block 381194, lost 
async page write
[23557.649811] Buffer I/O error on dev dm-6, logical block 381195, lost 
async page write
[23557.649818] Buffer I/O error on dev dm-6, logical block 381196, lost 
async page write
[23557.649862] Buffer I/O error on dev dm-6, logical block 381197, lost 
async page write
[23557.649871] Buffer I/O error on dev dm-6, logical block 381198, lost 
async page write
[23557.649880] Buffer I/O error on dev dm-6, logical block 381199, lost 
async page write
[23557.649888] Buffer I/O error on dev dm-6, logical block 381200, lost 
async page write
[23557.649897] Buffer I/O error on dev dm-6, logical block 381201, lost 
async page write
[23557.649903] Buffer I/O error on dev dm-6, logical block 381202, lost 
async page write

Notice the suppressed buffer_io_error entries: are they related to the 
bug you linked before?
Anyway, in *no* cases I had a filesystem shutdown on these errors.

Trying to be pragmatic, my main concern is to avoid extended filesystem 
and/or data corruption in the case a thin pool become inadvertently 
full. For example, with ext4 I can mount the filesystem with 
"errors=remount-ro,data=journaled" and *any* filesystem error (due to 
thinpool or other problems) will put the filesystem in a read-only 
state, avoiding significan damages.

If, and how, I can replicate this behavior with XFS? From my 
understanding, XFS does not have a "remount read-only" mode. Moreover, 
until its metadata can be safely stored on disk (ie: they hit already 
allocated space), it seems to happily continue to run, disregarding data 
writeout problem/error. As a note, ext4 without "data=jornaled" bahave 
quite similarly, whit a read-only remount happening on metadata errors 
only.

Surely I am missing something... right?
Thanks.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

next prev parent reply	other threads:[~2017-05-23 20:05 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-22 14:25 Shutdown filesystem when a thin pool become full Gionatan Danti
2017-05-22 23:09 ` Carlos Maiolino
2017-05-23 10:56   ` Gionatan Danti
2017-05-23 11:01     ` Gionatan Danti
2017-05-23 12:27       ` Carlos Maiolino
2017-05-23 20:05         ` Gionatan Danti [this message]
2017-05-23 21:33           ` Eric Sandeen
2017-05-24 17:52             ` Gionatan Danti
2017-06-13  9:09           ` Gionatan Danti
2017-06-15 11:51             ` Gionatan Danti
2017-06-15 13:14               ` Carlos Maiolino
2017-06-15 14:10                 ` Carlos Maiolino
2017-06-15 15:04                   ` Gionatan Danti
2017-06-20 10:19                     ` Gionatan Danti
2017-06-20 11:05                     ` Carlos Maiolino
2017-06-20 15:03                       ` Gionatan Danti
2017-06-20 15:28                         ` Brian Foster
2017-06-20 15:34                           ` Luis R. Rodriguez
2017-06-20 17:01                             ` Brian Foster
2017-06-20 15:55                           ` Gionatan Danti
2017-06-20 17:02                             ` Brian Foster
2017-06-20 18:43                               ` Gionatan Danti
2017-06-21  9:44                                 ` Carlos Maiolino
2017-06-21 10:39                                   ` Gionatan Danti
2017-06-21  9:53                                 ` Brian Foster
2017-05-23 12:11     ` Carlos Maiolino
2017-05-23 13:24 ` Eric Sandeen
2017-05-23 20:23   ` Gionatan Danti
2017-05-24  7:38     ` Carlos Maiolino
2017-05-24 17:50       ` Gionatan Danti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=24daa89a452496d2cdffa5512a64ed2e@assyoma.it \
    --to=g.danti@assyoma.it \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).