Re: btrfs send hung in pipe_wait

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Stefan Loewen <stefan.loewen@gmail.com>
To: Chris Murphy <lists@colorremedies.com>
Cc: Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: btrfs send hung in pipe_wait
Date: Fri, 7 Sep 2018 14:47:41 +0200	[thread overview]
Message-ID: <CAHTTHimqg_wgqs0AXt73YzOv3ga7cAEUvbwMOVVT2JUVaNbsFQ@mail.gmail.com> (raw)
In-Reply-To: <CAJCQCtS+ZXzGU0AE=C1iA7yNFrXuRAvZkhssxN40=jPd=x6neA@mail.gmail.com>

Well... It seems it's not the hardware.
I ran a long SMART check which ran through without errors and
reallocation count is still 0.

So I used clonezilla (partclone.btrfs) to mirror the drive to another
drive (same model).
Everything copied over just fine. No I/O error im dmesg.

The new disk shows the same behavior.
So I created another subvolume, reflinked stuff over and found that it
is enough to reflink one file, create a read-only snapshot and try to
btrfs-send that. It's not happening with every file, but there are
definitely multiple different files. The one I tested with is a 3.8GB
ISO file.
Even better:
'btrfs send --no-data snap-one > /dev/null'
(snap-one containing just one iso file) hangs as well.
Still dmesg shows no IO errors, only "INFO: task btrfs-transacti:541
blocked for more than 120 seconds." with associated call trace.
btrfs-send reads some MB in the beginning, writes a few bytes and then
hangs without further IO.

copying the same file without --reflink, snapshotting and sending
works without problems.

I guess that pretty much eleminates bad sectors and points towards
some problem with reflinks / btrfs metadata.


Btw.: Thanks for taking that much time for helping me find the problem
here, Chris. Very much appreciated!
Am Fr., 7. Sep. 2018 um 05:29 Uhr schrieb Chris Murphy
<lists@colorremedies.com>:
>
> On Thu, Sep 6, 2018 at 2:16 PM, Stefan Loewen <stefan.loewen@gmail.com> wrote:
>
> > Data,single: Size:695.01GiB, Used:653.69GiB
> > /dev/sdb1     695.01GiB
> > Metadata,DUP: Size:4.00GiB, Used:2.25GiB
> > /dev/sdb1       8.00GiB
> > System,DUP: Size:40.00MiB, Used:96.00KiB
>
>
> > Does that mean Metadata is duplicated?
>
> Yes. Single copy for data. Duplicate for metadata+system, and there
> are no single chunks for metadata/system.
>
> >
> > Ok so to summarize and see if I understood you correctly:
> > There are bad sectors on disk. Running an extended selftest (smartctl -t
> > long) could find those and replace them with spare sectors.
>
> More likely if it finds a persistently failing sector, it will just
> record the first failing sector LBA in its log, and then abort. You'll
> see this info with 'smartctl -a' or with -x.
>
> It is possible to resume the test using selective option and picking a
> 4K aligned 512 byte LBA value after the 4K sector with the defect.
> Just because only one is reported in dmesg doesn't mean there isn't a
> bad one.
>
> It's unlikely the long test is going to actually fix anything, it'll
> just give you more ammunition for getting a likely under warranty
> device replaced because it really shouldn't have any issues at this
> age.
>
>
> > If it does not I can try calculating the physical (4K) sector number and
> > write to that to make the drive notice and mark the bad sector.
> > Is there a way to find out which file I will be writing to beforehand?
>
> I'm not sure how to do it easily.
>
> >Or is
> > it easier to just write to the sector and then wait for scrub to tell me
> > (and the sector is broken anyways)?
>
> If it's a persistent read error, then it's lost. So you might as well
> overwrite it. If it's data, scrub will tell you what file is corrupted
> (and restore can help you recover the whole file, of course it'll have
> a 4K hole of zeros in it). If it's metadata, Btrfs will fix up the 4K
> hole with duplicate metadata.
>
> Gotcha is to make certain you've got the right LBA to write to. You
> can use dd to test this, by reading the suspect bad sector, and if
> you've got the right one, you'll get an I/O error in user space and
> dmesg will have a message like before with sector value. Use the dd
> skip= flag for reading, but make *sure* you use seek= when writing
> *and* make sure you always use bs=4096 count=1 so that if you make a
> mistake you limit the damage haha.
>
> >
> > For the drive: Not under warranty anymore. It's an external HDD that I had
> > lying around for years, mostly unused. Now I wanted to use it as part of my
> > small DIY NAS.
>
> Gotcha. Well you can read up on smartctl and smartd, and set it up for
> regular extended tests, and keep an eye on rapidly changing values. It
> might give you a 50/50 chance of an early heads up before it dies.
>
> I've got an old Hitachi/Apple laptop drive that years ago developed
> multiple bad sectors in different zones of the drive. They got
> remapped and I haven't had a problem with that drive since. *shrug*
> And in fact I did get a discrete error message from the drive for one
> of those and Btrfs overwrote that bad sector with a good copy (it's in
> a raid1 volume), so working as designed I guess.
>
> Since you didn't get a fix up message from Btrfs, either the whole
> thing just got confused with hanging tasks, or it's possible it's a
> data block.
>
>
> --
> Chris Murphy

next prev parent reply	other threads:[~2018-09-07 17:29 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-06  9:22 btrfs send hung in pipe_wait Stefan Löwen
2018-09-06 15:04 ` Stefan Loewen
2018-09-06 15:48   ` Chris Murphy
2018-09-06 16:03     ` Stefan Löwen
2018-09-06 18:16       ` Chris Murphy
2018-09-06 18:36         ` Stefan Loewen
2018-09-06 19:58           ` Chris Murphy
2018-09-06 20:16             ` Stefan Loewen
2018-09-07  3:29               ` Chris Murphy
2018-09-07 12:47                 ` Stefan Loewen [this message]
2018-09-07 15:44                   ` Chris Murphy
2018-09-07 17:07                     ` Stefan Loewen
2018-09-07 19:17                       ` Chris Murphy
     [not found]                         ` <CAHTTHimT7m+S4bm1OgZOfmFkk69fc1SPGEvidxwFCHniKL-w6A@mail.gmail.com>
2018-09-08  9:45                           ` Fwd: " Stefan Loewen
2018-09-09  2:31                             ` Chris Murphy
     [not found]                               ` <CAHTTHinSJy6c7jV1pApeQgnGwMHjd9DEutqxc-T5XjKVbeh1SA@mail.gmail.com>
2018-09-09 23:29                                 ` Chris Murphy
     [not found]                           ` <CAJCQCtQBwvvbYR3u=EGbRR=rsnBaZK5F=mso3SE_kPwtcXyvHg@mail.gmail.com>
2018-09-08  9:47                             ` Fwd: " Stefan Loewen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHTTHimqg_wgqs0AXt73YzOv3ga7cAEUvbwMOVVT2JUVaNbsFQ@mail.gmail.com \
    --to=stefan.loewen@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lists@colorremedies.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).