From: Mike Maslenkin <mike.maslenkin@gmail.com>
To: Fiona Ebner <f.ebner@proxmox.com>
Cc: John Snow <jsnow@redhat.com>,
QEMU Developers <qemu-devel@nongnu.org>,
"open list:Network Block Dev..." <qemu-block@nongnu.org>,
Thomas Lamprecht <t.lamprecht@proxmox.com>,
Aaron Lauterer <a.lauterer@proxmox.com>
Subject: Re: Lost partition tables on ide-hd + ahci drive
Date: Thu, 16 Feb 2023 17:17:17 +0300 [thread overview]
Message-ID: <CAL77WPAdDyKFWP_Dqsz_xr7OCzHLTkw6VbYDMGobi8kek4e_8A@mail.gmail.com> (raw)
In-Reply-To: <d07bdbc1-065e-f8ec-2a44-ab141ffedd41@proxmox.com>
Does additional comparison make a sense here: check for LBA == 0 and
then check MBR signature bytes.
Additionally it’s easy to check buffer_is_zero() result or even print
FIS contents under these conditions.
Data looks like a part of guest memory of 64bit Windows.
On Wed, Feb 15, 2023 at 1:53 PM Fiona Ebner <f.ebner@proxmox.com> wrote:
>
> Am 14.02.23 um 19:21 schrieb John Snow:
> > On Thu, Feb 2, 2023 at 7:08 AM Fiona Ebner <f.ebner@proxmox.com> wrote:
> >>
> >> Hi,
> >> over the years we've got 1-2 dozen reports[0] about suddenly
> >> missing/corrupted MBR/partition tables. The issue seems to be very rare
> >> and there was no success in trying to reproduce it yet. I'm asking here
> >> in the hope that somebody has seen something similar.
> >>
> >> The only commonality seems to be the use of an ide-hd drive with ahci bus.
> >>
> >> It does seem to happen with both Linux and Windows guests (one of the
> >> reports even mentions FreeBSD) and backing storages for the VMs include
> >> ZFS, RBD, LVM-Thin as well as file-based storages.
> >>
> >> Relevant part of an example configuration:
> >>
> >>> -device 'ahci,id=ahci0,multifunction=on,bus=pci.0,addr=0x7' \
> >>> -drive 'file=/dev/zvol/myzpool/vm-168-disk-0,if=none,id=drive-sata0,format=raw,cache=none,aio=io_uring,detect-zeroes=on' \
> >>> -device 'ide-hd,bus=ahci0.0,drive=drive-sata0,id=sata0' \
> >>
> >> The first reports are from before io_uring was used and there are also
> >> reports with writeback cache mode and discard=on,detect-zeroes=unmap.
> >>
> >> Some reports say that the issue occurred under high IO load.
> >>
> >> Many reports suspect backups causing the issue. Our backup mechanism
> >> uses backup_job_create() for each drive and runs the jobs sequentially.
> >> It uses a custom block driver as the backup target which just forwards
> >> the writes to the actual target which can be a file or our backup server.
> >> (If you really want to see the details, apply the patches in [1] and see
> >> pve-backup.c and block/backup-dump.c).
> >>
> >> Of course, the backup job will read sector 0 of the source disk, but I
> >> really can't see where a stray write would happen, why the issue would
> >> trigger so rarely or why seemingly only ide-hd+ahci would be affected.
> >>
> >> So again, just asking if somebody has seen something similar or has a
> >> hunch of what the cause might be.
> >>
> >
> > Hi Floria;
> >
> > I'm sorry to say that I haven't worked on the block devices (or
> > backup) for a little while now, so I am not immediately sure what
> > might be causing this problem. In general, I advise against using AHCI
> > in production as better performance (and dev support) can be achieved
> > through virtio.
>
> Yes, we also recommend using virtio-{scsi,blk}-pci to our users and most
> do. Still, some use AHCI, I'd guess mostly for Windows, but not only.
>
> > Still, I am not sure why the combination of AHCI with
> > backup_job_create() would be corrupting the early sectors of the disk.
>
> It's not clear that backup itself is causing the issue. Some of the
> reports do correlate it with backup, but there are no precise timestamps
> when the corruption happened. It might be that the additional IO during
> backup is somehow triggering the issue.
>
> > Do you have any analysis on how much data gets corrupted? Is it the
> > first sector only, the first few? Has anyone taken a peek at the
> > backing storage to see if there are any interesting patterns that can
> > be observed? (Zeroes, garbage, old data?)
>
> It does seem to be the first sector only, but it's not entirely clear.
> Many of the affected users said that after fixing the partition table
> with TestDisk, the VMs booted/worked normally again. We only have dumps
> for the first MiB of three images. In this case, all Windows with Ceph
> RBD images.
>
> See below[0] for the dumps. One was a valid MBR and matched the latest
> good backup, so that VM didn't boot for some other reason, not sure if
> even related to this bug. I did not include this one. One was completely
> empty and one contained other data in the first 512 Bytes, then again
> zeroes, but those zeroes are nothing special AFAIK.
>
> > Have any errors or warnings been observed in either the guest or the
> > host that might offer some clues?
>
> There is a single user who seemed to have hardware issues, and I'd be
> inclined to blame those in that case. But none of the other users
> reported any errors or warnings, though I can't say if any checked
> inside the guests.
>
> > Is there any commonality in the storage format being used? Is it
> > qcow2? Is it network-backed?
>
> There are reports with local ZFS volumes, local LVM-Thin volumes, RBD
> images, qcow2 on NFS. So no pattern to be seen.
>
> > Apologies for the "tier 1" questions.
>
> Thank you for your time!
>
> Best Regards,
> Fiona
>
> @Aaron (had access to the broken images): please correct me/add anything
> relevant I missed. Are the broken VMs/backups still present? If yes, can
> we ask the user to check the logs inside?
>
> [0]:
> > febner@enia ~/Downloads % hexdump -C dump-vm-120.raw
> > 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
> > *
> > 00100000
> > febner@enia ~/Downloads % hexdump -C dump-vm-130.raw
> > 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
> > *
> > 000000c0 00 00 19 03 46 4d 66 6e 00 00 00 00 00 00 00 00 |....FMfn........|
> > 000000d0 04 f2 7a 01 00 00 00 00 00 00 00 00 00 00 00 00 |..z.............|
> > 000000e0 f0 a4 01 00 00 00 00 00 c8 4d 5b 99 0c 81 ff ff |.........M[.....|
> > 000000f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
> > 00000100 00 42 e1 38 0d da ff ff 00 bc b4 3b 0d da ff ff |.B.8.......;....|
> > 00000110 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
> > 00000120 78 00 00 00 01 00 00 00 a8 00 aa 00 00 00 00 00 |x...............|
> > 00000130 a0 71 ba b0 0c 81 ff ff 2e 00 2e 00 00 00 00 00 |.q..............|
> > 00000140 a0 71 ba b0 0c 81 ff ff 00 00 00 00 00 00 00 00 |.q..............|
> > 00000150 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
> > *
> > 000001a0 5c 00 44 00 65 00 76 00 69 00 63 00 65 00 5c 00 |\.D.e.v.i.c.e.\.|
> > 000001b0 48 00 61 00 72 00 64 00 64 00 69 00 73 00 6b 00 |H.a.r.d.d.i.s.k.|
> > 000001c0 56 00 6f 00 6c 00 75 00 6d 00 65 00 32 00 5c 00 |V.o.l.u.m.e.2.\.|
> > 000001d0 57 00 69 00 6e 00 64 00 6f 00 77 00 73 00 5c 00 |W.i.n.d.o.w.s.\.|
> > 000001e0 4d 00 69 00 63 00 72 00 6f 00 73 00 6f 00 66 00 |M.i.c.r.o.s.o.f.|
> > 000001f0 74 00 2e 00 4e 00 45 00 54 00 5c 00 46 00 72 00 |t...N.E.T.\.F.r.|
> > 00000200 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
> > *
> > 00100000
>
>
next prev parent reply other threads:[~2023-02-16 14:18 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-02 12:08 Lost partition tables on ide-hd + ahci drive Fiona Ebner
2023-02-14 18:21 ` John Snow
2023-02-15 10:53 ` Fiona Ebner
2023-02-15 21:47 ` John Snow
2023-02-16 8:58 ` Fiona Ebner
2023-02-16 14:17 ` Mike Maslenkin [this message]
2023-02-16 15:25 ` Fiona Ebner
2023-02-16 16:15 ` Mike Maslenkin
2023-02-17 12:25 ` Fiona Ebner
2023-02-17 13:40 ` Fiona Ebner
2023-02-17 21:22 ` Mike Maslenkin
2023-08-23 8:47 ` Fiona Ebner
2023-08-23 9:17 ` Fiona Ebner
2023-08-26 18:07 ` Mike Maslenkin
2023-02-17 9:44 ` Aaron Lauterer
2023-06-14 14:48 ` Simon J. Rowe
2023-06-15 7:04 ` Fiona Ebner
2023-06-15 8:24 ` Simon Rowe
2023-07-27 13:22 ` Simon Rowe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAL77WPAdDyKFWP_Dqsz_xr7OCzHLTkw6VbYDMGobi8kek4e_8A@mail.gmail.com \
--to=mike.maslenkin@gmail.com \
--cc=a.lauterer@proxmox.com \
--cc=f.ebner@proxmox.com \
--cc=jsnow@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=t.lamprecht@proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).