Re: Lost partition tables on ide-hd + ahci drive

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Mike Maslenkin <mike.maslenkin@gmail.com>
To: Fiona Ebner <f.ebner@proxmox.com>
Cc: John Snow <jsnow@redhat.com>,
	QEMU Developers <qemu-devel@nongnu.org>,
	 "open list:Network Block Dev..." <qemu-block@nongnu.org>,
	Thomas Lamprecht <t.lamprecht@proxmox.com>,
	 Aaron Lauterer <a.lauterer@proxmox.com>
Subject: Re: Lost partition tables on ide-hd + ahci drive
Date: Thu, 16 Feb 2023 17:17:17 +0300	[thread overview]
Message-ID: <CAL77WPAdDyKFWP_Dqsz_xr7OCzHLTkw6VbYDMGobi8kek4e_8A@mail.gmail.com> (raw)
In-Reply-To: <d07bdbc1-065e-f8ec-2a44-ab141ffedd41@proxmox.com>

Does additional comparison make a sense here: check for LBA == 0 and
then check MBR signature bytes.
Additionally it’s easy to check buffer_is_zero() result or even print
FIS contents under these conditions.
Data looks like a part of guest memory of 64bit Windows.

On Wed, Feb 15, 2023 at 1:53 PM Fiona Ebner <f.ebner@proxmox.com> wrote:
>
> Am 14.02.23 um 19:21 schrieb John Snow:
> > On Thu, Feb 2, 2023 at 7:08 AM Fiona Ebner <f.ebner@proxmox.com> wrote:
> >>
> >> Hi,
> >> over the years we've got 1-2 dozen reports[0] about suddenly
> >> missing/corrupted MBR/partition tables. The issue seems to be very rare
> >> and there was no success in trying to reproduce it yet. I'm asking here
> >> in the hope that somebody has seen something similar.
> >>
> >> The only commonality seems to be the use of an ide-hd drive with ahci bus.
> >>
> >> It does seem to happen with both Linux and Windows guests (one of the
> >> reports even mentions FreeBSD) and backing storages for the VMs include
> >> ZFS, RBD, LVM-Thin as well as file-based storages.
> >>
> >> Relevant part of an example configuration:
> >>
> >>>   -device 'ahci,id=ahci0,multifunction=on,bus=pci.0,addr=0x7' \
> >>>   -drive 'file=/dev/zvol/myzpool/vm-168-disk-0,if=none,id=drive-sata0,format=raw,cache=none,aio=io_uring,detect-zeroes=on' \
> >>>   -device 'ide-hd,bus=ahci0.0,drive=drive-sata0,id=sata0' \
> >>
> >> The first reports are from before io_uring was used and there are also
> >> reports with writeback cache mode and discard=on,detect-zeroes=unmap.
> >>
> >> Some reports say that the issue occurred under high IO load.
> >>
> >> Many reports suspect backups causing the issue. Our backup mechanism
> >> uses backup_job_create() for each drive and runs the jobs sequentially.
> >> It uses a custom block driver as the backup target which just forwards
> >> the writes to the actual target which can be a file or our backup server.
> >> (If you really want to see the details, apply the patches in [1] and see
> >> pve-backup.c and block/backup-dump.c).
> >>
> >> Of course, the backup job will read sector 0 of the source disk, but I
> >> really can't see where a stray write would happen, why the issue would
> >> trigger so rarely or why seemingly only ide-hd+ahci would be affected.
> >>
> >> So again, just asking if somebody has seen something similar or has a
> >> hunch of what the cause might be.
> >>
> >
> > Hi Floria;
> >
> > I'm sorry to say that I haven't worked on the block devices (or
> > backup) for a little while now, so I am not immediately sure what
> > might be causing this problem. In general, I advise against using AHCI
> > in production as better performance (and dev support) can be achieved
> > through virtio.
>
> Yes, we also recommend using virtio-{scsi,blk}-pci to our users and most
> do. Still, some use AHCI, I'd guess mostly for Windows, but not only.
>
> > Still, I am not sure why the combination of AHCI with
> > backup_job_create() would be corrupting the early sectors of the disk.
>
> It's not clear that backup itself is causing the issue. Some of the
> reports do correlate it with backup, but there are no precise timestamps
> when the corruption happened. It might be that the additional IO during
> backup is somehow triggering the issue.
>
> > Do you have any analysis on how much data gets corrupted? Is it the
> > first sector only, the first few? Has anyone taken a peek at the
> > backing storage to see if there are any interesting patterns that can
> > be observed? (Zeroes, garbage, old data?)
>
> It does seem to be the first sector only, but it's not entirely clear.
> Many of the affected users said that after fixing the partition table
> with TestDisk, the VMs booted/worked normally again. We only have dumps
> for the first MiB of three images. In this case, all Windows with Ceph
> RBD images.
>
> See below[0] for the dumps. One was a valid MBR and matched the latest
> good backup, so that VM didn't boot for some other reason, not sure if
> even related to this bug. I did not include this one. One was completely
> empty and one contained other data in the first 512 Bytes, then again
> zeroes, but those zeroes are nothing special AFAIK.
>
> > Have any errors or warnings been observed in either the guest or the
> > host that might offer some clues?
>
> There is a single user who seemed to have hardware issues, and I'd be
> inclined to blame those in that case. But none of the other users
> reported any errors or warnings, though I can't say if any checked
> inside the guests.
>
> > Is there any commonality in the storage format being used? Is it
> > qcow2? Is it network-backed?
>
> There are reports with local ZFS volumes, local LVM-Thin volumes, RBD
> images, qcow2 on NFS. So no pattern to be seen.
>
> > Apologies for the "tier 1" questions.
>
> Thank you for your time!
>
> Best Regards,
> Fiona
>
> @Aaron (had access to the broken images): please correct me/add anything
> relevant I missed. Are the broken VMs/backups still present? If yes, can
> we ask the user to check the logs inside?
>
> [0]:
> > febner@enia ~/Downloads % hexdump -C dump-vm-120.raw
> > 00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > *
> > 00100000
> > febner@enia ~/Downloads % hexdump -C dump-vm-130.raw
> > 00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > *
> > 000000c0  00 00 19 03 46 4d 66 6e  00 00 00 00 00 00 00 00  |....FMfn........|
> > 000000d0  04 f2 7a 01 00 00 00 00  00 00 00 00 00 00 00 00  |..z.............|
> > 000000e0  f0 a4 01 00 00 00 00 00  c8 4d 5b 99 0c 81 ff ff  |.........M[.....|
> > 000000f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > 00000100  00 42 e1 38 0d da ff ff  00 bc b4 3b 0d da ff ff  |.B.8.......;....|
> > 00000110  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > 00000120  78 00 00 00 01 00 00 00  a8 00 aa 00 00 00 00 00  |x...............|
> > 00000130  a0 71 ba b0 0c 81 ff ff  2e 00 2e 00 00 00 00 00  |.q..............|
> > 00000140  a0 71 ba b0 0c 81 ff ff  00 00 00 00 00 00 00 00  |.q..............|
> > 00000150  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > *
> > 000001a0  5c 00 44 00 65 00 76 00  69 00 63 00 65 00 5c 00  |\.D.e.v.i.c.e.\.|
> > 000001b0  48 00 61 00 72 00 64 00  64 00 69 00 73 00 6b 00  |H.a.r.d.d.i.s.k.|
> > 000001c0  56 00 6f 00 6c 00 75 00  6d 00 65 00 32 00 5c 00  |V.o.l.u.m.e.2.\.|
> > 000001d0  57 00 69 00 6e 00 64 00  6f 00 77 00 73 00 5c 00  |W.i.n.d.o.w.s.\.|
> > 000001e0  4d 00 69 00 63 00 72 00  6f 00 73 00 6f 00 66 00  |M.i.c.r.o.s.o.f.|
> > 000001f0  74 00 2e 00 4e 00 45 00  54 00 5c 00 46 00 72 00  |t...N.E.T.\.F.r.|
> > 00000200  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > *
> > 00100000
>
>

next prev parent reply	other threads:[~2023-02-16 14:18 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-02 12:08 Lost partition tables on ide-hd + ahci drive Fiona Ebner
2023-02-14 18:21 ` John Snow
2023-02-15 10:53   ` Fiona Ebner
2023-02-15 21:47     ` John Snow
2023-02-16  8:58       ` Fiona Ebner
2023-02-16 14:17     ` Mike Maslenkin [this message]
2023-02-16 15:25       ` Fiona Ebner
2023-02-16 16:15         ` Mike Maslenkin
2023-02-17 12:25           ` Fiona Ebner
2023-02-17 13:40       ` Fiona Ebner
2023-02-17 21:22         ` Mike Maslenkin
2023-08-23  8:47           ` Fiona Ebner
2023-08-23  9:17             ` Fiona Ebner
2023-08-26 18:07               ` Mike Maslenkin
2023-02-17  9:44     ` Aaron Lauterer
2023-06-14 14:48 ` Simon J. Rowe
2023-06-15  7:04   ` Fiona Ebner
2023-06-15  8:24     ` Simon Rowe
2023-07-27 13:22   ` Simon Rowe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAL77WPAdDyKFWP_Dqsz_xr7OCzHLTkw6VbYDMGobi8kek4e_8A@mail.gmail.com \
    --to=mike.maslenkin@gmail.com \
    --cc=a.lauterer@proxmox.com \
    --cc=f.ebner@proxmox.com \
    --cc=jsnow@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=t.lamprecht@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).