From: John Snow <jsnow@redhat.com>
To: eniac-xw.zhang@hp.com
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] AHCI bug
Date: Thu, 24 Jul 2014 17:13:31 -0400 [thread overview]
Message-ID: <53D176FB.1050907@redhat.com> (raw)
In-Reply-To: <3B22ECA2D19A3D408C83F4F15A9CB7D4508FBC04@G9W0737.americas.hpqcorp.net>
[-- Attachment #1: Type: text/plain, Size: 6111 bytes --]
On 07/24/2014 01:01 PM, Zhang, Eniac wrote:
>
> Hi all,
>
> I found a problem while using following combinations: qemu-1.6.1 +
> large disks (>127GB) + q35(AHCI controller) + int13h disk access.
>
> The AHCI controller code in Qemu has a bug that it will use the wrong
> LBA address when Seabios tries to access LBA>128GB (aka 127.5GB limit
> http://www.hardwaresecrets.com/printpage/Hard-Disk-Drives-Capacity-Limits/482).
> When we needs to access the LBA>0xfffffff, 28bit LBA is not sufficient
> thus AHCI code needs to convert that into an LBA48 command, but it
> didn't set all the flags correctly, so low level code ends up reading
> a sector at different address, here's where I caught it red handed
> (from logging statement I added to qemu):
>
> *set: sector=300050824*
>
> *get: sector=35809672*
>
> *LBA mismatch detected, saved:0x00000011e26988, read:0x00000002226988*
>
> Here's my qemu patch:
>
> diff -r -p ../qemu-1.6.1-org/hw/ide/ahci.c ./hw/ide/ahci.c
>
> *** ../qemu-1.6.1-org/hw/ide/ahci.c 2013-10-09 15:20:32.000000000 -0400
>
> --- ./hw/ide/ahci.c 2014-07-21 10:37:03.734217053 -0400
>
> *************** static int handle_cmd(AHCIState *s, int
>
> *** 929,934 ****
>
> --- 931,954 ----
>
> * do, I simply assume non-used fields as reserved and
> OR everything
>
> * together, independent of the command.
>
> */
>
> +
>
> + #if 1 // eniac
>
> + // enable lba and lba48 mode, otherwise the bit won't
> get set until the command is completed, cause read/write corruption
>
> + ide_state->lba48 = (cmd_fis[2] == WIN_READDMA_EXT
>
> + || cmd_fis[2] == WIN_READ_EXT
>
> + || cmd_fis[2] == WIN_READDMA_QUEUED_EXT
>
> + || cmd_fis[2] == WIN_READ_NATIVE_MAX_EXT
>
> + || cmd_fis[2] == WIN_MULTREAD_EXT
>
> + || cmd_fis[2] == WIN_WRITE_EXT
>
> + || cmd_fis[2] == WIN_WRITEDMA_EXT
>
> + || cmd_fis[2] == WIN_WRITEDMA_QUEUED_EXT
>
> + || cmd_fis[2] == WIN_SET_MAX_EXT
>
> + || cmd_fis[2] == WIN_MULTWRITE_EXT
>
> + || cmd_fis[2] == WIN_VERIFY_EXT
>
> + || cmd_fis[2] == WIN_FLUSH_CACHE_EXT
>
> + );
>
> + ide_state->select |= 0x40;
>
> + #endif // eniac
>
> ide_set_sector(ide_state, ((uint64_t)cmd_fis[10] << 40)
>
> | ((uint64_t)cmd_fis[9] << 32)
>
> /* This is used for LBA48 commands */
>
> diff -r -p ../qemu-1.6.1-org/hw/ide/core.c ./hw/ide/core.c
>
> *** ../qemu-1.6.1-org/hw/ide/core.c 2013-10-09 15:20:32.000000000 -0400
>
> --- ./hw/ide/core.c 2014-07-21 10:37:21.132422670 -0400
>
> *************** void ide_transfer_stop(IDEState *s)
>
> *** 445,450 ****
>
> --- 445,454 ----
>
> s->status &= ~DRQ_STAT;
>
> }
>
> + #if 1 // eniac added to detect LBA skew, quit early to avoid corruption
>
> + int64_t saved_lba = -1;
>
> + #endif // eniac
>
> +
>
> int64_t ide_get_sector(IDEState *s)
>
> {
>
> int64_t sector_num;
>
> *************** int64_t ide_get_sector(IDEState *s)
>
> *** 464,475 ****
>
> --- 468,495 ----
>
> sector_num = ((s->hcyl << 8) | s->lcyl) * s->heads * s->sectors +
>
> (s->select & 0x0f) * s->sectors + (s->sector - 1);
>
> }
>
> +
>
> + #if 1 // eniac
>
> + printf("get: sector=%" PRId64 "\n", sector_num);
>
> + if (sector_num != saved_lba && saved_lba != -1) {
>
> + printf("LBA mismatch detected, saved:%#016llx, read:%#016llx\n",
> (long long unsigned) saved_lba, (long long unsigned) sector_num);
>
> + exit(1);
>
> + } else {
>
> + saved_lba = -1;
>
> + }
>
> + #endif // eniac
>
> return sector_num;
>
> }
>
> void ide_set_sector(IDEState *s, int64_t sector_num)
>
> {
>
> unsigned int cyl, r;
>
> +
>
> +
>
> + #if 1 // eniac
>
> + saved_lba = sector_num;
>
> + printf("set: sector=%" PRId64 "\n", sector_num);
>
> + #endif // eniac
>
> if (s->select & 0x40) {
>
> if (!s->lba48) {
>
> s->select = (s->select & 0xf0) | (sector_num >> 24);
>
> To duplicate the problem you need to turn off the workaround in
> ahci.c, leaving the debug logs in core.c, compile your
> qemu-system-x86_64 and then run:
>
> ./ qemu-system-x86_64 -fda dos622.img -drive
> if=none,file=./blank.qcow2,id=hdc,media=disk -device
> ide-hd,drive=hdc,bus=ide.0 -M q35 -m 256M -vnc :1 -boot a
>
> Blank.qcow2 is a 300GB virtual disk file I pre-created, you can leave
> it blank cause what's on disk doesn't matter in this test. dos622.img
> is the dos622 floppy image with debug.com and a batch file:
>
> # cat int1342.bat
>
> a 100
>
> mov si, 0200
>
> mov ax, 4200
>
> mov dx, 0080
>
> int 13
>
> ret
>
> ; 0x3: length
>
> e 200 10 00 7f 00 00 00 00 50
>
> ; lba
>
> e 208 88 69 e2 11 00 00 00 00
>
> r ip
>
> 100
>
> g
>
> r
>
> d 5000:0
>
> q
>
> Connect vncviewer, once dos boot is completed, type debug<int1342.bat
> to try to read a sector beyond 128GB using int13h.
>
> I tried to fix that by put in check that set the LBA modes manually.
> It worked well for AHCI controllers but I think it looks like a
> bandaid rather than a proper fix. Can someone more knowledgeable in
> disk access and controller code help to figure out what was supposed
> to happen here?
>
> Regards/Eniac
>
Hi Eniac,
(1) Please re-base your patch on top of the latest git source -- if the
latest version does in fact still act erroneously. I recommend running
your reproduce script on the latest upstream code. If you still find a
problem, then:
(2) Please resend this patch using `git format-patch` or an equivalent
format. It will be easier for me and others to review it in that format.
Please see http://qemu-project.org/Contribute/SubmitAPatch for the
submission formatting and process guidelines.
(3) Any non-debugging code should not be behind an #ifdef in a submitted
patch ... unless you meant this to be an RFC, but still please use the
git patch format in those cases, and make sure "RFC" is in your subject
line.
(4) Please use the DEBUG_AHCI flag if there are conditional checks you'd
like to enable in non-production environments.
(5) Use DPRINTF for debug prints instead of printf.
Thank you!
--John
[-- Attachment #2: Type: text/html, Size: 24564 bytes --]
prev parent reply other threads:[~2014-07-24 21:13 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-24 17:01 [Qemu-devel] AHCI bug Zhang, Eniac
2014-07-24 21:13 ` John Snow [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53D176FB.1050907@redhat.com \
--to=jsnow@redhat.com \
--cc=eniac-xw.zhang@hp.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).