qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: dovgaluk <dovgaluk@ispras.ru>
To: Kevin Wolf <kwolf@redhat.com>
Cc: Pavel Dovgalyuk <pavel.dovgaluk@gmail.com>,
	jsnow@redhat.com, qemu-devel@nongnu.org,
	pavel.dovgaluk@ispras.ru, mreitz@redhat.com
Subject: Re: [PATCH] icount: make dma reads deterministic
Date: Tue, 03 Mar 2020 15:31:29 +0300	[thread overview]
Message-ID: <a6067910aa1bb4eb512c50292734b566@ispras.ru> (raw)
In-Reply-To: <20200302161903.GF4965@linux.fritz.box>

Kevin Wolf писал 2020-03-02 19:19:
> Am 02.03.2020 um 13:59 hat Pavel Dovgalyuk geschrieben:
>> Windows guest sometimes makes DMA requests with overlapping
>> target addresses. This leads to the following structure of iov for
>> the block driver:
>> 
>> addr size1
>> addr size2
>> addr size3
>> 
>> It means that three adjacent disk blocks should be read into the same
>> memory buffer. Windows does not expects anything from these bytes
>> (should it be data from the first block, or the last one, or some 
>> mix),
>> but uses them somehow. It leads to non-determinism of the guest 
>> execution,
>> because block driver does not preserve any order of reading.
>> 
>> This situation was discusses in the mailing list at least twice:
>> https://lists.gnu.org/archive/html/qemu-devel/2010-09/msg01996.html
>> https://lists.gnu.org/archive/html/qemu-devel/2020-02/msg05185.html
>> 
>> This patch makes such disk reads deterministic in icount mode.
>> It skips SG parts that were already affected by prior reads
>> within the same request. Parts that are non identical, but are just
>> overlapped, are trimmed.
>> 
>> Examples for different SG part sequences:
>> 
>> 1)
>> A1 1000
>> A1 1000
>> ->
>> A1 1000
>> 
>> 2)
>> A1 1000
>> A2 1000
>> A1 1000
>> A3 1000
>> ->
>> Two requests with different offsets, because second A1/1000 should be 
>> skipped.
>> A1 1000
>> A2 1000
>> --
>> A3 1000
> 
> How is the "--" line represented in the code?
> 
>> 3)
>> A1 800
>> A2 1000
>> A1 1000
>> ->
>> First 800 bytes of third SG are skipped.
>> A1 800
>> A2 1000
>> --
>> A1+800 800
>> 
>> Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
>> ---
>>  dma-helpers.c |   57 
>> +++++++++++++++++++++++++++++++++++++++++++++++++++++----
>>  1 file changed, 53 insertions(+), 4 deletions(-)
>> 
>> diff --git a/dma-helpers.c b/dma-helpers.c
>> index e8a26e81e1..d71512f707 100644
>> --- a/dma-helpers.c
>> +++ b/dma-helpers.c
>> @@ -13,6 +13,7 @@
>>  #include "trace-root.h"
>>  #include "qemu/thread.h"
>>  #include "qemu/main-loop.h"
>> +#include "sysemu/cpus.h"
>> 
>>  /* #define DEBUG_IOMMU */
>> 
>> @@ -139,17 +140,65 @@ static void dma_blk_cb(void *opaque, int ret)
>>      dma_blk_unmap(dbs);
>> 
>>      while (dbs->sg_cur_index < dbs->sg->nsg) {
>> +        bool skip = false;
>>          cur_addr = dbs->sg->sg[dbs->sg_cur_index].base + 
>> dbs->sg_cur_byte;
>>          cur_len = dbs->sg->sg[dbs->sg_cur_index].len - 
>> dbs->sg_cur_byte;
>> -        mem = dma_memory_map(dbs->sg->as, cur_addr, &cur_len, 
>> dbs->dir);
>> -        if (!mem)
>> -            break;
>> -        qemu_iovec_add(&dbs->iov, mem, cur_len);
>> +
>> +        /*
>> +         * Make reads deterministic in icount mode.
>> +         * Windows sometimes issues disk read requests with
>> +         * overlapping SGs. It leads to non-determinism, because
>> +         * resulting buffer contents may be mixed from several
>> +         * sectors.
>> +         * This code crops SGs that were already read in this 
>> request.
>> +         */
> 
> Please make use of the full line length for the commit text, and add
> empty lines between paragraphs.

Ok

> 
>> +        if (use_icount
>> +            && dbs->dir == DMA_DIRECTION_FROM_DEVICE) {
> 
> This fits in a single line.

Ok

>> +        }
>> +
>>          dbs->sg_cur_byte += cur_len;
>>          if (dbs->sg_cur_byte == dbs->sg->sg[dbs->sg_cur_index].len) {
>>              dbs->sg_cur_byte = 0;
>>              ++dbs->sg_cur_index;
>>          }
>> +
>> +        /*
>> +         * All remaining SGs were skipped.
>> +         * This is not reschedule case, because we already
>> +         * performed the reads, and the last SGs were skipped.
>> +         */
>> +        if (dbs->sg_cur_index == dbs->sg->nsg && dbs->iov.size == 0) 
>> {
>> +            dma_complete(dbs, ret);
>> +            return;
>> +        }
>>      }
> 
> I think the concept of skipping SG list entries makes this patch
> relatively complex. Maybe one of these would work better:
> 
> 1. Instead of skipping, add a temporary bounce buffer to the iovec.
> 
> 2. Instead of skipping, just exit the loop and effectively split the
>    request in multiple parts (like you already do in one case). Then 
> the
>    memory will still be written to twice, but deterministically so that
>    the later SG list entry always wins.
> 
> I think 2. sounds quite attractive because you don't have to manage any
> additional state. You can even simplify the loop to use 
> ranges_overlap()

Thanks for this idea. Please check the new version.
I didn't find how to check SG addresses without making the comparisons 
too complex
and without storing extra data. Therefore I pass iov pointers directly 
to ranges_overlap().


Pavel Dovgalyuk


      reply	other threads:[~2020-03-03 12:32 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-02 12:59 [PATCH] icount: make dma reads deterministic Pavel Dovgalyuk
2020-03-02 16:19 ` Kevin Wolf
2020-03-03 12:31   ` dovgaluk [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a6067910aa1bb4eb512c50292734b566@ispras.ru \
    --to=dovgaluk@ispras.ru \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pavel.dovgaluk@gmail.com \
    --cc=pavel.dovgaluk@ispras.ru \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).