Re: [Qemu-devel] [PATCH v3 4/5] fw_cfg: add generic non-DMA read method

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Laszlo Ersek <lersek@redhat.com>
To: "Gabriel L. Somlo" <somlo@cmu.edu>
Cc: peter.maydell@linaro.org, jordan.l.justen@intel.com,
	qemu-devel@nongnu.org, kraxel@redhat.com, pbonzini@redhat.com,
	markmb@redhat.com
Subject: Re: [Qemu-devel] [PATCH v3 4/5] fw_cfg: add generic non-DMA read method
Date: Tue, 3 Nov 2015 22:35:36 +0100	[thread overview]
Message-ID: <563928A8.5030907@redhat.com> (raw)
In-Reply-To: <20151103175515.GF10717@HEDWIG.INI.CMU.EDU>

On 11/03/15 18:55, Gabriel L. Somlo wrote:
> On Tue, Nov 03, 2015 at 11:53:55AM +0100, Laszlo Ersek wrote:
>> Thank you for splitting out this patch; it makes it easier to review.
>> However,
>>
>> On 11/03/15 01:35, Gabriel L. Somlo wrote:
>>> Introduce fw_cfg_data_read(), a generic read method which works
>>> on all access widths (1 through 8 bytes, inclusive), and can be
>>> used during both IOPort and MMIO read accesses.
>>>
>>> To maintain legibility, only fw_cfg_data_mem_read() (the MMIO
>>> data read method) is replaced by this patch. The new method
>>> essentially unwinds the fw_cfg_data_mem_read() + fw_cfg_read()
>>> combo, but without unnecessarily repeating all the validity
>>> checks performed by the latter on each byte being read.
>>
>> this unwinding caused a bug to creep in.
>>
>> Namely, we have to identify the set of data that remains constant
>> between *all* "size" calls that fw_cfg_data_mem_read() makes to
>> fw_cfg_read(), and hoist / eliminate the checks on those *only*.
>>
>> Specifically,
>>
>>> This patch also modifies the trace_fw_cfg_read prototype to
>>> accept a 64-bit value argument, allowing it to work properly
>>> with the new read method, but also remain backward compatible
>>> with existing call sites.
>>>
>>> Cc: Laszlo Ersek <lersek@redhat.com>
>>> Cc: Gerd Hoffmann <kraxel@redhat.com>
>>> Cc: Marc Marí <markmb@redhat.com>
>>> Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
>>> ---
>>>  hw/nvram/fw_cfg.c | 33 +++++++++++++++++++--------------
>>>  trace-events      |  2 +-
>>>  2 files changed, 20 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
>>> index c2d3a0a..8aa980c 100644
>>> --- a/hw/nvram/fw_cfg.c
>>> +++ b/hw/nvram/fw_cfg.c
>>> @@ -274,6 +274,24 @@ static int fw_cfg_select(FWCfgState *s, uint16_t key)
>>>      return ret;
>>>  }
>>>  
>>> +static uint64_t fw_cfg_data_read(void *opaque, hwaddr addr, unsigned size)
>>> +{
>>> +    FWCfgState *s = opaque;
>>
>> This is good.
>>
>>> +    int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
>>
>> Okay too.
>>
>>> +    FWCfgEntry *e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
>>
>> (1) Side point: the conversion here is faithful to the original code in
>> fw_cfg_read(), but even in the original code, the expression uses
>> "s->cur_entry" as a (masked) subscript *before* comparing it against
>> FW_CFG_INVALID. I don't think that's right.
>>
>> The same issue is present in fw_cfg_dma_transfer(). Care to write a
>> patch (before the restructuring) that fixes both?
>>
>> Note, I am aware that the expression in both of the above mentioned
>> functions only calculates the *address* of the nonexistent element
>> belonging to (FW_CFG_INVALID & FW_CFG_ENTRY_MASK) == 0x3FFF:
>>
>>   e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
>>
>> But it doesn't matter; it's undefined behavior just the same. Instead,
>> *both* locations should say:
>>
>>  e = (s->cur_entry == FW_CFG_INVALID) ? NULL :
>>      &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
>>
>> (I share the blame for not noticing this earlier -- I too reviewed
>> fw_cfg_dma_transfer().)
>>
>> NULL is a valid pointer to *evaluate* (not to dereference), whereas the
>> current address-of expression is not valid even for evaluation. Also, in
>> practice, dereferencing NULL would give us a nice (as in, non-garbage)
>> SIGSEGV.
> 
> Done.
> 
>>
>> Anyway, back to the topic at hand:
>>
>>> +    uint64_t value = 0;
>>> +
>>> +    assert(size <= sizeof(value));
>>> +    if (s->cur_entry != FW_CFG_INVALID && e->data) {
>>
>> Right, good conversion. (Side note: this does protect against
>> *dereferencing* "e", but it's already too late, as far as undefined
>> behavior is concerned.)
>>
>>> +        while (size-- && s->cur_offset < e->len) {
>>> +            value = (value << 8) | e->data[s->cur_offset++];
>>> +        }
>>
>> (2) So, this is the bug. The pre-conversion code would keep shifting
>> "value" to the left until "size" was reached, regardless of the
>> underlying blob size, and just leave the least significant bytes zeroed
>> if the item ended too early. Whereas this loop *stops shifting* when the
>> blob ends.
> 
> D'OH!!! That should teach me to pay more attention -- thanks for
> catching it!
> 
>> Since the wide data register (which is big-endian) implements a
>> substring-preserving transfer (on top of QEMU's integer preserving
>> device r/w infrastructure), this change breaks the case when the
>> firmware reads, say, 8 bytes from the register in a single access, when
>> only 3 are left in the blob, and then uses only the three *lowest
>> address* bytes from the uint64_t value read. Although no known firmware
>> does this at the moment, it would be valid, and the above hunk would
>> break it.
>>
>> Hence please
>>
>> (2a) either append the missing "cumulative" shift after the loop:
>>
>>     while (size && s->cur_offset < e->len) {
>>         --size;
>>         value = (value << 8) | e->data[s->cur_offset++];
>>     }
>>     value <<= 8 * size;
> 
> I went with 2a. Also added a comment to make things painfully obvious
> to any potential future archaeologists:
> 
> +static uint64_t fw_cfg_data_read(void *opaque, hwaddr addr, unsigned size)
> +{
> +    FWCfgState *s = opaque;
> +    int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
> +    FWCfgEntry *e = (s->cur_entry == FW_CFG_INVALID) ? NULL :
> +                    &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
> +    uint64_t value = 0;
> +
> +    assert(size <= sizeof(value));
> +    if (s->cur_entry != FW_CFG_INVALID && e->data) {
> +        /* The least significant 'size' bytes of the return value are
> +         * expected to contain a string preserving portion of the item
> +         * data, padded with zeros to the right in case we run out early.

Please say "*on* the right" here, just like it reads below (emphasis
added only for review purposes).

Also, while the above seems correct, I prefer my own wording from commit
3c23402d4032:

  The solution is to compose the host-endian representation [...] of
  the big endian interpretation [...] of the fw_cfg string [...]

I'm admittedly biased (I have deep scars that read "FW CFG" if I squint
;)) -- my preference could be harder to interpret for "future
archeologist". So I'll leave it to you whether to keep yours, pick mine,
or run with a mixture / union.

Cheers!
Laszlo

> +         */
> +        while (size && s->cur_offset < e->len) {
> +            value = (value << 8) | e->data[s->cur_offset++];
> +            size--;
> +        }
> +        /* If size is still not zero, we *did* run out early, so finish
> +         * left-shifting to add the appropriate number of padding zeros
> +         * on the right.
> +         */
> +        value <<= 8 * size;
> +    }
> +
> +    trace_fw_cfg_read(s, value);
> +    return value;
> +}
> 
> Version 4 should be out by the end of today.
> 
> Thanks again,
> --Gabriel
> 
>>
>> (2b) or move the offset check from the loop's controlling expression
>> into the value composition:
>>
>>         while (size--) {
>>             value = (value << 8) | (s->cur_offset < e->len ?
>>                                     e->data[s->cur_offset++] :
>>                                     0);
>>         }
>>
>> The rest looks good.
>>
>> Thanks
>> Laszlo
>>
>>> +    }
>>> +
>>> +    trace_fw_cfg_read(s, value);
>>> +    return value;
>>> +}
>>> +
>>>  static uint8_t fw_cfg_read(FWCfgState *s)
>>>  {
>>>      int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
>>> @@ -290,19 +308,6 @@ static uint8_t fw_cfg_read(FWCfgState *s)
>>>      return ret;
>>>  }
>>>  
>>> -static uint64_t fw_cfg_data_mem_read(void *opaque, hwaddr addr,
>>> -                                     unsigned size)
>>> -{
>>> -    FWCfgState *s = opaque;
>>> -    uint64_t value = 0;
>>> -    unsigned i;
>>> -
>>> -    for (i = 0; i < size; ++i) {
>>> -        value = (value << 8) | fw_cfg_read(s);
>>> -    }
>>> -    return value;
>>> -}
>>> -
>>>  static void fw_cfg_data_mem_write(void *opaque, hwaddr addr,
>>>                                    uint64_t value, unsigned size)
>>>  {
>>> @@ -483,7 +488,7 @@ static const MemoryRegionOps fw_cfg_ctl_mem_ops = {
>>>  };
>>>  
>>>  static const MemoryRegionOps fw_cfg_data_mem_ops = {
>>> -    .read = fw_cfg_data_mem_read,
>>> +    .read = fw_cfg_data_read,
>>>      .write = fw_cfg_data_mem_write,
>>>      .endianness = DEVICE_BIG_ENDIAN,
>>>      .valid = {
>>> diff --git a/trace-events b/trace-events
>>> index 72136b9..5073040 100644
>>> --- a/trace-events
>>> +++ b/trace-events
>>> @@ -196,7 +196,7 @@ ecc_diag_mem_readb(uint64_t addr, uint32_t ret) "Read diagnostic %"PRId64"= %02x
>>>  
>>>  # hw/nvram/fw_cfg.c
>>>  fw_cfg_select(void *s, uint16_t key, int ret) "%p key %d = %d"
>>> -fw_cfg_read(void *s, uint8_t ret) "%p = %d"
>>> +fw_cfg_read(void *s, uint64_t ret) "%p = %"PRIx64
>>>  fw_cfg_add_file(void *s, int index, char *name, size_t len) "%p #%d: %s (%zd bytes)"
>>>  
>>>  # hw/block/hd-geometry.c
>>>
>>

next prev parent reply	other threads:[~2015-11-03 21:35 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-03  0:35 [Qemu-devel] [PATCH v3 0/5] fw_cfg: spec update, misc. cleanup, optimize read Gabriel L. Somlo
2015-11-03  0:35 ` [Qemu-devel] [PATCH v3 1/5] fw_cfg: move internal function call docs to header file Gabriel L. Somlo
2015-11-03  0:35 ` [Qemu-devel] [PATCH v3 2/5] fw_cfg: amend callback behavior spec to once per select Gabriel L. Somlo
2015-11-03  9:51   ` Laszlo Ersek
2015-11-03  0:35 ` [Qemu-devel] [PATCH v3 3/5] fw_cfg: remove offset argument from callback prototype Gabriel L. Somlo
2015-11-03  0:35 ` [Qemu-devel] [PATCH v3 4/5] fw_cfg: add generic non-DMA read method Gabriel L. Somlo
2015-11-03 10:53   ` Laszlo Ersek
2015-11-03 17:55     ` Gabriel L. Somlo
2015-11-03 21:35       ` Laszlo Ersek [this message]
2015-11-03 21:44         ` Gabriel L. Somlo
2015-11-03 22:03         ` Gabriel L. Somlo
2015-11-04 13:01           ` Laszlo Ersek
2015-11-03  0:35 ` [Qemu-devel] [PATCH v3 5/5] fw_cfg: replace ioport data read with generic method Gabriel L. Somlo
2015-11-03 11:11   ` Laszlo Ersek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=563928A8.5030907@redhat.com \
    --to=lersek@redhat.com \
    --cc=jordan.l.justen@intel.com \
    --cc=kraxel@redhat.com \
    --cc=markmb@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=somlo@cmu.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).