Re: [RFC PATCH 0/3] hw/pflash_cfi01: Reduce memory consumption when flash image is smaller than region

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Philippe Mathieu-Daudé" <philmd@redhat.com>
To: David Edmondson <david.edmondson@oracle.com>, qemu-block@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>,
	qemu-devel@nongnu.org, Max Reitz <mreitz@redhat.com>
Subject: Re: [RFC PATCH 0/3] hw/pflash_cfi01: Reduce memory consumption when flash image is smaller than region
Date: Tue, 16 Feb 2021 16:03:05 +0100	[thread overview]
Message-ID: <df4db595-c2db-4fa8-0a4b-1403117dcc76@redhat.com> (raw)
In-Reply-To: <20210216142721.1985543-1-david.edmondson@oracle.com>

On 2/16/21 3:27 PM, David Edmondson wrote:
> As described in
> https://lore.kernel.org/r/20201116104216.439650-1-david.edmondson@oracle.com,
> I'd like to reduce the amount of memory consumed by QEMU mapping UEFI
> images on aarch64.
> 
> To recap:
> 
>> Currently ARM UEFI images are typically built as 2MB/768kB flash
>> images for code and variables respectively. These images are both
>> then padded out to 64MB before being loaded by QEMU.
>>
>> Because the images are 64MB each, QEMU allocates 128MB of memory to
>> read them, and then proceeds to read all 128MB from disk (dirtying
>> the memory). Of this 128MB less than 3MB is useful - the rest is
>> zero padding.
>>
>> On a machine with 100 VMs this wastes over 12GB of memory.
> 
> There were objections to my previous patch because it changed the size
> of the regions reported to the guest via the memory map (the reported
> size depended on the size of the image).
> 
> This is a smaller patch which only helps with read-only flash images,
> as it does so by changing the memory region that covers the entire
> region to be IO rather than RAM, and loads the flash image into a
> smaller sub-region that is the more traditional mixed IO/ROMD type.
> 
> All read/write operations to areas outside of the underlying block
> device are handled directly (reads return 0, writes fail (which is
> okay, because this path only supports read-only devices)).
> 
> This reduces the memory consumption for the read-only AAVMF code image
> from 64MB to around 2MB (presuming that the UEFI image is adjusted
> accordingly). It does nothing to improve the memory consumption caused
> by the read-write AAVMF vars image.

So for each VM this changes from 64 + 64 to 2 + 64 MiB.

100 VMs now use 6.5GB instead of 400MB. Quite an improvement already :)

> There was a suggestion in a previous thread that perhaps the pflash
> driver could be re-worked to use the block IO interfaces to access the
> underlying device "on demand" rather than reading in the entire image
> at startup (at least, that's how I understood the comment).
> 
> I looked at implementing this and struggled to get it to work for all
> of the required use cases. Specifically, there are several code paths
> that expect to retrieve a pointer to the flat memory image of the
> pflash device and manipulate it directly (examples include the Malta
> board and encrypted memory support on x86), or write the entire image
> to storage (after migration).

IIUC these are specific uses when the machine is paused. For Malta we
can map a ROM instead.

I don't know about encrypted x86 machines.

> My implementation was based around mapping the flash region only for
> IO, which meant that every read or write had to be handled directly by
> the pflash driver (there was no ROMD style operation), which also made
> booting an aarch64 VM noticeably slower - getting through the firmware
> went from under 1 second to around 10 seconds.
> 
> Improving the writeable device support requires some more general
> infrastructure, I think, but I'm not familiar with everything that
> QEMU currently provides, and would be very happy to learn otherwise.

I am not a block expert, but I wonder if something like this could
be used:

- create a raw (parent) block image of 64MiB

- add a raw (child) block with your 768kB of VARS file

- add a null-co (child) block of 63Mib + 256kiB

- pass the parent block to the pflash device

Regards,

Phil.

next prev parent reply	other threads:[~2021-02-16 15:18 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-16 14:27 [RFC PATCH 0/3] hw/pflash_cfi01: Reduce memory consumption when flash image is smaller than region David Edmondson
2021-02-16 14:27 ` [RFC PATCH 1/3] hw/pflash_cfi*: Replace DPRINTF with trace events David Edmondson
2021-02-16 14:27 ` [RFC PATCH 2/3] hw/pflash_cfi01: Correct the type of PFlashCFI01.ro David Edmondson
2021-02-16 14:27 ` [RFC PATCH 3/3] hw/pflash_cfi01: Allow read-only devices to have a smaller backing device David Edmondson
2021-02-16 15:03 ` Philippe Mathieu-Daudé [this message]
2021-02-16 15:22   ` [RFC PATCH 0/3] hw/pflash_cfi01: Reduce memory consumption when flash image is smaller than region David Edmondson
2021-02-16 15:44     ` Philippe Mathieu-Daudé
2021-02-16 15:53       ` David Edmondson
2021-02-18 10:34         ` David Edmondson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=df4db595-c2db-4fa8-0a4b-1403117dcc76@redhat.com \
    --to=philmd@redhat.com \
    --cc=david.edmondson@oracle.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).