From: "Michael S. Tsirkin" <mst@redhat.com>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: Gavin Shan <gshan@redhat.com>,
qemu-arm@nongnu.org, qemu-devel@nongnu.org, peterx@redhat.com,
alex@shazbot.org, richard.henderson@linaro.org,
berrange@redhat.com, philmd@oss.qualcomm.com, philmd@mailo.com,
david@kernel.org, clg@redhat.com, pbonzini@redhat.com,
phrdina@redhat.com, jugraham@redhat.com,
liugang24219@sangfor.com.cn, dinghui@sangfor.com.cn,
shan.gavin@gmail.com
Subject: Re: [PATCH v3 1/2] system/memory: Use qemu_ram_{copy, move}() in ram device region accessors
Date: Fri, 26 Jun 2026 06:48:35 -0400 [thread overview]
Message-ID: <20260626063730-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CAFEAcA8UjtXgmPC27fOKMwbaWs2piwGYudtwCNWQHuu3E_7A-A@mail.gmail.com>
On Thu, Jun 25, 2026 at 07:40:29PM +0100, Peter Maydell wrote:
> On Thu, 25 Jun 2026 at 17:47, Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Thu, Jun 25, 2026 at 04:23:47PM +0100, Peter Maydell wrote:
> > > On Thu, 25 Jun 2026 at 15:52, Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > I think there is exactly 1 kinda reasonable case. A 2 byte read/write at
> > > > offset 0x1 within a dword. This maps nicely to even classical PCI byte
> > > > enable mechanism and so yes it works if your CPU can initiate these
> > > > things, and it's atomic.
> > > >
> > > > I tried reading LEDCTL on e1000e:
> > > >
> > > > byte @ 0xe00: 0x64
> > > > byte @ 0xe01: 0x2a
> > > > byte @ 0xe02: 0x00
> > > > byte @ 0xe03: 0x00
> > > > word @ 0xe00: 0x2a64
> > > > word @ 0xe01: 0x002a
> > > >
> > > > Works fine.
> > >
> > > The e1000e datasheet actually documents what it does in this
> > > case (slightly surprising, since hardware engineers love to
> > > leave this kind of corner case undocumented):
> > >
> > > # For registers that should be accessed as 32-bit double words,
> > > # partial writes (less than a 32-bit double word) does not take
> > > # effect (such as, the write is ignored).
> > > # Partial reads
> > > # return all 32 bits of data regardless of the byte enables.
> > > #
> > > # Note: Partial reads to clear-by-read registers (such as, ICR)
> > > # can have unexpected results since all 32 bits are actually read
> > > # regardless of the byte enables. Partial reads should not be done.
> > >
> > > So for this specific device that access is out-of-spec.
> >
> > You mean that access to clear by read should not be done, right?
>
> The datasheet is ambiguous about whether "Partial reads should
> not be done" is meant to apply generally or only to clear-by-read
> registers.
I mean it's in a note about clear-by-read registers. Seems unambigous.
> > > I guess what I'm wondering is: can we just have code
> > > that does an aligned exact-width access in the 1/2/4/8
> > > byte aligned case, and the host's best approximation to
> > > an unaligned exact-width access for the 2/4/8 byte
> > > unaligned case?
> >
> > That's my idea, too.
> >
> > > (so on sparc you get multiple smaller
> > > accesses, and on most archs including x86 and arm you
> > > get an unaligned load).
> >
> > I thought unaligned load from uncacheable on arm is
> > also a fault?
>
> For Arm the distinction is not cacheable/uncacheable
> but Normal vs Device. (Device is essentially for things
> which are not RAM; Normal is for RAM and RAM-like things,
> and includes all of Normal Non-cacheable, Normal WT-Cacheable
> and Normal WB-Cacheable.) Things mapped as Normal memory
> don't generate unaligned faults (unless the guest turned them
> on deliberately). For Device memory, it is IMPLEMENTATION
> DEFINED whether you get an alignment fault or not if you
> map something as Device that could have handled unaligned
> accesses if you had mapped it as Normal.
Right. Sorry I was unclear. I was referring to pgprot_noncached in Linux.
> > > That would mean the guest could
> > > potentially provoke a fault on the load/store on an
> > > access to a passthrough device, but if you give the
> > > guest passthrough access it can very likely provoke
> > > a fault anyway, depending on exactly what the device is.
> > >
> > > I think the most likely reason for an unaligned access
> > > in this codepath is "it's actually RAM, either really
> > > host RAM or else something memory-like in a BAR", and
> > > either way if the guest does a 4-byte unaligned access
> > > then doing a 4-byte unaligned access seems better than
> > > second-guessing it, even on non-x86.
>
> > Right though remember: whether it's RAM doesn't matter. What matters is
> > how we map it. qemu might fault because it maps NC but guest maps
> > cacheable and it's ok.
>
> If QEMU and the guest disagree about the memory attributes
> on Arm then we have already lost, because the architecture
> says that memory attribute mismatches result in a variety of
> undesirable effects including things like loss of cache coherency
> (i.e. read and writes via QEMU's NC mapping disagree with ones
> via the guest's cacheable mapping because the latter are hitting
> in the cache and the former are bypassing it).
But we don't know how guest mapped it, at least without
VFIO's help. I guess DMA into device BAR is just often broken
on ARM except if guest happens to use same attributes as
set up by vfio.
> > But, all this in theory. At a high level, I personally think going with
> > what you propose as a 1st approximation is entirely reasonable, except
> > for one thing: we really should not crash qemu, since access can be from
> > guest userspace.
>
> You can't prevent faults entirely, though -- if the device
> being mapped has e.g. behaviour that says "unaligned accesses
> will fault" and then the x86 guest does an unaligned access,
> then the device will trigger a fault,
I don't get this part. How can a device "trigger a fault" on the x86 CPU?
But on non x86 in theory yes software could want the fault.
Hopefully it does not.
> and the fault is what
> you want because it's what the guest would see on real h/w.
> Unfortunately we don't have a convenient way to feed the
> fault back to the guest.
I didn't check but I mean we could add it to kvm I am guessing, if we
want. A lot of work so I would maybe wait for an actual issue before
spending engineering time on this.
> At some level if you pass through
> host hardware you're relying on the guest to not do totally
> stupid things.
Indeed.
> thanks
> -- PMM
next prev parent reply other threads:[~2026-06-26 10:49 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-16 5:25 [PATCH v3 0/2] system/memory: Make ram device region directly accessible Gavin Shan
2026-06-16 5:25 ` [PATCH v3 1/2] system/memory: Use qemu_ram_{copy, move}() in ram device region accessors Gavin Shan
2026-06-16 6:17 ` Michael S. Tsirkin
2026-06-16 7:15 ` Gavin Shan
2026-06-16 9:51 ` Michael S. Tsirkin
2026-06-16 12:50 ` Ding Hui
2026-06-16 15:51 ` Michael S. Tsirkin
2026-06-16 23:01 ` Gavin Shan
2026-06-25 10:09 ` Peter Maydell
2026-06-25 11:07 ` Michael S. Tsirkin
2026-06-25 12:48 ` Peter Maydell
2026-06-25 13:23 ` Michael S. Tsirkin
2026-06-25 14:02 ` Peter Maydell
2026-06-25 14:52 ` Michael S. Tsirkin
2026-06-25 15:23 ` Peter Maydell
2026-06-25 16:47 ` Michael S. Tsirkin
2026-06-25 18:40 ` Peter Maydell
2026-06-26 0:07 ` Gavin Shan
2026-06-26 10:48 ` Michael S. Tsirkin [this message]
2026-06-16 5:25 ` [PATCH v3 2/2] system/memory: Make ram device region directly accessible Gavin Shan
2026-06-16 5:36 ` [PATCH v3 0/2] " Michael S. Tsirkin
2026-06-16 5:43 ` Gavin Shan
2026-06-16 5:40 ` Gavin Shan
2026-06-16 5:44 ` Michael S. Tsirkin
2026-06-17 2:35 ` Gavin Shan
2026-06-17 5:52 ` Michael S. Tsirkin
2026-06-17 7:00 ` Gavin Shan
2026-06-17 7:27 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260626063730-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=alex@shazbot.org \
--cc=berrange@redhat.com \
--cc=clg@redhat.com \
--cc=david@kernel.org \
--cc=dinghui@sangfor.com.cn \
--cc=gshan@redhat.com \
--cc=jugraham@redhat.com \
--cc=liugang24219@sangfor.com.cn \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=peterx@redhat.com \
--cc=philmd@mailo.com \
--cc=philmd@oss.qualcomm.com \
--cc=phrdina@redhat.com \
--cc=qemu-arm@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
--cc=shan.gavin@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.