Re: [PATCH RFCv1] virtio: Inherit max bounce buffer size from bus parent if possible

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Peter Xu <peterx@redhat.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: "Peter Maydell" <peter.maydell@linaro.org>,
	"Gavin Shan" <gshan@redhat.com>,
	"Pavel Hrdina" <phrdina@redhat.com>,
	"Daniel P. Berrangé" <berrange@redhat.com>,
	qemu-devel@nongnu.org, qemu-arm@nongnu.org, jugraham@redhat.com,
	shan.gavin@gmail.com, "Alex Williamson" <alex@shazbot.org>,
	"David Hildenbrand" <david@kernel.org>
Subject: Re: [PATCH RFCv1] virtio: Inherit max bounce buffer size from bus parent if possible
Date: Thu, 11 Jun 2026 14:20:54 -0400	[thread overview]
Message-ID: <air8hrlhrkaoJ4wi@x1.local> (raw)
In-Reply-To: <20260611130037-mutt-send-email-mst@kernel.org>

On Thu, Jun 11, 2026 at 01:02:34PM -0400, Michael S. Tsirkin wrote:
> On Thu, Jun 11, 2026 at 05:53:54PM +0100, Peter Maydell wrote:
> > On Thu, 11 Jun 2026 at 17:42, Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Thu, Jun 11, 2026 at 05:16:09PM +0100, Peter Maydell wrote:
> > > > On Thu, 11 Jun 2026 at 16:57, Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > Then I'd like to see an example where we have an actual good reason
> > > > > to do execute arbitrary width accesses on device RAM when the
> > > > > driver does not execute them, please.
> > > >
> > > > That's the case we started with, with this GPU where the
> > > > virtio data structures are in this PCI BAR. We want the
> > > > virtio backend to have the freedom to say "I'm just going
> > > > to assume the ring buffer etc is in RAM, and I don't need to
> > > > care about carefully ensuring that I only do word accesses".
> > >
> > > virtio backend does not need to assume anything. any spec
> > > compliant guest guarantees it. any address given to
> > > a virtio device is ram.
> > 
> > OK, but how do we tell if the mmap() PCI BAR we have is RAM
> > or not ? Some of them are, some of them aren't...
> > 
> > -- PMM
> 
> We don't care? If guest asked a virtio device to access
> memory then that memory better support accesses guest
> requested, and on virtio that means any width.

Yes that's also my understanding that QEMU shouldn't care, if on bare metal
if it is a grey area with undefined behavior, then QEMU should also be fine
to make it undefined.

Only one (IMHO, very slight..) concern is, if such "undefined behavior"
operation may crash QEMU rather than causing a sigbus like what normally
would happen on a bare metal.  I'll put that discussion at last since I
don't know if it's that important.

So taking that e1000 bug into account, looks like we have more of such user
that wants explicit control over the memory operations, likely with
attibutes:

- Aligned only

- No vectored inst

- No possible duplication (perhaps it means, only use atomic access??? per
  PeterM's explanation in another email)

I wonder if we should do this by default to all IOs, I think it might have
an unwanted impact on general perf.  One idea is we can introduce a new bit
in MemTxAttrs: say, mmio_strict, which will follow above stricter rule of
doing IO.

Then for ram_device, when doing memory access we should also use the same
set, hence something like:

  if (memory_access_is_direct(mr, ...)) {
    if (MemTxAttrs.mmio_strict || memory_region_is_ram_device(mr)) {
        memmove_strict_mmio(); // or memmove_no_vector(), or ...
    }
  }

I also want to bring us to the same page on differenciating two things:

- about direct access definition: so far, I want to define this almost as
  "it is directly accessible from host virtual address space".  It means
  ram_device definitely falls into this category, so it's a sub-category
  only but enforces strict mmio as above

- bounce buffer: I want to make sure we're on the same page this is
  something totally solving different problem of what we're looking at for
  ram_device.  Essentially this is only needed if mem is not "direct
  access".  Otherwise we shouldn't need it (including ram_device).

I'm not sure if above sounds reasonable.  The hope is with that we should
fix both this GPU and e1000 issues (e1000, and maybe other places, needs to
start passing MemTxAttrs.mmio_strict, though, if we want to keep the
default to be still fast-path to use memcpy()/memmove()?).

Thanks,

=========================

Two cases here on the concern:

(1) if fully emulated device, non-issue afaiu because whatever DMA does
(with vectored ops) will only apply to QEMU process (like a bounce
buffer..) so it won't crash,

(2) if it's a VFIO device (not the GPU case, but when like realtek and some
guest driver registers it as a DMA target), logically the guest should
receive a bus error, but for QEMU's case it may crash QEMU with SIGBUS:
sigbus_handler() doesn't process anything except memory failures so far, it
seems.  The right thing might be that we forward this sigbus to guest, but
I didn't check further.

But I don't know if it would really happen, better verify it, and even if
it will happen I still think it's a very niche security use case to cover
so far.  Doesn't seem to be a blocker to fix a real perf problem first with
RAM-like GPU devices.

-- 
Peter Xu

next prev parent reply	other threads:[~2026-06-11 18:21 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-08  0:18 [PATCH RFCv1] virtio: Inherit max bounce buffer size from bus parent if possible Gavin Shan
2026-06-08  8:55 ` Daniel P. Berrangé
2026-06-08 11:11   ` Gavin Shan
2026-06-08 11:38     ` Daniel P. Berrangé
2026-06-09  2:08       ` Gavin Shan
2026-06-09 16:25         ` Peter Xu
2026-06-10  0:32           ` Gavin Shan
2026-06-10  9:54     ` Pavel Hrdina
2026-06-10 10:55       ` Gavin Shan
2026-06-10 12:12         ` Michael S. Tsirkin
2026-06-10 12:19           ` Gavin Shan
2026-06-10 12:27             ` Michael S. Tsirkin
2026-06-10 13:00               ` Gavin Shan
2026-06-10 13:54                 ` Gavin Shan
2026-06-10 14:06                   ` Michael S. Tsirkin
2026-06-10 15:36                     ` Peter Xu
2026-06-10 16:11                       ` Peter Maydell
2026-06-10 16:19                         ` Michael S. Tsirkin
2026-06-10 19:10                           ` Peter Xu
2026-06-10 21:03                             ` Michael S. Tsirkin
2026-06-10 21:27                               ` Peter Xu
2026-06-10 21:44                                 ` Michael S. Tsirkin
2026-06-10 16:18                       ` Michael S. Tsirkin
2026-06-11  4:33                         ` Gavin Shan
2026-06-11  5:31                           ` Michael S. Tsirkin
2026-06-11  6:28                             ` Gavin Shan
2026-06-11  6:34                               ` Michael S. Tsirkin
2026-06-11 12:33                                 ` Gavin Shan
2026-06-11 12:48                                   ` Peter Maydell
2026-06-11 14:10                                     ` Michael S. Tsirkin
2026-06-11 14:55                                       ` Peter Maydell
2026-06-11 15:05                                         ` Michael S. Tsirkin
2026-06-11 15:25                                           ` Michael S. Tsirkin
2026-06-11 15:29                                           ` Peter Maydell
2026-06-11 15:57                                             ` Michael S. Tsirkin
2026-06-11 16:16                                               ` Peter Maydell
2026-06-11 16:42                                                 ` Michael S. Tsirkin
2026-06-11 16:53                                                   ` Peter Maydell
2026-06-11 17:02                                                     ` Michael S. Tsirkin
2026-06-11 18:20                                                       ` Peter Xu [this message]
2026-06-11 20:52                                                         ` Michael S. Tsirkin
2026-06-11 21:20                                                           ` Peter Xu
2026-06-11 21:59                                                             ` Michael S. Tsirkin
2026-06-11 21:15                                                       ` Michael S. Tsirkin
2026-06-11 16:13                                           ` Michael S. Tsirkin
2026-06-11  6:51                               ` Michael S. Tsirkin
2026-06-10 12:23         ` Pavel Hrdina
2026-06-10 14:04           ` Gavin Shan
2026-06-10 14:08             ` Michael S. Tsirkin
2026-06-10  9:49 ` Michael S. Tsirkin
2026-06-10 18:30   ` Stefan Hajnoczi
2026-06-10 21:00     ` Michael S. Tsirkin
2026-06-11 14:20       ` Stefan Hajnoczi
2026-06-11 14:45         ` Michael S. Tsirkin
2026-06-11 15:04           ` Peter Maydell
2026-06-11 15:09             ` Michael S. Tsirkin
2026-06-11 18:37               ` Stefan Hajnoczi
2026-06-11 20:54                 ` Michael S. Tsirkin
2026-06-11  1:19     ` Gavin Shan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=air8hrlhrkaoJ4wi@x1.local \
    --to=peterx@redhat.com \
    --cc=alex@shazbot.org \
    --cc=berrange@redhat.com \
    --cc=david@kernel.org \
    --cc=gshan@redhat.com \
    --cc=jugraham@redhat.com \
    --cc=mst@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=phrdina@redhat.com \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=shan.gavin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.