All of lore.kernel.org
 help / color / mirror / Atom feed
From: Catalin Marinas <catalin.marinas@arm.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Marc Zyngier <maz@kernel.org>,
	ankita@nvidia.com,
	Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>,
	oliver.upton@linux.dev, suzuki.poulose@arm.com,
	yuzenghui@huawei.com, will@kernel.org, ardb@kernel.org,
	akpm@linux-foundation.org, gshan@redhat.com, aniketa@nvidia.com,
	cjia@nvidia.com, kwankhede@nvidia.com, targupta@nvidia.com,
	vsethi@nvidia.com, acurrid@nvidia.com, apopple@nvidia.com,
	jhubbard@nvidia.com, danw@nvidia.com, mochs@nvidia.com,
	kvmarm@lists.linux.dev, kvm@vger.kernel.org,
	lpieralisi@kernel.org, linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v2 1/1] KVM: arm64: allow the VM to select DEVICE_* and NORMAL_NC for IO memory
Date: Wed, 6 Dec 2023 16:31:48 +0000	[thread overview]
Message-ID: <ZXCh9N2xp0efHcpE@arm.com> (raw)
In-Reply-To: <20231206151603.GR2692119@nvidia.com>

On Wed, Dec 06, 2023 at 11:16:03AM -0400, Jason Gunthorpe wrote:
> On Wed, Dec 06, 2023 at 12:14:18PM +0000, Catalin Marinas wrote:
> > We could do with a pgprot_maybewritecombine() or
> > pgprot_writecombinenospec() (similar to Jason's idea but without
> > changing the semantics of pgprot_device()). For the user mapping on
> > arm64 this would be Device (even _GRE) since it can't disable
> > speculation but stage 2 would leave the decision to the guest since the
> > speculative loads aren't much different from committed loads done
> > wrongly.
> 
> This would be fine, as would a VMA flag. Please pick one :)
> 
> I think a VMA flag is simpler than messing with pgprot.

I guess one could write a patch and see how it goes ;).

> > If we want the VMM to drive this entirely, we could add a new mmap()
> > flag like MAP_WRITECOMBINE or PROT_WRITECOMBINE. They do feel a bit
> 
> As in the other thread, we cannot unconditionally map NORMAL_NC into
> the VMM.

I'm not suggesting this but rather the VMM map portions of the BAR with
either Device or Normal-NC, concatenate them (MAP_FIXED) and pass this
range as a memory slot (or multiple if a slot doesn't allow multiple
vmas).

> > The latter has some benefits for DPDK but it's a lot more involved
> > with
> 
> DPDK WC support will be solved with some VFIO-only change if anyone
> ever cares to make it, if that is what you mean.

Yeah. Some arguments I've heard in private and public discussions is
that the KVM device pass-through shouldn't be different from the DPDK
case. So fixing that would cover KVM as well, though we'd need
additional logic in the VMM. BenH had a short talk at Plumbers around
this - https://youtu.be/QLvN3KXCn0k?t=7010. There was some statement in
there that for x86, the guests are allowed to do WC without other KVM
restrictions (not sure whether that's the case, not familiar with it).

> > having to add device-specific knowledge into the VMM. The VMM would also
> > have to present the whole BAR contiguously to the guest even if there
> > are different mapping attributes within the range. So a lot of MAP_FIXED
> > uses. I'd rather leaving this decision with the guest than the VMM, it
> > looks like more hassle to create those mappings. The VMM or the VFIO
> > could only state write-combine and speculation allowed.
> 
> We talked about this already, the guest must decide, the VMM doesn't
> have the information to pre-predict which pages the guest will want to
> use WC on.

Are the Device/Normal offsets within a BAR fixed, documented in e.g. the
spec or this is something configurable via some MMIO that the guest
does.

-- 
Catalin

WARNING: multiple messages have this Message-ID (diff)
From: Catalin Marinas <catalin.marinas@arm.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Marc Zyngier <maz@kernel.org>,
	ankita@nvidia.com,
	Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>,
	oliver.upton@linux.dev, suzuki.poulose@arm.com,
	yuzenghui@huawei.com, will@kernel.org, ardb@kernel.org,
	akpm@linux-foundation.org, gshan@redhat.com, aniketa@nvidia.com,
	cjia@nvidia.com, kwankhede@nvidia.com, targupta@nvidia.com,
	vsethi@nvidia.com, acurrid@nvidia.com, apopple@nvidia.com,
	jhubbard@nvidia.com, danw@nvidia.com, mochs@nvidia.com,
	kvmarm@lists.linux.dev, kvm@vger.kernel.org,
	lpieralisi@kernel.org, linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v2 1/1] KVM: arm64: allow the VM to select DEVICE_* and NORMAL_NC for IO memory
Date: Wed, 6 Dec 2023 16:31:48 +0000	[thread overview]
Message-ID: <ZXCh9N2xp0efHcpE@arm.com> (raw)
In-Reply-To: <20231206151603.GR2692119@nvidia.com>

On Wed, Dec 06, 2023 at 11:16:03AM -0400, Jason Gunthorpe wrote:
> On Wed, Dec 06, 2023 at 12:14:18PM +0000, Catalin Marinas wrote:
> > We could do with a pgprot_maybewritecombine() or
> > pgprot_writecombinenospec() (similar to Jason's idea but without
> > changing the semantics of pgprot_device()). For the user mapping on
> > arm64 this would be Device (even _GRE) since it can't disable
> > speculation but stage 2 would leave the decision to the guest since the
> > speculative loads aren't much different from committed loads done
> > wrongly.
> 
> This would be fine, as would a VMA flag. Please pick one :)
> 
> I think a VMA flag is simpler than messing with pgprot.

I guess one could write a patch and see how it goes ;).

> > If we want the VMM to drive this entirely, we could add a new mmap()
> > flag like MAP_WRITECOMBINE or PROT_WRITECOMBINE. They do feel a bit
> 
> As in the other thread, we cannot unconditionally map NORMAL_NC into
> the VMM.

I'm not suggesting this but rather the VMM map portions of the BAR with
either Device or Normal-NC, concatenate them (MAP_FIXED) and pass this
range as a memory slot (or multiple if a slot doesn't allow multiple
vmas).

> > The latter has some benefits for DPDK but it's a lot more involved
> > with
> 
> DPDK WC support will be solved with some VFIO-only change if anyone
> ever cares to make it, if that is what you mean.

Yeah. Some arguments I've heard in private and public discussions is
that the KVM device pass-through shouldn't be different from the DPDK
case. So fixing that would cover KVM as well, though we'd need
additional logic in the VMM. BenH had a short talk at Plumbers around
this - https://youtu.be/QLvN3KXCn0k?t=7010. There was some statement in
there that for x86, the guests are allowed to do WC without other KVM
restrictions (not sure whether that's the case, not familiar with it).

> > having to add device-specific knowledge into the VMM. The VMM would also
> > have to present the whole BAR contiguously to the guest even if there
> > are different mapping attributes within the range. So a lot of MAP_FIXED
> > uses. I'd rather leaving this decision with the guest than the VMM, it
> > looks like more hassle to create those mappings. The VMM or the VFIO
> > could only state write-combine and speculation allowed.
> 
> We talked about this already, the guest must decide, the VMM doesn't
> have the information to pre-predict which pages the guest will want to
> use WC on.

Are the Device/Normal offsets within a BAR fixed, documented in e.g. the
spec or this is something configurable via some MMIO that the guest
does.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2023-12-06 16:31 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-05  3:30 [PATCH v2 1/1] KVM: arm64: allow the VM to select DEVICE_* and NORMAL_NC for IO memory ankita
2023-12-05  3:30 ` ankita
2023-12-05  9:21 ` Marc Zyngier
2023-12-05  9:21   ` Marc Zyngier
2023-12-05 11:40   ` Catalin Marinas
2023-12-05 11:40     ` Catalin Marinas
2023-12-05 13:05     ` Jason Gunthorpe
2023-12-05 13:05       ` Jason Gunthorpe
2023-12-05 14:37       ` Lorenzo Pieralisi
2023-12-05 14:37         ` Lorenzo Pieralisi
2023-12-05 14:44         ` Jason Gunthorpe
2023-12-05 14:44           ` Jason Gunthorpe
2023-12-05 16:24           ` Catalin Marinas
2023-12-05 16:24             ` Catalin Marinas
2023-12-05 17:10             ` Jason Gunthorpe
2023-12-05 17:10               ` Jason Gunthorpe
2023-12-05 16:22       ` Catalin Marinas
2023-12-05 16:22         ` Catalin Marinas
2023-12-05 16:43         ` Jason Gunthorpe
2023-12-05 16:43           ` Jason Gunthorpe
2023-12-05 17:01           ` Marc Zyngier
2023-12-05 17:01             ` Marc Zyngier
2023-12-05 17:33             ` Catalin Marinas
2023-12-05 17:33               ` Catalin Marinas
2023-12-05 17:50               ` Marc Zyngier
2023-12-05 17:50                 ` Marc Zyngier
2023-12-05 18:40                 ` Catalin Marinas
2023-12-05 18:40                   ` Catalin Marinas
2023-12-06 11:39                   ` Marc Zyngier
2023-12-06 11:39                     ` Marc Zyngier
2023-12-06 12:14                     ` Catalin Marinas
2023-12-06 12:14                       ` Catalin Marinas
2023-12-06 15:16                       ` Jason Gunthorpe
2023-12-06 15:16                         ` Jason Gunthorpe
2023-12-06 16:31                         ` Catalin Marinas [this message]
2023-12-06 16:31                           ` Catalin Marinas
2023-12-06 17:20                           ` Jason Gunthorpe
2023-12-06 17:20                             ` Jason Gunthorpe
2023-12-06 18:58                             ` Catalin Marinas
2023-12-06 18:58                               ` Catalin Marinas
2023-12-06 19:03                               ` Jason Gunthorpe
2023-12-06 19:03                                 ` Jason Gunthorpe
2023-12-06 19:06                                 ` Catalin Marinas
2023-12-06 19:06                                   ` Catalin Marinas
2023-12-07  2:53                                   ` Ankit Agrawal
2023-12-07  2:53                                     ` Ankit Agrawal
2023-12-06 11:52                   ` Lorenzo Pieralisi
2023-12-06 11:52                     ` Lorenzo Pieralisi
2023-12-05 19:24           ` Catalin Marinas
2023-12-05 19:24             ` Catalin Marinas
2023-12-05 19:48             ` Jason Gunthorpe
2023-12-05 19:48               ` Jason Gunthorpe
2023-12-06 14:49               ` Catalin Marinas
2023-12-06 14:49                 ` Catalin Marinas
2023-12-06 15:05                 ` Jason Gunthorpe
2023-12-06 15:05                   ` Jason Gunthorpe
2023-12-06 15:18                   ` Lorenzo Pieralisi
2023-12-06 15:18                     ` Lorenzo Pieralisi
2023-12-06 15:38                     ` Jason Gunthorpe
2023-12-06 15:38                       ` Jason Gunthorpe
2023-12-06 16:23                       ` Catalin Marinas
2023-12-06 16:23                         ` Catalin Marinas
2023-12-06 16:48                         ` Jason Gunthorpe
2023-12-06 16:48                           ` Jason Gunthorpe
2023-12-07 10:13                           ` Lorenzo Pieralisi
2023-12-07 10:13                             ` Lorenzo Pieralisi
2023-12-07 13:38                             ` Jason Gunthorpe
2023-12-07 13:38                               ` Jason Gunthorpe
2023-12-07 14:50                               ` Lorenzo Pieralisi
2023-12-07 14:50                                 ` Lorenzo Pieralisi
2023-12-05 13:28     ` Lorenzo Pieralisi
2023-12-05 13:28       ` Lorenzo Pieralisi
2023-12-05 14:16     ` Shameerali Kolothum Thodi
2023-12-05 14:16       ` Shameerali Kolothum Thodi
2023-12-06  8:17       ` Shameerali Kolothum Thodi
2023-12-06  8:17         ` Shameerali Kolothum Thodi
2023-12-05 11:48 ` Catalin Marinas
2023-12-05 11:48   ` Catalin Marinas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZXCh9N2xp0efHcpE@arm.com \
    --to=catalin.marinas@arm.com \
    --cc=acurrid@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=aniketa@nvidia.com \
    --cc=ankita@nvidia.com \
    --cc=apopple@nvidia.com \
    --cc=ardb@kernel.org \
    --cc=cjia@nvidia.com \
    --cc=danw@nvidia.com \
    --cc=gshan@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=kwankhede@nvidia.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lpieralisi@kernel.org \
    --cc=maz@kernel.org \
    --cc=mochs@nvidia.com \
    --cc=oliver.upton@linux.dev \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=suzuki.poulose@arm.com \
    --cc=targupta@nvidia.com \
    --cc=vsethi@nvidia.com \
    --cc=will@kernel.org \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.