From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7EE7DEB64DD for ; Fri, 14 Jul 2023 02:40:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232666AbjGNCk3 convert rfc822-to-8bit (ORCPT ); Thu, 13 Jul 2023 22:40:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55974 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229580AbjGNCk3 (ORCPT ); Thu, 13 Jul 2023 22:40:29 -0400 X-Greylist: delayed 388 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Thu, 13 Jul 2023 19:40:27 PDT Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 6EB622121; Thu, 13 Jul 2023 19:40:27 -0700 (PDT) Received: from [IPv6:::1] (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id 36E2WobL023295; Thu, 13 Jul 2023 21:32:50 -0500 Message-ID: <2838d716b08c78ed24fdd3fe392e21222ee70067.camel@kernel.crashing.org> Subject: VFIO (PCI) and write combine mapping of BARs From: Benjamin Herrenschmidt To: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org, alex.williamson@redhat.com, osamaabb@amazon.com, linux-pci@vger.kernel.org, Clint Sbisa Date: Fri, 14 Jul 2023 12:32:49 +1000 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT User-Agent: Evolution 3.44.4-0ubuntu1 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org Hi Folks ! I'd like to revive an old discussion as we (Amazon Linux) have been getting asks for it. What's the best interface to provide the option of write combine mmap's of BARs via VFIO ? The problem isn't so much the low level implementation, we just have to play with the pgprot, the question is more around what API to present to control this. One trivial way would be to have an ioctl to set a flag for a given region/BAR that cause subsequent mmap's to use write-combine. We would have to keep a bitmap for the "legacy" regions, and use a flag in struct vfio_pci_region for the others. One potentially better way is to make it strictly an attribute of vfio_pci_region, along with an ioctl that creates a "subregion". The idea here is that we would have an ioctl to create a region from an existing region dynamically, which represents a subset of the original region (typically a BAR), with potentially different attributes (or we keep the attribute get/set separate). I like the latter more because it will allow to more easily define that portions of a BAR can need different attributes without causing state/race issues between setting the attribute and mmap. This will also enable other attributes than write-combine if/when the need arises. Any better idea ? thoughs ? objections ? This is still quite specific to PCI, but so is the entire regions mechanism, so I don't see an easy path to something more generic at this stage. Cheers, Ben.