* Re: [PATCH v6 05/16] dt-bindings: net: wireless: describe the ath12k PCI module
From: Krzysztof Kozlowski @ 2024-03-27 18:21 UTC (permalink / raw)
To: Bartosz Golaszewski, Marcel Holtmann, Luiz Augusto von Dentz,
David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Rob Herring, Krzysztof Kozlowski, Conor Dooley, Kalle Valo,
Bjorn Andersson, Konrad Dybcio, Liam Girdwood, Mark Brown,
Catalin Marinas, Will Deacon, Bjorn Helgaas, Saravana Kannan,
Geert Uytterhoeven, Arnd Bergmann, Neil Armstrong,
Marek Szyprowski, Alex Elder, Srini Kandagatla,
Greg Kroah-Hartman, Abel Vesa, Manivannan Sadhasivam,
Lukas Wunner, Dmitry Baryshkov
Cc: linux-bluetooth, netdev, devicetree, linux-kernel, linux-wireless,
linux-arm-msm, linux-arm-kernel, linux-pci, linux-pm,
Bartosz Golaszewski
In-Reply-To: <20240325131624.26023-6-brgl@bgdev.pl>
On 25/03/2024 14:16, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>
> Add device-tree bindings for the ATH12K module found in the WCN7850
> package.
>
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> ---
> .../bindings/net/wireless/qcom,ath12k.yaml | 100 ++++++++++++++++++
> 1 file changed, 100 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/net/wireless/qcom,ath12k.yaml
>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Best regards,
Krzysztof
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* [PATCH v1 1/2] arm64: dts: ti: verdin-am62: mallow: fix GPIOs pinctrl
From: Francesco Dolcini @ 2024-03-27 18:28 UTC (permalink / raw)
To: Nishanth Menon, Vignesh Raghavendra, Tero Kristo, Rob Herring,
Krzysztof Kozlowski, Conor Dooley
Cc: Francesco Dolcini, linux-arm-kernel, devicetree, linux-kernel
In-Reply-To: <20240327182801.5997-1-francesco@dolcini.it>
From: Francesco Dolcini <francesco.dolcini@toradex.com>
Generic GPIOs pinctrl nodes are not correct, gpio[1-4] are into the MCU
domain and should be into &mcu_gpio0, gpio[5-8] were missing and are added
in this commit.
Fixes: 7698622fbcf4 ("arm64: dts: ti: Add verdin am62 mallow board")
Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
---
.../arm64/boot/dts/ti/k3-am62-verdin-mallow.dtsi | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/boot/dts/ti/k3-am62-verdin-mallow.dtsi b/arch/arm64/boot/dts/ti/k3-am62-verdin-mallow.dtsi
index 77b1beb638ad..cd81a606c435 100644
--- a/arch/arm64/boot/dts/ti/k3-am62-verdin-mallow.dtsi
+++ b/arch/arm64/boot/dts/ti/k3-am62-verdin-mallow.dtsi
@@ -81,10 +81,10 @@ &epwm1 {
&main_gpio0 {
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_ctrl_sleep_moci>,
- <&pinctrl_gpio_1>,
- <&pinctrl_gpio_2>,
- <&pinctrl_gpio_3>,
- <&pinctrl_gpio_4>;
+ <&pinctrl_gpio_5>,
+ <&pinctrl_gpio_6>,
+ <&pinctrl_gpio_7>,
+ <&pinctrl_gpio_8>;
};
/* Verdin I2C_1 */
@@ -149,6 +149,14 @@ &main_uart1 {
status = "okay";
};
+&mcu_gpio0 {
+ pinctrl-names = "default";
+ pinctrl-0 = <&pinctrl_gpio_1>,
+ <&pinctrl_gpio_2>,
+ <&pinctrl_gpio_3>,
+ <&pinctrl_gpio_4>;
+};
+
/* Verdin I2C_3_HDMI */
&mcu_i2c0 {
status = "okay";
--
2.39.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v1 0/2] arm64: dts: ti: k3-am625-verdin: fix GPIOs pinctrl
From: Francesco Dolcini @ 2024-03-27 18:27 UTC (permalink / raw)
To: Nishanth Menon, Vignesh Raghavendra, Tero Kristo, Rob Herring,
Krzysztof Kozlowski, Conor Dooley
Cc: Francesco Dolcini, linux-arm-kernel, devicetree, linux-kernel
From: Francesco Dolcini <francesco.dolcini@toradex.com>
Fix a couple of issues on the Verdin AM62 pinctrl affecting multiple
carrier boards.
The first patch fixes a mistake on the pinctrl of the verdin-am62 mallow
carrier board.
The second patch allows using the mPCIe/M.2 slot available on multiple
carrier boards with cards that use only the USB interface toward the host,
this is achieved with a GPIO hog, given that no PCIe interface exists.
Francesco Dolcini (2):
arm64: dts: ti: verdin-am62: mallow: fix GPIOs pinctrl
arm64: dts: ti: k3-am625-verdin: add PCIe reset gpio hog
.../boot/dts/ti/k3-am62-verdin-dahlia.dtsi | 8 ++++++-
.../arm64/boot/dts/ti/k3-am62-verdin-dev.dtsi | 8 ++++++-
.../boot/dts/ti/k3-am62-verdin-mallow.dtsi | 22 +++++++++++++++----
.../boot/dts/ti/k3-am62-verdin-yavia.dtsi | 8 ++++++-
arch/arm64/boot/dts/ti/k3-am62-verdin.dtsi | 9 ++++++++
5 files changed, 48 insertions(+), 7 deletions(-)
--
2.39.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* [PATCH v1 2/2] arm64: dts: ti: k3-am625-verdin: add PCIe reset gpio hog
From: Francesco Dolcini @ 2024-03-27 18:28 UTC (permalink / raw)
To: Nishanth Menon, Vignesh Raghavendra, Tero Kristo, Rob Herring,
Krzysztof Kozlowski, Conor Dooley
Cc: Francesco Dolcini, linux-arm-kernel, devicetree, linux-kernel
In-Reply-To: <20240327182801.5997-1-francesco@dolcini.it>
From: Francesco Dolcini <francesco.dolcini@toradex.com>
Add a GPIO hog to release PCIe reset on the carrier board, this is
required to use M.2 or mPCIe cards.
Verdin AM62 does not have any PCIe interface, however the Verdin family
has PCIe and normally an M.2 or mPCIe slot is available in the carrier
board that can be used with cards that use only the USB interface toward
the host CPU, for example cellular network modem.
Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
---
arch/arm64/boot/dts/ti/k3-am62-verdin-dahlia.dtsi | 8 +++++++-
arch/arm64/boot/dts/ti/k3-am62-verdin-dev.dtsi | 8 +++++++-
arch/arm64/boot/dts/ti/k3-am62-verdin-mallow.dtsi | 8 +++++++-
arch/arm64/boot/dts/ti/k3-am62-verdin-yavia.dtsi | 8 +++++++-
arch/arm64/boot/dts/ti/k3-am62-verdin.dtsi | 9 +++++++++
5 files changed, 37 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/boot/dts/ti/k3-am62-verdin-dahlia.dtsi b/arch/arm64/boot/dts/ti/k3-am62-verdin-dahlia.dtsi
index 6c4cec8728e4..e51fda1127ef 100644
--- a/arch/arm64/boot/dts/ti/k3-am62-verdin-dahlia.dtsi
+++ b/arch/arm64/boot/dts/ti/k3-am62-verdin-dahlia.dtsi
@@ -160,7 +160,8 @@ &mcu_gpio0 {
pinctrl-0 = <&pinctrl_gpio_1>,
<&pinctrl_gpio_2>,
<&pinctrl_gpio_3>,
- <&pinctrl_gpio_4>;
+ <&pinctrl_gpio_4>,
+ <&pinctrl_pcie_1_reset>;
};
/* Verdin I2C_3_HDMI */
@@ -211,6 +212,11 @@ &verdin_gpio_keys {
status = "okay";
};
+/* Verdin PCIE_1_RESET# */
+&verdin_pcie_1_reset_hog {
+ status = "okay";
+};
+
/* Verdin UART_2 */
&wkup_uart0 {
status = "okay";
diff --git a/arch/arm64/boot/dts/ti/k3-am62-verdin-dev.dtsi b/arch/arm64/boot/dts/ti/k3-am62-verdin-dev.dtsi
index be62648e7818..74eec1a1abca 100644
--- a/arch/arm64/boot/dts/ti/k3-am62-verdin-dev.dtsi
+++ b/arch/arm64/boot/dts/ti/k3-am62-verdin-dev.dtsi
@@ -181,7 +181,8 @@ &mcu_gpio0 {
pinctrl-0 = <&pinctrl_gpio_1>,
<&pinctrl_gpio_2>,
<&pinctrl_gpio_3>,
- <&pinctrl_gpio_4>;
+ <&pinctrl_gpio_4>,
+ <&pinctrl_pcie_1_reset>;
};
/* Verdin I2C_3_HDMI */
@@ -232,6 +233,11 @@ &verdin_gpio_keys {
status = "okay";
};
+/* Verdin PCIE_1_RESET# */
+&verdin_pcie_1_reset_hog {
+ status = "okay";
+};
+
/* Verdin UART_2 */
&wkup_uart0 {
status = "okay";
diff --git a/arch/arm64/boot/dts/ti/k3-am62-verdin-mallow.dtsi b/arch/arm64/boot/dts/ti/k3-am62-verdin-mallow.dtsi
index cd81a606c435..754216d8ac14 100644
--- a/arch/arm64/boot/dts/ti/k3-am62-verdin-mallow.dtsi
+++ b/arch/arm64/boot/dts/ti/k3-am62-verdin-mallow.dtsi
@@ -154,7 +154,8 @@ &mcu_gpio0 {
pinctrl-0 = <&pinctrl_gpio_1>,
<&pinctrl_gpio_2>,
<&pinctrl_gpio_3>,
- <&pinctrl_gpio_4>;
+ <&pinctrl_gpio_4>,
+ <&pinctrl_pcie_1_reset>;
};
/* Verdin I2C_3_HDMI */
@@ -200,6 +201,11 @@ &verdin_gpio_keys {
status = "okay";
};
+/* Verdin PCIE_1_RESET# */
+&verdin_pcie_1_reset_hog {
+ status = "okay";
+};
+
/* Verdin UART_2 */
&wkup_uart0 {
status = "okay";
diff --git a/arch/arm64/boot/dts/ti/k3-am62-verdin-yavia.dtsi b/arch/arm64/boot/dts/ti/k3-am62-verdin-yavia.dtsi
index 997dfafd27eb..7372d392ec8a 100644
--- a/arch/arm64/boot/dts/ti/k3-am62-verdin-yavia.dtsi
+++ b/arch/arm64/boot/dts/ti/k3-am62-verdin-yavia.dtsi
@@ -159,7 +159,8 @@ &mcu_gpio0 {
pinctrl-0 = <&pinctrl_gpio_1>,
<&pinctrl_gpio_2>,
<&pinctrl_gpio_3>,
- <&pinctrl_gpio_4>;
+ <&pinctrl_gpio_4>,
+ <&pinctrl_pcie_1_reset>;
};
/* Verdin I2C_3_HDMI */
@@ -205,6 +206,11 @@ &verdin_gpio_keys {
status = "okay";
};
+/* Verdin PCIE_1_RESET# */
+&verdin_pcie_1_reset_hog {
+ status = "okay";
+};
+
/* Verdin UART_2 */
&wkup_uart0 {
status = "okay";
diff --git a/arch/arm64/boot/dts/ti/k3-am62-verdin.dtsi b/arch/arm64/boot/dts/ti/k3-am62-verdin.dtsi
index e8d8857ad51f..e6c10d23d038 100644
--- a/arch/arm64/boot/dts/ti/k3-am62-verdin.dtsi
+++ b/arch/arm64/boot/dts/ti/k3-am62-verdin.dtsi
@@ -1407,6 +1407,15 @@ &mcu_gpio0 {
"",
"",
"";
+
+ verdin_pcie_1_reset_hog: pcie-1-reset-hog {
+ gpio-hog;
+ /* Verdin PCIE_1_RESET# (SODIMM 244) */
+ gpios = <0 GPIO_ACTIVE_LOW>;
+ line-name = "PCIE_1_RESET#";
+ output-low;
+ status = "disabled";
+ };
};
/* Verdin CAN_2 */
--
2.39.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* Re: [PATCH RESEND v6 1/5] dt-bindings: spmi: Add X1E80100 SPMI PMIC ARB schema
From: Krzysztof Kozlowski @ 2024-03-27 18:40 UTC (permalink / raw)
To: Abel Vesa, Stephen Boyd, Matthias Brugger, Bjorn Andersson,
Konrad Dybcio, Dmitry Baryshkov, Neil Armstrong,
AngeloGioacchino Del Regno, Rob Herring, Krzysztof Kozlowski,
Conor Dooley
Cc: Srini Kandagatla, Johan Hovold, linux-kernel, linux-arm-kernel,
linux-arm-msm, linux-mediatek, devicetree
In-Reply-To: <20240326-spmi-multi-master-support-v6-1-1c87d8306c5b@linaro.org>
On 26/03/2024 17:28, Abel Vesa wrote:
> + qcom,channel:
> + $ref: /schemas/types.yaml#/definitions/uint32
> + minimum: 0
> + maximum: 5
> + description: >
> + which of the PMIC Arb provided channels to use for accesses
> +
> +patternProperties:
> + "spmi@[a-f0-9]+$":
Missing '^' in the pattern. I did not notice it earlier because only
since ~2 weeks I have lei filter for ABI based on node names. And your
driver created ABI based on node names, so this must be fixed. Sorry. :(
Best regards,
Krzysztof
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [WIP 0/3] Memory model and atomic API in Rust
From: Kent Overstreet @ 2024-03-27 18:50 UTC (permalink / raw)
To: comex
Cc: Linus Torvalds, Dr. David Alan Gilbert, Philipp Stanner,
Boqun Feng, rust-for-linux, linux-kernel, linux-arch, llvm,
Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
Alan Stern, Andrea Parri, Will Deacon, Peter Zijlstra,
Nicholas Piggin, David Howells, Jade Alglave, Luc Maranget,
Paul E. McKenney, Akira Yokosawa, Daniel Lustig, Joel Fernandes,
Nathan Chancellor, Nick Desaulniers, kent.overstreet,
Greg Kroah-Hartman, Marco Elver, Mark Rutland, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
Catalin Marinas, linux-arm-kernel, linux-fsdevel
In-Reply-To: <160DB953-1588-418E-A490-381009CD8DE0@gmail.com>
On Wed, Mar 27, 2024 at 09:16:09AM -0700, comex wrote:
> Meanwhile, Rust intentionally lacks strict aliasing.
I wasn't aware of this. Given that unrestricted pointers are a real
impediment to compiler optimization, I thought that with Rust we were
finally starting to nail down a concrete enough memory model to tackle
this safely. But I guess not?
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH v6 01/16] regulator: dt-bindings: describe the PMU module of the QCA6390 package
From: Bartosz Golaszewski @ 2024-03-27 18:55 UTC (permalink / raw)
To: Krzysztof Kozlowski
Cc: Marcel Holtmann, Luiz Augusto von Dentz, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Rob Herring,
Krzysztof Kozlowski, Conor Dooley, Kalle Valo, Bjorn Andersson,
Konrad Dybcio, Liam Girdwood, Mark Brown, Catalin Marinas,
Will Deacon, Bjorn Helgaas, Saravana Kannan, Geert Uytterhoeven,
Arnd Bergmann, Neil Armstrong, Marek Szyprowski, Alex Elder,
Srini Kandagatla, Greg Kroah-Hartman, Abel Vesa,
Manivannan Sadhasivam, Lukas Wunner, Dmitry Baryshkov,
linux-bluetooth, netdev, devicetree, linux-kernel, linux-wireless,
linux-arm-msm, linux-arm-kernel, linux-pci, linux-pm,
Bartosz Golaszewski
In-Reply-To: <af9def4e-c6d6-49d9-a457-68c40492587a@linaro.org>
On Wed, Mar 27, 2024 at 7:17 PM Krzysztof Kozlowski
<krzysztof.kozlowski@linaro.org> wrote:
>
> On 25/03/2024 14:16, Bartosz Golaszewski wrote:
> > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> >
> > The QCA6390 package contains discreet modules for WLAN and Bluetooth. They
> > are powered by the Power Management Unit (PMU) that takes inputs from the
> > host and provides LDO outputs. This document describes this module.
> >
> > Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>
> Can you start using b4?
>
> This is a friendly reminder during the review process.
>
> It looks like you received a tag and forgot to add it.
>
> If you do not know the process, here is a short explanation:
> Please add Acked-by/Reviewed-by/Tested-by tags when posting new
> versions, under or above your Signed-off-by tag. Tag is "received", when
> provided in a message replied to you on the mailing list. Tools like b4
> can help here. However, there's no need to repost patches *only* to add
> the tags. The upstream maintainer will do that for tags received on the
> version they apply.
>
> https://elixir.bootlin.com/linux/v6.5-rc3/source/Documentation/process/submitting-patches.rst#L577
>
> If a tag was not added on purpose, please state why and what changed.
>
As per the first sentence of the cover letter: I dropped review tags
from the patches that changed significantly while keeping them for
those that didn't. If there's a way to let your automation know about
this, please let me know/point me in the right direction because I
don't know about it.
Bart
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH v6 02/16] regulator: dt-bindings: describe the PMU module of the WCN7850 package
From: Bartosz Golaszewski @ 2024-03-27 18:56 UTC (permalink / raw)
To: Krzysztof Kozlowski
Cc: Marcel Holtmann, Luiz Augusto von Dentz, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Rob Herring,
Krzysztof Kozlowski, Conor Dooley, Kalle Valo, Bjorn Andersson,
Konrad Dybcio, Liam Girdwood, Mark Brown, Catalin Marinas,
Will Deacon, Bjorn Helgaas, Saravana Kannan, Geert Uytterhoeven,
Arnd Bergmann, Neil Armstrong, Marek Szyprowski, Alex Elder,
Srini Kandagatla, Greg Kroah-Hartman, Abel Vesa,
Manivannan Sadhasivam, Lukas Wunner, Dmitry Baryshkov,
linux-bluetooth, netdev, devicetree, linux-kernel, linux-wireless,
linux-arm-msm, linux-arm-kernel, linux-pci, linux-pm,
Bartosz Golaszewski
In-Reply-To: <1614af1c-330d-49ee-aa22-a19de866862e@linaro.org>
On Wed, Mar 27, 2024 at 7:19 PM Krzysztof Kozlowski
<krzysztof.kozlowski@linaro.org> wrote:
>
> On 25/03/2024 14:16, Bartosz Golaszewski wrote:
> > + then:
> > + required:
> > + - vdd-supply
> > + - vddio-supply
> > + - vddaon-supply
> > + - vdddig-supply
> > + - vddrfa1p2-supply
> > + - vddrfa1p8-supply
>
> I assume vddio1p2 is not required on purpose.
>
Correct.
Bart
> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
>
> Best regards,
> Krzysztof
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* [PATCH v1] arm64: mm: Batch dsb and isb when populating pgtables
From: Ryan Roberts @ 2024-03-27 19:07 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon, Mark Rutland, Ard Biesheuvel,
David Hildenbrand, Donald Dutile, Eric Chanudet
Cc: Ryan Roberts, linux-arm-kernel, linux-kernel
In-Reply-To: <20240326101448.3453626-1-ryan.roberts@arm.com>
After removing uneccessary TLBIs, the next bottleneck when creating the
page tables for the linear map is DSB and ISB, which were previously
issued per-pte in __set_pte(). Since we are writing multiple ptes in a
given pte table, we can elide these barriers and insert them once we
have finished writing to the table.
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
arch/arm64/include/asm/pgtable.h | 7 ++++++-
arch/arm64/mm/mmu.c | 13 ++++++++++++-
2 files changed, 18 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index bd5d02f3f0a3..81e427b23b3f 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -271,9 +271,14 @@ static inline pte_t pte_mkdevmap(pte_t pte)
return set_pte_bit(pte, __pgprot(PTE_DEVMAP | PTE_SPECIAL));
}
-static inline void __set_pte(pte_t *ptep, pte_t pte)
+static inline void ___set_pte(pte_t *ptep, pte_t pte)
{
WRITE_ONCE(*ptep, pte);
+}
+
+static inline void __set_pte(pte_t *ptep, pte_t pte)
+{
+ ___set_pte(ptep, pte);
/*
* Only if the new pte is valid and kernel, otherwise TLB maintenance
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 1b2a2a2d09b7..c6d5a76732d4 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -301,7 +301,11 @@ static pte_t *init_pte(pte_t *ptep, unsigned long addr, unsigned long end,
do {
pte_t old_pte = __ptep_get(ptep);
- __set_pte(ptep, pfn_pte(__phys_to_pfn(phys), prot));
+ /*
+ * Required barriers to make this visible to the table walker
+ * are deferred to the end of alloc_init_cont_pte().
+ */
+ ___set_pte(ptep, pfn_pte(__phys_to_pfn(phys), prot));
/*
* After the PTE entry has been populated once, we
@@ -358,6 +362,13 @@ static void alloc_init_cont_pte(pmd_t *pmdp, unsigned long addr,
} while (addr = next, addr != end);
ops->unmap(TYPE_PTE);
+
+ /*
+ * Ensure all previous pgtable writes are visible to the table walker.
+ * See init_pte().
+ */
+ dsb(ishst);
+ isb();
}
static pmd_t *init_pmd(pmd_t *pmdp, unsigned long addr, unsigned long end,
--
2.25.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* Re: [WIP 0/3] Memory model and atomic API in Rust
From: Linus Torvalds @ 2024-03-27 19:07 UTC (permalink / raw)
To: Kent Overstreet
Cc: comex, Dr. David Alan Gilbert, Philipp Stanner, Boqun Feng,
rust-for-linux, linux-kernel, linux-arch, llvm, Miguel Ojeda,
Alex Gaynor, Wedson Almeida Filho, Gary Guo, Björn Roy Baron,
Benno Lossin, Andreas Hindborg, Alice Ryhl, Alan Stern,
Andrea Parri, Will Deacon, Peter Zijlstra, Nicholas Piggin,
David Howells, Jade Alglave, Luc Maranget, Paul E. McKenney,
Akira Yokosawa, Daniel Lustig, Joel Fernandes, Nathan Chancellor,
Nick Desaulniers, kent.overstreet, Greg Kroah-Hartman,
Marco Elver, Mark Rutland, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
Catalin Marinas, linux-arm-kernel, linux-fsdevel
In-Reply-To: <qyjrex54hbhvhw4gmn7b6l2hr45o56bwt6fazfalykwcp5zzkx@vwt7k3d6kdwt>
On Wed, 27 Mar 2024 at 11:51, Kent Overstreet <kent.overstreet@linux.dev> wrote:
>
> On Wed, Mar 27, 2024 at 09:16:09AM -0700, comex wrote:
> > Meanwhile, Rust intentionally lacks strict aliasing.
>
> I wasn't aware of this. Given that unrestricted pointers are a real
> impediment to compiler optimization, I thought that with Rust we were
> finally starting to nail down a concrete enough memory model to tackle
> this safely. But I guess not?
Strict aliasing is a *horrible* mistake.
It's not even *remotely* "tackle this safely". It's the exact
opposite. It's completely broken.
Anybody who thinks strict aliasing is a good idea either
(a) doesn't understand what it means
(b) has been brainwashed by incompetent compiler people.
it's a horrendous crock that was introduced by people who thought it
was too complicated to write out "restrict" keywords, and that thought
that "let's break old working programs and make it harder to write new
programs" was a good idea.
Nobody should ever do it. The fact that Rust doesn't do the C strict
aliasing is a good thing. Really.
I suspect you have been fooled by the name. Because "strict aliasing"
sounds like a good thing. It sounds like "I know these strictly can't
alias". But despite that name, it's the complete opposite of that, and
means "I will ignore actual real aliasing even if it exists, because I
will make aliasing decisions on entirely made-up grounds".
Just say no to strict aliasing. Thankfully, there's an actual compiler
flag for that: -fno-strict-aliasing. It should absolutely have been
the default.
Linus
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH v1 0/3] Speed up boot with faster linear map creation
From: Ryan Roberts @ 2024-03-27 19:12 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon, Mark Rutland, Ard Biesheuvel,
David Hildenbrand, Donald Dutile, Eric Chanudet
Cc: linux-arm-kernel, linux-kernel
In-Reply-To: <20240326101448.3453626-1-ryan.roberts@arm.com>
On 26/03/2024 10:14, Ryan Roberts wrote:
> Hi All,
>
> It turns out that creating the linear map can take a significant proportion of
> the total boot time, especially when rodata=full. And a large portion of the
> time it takes to create the linear map is issuing TLBIs. This series reworks the
> kernel pgtable generation code to significantly reduce the number of TLBIs. See
> each patch for details.
>
> The below shows the execution time of map_mem() across a couple of different
> systems with different RAM configurations. We measure after applying each patch
> and show the improvement relative to base (v6.9-rc1):
>
> | Apple M2 VM | Ampere Altra| Ampere Altra| Ampere Altra
> | VM, 16G | VM, 64G | VM, 256G | Metal, 512G
> ---------------|-------------|-------------|-------------|-------------
> | ms (%) | ms (%) | ms (%) | ms (%)
> ---------------|-------------|-------------|-------------|-------------
> base | 151 (0%) | 2191 (0%) | 8990 (0%) | 17443 (0%)
> no-cont-remap | 77 (-49%) | 429 (-80%) | 1753 (-80%) | 3796 (-78%)
> no-alloc-remap | 77 (-49%) | 375 (-83%) | 1532 (-83%) | 3366 (-81%)
> lazy-unmap | 63 (-58%) | 330 (-85%) | 1312 (-85%) | 2929 (-83%)
I've just appended an additional patch to this series. This takes us to a ~95%
reduction overall:
| Apple M2 VM | Ampere Altra| Ampere Altra| Ampere Altra
| VM, 16G | VM, 64G | VM, 256G | Metal, 512G
---------------|-------------|-------------|-------------|-------------
| ms (%) | ms (%) | ms (%) | ms (%)
---------------|-------------|-------------|-------------|-------------
base | 151 (0%) | 2191 (0%) | 8990 (0%) | 17443 (0%)
no-cont-remap | 77 (-49%) | 429 (-80%) | 1753 (-80%) | 3796 (-78%)
no-alloc-remap | 77 (-49%) | 375 (-83%) | 1532 (-83%) | 3366 (-81%)
lazy-unmap | 63 (-58%) | 330 (-85%) | 1312 (-85%) | 2929 (-83%)
batch-barriers | 11 (-93%) | 61 (-97%) | 261 (-97%) | 837 (-95%)
Don't believe the intermediate block-based pgtable idea will now be neccessary
so I don't intend to persue that. It might be that we choose to drop the middle
two patchs; I'm keen to hear opinions.
Thanks,
Ryan
>
> This series applies on top of v6.9-rc1. All mm selftests pass. I haven't yet
> tested all VA size configs (although I don't anticipate any issues); I'll do
> this as part of followup.
>
> Thanks,
> Ryan
>
>
> Ryan Roberts (3):
> arm64: mm: Don't remap pgtables per- cont(pte|pmd) block
> arm64: mm: Don't remap pgtables for allocate vs populate
> arm64: mm: Lazily clear pte table mappings from fixmap
>
> arch/arm64/include/asm/fixmap.h | 5 +-
> arch/arm64/include/asm/mmu.h | 8 +
> arch/arm64/include/asm/pgtable.h | 4 -
> arch/arm64/kernel/cpufeature.c | 10 +-
> arch/arm64/mm/fixmap.c | 11 +
> arch/arm64/mm/mmu.c | 364 +++++++++++++++++++++++--------
> include/linux/pgtable.h | 8 +
> 7 files changed, 307 insertions(+), 103 deletions(-)
>
> --
> 2.25.1
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* [PATCH v6 03/29] iommu/arm-smmu-v3: Do not allow a SVA domain to be set on the wrong PASID
From: Jason Gunthorpe @ 2024-03-27 18:07 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
In-Reply-To: <0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com>
The SVA code is wired to assume that the SVA is programmed onto the
mm->pasid. The current core code always does this, so it is fine.
Add a check for clarity.
Fixes: 386fa64fd52b ("arm-smmu-v3/sva: Add SVA domain support")
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 2cd433a9c8a0fa..41b44baef15e80 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -569,6 +569,9 @@ static int arm_smmu_sva_set_dev_pasid(struct iommu_domain *domain,
int ret = 0;
struct mm_struct *mm = domain->mm;
+ if (mm_get_enqcmd_pasid(mm) != id)
+ return -EINVAL;
+
mutex_lock(&sva_lock);
ret = __arm_smmu_sva_bind(dev, id, mm);
mutex_unlock(&sva_lock);
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v6 02/29] iommu/arm-smmu-v3: Add cpu_to_le64() around STRTAB_STE_0_V
From: Jason Gunthorpe @ 2024-03-27 18:07 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
In-Reply-To: <0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com>
STRTAB_STE_0_V is a CPU value, it needs conversion for sparse to be clean.
The missing annotation was a mistake introduced by splitting the ops out
from the STE writer.
Fixes: 7da51af9125c ("iommu/arm-smmu-v3: Make STE programming independent of the callers")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202403011441.5WqGrYjp-lkp@intel.com/
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 5ed036225e69bb..fa3f3e7d9b0cba 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1139,7 +1139,8 @@ static void arm_smmu_write_ste(struct arm_smmu_master *master, u32 sid,
* requires a breaking update, zero the V bit, write all qwords
* but 0, then set qword 0
*/
- unused_update.data[0] = entry->data[0] & (~STRTAB_STE_0_V);
+ unused_update.data[0] = entry->data[0] &
+ cpu_to_le64(~STRTAB_STE_0_V);
entry_set(smmu, sid, entry, &unused_update, 0, 1);
entry_set(smmu, sid, entry, target, 1, num_entry_qwords - 1);
entry_set(smmu, sid, entry, target, 0, 1);
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v6 07/29] iommu/arm-smmu-v3: Make CD programming use arm_smmu_write_entry()
From: Jason Gunthorpe @ 2024-03-27 18:07 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
In-Reply-To: <0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com>
CD table entries and STE's have the same essential programming sequence,
just with different types.
Have arm_smmu_write_ctx_desc() generate a target CD and call
arm_smmu_write_entry() to do the programming. Due to the way the target CD
is generated by modifying the existing CD this alone is not enough for the
CD callers to be freed of the ordering requirements.
The following patches will make the rest of the CD flow mirror the STE
flow with precise CD contents generated in all cases.
Signed-off-by: Michael Shavit <mshavit@google.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Moritz Fischer <moritzf@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 94 ++++++++++++++++-----
1 file changed, 74 insertions(+), 20 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 19fa511cec2c05..453437ca4bfc2b 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -61,6 +61,7 @@ struct arm_smmu_entry_writer_ops {
#define NUM_ENTRY_QWORDS 8
static_assert(sizeof(struct arm_smmu_ste) == NUM_ENTRY_QWORDS * sizeof(u64));
+static_assert(sizeof(struct arm_smmu_cd) == NUM_ENTRY_QWORDS * sizeof(u64));
static phys_addr_t arm_smmu_msi_cfg[ARM_SMMU_MAX_MSIS][3] = {
[EVTQ_MSI_INDEX] = {
@@ -1236,6 +1237,67 @@ static struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master,
return &l1_desc->l2ptr[idx];
}
+struct arm_smmu_cd_writer {
+ struct arm_smmu_entry_writer writer;
+ unsigned int ssid;
+};
+
+static void arm_smmu_get_cd_used(const __le64 *ent, __le64 *used_bits)
+{
+ used_bits[0] = cpu_to_le64(CTXDESC_CD_0_V);
+ if (!(ent[0] & cpu_to_le64(CTXDESC_CD_0_V)))
+ return;
+ memset(used_bits, 0xFF, sizeof(struct arm_smmu_cd));
+
+ /* EPD0 means T0SZ/TG0/IR0/OR0/SH0/TTB0 are IGNORED */
+ if (ent[0] & cpu_to_le64(CTXDESC_CD_0_TCR_EPD0)) {
+ used_bits[0] &= ~cpu_to_le64(
+ CTXDESC_CD_0_TCR_T0SZ | CTXDESC_CD_0_TCR_TG0 |
+ CTXDESC_CD_0_TCR_IRGN0 | CTXDESC_CD_0_TCR_ORGN0 |
+ CTXDESC_CD_0_TCR_SH0);
+ used_bits[1] &= ~cpu_to_le64(CTXDESC_CD_1_TTB0_MASK);
+ }
+}
+
+static void arm_smmu_cd_writer_sync_entry(struct arm_smmu_entry_writer *writer)
+{
+ struct arm_smmu_cd_writer *cd_writer =
+ container_of(writer, struct arm_smmu_cd_writer, writer);
+
+ arm_smmu_sync_cd(writer->master, cd_writer->ssid, true);
+}
+
+static const struct arm_smmu_entry_writer_ops arm_smmu_cd_writer_ops = {
+ .sync = arm_smmu_cd_writer_sync_entry,
+ .get_used = arm_smmu_get_cd_used,
+ .v_bit = cpu_to_le64(CTXDESC_CD_0_V),
+};
+
+static void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid,
+ struct arm_smmu_cd *cdptr,
+ const struct arm_smmu_cd *target)
+{
+ struct arm_smmu_cd_writer cd_writer = {
+ .writer = {
+ .ops = &arm_smmu_cd_writer_ops,
+ .master = master,
+ },
+ .ssid = ssid,
+ };
+
+ arm_smmu_write_entry(&cd_writer.writer, cdptr->data, target->data);
+}
+
+static void arm_smmu_clean_cd_entry(struct arm_smmu_cd *target)
+{
+ struct arm_smmu_cd used = {};
+ int i;
+
+ arm_smmu_get_cd_used(target->data, used.data);
+ for (i = 0; i != ARRAY_SIZE(target->data); i++)
+ target->data[i] &= used.data[i];
+}
+
int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid,
struct arm_smmu_ctx_desc *cd)
{
@@ -1252,17 +1314,20 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid,
*/
u64 val;
bool cd_live;
- struct arm_smmu_cd *cdptr;
+ struct arm_smmu_cd target;
+ struct arm_smmu_cd *cdptr = ⌖
+ struct arm_smmu_cd *cd_table_entry;
struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
struct arm_smmu_device *smmu = master->smmu;
if (WARN_ON(ssid >= (1 << cd_table->s1cdmax)))
return -E2BIG;
- cdptr = arm_smmu_get_cd_ptr(master, ssid);
- if (!cdptr)
+ cd_table_entry = arm_smmu_get_cd_ptr(master, ssid);
+ if (!cd_table_entry)
return -ENOMEM;
+ target = *cd_table_entry;
val = le64_to_cpu(cdptr->data[0]);
cd_live = !!(val & CTXDESC_CD_0_V);
@@ -1284,13 +1349,6 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid,
cdptr->data[2] = 0;
cdptr->data[3] = cpu_to_le64(cd->mair);
- /*
- * STE may be live, and the SMMU might read dwords of this CD in any
- * order. Ensure that it observes valid values before reading
- * V=1.
- */
- arm_smmu_sync_cd(master, ssid, true);
-
val = cd->tcr |
#ifdef __BIG_ENDIAN
CTXDESC_CD_0_ENDI |
@@ -1304,18 +1362,14 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid,
if (cd_table->stall_enabled)
val |= CTXDESC_CD_0_S;
}
-
+ cdptr->data[0] = cpu_to_le64(val);
/*
- * The SMMU accesses 64-bit values atomically. See IHI0070Ca 3.21.3
- * "Configuration structures and configuration invalidation completion"
- *
- * The size of single-copy atomic reads made by the SMMU is
- * IMPLEMENTATION DEFINED but must be at least 64 bits. Any single
- * field within an aligned 64-bit span of a structure can be altered
- * without first making the structure invalid.
+ * Since the above is updating the CD entry based on the current value
+ * without zeroing unused bits it needs fixing before being passed to
+ * the programming logic.
*/
- WRITE_ONCE(cdptr->data[0], cpu_to_le64(val));
- arm_smmu_sync_cd(master, ssid, true);
+ arm_smmu_clean_cd_entry(&target);
+ arm_smmu_write_cd_entry(master, ssid, cd_table_entry, &target);
return 0;
}
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v6 18/29] iommu/arm-smmu-v3: Do not use master->sva_enable to restrict attaches
From: Jason Gunthorpe @ 2024-03-27 18:08 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
In-Reply-To: <0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com>
We no longer need a master->sva_enable to control what attaches are
allowed. Instead we can tell if the attach is legal based on the current
configuration of the master.
Keep track of the number of valid CD entries for SSID's in the cd_table
and if the cd_table has been installed in the STE directly so we know what
the configuration is.
The attach logic is then made into:
- SVA bind, check if the CD is installed
- RID attach of S2, block if SSIDs are used
- RID attach of IDENTITY/BLOCKING, block if SSIDs are used
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 2 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 24 +++++++++----------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 7 ++++++
3 files changed, 20 insertions(+), 13 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index be54dfb89e880e..24e7cf759bbc35 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -626,7 +626,7 @@ static int arm_smmu_sva_set_dev_pasid(struct iommu_domain *domain,
struct arm_smmu_cd target;
int ret;
- if (mm_get_enqcmd_pasid(mm) != id)
+ if (mm_get_enqcmd_pasid(mm) != id || !master->cd_table.in_ste)
return -EINVAL;
mutex_lock(&sva_lock);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 00be1e2ebefaa9..a77467f8837515 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1297,6 +1297,8 @@ void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid,
struct arm_smmu_cd *cdptr,
const struct arm_smmu_cd *target)
{
+ bool target_valid = target->data[0] & cpu_to_le64(CTXDESC_CD_0_V);
+ bool cur_valid = cdptr->data[0] & cpu_to_le64(CTXDESC_CD_0_V);
struct arm_smmu_cd_writer cd_writer = {
.writer = {
.ops = &arm_smmu_cd_writer_ops,
@@ -1305,6 +1307,13 @@ void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid,
.ssid = ssid,
};
+ if (ssid != IOMMU_NO_PASID && cur_valid != target_valid) {
+ if (cur_valid)
+ master->cd_table.used_ssids--;
+ else
+ master->cd_table.used_ssids++;
+ }
+
arm_smmu_write_entry(&cd_writer.writer, cdptr->data, target->data);
}
@@ -2712,16 +2721,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
master = dev_iommu_priv_get(dev);
smmu = master->smmu;
- /*
- * Checking that SVA is disabled ensures that this device isn't bound to
- * any mm, and can be safely detached from its old domain. Bonds cannot
- * be removed concurrently since we're holding the group mutex.
- */
- if (arm_smmu_master_sva_enabled(master)) {
- dev_err(dev, "cannot attach - SVA enabled\n");
- return -EBUSY;
- }
-
mutex_lock(&smmu_domain->init_mutex);
if (!smmu_domain->smmu) {
@@ -2737,7 +2736,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
cdptr = arm_smmu_alloc_cd_ptr(master, IOMMU_NO_PASID);
if (!cdptr)
return -ENOMEM;
- }
+ } else if (arm_smmu_ssids_in_use(&master->cd_table))
+ return -EBUSY;
/*
* Prevent arm_smmu_share_asid() from trying to change the ASID
@@ -2809,7 +2809,7 @@ static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
.old_domain = iommu_get_domain_for_dev(dev),
};
- if (arm_smmu_master_sva_enabled(master))
+ if (arm_smmu_ssids_in_use(&master->cd_table))
return -EBUSY;
/*
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 5f49a5771ab027..3da131e0173e1f 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -602,12 +602,19 @@ struct arm_smmu_ctx_desc_cfg {
dma_addr_t cdtab_dma;
struct arm_smmu_l1_ctx_desc *l1_desc;
unsigned int num_l1_ents;
+ unsigned int used_ssids;
u8 in_ste;
u8 s1fmt;
/* log2 of the maximum number of CDs supported by this table */
u8 s1cdmax;
};
+/* True if the cd table has SSIDS > 0 in use. */
+static inline bool arm_smmu_ssids_in_use(struct arm_smmu_ctx_desc_cfg *cd_table)
+{
+ return cd_table->used_ssids;
+}
+
struct arm_smmu_s2_cfg {
u16 vmid;
};
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v6 01/29] iommu: Validate the PASID in iommu_attach_device_pasid()
From: Jason Gunthorpe @ 2024-03-27 18:07 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
In-Reply-To: <0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com>
The SVA code checks that the PASID is valid for the device when assigning
the PASID to the MM, but the normal PAGING related path does not check it.
Devices that don't support PASID or PASID values too large for the device
should not invoke the driver callback. The drivers should rely on the
core code for this enforcement.
Fixes: 16603704559c ("iommu: Add attach/detach_dev_pasid iommu interfaces")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/iommu.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 098869007c69e5..a95a483def2d2a 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -3354,6 +3354,7 @@ int iommu_attach_device_pasid(struct iommu_domain *domain,
{
/* Caller must be a probed driver on dev */
struct iommu_group *group = dev->iommu_group;
+ struct group_device *device;
void *curr;
int ret;
@@ -3363,10 +3364,18 @@ int iommu_attach_device_pasid(struct iommu_domain *domain,
if (!group)
return -ENODEV;
- if (!dev_has_iommu(dev) || dev_iommu_ops(dev) != domain->owner)
+ if (!dev_has_iommu(dev) || dev_iommu_ops(dev) != domain->owner ||
+ pasid == IOMMU_NO_PASID)
return -EINVAL;
mutex_lock(&group->mutex);
+ for_each_group_device(group, device) {
+ if (pasid >= device->dev->iommu->max_pasids) {
+ ret = -EINVAL;
+ goto out_unlock;
+ }
+ }
+
curr = xa_cmpxchg(&group->pasid_array, pasid, NULL, domain, GFP_KERNEL);
if (curr) {
ret = xa_err(curr) ? : -EBUSY;
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v6 04/29] iommu/arm-smmu-v3: Do not ATC invalidate the entire domain
From: Jason Gunthorpe @ 2024-03-27 18:07 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
In-Reply-To: <0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com>
At this point we know which master we are going to change the PCI config
on, this is the only device we need to invalidate. Switch
arm_smmu_atc_inv_domain() for arm_smmu_atc_inv_master().
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Moritz Fischer <moritzf@google.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index fa3f3e7d9b0cba..7480f70701a045 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2412,7 +2412,10 @@ static void arm_smmu_enable_ats(struct arm_smmu_master *master,
pdev = to_pci_dev(master->dev);
atomic_inc(&smmu_domain->nr_ats_masters);
- arm_smmu_atc_inv_domain(smmu_domain, IOMMU_NO_PASID, 0, 0);
+ /*
+ * ATC invalidation of PASID 0 causes the entire ATC to be flushed.
+ */
+ arm_smmu_atc_inv_master(master);
if (pci_enable_ats(pdev, stu))
dev_err(master->dev, "Failed to enable ATS (STU %zu)\n", stu);
}
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v6 15/29] iommu/arm-smmu-v3: Make smmu_domain->devices into an allocated list
From: Jason Gunthorpe @ 2024-03-27 18:08 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
In-Reply-To: <0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com>
The next patch will need to store the same master twice (with different
SSIDs), so allocate memory for each list element.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 11 ++++--
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 39 ++++++++++++++++---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 7 +++-
3 files changed, 47 insertions(+), 10 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 095d11df2a1966..63f98264b646bc 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -37,12 +37,13 @@ static DEFINE_MUTEX(sva_lock);
static void
arm_smmu_update_s1_domain_cd_entry(struct arm_smmu_domain *smmu_domain)
{
- struct arm_smmu_master *master;
+ struct arm_smmu_master_domain *master_domain;
struct arm_smmu_cd target_cd;
unsigned long flags;
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- list_for_each_entry(master, &smmu_domain->devices, domain_head) {
+ list_for_each_entry(master_domain, &smmu_domain->devices, devices_elm) {
+ struct arm_smmu_master *master = master_domain->master;
struct arm_smmu_cd *cdptr;
/* S1 domains only support RID attachment right now */
@@ -299,7 +300,7 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
{
struct arm_smmu_mmu_notifier *smmu_mn = mn_to_smmu(mn);
struct arm_smmu_domain *smmu_domain = smmu_mn->domain;
- struct arm_smmu_master *master;
+ struct arm_smmu_master_domain *master_domain;
unsigned long flags;
mutex_lock(&sva_lock);
@@ -313,7 +314,9 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
* but disable translation.
*/
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- list_for_each_entry(master, &smmu_domain->devices, domain_head) {
+ list_for_each_entry(master_domain, &smmu_domain->devices,
+ devices_elm) {
+ struct arm_smmu_master *master = master_domain->master;
struct arm_smmu_cd target;
struct arm_smmu_cd *cdptr;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 3922478799e130..4411706019f036 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2012,10 +2012,10 @@ static int arm_smmu_atc_inv_master(struct arm_smmu_master *master)
int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
unsigned long iova, size_t size)
{
+ struct arm_smmu_master_domain *master_domain;
int i;
unsigned long flags;
struct arm_smmu_cmdq_ent cmd;
- struct arm_smmu_master *master;
struct arm_smmu_cmdq_batch cmds;
if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_ATS))
@@ -2043,7 +2043,10 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
cmds.num = 0;
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- list_for_each_entry(master, &smmu_domain->devices, domain_head) {
+ list_for_each_entry(master_domain, &smmu_domain->devices,
+ devices_elm) {
+ struct arm_smmu_master *master = master_domain->master;
+
if (!master->ats_enabled)
continue;
@@ -2539,9 +2542,26 @@ static void arm_smmu_disable_pasid(struct arm_smmu_master *master)
pci_disable_pasid(pdev);
}
+static struct arm_smmu_master_domain *
+arm_smmu_find_master_domain(struct arm_smmu_domain *smmu_domain,
+ struct arm_smmu_master *master)
+{
+ struct arm_smmu_master_domain *master_domain;
+
+ lockdep_assert_held(&smmu_domain->devices_lock);
+
+ list_for_each_entry(master_domain, &smmu_domain->devices,
+ devices_elm) {
+ if (master_domain->master == master)
+ return master_domain;
+ }
+ return NULL;
+}
+
static void arm_smmu_detach_dev(struct arm_smmu_master *master)
{
struct iommu_domain *domain = iommu_get_domain_for_dev(master->dev);
+ struct arm_smmu_master_domain *master_domain;
struct arm_smmu_domain *smmu_domain;
unsigned long flags;
@@ -2552,7 +2572,11 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
arm_smmu_disable_ats(master, smmu_domain);
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- list_del_init(&master->domain_head);
+ master_domain = arm_smmu_find_master_domain(smmu_domain, master);
+ if (master_domain) {
+ list_del(&master_domain->devices_elm);
+ kfree(master_domain);
+ }
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
master->ats_enabled = false;
@@ -2566,6 +2590,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct arm_smmu_device *smmu;
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+ struct arm_smmu_master_domain *master_domain;
struct arm_smmu_master *master;
struct arm_smmu_cd *cdptr;
@@ -2602,6 +2627,11 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
return -ENOMEM;
}
+ master_domain = kzalloc(sizeof(*master_domain), GFP_KERNEL);
+ if (!master_domain)
+ return -ENOMEM;
+ master_domain->master = master;
+
/*
* Prevent arm_smmu_share_asid() from trying to change the ASID
* of either the old or new domain while we are working on it.
@@ -2615,7 +2645,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
master->ats_enabled = arm_smmu_ats_supported(master);
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- list_add(&master->domain_head, &smmu_domain->devices);
+ list_add(&master_domain->devices_elm, &smmu_domain->devices);
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
switch (smmu_domain->stage) {
@@ -2929,7 +2959,6 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
master->dev = dev;
master->smmu = smmu;
INIT_LIST_HEAD(&master->bonds);
- INIT_LIST_HEAD(&master->domain_head);
dev_iommu_priv_set(dev, master);
ret = arm_smmu_insert_master(smmu, master);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index a3b94b839ee927..e4043e48a6e20d 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -696,7 +696,6 @@ struct arm_smmu_stream {
struct arm_smmu_master {
struct arm_smmu_device *smmu;
struct device *dev;
- struct list_head domain_head;
struct arm_smmu_stream *streams;
/* Locked by the iommu core using the group mutex */
struct arm_smmu_ctx_desc_cfg cd_table;
@@ -730,12 +729,18 @@ struct arm_smmu_domain {
struct iommu_domain domain;
+ /* List of struct arm_smmu_master_domain */
struct list_head devices;
spinlock_t devices_lock;
struct list_head mmu_notifiers;
};
+struct arm_smmu_master_domain {
+ struct list_head devices_elm;
+ struct arm_smmu_master *master;
+};
+
static inline struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
{
return container_of(dom, struct arm_smmu_domain, domain);
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v6 25/29] iommu/arm-smmu-v3: Move the arm_smmu_asid_xa to per-smmu like vmid
From: Jason Gunthorpe @ 2024-03-27 18:08 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
In-Reply-To: <0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com>
The SVA BTM and shared cd code was the only thing keeping this as a global
array. Now that is out of the way we can move it to per-smmu.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 2 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 39 +++++++++----------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 +--
3 files changed, 22 insertions(+), 24 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 9ec1a5869ac3b2..95d8d2d283c916 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -404,7 +404,7 @@ struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
smmu_domain->domain.ops = &arm_smmu_sva_domain_ops;
smmu_domain->smmu = smmu;
- ret = xa_alloc(&arm_smmu_asid_xa, &asid, smmu_domain,
+ ret = xa_alloc(&smmu->asid_map, &asid, smmu_domain,
XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL);
if (ret)
goto err_free;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 888972c97f56e1..bdcf9a7039f869 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -86,9 +86,6 @@ struct arm_smmu_option_prop {
const char *prop;
};
-DEFINE_XARRAY_ALLOC1(arm_smmu_asid_xa);
-DEFINE_MUTEX(arm_smmu_asid_lock);
-
static struct arm_smmu_option_prop arm_smmu_options[] = {
{ ARM_SMMU_OPT_SKIP_PREFETCH, "hisilicon,broken-prefetch-cmd" },
{ ARM_SMMU_OPT_PAGE0_REGS_ONLY, "cavium,cn9900-broken-page1-regspace"},
@@ -2280,9 +2277,9 @@ void arm_smmu_domain_free_id(struct arm_smmu_domain *smmu_domain)
smmu_domain->domain.type == IOMMU_DOMAIN_SVA) &&
smmu_domain->cd.asid) {
/* Prevent SVA from touching the CD while we're freeing it */
- mutex_lock(&arm_smmu_asid_lock);
- xa_erase(&arm_smmu_asid_xa, smmu_domain->cd.asid);
- mutex_unlock(&arm_smmu_asid_lock);
+ mutex_lock(&smmu->asid_lock);
+ xa_erase(&smmu->asid_map, smmu_domain->cd.asid);
+ mutex_unlock(&smmu->asid_lock);
} else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S2 &&
smmu_domain->s2_cfg.vmid) {
ida_free(&smmu->vmid_map, smmu_domain->s2_cfg.vmid);
@@ -2313,11 +2310,11 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_device *smmu,
struct arm_smmu_ctx_desc *cd = &smmu_domain->cd;
/* Prevent SVA from modifying the ASID until it is written to the CD */
- mutex_lock(&arm_smmu_asid_lock);
- ret = xa_alloc(&arm_smmu_asid_xa, &asid, smmu_domain,
+ mutex_lock(&smmu->asid_lock);
+ ret = xa_alloc(&smmu->asid_map, &asid, smmu_domain,
XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL);
cd->asid = (u16)asid;
- mutex_unlock(&arm_smmu_asid_lock);
+ mutex_unlock(&smmu->asid_lock);
return ret;
}
@@ -2609,7 +2606,7 @@ static int arm_smmu_attach_prepare(struct arm_smmu_master *master,
* arm_smmu_master_domain contents otherwise it could randomly write one
* or the other to the CD.
*/
- lockdep_assert_held(&arm_smmu_asid_lock);
+ lockdep_assert_held(&master->smmu->asid_lock);
state->want_ats = !state->disable_ats && arm_smmu_ats_supported(master);
@@ -2661,7 +2658,7 @@ static int arm_smmu_attach_prepare(struct arm_smmu_master *master,
static void arm_smmu_attach_commit(struct arm_smmu_master *master,
struct attach_state *state)
{
- lockdep_assert_held(&arm_smmu_asid_lock);
+ lockdep_assert_held(&master->smmu->asid_lock);
if (state->want_ats && !master->ats_enabled) {
arm_smmu_enable_ats(master);
@@ -2725,11 +2722,11 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
* This allows the STE and the smmu_domain->devices list to
* be inconsistent during this routine.
*/
- mutex_lock(&arm_smmu_asid_lock);
+ mutex_lock(&smmu->asid_lock);
ret = arm_smmu_attach_prepare(master, domain, &state);
if (ret) {
- mutex_unlock(&arm_smmu_asid_lock);
+ mutex_unlock(&smmu->asid_lock);
return ret;
}
@@ -2753,7 +2750,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
}
arm_smmu_attach_commit(master, &state);
- mutex_unlock(&arm_smmu_asid_lock);
+ mutex_unlock(&smmu->asid_lock);
return 0;
}
@@ -2783,7 +2780,7 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
if (!cdptr)
return -ENOMEM;
- mutex_lock(&arm_smmu_asid_lock);
+ mutex_lock(&master->smmu->asid_lock);
ret = arm_smmu_attach_prepare(master, &smmu_domain->domain, &state);
if (ret)
goto out_unlock;
@@ -2793,7 +2790,7 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
arm_smmu_attach_commit(master, &state);
out_unlock:
- mutex_unlock(&arm_smmu_asid_lock);
+ mutex_unlock(&master->smmu->asid_lock);
return ret;
}
@@ -2809,12 +2806,12 @@ static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
smmu_domain = to_smmu_domain(domain);
- mutex_lock(&arm_smmu_asid_lock);
+ mutex_lock(&master->smmu->asid_lock);
arm_smmu_clear_cd(master, pasid);
if (master->ats_enabled)
arm_smmu_atc_inv_master(master, pasid);
arm_smmu_remove_master_domain(master, &smmu_domain->domain, pasid);
- mutex_unlock(&arm_smmu_asid_lock);
+ mutex_unlock(&master->smmu->asid_lock);
}
static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
@@ -2833,7 +2830,7 @@ static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
* Do not allow any ASID to be changed while are working on the STE,
* otherwise we could miss invalidations.
*/
- mutex_lock(&arm_smmu_asid_lock);
+ mutex_lock(&master->smmu->asid_lock);
/*
* The SMMU does not support enabling ATS with bypass/abort. When the
@@ -2846,7 +2843,7 @@ static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
arm_smmu_attach_prepare(master, domain, &state);
arm_smmu_install_ste_for_dev(master, ste);
arm_smmu_attach_commit(master, &state);
- mutex_unlock(&arm_smmu_asid_lock);
+ mutex_unlock(&master->smmu->asid_lock);
/*
* This has to be done after removing the master from the
@@ -3508,6 +3505,8 @@ static int arm_smmu_init_strtab(struct arm_smmu_device *smmu)
smmu->strtab_cfg.strtab_base = reg;
ida_init(&smmu->vmid_map);
+ xa_init_flags(&smmu->asid_map, XA_FLAGS_ALLOC1);
+ mutex_init(&smmu->asid_lock);
return 0;
}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index a711a659576a95..97c13f9313dcfe 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -673,6 +673,8 @@ struct arm_smmu_device {
#define ARM_SMMU_MAX_ASIDS (1 << 16)
unsigned int asid_bits;
+ struct xarray asid_map;
+ struct mutex asid_lock;
#define ARM_SMMU_MAX_VMIDS (1 << 16)
unsigned int vmid_bits;
@@ -751,9 +753,6 @@ static inline struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
return container_of(dom, struct arm_smmu_domain, domain);
}
-extern struct xarray arm_smmu_asid_xa;
-extern struct mutex arm_smmu_asid_lock;
-
struct arm_smmu_domain *arm_smmu_domain_alloc(void);
void arm_smmu_clear_cd(struct arm_smmu_master *master, ioasid_t ssid);
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v6 17/29] iommu/arm-smmu-v3: Add ssid to struct arm_smmu_master_domain
From: Jason Gunthorpe @ 2024-03-27 18:08 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
In-Reply-To: <0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com>
Prepare to allow a S1 domain to be attached to a PASID as well. Keep track
of the SSID the domain is using on each master in the
arm_smmu_master_domain.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 15 ++++---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 42 +++++++++++++++----
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 ++-
3 files changed, 43 insertions(+), 19 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 63f98264b646bc..be54dfb89e880e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -46,13 +46,12 @@ arm_smmu_update_s1_domain_cd_entry(struct arm_smmu_domain *smmu_domain)
struct arm_smmu_master *master = master_domain->master;
struct arm_smmu_cd *cdptr;
- /* S1 domains only support RID attachment right now */
- cdptr = arm_smmu_get_cd_ptr(master, IOMMU_NO_PASID);
+ cdptr = arm_smmu_get_cd_ptr(master, master_domain->ssid);
if (WARN_ON(!cdptr))
continue;
arm_smmu_make_s1_cd(&target_cd, master, smmu_domain);
- arm_smmu_write_cd_entry(master, IOMMU_NO_PASID, cdptr,
+ arm_smmu_write_cd_entry(master, master_domain->ssid, cdptr,
&target_cd);
}
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
@@ -292,8 +291,8 @@ static void arm_smmu_mm_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn,
smmu_domain);
}
- arm_smmu_atc_inv_domain(smmu_domain, mm_get_enqcmd_pasid(mm), start,
- size);
+ arm_smmu_atc_inv_domain_sva(smmu_domain, mm_get_enqcmd_pasid(mm), start,
+ size);
}
static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
@@ -330,7 +329,7 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_mn->cd->asid);
- arm_smmu_atc_inv_domain(smmu_domain, mm_get_enqcmd_pasid(mm), 0, 0);
+ arm_smmu_atc_inv_domain_sva(smmu_domain, mm_get_enqcmd_pasid(mm), 0, 0);
smmu_mn->cleared = true;
mutex_unlock(&sva_lock);
@@ -409,8 +408,8 @@ static void arm_smmu_mmu_notifier_put(struct arm_smmu_mmu_notifier *smmu_mn)
*/
if (!smmu_mn->cleared) {
arm_smmu_tlb_inv_asid(smmu_domain->smmu, cd->asid);
- arm_smmu_atc_inv_domain(smmu_domain, mm_get_enqcmd_pasid(mm), 0,
- 0);
+ arm_smmu_atc_inv_domain_sva(smmu_domain,
+ mm_get_enqcmd_pasid(mm), 0, 0);
}
/* Frees smmu_mn */
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index fb18d9d500aeba..00be1e2ebefaa9 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2011,8 +2011,8 @@ static int arm_smmu_atc_inv_master(struct arm_smmu_master *master)
return arm_smmu_cmdq_batch_submit(master->smmu, &cmds);
}
-int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
- unsigned long iova, size_t size)
+static int __arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
+ ioasid_t ssid, unsigned long iova, size_t size)
{
struct arm_smmu_master_domain *master_domain;
int i;
@@ -2040,8 +2040,6 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
if (!atomic_read(&smmu_domain->nr_ats_masters))
return 0;
- arm_smmu_atc_inv_to_cmd(ssid, iova, size, &cmd);
-
cmds.num = 0;
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
@@ -2052,6 +2050,16 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
if (!master->ats_enabled)
continue;
+ /*
+ * Non-zero ssid means SVA is co-opting the S1 domain to issue
+ * invalidations for SVA PASIDs.
+ */
+ if (ssid != IOMMU_NO_PASID)
+ arm_smmu_atc_inv_to_cmd(ssid, iova, size, &cmd);
+ else
+ arm_smmu_atc_inv_to_cmd(master_domain->ssid, iova, size,
+ &cmd);
+
for (i = 0; i < master->num_streams; i++) {
cmd.atc.sid = master->streams[i].id;
arm_smmu_cmdq_batch_add(smmu_domain->smmu, &cmds, &cmd);
@@ -2062,6 +2070,19 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
return arm_smmu_cmdq_batch_submit(smmu_domain->smmu, &cmds);
}
+static int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
+ unsigned long iova, size_t size)
+{
+ return __arm_smmu_atc_inv_domain(smmu_domain, IOMMU_NO_PASID, iova,
+ size);
+}
+
+int arm_smmu_atc_inv_domain_sva(struct arm_smmu_domain *smmu_domain,
+ ioasid_t ssid, unsigned long iova, size_t size)
+{
+ return __arm_smmu_atc_inv_domain(smmu_domain, ssid, iova, size);
+}
+
/* IO_PGTABLE API */
static void arm_smmu_tlb_inv_context(void *cookie)
{
@@ -2083,7 +2104,7 @@ static void arm_smmu_tlb_inv_context(void *cookie)
cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid;
arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
}
- arm_smmu_atc_inv_domain(smmu_domain, IOMMU_NO_PASID, 0, 0);
+ arm_smmu_atc_inv_domain(smmu_domain, 0, 0);
}
static void __arm_smmu_tlb_inv_range(struct arm_smmu_cmdq_ent *cmd,
@@ -2181,7 +2202,7 @@ static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size,
* Unfortunately, this can't be leaf-only since we may have
* zapped an entire table.
*/
- arm_smmu_atc_inv_domain(smmu_domain, IOMMU_NO_PASID, iova, size);
+ arm_smmu_atc_inv_domain(smmu_domain, iova, size);
}
void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
@@ -2524,7 +2545,8 @@ static void arm_smmu_disable_pasid(struct arm_smmu_master *master)
static struct arm_smmu_master_domain *
arm_smmu_find_master_domain(struct arm_smmu_domain *smmu_domain,
- struct arm_smmu_master *master)
+ struct arm_smmu_master *master,
+ ioasid_t ssid)
{
struct arm_smmu_master_domain *master_domain;
@@ -2532,7 +2554,8 @@ arm_smmu_find_master_domain(struct arm_smmu_domain *smmu_domain,
list_for_each_entry(master_domain, &smmu_domain->devices,
devices_elm) {
- if (master_domain->master == master)
+ if (master_domain->master == master &&
+ master_domain->ssid == ssid)
return master_domain;
}
return NULL;
@@ -2565,7 +2588,8 @@ static void arm_smmu_remove_master_domain(struct arm_smmu_master *master,
return;
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- master_domain = arm_smmu_find_master_domain(smmu_domain, master);
+ master_domain = arm_smmu_find_master_domain(smmu_domain, master,
+ IOMMU_NO_PASID);
if (master_domain) {
list_del(&master_domain->devices_elm);
kfree(master_domain);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index e4043e48a6e20d..5f49a5771ab027 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -739,6 +739,7 @@ struct arm_smmu_domain {
struct arm_smmu_master_domain {
struct list_head devices_elm;
struct arm_smmu_master *master;
+ ioasid_t ssid;
};
static inline struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
@@ -770,8 +771,8 @@ void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
size_t granule, bool leaf,
struct arm_smmu_domain *smmu_domain);
bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd);
-int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
- unsigned long iova, size_t size);
+int arm_smmu_atc_inv_domain_sva(struct arm_smmu_domain *smmu_domain,
+ ioasid_t ssid, unsigned long iova, size_t size);
#ifdef CONFIG_ARM_SMMU_V3_SVA
bool arm_smmu_sva_supported(struct arm_smmu_device *smmu);
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v6 26/29] iommu/arm-smmu-v3: Bring back SVA BTM support
From: Jason Gunthorpe @ 2024-03-27 18:08 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
In-Reply-To: <0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com>
BTM support is a feature where the CPU TLB invalidation can be forwarded
to the IOMMU and also invalidate the IOTLB. For this to work the CPU and
IOMMU ASID must be the same.
Retain the prior SVA design here of keeping the ASID allocator for the
IOMMU private to SMMU and force SVA domains to set an ASID that matches
the CPU ASID.
This requires changing the ASID assigned to a S1 domain if it happens to
be overlapping with the required CPU ASID. We hold on to the CPU ASID so
long as the SVA iommu_domain exists, so SVA domain conflict is not
possible.
With the asid per-smmu we no longer have a problem that two per-smmu
iommu_domain's would need to share a CPU ASID entry in the IOMMU's xarray.
Use the same ASID move algorithm for the S1 domains as before with some
streamlining around how the xarray is being used. Do not synchronize the
ASID's if BTM mode is not supported. Just leave BTM features off
everywhere.
Audit all the places that touch cd->asid and think carefully about how the
locking works with the change to the cd->asid by the move algorithm. Use
xarray internal locking during xa_alloc() instead of double locking. Add a
note that concurrent S1 invalidation doesn't fully work.
Note that this is all still dead code, ARM_SMMU_FEAT_BTM is never set.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 133 ++++++++++++++++--
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 15 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 +-
3 files changed, 129 insertions(+), 21 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 95d8d2d283c916..11e295f70a89b4 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -14,12 +14,33 @@
static DEFINE_MUTEX(sva_lock);
-static void __maybe_unused
-arm_smmu_update_s1_domain_cd_entry(struct arm_smmu_domain *smmu_domain)
+static int arm_smmu_realloc_s1_domain_asid(struct arm_smmu_device *smmu,
+ struct arm_smmu_domain *smmu_domain)
{
struct arm_smmu_master_domain *master_domain;
+ u32 old_asid = smmu_domain->cd.asid;
struct arm_smmu_cd target_cd;
unsigned long flags;
+ int ret;
+
+ lockdep_assert_held(&smmu->asid_lock);
+
+ /*
+ * FIXME: The unmap and invalidation path doesn't take any locks but
+ * this is not fully safe. Since updating the CD tables is not atomic
+ * there is always a hole where invalidating only one ASID of two active
+ * ASIDs during unmap will cause the IOTLB to become stale.
+ *
+ * This approach is to hopefully shift the racing CPUs to the new ASID
+ * before we start programming the HW. This increases the chance that
+ * racing IOPTE changes will pick up an invalidation for the new ASID
+ * and we achieve eventual consistency. For the brief period where the
+ * old ASID is still in the CD entries it will become incoherent.
+ */
+ ret = xa_alloc(&smmu->asid_map, &smmu_domain->cd.asid, smmu_domain,
+ XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL);
+ if (ret)
+ return ret;
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
list_for_each_entry(master_domain, &smmu_domain->devices, devices_elm) {
@@ -35,6 +56,10 @@ arm_smmu_update_s1_domain_cd_entry(struct arm_smmu_domain *smmu_domain)
&target_cd);
}
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
+
+ /* Clean the ASID we are about to assign to a new translation */
+ arm_smmu_tlb_inv_asid(smmu, old_asid);
+ return 0;
}
static u64 page_size_to_cd(void)
@@ -147,12 +172,12 @@ static void arm_smmu_mm_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn,
}
if (!smmu_domain->btm_invalidation) {
+ ioasid_t asid = READ_ONCE(smmu_domain->cd.asid);
+
if (!size)
- arm_smmu_tlb_inv_asid(smmu_domain->smmu,
- smmu_domain->cd.asid);
+ arm_smmu_tlb_inv_asid(smmu_domain->smmu, asid);
else
- arm_smmu_tlb_inv_range_asid(start, size,
- smmu_domain->cd.asid,
+ arm_smmu_tlb_inv_range_asid(start, size, asid,
PAGE_SIZE, false,
smmu_domain);
}
@@ -181,6 +206,8 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
cdptr = arm_smmu_get_cd_ptr(master, master_domain->ssid);
if (WARN_ON(!cdptr))
continue;
+
+ /* An SVA ASID never changes, no asid_lock required */
arm_smmu_make_sva_cd(&target, master, NULL,
smmu_domain->cd.asid,
smmu_domain->btm_invalidation);
@@ -374,6 +401,8 @@ static void arm_smmu_sva_domain_free(struct iommu_domain *domain)
* unnecessary invalidation.
*/
arm_smmu_domain_free_id(smmu_domain);
+ if (smmu_domain->btm_invalidation)
+ arm64_mm_context_put(domain->mm);
/*
* Actual free is defered to the SRCU callback
@@ -387,13 +416,97 @@ static const struct iommu_domain_ops arm_smmu_sva_domain_ops = {
.free = arm_smmu_sva_domain_free
};
+static int arm_smmu_share_asid(struct arm_smmu_device *smmu,
+ struct arm_smmu_domain *smmu_domain,
+ struct mm_struct *mm)
+{
+ struct arm_smmu_domain *old_s1_domain;
+ int ret;
+
+ /*
+ * Notice that BTM is never currently enabled, this is all dead code.
+ * The specification cautions:
+ *
+ * Note: Arm expects that SMMU stage 2 address spaces are generally
+ * shared with their respective PE virtual machine stage 2
+ * configuration. If broadcast invalidation is required to be avoided
+ * for a particular SMMU stage 2 address space, Arm recommends that a
+ * hypervisor configures the STE with a VMID that is not allocated for
+ * virtual machine use on the PEs
+ *
+ * However, in Linux, both KVM and SMMU think they own the VMID pool.
+ * Unfortunately the ARM design is problematic for Linux as we do not
+ * currently share the S2 table with KVM. This creates a situation where
+ * the S2 needs to have the same VMID as KVM in order to allow the guest
+ * to use BTM, however we must still invalidate the S2 directly since it
+ * is a different radix tree. What Linux would like is something like
+ * ASET for the STE to disable BTM only for the S2.
+ *
+ * Arguably in a system with BTM the driver should prefer to use a S1
+ * table in all cases execpt when explicitly asked to create a nesting
+ * parent. Then it should use the VMID of KVM to enable BTM in the
+ * guest. We cannot optimize away the resulting double invalidation of
+ * the S2 :( Or we simply ignore BTM entirely as we are doing now.
+ */
+ if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_BTM))
+ return xa_alloc(&smmu->asid_map, &smmu_domain->cd.asid,
+ smmu_domain,
+ XA_LIMIT(1, (1 << smmu->asid_bits) - 1),
+ GFP_KERNEL);
+
+ /* At this point the caller ensures we have a mmget() */
+ smmu_domain->cd.asid = arm64_mm_context_get(mm);
+
+ mutex_lock(&smmu->asid_lock);
+ old_s1_domain = xa_store(&smmu->asid_map, smmu_domain->cd.asid,
+ smmu_domain, GFP_KERNEL);
+ if (xa_err(old_s1_domain)) {
+ ret = xa_err(old_s1_domain);
+ goto out_put_asid;
+ }
+
+ /*
+ * In BTM mode the CPU ASID and the IOMMU ASID have to be the same.
+ * Unfortunately we run separate allocators for this and the IOMMU
+ * ASID can already have been assigned to a S1 domain. SVA domains
+ * always align to their CPU ASIDs. In this case we change
+ * the S1 domain's ASID, update the CD entry and flush the caches.
+ *
+ * This is a bit tricky, all the places writing to a S1 CD, reading the
+ * S1 ASID, or doing xa_erase must hold the asid_lock or xa_lock to
+ * avoid IOTLB incoherence.
+ */
+ if (old_s1_domain) {
+ if (WARN_ON(old_s1_domain->domain.type == IOMMU_DOMAIN_SVA)) {
+ ret = -EINVAL;
+ goto out_restore_s1;
+ }
+ ret = arm_smmu_realloc_s1_domain_asid(smmu, old_s1_domain);
+ if (ret)
+ goto out_restore_s1;
+ }
+
+ smmu_domain->btm_invalidation = true;
+
+ ret = 0;
+ goto out_unlock;
+
+out_restore_s1:
+ xa_store(&smmu->asid_map, smmu_domain->cd.asid, old_s1_domain,
+ GFP_KERNEL);
+out_put_asid:
+ arm64_mm_context_put(mm);
+out_unlock:
+ mutex_unlock(&smmu->asid_lock);
+ return ret;
+}
+
struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
struct mm_struct *mm)
{
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
struct arm_smmu_device *smmu = master->smmu;
struct arm_smmu_domain *smmu_domain;
- u32 asid;
int ret;
smmu_domain = arm_smmu_domain_alloc();
@@ -404,12 +517,10 @@ struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
smmu_domain->domain.ops = &arm_smmu_sva_domain_ops;
smmu_domain->smmu = smmu;
- ret = xa_alloc(&smmu->asid_map, &asid, smmu_domain,
- XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL);
+ ret = arm_smmu_share_asid(smmu, smmu_domain, mm);
if (ret)
goto err_free;
- smmu_domain->cd.asid = asid;
smmu_domain->mmu_notifier.ops = &arm_smmu_mmu_notifier_ops;
ret = mmu_notifier_register(&smmu_domain->mmu_notifier, mm);
if (ret)
@@ -419,6 +530,8 @@ struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
err_asid:
arm_smmu_domain_free_id(smmu_domain);
+ if (smmu_domain->btm_invalidation)
+ arm64_mm_context_put(mm);
err_free:
kfree(smmu_domain);
return ERR_PTR(ret);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index bdcf9a7039f869..5a2c6d099008ed 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1324,6 +1324,8 @@ void arm_smmu_make_s1_cd(struct arm_smmu_cd *target,
typeof(&pgtbl_cfg->arm_lpae_s1_cfg.tcr) tcr =
&pgtbl_cfg->arm_lpae_s1_cfg.tcr;
+ lockdep_assert_held(&master->smmu->asid_lock);
+
memset(target, 0, sizeof(*target));
target->data[0] = cpu_to_le64(
@@ -2061,7 +2063,7 @@ static void arm_smmu_tlb_inv_context(struct arm_smmu_domain *smmu_domain)
if ((smmu_domain->stage == ARM_SMMU_DOMAIN_S1 ||
smmu_domain->domain.type == IOMMU_DOMAIN_SVA)) {
- arm_smmu_tlb_inv_asid(smmu, smmu_domain->cd.asid);
+ arm_smmu_tlb_inv_asid(smmu, READ_ONCE(smmu_domain->cd.asid));
} else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S2) {
cmd.opcode = CMDQ_OP_TLBI_S12_VMALL;
cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid;
@@ -2305,17 +2307,10 @@ static void arm_smmu_domain_free_paging(struct iommu_domain *domain)
static int arm_smmu_domain_finalise_s1(struct arm_smmu_device *smmu,
struct arm_smmu_domain *smmu_domain)
{
- int ret;
- u32 asid;
struct arm_smmu_ctx_desc *cd = &smmu_domain->cd;
- /* Prevent SVA from modifying the ASID until it is written to the CD */
- mutex_lock(&smmu->asid_lock);
- ret = xa_alloc(&smmu->asid_map, &asid, smmu_domain,
- XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL);
- cd->asid = (u16)asid;
- mutex_unlock(&smmu->asid_lock);
- return ret;
+ return xa_alloc(&smmu->asid_map, &cd->asid, smmu_domain,
+ XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL);
}
static int arm_smmu_domain_finalise_s2(struct arm_smmu_device *smmu,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 97c13f9313dcfe..12eabafbb70c9c 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -586,7 +586,7 @@ struct arm_smmu_strtab_l1_desc {
};
struct arm_smmu_ctx_desc {
- u16 asid;
+ u32 asid;
};
struct arm_smmu_l1_ctx_desc {
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v6 06/29] iommu/arm-smmu-v3: Add an ops indirection to the STE code
From: Jason Gunthorpe @ 2024-03-27 18:07 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
In-Reply-To: <0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com>
Prepare to put the CD code into the same mechanism. Add an ops indirection
around all the STE specific code and make the worker functions independent
of the entry content being processed.
get_used and sync ops are provided to hook the correct code.
Signed-off-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 170 ++++++++++++--------
1 file changed, 102 insertions(+), 68 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 6f62a38052b504..19fa511cec2c05 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -47,8 +47,20 @@ enum arm_smmu_msi_index {
ARM_SMMU_MAX_MSIS,
};
-static void arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu,
- ioasid_t sid);
+struct arm_smmu_entry_writer_ops;
+struct arm_smmu_entry_writer {
+ const struct arm_smmu_entry_writer_ops *ops;
+ struct arm_smmu_master *master;
+};
+
+struct arm_smmu_entry_writer_ops {
+ __le64 v_bit;
+ void (*get_used)(const __le64 *entry, __le64 *used);
+ void (*sync)(struct arm_smmu_entry_writer *writer);
+};
+
+#define NUM_ENTRY_QWORDS 8
+static_assert(sizeof(struct arm_smmu_ste) == NUM_ENTRY_QWORDS * sizeof(u64));
static phys_addr_t arm_smmu_msi_cfg[ARM_SMMU_MAX_MSIS][3] = {
[EVTQ_MSI_INDEX] = {
@@ -977,43 +989,42 @@ void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid)
* would be nice if this was complete according to the spec, but minimally it
* has to capture the bits this driver uses.
*/
-static void arm_smmu_get_ste_used(const struct arm_smmu_ste *ent,
- struct arm_smmu_ste *used_bits)
+static void arm_smmu_get_ste_used(const __le64 *ent, __le64 *used_bits)
{
- unsigned int cfg = FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(ent->data[0]));
+ unsigned int cfg = FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(ent[0]));
- used_bits->data[0] = cpu_to_le64(STRTAB_STE_0_V);
- if (!(ent->data[0] & cpu_to_le64(STRTAB_STE_0_V)))
+ used_bits[0] = cpu_to_le64(STRTAB_STE_0_V);
+ if (!(ent[0] & cpu_to_le64(STRTAB_STE_0_V)))
return;
- used_bits->data[0] |= cpu_to_le64(STRTAB_STE_0_CFG);
+ used_bits[0] |= cpu_to_le64(STRTAB_STE_0_CFG);
/* S1 translates */
if (cfg & BIT(0)) {
- used_bits->data[0] |= cpu_to_le64(STRTAB_STE_0_S1FMT |
- STRTAB_STE_0_S1CTXPTR_MASK |
- STRTAB_STE_0_S1CDMAX);
- used_bits->data[1] |=
+ used_bits[0] |= cpu_to_le64(STRTAB_STE_0_S1FMT |
+ STRTAB_STE_0_S1CTXPTR_MASK |
+ STRTAB_STE_0_S1CDMAX);
+ used_bits[1] |=
cpu_to_le64(STRTAB_STE_1_S1DSS | STRTAB_STE_1_S1CIR |
STRTAB_STE_1_S1COR | STRTAB_STE_1_S1CSH |
STRTAB_STE_1_S1STALLD | STRTAB_STE_1_STRW |
STRTAB_STE_1_EATS);
- used_bits->data[2] |= cpu_to_le64(STRTAB_STE_2_S2VMID);
+ used_bits[2] |= cpu_to_le64(STRTAB_STE_2_S2VMID);
}
/* S2 translates */
if (cfg & BIT(1)) {
- used_bits->data[1] |=
+ used_bits[1] |=
cpu_to_le64(STRTAB_STE_1_EATS | STRTAB_STE_1_SHCFG);
- used_bits->data[2] |=
+ used_bits[2] |=
cpu_to_le64(STRTAB_STE_2_S2VMID | STRTAB_STE_2_VTCR |
STRTAB_STE_2_S2AA64 | STRTAB_STE_2_S2ENDI |
STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2R);
- used_bits->data[3] |= cpu_to_le64(STRTAB_STE_3_S2TTB_MASK);
+ used_bits[3] |= cpu_to_le64(STRTAB_STE_3_S2TTB_MASK);
}
if (cfg == STRTAB_STE_0_CFG_BYPASS)
- used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_SHCFG);
+ used_bits[1] |= cpu_to_le64(STRTAB_STE_1_SHCFG);
}
/*
@@ -1022,57 +1033,55 @@ static void arm_smmu_get_ste_used(const struct arm_smmu_ste *ent,
* unused_update is an intermediate value of entry that has unused bits set to
* their new values.
*/
-static u8 arm_smmu_entry_qword_diff(const struct arm_smmu_ste *entry,
- const struct arm_smmu_ste *target,
- struct arm_smmu_ste *unused_update)
+static u8 arm_smmu_entry_qword_diff(struct arm_smmu_entry_writer *writer,
+ const __le64 *entry, const __le64 *target,
+ __le64 *unused_update)
{
- struct arm_smmu_ste target_used = {};
- struct arm_smmu_ste cur_used = {};
+ __le64 target_used[NUM_ENTRY_QWORDS] = {};
+ __le64 cur_used[NUM_ENTRY_QWORDS] = {};
u8 used_qword_diff = 0;
unsigned int i;
- arm_smmu_get_ste_used(entry, &cur_used);
- arm_smmu_get_ste_used(target, &target_used);
+ writer->ops->get_used(entry, cur_used);
+ writer->ops->get_used(target, target_used);
- for (i = 0; i != ARRAY_SIZE(target_used.data); i++) {
+ for (i = 0; i != NUM_ENTRY_QWORDS; i++) {
/*
* Check that masks are up to date, the make functions are not
* allowed to set a bit to 1 if the used function doesn't say it
* is used.
*/
- WARN_ON_ONCE(target->data[i] & ~target_used.data[i]);
+ WARN_ON_ONCE(target[i] & ~target_used[i]);
/* Bits can change because they are not currently being used */
- unused_update->data[i] = (entry->data[i] & cur_used.data[i]) |
- (target->data[i] & ~cur_used.data[i]);
+ unused_update[i] = (entry[i] & cur_used[i]) |
+ (target[i] & ~cur_used[i]);
/*
* Each bit indicates that a used bit in a qword needs to be
* changed after unused_update is applied.
*/
- if ((unused_update->data[i] & target_used.data[i]) !=
- target->data[i])
+ if ((unused_update[i] & target_used[i]) != target[i])
used_qword_diff |= 1 << i;
}
return used_qword_diff;
}
-static bool entry_set(struct arm_smmu_device *smmu, ioasid_t sid,
- struct arm_smmu_ste *entry,
- const struct arm_smmu_ste *target, unsigned int start,
+static bool entry_set(struct arm_smmu_entry_writer *writer, __le64 *entry,
+ const __le64 *target, unsigned int start,
unsigned int len)
{
bool changed = false;
unsigned int i;
for (i = start; len != 0; len--, i++) {
- if (entry->data[i] != target->data[i]) {
- WRITE_ONCE(entry->data[i], target->data[i]);
+ if (entry[i] != target[i]) {
+ WRITE_ONCE(entry[i], target[i]);
changed = true;
}
}
if (changed)
- arm_smmu_sync_ste_for_sid(smmu, sid);
+ writer->ops->sync(writer);
return changed;
}
@@ -1102,17 +1111,14 @@ static bool entry_set(struct arm_smmu_device *smmu, ioasid_t sid,
* V=0 process. This relies on the IGNORED behavior described in the
* specification.
*/
-static void arm_smmu_write_ste(struct arm_smmu_master *master, u32 sid,
- struct arm_smmu_ste *entry,
- const struct arm_smmu_ste *target)
+static void arm_smmu_write_entry(struct arm_smmu_entry_writer *writer,
+ __le64 *entry, const __le64 *target)
{
- unsigned int num_entry_qwords = ARRAY_SIZE(target->data);
- struct arm_smmu_device *smmu = master->smmu;
- struct arm_smmu_ste unused_update;
+ __le64 unused_update[NUM_ENTRY_QWORDS];
u8 used_qword_diff;
used_qword_diff =
- arm_smmu_entry_qword_diff(entry, target, &unused_update);
+ arm_smmu_entry_qword_diff(writer, entry, target, unused_update);
if (hweight8(used_qword_diff) == 1) {
/*
* Only one qword needs its used bits to be changed. This is a
@@ -1128,22 +1134,21 @@ static void arm_smmu_write_ste(struct arm_smmu_master *master, u32 sid,
* writing it in the next step anyways. This can save a sync
* when the only change is in that qword.
*/
- unused_update.data[critical_qword_index] =
- entry->data[critical_qword_index];
- entry_set(smmu, sid, entry, &unused_update, 0, num_entry_qwords);
- entry_set(smmu, sid, entry, target, critical_qword_index, 1);
- entry_set(smmu, sid, entry, target, 0, num_entry_qwords);
+ unused_update[critical_qword_index] =
+ entry[critical_qword_index];
+ entry_set(writer, entry, unused_update, 0, NUM_ENTRY_QWORDS);
+ entry_set(writer, entry, target, critical_qword_index, 1);
+ entry_set(writer, entry, target, 0, NUM_ENTRY_QWORDS);
} else if (used_qword_diff) {
/*
* At least two qwords need their inuse bits to be changed. This
* requires a breaking update, zero the V bit, write all qwords
* but 0, then set qword 0
*/
- unused_update.data[0] = entry->data[0] &
- cpu_to_le64(~STRTAB_STE_0_V);
- entry_set(smmu, sid, entry, &unused_update, 0, 1);
- entry_set(smmu, sid, entry, target, 1, num_entry_qwords - 1);
- entry_set(smmu, sid, entry, target, 0, 1);
+ unused_update[0] = entry[0] & (~writer->ops->v_bit);
+ entry_set(writer, entry, unused_update, 0, 1);
+ entry_set(writer, entry, target, 1, NUM_ENTRY_QWORDS - 1);
+ entry_set(writer, entry, target, 0, 1);
} else {
/*
* No inuse bit changed. Sanity check that all unused bits are 0
@@ -1151,18 +1156,7 @@ static void arm_smmu_write_ste(struct arm_smmu_master *master, u32 sid,
* compute_qword_diff().
*/
WARN_ON_ONCE(
- entry_set(smmu, sid, entry, target, 0, num_entry_qwords));
- }
-
- /* It's likely that we'll want to use the new STE soon */
- if (!(smmu->options & ARM_SMMU_OPT_SKIP_PREFETCH)) {
- struct arm_smmu_cmdq_ent
- prefetch_cmd = { .opcode = CMDQ_OP_PREFETCH_CFG,
- .prefetch = {
- .sid = sid,
- } };
-
- arm_smmu_cmdq_issue_cmd(smmu, &prefetch_cmd);
+ entry_set(writer, entry, target, 0, NUM_ENTRY_QWORDS));
}
}
@@ -1435,17 +1429,57 @@ arm_smmu_write_strtab_l1_desc(__le64 *dst, struct arm_smmu_strtab_l1_desc *desc)
WRITE_ONCE(*dst, cpu_to_le64(val));
}
-static void arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu, u32 sid)
+struct arm_smmu_ste_writer {
+ struct arm_smmu_entry_writer writer;
+ u32 sid;
+};
+
+static void arm_smmu_ste_writer_sync_entry(struct arm_smmu_entry_writer *writer)
{
+ struct arm_smmu_ste_writer *ste_writer =
+ container_of(writer, struct arm_smmu_ste_writer, writer);
struct arm_smmu_cmdq_ent cmd = {
.opcode = CMDQ_OP_CFGI_STE,
.cfgi = {
- .sid = sid,
+ .sid = ste_writer->sid,
.leaf = true,
},
};
- arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
+ arm_smmu_cmdq_issue_cmd_with_sync(writer->master->smmu, &cmd);
+}
+
+static const struct arm_smmu_entry_writer_ops arm_smmu_ste_writer_ops = {
+ .sync = arm_smmu_ste_writer_sync_entry,
+ .get_used = arm_smmu_get_ste_used,
+ .v_bit = cpu_to_le64(STRTAB_STE_0_V),
+};
+
+static void arm_smmu_write_ste(struct arm_smmu_master *master, u32 sid,
+ struct arm_smmu_ste *ste,
+ const struct arm_smmu_ste *target)
+{
+ struct arm_smmu_device *smmu = master->smmu;
+ struct arm_smmu_ste_writer ste_writer = {
+ .writer = {
+ .ops = &arm_smmu_ste_writer_ops,
+ .master = master,
+ },
+ .sid = sid,
+ };
+
+ arm_smmu_write_entry(&ste_writer.writer, ste->data, target->data);
+
+ /* It's likely that we'll want to use the new STE soon */
+ if (!(smmu->options & ARM_SMMU_OPT_SKIP_PREFETCH)) {
+ struct arm_smmu_cmdq_ent
+ prefetch_cmd = { .opcode = CMDQ_OP_PREFETCH_CFG,
+ .prefetch = {
+ .sid = sid,
+ } };
+
+ arm_smmu_cmdq_issue_cmd(smmu, &prefetch_cmd);
+ }
}
static void arm_smmu_make_abort_ste(struct arm_smmu_ste *target)
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v6 28/29] iommu/arm-smmu-v3: Allow a PASID to be set when RID is IDENTITY/BLOCKED
From: Jason Gunthorpe @ 2024-03-27 18:08 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
In-Reply-To: <0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com>
If the STE doesn't point to the CD table we can upgrade it by
reprogramming the STE with the appropriate S1DSS. We may also need to turn
on ATS at the same time.
Keep track if the installed STE is pointing at the cd_table and the ATS
state to trigger this path.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 49 ++++++++++++++++++++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 +-
2 files changed, 49 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 69b628c4aaacdf..f87525225c8a50 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2432,6 +2432,9 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master,
master->cd_table.in_ste =
FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(target->data[0])) ==
STRTAB_STE_0_CFG_S1_TRANS;
+ master->ste_ats_enabled =
+ FIELD_GET(STRTAB_STE_1_EATS, le64_to_cpu(target->data[1])) ==
+ STRTAB_STE_1_EATS_TRANS;
for (i = 0; i < master->num_streams; ++i) {
u32 sid = master->streams[i].id;
@@ -2762,10 +2765,36 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
return 0;
}
+static void arm_smmu_update_ste(struct arm_smmu_master *master,
+ struct iommu_domain *sid_domain,
+ bool want_ats)
+{
+ unsigned int s1dss = STRTAB_STE_1_S1DSS_TERMINATE;
+ struct arm_smmu_ste ste;
+
+ if (master->cd_table.in_ste && master->ste_ats_enabled == want_ats)
+ return;
+
+ if (sid_domain->type == IOMMU_DOMAIN_IDENTITY)
+ s1dss = STRTAB_STE_1_S1DSS_BYPASS;
+ else
+ WARN_ON(sid_domain->type != IOMMU_DOMAIN_BLOCKED);
+
+ /*
+ * Change the STE into a cdtable one with SID IDENTITY/BLOCKED behavior
+ * using s1dss if necessary. The cd_table is already installed then
+ * the S1DSS is correct and this will just update the EATS. Otherwise
+ * it installs the entire thing. This will be hitless.
+ */
+ arm_smmu_make_cdtable_ste(&ste, master, want_ats, s1dss);
+ arm_smmu_install_ste_for_dev(master, &ste);
+}
+
int arm_smmu_set_pasid(struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
const struct arm_smmu_cd *cd)
{
+ struct iommu_domain *sid_domain = iommu_get_domain_for_dev(master->dev);
struct attach_state state = {
/*
* For now the core code prevents calling this when a domain is
@@ -2781,8 +2810,10 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
if (smmu_domain->smmu != master->smmu)
return -EINVAL;
- if (!master->cd_table.in_ste)
- return -ENODEV;
+ if (!master->cd_table.in_ste &&
+ sid_domain->type != IOMMU_DOMAIN_IDENTITY &&
+ sid_domain->type != IOMMU_DOMAIN_BLOCKED)
+ return -EINVAL;
cdptr = arm_smmu_alloc_cd_ptr(master, pasid);
if (!cdptr)
@@ -2794,6 +2825,7 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
goto out_unlock;
arm_smmu_write_cd_entry(master, pasid, cdptr, cd);
+ arm_smmu_update_ste(master, sid_domain, state.want_ats);
arm_smmu_attach_commit(master, &state);
@@ -2820,6 +2852,19 @@ static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
arm_smmu_atc_inv_master(master, pasid);
arm_smmu_remove_master_domain(master, &smmu_domain->domain, pasid);
mutex_unlock(&master->smmu->asid_lock);
+
+ /*
+ * When the last user of the CD table goes away downgrade the STE back
+ * to a non-cd_table one.
+ */
+ if (!arm_smmu_ssids_in_use(&master->cd_table)) {
+ struct iommu_domain *sid_domain =
+ iommu_get_domain_for_dev(master->dev);
+
+ if (domain->type == IOMMU_DOMAIN_IDENTITY ||
+ domain->type == IOMMU_DOMAIN_BLOCKED)
+ sid_domain->ops->attach_dev(sid_domain, dev);
+ }
}
static void arm_smmu_attach_dev_ste(struct iommu_domain *domain,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 12eabafbb70c9c..853cd17d06e671 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -706,7 +706,8 @@ struct arm_smmu_master {
/* Locked by the iommu core using the group mutex */
struct arm_smmu_ctx_desc_cfg cd_table;
unsigned int num_streams;
- bool ats_enabled;
+ bool ats_enabled : 1;
+ bool ste_ats_enabled : 1;
bool stall_enabled;
bool sva_enabled;
bool iopf_enabled;
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v6 23/29] iommu/arm-smmu-v3: Put the SVA mmu notifier in the smmu_domain
From: Jason Gunthorpe @ 2024-03-27 18:08 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Lu Baolu, Eric Auger, Jean-Philippe Brucker, Joerg Roedel,
Kevin Tian, kernel test robot, Moritz Fischer, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameer Kolothum,
Mostafa Saleh, Tony Zhu, Yi Liu, Zhangfei Gao
In-Reply-To: <0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com>
This removes all the notifier de-duplication logic in the driver and
relies on the core code to de-duplicate and allocate only one SVA domain
per mm per smmu instance. This naturally gives a 1:1 relationship between
SVA domain and mmu notifier.
Remove all of the previous mmu_notifier, bond, shared cd, and cd refcount
logic entirely.
For the purpose of organizing patches lightly remove BTM support. The next
patches will add it back in. BTM is a performance optimization so this is
bisection friendly functionally invisible change. Note that the current
code never even enables ARM_SMMU_FEAT_BTM so none of this code is ever
even used.
The bond/shared_cd/btm/asid allocator are tightly wound together and
changing them all at once would make this patch too big. The core issue is
that having a single SVA domain per-smmu instance conflicts with the
design of having a global ASID table that BTM currently needs, as we would
end up having to assign multiple SVA domains to the same ASID.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 396 ++++--------------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 78 +---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 14 +-
3 files changed, 97 insertions(+), 391 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index e337e40ac5de31..fd0b9f230f89e3 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -12,29 +12,9 @@
#include "arm-smmu-v3.h"
#include "../../io-pgtable-arm.h"
-struct arm_smmu_mmu_notifier {
- struct mmu_notifier mn;
- struct arm_smmu_ctx_desc *cd;
- bool cleared;
- refcount_t refs;
- struct list_head list;
- struct arm_smmu_domain *domain;
-};
-
-#define mn_to_smmu(mn) container_of(mn, struct arm_smmu_mmu_notifier, mn)
-
-struct arm_smmu_bond {
- struct mm_struct *mm;
- struct arm_smmu_mmu_notifier *smmu_mn;
- struct list_head list;
-};
-
-#define sva_to_bond(handle) \
- container_of(handle, struct arm_smmu_bond, sva)
-
static DEFINE_MUTEX(sva_lock);
-static void
+static void __maybe_unused
arm_smmu_update_s1_domain_cd_entry(struct arm_smmu_domain *smmu_domain)
{
struct arm_smmu_master_domain *master_domain;
@@ -57,58 +37,6 @@ arm_smmu_update_s1_domain_cd_entry(struct arm_smmu_domain *smmu_domain)
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
}
-/*
- * Check if the CPU ASID is available on the SMMU side. If a private context
- * descriptor is using it, try to replace it.
- */
-static struct arm_smmu_ctx_desc *
-arm_smmu_share_asid(struct mm_struct *mm, u16 asid)
-{
- int ret;
- u32 new_asid;
- struct arm_smmu_ctx_desc *cd;
- struct arm_smmu_device *smmu;
- struct arm_smmu_domain *smmu_domain;
-
- cd = xa_load(&arm_smmu_asid_xa, asid);
- if (!cd)
- return NULL;
-
- if (cd->mm) {
- if (WARN_ON(cd->mm != mm))
- return ERR_PTR(-EINVAL);
- /* All devices bound to this mm use the same cd struct. */
- refcount_inc(&cd->refs);
- return cd;
- }
-
- smmu_domain = container_of(cd, struct arm_smmu_domain, cd);
- smmu = smmu_domain->smmu;
-
- ret = xa_alloc(&arm_smmu_asid_xa, &new_asid, cd,
- XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL);
- if (ret)
- return ERR_PTR(-ENOSPC);
- /*
- * Race with unmap: TLB invalidations will start targeting the new ASID,
- * which isn't assigned yet. We'll do an invalidate-all on the old ASID
- * later, so it doesn't matter.
- */
- cd->asid = new_asid;
- /*
- * Update ASID and invalidate CD in all associated masters. There will
- * be some overlap between use of both ASIDs, until we invalidate the
- * TLB.
- */
- arm_smmu_update_s1_domain_cd_entry(smmu_domain);
-
- /* Invalidate TLB entries previously associated with that context */
- arm_smmu_tlb_inv_asid(smmu, asid);
-
- xa_erase(&arm_smmu_asid_xa, asid);
- return NULL;
-}
-
static u64 page_size_to_cd(void)
{
static_assert(PAGE_SIZE == SZ_4K || PAGE_SIZE == SZ_16K ||
@@ -122,7 +50,8 @@ static u64 page_size_to_cd(void)
static void arm_smmu_make_sva_cd(struct arm_smmu_cd *target,
struct arm_smmu_master *master,
- struct mm_struct *mm, u16 asid)
+ struct mm_struct *mm, u16 asid,
+ bool btm_invalidation)
{
u64 par;
@@ -143,7 +72,7 @@ static void arm_smmu_make_sva_cd(struct arm_smmu_cd *target,
(master->stall_enabled ? CTXDESC_CD_0_S : 0) |
CTXDESC_CD_0_R |
CTXDESC_CD_0_A |
- CTXDESC_CD_0_ASET |
+ (btm_invalidation ? 0 : CTXDESC_CD_0_ASET) |
FIELD_PREP(CTXDESC_CD_0_ASID, asid));
/*
@@ -185,69 +114,6 @@ static void arm_smmu_make_sva_cd(struct arm_smmu_cd *target,
target->data[3] = cpu_to_le64(read_sysreg(mair_el1));
}
-static struct arm_smmu_ctx_desc *arm_smmu_alloc_shared_cd(struct mm_struct *mm)
-{
- u16 asid;
- int err = 0;
- struct arm_smmu_ctx_desc *cd;
- struct arm_smmu_ctx_desc *ret = NULL;
-
- /* Don't free the mm until we release the ASID */
- mmgrab(mm);
-
- asid = arm64_mm_context_get(mm);
- if (!asid) {
- err = -ESRCH;
- goto out_drop_mm;
- }
-
- cd = kzalloc(sizeof(*cd), GFP_KERNEL);
- if (!cd) {
- err = -ENOMEM;
- goto out_put_context;
- }
-
- refcount_set(&cd->refs, 1);
-
- mutex_lock(&arm_smmu_asid_lock);
- ret = arm_smmu_share_asid(mm, asid);
- if (ret) {
- mutex_unlock(&arm_smmu_asid_lock);
- goto out_free_cd;
- }
-
- err = xa_insert(&arm_smmu_asid_xa, asid, cd, GFP_KERNEL);
- mutex_unlock(&arm_smmu_asid_lock);
-
- if (err)
- goto out_free_asid;
-
- cd->asid = asid;
- cd->mm = mm;
-
- return cd;
-
-out_free_asid:
- arm_smmu_free_asid(cd);
-out_free_cd:
- kfree(cd);
-out_put_context:
- arm64_mm_context_put(mm);
-out_drop_mm:
- mmdrop(mm);
- return err < 0 ? ERR_PTR(err) : ret;
-}
-
-static void arm_smmu_free_shared_cd(struct arm_smmu_ctx_desc *cd)
-{
- if (arm_smmu_free_asid(cd)) {
- /* Unpin ASID */
- arm64_mm_context_put(cd->mm);
- mmdrop(cd->mm);
- kfree(cd);
- }
-}
-
/*
* Cloned from the MAX_TLBI_OPS in arch/arm64/include/asm/tlbflush.h, this
* is used as a threshold to replace per-page TLBI commands to issue in the
@@ -262,8 +128,8 @@ static void arm_smmu_mm_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn,
unsigned long start,
unsigned long end)
{
- struct arm_smmu_mmu_notifier *smmu_mn = mn_to_smmu(mn);
- struct arm_smmu_domain *smmu_domain = smmu_mn->domain;
+ struct arm_smmu_domain *smmu_domain =
+ container_of(mn, struct arm_smmu_domain, mmu_notifier);
size_t size;
/*
@@ -280,34 +146,27 @@ static void arm_smmu_mm_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn,
size = 0;
}
- if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_BTM)) {
+ if (!smmu_domain->btm_invalidation) {
if (!size)
arm_smmu_tlb_inv_asid(smmu_domain->smmu,
- smmu_mn->cd->asid);
+ smmu_domain->cd.asid);
else
arm_smmu_tlb_inv_range_asid(start, size,
- smmu_mn->cd->asid,
+ smmu_domain->cd.asid,
PAGE_SIZE, false,
smmu_domain);
}
- arm_smmu_atc_inv_domain_sva(smmu_domain, mm_get_enqcmd_pasid(mm), start,
- size);
+ arm_smmu_atc_inv_domain(smmu_domain, start, size);
}
static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
{
- struct arm_smmu_mmu_notifier *smmu_mn = mn_to_smmu(mn);
- struct arm_smmu_domain *smmu_domain = smmu_mn->domain;
+ struct arm_smmu_domain *smmu_domain =
+ container_of(mn, struct arm_smmu_domain, mmu_notifier);
struct arm_smmu_master_domain *master_domain;
unsigned long flags;
- mutex_lock(&sva_lock);
- if (smmu_mn->cleared) {
- mutex_unlock(&sva_lock);
- return;
- }
-
/*
* DMA may still be running. Keep the cd valid to avoid C_BAD_CD events,
* but disable translation.
@@ -319,25 +178,27 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
struct arm_smmu_cd target;
struct arm_smmu_cd *cdptr;
- cdptr = arm_smmu_get_cd_ptr(master, mm_get_enqcmd_pasid(mm));
+ cdptr = arm_smmu_get_cd_ptr(master, master_domain->ssid);
if (WARN_ON(!cdptr))
continue;
- arm_smmu_make_sva_cd(&target, master, NULL, smmu_mn->cd->asid);
- arm_smmu_write_cd_entry(master, mm_get_enqcmd_pasid(mm), cdptr,
+ arm_smmu_make_sva_cd(&target, master, NULL,
+ smmu_domain->cd.asid,
+ smmu_domain->btm_invalidation);
+ arm_smmu_write_cd_entry(master, master_domain->ssid, cdptr,
&target);
}
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
- arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_mn->cd->asid);
- arm_smmu_atc_inv_domain_sva(smmu_domain, mm_get_enqcmd_pasid(mm), 0, 0);
-
- smmu_mn->cleared = true;
- mutex_unlock(&sva_lock);
+ arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_domain->cd.asid);
+ arm_smmu_atc_inv_domain(smmu_domain, 0, 0);
}
static void arm_smmu_mmu_notifier_free(struct mmu_notifier *mn)
{
- kfree(mn_to_smmu(mn));
+ struct arm_smmu_domain *smmu_domain =
+ container_of(mn, struct arm_smmu_domain, mmu_notifier);
+
+ kfree(smmu_domain);
}
static const struct mmu_notifier_ops arm_smmu_mmu_notifier_ops = {
@@ -346,115 +207,6 @@ static const struct mmu_notifier_ops arm_smmu_mmu_notifier_ops = {
.free_notifier = arm_smmu_mmu_notifier_free,
};
-/* Allocate or get existing MMU notifier for this {domain, mm} pair */
-static struct arm_smmu_mmu_notifier *
-arm_smmu_mmu_notifier_get(struct arm_smmu_domain *smmu_domain,
- struct mm_struct *mm)
-{
- int ret;
- struct arm_smmu_ctx_desc *cd;
- struct arm_smmu_mmu_notifier *smmu_mn;
-
- list_for_each_entry(smmu_mn, &smmu_domain->mmu_notifiers, list) {
- if (smmu_mn->mn.mm == mm) {
- refcount_inc(&smmu_mn->refs);
- return smmu_mn;
- }
- }
-
- cd = arm_smmu_alloc_shared_cd(mm);
- if (IS_ERR(cd))
- return ERR_CAST(cd);
-
- smmu_mn = kzalloc(sizeof(*smmu_mn), GFP_KERNEL);
- if (!smmu_mn) {
- ret = -ENOMEM;
- goto err_free_cd;
- }
-
- refcount_set(&smmu_mn->refs, 1);
- smmu_mn->cd = cd;
- smmu_mn->domain = smmu_domain;
- smmu_mn->mn.ops = &arm_smmu_mmu_notifier_ops;
-
- ret = mmu_notifier_register(&smmu_mn->mn, mm);
- if (ret) {
- kfree(smmu_mn);
- goto err_free_cd;
- }
-
- list_add(&smmu_mn->list, &smmu_domain->mmu_notifiers);
- return smmu_mn;
-
-err_free_cd:
- arm_smmu_free_shared_cd(cd);
- return ERR_PTR(ret);
-}
-
-static void arm_smmu_mmu_notifier_put(struct arm_smmu_mmu_notifier *smmu_mn)
-{
- struct mm_struct *mm = smmu_mn->mn.mm;
- struct arm_smmu_ctx_desc *cd = smmu_mn->cd;
- struct arm_smmu_domain *smmu_domain = smmu_mn->domain;
-
- if (!refcount_dec_and_test(&smmu_mn->refs))
- return;
-
- list_del(&smmu_mn->list);
-
- /*
- * If we went through clear(), we've already invalidated, and no
- * new TLB entry can have been formed.
- */
- if (!smmu_mn->cleared) {
- arm_smmu_tlb_inv_asid(smmu_domain->smmu, cd->asid);
- arm_smmu_atc_inv_domain_sva(smmu_domain,
- mm_get_enqcmd_pasid(mm), 0, 0);
- }
-
- /* Frees smmu_mn */
- mmu_notifier_put(&smmu_mn->mn);
- arm_smmu_free_shared_cd(cd);
-}
-
-static struct arm_smmu_bond *__arm_smmu_sva_bind(struct device *dev,
- struct mm_struct *mm)
-{
- int ret;
- struct arm_smmu_bond *bond;
- struct arm_smmu_master *master = dev_iommu_priv_get(dev);
- struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
- struct arm_smmu_domain *smmu_domain;
-
- if (!(domain->type & __IOMMU_DOMAIN_PAGING))
- return ERR_PTR(-ENODEV);
- smmu_domain = to_smmu_domain(domain);
- if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
- return ERR_PTR(-ENODEV);
-
- if (!master || !master->sva_enabled)
- return ERR_PTR(-ENODEV);
-
- bond = kzalloc(sizeof(*bond), GFP_KERNEL);
- if (!bond)
- return ERR_PTR(-ENOMEM);
-
- bond->mm = mm;
-
- bond->smmu_mn = arm_smmu_mmu_notifier_get(smmu_domain, mm);
- if (IS_ERR(bond->smmu_mn)) {
- ret = PTR_ERR(bond->smmu_mn);
- goto err_free_bond;
- }
-
- list_add(&bond->list, &master->bonds);
- return bond;
-
-err_free_bond:
- kfree(bond);
- return ERR_PTR(ret);
-}
-
bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
{
unsigned long reg, fld;
@@ -571,11 +323,6 @@ int arm_smmu_master_enable_sva(struct arm_smmu_master *master)
int arm_smmu_master_disable_sva(struct arm_smmu_master *master)
{
mutex_lock(&sva_lock);
- if (!list_empty(&master->bonds)) {
- dev_err(master->dev, "cannot disable SVA, device is bound\n");
- mutex_unlock(&sva_lock);
- return -EBUSY;
- }
arm_smmu_master_sva_disable_iopf(master);
master->sva_enabled = false;
mutex_unlock(&sva_lock);
@@ -592,66 +339,52 @@ void arm_smmu_sva_notifier_synchronize(void)
mmu_notifier_synchronize();
}
-void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain,
- struct device *dev, ioasid_t id)
-{
- struct mm_struct *mm = domain->mm;
- struct arm_smmu_bond *bond = NULL, *t;
- struct arm_smmu_master *master = dev_iommu_priv_get(dev);
-
- arm_smmu_remove_pasid(master, to_smmu_domain(domain), id);
-
- mutex_lock(&sva_lock);
- list_for_each_entry(t, &master->bonds, list) {
- if (t->mm == mm) {
- bond = t;
- break;
- }
- }
-
- if (!WARN_ON(!bond)) {
- list_del(&bond->list);
- arm_smmu_mmu_notifier_put(bond->smmu_mn);
- kfree(bond);
- }
- mutex_unlock(&sva_lock);
-}
-
static int arm_smmu_sva_set_dev_pasid(struct iommu_domain *domain,
struct device *dev, ioasid_t id)
{
+ struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
- struct mm_struct *mm = domain->mm;
- struct arm_smmu_bond *bond;
struct arm_smmu_cd target;
int ret;
- if (mm_get_enqcmd_pasid(mm) != id || !master->cd_table.in_ste)
+ /* Prevent arm_smmu_mm_release from being called while we are attaching */
+ if (!mmget_not_zero(domain->mm))
return -EINVAL;
- mutex_lock(&sva_lock);
- bond = __arm_smmu_sva_bind(dev, mm);
- if (IS_ERR(bond)) {
- mutex_unlock(&sva_lock);
- return PTR_ERR(bond);
- }
+ /*
+ * This does not need the arm_smmu_asid_lock because SVA domains never
+ * get reassigned
+ */
+ arm_smmu_make_sva_cd(&target, master, domain->mm, smmu_domain->cd.asid,
+ smmu_domain->btm_invalidation);
+ ret = arm_smmu_set_pasid(master, smmu_domain, id, &target);
- arm_smmu_make_sva_cd(&target, master, mm, bond->smmu_mn->cd->asid);
- ret = arm_smmu_set_pasid(master, to_smmu_domain(domain), id, &target);
- if (ret) {
- list_del(&bond->list);
- arm_smmu_mmu_notifier_put(bond->smmu_mn);
- kfree(bond);
- mutex_unlock(&sva_lock);
- return ret;
- }
- mutex_unlock(&sva_lock);
- return 0;
+ mmput(domain->mm);
+ return ret;
}
static void arm_smmu_sva_domain_free(struct iommu_domain *domain)
{
- kfree(to_smmu_domain(domain));
+ struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+ /*
+ * Ensure the ASID is empty in the iommu cache before allowing reuse.
+ */
+ arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_domain->cd.asid);
+
+ /*
+ * Notice that the arm_smmu_mm_arch_invalidate_secondary_tlbs op can
+ * still be called/running at this point. We allow the ASID to be
+ * reused, and if there is a race then it just suffers harmless
+ * unnecessary invalidation.
+ */
+ xa_erase(&arm_smmu_asid_xa, smmu_domain->cd.asid);
+
+ /*
+ * Actual free is defered to the SRCU callback
+ * arm_smmu_mmu_notifier_free()
+ */
+ mmu_notifier_put(&smmu_domain->mmu_notifier);
}
static const struct iommu_domain_ops arm_smmu_sva_domain_ops = {
@@ -665,6 +398,8 @@ struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
struct arm_smmu_device *smmu = master->smmu;
struct arm_smmu_domain *smmu_domain;
+ u32 asid;
+ int ret;
smmu_domain = arm_smmu_domain_alloc();
if (IS_ERR(smmu_domain))
@@ -674,5 +409,22 @@ struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
smmu_domain->domain.ops = &arm_smmu_sva_domain_ops;
smmu_domain->smmu = smmu;
+ ret = xa_alloc(&arm_smmu_asid_xa, &asid, smmu_domain,
+ XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL);
+ if (ret)
+ goto err_free;
+
+ smmu_domain->cd.asid = asid;
+ smmu_domain->mmu_notifier.ops = &arm_smmu_mmu_notifier_ops;
+ ret = mmu_notifier_register(&smmu_domain->mmu_notifier, mm);
+ if (ret)
+ goto err_asid;
+
return &smmu_domain->domain;
+
+err_asid:
+ xa_erase(&arm_smmu_asid_xa, smmu_domain->cd.asid);
+err_free:
+ kfree(smmu_domain);
+ return ERR_PTR(ret);
}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 7b001afda17aa8..a82a5e4a13bb44 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1446,22 +1446,6 @@ static void arm_smmu_free_cd_tables(struct arm_smmu_master *master)
cd_table->cdtab = NULL;
}
-bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd)
-{
- bool free;
- struct arm_smmu_ctx_desc *old_cd;
-
- if (!cd->asid)
- return false;
-
- free = refcount_dec_and_test(&cd->refs);
- if (free) {
- old_cd = xa_erase(&arm_smmu_asid_xa, cd->asid);
- WARN_ON(old_cd != cd);
- }
- return free;
-}
-
/* Stream table manipulation functions */
static void
arm_smmu_write_strtab_l1_desc(__le64 *dst, struct arm_smmu_strtab_l1_desc *desc)
@@ -2021,8 +2005,8 @@ static int arm_smmu_atc_inv_master(struct arm_smmu_master *master,
return arm_smmu_cmdq_batch_submit(master->smmu, &cmds);
}
-static int __arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
- ioasid_t ssid, unsigned long iova, size_t size)
+int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
+ unsigned long iova, size_t size)
{
struct arm_smmu_master_domain *master_domain;
int i;
@@ -2060,15 +2044,7 @@ static int __arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
if (!master->ats_enabled)
continue;
- /*
- * Non-zero ssid means SVA is co-opting the S1 domain to issue
- * invalidations for SVA PASIDs.
- */
- if (ssid != IOMMU_NO_PASID)
- arm_smmu_atc_inv_to_cmd(ssid, iova, size, &cmd);
- else
- arm_smmu_atc_inv_to_cmd(master_domain->ssid, iova, size,
- &cmd);
+ arm_smmu_atc_inv_to_cmd(master_domain->ssid, iova, size, &cmd);
for (i = 0; i < master->num_streams; i++) {
cmd.atc.sid = master->streams[i].id;
@@ -2080,19 +2056,6 @@ static int __arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
return arm_smmu_cmdq_batch_submit(smmu_domain->smmu, &cmds);
}
-static int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
- unsigned long iova, size_t size)
-{
- return __arm_smmu_atc_inv_domain(smmu_domain, IOMMU_NO_PASID, iova,
- size);
-}
-
-int arm_smmu_atc_inv_domain_sva(struct arm_smmu_domain *smmu_domain,
- ioasid_t ssid, unsigned long iova, size_t size)
-{
- return __arm_smmu_atc_inv_domain(smmu_domain, ssid, iova, size);
-}
-
/* IO_PGTABLE API */
static void arm_smmu_tlb_inv_context(void *cookie)
{
@@ -2281,7 +2244,6 @@ struct arm_smmu_domain *arm_smmu_domain_alloc(void)
mutex_init(&smmu_domain->init_mutex);
INIT_LIST_HEAD(&smmu_domain->devices);
spin_lock_init(&smmu_domain->devices_lock);
- INIT_LIST_HEAD(&smmu_domain->mmu_notifiers);
return smmu_domain;
}
@@ -2324,7 +2286,7 @@ static void arm_smmu_domain_free_paging(struct iommu_domain *domain)
if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
/* Prevent SVA from touching the CD while we're freeing it */
mutex_lock(&arm_smmu_asid_lock);
- arm_smmu_free_asid(&smmu_domain->cd);
+ xa_erase(&arm_smmu_asid_xa, smmu_domain->cd.asid);
mutex_unlock(&arm_smmu_asid_lock);
} else {
struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg;
@@ -2342,11 +2304,9 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_device *smmu,
u32 asid;
struct arm_smmu_ctx_desc *cd = &smmu_domain->cd;
- refcount_set(&cd->refs, 1);
-
/* Prevent SVA from modifying the ASID until it is written to the CD */
mutex_lock(&arm_smmu_asid_lock);
- ret = xa_alloc(&arm_smmu_asid_xa, &asid, cd,
+ ret = xa_alloc(&arm_smmu_asid_xa, &asid, smmu_domain,
XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL);
cd->asid = (u16)asid;
mutex_unlock(&arm_smmu_asid_lock);
@@ -2805,6 +2765,9 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
/* The core code validates pasid */
+ if (smmu_domain->smmu != master->smmu)
+ return -EINVAL;
+
if (!master->cd_table.in_ste)
return -ENODEV;
@@ -2826,9 +2789,18 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
return ret;
}
-void arm_smmu_remove_pasid(struct arm_smmu_master *master,
- struct arm_smmu_domain *smmu_domain, ioasid_t pasid)
+static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
{
+ struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+ struct arm_smmu_domain *smmu_domain;
+ struct iommu_domain *domain;
+
+ domain = iommu_get_domain_for_dev_pasid(dev, pasid, IOMMU_DOMAIN_SVA);
+ if (WARN_ON(IS_ERR(domain)) || !domain)
+ return;
+
+ smmu_domain = to_smmu_domain(domain);
+
mutex_lock(&arm_smmu_asid_lock);
arm_smmu_clear_cd(master, pasid);
if (master->ats_enabled)
@@ -3105,7 +3077,6 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
master->dev = dev;
master->smmu = smmu;
- INIT_LIST_HEAD(&master->bonds);
dev_iommu_priv_set(dev, master);
ret = arm_smmu_insert_master(smmu, master);
@@ -3287,17 +3258,6 @@ static int arm_smmu_def_domain_type(struct device *dev)
return 0;
}
-static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
-{
- struct iommu_domain *domain;
-
- domain = iommu_get_domain_for_dev_pasid(dev, pasid, IOMMU_DOMAIN_SVA);
- if (WARN_ON(IS_ERR(domain)) || !domain)
- return;
-
- arm_smmu_sva_remove_dev_pasid(domain, dev, pasid);
-}
-
static struct iommu_ops arm_smmu_ops = {
.identity_domain = &arm_smmu_identity_domain,
.blocked_domain = &arm_smmu_blocked_domain,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 107a39f1dfe869..3516869954ea33 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -587,9 +587,6 @@ struct arm_smmu_strtab_l1_desc {
struct arm_smmu_ctx_desc {
u16 asid;
-
- refcount_t refs;
- struct mm_struct *mm;
};
struct arm_smmu_l1_ctx_desc {
@@ -711,7 +708,6 @@ struct arm_smmu_master {
bool stall_enabled;
bool sva_enabled;
bool iopf_enabled;
- struct list_head bonds;
unsigned int ssid_bits;
};
@@ -740,7 +736,8 @@ struct arm_smmu_domain {
struct list_head devices;
spinlock_t devices_lock;
- struct list_head mmu_notifiers;
+ struct mmu_notifier mmu_notifier;
+ bool btm_invalidation;
};
struct arm_smmu_master_domain {
@@ -779,9 +776,8 @@ void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid);
void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
size_t granule, bool leaf,
struct arm_smmu_domain *smmu_domain);
-bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd);
-int arm_smmu_atc_inv_domain_sva(struct arm_smmu_domain *smmu_domain,
- ioasid_t ssid, unsigned long iova, size_t size);
+int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
+ unsigned long iova, size_t size);
#ifdef CONFIG_ARM_SMMU_V3_SVA
bool arm_smmu_sva_supported(struct arm_smmu_device *smmu);
@@ -793,8 +789,6 @@ bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master);
void arm_smmu_sva_notifier_synchronize(void);
struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
struct mm_struct *mm);
-void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain,
- struct device *dev, ioasid_t id);
#else /* CONFIG_ARM_SMMU_V3_SVA */
static inline bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
{
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* Re: [PATCH v2 1/2] dt-bindings: iommu: arm,smmu-v3: Add SC8280XP compatible
From: Konrad Dybcio @ 2024-03-27 19:23 UTC (permalink / raw)
To: Robin Murphy, Bjorn Andersson, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Will Deacon, Joerg Roedel, Rob Herring
Cc: Marijn Suijten, linux-arm-msm, devicetree, linux-kernel,
linux-arm-kernel, iommu, Johan Hovold
In-Reply-To: <9b2a681e-1191-4cf7-8da7-14aa2c1fa455@arm.com>
On 19.03.2024 2:53 PM, Robin Murphy wrote:
> On 2024-03-09 1:31 pm, Konrad Dybcio wrote:
>> The smmu-v3 binding currently doesn't differentiate the SoCs it's
>> implemented on. This is a poor design choice that may bite in the future,
>> should any quirks surface.
>
> That doesn't seem entirely fair to say - the vast majority of bindings don't have separate compatibles for every known integration of the same implementation in different SoCs. And in this case we don't have per-implementation compatibles for quirks and errata because the implementation is architecturally discoverable from the SMMU_IIDR register.
>
> We have the whole mess for QCom SMMUv2 because the effective *implementation* is a mix of hardware and hypervisor, whose behaviour does seem to vary on almost a per-SoC basis. I'm not at all keen to start repeating that here without very good reason, and that of "documenting" a device which we typically expect to not even be accessible isn't really convincing me...
From my POV as an arch dts maintainer, this often ends up being the only
way to retroactively add some conditional action into the code - the kernel
is supposed to be backwards compatible with older device trees.
And so far it's been almost by luck that all of the smmuv3 implementations
have been a straight copy-and-paste of the reference design (or close enough),
I don't believe this will be for much longer.
Konrad
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox