LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 2/2] powerpc/pseries: Pass MSI affinity to irq_create_mapping()
From: Laurent Vivier @ 2020-11-26  8:28 UTC (permalink / raw)
  To: linux-kernel
  Cc: Laurent Vivier, Michael S . Tsirkin, linux-pci, Greg Kurz,
	linux-block, Paul Mackerras, Marc Zyngier, Thomas Gleixner,
	linuxppc-dev, Christoph Hellwig
In-Reply-To: <20201126082852.1178497-1-lvivier@redhat.com>

With virtio multiqueue, normally each queue IRQ is mapped to a CPU.

Commit 0d9f0a52c8b9f ("virtio_scsi: use virtio IRQ affinity") exposed
an existing shortcoming of the arch code by moving virtio_scsi to
the automatic IRQ affinity assignment.

The affinity is correctly computed in msi_desc but this is not applied
to the system IRQs.

It appears the affinity is correctly passed to rtas_setup_msi_irqs() but
lost at this point and never passed to irq_domain_alloc_descs()
(see commit 06ee6d571f0e ("genirq: Add affinity hint to irq allocation"))
because irq_create_mapping() doesn't take an affinity parameter.

As the previous patch has added the affinity parameter to
irq_create_mapping() we can forward the affinity from rtas_setup_msi_irqs()
to irq_domain_alloc_descs().

With this change, the virtqueues are correctly dispatched between the CPUs
on pseries.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/platforms/pseries/msi.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/msi.c b/arch/powerpc/platforms/pseries/msi.c
index 133f6adcb39c..b3ac2455faad 100644
--- a/arch/powerpc/platforms/pseries/msi.c
+++ b/arch/powerpc/platforms/pseries/msi.c
@@ -458,7 +458,8 @@ static int rtas_setup_msi_irqs(struct pci_dev *pdev, int nvec_in, int type)
 			return hwirq;
 		}
 
-		virq = irq_create_mapping(NULL, hwirq);
+		virq = irq_create_mapping_affinity(NULL, hwirq,
+						   entry->affinity);
 
 		if (!virq) {
 			pr_debug("rtas_msi: Failed mapping hwirq %d\n", hwirq);
-- 
2.28.0


^ permalink raw reply related

* [PATCH v4 1/2] genirq/irqdomain: Add an irq_create_mapping_affinity() function
From: Laurent Vivier @ 2020-11-26  8:28 UTC (permalink / raw)
  To: linux-kernel
  Cc: Laurent Vivier, Michael S . Tsirkin, linux-pci, Greg Kurz,
	linux-block, Paul Mackerras, Marc Zyngier, Thomas Gleixner,
	linuxppc-dev, Christoph Hellwig
In-Reply-To: <20201126082852.1178497-1-lvivier@redhat.com>

There is currently no way to convey the affinity of an interrupt
via irq_create_mapping(), which creates issues for devices that
expect that affinity to be managed by the kernel.

In order to sort this out, rename irq_create_mapping() to
irq_create_mapping_affinity() with an additional affinity parameter
that can conveniently passed down to irq_domain_alloc_descs().

irq_create_mapping() is then re-implemented as a wrapper around
irq_create_mapping_affinity().

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
---
 include/linux/irqdomain.h | 12 ++++++++++--
 kernel/irq/irqdomain.c    | 13 ++++++++-----
 2 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
index 71535e87109f..ea5a337e0f8b 100644
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -384,11 +384,19 @@ extern void irq_domain_associate_many(struct irq_domain *domain,
 extern void irq_domain_disassociate(struct irq_domain *domain,
 				    unsigned int irq);
 
-extern unsigned int irq_create_mapping(struct irq_domain *host,
-				       irq_hw_number_t hwirq);
+extern unsigned int irq_create_mapping_affinity(struct irq_domain *host,
+				      irq_hw_number_t hwirq,
+				      const struct irq_affinity_desc *affinity);
 extern unsigned int irq_create_fwspec_mapping(struct irq_fwspec *fwspec);
 extern void irq_dispose_mapping(unsigned int virq);
 
+static inline unsigned int irq_create_mapping(struct irq_domain *host,
+					      irq_hw_number_t hwirq)
+{
+	return irq_create_mapping_affinity(host, hwirq, NULL);
+}
+
+
 /**
  * irq_linear_revmap() - Find a linux irq from a hw irq number.
  * @domain: domain owning this hardware interrupt
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index cf8b374b892d..e4ca69608f3b 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -624,17 +624,19 @@ unsigned int irq_create_direct_mapping(struct irq_domain *domain)
 EXPORT_SYMBOL_GPL(irq_create_direct_mapping);
 
 /**
- * irq_create_mapping() - Map a hardware interrupt into linux irq space
+ * irq_create_mapping_affinity() - Map a hardware interrupt into linux irq space
  * @domain: domain owning this hardware interrupt or NULL for default domain
  * @hwirq: hardware irq number in that domain space
+ * @affinity: irq affinity
  *
  * Only one mapping per hardware interrupt is permitted. Returns a linux
  * irq number.
  * If the sense/trigger is to be specified, set_irq_type() should be called
  * on the number returned from that call.
  */
-unsigned int irq_create_mapping(struct irq_domain *domain,
-				irq_hw_number_t hwirq)
+unsigned int irq_create_mapping_affinity(struct irq_domain *domain,
+				       irq_hw_number_t hwirq,
+				       const struct irq_affinity_desc *affinity)
 {
 	struct device_node *of_node;
 	int virq;
@@ -660,7 +662,8 @@ unsigned int irq_create_mapping(struct irq_domain *domain,
 	}
 
 	/* Allocate a virtual interrupt number */
-	virq = irq_domain_alloc_descs(-1, 1, hwirq, of_node_to_nid(of_node), NULL);
+	virq = irq_domain_alloc_descs(-1, 1, hwirq, of_node_to_nid(of_node),
+				      affinity);
 	if (virq <= 0) {
 		pr_debug("-> virq allocation failed\n");
 		return 0;
@@ -676,7 +679,7 @@ unsigned int irq_create_mapping(struct irq_domain *domain,
 
 	return virq;
 }
-EXPORT_SYMBOL_GPL(irq_create_mapping);
+EXPORT_SYMBOL_GPL(irq_create_mapping_affinity);
 
 /**
  * irq_create_strict_mappings() - Map a range of hw irqs to fixed linux irqs
-- 
2.28.0


^ permalink raw reply related

* [PATCH v4 0/2] powerpc/pseries: fix MSI/X IRQ affinity on pseries
From: Laurent Vivier @ 2020-11-26  8:28 UTC (permalink / raw)
  To: linux-kernel
  Cc: Laurent Vivier, Michael S . Tsirkin, linux-pci, Greg Kurz,
	linux-block, Paul Mackerras, Marc Zyngier, Thomas Gleixner,
	linuxppc-dev, Christoph Hellwig

With virtio, in multiqueue case, each queue IRQ is normally
bound to a different CPU using the affinity mask.

This works fine on x86_64 but totally ignored on pseries.

This is not obvious at first look because irqbalance is doing
some balancing to improve that.

It appears that the "managed" flag set in the MSI entry
is never copied to the system IRQ entry.

This series passes the affinity mask from rtas_setup_msi_irqs()
to irq_domain_alloc_descs() by adding an affinity parameter to
irq_create_mapping().

The first patch adds the parameter (no functional change), the
second patch passes the actual affinity mask to irq_create_mapping()
in rtas_setup_msi_irqs().

For instance, with 32 CPUs VM and 32 queues virtio-scsi interface:

... -smp 32 -device virtio-scsi-pci,id=virtio_scsi_pci0,num_queues=32

for IRQ in $(grep virtio2-request /proc/interrupts |cut -d: -f1); do
    for file in /proc/irq/$IRQ/ ; do
        echo -n "IRQ: $(basename $file) CPU: " ; cat $file/smp_affinity_list
    done
done

Without the patch (and without irqbalanced)

IRQ: 268 CPU: 0-31
IRQ: 269 CPU: 0-31
IRQ: 270 CPU: 0-31
IRQ: 271 CPU: 0-31
IRQ: 272 CPU: 0-31
IRQ: 273 CPU: 0-31
IRQ: 274 CPU: 0-31
IRQ: 275 CPU: 0-31
IRQ: 276 CPU: 0-31
IRQ: 277 CPU: 0-31
IRQ: 278 CPU: 0-31
IRQ: 279 CPU: 0-31
IRQ: 280 CPU: 0-31
IRQ: 281 CPU: 0-31
IRQ: 282 CPU: 0-31
IRQ: 283 CPU: 0-31
IRQ: 284 CPU: 0-31
IRQ: 285 CPU: 0-31
IRQ: 286 CPU: 0-31
IRQ: 287 CPU: 0-31
IRQ: 288 CPU: 0-31
IRQ: 289 CPU: 0-31
IRQ: 290 CPU: 0-31
IRQ: 291 CPU: 0-31
IRQ: 292 CPU: 0-31
IRQ: 293 CPU: 0-31
IRQ: 294 CPU: 0-31
IRQ: 295 CPU: 0-31
IRQ: 296 CPU: 0-31
IRQ: 297 CPU: 0-31
IRQ: 298 CPU: 0-31
IRQ: 299 CPU: 0-31

With the patch:

IRQ: 265 CPU: 0
IRQ: 266 CPU: 1
IRQ: 267 CPU: 2
IRQ: 268 CPU: 3
IRQ: 269 CPU: 4
IRQ: 270 CPU: 5
IRQ: 271 CPU: 6
IRQ: 272 CPU: 7
IRQ: 273 CPU: 8
IRQ: 274 CPU: 9
IRQ: 275 CPU: 10
IRQ: 276 CPU: 11
IRQ: 277 CPU: 12
IRQ: 278 CPU: 13
IRQ: 279 CPU: 14
IRQ: 280 CPU: 15
IRQ: 281 CPU: 16
IRQ: 282 CPU: 17
IRQ: 283 CPU: 18
IRQ: 284 CPU: 19
IRQ: 285 CPU: 20
IRQ: 286 CPU: 21
IRQ: 287 CPU: 22
IRQ: 288 CPU: 23
IRQ: 289 CPU: 24
IRQ: 290 CPU: 25
IRQ: 291 CPU: 26
IRQ: 292 CPU: 27
IRQ: 293 CPU: 28
IRQ: 294 CPU: 29
IRQ: 295 CPU: 30
IRQ: 299 CPU: 31

This matches what we have on an x86_64 system.

v4: udate changelog of PATCH 2, add Michael's Acked-by
v3: update changelog of PATCH 1 with comments from Thomas Gleixner and
    Marc Zyngier.
v2: add a wrapper around original irq_create_mapping() with the
    affinity parameter. Update comments

Laurent Vivier (2):
  genirq/irqdomain: Add an irq_create_mapping_affinity() function
  powerpc/pseries: Pass MSI affinity to irq_create_mapping()

 arch/powerpc/platforms/pseries/msi.c |  3 ++-
 include/linux/irqdomain.h            | 12 ++++++++++--
 kernel/irq/irqdomain.c               | 13 ++++++++-----
 3 files changed, 20 insertions(+), 8 deletions(-)

-- 
2.28.0



^ permalink raw reply

* Re: [PATCH v6 16/22] powerpc/book3s64/kuap: Improve error reporting with KUAP
From: Aneesh Kumar K.V @ 2020-11-26  7:44 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev, mpe
In-Reply-To: <bd854266-6cb5-3a04-ae80-a53e03f1e1d3@csgroup.eu>

Christophe Leroy <christophe.leroy@csgroup.eu> writes:

> Le 25/11/2020 à 06:16, Aneesh Kumar K.V a écrit :
>> With hash translation use DSISR_KEYFAULT to identify a wrong access.
>> With Radix we look at the AMR value and type of fault.
>> 
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> ---
>>   arch/powerpc/include/asm/book3s/32/kup.h     |  4 +--
>>   arch/powerpc/include/asm/book3s/64/kup.h     | 27 ++++++++++++++++----
>>   arch/powerpc/include/asm/kup.h               |  4 +--
>>   arch/powerpc/include/asm/nohash/32/kup-8xx.h |  4 +--
>>   arch/powerpc/mm/fault.c                      |  2 +-
>>   5 files changed, 29 insertions(+), 12 deletions(-)
>> 
>> diff --git a/arch/powerpc/include/asm/book3s/32/kup.h b/arch/powerpc/include/asm/book3s/32/kup.h
>> index 32fd4452e960..b18cd931e325 100644
>> --- a/arch/powerpc/include/asm/book3s/32/kup.h
>> +++ b/arch/powerpc/include/asm/book3s/32/kup.h
>> @@ -177,8 +177,8 @@ static inline void restore_user_access(unsigned long flags)
>>   		allow_user_access(to, to, end - addr, KUAP_READ_WRITE);
>>   }
>>   
>> -static inline bool
>> -bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write)
>> +static inline bool bad_kuap_fault(struct pt_regs *regs, unsigned long address,
>> +				  bool is_write, unsigned long error_code)
>>   {
>>   	unsigned long begin = regs->kuap & 0xf0000000;
>>   	unsigned long end = regs->kuap << 28;
>> diff --git a/arch/powerpc/include/asm/book3s/64/kup.h b/arch/powerpc/include/asm/book3s/64/kup.h
>> index 4a3d0d601745..2922c442a218 100644
>> --- a/arch/powerpc/include/asm/book3s/64/kup.h
>> +++ b/arch/powerpc/include/asm/book3s/64/kup.h
>> @@ -301,12 +301,29 @@ static inline void set_kuap(unsigned long value)
>>   	isync();
>>   }
>>   
>> -static inline bool
>> -bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write)
>> +#define RADIX_KUAP_BLOCK_READ	UL(0x4000000000000000)
>> +#define RADIX_KUAP_BLOCK_WRITE	UL(0x8000000000000000)
>> +
>> +static inline bool bad_kuap_fault(struct pt_regs *regs, unsigned long address,
>> +				  bool is_write, unsigned long error_code)
>>   {
>> -	return WARN(mmu_has_feature(MMU_FTR_KUAP) &&
>> -		    (regs->kuap & (is_write ? AMR_KUAP_BLOCK_WRITE : AMR_KUAP_BLOCK_READ)),
>> -		    "Bug: %s fault blocked by AMR!", is_write ? "Write" : "Read");
>> +	if (!mmu_has_feature(MMU_FTR_KUAP))
>> +		return false;
>> +
>> +	if (radix_enabled()) {
>> +		/*
>> +		 * Will be a storage protection fault.
>> +		 * Only check the details of AMR[0]
>> +		 */
>> +		return WARN((regs->kuap & (is_write ? RADIX_KUAP_BLOCK_WRITE : RADIX_KUAP_BLOCK_READ)),
>> +			    "Bug: %s fault blocked by AMR!", is_write ? "Write" : "Read");
>
> I think it is pointless to keep the WARN() here.
>
> I have a series aiming at removing them. See 
> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/cc9129bdda1dbc2f0a09cf45fece7d0b0e690784.1605541983.git.christophe.leroy@csgroup.eu/

Can we do this as a spearate patch as you posted above? We can drop the
WARN in that while keeping the hash branch to look at DSISR value.

-aneesh

^ permalink raw reply

* Re: [PATCH v6 09/22] powerpc/exec: Set thread.regs early during exec
From: Christophe Leroy @ 2020-11-26  7:43 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linuxppc-dev, mpe
In-Reply-To: <87k0u8tvoj.fsf@linux.ibm.com>



Le 26/11/2020 à 08:38, Aneesh Kumar K.V a écrit :
> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> 
>> Le 25/11/2020 à 06:16, Aneesh Kumar K.V a écrit :
> ....
> 
>> +++ b/arch/powerpc/kernel/process.c
>>> @@ -1530,10 +1530,32 @@ void flush_thread(void)
>>>    #ifdef CONFIG_PPC_BOOK3S_64
>>>    void arch_setup_new_exec(void)
>>>    {
>>> -	if (radix_enabled())
>>> -		return;
>>> -	hash__setup_new_exec();
>>> +	if (!radix_enabled())
>>> +		hash__setup_new_exec();
>>> +
>>> +	/*
>>> +	 * If we exec out of a kernel thread then thread.regs will not be
>>> +	 * set.  Do it now.
>>> +	 */
>>> +	if (!current->thread.regs) {
>>> +		struct pt_regs *regs = task_stack_page(current) + THREAD_SIZE;
>>> +		current->thread.regs = regs - 1;
>>> +	}
>>> +
>>> +}
>>> +#else
>>> +void arch_setup_new_exec(void)
>>> +{
>>> +	/*
>>> +	 * If we exec out of a kernel thread then thread.regs will not be
>>> +	 * set.  Do it now.
>>> +	 */
>>> +	if (!current->thread.regs) {
>>> +		struct pt_regs *regs = task_stack_page(current) + THREAD_SIZE;
>>> +		current->thread.regs = regs - 1;
>>> +	}
>>>    }
>>> +
>>>    #endif
>>
>> No need to duplicate arch_setup_new_exec() I think. radix_enabled() is defined at all time so the
>> first function should be valid at all time.
>>
> 
> arch/powerpc/kernel/process.c: In function ‘arch_setup_new_exec’:
> arch/powerpc/kernel/process.c:1529:3: error: implicit declaration of function ‘hash__setup_new_exec’; did you mean ‘arch_setup_new_exec’? [-Werror=implicit-function-declaration]
>   1529 |   hash__setup_new_exec();
>        |   ^~~~~~~~~~~~~~~~~~~~
>        |   arch_setup_new_exec
> 
> 
> That requires us to have hash__setup_new_exec prototype for all platforms.

Yes indeed.

So maybe, just enclose that part in the #ifdef instead of duplicating the common part ?

Christophe

^ permalink raw reply

* Re: [PATCH v6 09/22] powerpc/exec: Set thread.regs early during exec
From: Aneesh Kumar K.V @ 2020-11-26  7:38 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev, mpe
In-Reply-To: <f5960226-f451-41ed-2992-bbe0acf9d190@csgroup.eu>

Christophe Leroy <christophe.leroy@csgroup.eu> writes:

> Le 25/11/2020 à 06:16, Aneesh Kumar K.V a écrit :
....

> +++ b/arch/powerpc/kernel/process.c
>> @@ -1530,10 +1530,32 @@ void flush_thread(void)
>>   #ifdef CONFIG_PPC_BOOK3S_64
>>   void arch_setup_new_exec(void)
>>   {
>> -	if (radix_enabled())
>> -		return;
>> -	hash__setup_new_exec();
>> +	if (!radix_enabled())
>> +		hash__setup_new_exec();
>> +
>> +	/*
>> +	 * If we exec out of a kernel thread then thread.regs will not be
>> +	 * set.  Do it now.
>> +	 */
>> +	if (!current->thread.regs) {
>> +		struct pt_regs *regs = task_stack_page(current) + THREAD_SIZE;
>> +		current->thread.regs = regs - 1;
>> +	}
>> +
>> +}
>> +#else
>> +void arch_setup_new_exec(void)
>> +{
>> +	/*
>> +	 * If we exec out of a kernel thread then thread.regs will not be
>> +	 * set.  Do it now.
>> +	 */
>> +	if (!current->thread.regs) {
>> +		struct pt_regs *regs = task_stack_page(current) + THREAD_SIZE;
>> +		current->thread.regs = regs - 1;
>> +	}
>>   }
>> +
>>   #endif
>
> No need to duplicate arch_setup_new_exec() I think. radix_enabled() is defined at all time so the 
> first function should be valid at all time.
>

arch/powerpc/kernel/process.c: In function ‘arch_setup_new_exec’:
arch/powerpc/kernel/process.c:1529:3: error: implicit declaration of function ‘hash__setup_new_exec’; did you mean ‘arch_setup_new_exec’? [-Werror=implicit-function-declaration]
 1529 |   hash__setup_new_exec();
      |   ^~~~~~~~~~~~~~~~~~~~
      |   arch_setup_new_exec


That requires us to have hash__setup_new_exec prototype for all platforms.

-aneesh

^ permalink raw reply

* [PATCH] ASoC: fsl: Fix config name of CONFIG_ARCH_MXC
From: Shengjiu Wang @ 2020-11-26  6:14 UTC (permalink / raw)
  To: timur, nicoleotsuka, Xiubo.Lee, festevam, broonie, perex, tiwai,
	alsa-devel
  Cc: linuxppc-dev, linux-kernel

CONFIG_ARCH_MXC should be ARCH_MXC

Fixes: 674226db62ec ("ASoC: fsl: SND_SOC_FSL_AUD2HTX should depend on ARCH_MXC")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
---
 sound/soc/fsl/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/soc/fsl/Kconfig b/sound/soc/fsl/Kconfig
index 7d48f4f98e8b..835a14821360 100644
--- a/sound/soc/fsl/Kconfig
+++ b/sound/soc/fsl/Kconfig
@@ -107,7 +107,7 @@ config SND_SOC_FSL_XCVR
 
 config SND_SOC_FSL_AUD2HTX
 	tristate "AUDIO TO HDMI TX module support"
-	depends on CONFIG_ARCH_MXC || COMPILE_TEST
+	depends on ARCH_MXC || COMPILE_TEST
 	help
 	  Say Y if you want to add AUDIO TO HDMI TX support for NXP.
 
-- 
2.27.0


^ permalink raw reply related

* Re: [PATCH] tpm: ibmvtpm: fix error return code in tpm_ibmvtpm_probe()
From: Jarkko Sakkinen @ 2020-11-26  3:35 UTC (permalink / raw)
  To: Wang Hai, mpe, benh, paulus, peterhuewe, jgg, stefanb, nayna
  Cc: linux-integrity, linuxppc-dev, linux-kernel
In-Reply-To: <20201124135244.31932-1-wanghai38@huawei.com>

On Tue, 2020-11-24 at 21:52 +0800, Wang Hai wrote:
> Fix to return a negative error code from the error handling
> case instead of 0, as done elsewhere in this function.
> 
> Fixes: d8d74ea3c002 ("tpm: ibmvtpm: Wait for buffer to be set before
> proceeding")
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: Wang Hai <wanghai38@huawei.com>

Provide a reasoning for -ETIMEOUT in the commit message.

/Jarkko

> ---
>  drivers/char/tpm/tpm_ibmvtpm.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/char/tpm/tpm_ibmvtpm.c
> b/drivers/char/tpm/tpm_ibmvtpm.c
> index 994385bf37c0..813eb2cac0ce 100644
> --- a/drivers/char/tpm/tpm_ibmvtpm.c
> +++ b/drivers/char/tpm/tpm_ibmvtpm.c
> @@ -687,6 +687,7 @@ static int tpm_ibmvtpm_probe(struct vio_dev
> *vio_dev,
>                                 ibmvtpm->rtce_buf != NULL,
>                                 HZ)) {
>                 dev_err(dev, "CRQ response timed out\n");
> +               rc = -ETIMEDOUT;
>                 goto init_irq_cleanup;
>         }
>  



^ permalink raw reply

* Re: [PATCH v6 04/22] powerpc/book3s64/kuap/kuep: Move uamor setup to pkey init
From: Michael Ellerman @ 2020-11-26  3:28 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linuxppc-dev; +Cc: Aneesh Kumar K.V
In-Reply-To: <20201125051634.509286-5-aneesh.kumar@linux.ibm.com>

"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
> This patch consolidates UAMOR update across pkey, kuap and kuep features.
> The boot cpu initialize UAMOR via pkey init and both radix/hash do the
> secondary cpu UAMOR init in early_init_mmu_secondary.
>
> We don't check for mmu_feature in radix secondary init because UAMOR
> is a supported SPRN with all CPUs supporting radix translation.
> The old code was not updating UAMOR if we had smap disabled and smep enabled.
> This change handles that case.
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  arch/powerpc/mm/book3s64/radix_pgtable.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
> index 3adcf730f478..bfe441af916a 100644
> --- a/arch/powerpc/mm/book3s64/radix_pgtable.c
> +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
> @@ -620,9 +620,6 @@ void setup_kuap(bool disabled)
>  		cur_cpu_spec->mmu_features |= MMU_FTR_RADIX_KUAP;
>  	}
>  
> -	/* Make sure userspace can't change the AMR */
> -	mtspr(SPRN_UAMOR, 0);
> -
>  	/*
>  	 * Set the default kernel AMR values on all cpus.
>  	 */
> @@ -721,6 +718,11 @@ void radix__early_init_mmu_secondary(void)
>  
>  	radix__switch_mmu_context(NULL, &init_mm);
>  	tlbiel_all();
> +
> +#ifdef CONFIG_PPC_PKEY
> +	/* Make sure userspace can't change the AMR */
> +	mtspr(SPRN_UAMOR, 0);
> +#endif

If PPC_PKEY is disabled I think this leaves UAMOR unset, which means it
could potentially allow AMR to be used as a covert channel between
processes.

cheers

^ permalink raw reply

* Re: [PATCH v6 03/22] powerpc/book3s64/kuap/kuep: Make KUAP and KUEP a subfeature of PPC_MEM_KEYS
From: Michael Ellerman @ 2020-11-26  3:25 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linuxppc-dev; +Cc: Aneesh Kumar K.V
In-Reply-To: <20201125051634.509286-4-aneesh.kumar@linux.ibm.com>

"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
> The next set of patches adds support for kuap with hash translation.
> Hence make KUAP a BOOK3S_64 feature. Also make it a subfeature of
> PPC_MEM_KEYS. Hash translation is going to use pkeys to support
> KUAP/KUEP. Adding this dependency reduces the code complexity and
> enables us to move some of the initialization code to pkeys.c

The subject and change log don't really match the patch anymore since
you incorporated my changes.

This adds a new CONFIG called PPC_PKEY which is enabled if either PKEY
or KUAP/KUEP is enabled etc.

cheers

> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  .../powerpc/include/asm/book3s/64/kup-radix.h |  4 ++--
>  arch/powerpc/include/asm/book3s/64/mmu.h      |  2 +-
>  arch/powerpc/include/asm/ptrace.h             |  7 +++++-
>  arch/powerpc/kernel/asm-offsets.c             |  3 +++
>  arch/powerpc/mm/book3s64/Makefile             |  2 +-
>  arch/powerpc/mm/book3s64/pkeys.c              | 24 ++++++++++++-------
>  arch/powerpc/platforms/Kconfig.cputype        |  5 ++++
>  7 files changed, 33 insertions(+), 14 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/kup-radix.h b/arch/powerpc/include/asm/book3s/64/kup-radix.h
> index 28716e2f13e3..68eaa2fac3ab 100644
> --- a/arch/powerpc/include/asm/book3s/64/kup-radix.h
> +++ b/arch/powerpc/include/asm/book3s/64/kup-radix.h
> @@ -16,7 +16,7 @@
>  #ifdef CONFIG_PPC_KUAP
>  	BEGIN_MMU_FTR_SECTION_NESTED(67)
>  	mfspr	\gpr1, SPRN_AMR
> -	ld	\gpr2, STACK_REGS_KUAP(r1)
> +	ld	\gpr2, STACK_REGS_AMR(r1)
>  	cmpd	\gpr1, \gpr2
>  	beq	998f
>  	isync
> @@ -48,7 +48,7 @@
>  	bne	\msr_pr_cr, 99f
>  	.endif
>  	mfspr	\gpr1, SPRN_AMR
> -	std	\gpr1, STACK_REGS_KUAP(r1)
> +	std	\gpr1, STACK_REGS_AMR(r1)
>  	li	\gpr2, (AMR_KUAP_BLOCKED >> AMR_KUAP_SHIFT)
>  	sldi	\gpr2, \gpr2, AMR_KUAP_SHIFT
>  	cmpd	\use_cr, \gpr1, \gpr2
> diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
> index e0b52940e43c..a2a015066bae 100644
> --- a/arch/powerpc/include/asm/book3s/64/mmu.h
> +++ b/arch/powerpc/include/asm/book3s/64/mmu.h
> @@ -199,7 +199,7 @@ extern int mmu_io_psize;
>  void mmu_early_init_devtree(void);
>  void hash__early_init_devtree(void);
>  void radix__early_init_devtree(void);
> -#ifdef CONFIG_PPC_MEM_KEYS
> +#ifdef CONFIG_PPC_PKEY
>  void pkey_early_init_devtree(void);
>  #else
>  static inline void pkey_early_init_devtree(void) {}
> diff --git a/arch/powerpc/include/asm/ptrace.h b/arch/powerpc/include/asm/ptrace.h
> index e2c778c176a3..e7f1caa007a4 100644
> --- a/arch/powerpc/include/asm/ptrace.h
> +++ b/arch/powerpc/include/asm/ptrace.h
> @@ -53,9 +53,14 @@ struct pt_regs
>  #ifdef CONFIG_PPC64
>  			unsigned long ppr;
>  #endif
> +			union {
>  #ifdef CONFIG_PPC_KUAP
> -			unsigned long kuap;
> +				unsigned long kuap;
>  #endif
> +#ifdef CONFIG_PPC_PKEY
> +				unsigned long amr;
> +#endif
> +			};
>  		};
>  		unsigned long __pad[2];	/* Maintain 16 byte interrupt stack alignment */
>  	};
> diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
> index c2722ff36e98..418a0b314a33 100644
> --- a/arch/powerpc/kernel/asm-offsets.c
> +++ b/arch/powerpc/kernel/asm-offsets.c
> @@ -354,6 +354,9 @@ int main(void)
>  	STACK_PT_REGS_OFFSET(_PPR, ppr);
>  #endif /* CONFIG_PPC64 */
>  
> +#ifdef CONFIG_PPC_PKEY
> +	STACK_PT_REGS_OFFSET(STACK_REGS_AMR, amr);
> +#endif
>  #ifdef CONFIG_PPC_KUAP
>  	STACK_PT_REGS_OFFSET(STACK_REGS_KUAP, kuap);
>  #endif
> diff --git a/arch/powerpc/mm/book3s64/Makefile b/arch/powerpc/mm/book3s64/Makefile
> index fd393b8be14f..1b56d3af47d4 100644
> --- a/arch/powerpc/mm/book3s64/Makefile
> +++ b/arch/powerpc/mm/book3s64/Makefile
> @@ -17,7 +17,7 @@ endif
>  obj-$(CONFIG_TRANSPARENT_HUGEPAGE) += hash_hugepage.o
>  obj-$(CONFIG_PPC_SUBPAGE_PROT)	+= subpage_prot.o
>  obj-$(CONFIG_SPAPR_TCE_IOMMU)	+= iommu_api.o
> -obj-$(CONFIG_PPC_MEM_KEYS)	+= pkeys.o
> +obj-$(CONFIG_PPC_PKEY)	+= pkeys.o
>  
>  # Instrumenting the SLB fault path can lead to duplicate SLB entries
>  KCOV_INSTRUMENT_slb.o := n
> diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
> index b1d091a97611..7dc71f85683d 100644
> --- a/arch/powerpc/mm/book3s64/pkeys.c
> +++ b/arch/powerpc/mm/book3s64/pkeys.c
> @@ -89,12 +89,14 @@ static int scan_pkey_feature(void)
>  		}
>  	}
>  
> +#ifdef CONFIG_PPC_MEM_KEYS
>  	/*
>  	 * Adjust the upper limit, based on the number of bits supported by
>  	 * arch-neutral code.
>  	 */
>  	pkeys_total = min_t(int, pkeys_total,
>  			    ((ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT) + 1));
> +#endif
>  	return pkeys_total;
>  }
>  
> @@ -102,6 +104,7 @@ void __init pkey_early_init_devtree(void)
>  {
>  	int pkeys_total, i;
>  
> +#ifdef CONFIG_PPC_MEM_KEYS
>  	/*
>  	 * We define PKEY_DISABLE_EXECUTE in addition to the arch-neutral
>  	 * generic defines for PKEY_DISABLE_ACCESS and PKEY_DISABLE_WRITE.
> @@ -117,7 +120,7 @@ void __init pkey_early_init_devtree(void)
>  	BUILD_BUG_ON(__builtin_clzl(ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT) +
>  		     __builtin_popcountl(ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT)
>  				!= (sizeof(u64) * BITS_PER_BYTE));
> -
> +#endif
>  	/*
>  	 * Only P7 and above supports SPRN_AMR update with MSR[PR] = 1
>  	 */
> @@ -223,14 +226,6 @@ void __init pkey_early_init_devtree(void)
>  	return;
>  }
>  
> -void pkey_mm_init(struct mm_struct *mm)
> -{
> -	if (!mmu_has_feature(MMU_FTR_PKEY))
> -		return;
> -	mm_pkey_allocation_map(mm) = initial_allocation_mask;
> -	mm->context.execute_only_pkey = execute_only_key;
> -}
> -
>  static inline u64 read_amr(void)
>  {
>  	return mfspr(SPRN_AMR);
> @@ -257,6 +252,15 @@ static inline void write_iamr(u64 value)
>  	mtspr(SPRN_IAMR, value);
>  }
>  
> +#ifdef CONFIG_PPC_MEM_KEYS
> +void pkey_mm_init(struct mm_struct *mm)
> +{
> +	if (!mmu_has_feature(MMU_FTR_PKEY))
> +		return;
> +	mm_pkey_allocation_map(mm) = initial_allocation_mask;
> +	mm->context.execute_only_pkey = execute_only_key;
> +}
> +
>  static inline void init_amr(int pkey, u8 init_bits)
>  {
>  	u64 new_amr_bits = (((u64)init_bits & 0x3UL) << pkeyshift(pkey));
> @@ -445,3 +449,5 @@ void arch_dup_pkeys(struct mm_struct *oldmm, struct mm_struct *mm)
>  	mm_pkey_allocation_map(mm) = mm_pkey_allocation_map(oldmm);
>  	mm->context.execute_only_pkey = oldmm->context.execute_only_pkey;
>  }
> +
> +#endif /* CONFIG_PPC_MEM_KEYS */
> diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/platforms/Kconfig.cputype
> index c194c4ae8bc7..f255e8f32155 100644
> --- a/arch/powerpc/platforms/Kconfig.cputype
> +++ b/arch/powerpc/platforms/Kconfig.cputype
> @@ -395,6 +395,11 @@ config PPC_KUAP_DEBUG
>  	  Add extra debugging for Kernel Userspace Access Protection (KUAP)
>  	  If you're unsure, say N.
>  
> +config PPC_PKEY
> +	def_bool y
> +	depends on PPC_BOOK3S_64
> +	depends on PPC_MEM_KEYS || PPC_KUAP || PPC_KUEP
> +
>  config ARCH_ENABLE_HUGEPAGE_MIGRATION
>  	def_bool y
>  	depends on PPC_BOOK3S_64 && HUGETLB_PAGE && MIGRATION
> -- 
> 2.28.0

^ permalink raw reply

* Re: [PATCH v3 2/2] powerpc/pseries: pass MSI affinity to irq_create_mapping()
From: Michael Ellerman @ 2020-11-26  3:22 UTC (permalink / raw)
  To: Laurent Vivier, linux-kernel
  Cc: Laurent Vivier, Marc Zyngier, Michael S . Tsirkin, linux-pci,
	Greg Kurz, linux-block, Paul Mackerras, Thomas Gleixner,
	linuxppc-dev, Christoph Hellwig
In-Reply-To: <20201125150932.1150619-3-lvivier@redhat.com>

Laurent Vivier <lvivier@redhat.com> writes:
> With virtio multiqueue, normally each queue IRQ is mapped to a CPU.
>
> But since commit 0d9f0a52c8b9f ("virtio_scsi: use virtio IRQ affinity")
> this is broken on pseries.
>
> The affinity is correctly computed in msi_desc but this is not applied
> to the system IRQs.
>
> It appears the affinity is correctly passed to rtas_setup_msi_irqs() but
> lost at this point and never passed to irq_domain_alloc_descs()
> (see commit 06ee6d571f0e ("genirq: Add affinity hint to irq allocation"))
> because irq_create_mapping() doesn't take an affinity parameter.
>
> As the previous patch has added the affinity parameter to
> irq_create_mapping() we can forward the affinity from rtas_setup_msi_irqs()
> to irq_domain_alloc_descs().
>
> With this change, the virtqueues are correctly dispatched between the CPUs
> on pseries.
>
> BugId: https://bugzilla.redhat.com/show_bug.cgi?id=1702939
> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> Reviewed-by: Greg Kurz <groug@kaod.org>
> ---
>  arch/powerpc/platforms/pseries/msi.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)

Acked-by: Michael Ellerman <mpe@ellerman.id.au>

cheers

> diff --git a/arch/powerpc/platforms/pseries/msi.c b/arch/powerpc/platforms/pseries/msi.c
> index 133f6adcb39c..b3ac2455faad 100644
> --- a/arch/powerpc/platforms/pseries/msi.c
> +++ b/arch/powerpc/platforms/pseries/msi.c
> @@ -458,7 +458,8 @@ static int rtas_setup_msi_irqs(struct pci_dev *pdev, int nvec_in, int type)
>  			return hwirq;
>  		}
>  
> -		virq = irq_create_mapping(NULL, hwirq);
> +		virq = irq_create_mapping_affinity(NULL, hwirq,
> +						   entry->affinity);
>  
>  		if (!virq) {
>  			pr_debug("rtas_msi: Failed mapping hwirq %d\n", hwirq);
> -- 
> 2.28.0

^ permalink raw reply

* Re: [PATCH v3 2/2] powerpc/pseries: pass MSI affinity to irq_create_mapping()
From: Michael Ellerman @ 2020-11-26  3:22 UTC (permalink / raw)
  To: Marc Zyngier, Laurent Vivier
  Cc: Michael S . Tsirkin, linux-pci, Denis Kirjanov, Greg Kurz,
	linux-kernel, linux-block, Paul Mackerras, Thomas Gleixner,
	linuxppc-dev, Christoph Hellwig
In-Reply-To: <5419d1790c9ea0d9d7791ae887794285@kernel.org>

Marc Zyngier <maz@kernel.org> writes:
> On 2020-11-25 16:24, Laurent Vivier wrote:
>> On 25/11/2020 17:05, Denis Kirjanov wrote:
>>> On 11/25/20, Laurent Vivier <lvivier@redhat.com> wrote:
>>>> With virtio multiqueue, normally each queue IRQ is mapped to a CPU.
>>>> 
>>>> But since commit 0d9f0a52c8b9f ("virtio_scsi: use virtio IRQ 
>>>> affinity")
>>>> this is broken on pseries.
>>> 
>>> Please add "Fixes" tag.
>> 
>> In fact, the code in commit 0d9f0a52c8b9f is correct.
>> 
>> The problem is with MSI/X irq affinity and pseries. So this patch
>> fixes more than virtio_scsi. I put this information because this
>> commit allows to clearly show the problem. Perhaps I should remove
>> this line in fact?
>
> This patch does not fix virtio_scsi at all, which as you noticed, is
> correct. It really fixes the PPC MSI setup, which is starting to show
> its age. So getting rid of the reference seems like the right thing to 
> do.

It's still useful to refer to that commit if the code worked prior to
that commit. But you should make it clearer that 0d9f0a52c8b9f wasn't in
error, it just exposed an existing shortcoming of the arch code.

cheers

^ permalink raw reply

* Re: [PATCH v6 07/22] powerpc/book3s64/kuap: Rename MMU_FTR_RADIX_KUAP to MMU_FTR_KUAP
From: Michael Ellerman @ 2020-11-26  3:16 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linuxppc-dev; +Cc: Aneesh Kumar K.V
In-Reply-To: <20201125051634.509286-8-aneesh.kumar@linux.ibm.com>

"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
> diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
> index 255a1837e9f7..f5c7a17c198a 100644
> --- a/arch/powerpc/include/asm/mmu.h
> +++ b/arch/powerpc/include/asm/mmu.h
> @@ -28,6 +28,11 @@
>   * Individual features below.
>   */
>  
> +/*
> + * Supports KUAP (key 0 controlling userspace addresses) on radix
> + */

That comment needs updating.

I think this feature now means we have either key 0 controlling uaccess
on radix OR we're using the AMR to manually implement KUAP.

> +#define MMU_FTR_KUAP			ASM_CONST(0x00000200)

I agree with Christophe that this name is now too generic.

With that name one would expect it to be enabled on the 32-bit CPUs that
implement KUAP.

Maybe MMU_FTR_BOOK3S_KUAP ?


If in future the other MMUs want an MMU feature for KUAP then we could
rename it to MMU_FTR_KUAP, but we'd need to be careful with ifdefs to
make sure it guards the right things.

cheers

^ permalink raw reply

* [PATCH 11/13] ibmvfc: set and track hw queue in ibmvfc_event struct
From: Tyrel Datwyler @ 2020-11-26  1:48 UTC (permalink / raw)
  To: james.bottomley
  Cc: Tyrel Datwyler, martin.petersen, linux-scsi, linux-kernel, brking,
	linuxppc-dev
In-Reply-To: <20201126014824.123831-1-tyreld@linux.ibm.com>

Extract the hwq id from a SCSI command and store it in the ibmvfc_event
structure to identify which Sub-CRQ to send the command down when
channels are being utilized.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 5 +++++
 drivers/scsi/ibmvscsi/ibmvfc.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 55893d09f883..f686c2cb0de2 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -1387,6 +1387,7 @@ static void ibmvfc_init_event(struct ibmvfc_event *evt,
 	evt->crq.format = format;
 	evt->done = done;
 	evt->eh_comp = NULL;
+	evt->hwq = 0;
 }
 
 /**
@@ -1738,6 +1739,8 @@ static int ibmvfc_queuecommand_lck(struct scsi_cmnd *cmnd,
 	struct ibmvfc_cmd *vfc_cmd;
 	struct ibmvfc_fcp_cmd_iu *iu;
 	struct ibmvfc_event *evt;
+	u32 tag_and_hwq = blk_mq_unique_tag(cmnd->request);
+	u16 hwq = blk_mq_unique_tag_to_hwq(tag_and_hwq);
 	int rc;
 
 	if (unlikely((rc = fc_remote_port_chkready(rport))) ||
@@ -1765,6 +1768,8 @@ static int ibmvfc_queuecommand_lck(struct scsi_cmnd *cmnd,
 	}
 
 	vfc_cmd->correlation = cpu_to_be64(evt);
+	if (vhost->using_channels)
+		evt->hwq = hwq % vhost->scsi_scrqs.active_queues;
 
 	if (likely(!(rc = ibmvfc_map_sg_data(cmnd, evt, vfc_cmd, vhost->dev))))
 		return ibmvfc_send_event(evt, vhost, 0);
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.h b/drivers/scsi/ibmvscsi/ibmvfc.h
index 04086ffbfca7..abda910ae33d 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.h
+++ b/drivers/scsi/ibmvscsi/ibmvfc.h
@@ -781,6 +781,7 @@ struct ibmvfc_event {
 	struct completion comp;
 	struct completion *eh_comp;
 	struct timer_list timer;
+	u16 hwq;
 };
 
 /* a pool of event structs for use */
-- 
2.27.0


^ permalink raw reply related

* [PATCH 13/13] ibmvfc: register Sub-CRQ handles with VIOS during channel setup
From: Tyrel Datwyler @ 2020-11-26  1:48 UTC (permalink / raw)
  To: james.bottomley
  Cc: Tyrel Datwyler, martin.petersen, linux-scsi, linux-kernel, brking,
	linuxppc-dev
In-Reply-To: <20201126014824.123831-1-tyreld@linux.ibm.com>

If the ibmvfc client adapter requests channels it must submit a number
of Sub-CRQ handles matching the number of channels being requested. The
VIOS in its response will overwrite the actual number of channel
resources allocated which may be less than what was requested. The
client then must store the VIOS Sub-CRQ handle for each queue. This VIOS
handle is needed as a parameter with  h_send_sub_crq().

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 32 +++++++++++++++++++++++++++++++-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 897e3236534d..6bb1028bbe44 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -4494,15 +4494,35 @@ static void ibmvfc_discover_targets(struct ibmvfc_host *vhost)
 static void ibmvfc_channel_setup_done(struct ibmvfc_event *evt)
 {
 	struct ibmvfc_host *vhost = evt->vhost;
+	struct ibmvfc_channel_setup *setup = vhost->channel_setup_buf;
+	struct ibmvfc_scsi_channels *scrqs = &vhost->scsi_scrqs;
 	u32 mad_status = be16_to_cpu(evt->xfer_iu->channel_setup.common.status);
 	int level = IBMVFC_DEFAULT_LOG_LEVEL;
+	int flags, active_queues, i;
 
 	ibmvfc_free_event(evt);
 
 	switch (mad_status) {
 	case IBMVFC_MAD_SUCCESS:
 		ibmvfc_dbg(vhost, "Channel Setup succeded\n");
+		flags = be32_to_cpu(setup->flags);
 		vhost->do_enquiry = 0;
+		active_queues = be32_to_cpu(setup->num_scsi_subq_channels);
+		scrqs->active_queues = active_queues;
+
+		if (flags & IBMVFC_CHANNELS_CANCELED) {
+			ibmvfc_dbg(vhost, "Channels Canceled\n");
+			vhost->using_channels = 0;
+		} else {
+			if (active_queues)
+				vhost->using_channels = 1;
+			for (i = 0; i < active_queues; i++)
+				scrqs->scrqs[i].vios_cookie =
+					be64_to_cpu(setup->channel_handles[i]);
+
+			ibmvfc_dbg(vhost, "Using %u channels\n",
+				   vhost->scsi_scrqs.active_queues);
+		}
 		break;
 	case IBMVFC_MAD_FAILED:
 		level += ibmvfc_retry_host_init(vhost);
@@ -4526,9 +4546,19 @@ static void ibmvfc_channel_setup(struct ibmvfc_host *vhost)
 	struct ibmvfc_channel_setup_mad *mad;
 	struct ibmvfc_channel_setup *setup_buf = vhost->channel_setup_buf;
 	struct ibmvfc_event *evt = ibmvfc_get_event(vhost);
+	struct ibmvfc_scsi_channels *scrqs = &vhost->scsi_scrqs;
+	unsigned int num_channels =
+		min(vhost->client_scsi_channels, vhost->max_vios_scsi_channels);
+	int i;
 
 	memset(setup_buf, 0, sizeof(*setup_buf));
-	setup_buf->flags = cpu_to_be32(IBMVFC_CANCEL_CHANNELS);
+	if (num_channels == 0)
+		setup_buf->flags = cpu_to_be32(IBMVFC_CANCEL_CHANNELS);
+	else {
+		setup_buf->num_scsi_subq_channels = cpu_to_be32(num_channels);
+		for (i = 0; i < num_channels; i++)
+			setup_buf->channel_handles[i] = cpu_to_be64(scrqs->scrqs[i].cookie);
+	}
 
 	ibmvfc_init_event(evt, ibmvfc_channel_setup_done, IBMVFC_MAD_FORMAT);
 	mad = &evt->iu.channel_setup;
-- 
2.27.0


^ permalink raw reply related

* [PATCH 12/13] ibmvfc: send commands down HW Sub-CRQ when channelized
From: Tyrel Datwyler @ 2020-11-26  1:48 UTC (permalink / raw)
  To: james.bottomley
  Cc: Tyrel Datwyler, martin.petersen, linux-scsi, linux-kernel, brking,
	linuxppc-dev
In-Reply-To: <20201126014824.123831-1-tyreld@linux.ibm.com>

When the client has negotiated the use of channels all vfcFrames are
required to go down a Sub-CRQ channel or it is a protocoal violation. If
the adapter state is channelized submit vfcFrames to the appropriate
Sub-CRQ via the h_send_sub_crq() helper.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index f686c2cb0de2..897e3236534d 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -701,6 +701,15 @@ static int ibmvfc_send_crq(struct ibmvfc_host *vhost, u64 word1, u64 word2)
 	return plpar_hcall_norets(H_SEND_CRQ, vdev->unit_address, word1, word2);
 }
 
+static int ibmvfc_send_sub_crq(struct ibmvfc_host *vhost, u64 cookie, u64 word1,
+			       u64 word2, u64 word3, u64 word4)
+{
+	struct vio_dev *vdev = to_vio_dev(vhost->dev);
+
+	return plpar_hcall_norets(H_SEND_SUB_CRQ, vdev->unit_address, cookie,
+				  word1, word2, word3, word4);
+}
+
 /**
  * ibmvfc_send_crq_init - Send a CRQ init message
  * @vhost:	ibmvfc host struct
@@ -1524,8 +1533,17 @@ static int ibmvfc_send_event(struct ibmvfc_event *evt,
 
 	mb();
 
-	if ((rc = ibmvfc_send_crq(vhost, be64_to_cpu(crq_as_u64[0]),
-				  be64_to_cpu(crq_as_u64[1])))) {
+	if (vhost->using_channels && evt->crq.format == IBMVFC_CMD_FORMAT)
+		rc = ibmvfc_send_sub_crq(vhost,
+					 vhost->scsi_scrqs.scrqs[evt->hwq].vios_cookie,
+					 be64_to_cpu(crq_as_u64[0]),
+					 be64_to_cpu(crq_as_u64[1]),
+					 0, 0);
+	else
+		rc = ibmvfc_send_crq(vhost, be64_to_cpu(crq_as_u64[0]),
+				     be64_to_cpu(crq_as_u64[1]));
+
+	if (rc) {
 		list_del(&evt->queue);
 		del_timer(&evt->timer);
 
-- 
2.27.0


^ permalink raw reply related

* [PATCH 01/13] ibmvfc: add vhost fields and defaults for MQ enablement
From: Tyrel Datwyler @ 2020-11-26  1:48 UTC (permalink / raw)
  To: james.bottomley
  Cc: Tyrel Datwyler, martin.petersen, linux-scsi, linux-kernel, brking,
	linuxppc-dev
In-Reply-To: <20201126014824.123831-1-tyreld@linux.ibm.com>

Introduce several new vhost fields for managing MQ state of the adapter
as well as initial defaults for MQ enablement.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 7 +++++++
 drivers/scsi/ibmvscsi/ibmvfc.h | 9 +++++++++
 2 files changed, 16 insertions(+)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 42e4d35e0d35..cd609d19e6a1 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -5167,6 +5167,7 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id)
 	shost->max_sectors = IBMVFC_MAX_SECTORS;
 	shost->max_cmd_len = IBMVFC_MAX_CDB_LEN;
 	shost->unique_id = shost->host_no;
+	shost->nr_hw_queues = IBMVFC_SCSI_HW_QUEUES;
 
 	vhost = shost_priv(shost);
 	INIT_LIST_HEAD(&vhost->sent);
@@ -5178,6 +5179,12 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id)
 	vhost->partition_number = -1;
 	vhost->log_level = log_level;
 	vhost->task_set = 1;
+
+	vhost->mq_enabled = IBMVFC_MQ;
+	vhost->client_scsi_channels = IBMVFC_SCSI_CHANNELS;
+	vhost->using_channels = 0;
+	vhost->do_enquiry = 1;
+
 	strcpy(vhost->partition_name, "UNKNOWN");
 	init_waitqueue_head(&vhost->work_wait_q);
 	init_waitqueue_head(&vhost->init_wait_q);
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.h b/drivers/scsi/ibmvscsi/ibmvfc.h
index 9d58cfd774d3..8225bdbb127e 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.h
+++ b/drivers/scsi/ibmvscsi/ibmvfc.h
@@ -41,6 +41,11 @@
 #define IBMVFC_DEFAULT_LOG_LEVEL	2
 #define IBMVFC_MAX_CDB_LEN		16
 #define IBMVFC_CLS3_ERROR		0
+#define IBMVFC_MQ			0
+#define IBMVFC_SCSI_CHANNELS		0
+#define IBMVFC_SCSI_HW_QUEUES		1
+#define IBMVFC_MIG_NO_SUB_TO_CRQ	0
+#define IBMVFC_MIG_NO_N_TO_M		0
 
 /*
  * Ensure we have resources for ERP and initialization:
@@ -826,6 +831,10 @@ struct ibmvfc_host {
 	int delay_init;
 	int scan_complete;
 	int logged_in;
+	int mq_enabled;
+	int using_channels;
+	int do_enquiry;
+	int client_scsi_channels;
 	int aborting_passthru;
 	int events_to_log;
 #define IBMVFC_AE_LINKUP	0x0001
-- 
2.27.0


^ permalink raw reply related

* [PATCH 05/13] ibmvfc: add Sub-CRQ IRQ enable/disable routine
From: Tyrel Datwyler @ 2020-11-26  1:48 UTC (permalink / raw)
  To: james.bottomley
  Cc: Tyrel Datwyler, martin.petersen, linux-scsi, linux-kernel, brking,
	linuxppc-dev
In-Reply-To: <20201126014824.123831-1-tyreld@linux.ibm.com>

Each Sub-CRQ has its own interrupt. A hypercall is required to toggle
the IRQ state. Provide the necessary mechanism via a helper function.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 571abdb48384..6eaedda4917a 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -3351,6 +3351,26 @@ static void ibmvfc_tasklet(void *data)
 	spin_unlock_irqrestore(vhost->host->host_lock, flags);
 }
 
+static int ibmvfc_toggle_scrq_irq(struct ibmvfc_sub_queue *scrq, int enable)
+{
+	struct device *dev = scrq->vhost->dev;
+	struct vio_dev *vdev = to_vio_dev(dev);
+	unsigned long rc;
+	int irq_action = H_ENABLE_VIO_INTERRUPT;
+
+	if (!enable)
+		irq_action = H_DISABLE_VIO_INTERRUPT;
+
+	rc = plpar_hcall_norets(H_VIOCTL, vdev->unit_address, irq_action,
+				scrq->hw_irq, 0, 0);
+
+	if (rc)
+		dev_err(dev, "Couldn't %s sub-crq[%lu] irq. rc=%ld\n",
+			enable ? "enable" : "disable", scrq->hwq_id, rc);
+
+	return rc;
+}
+
 /**
  * ibmvfc_init_tgt - Set the next init job step for the target
  * @tgt:		ibmvfc target struct
-- 
2.27.0


^ permalink raw reply related

* [PATCH 09/13] ibmvfc: implement channel enquiry and setup commands
From: Tyrel Datwyler @ 2020-11-26  1:48 UTC (permalink / raw)
  To: james.bottomley
  Cc: Tyrel Datwyler, martin.petersen, linux-scsi, linux-kernel, brking,
	linuxppc-dev
In-Reply-To: <20201126014824.123831-1-tyreld@linux.ibm.com>

New NPIV_ENQUIRY_CHANNEL and NPIV_SETUP_CHANNEL management datagrams
(MADs) were defined in a previous patchset. If the client advertises a
desire to use channels and the partner VIOS is channel capable then the
client must proceed with channel enquiry to determine the maximum number
of channels the VIOS is capable of providing, and registering SubCRQs
via channel setup with the VIOS immediately following NPIV Login. This
handshaking should not be performed for subsequent NPIV Logins unless
the CRQ connection has been reset.

Implement these two new MADs and issue them following a successful NPIV
login where the VIOS has set the SUPPORT_CHANNELS capability bit in the
NPIV Login response.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 135 ++++++++++++++++++++++++++++++++-
 drivers/scsi/ibmvscsi/ibmvfc.h |   3 +
 2 files changed, 136 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 53db6da20923..40a945712bdb 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -804,6 +804,8 @@ static int ibmvfc_reset_crq(struct ibmvfc_host *vhost)
 	spin_lock_irqsave(vhost->host->host_lock, flags);
 	vhost->state = IBMVFC_NO_CRQ;
 	vhost->logged_in = 0;
+	vhost->do_enquiry = 1;
+	vhost->using_channels = 0;
 
 	/* Clean out the queue */
 	memset(crq->msgs, 0, PAGE_SIZE);
@@ -4462,6 +4464,118 @@ static void ibmvfc_discover_targets(struct ibmvfc_host *vhost)
 		ibmvfc_link_down(vhost, IBMVFC_LINK_DEAD);
 }
 
+static void ibmvfc_channel_setup_done(struct ibmvfc_event *evt)
+{
+	struct ibmvfc_host *vhost = evt->vhost;
+	u32 mad_status = be16_to_cpu(evt->xfer_iu->channel_setup.common.status);
+	int level = IBMVFC_DEFAULT_LOG_LEVEL;
+
+	ibmvfc_free_event(evt);
+
+	switch (mad_status) {
+	case IBMVFC_MAD_SUCCESS:
+		ibmvfc_dbg(vhost, "Channel Setup succeded\n");
+		vhost->do_enquiry = 0;
+		break;
+	case IBMVFC_MAD_FAILED:
+		level += ibmvfc_retry_host_init(vhost);
+		ibmvfc_log(vhost, level, "Channel Setup failed\n");
+		fallthrough;
+	case IBMVFC_MAD_DRIVER_FAILED:
+		return;
+	default:
+		dev_err(vhost->dev, "Invalid Channel Setup response: 0x%x\n",
+			mad_status);
+		ibmvfc_link_down(vhost, IBMVFC_LINK_DEAD);
+		return;
+	}
+
+	ibmvfc_set_host_action(vhost, IBMVFC_HOST_ACTION_QUERY);
+	wake_up(&vhost->work_wait_q);
+}
+
+static void ibmvfc_channel_setup(struct ibmvfc_host *vhost)
+{
+	struct ibmvfc_channel_setup_mad *mad;
+	struct ibmvfc_channel_setup *setup_buf = vhost->channel_setup_buf;
+	struct ibmvfc_event *evt = ibmvfc_get_event(vhost);
+
+	memset(setup_buf, 0, sizeof(*setup_buf));
+	setup_buf->flags = cpu_to_be32(IBMVFC_CANCEL_CHANNELS);
+
+	ibmvfc_init_event(evt, ibmvfc_channel_setup_done, IBMVFC_MAD_FORMAT);
+	mad = &evt->iu.channel_setup;
+	memset(mad, 0, sizeof(*mad));
+	mad->common.version = cpu_to_be32(1);
+	mad->common.opcode = cpu_to_be32(IBMVFC_CHANNEL_SETUP);
+	mad->common.length = cpu_to_be16(sizeof(*mad));
+	mad->buffer.va = cpu_to_be64(vhost->channel_setup_dma);
+	mad->buffer.len = cpu_to_be32(sizeof(*vhost->channel_setup_buf));
+
+	ibmvfc_set_host_action(vhost, IBMVFC_HOST_ACTION_INIT_WAIT);
+
+	if (!ibmvfc_send_event(evt, vhost, default_timeout))
+		ibmvfc_dbg(vhost, "Sent channel setup\n");
+	else
+		ibmvfc_link_down(vhost, IBMVFC_LINK_DOWN);
+}
+
+static void ibmvfc_channel_enquiry_done(struct ibmvfc_event *evt)
+{
+	struct ibmvfc_host *vhost = evt->vhost;
+	struct ibmvfc_channel_enquiry *rsp = &evt->xfer_iu->channel_enquiry;
+	u32 mad_status = be16_to_cpu(rsp->common.status);
+	int level = IBMVFC_DEFAULT_LOG_LEVEL;
+
+	switch (mad_status) {
+	case IBMVFC_MAD_SUCCESS:
+		ibmvfc_dbg(vhost, "Channel Enquiry succeeded\n");
+		vhost->max_vios_scsi_channels = be32_to_cpu(rsp->num_scsi_subq_channels);
+		break;
+	case IBMVFC_MAD_FAILED:
+		level += ibmvfc_retry_host_init(vhost);
+		ibmvfc_log(vhost, level, "Channel Enquiry failed\n");
+		ibmvfc_free_event(evt);
+		fallthrough;
+	case IBMVFC_MAD_DRIVER_FAILED:
+		ibmvfc_free_event(evt);
+		return;
+	default:
+		dev_err(vhost->dev, "Invalid Channel Enquiry response: 0x%x\n",
+			mad_status);
+		ibmvfc_link_down(vhost, IBMVFC_LINK_DEAD);
+		ibmvfc_free_event(evt);
+		return;
+	}
+
+	ibmvfc_channel_setup(vhost);
+}
+
+static void ibmvfc_channel_enquiry(struct ibmvfc_host *vhost)
+{
+	struct ibmvfc_channel_enquiry *mad;
+	struct ibmvfc_event *evt = ibmvfc_get_event(vhost);
+
+	ibmvfc_init_event(evt, ibmvfc_channel_enquiry_done, IBMVFC_MAD_FORMAT);
+	mad = &evt->iu.channel_enquiry;
+	memset(mad, 0, sizeof(*mad));
+	mad->common.version = cpu_to_be32(1);
+	mad->common.opcode = cpu_to_be32(IBMVFC_CHANNEL_ENQUIRY);
+	mad->common.length = cpu_to_be16(sizeof(*mad));
+
+	if (IBMVFC_MIG_NO_SUB_TO_CRQ)
+		mad->flags |= cpu_to_be32(IBMVFC_NO_CHANNELS_TO_CRQ_SUPPORT);
+	if (IBMVFC_MIG_NO_N_TO_M)
+		mad->flags |= cpu_to_be32(IBMVFC_NO_N_TO_M_CHANNELS_SUPPORT);
+
+	ibmvfc_set_host_action(vhost, IBMVFC_HOST_ACTION_INIT_WAIT);
+
+	if (!ibmvfc_send_event(evt, vhost, default_timeout))
+		ibmvfc_dbg(vhost, "Send channel enquiry\n");
+	else
+		ibmvfc_link_down(vhost, IBMVFC_LINK_DEAD);
+}
+
 /**
  * ibmvfc_npiv_login_done - Completion handler for NPIV Login
  * @evt:	ibmvfc event struct
@@ -4543,8 +4657,14 @@ static void ibmvfc_npiv_login_done(struct ibmvfc_event *evt)
 
 	vhost->host->can_queue = be32_to_cpu(rsp->max_cmds) - IBMVFC_NUM_INTERNAL_REQ;
 	vhost->host->max_sectors = npiv_max_sectors;
-	ibmvfc_set_host_action(vhost, IBMVFC_HOST_ACTION_QUERY);
-	wake_up(&vhost->work_wait_q);
+
+	if (ibmvfc_check_caps(vhost, IBMVFC_CAN_SUPPORT_CHANNELS) && vhost->do_enquiry) {
+		ibmvfc_channel_enquiry(vhost);
+	} else {
+		vhost->do_enquiry = 0;
+		ibmvfc_set_host_action(vhost, IBMVFC_HOST_ACTION_QUERY);
+		wake_up(&vhost->work_wait_q);
+	}
 }
 
 /**
@@ -5313,9 +5433,20 @@ static int ibmvfc_alloc_mem(struct ibmvfc_host *vhost)
 		goto free_trace;
 	}
 
+	vhost->channel_setup_buf = dma_alloc_coherent(dev, sizeof(*vhost->channel_setup_buf),
+						      &vhost->channel_setup_dma,
+						      GFP_KERNEL);
+
+	if (!vhost->channel_setup_buf) {
+		dev_err(dev, "Couldn't allocate Channel Setup buffer\n");
+		goto free_tgt_pool;
+	}
+
 	LEAVE;
 	return 0;
 
+free_tgt_pool:
+	mempool_destroy(vhost->tgt_pool);
 free_trace:
 	kfree(vhost->trace);
 free_disc_buffer:
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.h b/drivers/scsi/ibmvscsi/ibmvfc.h
index d80c7eeef409..04086ffbfca7 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.h
+++ b/drivers/scsi/ibmvscsi/ibmvfc.h
@@ -842,10 +842,13 @@ struct ibmvfc_host {
 	struct ibmvfc_npiv_login login_info;
 	union ibmvfc_npiv_login_data *login_buf;
 	dma_addr_t login_buf_dma;
+	struct ibmvfc_channel_setup *channel_setup_buf;
+	dma_addr_t channel_setup_dma;
 	int disc_buf_sz;
 	int log_level;
 	struct ibmvfc_discover_targets_entry *disc_buf;
 	struct mutex passthru_mutex;
+	int max_vios_scsi_channels;
 	int task_set;
 	int init_retries;
 	int discovery_threads;
-- 
2.27.0


^ permalink raw reply related

* [PATCH 03/13] ibmvfc: add Subordinate CRQ definitions
From: Tyrel Datwyler @ 2020-11-26  1:48 UTC (permalink / raw)
  To: james.bottomley
  Cc: Tyrel Datwyler, martin.petersen, linux-scsi, linux-kernel, brking,
	linuxppc-dev
In-Reply-To: <20201126014824.123831-1-tyreld@linux.ibm.com>

Subordinate Command Response Queues (Sub CRQ) are used in conjunction
with the primary CRQ when more than one queue is needed by the virtual
IO adapter. Recent phyp firmware versions support Sub CRQ's with ibmvfc
adapters. This feature is a prerequisite for supporting multiple
hardware backed submission queues in the vfc adapter.

The Sub CRQ command element differs from the standard CRQ in that it is
32bytes long as opposed to 16bytes for the latter. Despite this extra
16bytes the ibmvfc protocol will use the original CRQ command element
mapped to the first 16bytes of the Sub CRQ element initially.

Add definitions for the Sub CRQ command element and queue.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
---
 drivers/scsi/ibmvscsi/ibmvfc.h | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.h b/drivers/scsi/ibmvscsi/ibmvfc.h
index 8225bdbb127e..084ecdfe51ea 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.h
+++ b/drivers/scsi/ibmvscsi/ibmvfc.h
@@ -656,6 +656,29 @@ struct ibmvfc_crq_queue {
 	dma_addr_t msg_token;
 };
 
+struct ibmvfc_sub_crq {
+	struct ibmvfc_crq crq;
+	__be64 reserved[2];
+} __packed __aligned(8);
+
+struct ibmvfc_sub_queue {
+	struct ibmvfc_sub_crq *msgs;
+	dma_addr_t msg_token;
+	int size, cur;
+	struct ibmvfc_host *vhost;
+	unsigned long cookie;
+	unsigned long vios_cookie;
+	unsigned long hw_irq;
+	unsigned long irq;
+	unsigned long hwq_id;
+	char name[32];
+};
+
+struct ibmvfc_scsi_channels {
+	struct ibmvfc_sub_queue *scrqs;
+	unsigned int active_queues;
+};
+
 enum ibmvfc_ae_link_state {
 	IBMVFC_AE_LS_LINK_UP		= 0x01,
 	IBMVFC_AE_LS_LINK_BOUNCED	= 0x02,
-- 
2.27.0


^ permalink raw reply related

* [PATCH 08/13] ibmvfc: map/request irq and register Sub-CRQ interrupt handler
From: Tyrel Datwyler @ 2020-11-26  1:48 UTC (permalink / raw)
  To: james.bottomley
  Cc: Tyrel Datwyler, martin.petersen, linux-scsi, linux-kernel, brking,
	linuxppc-dev
In-Reply-To: <20201126014824.123831-1-tyreld@linux.ibm.com>

Create an irq mapping for the hw_irq number provided from phyp firmware.
Request an irq assigned our Sub-CRQ interrupt handler.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 4fb782fa2c66..53db6da20923 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -5119,12 +5119,34 @@ static int ibmvfc_register_scsi_channel(struct ibmvfc_host *vhost,
 		goto reg_failed;
 	}
 
+	scrq->irq = irq_create_mapping(NULL, scrq->hw_irq);
+
+	if (!scrq->irq) {
+		rc = -EINVAL;
+		dev_err(dev, "Error mapping sub-crq[%d] irq\n", index);
+		goto irq_failed;
+	}
+
+	snprintf(scrq->name, sizeof(scrq->name), "ibmvfc-%x-scsi%d",
+		 vdev->unit_address, index);
+	rc = request_irq(scrq->irq, ibmvfc_interrupt_scsi, 0, scrq->name, scrq);
+
+	if (rc) {
+		dev_err(dev, "Couldn't register sub-crq[%d] irq\n", index);
+		irq_dispose_mapping(scrq->irq);
+		goto irq_failed;
+	}
+
 	scrq->hwq_id = index;
 	scrq->vhost = vhost;
 
 	LEAVE;
 	return 0;
 
+irq_failed:
+	do {
+		plpar_hcall_norets(H_FREE_SUB_CRQ, vdev->unit_address, scrq->cookie);
+	} while (rc == H_BUSY || H_IS_LONG_BUSY(rc));
 reg_failed:
 	dma_unmap_single(dev, scrq->msg_token, PAGE_SIZE, DMA_BIDIRECTIONAL);
 dma_map_failed:
-- 
2.27.0


^ permalink raw reply related

* [PATCH 04/13] ibmvfc: add alloc/dealloc routines for SCSI Sub-CRQ Channels
From: Tyrel Datwyler @ 2020-11-26  1:48 UTC (permalink / raw)
  To: james.bottomley
  Cc: Tyrel Datwyler, martin.petersen, linux-scsi, linux-kernel, brking,
	linuxppc-dev
In-Reply-To: <20201126014824.123831-1-tyreld@linux.ibm.com>

Allocate a set of Sub-CRQs in advance. During channel setup the client
and VIOS negotiate the number of queues the VIOS supports and the number
that the client desires to request. Its possible that the final channel
resources allocated is less than requested, but the client is still
responsible for sending handles for every queue it is hoping for.

Also, provide deallocation cleanup routines.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 115 +++++++++++++++++++++++++++++++++
 drivers/scsi/ibmvscsi/ibmvfc.h |   1 +
 2 files changed, 116 insertions(+)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 260b82e3cc01..571abdb48384 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -4983,6 +4983,114 @@ static int ibmvfc_init_crq(struct ibmvfc_host *vhost)
 	return retrc;
 }
 
+static int ibmvfc_register_scsi_channel(struct ibmvfc_host *vhost,
+				  int index)
+{
+	struct device *dev = vhost->dev;
+	struct vio_dev *vdev = to_vio_dev(dev);
+	struct ibmvfc_sub_queue *scrq = &vhost->scsi_scrqs.scrqs[index];
+	int rc = -ENOMEM;
+
+	ENTER;
+
+	scrq->msgs = (struct ibmvfc_sub_crq *)get_zeroed_page(GFP_KERNEL);
+	if (!scrq->msgs)
+		return rc;
+
+	scrq->size = PAGE_SIZE / sizeof(*scrq->msgs);
+	scrq->msg_token = dma_map_single(dev, scrq->msgs, PAGE_SIZE,
+					 DMA_BIDIRECTIONAL);
+
+	if (dma_mapping_error(dev, scrq->msg_token))
+		goto dma_map_failed;
+
+	rc = h_reg_sub_crq(vdev->unit_address, scrq->msg_token, PAGE_SIZE,
+			   &scrq->cookie, &scrq->hw_irq);
+
+	if (rc) {
+		dev_warn(dev, "Error registering sub-crq: %d\n", rc);
+		dev_warn(dev, "Firmware may not support MQ\n");
+		goto reg_failed;
+	}
+
+	scrq->hwq_id = index;
+	scrq->vhost = vhost;
+
+	LEAVE;
+	return 0;
+
+reg_failed:
+	dma_unmap_single(dev, scrq->msg_token, PAGE_SIZE, DMA_BIDIRECTIONAL);
+dma_map_failed:
+	free_page((unsigned long)scrq->msgs);
+	LEAVE;
+	return rc;
+}
+
+static void ibmvfc_deregister_scsi_channel(struct ibmvfc_host *vhost, int index)
+{
+	struct device *dev = vhost->dev;
+	struct vio_dev *vdev = to_vio_dev(dev);
+	struct ibmvfc_sub_queue *scrq = &vhost->scsi_scrqs.scrqs[index];
+	long rc;
+
+	ENTER;
+
+	do {
+		rc = plpar_hcall_norets(H_FREE_SUB_CRQ, vdev->unit_address,
+					scrq->cookie);
+	} while (rc == H_BUSY || H_IS_LONG_BUSY(rc));
+
+	if (rc)
+		dev_err(dev, "Failed to free sub-crq[%d]: rc=%ld\n", index, rc);
+
+	dma_unmap_single(dev, scrq->msg_token, PAGE_SIZE, DMA_BIDIRECTIONAL);
+	free_page((unsigned long)scrq->msgs);
+	LEAVE;
+}
+
+static int ibmvfc_init_sub_crqs(struct ibmvfc_host *vhost)
+{
+	int i, j;
+
+	ENTER;
+
+	vhost->scsi_scrqs.scrqs = kcalloc(vhost->client_scsi_channels,
+					  sizeof(*vhost->scsi_scrqs.scrqs),
+					  GFP_KERNEL);
+	if (!vhost->scsi_scrqs.scrqs)
+		return -1;
+
+	for (i = 0; i < vhost->client_scsi_channels; i++) {
+		if (ibmvfc_register_scsi_channel(vhost, i)) {
+			for (j = i; j > 0; j--)
+				ibmvfc_deregister_scsi_channel(vhost, j - 1);
+			kfree(vhost->scsi_scrqs.scrqs);
+			LEAVE;
+			return -1;
+		}
+	}
+
+	LEAVE;
+	return 0;
+}
+
+static void ibmvfc_release_sub_crqs(struct ibmvfc_host *vhost)
+{
+	int i;
+
+	ENTER;
+	if (!vhost->scsi_scrqs.scrqs)
+		return;
+
+	for (i = 0; i < vhost->client_scsi_channels; i++)
+		ibmvfc_deregister_scsi_channel(vhost, i);
+
+	vhost->scsi_scrqs.active_queues = 0;
+	kfree(vhost->scsi_scrqs.scrqs);
+	LEAVE;
+}
+
 /**
  * ibmvfc_free_mem - Free memory for vhost
  * @vhost:	ibmvfc host struct
@@ -5239,6 +5347,12 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id)
 		goto remove_shost;
 	}
 
+	if (vhost->mq_enabled) {
+		rc = ibmvfc_init_sub_crqs(vhost);
+		if (rc)
+			dev_warn(dev, "Failed to allocate Sub-CRQs. rc=%d\n", rc);
+	}
+
 	if (shost_to_fc_host(shost)->rqst_q)
 		blk_queue_max_segments(shost_to_fc_host(shost)->rqst_q, 1);
 	dev_set_drvdata(dev, vhost);
@@ -5296,6 +5410,7 @@ static int ibmvfc_remove(struct vio_dev *vdev)
 	ibmvfc_purge_requests(vhost, DID_ERROR);
 	spin_unlock_irqrestore(vhost->host->host_lock, flags);
 	ibmvfc_free_event_pool(vhost);
+	ibmvfc_release_sub_crqs(vhost);
 
 	ibmvfc_free_mem(vhost);
 	spin_lock(&ibmvfc_driver_lock);
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.h b/drivers/scsi/ibmvscsi/ibmvfc.h
index 084ecdfe51ea..d80c7eeef409 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.h
+++ b/drivers/scsi/ibmvscsi/ibmvfc.h
@@ -838,6 +838,7 @@ struct ibmvfc_host {
 	mempool_t *tgt_pool;
 	struct ibmvfc_crq_queue crq;
 	struct ibmvfc_async_crq_queue async_crq;
+	struct ibmvfc_scsi_channels scsi_scrqs;
 	struct ibmvfc_npiv_login login_info;
 	union ibmvfc_npiv_login_data *login_buf;
 	dma_addr_t login_buf_dma;
-- 
2.27.0


^ permalink raw reply related

* [PATCH 02/13] ibmvfc: define hcall wrapper for registering a Sub-CRQ
From: Tyrel Datwyler @ 2020-11-26  1:48 UTC (permalink / raw)
  To: james.bottomley
  Cc: Tyrel Datwyler, martin.petersen, linux-scsi, linux-kernel, brking,
	linuxppc-dev
In-Reply-To: <20201126014824.123831-1-tyreld@linux.ibm.com>

Sub-CRQs are registred with firmware via a hypercall. Abstract that
interface into a simpler helper function.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index cd609d19e6a1..260b82e3cc01 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -138,6 +138,20 @@ static void ibmvfc_tgt_move_login(struct ibmvfc_target *);
 
 static const char *unknown_error = "unknown error";
 
+static long h_reg_sub_crq(unsigned long unit_address, unsigned long ioba,
+			  unsigned long length, unsigned long *cookie,
+			  unsigned long *irq)
+{
+	unsigned long retbuf[PLPAR_HCALL_BUFSIZE];
+	long rc;
+
+	rc = plpar_hcall(H_REG_SUB_CRQ, retbuf, unit_address, ioba, length);
+	*cookie = retbuf[0];
+	*irq = retbuf[1];
+
+	return rc;
+}
+
 static int ibmvfc_check_caps(struct ibmvfc_host *vhost, unsigned long cap_flags)
 {
 	u64 host_caps = be64_to_cpu(vhost->login_buf->resp.capabilities);
-- 
2.27.0


^ permalink raw reply related

* [PATCH 00/13] ibmvfc: initial MQ development
From: Tyrel Datwyler @ 2020-11-26  1:48 UTC (permalink / raw)
  To: james.bottomley
  Cc: Tyrel Datwyler, martin.petersen, linux-scsi, linux-kernel, brking,
	linuxppc-dev

Recent updates in pHyp Firmware and VIOS releases provide new infrastructure
towards enabling Subordinate Command Response Queues (Sub-CRQs) such that each
Sub-CRQ is a channel backed by an actual hardware queue in the FC stack on the
partner VIOS. Sub-CRQs are registered with the firmware via hypercalls and then
negotiated with the VIOS via new Management Datagrams (MADs) for channel setup.

This initial implementation adds the necessary Sub-CRQ framework and implements
the new MADs for negotiating and assigning a set of Sub-CRQs to associated VIOS
HW backed channels. The event pool and locking still leverages the legacy single
queue implementation, and as such lock contention is problematic when increasing
the number of queues. However, this initial work demonstrates a 1.2x factor
increase in IOPs when configured with two HW queues despite lock contention.

Tyrel Datwyler (13):
  ibmvfc: add vhost fields and defaults for MQ enablement
  ibmvfc: define hcall wrapper for registering a Sub-CRQ
  ibmvfc: add Subordinate CRQ definitions
  ibmvfc: add alloc/dealloc routines for SCSI Sub-CRQ Channels
  ibmvfc: add Sub-CRQ IRQ enable/disable routine
  ibmvfc: add handlers to drain and complete Sub-CRQ responses
  ibmvfc: define Sub-CRQ interrupt handler routine
  ibmvfc: map/request irq and register Sub-CRQ interrupt handler
  ibmvfc: implement channel enquiry and setup commands
  ibmvfc: advertise client support for using hardware channels
  ibmvfc: set and track hw queue in ibmvfc_event struct
  ibmvfc: send commands down HW Sub-CRQ when channelized
  ibmvfc: register Sub-CRQ handles with VIOS during channel setup

 drivers/scsi/ibmvscsi/ibmvfc.c | 460 ++++++++++++++++++++++++++++++++-
 drivers/scsi/ibmvscsi/ibmvfc.h |  37 +++
 2 files changed, 493 insertions(+), 4 deletions(-)

-- 
2.27.0


^ permalink raw reply

* [PATCH 07/13] ibmvfc: define Sub-CRQ interrupt handler routine
From: Tyrel Datwyler @ 2020-11-26  1:48 UTC (permalink / raw)
  To: james.bottomley
  Cc: Tyrel Datwyler, martin.petersen, linux-scsi, linux-kernel, brking,
	linuxppc-dev
In-Reply-To: <20201126014824.123831-1-tyreld@linux.ibm.com>

Simple handler that calls Sub-CRQ drain routine directly.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index a8730522920e..4fb782fa2c66 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -3443,6 +3443,20 @@ static void ibmvfc_drain_sub_crq(struct ibmvfc_sub_queue *scrq)
 	}
 }
 
+static irqreturn_t ibmvfc_interrupt_scsi(int irq, void *scrq_instance)
+{
+	struct ibmvfc_sub_queue *scrq = (struct ibmvfc_sub_queue *)scrq_instance;
+	struct ibmvfc_host *vhost = scrq->vhost;
+	unsigned long flags;
+
+	spin_lock_irqsave(vhost->host->host_lock, flags);
+	ibmvfc_toggle_scrq_irq(scrq, 0);
+	ibmvfc_drain_sub_crq(scrq);
+	spin_unlock_irqrestore(vhost->host->host_lock, flags);
+
+	return IRQ_HANDLED;
+}
+
 /**
  * ibmvfc_init_tgt - Set the next init job step for the target
  * @tgt:		ibmvfc target struct
-- 
2.27.0


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox