LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 00/10] sound: convert tasklets to use new tasklet_setup()
From: Takashi Iwai @ 2020-09-01 10:09 UTC (permalink / raw)
  To: Allen
  Cc: alsa-devel, Kees Cook, timur, Xiubo.Lee,
	Linux Kernel Mailing List, clemens, tiwai, o-takashi,
	nicoleotsuka, Allen Pais, Mark Brown, perex, linuxppc-dev
In-Reply-To: <CAOMdWSJ2VKhbnRDTNVuTKSL12k0qhryO7yznstAk8k_nBGp2=Q@mail.gmail.com>

On Tue, 01 Sep 2020 12:04:53 +0200,
Allen wrote:
> 
> Takashi,
> > > > > These patches which I wasn't CCed on and which need their subject lines
> > > > > fixing :( .  With the subject lines fixed I guess so so
> > >
> > > > Extremely sorry. I thought I had it covered. How would you like it
> > > > worded?
> > >
> > > ASoC:
> >
> > To be more exact, "ASoC:" prefix is for sound/soc/*, and for the rest
> > sound/*, use "ALSA:" prefix please.
> 
> I could not get the generic API accepted upstream. We would stick to
> from_tasklet()
> or container_of(). Could I go ahead and send out V2 using
> from_tasklet() with subject line fixed?

Yes, please submit whatever should go into 5.9.


thanks,

Takashi

^ permalink raw reply

* Re: [PATCH v2] powerpc/mm: Remove DEBUG_VM_PGTABLE support on powerpc
From: Anshuman Khandual @ 2020-09-01 10:48 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linuxppc-dev, mpe
In-Reply-To: <20200901094423.100149-1-aneesh.kumar@linux.ibm.com>



On 09/01/2020 03:14 PM, Aneesh Kumar K.V wrote:
> The test is broken w.r.t page table update rules and results in kernel
> crash as below. Disable the support until we get the tests updated.
> 
> [   21.083519] kernel BUG at arch/powerpc/mm/pgtable.c:304!
> cpu 0x0: Vector: 700 (Program Check) at [c000000c6d1e76c0]
>     pc: c00000000009a5ec: assert_pte_locked+0x14c/0x380
>     lr: c0000000005eeeec: pte_update+0x11c/0x190
>     sp: c000000c6d1e7950
>    msr: 8000000002029033
>   current = 0xc000000c6d172c80
>   paca    = 0xc000000003ba0000   irqmask: 0x03   irq_happened: 0x01
>     pid   = 1, comm = swapper/0
> kernel BUG at arch/powerpc/mm/pgtable.c:304!
> [link register   ] c0000000005eeeec pte_update+0x11c/0x190
> [c000000c6d1e7950] 0000000000000001 (unreliable)
> [c000000c6d1e79b0] c0000000005eee14 pte_update+0x44/0x190
> [c000000c6d1e7a10] c000000001a2ca9c pte_advanced_tests+0x160/0x3d8
> [c000000c6d1e7ab0] c000000001a2d4fc debug_vm_pgtable+0x7e8/0x1338
> [c000000c6d1e7ba0] c0000000000116ec do_one_initcall+0xac/0x5f0
> [c000000c6d1e7c80] c0000000019e4fac kernel_init_freeable+0x4dc/0x5a4
> [c000000c6d1e7db0] c000000000012474 kernel_init+0x24/0x160
> [c000000c6d1e7e20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c
> 
> With DEBUG_VM disabled
> 
> [   20.530152] BUG: Kernel NULL pointer dereference on read at 0x00000000
> [   20.530183] Faulting instruction address: 0xc0000000000df330
> cpu 0x33: Vector: 380 (Data SLB Access) at [c000000c6d19f700]
>     pc: c0000000000df330: memset+0x68/0x104
>     lr: c00000000009f6d8: hash__pmdp_huge_get_and_clear+0xe8/0x1b0
>     sp: c000000c6d19f990
>    msr: 8000000002009033
>    dar: 0
>   current = 0xc000000c6d177480
>   paca    = 0xc00000001ec4f400   irqmask: 0x03   irq_happened: 0x01
>     pid   = 1, comm = swapper/0
> [link register   ] c00000000009f6d8 hash__pmdp_huge_get_and_clear+0xe8/0x1b0
> [c000000c6d19f990] c00000000009f748 hash__pmdp_huge_get_and_clear+0x158/0x1b0 (unreliable)
> [c000000c6d19fa10] c0000000019ebf30 pmd_advanced_tests+0x1f0/0x378
> [c000000c6d19fab0] c0000000019ed088 debug_vm_pgtable+0x79c/0x1244
> [c000000c6d19fba0] c0000000000116ec do_one_initcall+0xac/0x5f0
> [c000000c6d19fc80] c0000000019a4fac kernel_init_freeable+0x4dc/0x5a4
> [c000000c6d19fdb0] c000000000012474 kernel_init+0x24/0x160
> [c000000c6d19fe20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c
> 33:mon>
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  arch/powerpc/Kconfig | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 65bed1fdeaad..787e829b6f25 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -116,7 +116,6 @@ config PPC
>  	#
>  	select ARCH_32BIT_OFF_T if PPC32
>  	select ARCH_HAS_DEBUG_VIRTUAL
> -	select ARCH_HAS_DEBUG_VM_PGTABLE
>  	select ARCH_HAS_DEVMEM_IS_ALLOWED
>  	select ARCH_HAS_ELF_RANDOMIZE
>  	select ARCH_HAS_FORTIFY_SOURCE
> 

If support for powerpc is being dropped, please update the features file
here as well. They should be in sync.

Documentation/features/debug/debug-vm-pgtable/arch-support.txt

^ permalink raw reply

* Re: [PATCH] arch: vdso: add vdso linker script to 'targets' instead of extra-y
From: Greentime Hu @ 2020-09-01 11:29 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: linux-s390, Catalin Marinas, Vasily Gorbik, Nick Hu,
	Linux Kbuild mailing list, Heiko Carstens, linuxppc-dev,
	Linux Kernel Mailing List, Christian Borntraeger, Paul Mackerras,
	Vincent Chen, Will Deacon, linux-arm-kernel
In-Reply-To: <20200831182239.480317-1-masahiroy@kernel.org>

Masahiro Yamada <masahiroy@kernel.org> 於 2020年9月1日 週二 上午2:23寫道:
>
> The vdso linker script is preprocessed on demand.
> Adding it to 'targets' is enough to include the .cmd file.
>
> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
> ---
>
>  arch/arm64/kernel/vdso/Makefile     | 2 +-
>  arch/arm64/kernel/vdso32/Makefile   | 2 +-
>  arch/nds32/kernel/vdso/Makefile     | 2 +-
>  arch/powerpc/kernel/vdso32/Makefile | 2 +-
>  arch/powerpc/kernel/vdso64/Makefile | 2 +-
>  arch/s390/kernel/vdso64/Makefile    | 2 +-
>  6 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/arch/arm64/kernel/vdso/Makefile b/arch/arm64/kernel/vdso/Makefile
> index 45d5cfe46429..7cd8aafbe96e 100644
> --- a/arch/arm64/kernel/vdso/Makefile
> +++ b/arch/arm64/kernel/vdso/Makefile
> @@ -54,7 +54,7 @@ endif
>  GCOV_PROFILE := n
>
>  obj-y += vdso.o
> -extra-y += vdso.lds
> +targets += vdso.lds
>  CPPFLAGS_vdso.lds += -P -C -U$(ARCH)
>
>  # Force dependency (incbin is bad)
> diff --git a/arch/arm64/kernel/vdso32/Makefile b/arch/arm64/kernel/vdso32/Makefile
> index d6adb4677c25..572475b7b7ed 100644
> --- a/arch/arm64/kernel/vdso32/Makefile
> +++ b/arch/arm64/kernel/vdso32/Makefile
> @@ -155,7 +155,7 @@ asm-obj-vdso := $(addprefix $(obj)/, $(asm-obj-vdso))
>  obj-vdso := $(c-obj-vdso) $(c-obj-vdso-gettimeofday) $(asm-obj-vdso)
>
>  obj-y += vdso.o
> -extra-y += vdso.lds
> +targets += vdso.lds
>  CPPFLAGS_vdso.lds += -P -C -U$(ARCH)
>
>  # Force dependency (vdso.s includes vdso.so through incbin)
> diff --git a/arch/nds32/kernel/vdso/Makefile b/arch/nds32/kernel/vdso/Makefile
> index 7c3c1ccb196e..55df25ef0057 100644
> --- a/arch/nds32/kernel/vdso/Makefile
> +++ b/arch/nds32/kernel/vdso/Makefile
> @@ -20,7 +20,7 @@ GCOV_PROFILE := n
>
>
>  obj-y += vdso.o
> -extra-y += vdso.lds
> +targets += vdso.lds
>  CPPFLAGS_vdso.lds += -P -C -U$(ARCH)
>
>  # Force dependency
> diff --git a/arch/powerpc/kernel/vdso32/Makefile b/arch/powerpc/kernel/vdso32/Makefile
> index 87ab1152d5ce..fd5072a4c73c 100644
> --- a/arch/powerpc/kernel/vdso32/Makefile
> +++ b/arch/powerpc/kernel/vdso32/Makefile
> @@ -29,7 +29,7 @@ ccflags-y := -shared -fno-common -fno-builtin -nostdlib \
>  asflags-y := -D__VDSO32__ -s
>
>  obj-y += vdso32_wrapper.o
> -extra-y += vdso32.lds
> +targets += vdso32.lds
>  CPPFLAGS_vdso32.lds += -P -C -Upowerpc
>
>  # Force dependency (incbin is bad)
> diff --git a/arch/powerpc/kernel/vdso64/Makefile b/arch/powerpc/kernel/vdso64/Makefile
> index 38c317f25141..c737b3ea3207 100644
> --- a/arch/powerpc/kernel/vdso64/Makefile
> +++ b/arch/powerpc/kernel/vdso64/Makefile
> @@ -17,7 +17,7 @@ ccflags-y := -shared -fno-common -fno-builtin -nostdlib \
>  asflags-y := -D__VDSO64__ -s
>
>  obj-y += vdso64_wrapper.o
> -extra-y += vdso64.lds
> +targets += vdso64.lds
>  CPPFLAGS_vdso64.lds += -P -C -U$(ARCH)
>
>  # Force dependency (incbin is bad)
> diff --git a/arch/s390/kernel/vdso64/Makefile b/arch/s390/kernel/vdso64/Makefile
> index 4a66a1cb919b..d0d406cfffa9 100644
> --- a/arch/s390/kernel/vdso64/Makefile
> +++ b/arch/s390/kernel/vdso64/Makefile
> @@ -25,7 +25,7 @@ $(targets:%=$(obj)/%.dbg): KBUILD_CFLAGS = $(KBUILD_CFLAGS_64)
>  $(targets:%=$(obj)/%.dbg): KBUILD_AFLAGS = $(KBUILD_AFLAGS_64)
>
>  obj-y += vdso64_wrapper.o
> -extra-y += vdso64.lds
> +targets += vdso64.lds
>  CPPFLAGS_vdso64.lds += -P -C -U$(ARCH)
>
>  # Disable gcov profiling, ubsan and kasan for VDSO code

For nds32:

Acked-by: Greentime Hu <green.hu@gmail.com>

^ permalink raw reply

* [PATCH] ASoC: fsl_sai: Support multiple data channel enable bits
From: Shengjiu Wang @ 2020-09-01 11:01 UTC (permalink / raw)
  To: timur, nicoleotsuka, Xiubo.Lee, festevam, broonie, perex, tiwai,
	alsa-devel, lgirdwood
  Cc: linuxppc-dev, linux-kernel

One data channel is one data line. From imx7ulp, the SAI IP is
enhanced to support multiple data channels.

If there is only two channels input and slots is 2, then enable one
data channel is enough for data transfer. So enable the TCE/RCE and
transmit/receive mask register according to the input channels and
slots configuration.

Move the data channel enablement from startup() to hw_params().

Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
---
 sound/soc/fsl/fsl_sai.c | 30 ++++++++++++------------------
 sound/soc/fsl/fsl_sai.h |  2 +-
 2 files changed, 13 insertions(+), 19 deletions(-)

diff --git a/sound/soc/fsl/fsl_sai.c b/sound/soc/fsl/fsl_sai.c
index 62c5fdb678fc..38c7bcbb361d 100644
--- a/sound/soc/fsl/fsl_sai.c
+++ b/sound/soc/fsl/fsl_sai.c
@@ -443,6 +443,7 @@ static int fsl_sai_hw_params(struct snd_pcm_substream *substream,
 	u32 slots = (channels == 1) ? 2 : channels;
 	u32 slot_width = word_width;
 	int adir = tx ? RX : TX;
+	u32 pins;
 	int ret;
 
 	if (sai->slots)
@@ -451,6 +452,8 @@ static int fsl_sai_hw_params(struct snd_pcm_substream *substream,
 	if (sai->slot_width)
 		slot_width = sai->slot_width;
 
+	pins = DIV_ROUND_UP(channels, slots);
+
 	if (!sai->is_slave_mode) {
 		if (sai->bclk_ratio)
 			ret = fsl_sai_set_bclk(cpu_dai, tx,
@@ -501,13 +504,17 @@ static int fsl_sai_hw_params(struct snd_pcm_substream *substream,
 				   FSL_SAI_CR5_FBT_MASK, val_cr5);
 	}
 
+	regmap_update_bits(sai->regmap, FSL_SAI_xCR3(tx, ofs),
+			   FSL_SAI_CR3_TRCE_MASK,
+			   FSL_SAI_CR3_TRCE((1 << pins) - 1));
 	regmap_update_bits(sai->regmap, FSL_SAI_xCR4(tx, ofs),
 			   FSL_SAI_CR4_SYWD_MASK | FSL_SAI_CR4_FRSZ_MASK,
 			   val_cr4);
 	regmap_update_bits(sai->regmap, FSL_SAI_xCR5(tx, ofs),
 			   FSL_SAI_CR5_WNW_MASK | FSL_SAI_CR5_W0W_MASK |
 			   FSL_SAI_CR5_FBT_MASK, val_cr5);
-	regmap_write(sai->regmap, FSL_SAI_xMR(tx), ~0UL - ((1 << channels) - 1));
+	regmap_write(sai->regmap, FSL_SAI_xMR(tx),
+		     ~0UL - ((1 << min(channels, slots)) - 1));
 
 	return 0;
 }
@@ -517,6 +524,10 @@ static int fsl_sai_hw_free(struct snd_pcm_substream *substream,
 {
 	struct fsl_sai *sai = snd_soc_dai_get_drvdata(cpu_dai);
 	bool tx = substream->stream == SNDRV_PCM_STREAM_PLAYBACK;
+	unsigned int ofs = sai->soc_data->reg_offset;
+
+	regmap_update_bits(sai->regmap, FSL_SAI_xCR3(tx, ofs),
+			   FSL_SAI_CR3_TRCE_MASK, 0);
 
 	if (!sai->is_slave_mode &&
 			sai->mclk_streams & BIT(substream->stream)) {
@@ -651,14 +662,9 @@ static int fsl_sai_startup(struct snd_pcm_substream *substream,
 		struct snd_soc_dai *cpu_dai)
 {
 	struct fsl_sai *sai = snd_soc_dai_get_drvdata(cpu_dai);
-	unsigned int ofs = sai->soc_data->reg_offset;
 	bool tx = substream->stream == SNDRV_PCM_STREAM_PLAYBACK;
 	int ret;
 
-	regmap_update_bits(sai->regmap, FSL_SAI_xCR3(tx, ofs),
-			   FSL_SAI_CR3_TRCE_MASK,
-			   FSL_SAI_CR3_TRCE);
-
 	/*
 	 * EDMA controller needs period size to be a multiple of
 	 * tx/rx maxburst
@@ -675,17 +681,6 @@ static int fsl_sai_startup(struct snd_pcm_substream *substream,
 	return ret;
 }
 
-static void fsl_sai_shutdown(struct snd_pcm_substream *substream,
-		struct snd_soc_dai *cpu_dai)
-{
-	struct fsl_sai *sai = snd_soc_dai_get_drvdata(cpu_dai);
-	unsigned int ofs = sai->soc_data->reg_offset;
-	bool tx = substream->stream == SNDRV_PCM_STREAM_PLAYBACK;
-
-	regmap_update_bits(sai->regmap, FSL_SAI_xCR3(tx, ofs),
-			   FSL_SAI_CR3_TRCE_MASK, 0);
-}
-
 static const struct snd_soc_dai_ops fsl_sai_pcm_dai_ops = {
 	.set_bclk_ratio	= fsl_sai_set_dai_bclk_ratio,
 	.set_sysclk	= fsl_sai_set_dai_sysclk,
@@ -695,7 +690,6 @@ static const struct snd_soc_dai_ops fsl_sai_pcm_dai_ops = {
 	.hw_free	= fsl_sai_hw_free,
 	.trigger	= fsl_sai_trigger,
 	.startup	= fsl_sai_startup,
-	.shutdown	= fsl_sai_shutdown,
 };
 
 static int fsl_sai_dai_probe(struct snd_soc_dai *cpu_dai)
diff --git a/sound/soc/fsl/fsl_sai.h b/sound/soc/fsl/fsl_sai.h
index 6aba7d28f5f3..5f630be74853 100644
--- a/sound/soc/fsl/fsl_sai.h
+++ b/sound/soc/fsl/fsl_sai.h
@@ -109,7 +109,7 @@
 #define FSL_SAI_CR2_DIV_MASK	0xff
 
 /* SAI Transmit and Receive Configuration 3 Register */
-#define FSL_SAI_CR3_TRCE	BIT(16)
+#define FSL_SAI_CR3_TRCE(x)     ((x) << 16)
 #define FSL_SAI_CR3_TRCE_MASK	GENMASK(23, 16)
 #define FSL_SAI_CR3_WDFL(x)	(x)
 #define FSL_SAI_CR3_WDFL_MASK	0x1f
-- 
2.27.0


^ permalink raw reply related

* Re: [PATCH 4/4] powerpc/64s/radix: Fix mm_cpumask trimming race vs kthread_use_mm
From: Michael Ellerman @ 2020-09-01 12:00 UTC (permalink / raw)
  To: Nicholas Piggin, linux-mm
  Cc: linux-arch, Jens Axboe, Peter Zijlstra, Aneesh Kumar K.V,
	linux-kernel, Nicholas Piggin, Andrew Morton, linuxppc-dev,
	David S. Miller
In-Reply-To: <20200828100022.1099682-5-npiggin@gmail.com>

Nicholas Piggin <npiggin@gmail.com> writes:
> Commit 0cef77c7798a7 ("powerpc/64s/radix: flush remote CPUs out of
> single-threaded mm_cpumask") added a mechanism to trim the mm_cpumask of
> a process under certain conditions. One of the assumptions is that
> mm_users would not be incremented via a reference outside the process
> context with mmget_not_zero() then go on to kthread_use_mm() via that
> reference.
>
> That invariant was broken by io_uring code (see previous sparc64 fix),
> but I'll point Fixes: to the original powerpc commit because we are
> changing that assumption going forward, so this will make backports
> match up.
>
> Fix this by no longer relying on that assumption, but by having each CPU
> check the mm is not being used, and clearing their own bit from the mask
> if it's okay. This fix relies on commit 38cf307c1f20 ("mm: fix
> kthread_use_mm() vs TLB invalidate") to disable irqs over the mm switch,
> and ARCH_WANT_IRQS_OFF_ACTIVATE_MM to be enabled.

You could use:

Depends-on: 38cf307c1f20 ("mm: fix kthread_use_mm() vs TLB invalidate")

> Fixes: 0cef77c7798a7 ("powerpc/64s/radix: flush remote CPUs out of single-threaded mm_cpumask")
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  arch/powerpc/include/asm/tlb.h       | 13 -------------
>  arch/powerpc/mm/book3s64/radix_tlb.c | 23 ++++++++++++++++-------
>  2 files changed, 16 insertions(+), 20 deletions(-)

One minor nit below if you're respinning anyway.

You know this stuff better than me, but I still reviewed it and it seems
good to me.

Reviewed-by: Michael Ellerman <mpe@ellerman.id.au>

> diff --git a/arch/powerpc/include/asm/tlb.h b/arch/powerpc/include/asm/tlb.h
> index fbc6f3002f23..d97f061fecac 100644
> --- a/arch/powerpc/include/asm/tlb.h
> +++ b/arch/powerpc/include/asm/tlb.h
> @@ -66,19 +66,6 @@ static inline int mm_is_thread_local(struct mm_struct *mm)
>  		return false;
>  	return cpumask_test_cpu(smp_processor_id(), mm_cpumask(mm));
>  }
> -static inline void mm_reset_thread_local(struct mm_struct *mm)
> -{
> -	WARN_ON(atomic_read(&mm->context.copros) > 0);
> -	/*
> -	 * It's possible for mm_access to take a reference on mm_users to
> -	 * access the remote mm from another thread, but it's not allowed
> -	 * to set mm_cpumask, so mm_users may be > 1 here.
> -	 */
> -	WARN_ON(current->mm != mm);
> -	atomic_set(&mm->context.active_cpus, 1);
> -	cpumask_clear(mm_cpumask(mm));
> -	cpumask_set_cpu(smp_processor_id(), mm_cpumask(mm));
> -}
>  #else /* CONFIG_PPC_BOOK3S_64 */
>  static inline int mm_is_thread_local(struct mm_struct *mm)
>  {
> diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c b/arch/powerpc/mm/book3s64/radix_tlb.c
> index 0d233763441f..a421a0e3f930 100644
> --- a/arch/powerpc/mm/book3s64/radix_tlb.c
> +++ b/arch/powerpc/mm/book3s64/radix_tlb.c
> @@ -645,19 +645,29 @@ static void do_exit_flush_lazy_tlb(void *arg)
>  	struct mm_struct *mm = arg;
>  	unsigned long pid = mm->context.id;
>  
> +	/*
> +	 * A kthread could have done a mmget_not_zero() after the flushing CPU
> +	 * checked mm_users == 1, and be in the process of kthread_use_mm when
                                ^
                                in mm_is_singlethreaded()

Adding that reference would help join the dots for a new reader I think.

cheers

> +	 * interrupted here. In that case, current->mm will be set to mm,
> +	 * because kthread_use_mm() setting ->mm and switching to the mm is
> +	 * done with interrupts off.
> +	 */
>  	if (current->mm == mm)
> -		return; /* Local CPU */
> +		goto out_flush;
>  
>  	if (current->active_mm == mm) {
> -		/*
> -		 * Must be a kernel thread because sender is single-threaded.
> -		 */
> -		BUG_ON(current->mm);
> +		WARN_ON_ONCE(current->mm != NULL);
> +		/* Is a kernel thread and is using mm as the lazy tlb */
>  		mmgrab(&init_mm);
> -		switch_mm(mm, &init_mm, current);
>  		current->active_mm = &init_mm;
> +		switch_mm_irqs_off(mm, &init_mm, current);
>  		mmdrop(mm);
>  	}
> +
> +	atomic_dec(&mm->context.active_cpus);
> +	cpumask_clear_cpu(smp_processor_id(), mm_cpumask(mm));
> +
> +out_flush:
>  	_tlbiel_pid(pid, RIC_FLUSH_ALL);
>  }
>  
> @@ -672,7 +682,6 @@ static void exit_flush_lazy_tlbs(struct mm_struct *mm)
>  	 */
>  	smp_call_function_many(mm_cpumask(mm), do_exit_flush_lazy_tlb,
>  				(void *)mm, 1);
> -	mm_reset_thread_local(mm);
>  }
>  
>  void radix__flush_tlb_mm(struct mm_struct *mm)

^ permalink raw reply

* [PATCH] selftests/powerpc: Skip PROT_SAO test in guests/LPARS
From: Michael Ellerman @ 2020-09-01 12:46 UTC (permalink / raw)
  To: linuxppc-dev

In commit 9b725a90a8f1 ("powerpc/64s: Disallow PROT_SAO in LPARs by
default") PROT_SAO was disabled in guests/LPARs by default. So skip
the test if we are running in a guest to avoid a spurious failure.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 tools/testing/selftests/powerpc/mm/prot_sao.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/powerpc/mm/prot_sao.c b/tools/testing/selftests/powerpc/mm/prot_sao.c
index e0cf8ebbf8cd..30b71b1d78d5 100644
--- a/tools/testing/selftests/powerpc/mm/prot_sao.c
+++ b/tools/testing/selftests/powerpc/mm/prot_sao.c
@@ -7,6 +7,7 @@
 #include <stdlib.h>
 #include <string.h>
 #include <sys/mman.h>
+#include <unistd.h>
 
 #include <asm/cputable.h>
 
@@ -18,9 +19,13 @@ int test_prot_sao(void)
 {
 	char *p;
 
-	/* SAO was introduced in 2.06 and removed in 3.1 */
+	/*
+	 * SAO was introduced in 2.06 and removed in 3.1. It's disabled in
+	 * guests/LPARs by default, so also skip if we are running in a guest.
+	 */
 	SKIP_IF(!have_hwcap(PPC_FEATURE_ARCH_2_06) ||
-		have_hwcap2(PPC_FEATURE2_ARCH_3_1));
+		have_hwcap2(PPC_FEATURE2_ARCH_3_1) ||
+		access("/proc/device-tree/rtas/ibm,hypertas-functions", F_OK) == 0);
 
 	/*
 	 * Ensure we can ask for PROT_SAO.
-- 
2.25.1


^ permalink raw reply related

* Re: [RESEND][PATCH 1/7] powerpc/iommu: Avoid overflow at boundary_size
From: Michael Ellerman @ 2020-09-01 13:27 UTC (permalink / raw)
  To: Nicolin Chen, benh, paulus, rth, ink, mattst88, tony.luck,
	fenghua.yu, schnelle, gerald.schaefer, hca, gor, borntraeger,
	davem, tglx, mingo, bp, x86, hpa, James.Bottomley, deller
  Cc: sfr, linux-ia64, linux-parisc, linux-s390, linux-kernel,
	linux-alpha, sparclinux, linuxppc-dev, hch
In-Reply-To: <20200831203811.8494-2-nicoleotsuka@gmail.com>

Nicolin Chen <nicoleotsuka@gmail.com> writes:
> The boundary_size might be as large as ULONG_MAX, which means
> that a device has no specific boundary limit. So either "+ 1"
> or passing it to ALIGN() would potentially overflow.
>
> According to kernel defines:
>     #define ALIGN_MASK(x, mask) (((x) + (mask)) & ~(mask))
>     #define ALIGN(x, a)	ALIGN_MASK(x, (typeof(x))(a) - 1)
>
> We can simplify the logic here:
>   ALIGN(boundary + 1, 1 << shift) >> shift
> = ALIGN_MASK(b + 1, (1 << s) - 1) >> s
> = {[b + 1 + (1 << s) - 1] & ~[(1 << s) - 1]} >> s
> = [b + 1 + (1 << s) - 1] >> s
> = [b + (1 << s)] >> s
> = (b >> s) + 1
>
> So fixing a potential overflow with the safer shortcut.
>
> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
> Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
> Cc: Christoph Hellwig <hch@lst.de>
> ---
>  arch/powerpc/kernel/iommu.c | 11 +++++------
>  1 file changed, 5 insertions(+), 6 deletions(-)

Are you asking for acks, or for maintainers to merge the patches
individually?

> diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
> index 9704f3f76e63..c01ccbf8afdd 100644
> --- a/arch/powerpc/kernel/iommu.c
> +++ b/arch/powerpc/kernel/iommu.c
> @@ -236,15 +236,14 @@ static unsigned long iommu_range_alloc(struct device *dev,
>  		}
>  	}
>  
> -	if (dev)
> -		boundary_size = ALIGN(dma_get_seg_boundary(dev) + 1,
> -				      1 << tbl->it_page_shift);
> -	else
> -		boundary_size = ALIGN(1UL << 32, 1 << tbl->it_page_shift);
>  	/* 4GB boundary for iseries_hv_alloc and iseries_hv_map */
> +	boundary_size = dev ? dma_get_seg_boundary(dev) : U32_MAX;

Is there any path that passes a NULL dev anymore?

Both iseries_hv_alloc() and iseries_hv_map() were removed years ago.
See:
  8ee3e0d69623 ("powerpc: Remove the main legacy iSerie platform code")


So maybe we should do a lead-up patch that drops the NULL dev support,
which will then make this patch simpler.

cheers


> +	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
> +	boundary_size = (boundary_size >> tbl->it_page_shift) + 1;
>  
>  	n = iommu_area_alloc(tbl->it_map, limit, start, npages, tbl->it_offset,
> -			     boundary_size >> tbl->it_page_shift, align_mask);
> +			     boundary_size, align_mask);
>  	if (n == -1) {
>  		if (likely(pass == 0)) {
>  			/* First try the pool from the start */
> -- 
> 2.17.1

^ permalink raw reply

* Re: [PATCH] powerpc/powernv/pci: Drop pnv_phb->initialized
From: Michael Ellerman @ 2020-09-01 13:52 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: Oliver O'Halloran
In-Reply-To: <20200831061500.1646445-1-oohall@gmail.com>

Oliver O'Halloran <oohall@gmail.com> writes:

> The pnv_phb->initialized flag is an odd beast. It was added back in 2012 in
> commit db1266c85261 ("powerpc/powernv: Skip check on PE if necessary") to
> allow devices to be enabled even if their PE assignments hadn't been
> completed yet. I can't think of any situation where we would (or should)
> have PCI devices being enabled before their PEs are assigned, so I can only
> assume it was a workaround for a bug or some other undesirable behaviour
> from the PCI core.
>
> Since commit dc3d8f85bb57 ("powerpc/powernv/pci: Re-work bus PE
> configuration") the PE setup occurs before the PCI core allows driver to
> attach to the device so the problem should no longer exist. Even it does
> allowing the device to be enabled before we have assigned the device to a
> PE is almost certainly broken and will cause spurious EEH events so we
> should probably just remove it.
>
> It's also worth pointing out that ->initialized flag is set in
> pnv_pci_ioda_create_dbgfs() which has the entire function body wrapped in
> flag.

"body wrapped in flag." ?

I guess you meant:

"wrapped in #ifdef CONFIG_DEBUG_FS" ?

> That has the fun side effect of bypassing any other checks in
> pnv_pci_enable_device_hook() which is probably not what anybody wants.

That would only be true for CONFIG_DEBUG_FS=n builds though.

cheers

> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 023a4f987bb2..6ac3c637b313 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -2410,9 +2410,6 @@ static void pnv_pci_ioda_create_dbgfs(void)
>  	list_for_each_entry_safe(hose, tmp, &hose_list, list_node) {
>  		phb = hose->private_data;
>  
> -		/* Notify initialization of PHB done */
> -		phb->initialized = 1;
> -
>  		sprintf(name, "PCI%04x", hose->global_number);
>  		phb->dbgfs = debugfs_create_dir(name, powerpc_debugfs_root);
>  
> @@ -2609,17 +2606,8 @@ static resource_size_t pnv_pci_default_alignment(void)
>   */
>  static bool pnv_pci_enable_device_hook(struct pci_dev *dev)
>  {
> -	struct pnv_phb *phb = pci_bus_to_pnvhb(dev->bus);
>  	struct pci_dn *pdn;
>  
> -	/* The function is probably called while the PEs have
> -	 * not be created yet. For example, resource reassignment
> -	 * during PCI probe period. We just skip the check if
> -	 * PEs isn't ready.
> -	 */
> -	if (!phb->initialized)
> -		return true;
> -
>  	pdn = pci_get_pdn(dev);
>  	if (!pdn || pdn->pe_number == IODA_INVALID_PE)
>  		return false;
> @@ -2629,14 +2617,9 @@ static bool pnv_pci_enable_device_hook(struct pci_dev *dev)
>  
>  static bool pnv_ocapi_enable_device_hook(struct pci_dev *dev)
>  {
> -	struct pci_controller *hose = pci_bus_to_host(dev->bus);
> -	struct pnv_phb *phb = hose->private_data;
>  	struct pci_dn *pdn;
>  	struct pnv_ioda_pe *pe;
>  
> -	if (!phb->initialized)
> -		return true;
> -
>  	pdn = pci_get_pdn(dev);
>  	if (!pdn)
>  		return false;
> diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
> index 739a0b3b72e1..36d22920f5a3 100644
> --- a/arch/powerpc/platforms/powernv/pci.h
> +++ b/arch/powerpc/platforms/powernv/pci.h
> @@ -119,7 +119,6 @@ struct pnv_phb {
>  	int			flags;
>  	void __iomem		*regs;
>  	u64			regs_phys;
> -	int			initialized;
>  	spinlock_t		lock;
>  
>  #ifdef CONFIG_DEBUG_FS
> -- 
> 2.26.2

^ permalink raw reply

* [PATCH] cpuidle-pseries: Fix CEDE latency conversion from tb to us
From: Gautham R. Shenoy @ 2020-09-01 14:08 UTC (permalink / raw)
  To: Michael Ellerman, Rafael J. Wysocki, Vaidyanathan Srinivasan
  Cc: Gautham R. Shenoy, linuxppc-dev, linux-kernel, linux-pm

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

commit d947fb4c965c ("cpuidle: pseries: Fixup exit latency for
CEDE(0)") sets the exit latency of CEDE(0) based on the latency values
of the Extended CEDE states advertised by the platform. The values
advertised by the platform are in timebase ticks. However the cpuidle
framework requires the latency values in microseconds.

If the tb-ticks value advertised by the platform correspond to a value
smaller than 1us, during the conversion from tb-ticks to microseconds,
in the current code, the result becomes zero. This is incorrect as it
puts a CEDE state on par with the snooze state.

This patch fixes this by rounding up the result obtained while
converting the latency value from tb-ticks to microseconds.

Fixes: commit d947fb4c965c ("cpuidle: pseries: Fixup exit latency for
CEDE(0)")

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 drivers/cpuidle/cpuidle-pseries.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cpuidle/cpuidle-pseries.c b/drivers/cpuidle/cpuidle-pseries.c
index ff6d99e..9043358 100644
--- a/drivers/cpuidle/cpuidle-pseries.c
+++ b/drivers/cpuidle/cpuidle-pseries.c
@@ -361,7 +361,7 @@ static void __init fixup_cede0_latency(void)
 	for (i = 0; i < nr_xcede_records; i++) {
 		struct xcede_latency_record *record = &payload->records[i];
 		u64 latency_tb = be64_to_cpu(record->latency_ticks);
-		u64 latency_us = tb_to_ns(latency_tb) / NSEC_PER_USEC;
+		u64 latency_us = DIV_ROUND_UP_ULL(tb_to_ns(latency_tb), NSEC_PER_USEC);
 
 		if (latency_us < min_latency_us)
 			min_latency_us = latency_us;
-- 
1.9.4


^ permalink raw reply related

* [PATCH v3 16/23] powerpc: use asm-generic/mmu_context.h for no-op implementations
From: Nicholas Piggin @ 2020-09-01 14:15 UTC (permalink / raw)
  To: linux-arch
  Cc: Arnd Bergmann, linux-kernel, Nicholas Piggin, linux-mm,
	linuxppc-dev
In-Reply-To: <20200901141539.1757549-1-npiggin@gmail.com>

Cc: linuxppc-dev@lists.ozlabs.org
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/mmu_context.h | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index 7f3658a97384..a3a12a8341b2 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -14,7 +14,9 @@
 /*
  * Most if the context management is out of line
  */
+#define init_new_context init_new_context
 extern int init_new_context(struct task_struct *tsk, struct mm_struct *mm);
+#define destroy_context destroy_context
 extern void destroy_context(struct mm_struct *mm);
 #ifdef CONFIG_SPAPR_TCE_IOMMU
 struct mm_iommu_table_group_mem_t;
@@ -235,27 +237,26 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
 }
 #define switch_mm_irqs_off switch_mm_irqs_off
 
-
-#define deactivate_mm(tsk,mm)	do { } while (0)
-
 /*
  * After we have set current->mm to a new value, this activates
  * the context for the new mm so we see the new mappings.
  */
+#define activate_mm activate_mm
 static inline void activate_mm(struct mm_struct *prev, struct mm_struct *next)
 {
 	switch_mm(prev, next, current);
 }
 
 /* We don't currently use enter_lazy_tlb() for anything */
+#ifdef CONFIG_PPC_BOOK3E_64
+#define enter_lazy_tlb enter_lazy_tlb
 static inline void enter_lazy_tlb(struct mm_struct *mm,
 				  struct task_struct *tsk)
 {
 	/* 64-bit Book3E keeps track of current PGD in the PACA */
-#ifdef CONFIG_PPC_BOOK3E_64
 	get_paca()->pgd = NULL;
-#endif
 }
+#endif
 
 extern void arch_exit_mmap(struct mm_struct *mm);
 
@@ -298,5 +299,7 @@ static inline int arch_dup_mmap(struct mm_struct *oldmm,
 	return 0;
 }
 
+#include <asm-generic/mmu_context.h>
+
 #endif /* __KERNEL__ */
 #endif /* __ASM_POWERPC_MMU_CONTEXT_H */
-- 
2.23.0


^ permalink raw reply related

* Re: [PATCH] selftests/powerpc: Skip PROT_SAO test in guests/LPARS
From: Sachin Sant @ 2020-09-01 15:33 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev
In-Reply-To: <20200901124653.523182-1-mpe@ellerman.id.au>



> On 01-Sep-2020, at 6:16 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> 
> In commit 9b725a90a8f1 ("powerpc/64s: Disallow PROT_SAO in LPARs by
> default") PROT_SAO was disabled in guests/LPARs by default. So skip
> the test if we are running in a guest to avoid a spurious failure.
> 
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> —

Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com>

With the fix test is skipped while running in a guest

# ./prot_sao 
test: prot-sao
tags: git_version:unknown
[SKIP] Test skipped on line 25
skip: prot-sao
#


^ permalink raw reply

* Re: remove the last set_fs() in common code, and remove it for x86 and powerpc v2
From: Christophe Leroy @ 2020-09-01 17:13 UTC (permalink / raw)
  To: Christoph Hellwig, Linus Torvalds, Al Viro, Michael Ellerman, x86
  Cc: linux-fsdevel, linux-arch, linuxppc-dev, Kees Cook, linux-kernel
In-Reply-To: <20200827150030.282762-1-hch@lst.de>

Hi Christoph,

Le 27/08/2020 à 17:00, Christoph Hellwig a écrit :
> Hi all,
> 
> this series removes the last set_fs() used to force a kernel address
> space for the uaccess code in the kernel read/write/splice code, and then
> stops implementing the address space overrides entirely for x86 and
> powerpc.
> 
> The file system part has been posted a few times, and the read/write side
> has been pretty much unchanced.  For splice this series drops the
> conversion of the seq_file and sysctl code to the iter ops, and thus loses
> the splice support for them.  The reasons for that is that it caused a lot
> of churn for not much use - splice for these small files really isn't much
> of a win, even if existing userspace uses it.  All callers I found do the
> proper fallback, but if this turns out to be an issue the conversion can
> be resurrected.
> 
> Besides x86 and powerpc I plan to eventually convert all other
> architectures, although this will be a slow process, starting with the
> easier ones once the infrastructure is merged.  The process to convert
> architectures is roughtly:
> 
>   (1) ensure there is no set_fs(KERNEL_DS) left in arch specific code
>   (2) implement __get_kernel_nofault and __put_kernel_nofault
>   (3) remove the arch specific address limitation functionality
> 
> Changes since v1:
>   - drop the patch to remove the non-iter ops for /dev/zero and
>     /dev/null as they caused a performance regression
>   - don't enable user access in __get_kernel on powerpc
>   - xfail the set_fs() based lkdtm tests
> 
> Diffstat:
> 


I'm still sceptic with the results I get.

With 5.9-rc2:

root@vgoippro:~# time dd if=/dev/zero of=/dev/null count=1M
1048576+0 records in
1048576+0 records out
536870912 bytes (512.0MB) copied, 5.585880 seconds, 91.7MB/s
real    0m 5.59s
user    0m 1.40s
sys     0m 4.19s


With your series:

root@vgoippro:/tmp# time dd if=/dev/zero of=/dev/null count=1M
1048576+0 records in
1048576+0 records out
536870912 bytes (512.0MB) copied, 7.780540 seconds, 65.8MB/s
real    0m 7.79s
user    0m 2.12s
sys     0m 5.66s




Top of perf report of a standard perf record:

With 5.9-rc2:

     20.31%  dd       [kernel.kallsyms]  [k] __arch_clear_user
      8.37%  dd       [kernel.kallsyms]  [k] transfer_to_syscall
      7.37%  dd       [kernel.kallsyms]  [k] __fsnotify_parent
      6.95%  dd       [kernel.kallsyms]  [k] iov_iter_zero
      5.72%  dd       [kernel.kallsyms]  [k] new_sync_read
      4.87%  dd       [kernel.kallsyms]  [k] vfs_write
      4.47%  dd       [kernel.kallsyms]  [k] vfs_read
      3.07%  dd       [kernel.kallsyms]  [k] ksys_write
      2.77%  dd       [kernel.kallsyms]  [k] ksys_read
      2.65%  dd       [kernel.kallsyms]  [k] __fget_light
      2.37%  dd       [kernel.kallsyms]  [k] __fdget_pos
      2.35%  dd       [kernel.kallsyms]  [k] memset
      1.53%  dd       [kernel.kallsyms]  [k] rw_verify_area
      1.52%  dd       [kernel.kallsyms]  [k] read_iter_zero

With your series:
     19.60%  dd       [kernel.kallsyms]  [k] __arch_clear_user
     10.92%  dd       [kernel.kallsyms]  [k] iov_iter_zero
      9.50%  dd       [kernel.kallsyms]  [k] vfs_write
      8.97%  dd       [kernel.kallsyms]  [k] __fsnotify_parent
      5.46%  dd       [kernel.kallsyms]  [k] transfer_to_syscall
      5.42%  dd       [kernel.kallsyms]  [k] vfs_read
      3.58%  dd       [kernel.kallsyms]  [k] ksys_read
      2.84%  dd       [kernel.kallsyms]  [k] read_iter_zero
      2.24%  dd       [kernel.kallsyms]  [k] ksys_write
      1.80%  dd       [kernel.kallsyms]  [k] __fget_light
      1.34%  dd       [kernel.kallsyms]  [k] __fdget_pos
      0.91%  dd       [kernel.kallsyms]  [k] memset
      0.91%  dd       [kernel.kallsyms]  [k] rw_verify_area

Christophe

^ permalink raw reply

* Re: remove the last set_fs() in common code, and remove it for x86 and powerpc v2
From: Al Viro @ 2020-09-01 17:25 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: linux-arch, Kees Cook, x86, linuxppc-dev, linux-kernel,
	linux-fsdevel, Linus Torvalds, Christoph Hellwig
In-Reply-To: <a8bb0319-0928-4687-9e9c-777c5860dbdd@csgroup.eu>

On Tue, Sep 01, 2020 at 07:13:00PM +0200, Christophe Leroy wrote:

>     10.92%  dd       [kernel.kallsyms]  [k] iov_iter_zero

Interesting...  Could you get an instruction-level profile inside iov_iter_zero(),
along with the disassembly of that sucker?

^ permalink raw reply

* Re: remove the last set_fs() in common code, and remove it for x86 and powerpc v2
From: Matthew Wilcox @ 2020-09-01 17:42 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-arch, Kees Cook, x86, linuxppc-dev, linux-kernel,
	linux-fsdevel, Linus Torvalds, Christoph Hellwig
In-Reply-To: <20200901172512.GI1236603@ZenIV.linux.org.uk>

On Tue, Sep 01, 2020 at 06:25:12PM +0100, Al Viro wrote:
> On Tue, Sep 01, 2020 at 07:13:00PM +0200, Christophe Leroy wrote:
> 
> >     10.92%  dd       [kernel.kallsyms]  [k] iov_iter_zero
> 
> Interesting...  Could you get an instruction-level profile inside iov_iter_zero(),
> along with the disassembly of that sucker?

Also, does [1] make any difference?  Probably not since it's translating
O flags into IOCB flags instead of RWF flags into IOCB flags.  I wonder
if there's a useful trick we can play here ... something like:

static inline int iocb_flags(struct file *file)
{
        int res = 0;
	if (likely(!file->f_flags & O_APPEND | O_DIRECT | O_DSYNC | __O_SYNC)) && !IS_SYNC(file->f_mapping->host))
		return res;
        if (file->f_flags & O_APPEND)
                res |= IOCB_APPEND;
        if (file->f_flags & O_DIRECT)
                res |= IOCB_DIRECT;
        if ((file->f_flags & O_DSYNC) || IS_SYNC(file->f_mapping->host))
                res |= IOCB_DSYNC;
        if (file->f_flags & __O_SYNC)
                res |= IOCB_SYNC;
        return res;
}

Can we do something like force O_DSYNC to be set if the inode IS_SYNC()
at the time of open?  Or is setting the sync bit on the inode required
to affect currently-open files?

[1] https://lore.kernel.org/linux-fsdevel/95de7ce4-9254-39f1-304f-4455f66bf0f4@kernel.dk/ 

^ permalink raw reply

* Re: remove the last set_fs() in common code, and remove it for x86 and powerpc v2
From: Christophe Leroy @ 2020-09-01 18:39 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-arch, Kees Cook, x86, linuxppc-dev, linux-kernel,
	linux-fsdevel, Linus Torvalds, Christoph Hellwig
In-Reply-To: <20200901172512.GI1236603@ZenIV.linux.org.uk>



Le 01/09/2020 à 19:25, Al Viro a écrit :
> On Tue, Sep 01, 2020 at 07:13:00PM +0200, Christophe Leroy wrote:
> 
>>      10.92%  dd       [kernel.kallsyms]  [k] iov_iter_zero
> 
> Interesting...  Could you get an instruction-level profile inside iov_iter_zero(),
> along with the disassembly of that sucker?
> 

Output of perf annotate:


  Percent |	Source code & Disassembly of vmlinux for cpu-clock (3579 
samples)
---------------------------------------------------------------------------------
          :
          :
          :
          :	Disassembly of section .text:
          :
          :	c02cb3a4 <iov_iter_zero>:
          :	iov_iter_zero():
     2.24 :	  c02cb3a4:       stwu    r1,-80(r1)
     0.31 :	  c02cb3a8:       stw     r30,72(r1)
     0.00 :	  c02cb3ac:       mr      r30,r4
     0.11 :	  c02cb3b0:       stw     r31,76(r1)
     0.00 :	  c02cb3b4:       mr      r31,r3
     1.06 :	  c02cb3b8:       stw     r27,60(r1)
          :	iov_iter_type():
     0.03 :	  c02cb3bc:       lwz     r10,0(r4)
     0.06 :	  c02cb3c0:       rlwinm  r9,r10,0,0,30
          :	iov_iter_zero():
     0.03 :	  c02cb3c4:       cmpwi   r9,32
     0.00 :	  c02cb3c8:       lwz     r9,624(r2)
     2.15 :	  c02cb3cc:       stw     r9,28(r1)
     0.00 :	  c02cb3d0:       li      r9,0
     0.00 :	  c02cb3d4:       beq     c02cb520 <iov_iter_zero+0x17c>
     0.14 :	  c02cb3d8:       lwz     r9,8(r4)
     0.08 :	  c02cb3dc:       cmplw   r9,r3
     0.00 :	  c02cb3e0:       mr      r27,r9
     0.03 :	  c02cb3e4:       bgt     c02cb4fc <iov_iter_zero+0x158>
     1.34 :	  c02cb3e8:       cmpwi   r9,0
     0.00 :	  c02cb3ec:       beq     c02cb4d0 <iov_iter_zero+0x12c>
     0.11 :	  c02cb3f0:       andi.   r8,r10,16
     0.17 :	  c02cb3f4:       lwz     r31,4(r30)
     1.79 :	  c02cb3f8:       bne     c02cb61c <iov_iter_zero+0x278>
     0.00 :	  c02cb3fc:       andi.   r8,r10,8
     0.06 :	  c02cb400:       bne     c02cb770 <iov_iter_zero+0x3cc>
     0.22 :	  c02cb404:       andi.   r10,r10,64
     0.03 :	  c02cb408:       bne     c02cb88c <iov_iter_zero+0x4e8>
     0.11 :	  c02cb40c:       stw     r29,68(r1)
     1.59 :	  c02cb410:       stw     r28,64(r1)
     0.03 :	  c02cb414:       lwz     r28,12(r30)
     0.00 :	  c02cb418:       lwz     r7,4(r28)
     1.87 :	  c02cb41c:       subf    r29,r31,r7
     0.28 :	  c02cb420:       cmplw   r29,r27
     0.03 :	  c02cb424:       bgt     c02cb50c <iov_iter_zero+0x168>
     0.03 :	  c02cb428:       cmpwi   r29,0
     0.00 :	  c02cb42c:       beq     c02cb898 <iov_iter_zero+0x4f4>
     1.34 :	  c02cb430:       lwz     r3,0(r28)
          :	__access_ok():
     0.00 :	  c02cb434:       lis     r10,-16384
          :	iov_iter_zero():
     0.36 :	  c02cb438:       add     r3,r3,r31
          :	__access_ok():
     0.03 :	  c02cb43c:       cmplw   r3,r10
     1.79 :	  c02cb440:       bge     c02cb514 <iov_iter_zero+0x170>
    13.19 :	  c02cb444:       subf    r10,r3,r10
          :	clear_user():
     0.00 :	  c02cb448:       cmplw   r29,r10
     4.41 :	  c02cb44c:       mflr    r0
     0.00 :	  c02cb450:       stw     r0,84(r1)
     0.00 :	  c02cb454:       bgt     c02cb8c4 <iov_iter_zero+0x520>
     0.00 :	  c02cb458:       mr      r4,r29
     0.00 :	  c02cb45c:       bl      c001a41c <__arch_clear_user>
          :	iov_iter_zero():
     0.70 :	  c02cb460:       add     r31,r31,r29
     0.00 :	  c02cb464:       cmpwi   r3,0
    17.13 :	  c02cb468:       subf    r29,r29,r27
     0.00 :	  c02cb46c:       subf    r31,r3,r31
     1.20 :	  c02cb470:       add     r29,r29,r3
     0.00 :	  c02cb474:       beq     c02cb8b8 <iov_iter_zero+0x514>
     0.00 :	  c02cb478:       lwz     r9,8(r30)
     0.00 :	  c02cb47c:       subf    r10,r27,r29
     0.00 :	  c02cb480:       lwz     r0,84(r1)
     0.00 :	  c02cb484:       subf    r27,r29,r27
     0.00 :	  c02cb488:       add     r9,r10,r9
     0.00 :	  c02cb48c:       lwz     r7,4(r28)
     0.00 :	  c02cb490:       lwz     r10,12(r30)
     0.00 :	  c02cb494:       mtlr    r0
     1.65 :	  c02cb498:       cmplw   r31,r7
    14.61 :	  c02cb49c:       bne     c02cb4a8 <iov_iter_zero+0x104>
     1.65 :	  c02cb4a0:       addi    r28,r28,8
     0.00 :	  c02cb4a4:       li      r31,0
    14.92 :	  c02cb4a8:       lwz     r8,16(r30)
     0.00 :	  c02cb4ac:       subf    r10,r10,r28
     1.12 :	  c02cb4b0:       srawi   r10,r10,3
     0.56 :	  c02cb4b4:       stw     r28,12(r30)
     0.00 :	  c02cb4b8:       subf    r10,r10,r8
     1.23 :	  c02cb4bc:       stw     r10,16(r30)
     0.00 :	  c02cb4c0:       lwz     r28,64(r1)
     0.56 :	  c02cb4c4:       lwz     r29,68(r1)
     0.00 :	  c02cb4c8:       stw     r9,8(r30)
     2.12 :	  c02cb4cc:       stw     r31,4(r30)
     0.00 :	  c02cb4d0:       lwz     r9,28(r1)
     0.61 :	  c02cb4d4:       lwz     r10,624(r2)
     0.00 :	  c02cb4d8:       xor.    r9,r9,r10
     0.00 :	  c02cb4dc:       li      r10,0
     0.00 :	  c02cb4e0:       bne     c02cb9a8 <iov_iter_zero+0x604>
     0.00 :	  c02cb4e4:       mr      r3,r27
     0.00 :	  c02cb4e8:       lwz     r30,72(r1)
     1.73 :	  c02cb4ec:       lwz     r27,60(r1)
     0.50 :	  c02cb4f0:       lwz     r31,76(r1)
     0.00 :	  c02cb4f4:       addi    r1,r1,80
     0.00 :	  c02cb4f8:       blr
     0.00 :	  c02cb4fc:       cmpwi   r9,0
     0.00 :	  c02cb500:       mr      r27,r3
     0.00 :	  c02cb504:       beq     c02cb4d0 <iov_iter_zero+0x12c>
     0.00 :	  c02cb508:       b       c02cb3f0 <iov_iter_zero+0x4c>
     0.00 :	  c02cb50c:       mr      r29,r27
     0.00 :	  c02cb510:       b       c02cb428 <iov_iter_zero+0x84>
          :	__access_ok():
     0.00 :	  c02cb514:       li      r27,0
     0.00 :	  c02cb518:       mr      r10,r28
     0.00 :	  c02cb51c:       b       c02cb498 <iov_iter_zero+0xf4>
          :	pipe_zero():
     0.00 :	  c02cb520:       mflr    r0
     0.00 :	  c02cb524:       stw     r26,56(r1)
     0.00 :	  c02cb528:       stw     r0,84(r1)
     0.00 :	  c02cb52c:       mr      r3,r4
     0.00 :	  c02cb530:       stw     r28,64(r1)
     0.00 :	  c02cb534:       lwz     r28,12(r4)
     0.00 :	  c02cb538:       lwz     r26,40(r28)
     0.00 :	  c02cb53c:       bl      c02c8e48 <sanity>
     0.00 :	  c02cb540:       cmpwi   r3,0
     0.00 :	  c02cb544:       bne     c02cb560 <iov_iter_zero+0x1bc>
     0.00 :	  c02cb548:       lwz     r0,84(r1)
     0.00 :	  c02cb54c:       li      r27,0
     0.00 :	  c02cb550:       lwz     r26,56(r1)
     0.00 :	  c02cb554:       lwz     r28,64(r1)
     0.00 :	  c02cb558:       mtlr    r0
     0.00 :	  c02cb55c:       b       c02cb4d0 <iov_iter_zero+0x12c>
     0.00 :	  c02cb560:       mr      r4,r31
     0.00 :	  c02cb564:       addi    r6,r1,24
     0.00 :	  c02cb568:       addi    r5,r1,20
     0.00 :	  c02cb56c:       mr      r3,r30
     0.00 :	  c02cb570:       bl      c02c9030 <push_pipe>
     0.00 :	  c02cb574:       mr.     r27,r3
     0.00 :	  c02cb578:       beq     c02cb548 <iov_iter_zero+0x1a4>
     0.00 :	  c02cb57c:       lwz     r4,24(r1)
     0.00 :	  c02cb580:       addi    r26,r26,-1
     0.00 :	  c02cb584:       lwz     r9,20(r1)
     0.00 :	  c02cb588:       stw     r25,52(r1)
     0.00 :	  c02cb58c:       li      r25,0
     0.00 :	  c02cb590:       stw     r29,68(r1)
     0.00 :	  c02cb594:       mr      r29,r27
     0.00 :	  c02cb598:       subfic  r31,r4,4096
     0.00 :	  c02cb59c:       cmplw   r31,r29
     0.00 :	  c02cb5a0:       ble     c02cb5a8 <iov_iter_zero+0x204>
     0.00 :	  c02cb5a4:       mr      r31,r29
     0.00 :	  c02cb5a8:       and     r9,r26,r9
     0.00 :	  c02cb5ac:       lwz     r8,80(r28)
     0.00 :	  c02cb5b0:       rlwinm  r10,r9,1,0,30
     0.00 :	  c02cb5b4:       add     r9,r10,r9
     0.00 :	  c02cb5b8:       rlwinm  r9,r9,3,0,28
     0.00 :	  c02cb5bc:       lwzx    r3,r8,r9
     0.00 :	  c02cb5c0:       mr      r5,r31
     0.00 :	  c02cb5c4:       bl      c02c92ec <memzero_page>
     0.00 :	  c02cb5c8:       subf.   r29,r31,r29
     0.00 :	  c02cb5cc:       lwz     r9,20(r1)
     0.00 :	  c02cb5d0:       li      r4,0
     0.00 :	  c02cb5d4:       lwz     r10,24(r1)
     0.00 :	  c02cb5d8:       stw     r9,16(r30)
     0.00 :	  c02cb5dc:       addi    r9,r9,1
     0.00 :	  c02cb5e0:       add     r10,r10,r31
     0.00 :	  c02cb5e4:       stw     r9,20(r1)
     0.00 :	  c02cb5e8:       stw     r10,4(r30)
     0.00 :	  c02cb5ec:       stw     r25,24(r1)
     0.00 :	  c02cb5f0:       bne     c02cb598 <iov_iter_zero+0x1f4>
     0.00 :	  c02cb5f4:       lwz     r9,8(r30)
     0.00 :	  c02cb5f8:       subf    r9,r27,r9
     0.00 :	  c02cb5fc:       stw     r9,8(r30)
          :	iov_iter_zero():
     0.00 :	  c02cb600:       lwz     r0,84(r1)
     0.00 :	  c02cb604:       lwz     r25,52(r1)
     0.00 :	  c02cb608:       lwz     r26,56(r1)
     0.00 :	  c02cb60c:       mtlr    r0
     0.00 :	  c02cb610:       lwz     r28,64(r1)
     0.00 :	  c02cb614:       lwz     r29,68(r1)
     0.00 :	  c02cb618:       b       c02cb4d0 <iov_iter_zero+0x12c>
     0.00 :	  c02cb61c:       stw     r23,44(r1)
     0.00 :	  c02cb620:       cmpwi   r27,0
     0.00 :	  c02cb624:       stw     r28,64(r1)
     0.00 :	  c02cb628:       mr      r23,r27
     0.00 :	  c02cb62c:       stw     r24,48(r1)
     0.00 :	  c02cb630:       li      r28,0
     0.00 :	  c02cb634:       lwz     r24,12(r30)
     0.00 :	  c02cb638:       mr      r8,r24
     0.00 :	  c02cb63c:       beq     c02cb714 <iov_iter_zero+0x370>
     0.00 :	  c02cb640:       mflr    r0
     0.00 :	  c02cb644:       stw     r25,52(r1)
     0.00 :	  c02cb648:       stw     r0,84(r1)
     0.00 :	  c02cb64c:       stw     r26,56(r1)
     0.00 :	  c02cb650:       stw     r29,68(r1)
     0.00 :	  c02cb654:       rlwinm  r25,r28,1,0,30
     0.00 :	  c02cb658:       add     r25,r25,r28
     0.00 :	  c02cb65c:       rlwinm  r25,r25,2,0,29
     0.00 :	  c02cb660:       add     r10,r8,r25
     0.00 :	  c02cb664:       lwz     r26,4(r10)
     0.00 :	  c02cb668:       mr      r29,r25
     0.00 :	  c02cb66c:       lwz     r9,8(r10)
     0.00 :	  c02cb670:       subf    r26,r31,r26
     0.00 :	  c02cb674:       cmplw   r26,r23
     0.00 :	  c02cb678:       add     r9,r31,r9
     0.00 :	  c02cb67c:       clrlwi  r4,r9,20
     0.00 :	  c02cb680:       ble     c02cb688 <iov_iter_zero+0x2e4>
     0.00 :	  c02cb684:       mr      r26,r23
     0.00 :	  c02cb688:       subfic  r7,r4,4096
     0.00 :	  c02cb68c:       cmplw   r26,r7
     0.00 :	  c02cb690:       ble     c02cb698 <iov_iter_zero+0x2f4>
     0.00 :	  c02cb694:       mr      r26,r7
     0.00 :	  c02cb698:       cmpwi   r26,0
     0.00 :	  c02cb69c:       beq     c02cb6c0 <iov_iter_zero+0x31c>
     0.00 :	  c02cb6a0:       lwz     r3,0(r10)
     0.00 :	  c02cb6a4:       rlwinm  r9,r9,25,7,26
     0.00 :	  c02cb6a8:       mr      r5,r26
     0.00 :	  c02cb6ac:       add     r3,r3,r9
     0.00 :	  c02cb6b0:       bl      c02c92ec <memzero_page>
          :	bvec_iter_advance():
     0.00 :	  c02cb6b4:       cmplw   r23,r26
          :	iov_iter_zero():
     0.00 :	  c02cb6b8:       lwz     r8,12(r30)
          :	bvec_iter_advance():
     0.00 :	  c02cb6bc:       blt     c02cb850 <iov_iter_zero+0x4ac>
     0.00 :	  c02cb6c0:       add.    r31,r31,r26
     0.00 :	  c02cb6c4:       subf    r23,r26,r23
     0.00 :	  c02cb6c8:       addi    r10,r8,4
     0.00 :	  c02cb6cc:       bne     c02cb6e4 <iov_iter_zero+0x340>
     0.00 :	  c02cb6d0:       b       c02cb6f0 <iov_iter_zero+0x34c>
     0.00 :	  c02cb6d4:       subf.   r31,r9,r31
     0.00 :	  c02cb6d8:       addi    r28,r28,1
     0.00 :	  c02cb6dc:       addi    r29,r29,12
     0.00 :	  c02cb6e0:       beq     c02cb760 <iov_iter_zero+0x3bc>
     0.00 :	  c02cb6e4:       lwzx    r9,r10,r29
     0.00 :	  c02cb6e8:       cmplw   r31,r9
     0.00 :	  c02cb6ec:       bge     c02cb6d4 <iov_iter_zero+0x330>
          :	iov_iter_zero():
     0.00 :	  c02cb6f0:       cmpwi   r23,0
     0.00 :	  c02cb6f4:       bne     c02cb654 <iov_iter_zero+0x2b0>
     0.00 :	  c02cb6f8:       add     r8,r8,r29
     0.00 :	  c02cb6fc:       lwz     r0,84(r1)
     0.00 :	  c02cb700:       lwz     r9,8(r30)
     0.00 :	  c02cb704:       lwz     r25,52(r1)
     0.00 :	  c02cb708:       mtlr    r0
     0.00 :	  c02cb70c:       lwz     r26,56(r1)
     0.00 :	  c02cb710:       lwz     r29,68(r1)
     0.00 :	  c02cb714:       subf    r24,r24,r8
     0.00 :	  c02cb718:       stw     r8,12(r30)
     0.00 :	  c02cb71c:       srawi   r6,r24,2
     0.00 :	  c02cb720:       lwz     r7,16(r30)
     0.00 :	  c02cb724:       rlwinm  r10,r24,0,0,29
     0.00 :	  c02cb728:       add     r10,r10,r6
     0.00 :	  c02cb72c:       rlwinm  r8,r10,4,0,27
     0.00 :	  c02cb730:       add     r10,r10,r8
     0.00 :	  c02cb734:       rlwinm  r8,r10,8,0,23
     0.00 :	  c02cb738:       add     r10,r10,r8
     0.00 :	  c02cb73c:       rlwinm  r8,r10,16,0,15
     0.00 :	  c02cb740:       add     r10,r10,r8
     0.00 :	  c02cb744:       add     r10,r7,r10
     0.00 :	  c02cb748:       stw     r10,16(r30)
     0.00 :	  c02cb74c:       subf    r9,r27,r9
     0.00 :	  c02cb750:       lwz     r23,44(r1)
     0.00 :	  c02cb754:       lwz     r24,48(r1)
     0.00 :	  c02cb758:       lwz     r28,64(r1)
     0.00 :	  c02cb75c:       b       c02cb4c8 <iov_iter_zero+0x124>
     0.00 :	  c02cb760:       rlwinm  r29,r28,1,0,30
     0.00 :	  c02cb764:       add     r29,r29,r28
     0.00 :	  c02cb768:       rlwinm  r29,r29,2,0,29
     0.00 :	  c02cb76c:       b       c02cb6f0 <iov_iter_zero+0x34c>
     0.00 :	  c02cb770:       mflr    r0
     0.00 :	  c02cb774:       stw     r26,56(r1)
     0.00 :	  c02cb778:       stw     r0,84(r1)
     0.00 :	  c02cb77c:       stw     r28,64(r1)
     0.00 :	  c02cb780:       stw     r29,68(r1)
     0.00 :	  c02cb784:       lwz     r28,12(r30)
     0.00 :	  c02cb788:       lwz     r29,4(r28)
     0.00 :	  c02cb78c:       subf    r29,r31,r29
     0.00 :	  c02cb790:       cmplw   r29,r27
     0.00 :	  c02cb794:       ble     c02cb79c <iov_iter_zero+0x3f8>
     0.00 :	  c02cb798:       mr      r29,r27
     0.00 :	  c02cb79c:       cmpwi   r29,0
     0.00 :	  c02cb7a0:       beq     c02cb8d8 <iov_iter_zero+0x534>
     0.00 :	  c02cb7a4:       lwz     r3,0(r28)
     0.00 :	  c02cb7a8:       mr      r5,r29
     0.00 :	  c02cb7ac:       li      r4,0
     0.00 :	  c02cb7b0:       add     r3,r3,r31
     0.00 :	  c02cb7b4:       subf    r26,r29,r27
     0.00 :	  c02cb7b8:       bl      c001999c <memset>
     0.00 :	  c02cb7bc:       add     r31,r31,r29
     0.00 :	  c02cb7c0:       cmpwi   r26,0
     0.00 :	  c02cb7c4:       bne     c02cb818 <iov_iter_zero+0x474>
     0.00 :	  c02cb7c8:       lwz     r9,4(r28)
     0.00 :	  c02cb7cc:       cmpw    r9,r31
     0.00 :	  c02cb7d0:       bne     c02cb7dc <iov_iter_zero+0x438>
     0.00 :	  c02cb7d4:       addi    r28,r28,8
     0.00 :	  c02cb7d8:       li      r31,0
     0.00 :	  c02cb7dc:       lwz     r9,12(r30)
     0.00 :	  c02cb7e0:       lwz     r8,16(r30)
     0.00 :	  c02cb7e4:       subf    r10,r9,r28
     0.00 :	  c02cb7e8:       stw     r28,12(r30)
     0.00 :	  c02cb7ec:       srawi   r10,r10,3
     0.00 :	  c02cb7f0:       lwz     r9,8(r30)
     0.00 :	  c02cb7f4:       subf    r10,r10,r8
     0.00 :	  c02cb7f8:       stw     r10,16(r30)
     0.00 :	  c02cb7fc:       subf    r9,r27,r9
     0.00 :	  c02cb800:       lwz     r0,84(r1)
     0.00 :	  c02cb804:       lwz     r26,56(r1)
     0.00 :	  c02cb808:       lwz     r28,64(r1)
     0.00 :	  c02cb80c:       mtlr    r0
     0.00 :	  c02cb810:       lwz     r29,68(r1)
     0.00 :	  c02cb814:       b       c02cb4c8 <iov_iter_zero+0x124>
     0.00 :	  c02cb818:       lwz     r31,12(r28)
     0.00 :	  c02cb81c:       addi    r28,r28,8
     0.00 :	  c02cb820:       cmplw   r31,r26
     0.00 :	  c02cb824:       ble     c02cb82c <iov_iter_zero+0x488>
     0.00 :	  c02cb828:       mr      r31,r26
     0.00 :	  c02cb82c:       cmpwi   r31,0
     0.00 :	  c02cb830:       beq     c02cb818 <iov_iter_zero+0x474>
     0.00 :	  c02cb834:       lwz     r3,0(r28)
     0.00 :	  c02cb838:       mr      r5,r31
     0.00 :	  c02cb83c:       li      r4,0
     0.00 :	  c02cb840:       bl      c001999c <memset>
     0.00 :	  c02cb844:       subf.   r26,r31,r26
     0.00 :	  c02cb848:       beq     c02cb7c8 <iov_iter_zero+0x424>
     0.00 :	  c02cb84c:       b       c02cb818 <iov_iter_zero+0x474>
          :	bvec_iter_advance():
     0.00 :	  c02cb850:       lis     r9,-16236
     0.00 :	  c02cb854:       lbz     r10,-20170(r9)
     0.00 :	  c02cb858:       cmpwi   r10,0
     0.00 :	  c02cb85c:       beq     c02cb868 <iov_iter_zero+0x4c4>
          :	iov_iter_zero():
     0.00 :	  c02cb860:       add     r8,r8,r25
     0.00 :	  c02cb864:       b       c02cb6fc <iov_iter_zero+0x358>
          :	bvec_iter_advance():
     0.00 :	  c02cb868:       lis     r3,-16253
     0.00 :	  c02cb86c:       li      r10,1
     0.00 :	  c02cb870:       addi    r3,r3,7580
     0.00 :	  c02cb874:       stb     r10,-20170(r9)
     0.00 :	  c02cb878:       bl      c0029b1c <__warn_printk>
     0.00 :	  c02cb87c:       twui    r0,0
          :	iov_iter_zero():
     0.00 :	  c02cb880:       lwz     r8,12(r30)
     0.00 :	  c02cb884:       add     r8,r8,r25
     0.00 :	  c02cb888:       b       c02cb6fc <iov_iter_zero+0x358>
     0.00 :	  c02cb88c:       add     r31,r31,r27
     0.00 :	  c02cb890:       subf    r9,r27,r9
     0.00 :	  c02cb894:       b       c02cb4c8 <iov_iter_zero+0x124>
     0.00 :	  c02cb898:       mr      r29,r27
     0.00 :	  c02cb89c:       cmpwi   r29,0
     1.65 :	  c02cb8a0:       bne     c02cb8e0 <iov_iter_zero+0x53c>
     0.00 :	  c02cb8a4:       lwz     r9,8(r30)
     0.53 :	  c02cb8a8:       lwz     r7,4(r28)
     0.00 :	  c02cb8ac:       lwz     r10,12(r30)
     0.00 :	  c02cb8b0:       subf    r9,r27,r9
     0.00 :	  c02cb8b4:       b       c02cb498 <iov_iter_zero+0xf4>
     0.25 :	  c02cb8b8:       lwz     r0,84(r1)
     2.26 :	  c02cb8bc:       mtlr    r0
     0.00 :	  c02cb8c0:       b       c02cb89c <iov_iter_zero+0x4f8>
          :	clear_user():
     0.00 :	  c02cb8c4:       lwz     r0,84(r1)
     0.00 :	  c02cb8c8:       li      r27,0
     0.00 :	  c02cb8cc:       mr      r10,r28
     0.00 :	  c02cb8d0:       mtlr    r0
     0.00 :	  c02cb8d4:       b       c02cb498 <iov_iter_zero+0xf4>
          :	iov_iter_zero():
     0.00 :	  c02cb8d8:       mr      r26,r27
     0.00 :	  c02cb8dc:       b       c02cb7c0 <iov_iter_zero+0x41c>
     0.00 :	  c02cb8e0:       stw     r26,56(r1)
     0.00 :	  c02cb8e4:       stw     r25,52(r1)
          :	__access_ok():
     0.00 :	  c02cb8e8:       lis     r25,-16384
          :	iov_iter_zero():
     0.00 :	  c02cb8ec:       lwz     r7,12(r28)
     0.00 :	  c02cb8f0:       addi    r26,r28,8
     0.00 :	  c02cb8f4:       mr      r31,r29
     0.00 :	  c02cb8f8:       cmplw   r29,r7
     0.00 :	  c02cb8fc:       ble     c02cb904 <iov_iter_zero+0x560>
     0.00 :	  c02cb900:       mr      r31,r7
     0.00 :	  c02cb904:       cmpwi   r31,0
     0.00 :	  c02cb908:       beq     c02cba04 <iov_iter_zero+0x660>
     0.00 :	  c02cb90c:       lwz     r3,0(r26)
          :	__access_ok():
     0.00 :	  c02cb910:       cmplw   r3,r25
     0.00 :	  c02cb914:       bge     c02cb980 <iov_iter_zero+0x5dc>
     0.00 :	  c02cb918:       subf    r9,r3,r25
          :	clear_user():
     0.00 :	  c02cb91c:       cmplw   r31,r9
     0.00 :	  c02cb920:       mflr    r0
     0.00 :	  c02cb924:       stw     r0,84(r1)
     0.00 :	  c02cb928:       bgt     c02cb978 <iov_iter_zero+0x5d4>
     0.00 :	  c02cb92c:       mr      r4,r31
     0.00 :	  c02cb930:       bl      c001a41c <__arch_clear_user>
          :	iov_iter_zero():
     0.00 :	  c02cb934:       subf    r29,r31,r29
     0.00 :	  c02cb938:       cmpwi   r3,0
     0.00 :	  c02cb93c:       subf    r31,r3,r31
     0.00 :	  c02cb940:       add     r29,r3,r29
     0.00 :	  c02cb944:       beq     c02cb9cc <iov_iter_zero+0x628>
     0.00 :	  c02cb948:       lwz     r9,8(r30)
     0.00 :	  c02cb94c:       subf    r8,r27,r29
     0.00 :	  c02cb950:       lwz     r0,84(r1)
     0.00 :	  c02cb954:       subf    r27,r29,r27
     0.00 :	  c02cb958:       lwz     r7,12(r28)
     0.00 :	  c02cb95c:       add     r9,r8,r9
     0.00 :	  c02cb960:       mr      r28,r26
     0.00 :	  c02cb964:       lwz     r10,12(r30)
     0.00 :	  c02cb968:       lwz     r25,52(r1)
     0.00 :	  c02cb96c:       mtlr    r0
     0.00 :	  c02cb970:       lwz     r26,56(r1)
     0.00 :	  c02cb974:       b       c02cb498 <iov_iter_zero+0xf4>
     0.00 :	  c02cb978:       lwz     r0,84(r1)
     0.00 :	  c02cb97c:       mtlr    r0
     0.00 :	  c02cb980:       lwz     r9,8(r30)
     0.00 :	  c02cb984:       subf    r8,r27,r29
     0.00 :	  c02cb988:       mr      r28,r26
     0.00 :	  c02cb98c:       lwz     r10,12(r30)
     0.00 :	  c02cb990:       lwz     r25,52(r1)
     0.00 :	  c02cb994:       add     r9,r8,r9
     0.00 :	  c02cb998:       lwz     r26,56(r1)
     0.00 :	  c02cb99c:       subf    r27,r29,r27
     0.00 :	  c02cb9a0:       li      r31,0
     0.00 :	  c02cb9a4:       b       c02cb498 <iov_iter_zero+0xf4>
     0.00 :	  c02cb9a8:       mflr    r0
     0.00 :	  c02cb9ac:       stw     r23,44(r1)
     0.00 :	  c02cb9b0:       stw     r0,84(r1)
     0.00 :	  c02cb9b4:       stw     r24,48(r1)
     0.00 :	  c02cb9b8:       stw     r25,52(r1)
     0.00 :	  c02cb9bc:       stw     r26,56(r1)
     0.00 :	  c02cb9c0:       stw     r28,64(r1)
     0.00 :	  c02cb9c4:       stw     r29,68(r1)
     0.00 :	  c02cb9c8:       bl      c071db48 <__stack_chk_fail>
     0.00 :	  c02cb9cc:       cmpwi   r29,0
     0.00 :	  c02cb9d0:       bne     c02cb9fc <iov_iter_zero+0x658>
     0.00 :	  c02cb9d4:       lwz     r9,8(r30)
     0.00 :	  c02cb9d8:       lwz     r0,84(r1)
     0.00 :	  c02cb9dc:       lwz     r7,12(r28)
     0.00 :	  c02cb9e0:       subf    r9,r27,r9
     0.00 :	  c02cb9e4:       mr      r28,r26
     0.00 :	  c02cb9e8:       lwz     r10,12(r30)
     0.00 :	  c02cb9ec:       lwz     r25,52(r1)
     0.00 :	  c02cb9f0:       mtlr    r0
     0.00 :	  c02cb9f4:       lwz     r26,56(r1)
     0.00 :	  c02cb9f8:       b       c02cb498 <iov_iter_zero+0xf4>
     0.00 :	  c02cb9fc:       lwz     r0,84(r1)
     0.00 :	  c02cba00:       mtlr    r0
     0.00 :	  c02cba04:       mr      r28,r26
     0.00 :	  c02cba08:       b       c02cb8ec <iov_iter_zero+0x548>

Christophe

^ permalink raw reply

* Re: [PATCH 05/10] lkdtm: disable set_fs-based tests for !CONFIG_SET_FS
From: Kees Cook @ 2020-09-01 18:52 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-arch, linuxppc-dev, the arch/x86 maintainers,
	Linux Kernel Mailing List, Al Viro, linux-fsdevel, Linus Torvalds
In-Reply-To: <20200829092406.GB8833@lst.de>

On Sat, Aug 29, 2020 at 11:24:06AM +0200, Christoph Hellwig wrote:
> On Thu, Aug 27, 2020 at 11:06:28AM -0700, Linus Torvalds wrote:
> > On Thu, Aug 27, 2020 at 8:00 AM Christoph Hellwig <hch@lst.de> wrote:
> > >
> > > Once we can't manipulate the address limit, we also can't test what
> > > happens when the manipulation is abused.
> > 
> > Just remove these tests entirely.
> > 
> > Once set_fs() doesn't exist on x86, the tests no longer make any sense
> > what-so-ever, because test coverage will be basically zero.
> > 
> > So don't make the code uglier just to maintain a fiction that
> > something is tested when it isn't really.
> 
> Sure fine with me unless Kees screams.

If we don't have set_fs, we don't need the tests. :)

-- 
Kees Cook

^ permalink raw reply

* Re: [PATCH 05/10] lkdtm: disable set_fs-based tests for !CONFIG_SET_FS
From: Kees Cook @ 2020-09-01 18:57 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-arch, linuxppc-dev, the arch/x86 maintainers,
	Linux Kernel Mailing List, Al Viro, linux-fsdevel, Linus Torvalds
In-Reply-To: <20200829092406.GB8833@lst.de>

On Sat, Aug 29, 2020 at 11:24:06AM +0200, Christoph Hellwig wrote:
> On Thu, Aug 27, 2020 at 11:06:28AM -0700, Linus Torvalds wrote:
> > On Thu, Aug 27, 2020 at 8:00 AM Christoph Hellwig <hch@lst.de> wrote:
> > >
> > > Once we can't manipulate the address limit, we also can't test what
> > > happens when the manipulation is abused.
> > 
> > Just remove these tests entirely.
> > 
> > Once set_fs() doesn't exist on x86, the tests no longer make any sense
> > what-so-ever, because test coverage will be basically zero.
> > 
> > So don't make the code uglier just to maintain a fiction that
> > something is tested when it isn't really.
> 
> Sure fine with me unless Kees screams.

To clarify: if any of x86, arm64, arm, powerpc, riscv, and s390 are
using set_fs(), I want to keep this test. "ugly" is fine in lkdtm. :)

-- 
Kees Cook

^ permalink raw reply

* Re: remove the last set_fs() in common code, and remove it for x86 and powerpc v2
From: Christophe Leroy @ 2020-09-01 19:01 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-arch, Kees Cook, x86, linuxppc-dev, linux-kernel,
	linux-fsdevel, Linus Torvalds, Christoph Hellwig
In-Reply-To: <20200901172512.GI1236603@ZenIV.linux.org.uk>



Le 01/09/2020 à 19:25, Al Viro a écrit :
> On Tue, Sep 01, 2020 at 07:13:00PM +0200, Christophe Leroy wrote:
> 
>>      10.92%  dd       [kernel.kallsyms]  [k] iov_iter_zero
> 
> Interesting...  Could you get an instruction-level profile inside iov_iter_zero(),
> along with the disassembly of that sucker?
> 

As a comparison, hereunder is the perf annotate of the 5.9-rc2 without 
the series:

  Percent |	Source code & Disassembly of vmlinux for cpu-clock (2581 
samples)
---------------------------------------------------------------------------------
          :
          :
          :
          :	Disassembly of section .text:
          :
          :	c02cbb80 <iov_iter_zero>:
          :	iov_iter_zero():
     3.22 :	  c02cbb80:       stwu    r1,-80(r1)
     3.25 :	  c02cbb84:       stw     r30,72(r1)
     0.00 :	  c02cbb88:       mr      r30,r4
     2.91 :	  c02cbb8c:       stw     r31,76(r1)
     0.00 :	  c02cbb90:       mr      r31,r3
     0.19 :	  c02cbb94:       stw     r27,60(r1)
          :	iov_iter_type():
     1.82 :	  c02cbb98:       lwz     r10,0(r4)
     0.54 :	  c02cbb9c:       rlwinm  r9,r10,0,0,30
          :	iov_iter_zero():
     1.98 :	  c02cbba0:       cmpwi   r9,32
     0.00 :	  c02cbba4:       lwz     r9,624(r2)
     0.35 :	  c02cbba8:       stw     r9,28(r1)
     0.00 :	  c02cbbac:       li      r9,0
     0.00 :	  c02cbbb0:       beq     c02cbd00 <iov_iter_zero+0x180>
     2.67 :	  c02cbbb4:       lwz     r9,8(r4)
     1.98 :	  c02cbbb8:       cmplw   r9,r3
     0.00 :	  c02cbbbc:       mr      r27,r9
     0.00 :	  c02cbbc0:       bgt     c02cbce8 <iov_iter_zero+0x168>
     0.31 :	  c02cbbc4:       cmpwi   r9,0
     0.00 :	  c02cbbc8:       beq     c02cbcbc <iov_iter_zero+0x13c>
     3.22 :	  c02cbbcc:       andi.   r8,r10,16
     1.70 :	  c02cbbd0:       lwz     r31,4(r30)
     0.00 :	  c02cbbd4:       bne     c02cbe10 <iov_iter_zero+0x290>
     0.31 :	  c02cbbd8:       andi.   r8,r10,8
     0.00 :	  c02cbbdc:       bne     c02cbf64 <iov_iter_zero+0x3e4>
     1.82 :	  c02cbbe0:       andi.   r10,r10,64
     0.00 :	  c02cbbe4:       bne     c02cc080 <iov_iter_zero+0x500>
     0.27 :	  c02cbbe8:       stw     r29,68(r1)
     1.94 :	  c02cbbec:       stw     r28,64(r1)
     1.98 :	  c02cbbf0:       lwz     r28,12(r30)
     0.31 :	  c02cbbf4:       lwz     r7,4(r28)
     2.13 :	  c02cbbf8:       subf    r29,r31,r7
     1.78 :	  c02cbbfc:       cmplw   r29,r27
     0.08 :	  c02cbc00:       bgt     c02cbcf8 <iov_iter_zero+0x178>
    28.24 :	  c02cbc04:       cmpwi   r29,0
     0.00 :	  c02cbc08:       beq     c02cc08c <iov_iter_zero+0x50c>
     2.01 :	  c02cbc0c:       lwz     r3,0(r28)
     3.10 :	  c02cbc10:       lwz     r10,1208(r2)
     0.00 :	  c02cbc14:       add     r3,r3,r31
          :	__access_ok():
     0.00 :	  c02cbc18:       cmplw   r3,r10
     0.00 :	  c02cbc1c:       bgt     c02cbc7c <iov_iter_zero+0xfc>
     3.37 :	  c02cbc20:       subf    r10,r3,r10
     0.00 :	  c02cbc24:       addi    r8,r29,-1
     3.14 :	  c02cbc28:       cmplw   r8,r10
     0.08 :	  c02cbc2c:       mflr    r0
     0.00 :	  c02cbc30:       stw     r0,84(r1)
     0.00 :	  c02cbc34:       bgt     c02cbd40 <iov_iter_zero+0x1c0>
          :	clear_user():
     0.00 :	  c02cbc38:       mr      r4,r29
     2.40 :	  c02cbc3c:       bl      c001a428 <__arch_clear_user>
          :	iov_iter_zero():
     1.55 :	  c02cbc40:       add     r31,r31,r29
     0.00 :	  c02cbc44:       cmpwi   r3,0
     1.94 :	  c02cbc48:       subf    r29,r29,r27
     0.00 :	  c02cbc4c:       subf    r31,r3,r31
     0.00 :	  c02cbc50:       add     r29,r29,r3
     0.00 :	  c02cbc54:       beq     c02cc0ac <iov_iter_zero+0x52c>
     0.00 :	  c02cbc58:       lwz     r9,8(r30)
     0.00 :	  c02cbc5c:       subf    r10,r27,r29
     0.00 :	  c02cbc60:       lwz     r0,84(r1)
     0.00 :	  c02cbc64:       subf    r27,r29,r27
     0.00 :	  c02cbc68:       add     r9,r10,r9
     0.00 :	  c02cbc6c:       lwz     r7,4(r28)
     0.00 :	  c02cbc70:       lwz     r10,12(r30)
     0.00 :	  c02cbc74:       mtlr    r0
     0.00 :	  c02cbc78:       b       c02cbc84 <iov_iter_zero+0x104>
          :	__access_ok():
     0.00 :	  c02cbc7c:       li      r27,0
     0.00 :	  c02cbc80:       mr      r10,r28
          :	iov_iter_zero():
     0.00 :	  c02cbc84:       cmplw   r31,r7
     0.00 :	  c02cbc88:       bne     c02cbc94 <iov_iter_zero+0x114>
     0.93 :	  c02cbc8c:       addi    r28,r28,8
     0.00 :	  c02cbc90:       li      r31,0
     1.28 :	  c02cbc94:       lwz     r8,16(r30)
     0.00 :	  c02cbc98:       subf    r10,r10,r28
     1.05 :	  c02cbc9c:       srawi   r10,r10,3
     0.00 :	  c02cbca0:       stw     r28,12(r30)
     0.00 :	  c02cbca4:       subf    r10,r10,r8
     0.93 :	  c02cbca8:       stw     r10,16(r30)
     0.04 :	  c02cbcac:       lwz     r28,64(r1)
     0.00 :	  c02cbcb0:       lwz     r29,68(r1)
     1.05 :	  c02cbcb4:       stw     r9,8(r30)
     0.00 :	  c02cbcb8:       stw     r31,4(r30)
     1.39 :	  c02cbcbc:       lwz     r9,28(r1)
     0.00 :	  c02cbcc0:       lwz     r10,624(r2)
     1.08 :	  c02cbcc4:       xor.    r9,r9,r10
     0.00 :	  c02cbcc8:       li      r10,0
     0.00 :	  c02cbccc:       bne     c02cc180 <iov_iter_zero+0x600>
     1.08 :	  c02cbcd0:       mr      r3,r27
     0.00 :	  c02cbcd4:       lwz     r30,72(r1)
     0.08 :	  c02cbcd8:       lwz     r27,60(r1)
     1.01 :	  c02cbcdc:       lwz     r31,76(r1)
     0.00 :	  c02cbce0:       addi    r1,r1,80
     0.04 :	  c02cbce4:       blr
     0.00 :	  c02cbce8:       cmpwi   r9,0
     0.00 :	  c02cbcec:       mr      r27,r3
     0.00 :	  c02cbcf0:       beq     c02cbcbc <iov_iter_zero+0x13c>
     0.00 :	  c02cbcf4:       b       c02cbbcc <iov_iter_zero+0x4c>
     0.00 :	  c02cbcf8:       mr      r29,r27
     0.00 :	  c02cbcfc:       b       c02cbc04 <iov_iter_zero+0x84>
          :	pipe_zero():
     0.00 :	  c02cbd00:       mflr    r0
     0.00 :	  c02cbd04:       stw     r26,56(r1)
     0.00 :	  c02cbd08:       stw     r0,84(r1)
     0.00 :	  c02cbd0c:       mr      r3,r4
     0.00 :	  c02cbd10:       stw     r28,64(r1)
     0.00 :	  c02cbd14:       lwz     r28,12(r4)
     0.00 :	  c02cbd18:       lwz     r26,40(r28)
     0.00 :	  c02cbd1c:       bl      c02c95d0 <sanity>
     0.00 :	  c02cbd20:       cmpwi   r3,0
     0.00 :	  c02cbd24:       bne     c02cbd54 <iov_iter_zero+0x1d4>
     0.00 :	  c02cbd28:       lwz     r0,84(r1)
     0.00 :	  c02cbd2c:       li      r27,0
     0.00 :	  c02cbd30:       lwz     r26,56(r1)
     0.00 :	  c02cbd34:       lwz     r28,64(r1)
     0.00 :	  c02cbd38:       mtlr    r0
     0.00 :	  c02cbd3c:       b       c02cbcbc <iov_iter_zero+0x13c>
          :	__access_ok():
     0.00 :	  c02cbd40:       lwz     r0,84(r1)
     0.00 :	  c02cbd44:       li      r27,0
     0.00 :	  c02cbd48:       mr      r10,r28
     0.00 :	  c02cbd4c:       mtlr    r0
     0.00 :	  c02cbd50:       b       c02cbc84 <iov_iter_zero+0x104>
          :	pipe_zero():
     0.00 :	  c02cbd54:       mr      r4,r31
     0.00 :	  c02cbd58:       addi    r6,r1,24
     0.00 :	  c02cbd5c:       addi    r5,r1,20
     0.00 :	  c02cbd60:       mr      r3,r30
     0.00 :	  c02cbd64:       bl      c02c97ac <push_pipe>
     0.00 :	  c02cbd68:       mr.     r27,r3
     0.00 :	  c02cbd6c:       beq     c02cbd28 <iov_iter_zero+0x1a8>
     0.00 :	  c02cbd70:       lwz     r4,24(r1)
     0.00 :	  c02cbd74:       addi    r26,r26,-1
     0.00 :	  c02cbd78:       lwz     r9,20(r1)
     0.00 :	  c02cbd7c:       stw     r25,52(r1)
     0.00 :	  c02cbd80:       li      r25,0
     0.00 :	  c02cbd84:       stw     r29,68(r1)
     0.00 :	  c02cbd88:       mr      r29,r27
     0.00 :	  c02cbd8c:       subfic  r31,r4,4096
     0.00 :	  c02cbd90:       cmplw   r31,r29
     0.00 :	  c02cbd94:       ble     c02cbd9c <iov_iter_zero+0x21c>
     0.00 :	  c02cbd98:       mr      r31,r29
     0.00 :	  c02cbd9c:       and     r9,r26,r9
     0.00 :	  c02cbda0:       lwz     r8,80(r28)
     0.00 :	  c02cbda4:       rlwinm  r10,r9,1,0,30
     0.00 :	  c02cbda8:       add     r9,r10,r9
     0.00 :	  c02cbdac:       rlwinm  r9,r9,3,0,28
     0.00 :	  c02cbdb0:       lwzx    r3,r8,r9
     0.00 :	  c02cbdb4:       mr      r5,r31
     0.00 :	  c02cbdb8:       bl      c02c99d0 <memzero_page>
     0.00 :	  c02cbdbc:       subf.   r29,r31,r29
     0.00 :	  c02cbdc0:       lwz     r9,20(r1)
     0.00 :	  c02cbdc4:       li      r4,0
     0.00 :	  c02cbdc8:       lwz     r10,24(r1)
     0.00 :	  c02cbdcc:       stw     r9,16(r30)
     0.00 :	  c02cbdd0:       addi    r9,r9,1
     0.00 :	  c02cbdd4:       add     r10,r10,r31
     0.00 :	  c02cbdd8:       stw     r9,20(r1)
     0.00 :	  c02cbddc:       stw     r10,4(r30)
     0.00 :	  c02cbde0:       stw     r25,24(r1)
     0.00 :	  c02cbde4:       bne     c02cbd8c <iov_iter_zero+0x20c>
     0.00 :	  c02cbde8:       lwz     r9,8(r30)
     0.00 :	  c02cbdec:       subf    r9,r27,r9
     0.00 :	  c02cbdf0:       stw     r9,8(r30)
          :	iov_iter_zero():
     0.00 :	  c02cbdf4:       lwz     r0,84(r1)
     0.00 :	  c02cbdf8:       lwz     r25,52(r1)
     0.00 :	  c02cbdfc:       lwz     r26,56(r1)
     0.00 :	  c02cbe00:       mtlr    r0
     0.00 :	  c02cbe04:       lwz     r28,64(r1)
     0.00 :	  c02cbe08:       lwz     r29,68(r1)
     0.00 :	  c02cbe0c:       b       c02cbcbc <iov_iter_zero+0x13c>
     0.00 :	  c02cbe10:       stw     r23,44(r1)
     0.00 :	  c02cbe14:       cmpwi   r27,0
     0.00 :	  c02cbe18:       stw     r28,64(r1)
     0.00 :	  c02cbe1c:       mr      r23,r27
     0.00 :	  c02cbe20:       stw     r24,48(r1)
     0.00 :	  c02cbe24:       li      r28,0
     0.00 :	  c02cbe28:       lwz     r24,12(r30)
     0.00 :	  c02cbe2c:       mr      r8,r24
     0.00 :	  c02cbe30:       beq     c02cbf08 <iov_iter_zero+0x388>
     0.00 :	  c02cbe34:       mflr    r0
     0.00 :	  c02cbe38:       stw     r25,52(r1)
     0.00 :	  c02cbe3c:       stw     r0,84(r1)
     0.00 :	  c02cbe40:       stw     r26,56(r1)
     0.00 :	  c02cbe44:       stw     r29,68(r1)
     0.00 :	  c02cbe48:       rlwinm  r25,r28,1,0,30
     0.00 :	  c02cbe4c:       add     r25,r25,r28
     0.00 :	  c02cbe50:       rlwinm  r25,r25,2,0,29
     0.00 :	  c02cbe54:       add     r10,r8,r25
     0.00 :	  c02cbe58:       lwz     r26,4(r10)
     0.00 :	  c02cbe5c:       mr      r29,r25
     0.00 :	  c02cbe60:       lwz     r9,8(r10)
     0.00 :	  c02cbe64:       subf    r26,r31,r26
     0.00 :	  c02cbe68:       cmplw   r26,r23
     0.00 :	  c02cbe6c:       add     r9,r31,r9
     0.00 :	  c02cbe70:       clrlwi  r4,r9,20
     0.00 :	  c02cbe74:       ble     c02cbe7c <iov_iter_zero+0x2fc>
     0.00 :	  c02cbe78:       mr      r26,r23
     0.00 :	  c02cbe7c:       subfic  r7,r4,4096
     0.00 :	  c02cbe80:       cmplw   r26,r7
     0.00 :	  c02cbe84:       ble     c02cbe8c <iov_iter_zero+0x30c>
     0.00 :	  c02cbe88:       mr      r26,r7
     0.00 :	  c02cbe8c:       cmpwi   r26,0
     0.00 :	  c02cbe90:       beq     c02cbeb4 <iov_iter_zero+0x334>
     0.00 :	  c02cbe94:       lwz     r3,0(r10)
     0.00 :	  c02cbe98:       rlwinm  r9,r9,25,7,26
     0.00 :	  c02cbe9c:       mr      r5,r26
     0.00 :	  c02cbea0:       add     r3,r3,r9
     0.00 :	  c02cbea4:       bl      c02c99d0 <memzero_page>
          :	bvec_iter_advance():
     0.00 :	  c02cbea8:       cmplw   r23,r26
          :	iov_iter_zero():
     0.00 :	  c02cbeac:       lwz     r8,12(r30)
          :	bvec_iter_advance():
     0.00 :	  c02cbeb0:       blt     c02cc044 <iov_iter_zero+0x4c4>
     0.00 :	  c02cbeb4:       add.    r31,r31,r26
     0.00 :	  c02cbeb8:       subf    r23,r26,r23
     0.00 :	  c02cbebc:       addi    r10,r8,4
     0.00 :	  c02cbec0:       bne     c02cbed8 <iov_iter_zero+0x358>
     0.00 :	  c02cbec4:       b       c02cbee4 <iov_iter_zero+0x364>
     0.00 :	  c02cbec8:       subf.   r31,r9,r31
     0.00 :	  c02cbecc:       addi    r28,r28,1
     0.00 :	  c02cbed0:       addi    r29,r29,12
     0.00 :	  c02cbed4:       beq     c02cbf54 <iov_iter_zero+0x3d4>
     0.00 :	  c02cbed8:       lwzx    r9,r10,r29
     0.00 :	  c02cbedc:       cmplw   r31,r9
     0.00 :	  c02cbee0:       bge     c02cbec8 <iov_iter_zero+0x348>
          :	iov_iter_zero():
     0.00 :	  c02cbee4:       cmpwi   r23,0
     0.00 :	  c02cbee8:       bne     c02cbe48 <iov_iter_zero+0x2c8>
     0.00 :	  c02cbeec:       add     r8,r8,r29
     0.00 :	  c02cbef0:       lwz     r0,84(r1)
     0.00 :	  c02cbef4:       lwz     r9,8(r30)
     0.00 :	  c02cbef8:       lwz     r25,52(r1)
     0.00 :	  c02cbefc:       mtlr    r0
     0.00 :	  c02cbf00:       lwz     r26,56(r1)
     0.00 :	  c02cbf04:       lwz     r29,68(r1)
     0.00 :	  c02cbf08:       subf    r24,r24,r8
     0.00 :	  c02cbf0c:       stw     r8,12(r30)
     0.00 :	  c02cbf10:       srawi   r6,r24,2
     0.00 :	  c02cbf14:       lwz     r7,16(r30)
     0.00 :	  c02cbf18:       rlwinm  r10,r24,0,0,29
     0.00 :	  c02cbf1c:       add     r10,r10,r6
     0.00 :	  c02cbf20:       rlwinm  r8,r10,4,0,27
     0.00 :	  c02cbf24:       add     r10,r10,r8
     0.00 :	  c02cbf28:       rlwinm  r8,r10,8,0,23
     0.00 :	  c02cbf2c:       add     r10,r10,r8
     0.00 :	  c02cbf30:       rlwinm  r8,r10,16,0,15
     0.00 :	  c02cbf34:       add     r10,r10,r8
     0.00 :	  c02cbf38:       add     r10,r7,r10
     0.00 :	  c02cbf3c:       stw     r10,16(r30)
     0.00 :	  c02cbf40:       subf    r9,r27,r9
     0.00 :	  c02cbf44:       lwz     r23,44(r1)
     0.00 :	  c02cbf48:       lwz     r24,48(r1)
     0.00 :	  c02cbf4c:       lwz     r28,64(r1)
     0.00 :	  c02cbf50:       b       c02cbcb4 <iov_iter_zero+0x134>
     0.00 :	  c02cbf54:       rlwinm  r29,r28,1,0,30
     0.00 :	  c02cbf58:       add     r29,r29,r28
     0.00 :	  c02cbf5c:       rlwinm  r29,r29,2,0,29
     0.00 :	  c02cbf60:       b       c02cbee4 <iov_iter_zero+0x364>
     0.00 :	  c02cbf64:       mflr    r0
     0.00 :	  c02cbf68:       stw     r26,56(r1)
     0.00 :	  c02cbf6c:       stw     r0,84(r1)
     0.00 :	  c02cbf70:       stw     r28,64(r1)
     0.00 :	  c02cbf74:       stw     r29,68(r1)
     0.00 :	  c02cbf78:       lwz     r28,12(r30)
     0.00 :	  c02cbf7c:       lwz     r29,4(r28)
     0.00 :	  c02cbf80:       subf    r29,r31,r29
     0.00 :	  c02cbf84:       cmplw   r29,r27
     0.00 :	  c02cbf88:       ble     c02cbf90 <iov_iter_zero+0x410>
     0.00 :	  c02cbf8c:       mr      r29,r27
     0.00 :	  c02cbf90:       cmpwi   r29,0
     0.00 :	  c02cbf94:       beq     c02cc0b8 <iov_iter_zero+0x538>
     0.00 :	  c02cbf98:       lwz     r3,0(r28)
     0.00 :	  c02cbf9c:       mr      r5,r29
     0.00 :	  c02cbfa0:       li      r4,0
     0.00 :	  c02cbfa4:       add     r3,r3,r31
     0.00 :	  c02cbfa8:       subf    r26,r29,r27
     0.00 :	  c02cbfac:       bl      c001999c <memset>
     0.00 :	  c02cbfb0:       add     r31,r31,r29
     0.00 :	  c02cbfb4:       cmpwi   r26,0
     0.00 :	  c02cbfb8:       bne     c02cc00c <iov_iter_zero+0x48c>
     0.00 :	  c02cbfbc:       lwz     r9,4(r28)
     0.00 :	  c02cbfc0:       cmpw    r9,r31
     0.00 :	  c02cbfc4:       bne     c02cbfd0 <iov_iter_zero+0x450>
     0.00 :	  c02cbfc8:       addi    r28,r28,8
     0.00 :	  c02cbfcc:       li      r31,0
     0.00 :	  c02cbfd0:       lwz     r9,12(r30)
     0.00 :	  c02cbfd4:       lwz     r8,16(r30)
     0.00 :	  c02cbfd8:       subf    r10,r9,r28
     0.00 :	  c02cbfdc:       stw     r28,12(r30)
     0.00 :	  c02cbfe0:       srawi   r10,r10,3
     0.00 :	  c02cbfe4:       lwz     r9,8(r30)
     0.00 :	  c02cbfe8:       subf    r10,r10,r8
     0.00 :	  c02cbfec:       stw     r10,16(r30)
     0.00 :	  c02cbff0:       subf    r9,r27,r9
     0.00 :	  c02cbff4:       lwz     r0,84(r1)
     0.00 :	  c02cbff8:       lwz     r26,56(r1)
     0.00 :	  c02cbffc:       lwz     r28,64(r1)
     0.00 :	  c02cc000:       mtlr    r0
     0.00 :	  c02cc004:       lwz     r29,68(r1)
     0.00 :	  c02cc008:       b       c02cbcb4 <iov_iter_zero+0x134>
     0.00 :	  c02cc00c:       lwz     r31,12(r28)
     0.00 :	  c02cc010:       addi    r28,r28,8
     0.00 :	  c02cc014:       cmplw   r31,r26
     0.00 :	  c02cc018:       ble     c02cc020 <iov_iter_zero+0x4a0>
     0.00 :	  c02cc01c:       mr      r31,r26
     0.00 :	  c02cc020:       cmpwi   r31,0
     0.00 :	  c02cc024:       beq     c02cc00c <iov_iter_zero+0x48c>
     0.00 :	  c02cc028:       lwz     r3,0(r28)
     0.00 :	  c02cc02c:       mr      r5,r31
     0.00 :	  c02cc030:       li      r4,0
     0.00 :	  c02cc034:       bl      c001999c <memset>
     0.00 :	  c02cc038:       subf.   r26,r31,r26
     0.00 :	  c02cc03c:       beq     c02cbfbc <iov_iter_zero+0x43c>
     0.00 :	  c02cc040:       b       c02cc00c <iov_iter_zero+0x48c>
          :	bvec_iter_advance():
     0.00 :	  c02cc044:       lis     r9,-16236
     0.00 :	  c02cc048:       lbz     r10,-20202(r9)
     0.00 :	  c02cc04c:       cmpwi   r10,0
     0.00 :	  c02cc050:       beq     c02cc05c <iov_iter_zero+0x4dc>
          :	iov_iter_zero():
     0.00 :	  c02cc054:       add     r8,r8,r25
     0.00 :	  c02cc058:       b       c02cbef0 <iov_iter_zero+0x370>
          :	bvec_iter_advance():
     0.00 :	  c02cc05c:       lis     r3,-16253
     0.00 :	  c02cc060:       li      r10,1
     0.00 :	  c02cc064:       addi    r3,r3,7692
     0.00 :	  c02cc068:       stb     r10,-20202(r9)
     0.00 :	  c02cc06c:       bl      c0029bc0 <__warn_printk>
     0.00 :	  c02cc070:       twui    r0,0
          :	iov_iter_zero():
     0.00 :	  c02cc074:       lwz     r8,12(r30)
     0.00 :	  c02cc078:       add     r8,r8,r25
     0.00 :	  c02cc07c:       b       c02cbef0 <iov_iter_zero+0x370>
     0.00 :	  c02cc080:       add     r31,r31,r27
     0.00 :	  c02cc084:       subf    r9,r27,r9
     0.00 :	  c02cc088:       b       c02cbcb4 <iov_iter_zero+0x134>
     0.00 :	  c02cc08c:       mr      r29,r27
     0.00 :	  c02cc090:       cmpwi   r29,0
     0.00 :	  c02cc094:       bne     c02cc0c0 <iov_iter_zero+0x540>
     1.51 :	  c02cc098:       lwz     r9,8(r30)
     0.00 :	  c02cc09c:       lwz     r7,4(r28)
     0.00 :	  c02cc0a0:       lwz     r10,12(r30)
     0.00 :	  c02cc0a4:       subf    r9,r27,r9
     0.00 :	  c02cc0a8:       b       c02cbc84 <iov_iter_zero+0x104>
     1.47 :	  c02cc0ac:       lwz     r0,84(r1)
     6.47 :	  c02cc0b0:       mtlr    r0
     0.00 :	  c02cc0b4:       b       c02cc090 <iov_iter_zero+0x510>
     0.00 :	  c02cc0b8:       mr      r26,r27
     0.00 :	  c02cc0bc:       b       c02cbfb4 <iov_iter_zero+0x434>
     0.00 :	  c02cc0c0:       stw     r26,56(r1)
     0.00 :	  c02cc0c4:       lwz     r7,12(r28)
     0.00 :	  c02cc0c8:       addi    r26,r28,8
     0.00 :	  c02cc0cc:       mr      r31,r29
     0.00 :	  c02cc0d0:       cmplw   r29,r7
     0.00 :	  c02cc0d4:       ble     c02cc0dc <iov_iter_zero+0x55c>
     0.00 :	  c02cc0d8:       mr      r31,r7
     0.00 :	  c02cc0dc:       cmpwi   r31,0
     0.00 :	  c02cc0e0:       beq     c02cc1d8 <iov_iter_zero+0x658>
     0.00 :	  c02cc0e4:       lwz     r3,0(r26)
          :	clear_user():
     0.00 :	  c02cc0e8:       lwz     r9,1208(r2)
          :	__access_ok():
     0.00 :	  c02cc0ec:       cmplw   r3,r9
     0.00 :	  c02cc0f0:       bgt     c02cc114 <iov_iter_zero+0x594>
     0.00 :	  c02cc0f4:       subf    r9,r3,r9
     0.00 :	  c02cc0f8:       addi    r10,r31,-1
     0.00 :	  c02cc0fc:       cmplw   r10,r9
     0.00 :	  c02cc100:       mflr    r0
     0.00 :	  c02cc104:       stw     r0,84(r1)
     0.00 :	  c02cc108:       ble     c02cc138 <iov_iter_zero+0x5b8>
     0.00 :	  c02cc10c:       lwz     r0,84(r1)
     0.00 :	  c02cc110:       mtlr    r0
          :	iov_iter_zero():
     0.00 :	  c02cc114:       lwz     r9,8(r30)
     0.00 :	  c02cc118:       subf    r8,r27,r29
     0.00 :	  c02cc11c:       mr      r28,r26
     0.00 :	  c02cc120:       lwz     r10,12(r30)
     0.00 :	  c02cc124:       lwz     r26,56(r1)
     0.00 :	  c02cc128:       add     r9,r8,r9
     0.00 :	  c02cc12c:       subf    r27,r29,r27
     0.00 :	  c02cc130:       li      r31,0
     0.00 :	  c02cc134:       b       c02cbc84 <iov_iter_zero+0x104>
          :	clear_user():
     0.00 :	  c02cc138:       mr      r4,r31
     0.00 :	  c02cc13c:       bl      c001a428 <__arch_clear_user>
          :	iov_iter_zero():
     0.00 :	  c02cc140:       subf    r29,r31,r29
     0.00 :	  c02cc144:       cmpwi   r3,0
     0.00 :	  c02cc148:       subf    r31,r3,r31
     0.00 :	  c02cc14c:       add     r29,r3,r29
     0.00 :	  c02cc150:       beq     c02cc1a4 <iov_iter_zero+0x624>
     0.00 :	  c02cc154:       lwz     r9,8(r30)
     0.00 :	  c02cc158:       subf    r8,r27,r29
     0.00 :	  c02cc15c:       lwz     r0,84(r1)
     0.00 :	  c02cc160:       subf    r27,r29,r27
     0.00 :	  c02cc164:       lwz     r7,12(r28)
     0.00 :	  c02cc168:       add     r9,r8,r9
     0.00 :	  c02cc16c:       mr      r28,r26
     0.00 :	  c02cc170:       lwz     r10,12(r30)
     0.00 :	  c02cc174:       lwz     r26,56(r1)
     0.00 :	  c02cc178:       mtlr    r0
     0.00 :	  c02cc17c:       b       c02cbc84 <iov_iter_zero+0x104>
     0.00 :	  c02cc180:       mflr    r0
     0.00 :	  c02cc184:       stw     r23,44(r1)
     0.00 :	  c02cc188:       stw     r0,84(r1)
     0.00 :	  c02cc18c:       stw     r24,48(r1)
     0.00 :	  c02cc190:       stw     r25,52(r1)
     0.00 :	  c02cc194:       stw     r26,56(r1)
     0.00 :	  c02cc198:       stw     r28,64(r1)
     0.00 :	  c02cc19c:       stw     r29,68(r1)
     0.00 :	  c02cc1a0:       bl      c071e2b0 <__stack_chk_fail>
     0.00 :	  c02cc1a4:       cmpwi   r29,0
     0.00 :	  c02cc1a8:       bne     c02cc1d0 <iov_iter_zero+0x650>
     0.00 :	  c02cc1ac:       lwz     r9,8(r30)
     0.00 :	  c02cc1b0:       lwz     r0,84(r1)
     0.00 :	  c02cc1b4:       lwz     r7,12(r28)
     0.00 :	  c02cc1b8:       subf    r9,r27,r9
     0.00 :	  c02cc1bc:       mr      r28,r26
     0.00 :	  c02cc1c0:       lwz     r10,12(r30)
     0.00 :	  c02cc1c4:       lwz     r26,56(r1)
     0.00 :	  c02cc1c8:       mtlr    r0
     0.00 :	  c02cc1cc:       b       c02cbc84 <iov_iter_zero+0x104>
     0.00 :	  c02cc1d0:       lwz     r0,84(r1)
     0.00 :	  c02cc1d4:       mtlr    r0
     0.00 :	  c02cc1d8:       mr      r28,r26
     0.00 :	  c02cc1dc:       b       c02cc0c4 <iov_iter_zero+0x544>


Christophe

^ permalink raw reply

* Re: [RESEND][PATCH 1/7] powerpc/iommu: Avoid overflow at boundary_size
From: Nicolin Chen @ 2020-09-01 20:53 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linux-ia64, James.Bottomley, paulus, hpa, sparclinux, hch, sfr,
	deller, x86, borntraeger, mingo, mattst88, fenghua.yu, gor,
	schnelle, hca, ink, tglx, gerald.schaefer, rth, tony.luck,
	linux-parisc, linux-s390, linux-kernel, linux-alpha, bp,
	linuxppc-dev, davem
In-Reply-To: <87lfht1vav.fsf@mpe.ellerman.id.au>

On Tue, Sep 01, 2020 at 11:27:36PM +1000, Michael Ellerman wrote:
> Nicolin Chen <nicoleotsuka@gmail.com> writes:
> > The boundary_size might be as large as ULONG_MAX, which means
> > that a device has no specific boundary limit. So either "+ 1"
> > or passing it to ALIGN() would potentially overflow.
> >
> > According to kernel defines:
> >     #define ALIGN_MASK(x, mask) (((x) + (mask)) & ~(mask))
> >     #define ALIGN(x, a)	ALIGN_MASK(x, (typeof(x))(a) - 1)
> >
> > We can simplify the logic here:
> >   ALIGN(boundary + 1, 1 << shift) >> shift
> > = ALIGN_MASK(b + 1, (1 << s) - 1) >> s
> > = {[b + 1 + (1 << s) - 1] & ~[(1 << s) - 1]} >> s
> > = [b + 1 + (1 << s) - 1] >> s
> > = [b + (1 << s)] >> s
> > = (b >> s) + 1
> >
> > So fixing a potential overflow with the safer shortcut.
> >
> > Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
> > Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
> > Cc: Christoph Hellwig <hch@lst.de>
> > ---
> >  arch/powerpc/kernel/iommu.c | 11 +++++------
> >  1 file changed, 5 insertions(+), 6 deletions(-)
> 
> Are you asking for acks, or for maintainers to merge the patches
> individually?

I was expecting that but Christoph just suggested me to squash them
into one so he would merge it: https://lkml.org/lkml/2020/9/1/159

Though I feel it'd be nice to get maintainers' acks before merging.

> > diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
> > index 9704f3f76e63..c01ccbf8afdd 100644
> > --- a/arch/powerpc/kernel/iommu.c
> > +++ b/arch/powerpc/kernel/iommu.c
> > @@ -236,15 +236,14 @@ static unsigned long iommu_range_alloc(struct device *dev,
> >  		}
> >  	}
> >  
> > -	if (dev)
> > -		boundary_size = ALIGN(dma_get_seg_boundary(dev) + 1,
> > -				      1 << tbl->it_page_shift);
> > -	else
> > -		boundary_size = ALIGN(1UL << 32, 1 << tbl->it_page_shift);
> >  	/* 4GB boundary for iseries_hv_alloc and iseries_hv_map */
> > +	boundary_size = dev ? dma_get_seg_boundary(dev) : U32_MAX;
> 
> Is there any path that passes a NULL dev anymore?
> 
> Both iseries_hv_alloc() and iseries_hv_map() were removed years ago.
> See:
>   8ee3e0d69623 ("powerpc: Remove the main legacy iSerie platform code")
> 
> 
> So maybe we should do a lead-up patch that drops the NULL dev support,
> which will then make this patch simpler.

The next version of this change will follow Christoph's suggestion
by having a helper function that takes care of !dev internally.

Thanks
Nic

> 
> 
> > +	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
> > +	boundary_size = (boundary_size >> tbl->it_page_shift) + 1;
> >  
> >  	n = iommu_area_alloc(tbl->it_map, limit, start, npages, tbl->it_offset,
> > -			     boundary_size >> tbl->it_page_shift, align_mask);
> > +			     boundary_size, align_mask);
> >  	if (n == -1) {
> >  		if (likely(pass == 0)) {
> >  			/* First try the pool from the start */
> > -- 
> > 2.17.1

^ permalink raw reply

* Re: [PATCH v1 01/10] powerpc/pseries/iommu: Replace hard-coded page shift
From: Leonardo Bras @ 2020-09-01 21:38 UTC (permalink / raw)
  To: Alexey Kardashevskiy, Oliver O'Halloran
  Cc: Christophe Leroy, David Dai, Ram Pai, Linux Kernel Mailing List,
	Murilo Fossa Vicentini, Paul Mackerras, Joel Stanley, Brian King,
	linuxppc-dev, Thiago Jung Bauermann
In-Reply-To: <1bba12c6-f1ec-9f1e-1d3e-c1efa5ceb7c7@ozlabs.ru>

On Mon, 2020-08-31 at 13:48 +1000, Alexey Kardashevskiy wrote:
> > > > Well, I created this TCE_RPN_BITS = 52 because the previous mask was a
> > > > hardcoded 40-bit mask (0xfffffffffful), for hard-coded 12-bit (4k)
> > > > pagesize, and on PAPR+/LoPAR also defines TCE as having bits 0-51
> > > > described as RPN, as described before.
> > > > 
> > > > IODA3 Revision 3.0_prd1 (OpenPowerFoundation), Figure 3.4 and 3.5.
> > > > shows system memory mapping into a TCE, and the TCE also has bits 0-51
> > > > for the RPN (52 bits). "Table 3.6. TCE Definition" also shows it.
> > > > In fact, by the looks of those figures, the RPN_MASK should always be a
> > > > 52-bit mask, and RPN = (page >> tceshift) & RPN_MASK.
> > > 
> > > I suspect the mask is there in the first place for extra protection
> > > against too big addresses going to the TCE table (or/and for virtial vs
> > > physical addresses). Using 52bit mask makes no sense for anything, you
> > > could just drop the mask and let c compiler deal with 64bit "uint" as it
> > > is basically a 4K page address anywhere in the 64bit space. Thanks,
> > 
> > Assuming 4K pages you need 52 RPN bits to cover the whole 64bit
> > physical address space. The IODA3 spec does explicitly say the upper
> > bits are optional and the implementation only needs to support enough
> > to cover up to the physical address limit, which is 56bits of P9 /
> > PHB4. If you want to validate that the address will fit inside of
> > MAX_PHYSMEM_BITS then fine, but I think that should be done as a
> > WARN_ON or similar rather than just silently masking off the bits.
> 
> We can do this and probably should anyway but I am also pretty sure we
> can just ditch the mask and have the hypervisor return an error which
> will show up in dmesg.

Ok then, ditching the mask.
Thanks!


^ permalink raw reply

* Re: [PATCH] soc: fsl: Remove bogus packed attributes from qman.h
From: Li Yang @ 2020-09-01 21:40 UTC (permalink / raw)
  To: Herbert Xu
  Cc: linuxppc-dev@lists.ozlabs.org, Linux Kernel Mailing List,
	linux-arm-kernel@lists.infradead.org
In-Reply-To: <20200901015630.GA9065@gondor.apana.org.au>

On Mon, Aug 31, 2020 at 8:57 PM Herbert Xu <herbert@gondor.apana.org.au> wrote:
>
> On Tue, Sep 01, 2020 at 01:50:38AM +0000, Leo Li wrote:
> >
> > Sorry for the late response.  I missed this email previously.
> >
> > These structures are descriptors used by hardware, we cannot have _ANY_ padding from the compiler.  The compiled result might be the same with or without the __packed attribute for now, but I think keep it there probably is safer for dealing with unexpected alignment requirements from the compiler in the future.
> >
> > Having conflicting alignment requirements warning might means something is wrong with the structure in certain scenario.  I just tried a ARM64 build but didn't see the warnings.  Could you share the warning you got and the build setup?  Thanks.
>
> Just do a COMPILE_TEST build on x86-64:
>
> In file included from ../drivers/crypto/caam/qi.c:12:

Looks like the CAAM driver and dependent QBMAN driver doesn't support
COMPILE_TEST yet.  Are you trying to add the support for it?

I changed the Kconfig to enable the COMPILE_TEST anyway and updated my
toolchain to gcc-10 trying to duplicate the issue.  The issues can
only be reproduced with "W=1".

> ../include/soc/fsl/qman.h:259:1: warning: alignment 1 of ‘struct qm_dqrr_entry’ is less than 8 [-Wpacked-not-aligned]
>  } __packed;
>  ^
> ../include/soc/fsl/qman.h:292:2: warning: alignment 1 of ‘struct <anonymous>’ is less than 8 [-Wpacked-not-aligned]
>   } __packed ern;
>   ^

I think this is a valid concern that if the parent structure doesn't
meet certain alignment requirements, the alignment for the
sub-structure cannot be guaranteed.  If we just remove the __packed
attribute from the parent structure, the compiler could try to add
padding in the parent structure to fulfill the alignment requirements
of the sub structure which is not good.  I think the following changes
are a better fix for the warnings:

diff --git a/include/soc/fsl/qman.h b/include/soc/fsl/qman.h
index cfe00e08e85b..9f484113cfda 100644
--- a/include/soc/fsl/qman.h
+++ b/include/soc/fsl/qman.h
@@ -256,7 +256,7 @@ struct qm_dqrr_entry {
        __be32 context_b;
        struct qm_fd fd;
        u8 __reserved4[32];
-} __packed;
+} __packed __aligned(64);
 #define QM_DQRR_VERB_VBIT              0x80
 #define QM_DQRR_VERB_MASK              0x7f    /* where the verb contains; */
 #define QM_DQRR_VERB_FRAME_DEQUEUE     0x60    /* "this format" */
@@ -289,7 +289,7 @@ union qm_mr_entry {
                __be32 tag;
                struct qm_fd fd;
                u8 __reserved1[32];
-       } __packed ern;
+       } __packed __aligned(64) ern;
        struct {
                u8 verb;
                u8 fqs;         /* Frame Queue Status */


Regards,
Leo

^ permalink raw reply related

* [PATCH 1/2] dma-mapping: introduce dma_get_seg_boundary_nr_pages()
From: Nicolin Chen @ 2020-09-01 22:16 UTC (permalink / raw)
  To: hch
  Cc: linux-ia64, James.Bottomley, paulus, hpa, sparclinux, sfr, deller,
	x86, borntraeger, mingo, mattst88, fenghua.yu, gor, schnelle, hca,
	ink, tglx, gerald.schaefer, rth, tony.luck, linux-parisc,
	linux-s390, linux-kernel, linux-alpha, bp, linuxppc-dev, davem
In-Reply-To: <20200901221646.26491-1-nicoleotsuka@gmail.com>

We found that callers of dma_get_seg_boundary mostly do an ALIGN
with page mask and then do a page shift to get number of pages:
    ALIGN(boundary + 1, 1 << shift) >> shift

However, the boundary might be as large as ULONG_MAX, which means
that a device has no specific boundary limit. So either "+ 1" or
passing it to ALIGN() would potentially overflow.

According to kernel defines:
    #define ALIGN_MASK(x, mask) (((x) + (mask)) & ~(mask))
    #define ALIGN(x, a)	ALIGN_MASK(x, (typeof(x))(a) - 1)

We can simplify the logic here into a helper function doing:
  ALIGN(boundary + 1, 1 << shift) >> shift
= ALIGN_MASK(b + 1, (1 << s) - 1) >> s
= {[b + 1 + (1 << s) - 1] & ~[(1 << s) - 1]} >> s
= [b + 1 + (1 << s) - 1] >> s
= [b + (1 << s)] >> s
= (b >> s) + 1

This patch introduces and applies dma_get_seg_boundary_nr_pages()
as an overflow-free helper for the dma_get_seg_boundary() callers
to get numbers of pages. It also takes care of the NULL dev case
for non-DMA API callers.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
---
 arch/alpha/kernel/pci_iommu.c    |  7 +------
 arch/ia64/hp/common/sba_iommu.c  |  3 +--
 arch/powerpc/kernel/iommu.c      |  9 ++-------
 arch/s390/pci/pci_dma.c          |  6 ++----
 arch/sparc/kernel/iommu-common.c | 10 +++-------
 arch/sparc/kernel/iommu.c        |  3 +--
 arch/sparc/kernel/pci_sun4v.c    |  3 +--
 arch/x86/kernel/amd_gart_64.c    |  3 +--
 drivers/parisc/ccio-dma.c        |  3 +--
 drivers/parisc/sba_iommu.c       |  3 +--
 include/linux/dma-mapping.h      | 19 +++++++++++++++++++
 11 files changed, 33 insertions(+), 36 deletions(-)

diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c
index 81037907268d..6f7de4f4e191 100644
--- a/arch/alpha/kernel/pci_iommu.c
+++ b/arch/alpha/kernel/pci_iommu.c
@@ -141,12 +141,7 @@ iommu_arena_find_pages(struct device *dev, struct pci_iommu_arena *arena,
 	unsigned long boundary_size;
 
 	base = arena->dma_base >> PAGE_SHIFT;
-	if (dev) {
-		boundary_size = dma_get_seg_boundary(dev) + 1;
-		boundary_size >>= PAGE_SHIFT;
-	} else {
-		boundary_size = 1UL << (32 - PAGE_SHIFT);
-	}
+	boundary_size = dma_get_seg_boundary_nr_pages(dev, PAGE_SHIFT);
 
 	/* Search forward for the first mask-aligned sequence of N free ptes */
 	ptes = arena->ptes;
diff --git a/arch/ia64/hp/common/sba_iommu.c b/arch/ia64/hp/common/sba_iommu.c
index 656a4888c300..b49b73a95067 100644
--- a/arch/ia64/hp/common/sba_iommu.c
+++ b/arch/ia64/hp/common/sba_iommu.c
@@ -485,8 +485,7 @@ sba_search_bitmap(struct ioc *ioc, struct device *dev,
 	ASSERT(((unsigned long) ioc->res_hint & (sizeof(unsigned long) - 1UL)) == 0);
 	ASSERT(res_ptr < res_end);
 
-	boundary_size = (unsigned long long)dma_get_seg_boundary(dev) + 1;
-	boundary_size = ALIGN(boundary_size, 1ULL << iovp_shift) >> iovp_shift;
+	boundary_size = dma_get_seg_boundary_nr_pages(dev, iovp_shift);
 
 	BUG_ON(ioc->ibase & ~iovp_mask);
 	shift = ioc->ibase >> iovp_shift;
diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index 9704f3f76e63..cbc2e62db597 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -236,15 +236,10 @@ static unsigned long iommu_range_alloc(struct device *dev,
 		}
 	}
 
-	if (dev)
-		boundary_size = ALIGN(dma_get_seg_boundary(dev) + 1,
-				      1 << tbl->it_page_shift);
-	else
-		boundary_size = ALIGN(1UL << 32, 1 << tbl->it_page_shift);
-	/* 4GB boundary for iseries_hv_alloc and iseries_hv_map */
+	boundary_size = dma_get_seg_boundary_nr_pages(dev, tbl->it_page_shift);
 
 	n = iommu_area_alloc(tbl->it_map, limit, start, npages, tbl->it_offset,
-			     boundary_size >> tbl->it_page_shift, align_mask);
+			     boundary_size, align_mask);
 	if (n == -1) {
 		if (likely(pass == 0)) {
 			/* First try the pool from the start */
diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c
index 64b1399a73f0..4a37d8f4de9d 100644
--- a/arch/s390/pci/pci_dma.c
+++ b/arch/s390/pci/pci_dma.c
@@ -261,13 +261,11 @@ static unsigned long __dma_alloc_iommu(struct device *dev,
 				       unsigned long start, int size)
 {
 	struct zpci_dev *zdev = to_zpci(to_pci_dev(dev));
-	unsigned long boundary_size;
 
-	boundary_size = ALIGN(dma_get_seg_boundary(dev) + 1,
-			      PAGE_SIZE) >> PAGE_SHIFT;
 	return iommu_area_alloc(zdev->iommu_bitmap, zdev->iommu_pages,
 				start, size, zdev->start_dma >> PAGE_SHIFT,
-				boundary_size, 0);
+				dma_get_seg_boundary_nr_pages(dev, PAGE_SHIFT),
+				0);
 }
 
 static dma_addr_t dma_alloc_address(struct device *dev, int size)
diff --git a/arch/sparc/kernel/iommu-common.c b/arch/sparc/kernel/iommu-common.c
index 59cb16691322..23ca75f09277 100644
--- a/arch/sparc/kernel/iommu-common.c
+++ b/arch/sparc/kernel/iommu-common.c
@@ -166,13 +166,6 @@ unsigned long iommu_tbl_range_alloc(struct device *dev,
 		}
 	}
 
-	if (dev)
-		boundary_size = ALIGN(dma_get_seg_boundary(dev) + 1,
-				      1 << iommu->table_shift);
-	else
-		boundary_size = ALIGN(1ULL << 32, 1 << iommu->table_shift);
-
-	boundary_size = boundary_size >> iommu->table_shift;
 	/*
 	 * if the skip_span_boundary_check had been set during init, we set
 	 * things up so that iommu_is_span_boundary() merely checks if the
@@ -181,6 +174,9 @@ unsigned long iommu_tbl_range_alloc(struct device *dev,
 	if ((iommu->flags & IOMMU_NO_SPAN_BOUND) != 0) {
 		shift = 0;
 		boundary_size = iommu->poolsize * iommu->nr_pools;
+	} else {
+		boundary_size = dma_get_seg_boundary_nr_pages(dev,
+					iommu->table_shift);
 	}
 	n = iommu_area_alloc(iommu->map, limit, start, npages, shift,
 			     boundary_size, align_mask);
diff --git a/arch/sparc/kernel/iommu.c b/arch/sparc/kernel/iommu.c
index 4ae7388b1bff..c3e4e2df26a8 100644
--- a/arch/sparc/kernel/iommu.c
+++ b/arch/sparc/kernel/iommu.c
@@ -472,8 +472,7 @@ static int dma_4u_map_sg(struct device *dev, struct scatterlist *sglist,
 	outs->dma_length = 0;
 
 	max_seg_size = dma_get_max_seg_size(dev);
-	seg_boundary_size = ALIGN(dma_get_seg_boundary(dev) + 1,
-				  IO_PAGE_SIZE) >> IO_PAGE_SHIFT;
+	seg_boundary_size = dma_get_seg_boundary_nr_pages(dev, IO_PAGE_SHIFT);
 	base_shift = iommu->tbl.table_map_base >> IO_PAGE_SHIFT;
 	for_each_sg(sglist, s, nelems, i) {
 		unsigned long paddr, npages, entry, out_entry = 0, slen;
diff --git a/arch/sparc/kernel/pci_sun4v.c b/arch/sparc/kernel/pci_sun4v.c
index 14b93c5564e3..6b92dd51c002 100644
--- a/arch/sparc/kernel/pci_sun4v.c
+++ b/arch/sparc/kernel/pci_sun4v.c
@@ -508,8 +508,7 @@ static int dma_4v_map_sg(struct device *dev, struct scatterlist *sglist,
 	iommu_batch_start(dev, prot, ~0UL);
 
 	max_seg_size = dma_get_max_seg_size(dev);
-	seg_boundary_size = ALIGN(dma_get_seg_boundary(dev) + 1,
-				  IO_PAGE_SIZE) >> IO_PAGE_SHIFT;
+	seg_boundary_size = dma_get_seg_boundary_nr_pages(dev, IO_PAGE_SHIFT);
 
 	mask = *dev->dma_mask;
 	if (!iommu_use_atu(iommu, mask))
diff --git a/arch/x86/kernel/amd_gart_64.c b/arch/x86/kernel/amd_gart_64.c
index e89031e9c847..bccc5357bffd 100644
--- a/arch/x86/kernel/amd_gart_64.c
+++ b/arch/x86/kernel/amd_gart_64.c
@@ -96,8 +96,7 @@ static unsigned long alloc_iommu(struct device *dev, int size,
 
 	base_index = ALIGN(iommu_bus_base & dma_get_seg_boundary(dev),
 			   PAGE_SIZE) >> PAGE_SHIFT;
-	boundary_size = ALIGN((u64)dma_get_seg_boundary(dev) + 1,
-			      PAGE_SIZE) >> PAGE_SHIFT;
+	boundary_size = dma_get_seg_boundary_nr_pages(dev, PAGE_SHIFT);
 
 	spin_lock_irqsave(&iommu_bitmap_lock, flags);
 	offset = iommu_area_alloc(iommu_gart_bitmap, iommu_pages, next_bit,
diff --git a/drivers/parisc/ccio-dma.c b/drivers/parisc/ccio-dma.c
index a5507f75b524..ba16b7f8f806 100644
--- a/drivers/parisc/ccio-dma.c
+++ b/drivers/parisc/ccio-dma.c
@@ -356,8 +356,7 @@ ccio_alloc_range(struct ioc *ioc, struct device *dev, size_t size)
 	** ggg sacrifices another 710 to the computer gods.
 	*/
 
-	boundary_size = ALIGN((unsigned long long)dma_get_seg_boundary(dev) + 1,
-			      1ULL << IOVP_SHIFT) >> IOVP_SHIFT;
+	boundary_size = dma_get_seg_boundary_nr_pages(dev, IOVP_SHIFT);
 
 	if (pages_needed <= 8) {
 		/*
diff --git a/drivers/parisc/sba_iommu.c b/drivers/parisc/sba_iommu.c
index d4314fba0269..959bda193b96 100644
--- a/drivers/parisc/sba_iommu.c
+++ b/drivers/parisc/sba_iommu.c
@@ -342,8 +342,7 @@ sba_search_bitmap(struct ioc *ioc, struct device *dev,
 	unsigned long shift;
 	int ret;
 
-	boundary_size = ALIGN((unsigned long long)dma_get_seg_boundary(dev) + 1,
-			      1ULL << IOVP_SHIFT) >> IOVP_SHIFT;
+	boundary_size = dma_get_seg_boundary_nr_pages(dev, IOVP_SHIFT);
 
 #if defined(ZX1_SUPPORT)
 	BUG_ON(ioc->ibase & ~IOVP_MASK);
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 52635e91143b..faab0a8210b9 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -632,6 +632,25 @@ static inline unsigned long dma_get_seg_boundary(struct device *dev)
 	return DMA_BIT_MASK(32);
 }
 
+/**
+ * dma_get_seg_boundary_nr_pages - return the segment boundary in "page" units
+ * @dev: device to guery the boundary for
+ * @page_shift: ilog() of the IOMMU page size
+ *
+ * Return the segment boundary in IOMMU page units (which may be different from
+ * the CPU page size) for the passed in device.
+ *
+ * If @dev is NULL a boundary of U32_MAX is assumed, this case is just for
+ * non-DMA API callers.
+ */
+static inline unsigned long dma_get_seg_boundary_nr_pages(struct device *dev,
+		unsigned int page_shift)
+{
+	if (!dev)
+		return (U32_MAX >> page_shift) + 1;
+	return (dma_get_seg_boundary(dev) >> page_shift) + 1;
+}
+
 static inline int dma_set_seg_boundary(struct device *dev, unsigned long mask)
 {
 	if (dev->dma_parms) {
-- 
2.17.1


^ permalink raw reply related

* [PATCH 0/2] dma-mapping: update default segment_boundary_mask
From: Nicolin Chen @ 2020-09-01 22:16 UTC (permalink / raw)
  To: hch
  Cc: linux-ia64, James.Bottomley, paulus, hpa, sparclinux, sfr, deller,
	x86, borntraeger, mingo, mattst88, fenghua.yu, gor, schnelle, hca,
	ink, tglx, gerald.schaefer, rth, tony.luck, linux-parisc,
	linux-s390, linux-kernel, linux-alpha, bp, linuxppc-dev, davem

These two patches are to update default segment_boundary_mask.

PATCH-1 fixes overflow issues in callers of dma_get_seg_boundary.
Previous version was a series: https://lkml.org/lkml/2020/8/31/1026

Then PATCH-2 sets default segment_boundary_mask to ULONG_MAX.

Nicolin Chen (2):
  dma-mapping: introduce dma_get_seg_boundary_nr_pages()
  dma-mapping: set default segment_boundary_mask to ULONG_MAX

 arch/alpha/kernel/pci_iommu.c    |  7 +------
 arch/ia64/hp/common/sba_iommu.c  |  3 +--
 arch/powerpc/kernel/iommu.c      |  9 ++-------
 arch/s390/pci/pci_dma.c          |  6 ++----
 arch/sparc/kernel/iommu-common.c | 10 +++-------
 arch/sparc/kernel/iommu.c        |  3 +--
 arch/sparc/kernel/pci_sun4v.c    |  3 +--
 arch/x86/kernel/amd_gart_64.c    |  3 +--
 drivers/parisc/ccio-dma.c        |  3 +--
 drivers/parisc/sba_iommu.c       |  3 +--
 include/linux/dma-mapping.h      | 21 ++++++++++++++++++++-
 11 files changed, 34 insertions(+), 37 deletions(-)

-- 
2.17.1


^ permalink raw reply

* [PATCH 2/2] dma-mapping: set default segment_boundary_mask to ULONG_MAX
From: Nicolin Chen @ 2020-09-01 22:16 UTC (permalink / raw)
  To: hch
  Cc: linux-ia64, James.Bottomley, paulus, hpa, sparclinux, sfr, deller,
	x86, borntraeger, mingo, mattst88, fenghua.yu, gor, schnelle, hca,
	ink, tglx, gerald.schaefer, rth, tony.luck, linux-parisc,
	linux-s390, linux-kernel, linux-alpha, bp, linuxppc-dev, davem
In-Reply-To: <20200901221646.26491-1-nicoleotsuka@gmail.com>

The default segment_boundary_mask was set to DMA_BIT_MAKS(32)
a decade ago by referencing SCSI/block subsystem, as a 32-bit
mask was good enough for most of the devices.

Now more and more drivers set dma_masks above DMA_BIT_MAKS(32)
while only a handful of them call dma_set_seg_boundary(). This
means that most drivers have a 4GB segmention boundary because
DMA API returns a 32-bit default value, though they might not
really have such a limit.

The default segment_boundary_mask should mean "no limit" since
the device doesn't explicitly set the mask. But a 32-bit mask
certainly limits those devices capable of 32+ bits addressing.

So this patch sets default segment_boundary_mask to ULONG_MAX.

Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
---
 include/linux/dma-mapping.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index faab0a8210b9..df0bff2ea750 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -629,7 +629,7 @@ static inline unsigned long dma_get_seg_boundary(struct device *dev)
 {
 	if (dev->dma_parms && dev->dma_parms->segment_boundary_mask)
 		return dev->dma_parms->segment_boundary_mask;
-	return DMA_BIT_MASK(32);
+	return ULONG_MAX;
 }
 
 /**
-- 
2.17.1


^ permalink raw reply related

* [PATCH 0/2] link vdso with linker
From: Nick Desaulniers @ 2020-09-01 22:25 UTC (permalink / raw)
  To: Michael Ellerman, Nicholas Piggin
  Cc: Christophe Leroy, Joe Lawrence, Kees Cook, Fangrui Song,
	Nick Desaulniers, linux-kernel, clang-built-linux, Paul Mackerras,
	linuxppc-dev

Kees Cook is working on series that adds --orphan-section=warn to arm,
arm64, and x86.  I noticed that ppc vdso were still using cc-ldoption
for these which I removed.  It seems this results in that flag being
silently dropped.

I'm very confident with the first patch, but the second needs closer
review around the error mentioned below the fold related to the .got
section.

Nick Desaulniers (2):
  powerpc/vdso64: link vdso64 with linker
  powerpc/vdso32: link vdso64 with linker

 arch/powerpc/include/asm/vdso.h         | 17 ++---------------
 arch/powerpc/kernel/vdso32/Makefile     |  7 +++++--
 arch/powerpc/kernel/vdso32/vdso32.lds.S |  3 ++-
 arch/powerpc/kernel/vdso64/Makefile     |  8 ++++++--
 arch/powerpc/kernel/vdso64/vdso64.lds.S |  1 -
 5 files changed, 15 insertions(+), 21 deletions(-)

-- 
2.28.0.402.g5ffc5be6b7-goog


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox