LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH V4 2/3] ASoC: fsl_asrc: replace the process_option table with function
From: Nicolin Chen @ 2019-04-19 18:22 UTC (permalink / raw)
  To: S.j. Wang
  Cc: alsa-devel@alsa-project.org, timur@kernel.org,
	Xiubo.Lee@gmail.com, festevam@gmail.com,
	linux-kernel@vger.kernel.org, broonie@kernel.org,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <0f7a6907c73e110c797b478fedaba2fc47b5e994.1555669068.git.shengjiu.wang@nxp.com>

On Fri, Apr 19, 2019 at 10:23:53AM +0000, S.j. Wang wrote:

> @@ -289,6 +318,12 @@ static int fsl_asrc_config_pair(struct fsl_asrc_pair *pair)
>  		return -EINVAL;
>  	}
>  
> +	ret = fsl_asrc_sel_proc(inrate, outrate, &pre_proc, &post_proc);

Since the function always return 0, I am thinking of treating
this function as a lookup function, and then moving this call
right before the register settings -- as we have already made
sure that both inrate and outrate are supported.

> +	if (ret) {
> +		pair_err("No supported pre-processing options\n");
> +		return ret;
> +	}

And probably no longer need this error-out. If there's a new
limitation related to this function, I believe we can add it
to the rate validation section as we are doing now -- better
to have rate validation code at one place.

> @@ -380,8 +415,8 @@ static int fsl_asrc_config_pair(struct fsl_asrc_pair *pair)
>  	/* Apply configurations for pre- and post-processing */

Here:
-  	/* Apply configurations for pre- and post-processing */
+  	/* Select and apply configurations for pre- and post-processing */
+	fsl_asrc_sel_proc(inrate, outrate, &pre_proc, &post_proc);
>  	regmap_update_bits(asrc_priv->regmap, REG_ASRCFG,
>  			   ASRCFG_PREMODi_MASK(index) |	ASRCFG_POSTMODi_MASK(index),
> -			   ASRCFG_PREMOD(index, process_option[in][out][0]) |
> -			   ASRCFG_POSTMOD(index, process_option[in][out][1]));
> +			   ASRCFG_PREMOD(index, pre_proc) |
> +			   ASRCFG_POSTMOD(index, post_proc));

^ permalink raw reply

* Re: [PATCH V4 1/3] ASoC: fsl_asrc: Fix the issue about unsupported rate
From: Nicolin Chen @ 2019-04-19 18:10 UTC (permalink / raw)
  To: S.j. Wang
  Cc: alsa-devel@alsa-project.org, timur@kernel.org,
	Xiubo.Lee@gmail.com, festevam@gmail.com,
	linux-kernel@vger.kernel.org, broonie@kernel.org,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <06c3e420b9fabfbec67becc2f9de009ce79a1d4b.1555669068.git.shengjiu.wang@nxp.com>

On Fri, Apr 19, 2019 at 10:23:50AM +0000, S.j. Wang wrote:
> When the output sample rate is [8kHz, 30kHz], the limitation
> of the supported ratio range is (1/24, 8). In the driver
> we use (8kHz, 30kHz) instead of [8kHz, 30kHz].
> So this patch is to fix this issue and the potential rounding
> issue with divider.
> 
> Fixes: fff6e03c7b65 ("ASoC: fsl_asrc: add support for 8-30kHz
> output sample rate")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
> ---
>  sound/soc/fsl/fsl_asrc.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/sound/soc/fsl/fsl_asrc.c b/sound/soc/fsl/fsl_asrc.c
> index 0b937924d2e4..5b8adc7fb117 100644
> --- a/sound/soc/fsl/fsl_asrc.c
> +++ b/sound/soc/fsl/fsl_asrc.c
> @@ -282,10 +282,10 @@ static int fsl_asrc_config_pair(struct fsl_asrc_pair *pair)
>  		return -EINVAL;
>  	}
>  
> -	if ((outrate > 8000 && outrate < 30000) &&
> -	    (outrate/inrate > 24 || inrate/outrate > 8)) {
> -		pair_err("exceed supported ratio range [1/24, 8] for \
> -				inrate/outrate: %d/%d\n", inrate, outrate);
> +	if ((outrate >= 8000 && outrate <= 30000) &&
> +	    (outrate > 24 * inrate || inrate > 8 * outrate)) {
> +		pair_err("exceed supported ratio range (1/24, 8) for inrate/outrate: %d/%d\n",

Using one of the conditions:
	if (inrate > 8 * outrate)
		pair_err();

This means:
	if (inrate <= 8 * outrate)
		/* Everything is fine */

So the supported ratio range is still [1/24, 8] right?

Thanks

^ permalink raw reply

* Re: Linux 5.1-rc5
From: Linus Torvalds @ 2019-04-19 17:27 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Christoph Hellwig, linuxppc-dev, Linux List Kernel Mailing,
	linux-s390
In-Reply-To: <20190419153307.4f2911b5@mschwideX1>

On Fri, Apr 19, 2019 at 6:33 AM Martin Schwidefsky
<schwidefsky@de.ibm.com> wrote:
>
> That problem got stuck in my head and I thought more about it. Why not
> emulate the static folding sequence in the s390 page table code?

So this model seems much closer to what x86 does in its folding, where
the pattern is basically

> static inline pX-1d_t *pXd_offset(pXd_t *pXd, unsigned long address)
> {
>         if (pXd_folded(pXd)
>                 return (pX-1d_t *) pXd;
>         return (pX-1d_t *) pXd_deref(*pXd) + pXd_index(address);
> }

which is really how the code is designed to work (ie the folded entry
doesn't actually do anything to the page directory pointer, it just
says "ok, we'll use this exact page directory pointer for the next
lower level instead".

And that's very much what allows the generic gup code to load the
entry once, and use a temporary, and as you walk down the chain, if it
is folded it just then uses that (previous) temporary value for the
next level instead. IOW, the lower level page table is hidden inside
the upper level one, and folding just means "don't do any offsets,
don't change any values, just use the entry as-is for the next lower
level".

So I think that's the right thing to do.

Looking at the s390 code, it seems to fold things the other way,
conceptually hiding the upper level inside the lower one, and always
doing the offset thing (but just avoiding the dereference).

Maybe there's some reason why the s390 code does it that way, but I
think your new model is the right one, and hopefully means you can use
the generic page table walking more easily.

Of course, the s390 folding is very different from the x86 one (or the
generic fixed 3-level of 4-level cases). The x86 folding doesn't
depend on the contents of the page tables, it's just entirely static
(well, the 5th level is conditional, but it's conditional on a static
key, not on what is in the page tables). So maybe the old model of
s390 made more sense in that context, but I look at your new suggested
pXd_offset() functions and I go "yeah, that's the way it's supposed to
work".

                 Linus

^ permalink raw reply

* Re: [PATCH v12 09/31] mm: VMA sequence count
From: Laurent Dufour @ 2019-04-19 15:45 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: jack, sergey.senozhatsky.work, peterz, Will Deacon, mhocko,
	linux-mm, paulus, Punit Agrawal, hpa, Michel Lespinasse,
	Alexei Starovoitov, Andrea Arcangeli, ak, Minchan Kim,
	aneesh.kumar, x86, Matthew Wilcox, Daniel Jordan, Ingo Molnar,
	David Rientjes, paulmck, Haiyan Song, npiggin, sj38.park, dave,
	kemi.wang, kirill, Thomas Gleixner, zhong jiang, Ganesh Mahendran,
	Yang Shi, Mike Rapoport, linuxppc-dev, linux-kernel,
	Sergey Senozhatsky, vinayak menon, akpm, Tim Chen, haren
In-Reply-To: <20190418224857.GI11645@redhat.com>

Hi Jerome,

Thanks a lot for reviewing this series.

Le 19/04/2019 à 00:48, Jerome Glisse a écrit :
> On Tue, Apr 16, 2019 at 03:45:00PM +0200, Laurent Dufour wrote:
>> From: Peter Zijlstra <peterz@infradead.org>
>>
>> Wrap the VMA modifications (vma_adjust/unmap_page_range) with sequence
>> counts such that we can easily test if a VMA is changed.
>>
>> The calls to vm_write_begin/end() in unmap_page_range() are
>> used to detect when a VMA is being unmap and thus that new page fault
>> should not be satisfied for this VMA. If the seqcount hasn't changed when
>> the page table are locked, this means we are safe to satisfy the page
>> fault.
>>
>> The flip side is that we cannot distinguish between a vma_adjust() and
>> the unmap_page_range() -- where with the former we could have
>> re-checked the vma bounds against the address.
>>
>> The VMA's sequence counter is also used to detect change to various VMA's
>> fields used during the page fault handling, such as:
>>   - vm_start, vm_end
>>   - vm_pgoff
>>   - vm_flags, vm_page_prot
>>   - vm_policy
> 
> ^ All above are under mmap write lock ?

Yes, changes are still made under the protection of the mmap_sem.

> 
>>   - anon_vma
> 
> ^ This is either under mmap write lock or under page table lock
> 
> So my question is do we need the complexity of seqcount_t for this ?

The sequence counter is used to detect write operation done while 
readers (SPF handler) is running.

The implementation is quite simple (here without the lockdep checks):

static inline void raw_write_seqcount_begin(seqcount_t *s)
{
	s->sequence++;
	smp_wmb();
}

I can't see why this is too complex here, would you elaborate on this ?

> 
> It seems that using regular int as counter and also relying on vm_flags
> when vma is unmap should do the trick.

vm_flags is not enough I guess an some operation are not impacting the 
vm_flags at all (resizing for instance).
Am I missing something ?

> 
> vma_delete(struct vm_area_struct *vma)
> {
>      ...
>      /*
>       * Make sure the vma is mark as invalid ie neither read nor write
>       * so that speculative fault back off. A racing speculative fault
>       * will either see the flags as 0 or the new seqcount.
>       */
>      vma->vm_flags = 0;
>      smp_wmb();
>      vma->seqcount++;
>      ...
> }

Well I don't think we can safely clear the vm_flags this way when the 
VMA is unmap, I think it is used later when cleaning is doen.

Later in this series, the VMA deletion is managed when the VMA is 
unlinked from the RB Tree. That is checked using the vm_rb field's 
value, and managed using RCU.

> Then:
> speculative_fault_begin(struct vm_area_struct *vma,
>                          struct spec_vmf *spvmf)
> {
>      ...
>      spvmf->seqcount = vma->seqcount;
>      smp_rmb();
>      spvmf->vm_flags = vma->vm_flags;
>      if (!spvmf->vm_flags) {
>          // Back off the vma is dying ...
>          ...
>      }
> }
> 
> bool speculative_fault_commit(struct vm_area_struct *vma,
>                                struct spec_vmf *spvmf)
> {
>      ...
>      seqcount = vma->seqcount;
>      smp_rmb();
>      vm_flags = vma->vm_flags;
> 
>      if (spvmf->vm_flags != vm_flags || seqcount != spvmf->seqcount) {
>          // Something did change for the vma
>          return false;
>      }
>      return true;
> }
> 
> This would also avoid the lockdep issue described below. But maybe what
> i propose is stupid and i will see it after further reviewing thing.

That's true that the lockdep is quite annoying here. But it is still 
interesting to keep in the loop to avoid 2 subsequent 
write_seqcount_begin() call being made in the same context (which would 
lead to an even sequence counter value while write operation is in 
progress). So I think this is still a good thing to have lockdep 
available here.



> 
> Cheers,
> Jérôme
> 
> 
>>
>> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
>>
>> [Port to 4.12 kernel]
>> [Build depends on CONFIG_SPECULATIVE_PAGE_FAULT]
>> [Introduce vm_write_* inline function depending on
>>   CONFIG_SPECULATIVE_PAGE_FAULT]
>> [Fix lock dependency between mapping->i_mmap_rwsem and vma->vm_sequence by
>>   using vm_raw_write* functions]
>> [Fix a lock dependency warning in mmap_region() when entering the error
>>   path]
>> [move sequence initialisation INIT_VMA()]
>> [Review the patch description about unmap_page_range()]
>> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
>> ---
>>   include/linux/mm.h       | 44 ++++++++++++++++++++++++++++++++++++++++
>>   include/linux/mm_types.h |  3 +++
>>   mm/memory.c              |  2 ++
>>   mm/mmap.c                | 30 +++++++++++++++++++++++++++
>>   4 files changed, 79 insertions(+)
>>
>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index 2ceb1d2869a6..906b9e06f18e 100644
>> --- a/include/linux/mm.h
>> +++ b/include/linux/mm.h
>> @@ -1410,6 +1410,9 @@ struct zap_details {
>>   static inline void INIT_VMA(struct vm_area_struct *vma)
>>   {
>>   	INIT_LIST_HEAD(&vma->anon_vma_chain);
>> +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
>> +	seqcount_init(&vma->vm_sequence);
>> +#endif
>>   }
>>   
>>   struct page *_vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
>> @@ -1534,6 +1537,47 @@ static inline void unmap_shared_mapping_range(struct address_space *mapping,
>>   	unmap_mapping_range(mapping, holebegin, holelen, 0);
>>   }
>>   
>> +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
>> +static inline void vm_write_begin(struct vm_area_struct *vma)
>> +{
>> +	write_seqcount_begin(&vma->vm_sequence);
>> +}
>> +static inline void vm_write_begin_nested(struct vm_area_struct *vma,
>> +					 int subclass)
>> +{
>> +	write_seqcount_begin_nested(&vma->vm_sequence, subclass);
>> +}
>> +static inline void vm_write_end(struct vm_area_struct *vma)
>> +{
>> +	write_seqcount_end(&vma->vm_sequence);
>> +}
>> +static inline void vm_raw_write_begin(struct vm_area_struct *vma)
>> +{
>> +	raw_write_seqcount_begin(&vma->vm_sequence);
>> +}
>> +static inline void vm_raw_write_end(struct vm_area_struct *vma)
>> +{
>> +	raw_write_seqcount_end(&vma->vm_sequence);
>> +}
>> +#else
>> +static inline void vm_write_begin(struct vm_area_struct *vma)
>> +{
>> +}
>> +static inline void vm_write_begin_nested(struct vm_area_struct *vma,
>> +					 int subclass)
>> +{
>> +}
>> +static inline void vm_write_end(struct vm_area_struct *vma)
>> +{
>> +}
>> +static inline void vm_raw_write_begin(struct vm_area_struct *vma)
>> +{
>> +}
>> +static inline void vm_raw_write_end(struct vm_area_struct *vma)
>> +{
>> +}
>> +#endif /* CONFIG_SPECULATIVE_PAGE_FAULT */
>> +
>>   extern int access_process_vm(struct task_struct *tsk, unsigned long addr,
>>   		void *buf, int len, unsigned int gup_flags);
>>   extern int access_remote_vm(struct mm_struct *mm, unsigned long addr,
>> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
>> index fd7d38ee2e33..e78f72eb2576 100644
>> --- a/include/linux/mm_types.h
>> +++ b/include/linux/mm_types.h
>> @@ -337,6 +337,9 @@ struct vm_area_struct {
>>   	struct mempolicy *vm_policy;	/* NUMA policy for the VMA */
>>   #endif
>>   	struct vm_userfaultfd_ctx vm_userfaultfd_ctx;
>> +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
>> +	seqcount_t vm_sequence;
>> +#endif
>>   } __randomize_layout;
>>   
>>   struct core_thread {
>> diff --git a/mm/memory.c b/mm/memory.c
>> index d5bebca47d98..423fa8ea0569 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -1256,6 +1256,7 @@ void unmap_page_range(struct mmu_gather *tlb,
>>   	unsigned long next;
>>   
>>   	BUG_ON(addr >= end);
>> +	vm_write_begin(vma);
>>   	tlb_start_vma(tlb, vma);
>>   	pgd = pgd_offset(vma->vm_mm, addr);
>>   	do {
>> @@ -1265,6 +1266,7 @@ void unmap_page_range(struct mmu_gather *tlb,
>>   		next = zap_p4d_range(tlb, vma, pgd, addr, next, details);
>>   	} while (pgd++, addr = next, addr != end);
>>   	tlb_end_vma(tlb, vma);
>> +	vm_write_end(vma);
>>   }
>>   
>>   
>> diff --git a/mm/mmap.c b/mm/mmap.c
>> index 5ad3a3228d76..a4e4d52a5148 100644
>> --- a/mm/mmap.c
>> +++ b/mm/mmap.c
>> @@ -726,6 +726,30 @@ int __vma_adjust(struct vm_area_struct *vma, unsigned long start,
>>   	long adjust_next = 0;
>>   	int remove_next = 0;
>>   
>> +	/*
>> +	 * Why using vm_raw_write*() functions here to avoid lockdep's warning ?
>> +	 *
>> +	 * Locked is complaining about a theoretical lock dependency, involving
>> +	 * 3 locks:
>> +	 *   mapping->i_mmap_rwsem --> vma->vm_sequence --> fs_reclaim
>> +	 *
>> +	 * Here are the major path leading to this dependency :
>> +	 *  1. __vma_adjust() mmap_sem  -> vm_sequence -> i_mmap_rwsem
>> +	 *  2. move_vmap() mmap_sem -> vm_sequence -> fs_reclaim
>> +	 *  3. __alloc_pages_nodemask() fs_reclaim -> i_mmap_rwsem
>> +	 *  4. unmap_mapping_range() i_mmap_rwsem -> vm_sequence
>> +	 *
>> +	 * So there is no way to solve this easily, especially because in
>> +	 * unmap_mapping_range() the i_mmap_rwsem is grab while the impacted
>> +	 * VMAs are not yet known.
>> +	 * However, the way the vm_seq is used is guarantying that we will
>> +	 * never block on it since we just check for its value and never wait
>> +	 * for it to move, see vma_has_changed() and handle_speculative_fault().
>> +	 */
>> +	vm_raw_write_begin(vma);
>> +	if (next)
>> +		vm_raw_write_begin(next);
>> +
>>   	if (next && !insert) {
>>   		struct vm_area_struct *exporter = NULL, *importer = NULL;
>>   
>> @@ -950,6 +974,8 @@ int __vma_adjust(struct vm_area_struct *vma, unsigned long start,
>>   			 * "vma->vm_next" gap must be updated.
>>   			 */
>>   			next = vma->vm_next;
>> +			if (next)
>> +				vm_raw_write_begin(next);
>>   		} else {
>>   			/*
>>   			 * For the scope of the comment "next" and
>> @@ -996,6 +1022,10 @@ int __vma_adjust(struct vm_area_struct *vma, unsigned long start,
>>   	if (insert && file)
>>   		uprobe_mmap(insert);
>>   
>> +	if (next && next != vma)
>> +		vm_raw_write_end(next);
>> +	vm_raw_write_end(vma);
>> +
>>   	validate_mm(mm);
>>   
>>   	return 0;
>> -- 
>> 2.21.0
>>
> 


^ permalink raw reply

* Re: [PATCH v2 06/11] MIPS: mark __fls() as __always_inline
From: Mathieu Malaterre @ 2019-04-19 15:45 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: linux-arch, linux-s390, Arnd Bergmann, x86, Heiko Carstens,
	linux-mips, LKML, Ingo Molnar, linux-mtd, Andrew Morton,
	linuxppc-dev, linux-arm-kernel
In-Reply-To: <20190419094754.24667-7-yamada.masahiro@socionext.com>

Hi,

On Fri, Apr 19, 2019 at 12:06 PM Masahiro Yamada
<yamada.masahiro@socionext.com> wrote:
>
> This prepares to move CONFIG_OPTIMIZE_INLINING from x86 to a common
> place. We need to eliminate potential issues beforehand.
>
> If it is enabled for mips, the following errors are reported:
>
> arch/mips/mm/sc-mips.o: In function `mips_sc_prefetch_enable.part.2':
> sc-mips.c:(.text+0x98): undefined reference to `mips_gcr_base'
> sc-mips.c:(.text+0x9c): undefined reference to `mips_gcr_base'
> sc-mips.c:(.text+0xbc): undefined reference to `mips_gcr_base'
> sc-mips.c:(.text+0xc8): undefined reference to `mips_gcr_base'
> sc-mips.c:(.text+0xdc): undefined reference to `mips_gcr_base'
> arch/mips/mm/sc-mips.o:sc-mips.c:(.text.unlikely+0x44): more undefined references to `mips_gcr_base'

Tested with success on ppc32/G4. But on CI20 (ci20_defconfig from
master), I get:

  MODPOST vmlinux.o
mipsel-linux-gnu-ld: arch/mips/kernel/traps.o: in function
`addr_gcr_err_control':
/home/mathieu/tmp/linux/linux/ci20/../arch/mips/include/asm/mips-cm.h:169:
undefined reference to `mips_gcr_base'
mipsel-linux-gnu-ld:
/home/mathieu/linux/linux/ci20/../arch/mips/include/asm/mips-cm.h:169:
undefined reference to `mips_gcr_base'
mipsel-linux-gnu-ld: arch/mips/mm/sc-mips.o: in function
`addr_gcr_l2_pft_control':
/home/mathieu/linux/linux/ci20/../arch/mips/include/asm/mips-cm.h:246:
undefined reference to `mips_gcr_base'
mipsel-linux-gnu-ld:
/home/mathieu/linux/linux/ci20/../arch/mips/include/asm/mips-cm.h:246:
undefined reference to `mips_gcr_base'
mipsel-linux-gnu-ld:
/home/mathieu/linux/linux/ci20/../arch/mips/include/asm/mips-cm.h:246:
undefined reference to `mips_gcr_base'
mipsel-linux-gnu-ld:
arch/mips/mm/sc-mips.o:/home/mathieu/linux/linux/ci20/../arch/mips/include/asm/mips-cm.h:246:
more undefined references to `mips_gcr_base' follow


> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
> ---
>
> Changes in v2:
>   - new patch
>
>  arch/mips/include/asm/bitops.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h
> index 830c93a010c3..6a26ead1c2b6 100644
> --- a/arch/mips/include/asm/bitops.h
> +++ b/arch/mips/include/asm/bitops.h
> @@ -482,7 +482,7 @@ static inline void __clear_bit_unlock(unsigned long nr, volatile unsigned long *
>   * Return the bit position (0..63) of the most significant 1 bit in a word
>   * Returns -1 if no 1 bit exists
>   */
> -static inline unsigned long __fls(unsigned long word)
> +static __always_inline unsigned long __fls(unsigned long word)
>  {
>         int num;
>
> --
> 2.17.1
>

^ permalink raw reply

* [PATCH] vfio-pci/nvlink2: Fix potential VMA leak
From: Greg Kurz @ 2019-04-19 15:37 UTC (permalink / raw)
  To: linux-kernel; +Cc: Alexey Kardashevskiy, Alex Williamson, linuxppc-dev

If vfio_pci_register_dev_region() fails then we should rollback
previous changes, ie. unmap the ATSD registers.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 drivers/vfio/pci/vfio_pci_nvlink2.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/vfio/pci/vfio_pci_nvlink2.c b/drivers/vfio/pci/vfio_pci_nvlink2.c
index 32f695ffe128..50fe3c4f7feb 100644
--- a/drivers/vfio/pci/vfio_pci_nvlink2.c
+++ b/drivers/vfio/pci/vfio_pci_nvlink2.c
@@ -472,6 +472,8 @@ int vfio_pci_ibm_npu2_init(struct vfio_pci_device *vdev)
 	return 0;
 
 free_exit:
+	if (data->base)
+		memunmap(data->base);
 	kfree(data);
 
 	return ret;


^ permalink raw reply related

* [PATCH] powerpc/powernv/npu: Fix reference leak
From: Greg Kurz @ 2019-04-19 15:34 UTC (permalink / raw)
  To: linux-kernel; +Cc: Alexey Kardashevskiy, linuxppc-dev, Alistair Popple

Since 902bdc57451c, get_pci_dev() calls pci_get_domain_bus_and_slot(). This
has the effect of incrementing the reference count of the PCI device, as
explained in drivers/pci/search.c:

 * Given a PCI domain, bus, and slot/function number, the desired PCI
 * device is located in the list of PCI devices. If the device is
 * found, its reference count is increased and this function returns a
 * pointer to its data structure.  The caller must decrement the
 * reference count by calling pci_dev_put().  If no device is found,
 * %NULL is returned.

Nothing was done to call pci_dev_put() and the reference count of GPU and
NPU PCI devices rockets up.

A natural way to fix this would be to teach the callers about the change,
so that they call pci_dev_put() when done with the pointer. This turns
out to be quite intrusive, as it affects many paths in npu-dma.c,
pci-ioda.c and vfio_pci_nvlink2.c. Also, the issue appeared in 4.16 and
some affected code got moved around since then: it would be problematic
to backport the fix to stable releases.

All that code never cared for reference counting anyway. Call pci_dev_put()
from get_pci_dev() to revert to the previous behavior.

Fixes: 902bdc57451c ("powerpc/powernv/idoa: Remove unnecessary pcidev from pci_dn")
Cc: stable@vger.kernel.org # v4.16
Signed-off-by: Greg Kurz <groug@kaod.org>
---
 arch/powerpc/platforms/powernv/npu-dma.c |   15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
index e713ade30087..d8f3647e8fb2 100644
--- a/arch/powerpc/platforms/powernv/npu-dma.c
+++ b/arch/powerpc/platforms/powernv/npu-dma.c
@@ -31,9 +31,22 @@ static DEFINE_SPINLOCK(npu_context_lock);
 static struct pci_dev *get_pci_dev(struct device_node *dn)
 {
 	struct pci_dn *pdn = PCI_DN(dn);
+	struct pci_dev *pdev;

-	return pci_get_domain_bus_and_slot(pci_domain_nr(pdn->phb->bus),
+	pdev = pci_get_domain_bus_and_slot(pci_domain_nr(pdn->phb->bus),
 					   pdn->busno, pdn->devfn);
+
+	/*
+	 * pci_get_domain_bus_and_slot() increased the reference count of
+	 * the PCI device, but callers don't need that actually as the PE
+	 * already holds a reference to the device. Since callers aren't
+	 * aware of the reference count change, call pci_dev_put() now to
+	 * avoid leaks.
+	 */
+	if (pdev)
+		pci_dev_put(pdev);
+
+	return pdev;
 }

 /* Given a NPU device get the associated PCI device. */

^ permalink raw reply related

* Re: [alsa-devel] [PATCH V2] ASoC: fsl_esai: Add pm runtime function
From: Mark Brown @ 2019-04-19 15:25 UTC (permalink / raw)
  To: S.j. Wang
  Cc: alsa-devel@alsa-project.org, timur@kernel.org,
	Xiubo.Lee@gmail.com, festevam@gmail.com,
	linux-kernel@vger.kernel.org, nicoleotsuka@gmail.com,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <VE1PR04MB6479A85982748088CC42901DE3270@VE1PR04MB6479.eurprd04.prod.outlook.com>

[-- Attachment #1: Type: text/plain, Size: 422 bytes --]

On Fri, Apr 19, 2019 at 11:01:21AM +0000, S.j. Wang wrote:

> > fsl_esai_probe(struct platform_device *pdev)
> >                 return ret;
> >         }
> > 
> > +       pm_runtime_enable(&pdev->dev);
> > +

> I just have a question, do I need to add pm_runtime_idle(&pdev->dev)?

It gets used to help drivers get into the correct state on startup, if
you're unsure if it's 100% required it shouldn't hurt.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCH] kernel/crash: make parse_crashkernel()'s return value more indicant
From: Pingfan Liu @ 2019-04-19 14:45 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Rich Felker, linux-ia64, Julien Thierry, Yangtao Li,
	Palmer Dabbelt, Heiko Carstens, x86, Stefan Agner, linux-mips,
	Paul Mackerras, H. Peter Anvin, linux-s390, Florian Fainelli,
	Yoshinori Sato, linux-sh, David Hildenbrand, Russell King,
	Ingo Molnar, linux-arm-kernel, Catalin Marinas, James Hogan,
	Dave Young, Fenghua Yu, Will Deacon, linuxppc-dev,
	Greg Kroah-Hartman, Borislav Petkov, Hari Bathini, Jens Axboe,
	Tony Luck, Baoquan He, Ard Biesheuvel, Robin Murphy, LKML,
	Ralf Baechle, Thomas Bogendoerfer, Paul Burton, Johannes Weiner,
	Martin Schwidefsky, Andrew Morton, Logan Gunthorpe, Greg Hackmann
In-Reply-To: <alpine.DEB.2.21.1904191009040.3174@nanos.tec.linutronix.de>

On Fri, Apr 19, 2019 at 4:19 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Fri, 19 Apr 2019, Pingfan Liu wrote:
>
> > At present, both return and crash_size should be checked to guarantee the
> > success of parse_crashkernel().
> > Simplify the way by returning negative if fail, positive if success. In
> > case of failure, -EINVAL for bad syntax, -1 for the parsing results in
> > crash_size=0.
>
> I'm not entirely sure what you are trying to say here, but '-1' is not an
> improvement at all. We surely are not short of proper error codes, right?
>
The different negative return values are only used by x86. The option
"crashkernel=X,high", which is only used on x86, causes parse_kernel()
to return -EINVAL, then let parse_crashkernel_high() have a try.

When parsing crashkernel=size@offset and crashkernel=range1:size1,
there are other cases of failure, which is not worth to call
parse_crashkernel_high() to have a try. That is "-1" aiming for.

First, in parse_crashkernel_mem(), if demanded size is bigger than
system ram, this one looks like -ENOMEM, but -ENOMEM normally is used
for allocation. Second, in parse_crashkernel_mem(), if system ram is
not inside the range listed by "crashkernel=". Third, crashkernel=0MB
is given in the option (not in practice, but can not forbid user to do
so).

All of these cases can be treated as -EINVAL, but hard to define the
error codes.
> Also I don't see any positive return value > 0. So what is this about:
>
Yes. 0 is enough for success.  I had thought about returning 1 if
@offset is specified in crashkernel. But at present, no use case for
it.

> > --- a/arch/ia64/kernel/setup.c
> > +++ b/arch/ia64/kernel/setup.c
> > @@ -277,7 +277,7 @@ static void __init setup_crashkernel(unsigned long total, int *n)
> >
> >       ret = parse_crashkernel(boot_command_line, total,
> >                       &size, &base);
> > -     if (ret == 0 && size > 0) {
> > +     if (ret >= 0) {
>
>   ^^^^^^^^^^^^^^^^^^^^^^^^^^^  ????
>
> >       if (!memory_region_available(crash_base, crash_size)) {
> > diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
> > index 45a8d0b..0b626e2 100644
> > --- a/arch/powerpc/kernel/fadump.c
> > +++ b/arch/powerpc/kernel/fadump.c
> > @@ -376,7 +376,7 @@ static inline unsigned long fadump_calculate_reserve_size(void)
> >        */
> >       ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> >                               &size, &base);
> > -     if (ret == 0 && size > 0) {
> > +     if (ret >= 0) {
>
> and this ?
>
> >               unsigned long max_size;
> >
> >               if (fw_dump.reserve_bootvar)
> > diff --git a/arch/powerpc/kernel/machine_kexec.c b/arch/powerpc/kernel/machine_kexec.c
> > index 63f5a93..9f3e61a 100644
> > --- a/arch/powerpc/kernel/machine_kexec.c
> > +++ b/arch/powerpc/kernel/machine_kexec.c
> > @@ -122,7 +122,7 @@ void __init reserve_crashkernel(void)
> >       /* use common parsing */
> >       ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> >                       &crash_size, &crash_base);
> > -     if (ret == 0 && crash_size > 0) {
> > +     if (ret >= 0) {
>
> Again.
>
> >               crashk_res.start = crash_base;
> >               crashk_res.end = crash_base + crash_size - 1;
> >       }
> > --- a/arch/sh/kernel/machine_kexec.c
> > +++ b/arch/sh/kernel/machine_kexec.c
> > @@ -157,7 +157,7 @@ void __init reserve_crashkernel(void)
> >
> >       ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> >                       &crash_size, &crash_base);
> > -     if (ret == 0 && crash_size > 0) {
> > +     if (ret >= 0) {
>
> And some more.
>
> >               crashk_res.start = crash_base;
> >               crashk_res.end = crash_base + crash_size - 1;
> >       }
> > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> > index 3d872a5..62d07d4 100644
> > --- a/arch/x86/kernel/setup.c
> > +++ b/arch/x86/kernel/setup.c
> > @@ -526,11 +526,11 @@ static void __init reserve_crashkernel(void)
> >
> >       /* crashkernel=XM */
> >       ret = parse_crashkernel(boot_command_line, total_mem, &crash_size, &crash_base);
> > -     if (ret != 0 || crash_size <= 0) {
> > +     if (ret == -EINVAL) {
>
> Without an explanation why this proceedes on error codes other than EINVAL
> this is uncomprehensible. Comments exist for a reason.
>
As explained above, deciding whether to let parse_crashkernel_high() try.

> >               /* crashkernel=X,high */
> >               ret = parse_crashkernel_high(boot_command_line, total_mem,
> >                                            &crash_size, &crash_base);
> > -             if (ret != 0 || crash_size <= 0)
> > +             if (ret < 0)
> >                       return;
> >               high = true;
>
> > @@ -87,7 +87,7 @@ static int __init parse_crashkernel_mem(char *cmdline,
> >               cur = tmp;
> >               if (size >= system_ram) {
> >                       pr_warn("crashkernel: invalid size\n");
> > -                     return -EINVAL;
> > +                     return -1;
>
> Well, this is incomprehensible as well. The pr_warn() says invalid and then
> you change the error code to something magic.
>
As explained above, want to know whether worth to let
parse_crashkernel_high() try.

What about the following alternative method? Treating crash_size=0 as
-EINVAL. Then on x86, just call parse_crashkernel_high() blindly to
have a try. Thanks for your review.

Regards,
Pingfan

^ permalink raw reply

* RE: [PATCH 02/13] soc/fsl/bman: map FBPR area in the iommu
From: Laurentiu Tudor @ 2019-04-19 13:41 UTC (permalink / raw)
  To: Robin Murphy, netdev@vger.kernel.org, Madalin-cristian Bucur,
	Roy Pledge, Camelia Alexandra Groza, Leo Li
  Cc: iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, davem@davemloft.net
In-Reply-To: <b747c418-a495-56e0-c106-a8ec8e82ccb9@arm.com>

Hi Robin,

> -----Original Message-----
> From: Robin Murphy <robin.murphy@arm.com>
> Sent: Friday, March 29, 2019 4:51 PM
> 
> On 29/03/2019 14:00, laurentiu.tudor@nxp.com wrote:
> > From: Laurentiu Tudor <laurentiu.tudor@nxp.com>
> >
> > Add a one-to-one iommu mapping for bman private data memory (FBPR).
> > This is required for BMAN to work without faults behind an iommu.
> >
> > Signed-off-by: Laurentiu Tudor <laurentiu.tudor@nxp.com>
> > ---
> >   drivers/soc/fsl/qbman/bman_ccsr.c | 11 +++++++++++
> >   1 file changed, 11 insertions(+)
> >
> > diff --git a/drivers/soc/fsl/qbman/bman_ccsr.c
> b/drivers/soc/fsl/qbman/bman_ccsr.c
> > index 7c3cc968053c..b209c79511bb 100644
> > --- a/drivers/soc/fsl/qbman/bman_ccsr.c
> > +++ b/drivers/soc/fsl/qbman/bman_ccsr.c
> > @@ -29,6 +29,7 @@
> >    */
> >
> >   #include "bman_priv.h"
> > +#include <linux/iommu.h>
> >
> >   u16 bman_ip_rev;
> >   EXPORT_SYMBOL(bman_ip_rev);
> > @@ -178,6 +179,7 @@ static int fsl_bman_probe(struct platform_device
> *pdev)
> >   	int ret, err_irq;
> >   	struct device *dev = &pdev->dev;
> >   	struct device_node *node = dev->of_node;
> > +	struct iommu_domain *domain;
> >   	struct resource *res;
> >   	u16 id, bm_pool_cnt;
> >   	u8 major, minor;
> > @@ -225,6 +227,15 @@ static int fsl_bman_probe(struct platform_device
> *pdev)
> >
> >   	dev_dbg(dev, "Allocated FBPR 0x%llx 0x%zx\n", fbpr_a, fbpr_sz);
> >
> > +	/* Create an 1-to-1 iommu mapping for FBPR area */
> > +	domain = iommu_get_domain_for_dev(dev);
> 
> If that's expected to be the default domain that you're grabbing, then
> this is *incredibly* fragile. There's nothing to stop the IOVA that you
> forcibly map from being automatically allocated later and causing some
> other DMA mapping to fail noisily and unexpectedly. Furthermore, have
> you tried this with "iommu.passthrough=1"?
> 
> That said, I really don't understand what's going on here anyway :/
> 
> As far as I can tell from qbman_init_private_mem(), fbpr_a comes from
> dma_alloc_coherent() and thus would already be a mapped IOVA - isn't
> this the stuff that Roy converted to nicely use shared-dma-pool regions
> a while ago?
> 

Finally found some time to look into this, sorry for the delay. It seems that on the code path taken in our case (dma_alloc_coherent() -> dma_alloc_attrs() -> dma_alloc_from_dev_coherent() -> __dma_alloc_from_coherent()) there's no call into the iommu layer, thus no mapping in the smmu. I plan to come up with a RFC patch early next week so we have something concrete to discuss on.

---
Best Regards, Laurentiu

^ permalink raw reply

* Re: Linux 5.1-rc5
From: Martin Schwidefsky @ 2019-04-19 13:33 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christoph Hellwig, linuxppc-dev, Linux List Kernel Mailing,
	linux-s390
In-Reply-To: <20190418204144.16adf2a0@mschwideX1>

On Thu, 18 Apr 2019 20:41:44 +0200
Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:

> On Thu, 18 Apr 2019 08:49:32 -0700
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> > On Thu, Apr 18, 2019 at 1:02 AM Martin Schwidefsky
> > <schwidefsky@de.ibm.com> wrote:  
> > >
> > > The problematic lines in the generic gup code are these three:
> > >
> > > 1845:   pmdp = pmd_offset(&pud, addr);
> > > 1888:   pudp = pud_offset(&p4d, addr);
> > > 1916:   p4dp = p4d_offset(&pgd, addr);
> > >
> > > Passing the pointer of a *copy* of a page table entry to pxd_offset() does
> > > not work with the page table folding on s390.    
> > 
> > Hmm. I wonder why. x86 too does the folding thing for the p4d and pud case.
> > 
> > The folding works with the local copy just the same way it works with
> > the orignal value.  
> 
> The difference is that with the static page table folding pgd_offset()
> does the index calculation of the actual hardware top-level table. With
> dynamic page table folding as s390 is doing it, if the task does not use
> a 5-level page table pgd_offset() will see a pgd_index() of 0, the indexing
> of the actual top-level table is done later with p4d_offset(), pud_offset()
> or pmd_offset(). 
> 
> As an example, with a three level page table we have three indexes x/y/z.
> The common code "thinks" 5 indexing steps, with static folding the index
> sequence is x 0 0 y z. With dynamic folding the sequence is 0 0 x y z.
> By moving the first indexing operation to pgd_offset the static sequence
> does not add an index to a non-dereferenced pointer to a stack variable,
> the dynamic sequence does.

That problem got stuck in my head and I thought more about it. Why not
emulate the static folding sequence in the s390 page table code?

As the table type is encoded in every entry for the region and segment
tables, pgd_offset() can look at the first entry to find the table type
and then do the correct index calculation for the given top-level table.
Like this:

static inline pgd_t *pgd_offset_raw(pgd_t *pgd, unsigned long address)
{
        unsigned long rste;
        unsigned int shift;

        /* Get the first entry of the top level table */
        rste = pgd_val(*pgd);
        /* Pick up the shift from the table type of the first entry */
        shift = ((rste & _REGION_ENTRY_TYPE_MASK) >> 2) * 11 + 20;
        return pgd + ((address >> shift) & (PTRS_PER_PGD - 1));
}

#define pgd_offset(mm, address) pgd_offset_raw((mm)->pgd, address)
#define pgd_offset_k(address) pgd_offset(&init_mm, address)

static inline p4d_t *p4d_offset(pgd_t *pgd, unsigned long address)
{
        if ((pgd_val(*pgd) & _REGION_ENTRY_TYPE_MASK) != _REGION_ENTRY_TYPE_R1)
                return (p4d_t *) pgd;
        return (p4d_t *) pgd_deref(*pgd) + p4d_index(address);
}

static inline pud_t *pud_offset(p4d_t *p4d, unsigned long address)
{
        if ((p4d_val(*p4d) & _REGION_ENTRY_TYPE_MASK) != _REGION_ENTRY_TYPE_R2)
                return (pud_t *) p4d;
        return (pud_t *) p4d_deref(*p4d) + pud_index(address);
}

static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address)
{
        if ((pud_val(*pud) & _REGION_ENTRY_TYPE_MASK) != _REGION_ENTRY_TYPE_R3)
                return (pmd_t *) pud;
        return (pmd_t *) pud_deref(*pud) + pmd_index(address);
}

This needs more thorough testing but in principle it does work. The kernel
boots and survives a kernel compile. The only things that is slightly off is
that pgd_offset() now has to look at the first table entry to do its job.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply

* [alsa-devel] [PATCH V2] ASoC: fsl_esai: Add pm runtime function
From: S.j. Wang @ 2019-04-19 11:01 UTC (permalink / raw)
  To: S.j. Wang, timur@kernel.org, nicoleotsuka@gmail.com,
	Xiubo.Lee@gmail.com, festevam@gmail.com, broonie@kernel.org,
	alsa-devel@alsa-project.org
  Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org


Hi

> 
> 
> Add pm runtime support and move clock handling there.
> fsl_esai_suspend is replaced by pm_runtime_force_suspend.
> fsl_esai_resume is replaced by pm_runtime_force_resume.
> 
> Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
> ---
> Changes in v2
> -refine the commit comments.
> -move regcache_mark_dirty to runtime suspend.
> 
>  sound/soc/fsl/fsl_esai.c | 141 ++++++++++++++++++++++++++--------------------
> -
>  1 file changed, 77 insertions(+), 64 deletions(-)
> 
> diff --git a/sound/soc/fsl/fsl_esai.c b/sound/soc/fsl/fsl_esai.c index
> bad0dfed6b68..10d2210c91ef 100644
> --- a/sound/soc/fsl/fsl_esai.c
> +++ b/sound/soc/fsl/fsl_esai.c
> @@ -9,6 +9,7 @@
>  #include <linux/module.h>
>  #include <linux/of_irq.h>
>  #include <linux/of_platform.h>
> +#include <linux/pm_runtime.h>
>  #include <sound/dmaengine_pcm.h>
>  #include <sound/pcm_params.h>
> 
> @@ -466,30 +467,6 @@ static int fsl_esai_startup(struct
> snd_pcm_substream *substream,
>                             struct snd_soc_dai *dai)  {
>         struct fsl_esai *esai_priv = snd_soc_dai_get_drvdata(dai);
> -       int ret;
> -
> -       /*
> -        * Some platforms might use the same bit to gate all three or two of
> -        * clocks, so keep all clocks open/close at the same time for safety
> -        */
> -       ret = clk_prepare_enable(esai_priv->coreclk);
> -       if (ret)
> -               return ret;
> -       if (!IS_ERR(esai_priv->spbaclk)) {
> -               ret = clk_prepare_enable(esai_priv->spbaclk);
> -               if (ret)
> -                       goto err_spbaclk;
> -       }
> -       if (!IS_ERR(esai_priv->extalclk)) {
> -               ret = clk_prepare_enable(esai_priv->extalclk);
> -               if (ret)
> -                       goto err_extalck;
> -       }
> -       if (!IS_ERR(esai_priv->fsysclk)) {
> -               ret = clk_prepare_enable(esai_priv->fsysclk);
> -               if (ret)
> -                       goto err_fsysclk;
> -       }
> 
>         if (!dai->active) {
>                 /* Set synchronous mode */ @@ -506,16 +483,6 @@ static int
> fsl_esai_startup(struct snd_pcm_substream *substream,
> 
>         return 0;
> 
> -err_fsysclk:
> -       if (!IS_ERR(esai_priv->extalclk))
> -               clk_disable_unprepare(esai_priv->extalclk);
> -err_extalck:
> -       if (!IS_ERR(esai_priv->spbaclk))
> -               clk_disable_unprepare(esai_priv->spbaclk);
> -err_spbaclk:
> -       clk_disable_unprepare(esai_priv->coreclk);
> -
> -       return ret;
>  }
> 
>  static int fsl_esai_hw_params(struct snd_pcm_substream *substream, @@
> -576,20 +543,6 @@ static int fsl_esai_hw_params(struct
> snd_pcm_substream *substream,
>         return 0;
>  }
> 
> -static void fsl_esai_shutdown(struct snd_pcm_substream *substream,
> -                             struct snd_soc_dai *dai)
> -{
> -       struct fsl_esai *esai_priv = snd_soc_dai_get_drvdata(dai);
> -
> -       if (!IS_ERR(esai_priv->fsysclk))
> -               clk_disable_unprepare(esai_priv->fsysclk);
> -       if (!IS_ERR(esai_priv->extalclk))
> -               clk_disable_unprepare(esai_priv->extalclk);
> -       if (!IS_ERR(esai_priv->spbaclk))
> -               clk_disable_unprepare(esai_priv->spbaclk);
> -       clk_disable_unprepare(esai_priv->coreclk);
> -}
> -
>  static int fsl_esai_trigger(struct snd_pcm_substream *substream, int cmd,
>                             struct snd_soc_dai *dai)  { @@ -658,7 +611,6 @@ static int
> fsl_esai_trigger(struct snd_pcm_substream *substream, int cmd,
> 
>  static const struct snd_soc_dai_ops fsl_esai_dai_ops = {
>         .startup = fsl_esai_startup,
> -       .shutdown = fsl_esai_shutdown,
>         .trigger = fsl_esai_trigger,
>         .hw_params = fsl_esai_hw_params,
>         .set_sysclk = fsl_esai_set_dai_sysclk, @@ -947,6 +899,10 @@ static int
> fsl_esai_probe(struct platform_device *pdev)
>                 return ret;
>         }
> 
> +       pm_runtime_enable(&pdev->dev);
> +

I just have a question, do I need to add pm_runtime_idle(&pdev->dev)?

Best regards
Wang shengjiu

^ permalink raw reply

* [PATCH V4 3/3] ASoC: fsl_asrc: Unify the supported input and output rate
From: S.j. Wang @ 2019-04-19 10:23 UTC (permalink / raw)
  To: timur@kernel.org, nicoleotsuka@gmail.com, Xiubo.Lee@gmail.com,
	festevam@gmail.com, broonie@kernel.org,
	alsa-devel@alsa-project.org
  Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org
In-Reply-To: <cover.1555669068.git.shengjiu.wang@nxp.com>

Unify the supported input and output rate, add the
12kHz/24kHz/128kHz to the support list

Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
---
 sound/soc/fsl/fsl_asrc.c | 32 +++++++++++++++++++-------------
 1 file changed, 19 insertions(+), 13 deletions(-)

diff --git a/sound/soc/fsl/fsl_asrc.c b/sound/soc/fsl/fsl_asrc.c
index 2c4bbc3499db..0d06e738264a 100644
--- a/sound/soc/fsl/fsl_asrc.c
+++ b/sound/soc/fsl/fsl_asrc.c
@@ -27,13 +27,14 @@
 	dev_dbg(&asrc_priv->pdev->dev, "Pair %c: " fmt, 'A' + index, ##__VA_ARGS__)
 
 /* Corresponding to process_option */
-static int supported_input_rate[] = {
-	5512, 8000, 11025, 16000, 22050, 32000, 44100, 48000, 64000, 88200,
-	96000, 176400, 192000,
+static unsigned int supported_asrc_rate[] = {
+	5512, 8000, 11025, 12000, 16000, 22050, 24000, 32000, 44100, 48000,
+	64000, 88200, 96000, 128000, 176400, 192000,
 };
 
-static int supported_asrc_rate[] = {
-	8000, 11025, 16000, 22050, 32000, 44100, 48000, 64000, 88200, 96000, 176400, 192000,
+static struct snd_pcm_hw_constraint_list fsl_asrc_rate_constraints = {
+	.count = ARRAY_SIZE(supported_asrc_rate),
+	.list = supported_asrc_rate,
 };
 
 /**
@@ -293,11 +294,11 @@ static int fsl_asrc_config_pair(struct fsl_asrc_pair *pair)
 	ideal = config->inclk == INCLK_NONE;
 
 	/* Validate input and output sample rates */
-	for (in = 0; in < ARRAY_SIZE(supported_input_rate); in++)
-		if (inrate == supported_input_rate[in])
+	for (in = 0; in < ARRAY_SIZE(supported_asrc_rate); in++)
+		if (inrate == supported_asrc_rate[in])
 			break;
 
-	if (in == ARRAY_SIZE(supported_input_rate)) {
+	if (in == ARRAY_SIZE(supported_asrc_rate)) {
 		pair_err("unsupported input sample rate: %dHz\n", inrate);
 		return -EINVAL;
 	}
@@ -311,7 +312,7 @@ static int fsl_asrc_config_pair(struct fsl_asrc_pair *pair)
 		return -EINVAL;
 	}
 
-	if ((outrate >= 8000 && outrate <= 30000) &&
+	if ((outrate >= 5512 && outrate <= 30000) &&
 	    (outrate > 24 * inrate || inrate > 8 * outrate)) {
 		pair_err("exceed supported ratio range (1/24, 8) for inrate/outrate: %d/%d\n",
 				inrate, outrate);
@@ -490,7 +491,9 @@ static int fsl_asrc_dai_startup(struct snd_pcm_substream *substream,
 		snd_pcm_hw_constraint_step(substream->runtime, 0,
 					   SNDRV_PCM_HW_PARAM_CHANNELS, 2);
 
-	return 0;
+
+	return snd_pcm_hw_constraint_list(substream->runtime, 0,
+			SNDRV_PCM_HW_PARAM_RATE, &fsl_asrc_rate_constraints);
 }
 
 static int fsl_asrc_dai_hw_params(struct snd_pcm_substream *substream,
@@ -603,7 +606,6 @@ static int fsl_asrc_dai_probe(struct snd_soc_dai *dai)
 	return 0;
 }
 
-#define FSL_ASRC_RATES		 SNDRV_PCM_RATE_8000_192000
 #define FSL_ASRC_FORMATS	(SNDRV_PCM_FMTBIT_S24_LE | \
 				 SNDRV_PCM_FMTBIT_S16_LE | \
 				 SNDRV_PCM_FMTBIT_S20_3LE)
@@ -614,14 +616,18 @@ static int fsl_asrc_dai_probe(struct snd_soc_dai *dai)
 		.stream_name = "ASRC-Playback",
 		.channels_min = 1,
 		.channels_max = 10,
-		.rates = FSL_ASRC_RATES,
+		.rate_min = 5512,
+		.rate_max = 192000,
+		.rates = SNDRV_PCM_RATE_KNOT,
 		.formats = FSL_ASRC_FORMATS,
 	},
 	.capture = {
 		.stream_name = "ASRC-Capture",
 		.channels_min = 1,
 		.channels_max = 10,
-		.rates = FSL_ASRC_RATES,
+		.rate_min = 5512,
+		.rate_max = 192000,
+		.rates = SNDRV_PCM_RATE_KNOT,
 		.formats = FSL_ASRC_FORMATS,
 	},
 	.ops = &fsl_asrc_dai_ops,
-- 
1.9.1


^ permalink raw reply related

* [PATCH V4 2/3] ASoC: fsl_asrc: replace the process_option table with function
From: S.j. Wang @ 2019-04-19 10:23 UTC (permalink / raw)
  To: timur@kernel.org, nicoleotsuka@gmail.com, Xiubo.Lee@gmail.com,
	festevam@gmail.com, broonie@kernel.org,
	alsa-devel@alsa-project.org
  Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org
In-Reply-To: <cover.1555669068.git.shengjiu.wang@nxp.com>

When we want to support more sample rate, for example 12kHz/24kHz
we need update the process_option table, if we want to support more
sample rate next time, the table need to be updated again. which
is not flexible.

We got a function fsl_asrc_sel_proc to replace the table, which can
give the pre-processing and post-processing options according to
the sample rate.

Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
---
 sound/soc/fsl/fsl_asrc.c | 75 +++++++++++++++++++++++++++++++++++-------------
 1 file changed, 55 insertions(+), 20 deletions(-)

diff --git a/sound/soc/fsl/fsl_asrc.c b/sound/soc/fsl/fsl_asrc.c
index 5b8adc7fb117..2c4bbc3499db 100644
--- a/sound/soc/fsl/fsl_asrc.c
+++ b/sound/soc/fsl/fsl_asrc.c
@@ -26,24 +26,6 @@
 #define pair_dbg(fmt, ...) \
 	dev_dbg(&asrc_priv->pdev->dev, "Pair %c: " fmt, 'A' + index, ##__VA_ARGS__)
 
-/* Sample rates are aligned with that defined in pcm.h file */
-static const u8 process_option[][12][2] = {
-	/* 8kHz 11.025kHz 16kHz 22.05kHz 32kHz 44.1kHz 48kHz   64kHz   88.2kHz 96kHz   176kHz  192kHz */
-	{{0, 1}, {0, 1}, {0, 1}, {0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 0},},	/* 5512Hz */
-	{{0, 1}, {0, 1}, {0, 1}, {0, 1}, {0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 0},},	/* 8kHz */
-	{{0, 2}, {0, 1}, {0, 1}, {0, 1}, {0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 0},},	/* 11025Hz */
-	{{1, 2}, {0, 2}, {0, 1}, {0, 1}, {0, 1}, {0, 1}, {0, 1}, {0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 0},},	/* 16kHz */
-	{{1, 2}, {1, 2}, {0, 2}, {0, 1}, {0, 1}, {0, 1}, {0, 1}, {0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 0},},	/* 22050Hz */
-	{{1, 2}, {2, 1}, {2, 1}, {0, 2}, {0, 1}, {0, 1}, {0, 1}, {0, 1}, {0, 1}, {0, 0}, {0, 0}, {0, 0},},	/* 32kHz */
-	{{2, 2}, {2, 2}, {2, 1}, {2, 1}, {0, 2}, {0, 1}, {0, 1}, {0, 1}, {0, 1}, {0, 1}, {0, 0}, {0, 0},},	/* 44.1kHz */
-	{{2, 2}, {2, 2}, {2, 1}, {2, 1}, {0, 2}, {0, 2}, {0, 1}, {0, 1}, {0, 1}, {0, 1}, {0, 0}, {0, 0},},	/* 48kHz */
-	{{2, 2}, {2, 2}, {2, 2}, {2, 1}, {1, 2}, {0, 2}, {0, 2}, {0, 1}, {0, 1}, {0, 1}, {0, 1}, {0, 0},},	/* 64kHz */
-	{{2, 2}, {2, 2}, {2, 2}, {2, 2}, {1, 2}, {1, 2}, {1, 2}, {1, 1}, {1, 1}, {1, 1}, {1, 1}, {1, 1},},	/* 88.2kHz */
-	{{2, 2}, {2, 2}, {2, 2}, {2, 2}, {1, 2}, {1, 2}, {1, 2}, {1, 1}, {1, 1}, {1, 1}, {1, 1}, {1, 1},},	/* 96kHz */
-	{{2, 2}, {2, 2}, {2, 2}, {2, 2}, {2, 2}, {2, 2}, {2, 2}, {2, 1}, {2, 1}, {2, 1}, {2, 1}, {2, 1},},	/* 176kHz */
-	{{2, 2}, {2, 2}, {2, 2}, {2, 2}, {2, 2}, {2, 2}, {2, 2}, {2, 1}, {2, 1}, {2, 1}, {2, 1}, {2, 1},},	/* 192kHz */
-};
-
 /* Corresponding to process_option */
 static int supported_input_rate[] = {
 	5512, 8000, 11025, 16000, 22050, 32000, 44100, 48000, 64000, 88200,
@@ -80,6 +62,51 @@
 static unsigned char *clk_map[2];
 
 /**
+ * Select the pre-processing and post-processing options
+ * Unsupport cases: Tsout > 8.125 * Tsin, Tsout > 16.125 * Tsin
+ *
+ * inrate: input sample rate
+ * outrate: output sample rate
+ * pre_proc: return value for pre-processing option
+ * post_proc: return value for post-processing option
+ */
+static int fsl_asrc_sel_proc(int inrate, int outrate,
+			     int *pre_proc, int *post_proc)
+{
+	bool post_proc_cond2;
+	bool post_proc_cond0;
+
+	/* select pre_proc between [0, 2] */
+	if (inrate * 8 > 33 * outrate)
+		*pre_proc = 2;
+	else if (inrate * 8 > 15 * outrate) {
+		if (inrate > 152000)
+			*pre_proc = 2;
+		else
+			*pre_proc = 1;
+	} else if (inrate < 76000)
+		*pre_proc = 0;
+	else if (inrate > 152000)
+		*pre_proc = 2;
+	else
+		*pre_proc = 1;
+
+	/* Condition for selection of post-processing */
+	post_proc_cond2 = (inrate * 15 > outrate * 16 && outrate < 56000) ||
+			  (inrate > 56000 && outrate < 56000);
+	post_proc_cond0 = inrate * 23 < outrate * 8;
+
+	if (post_proc_cond2)
+		*post_proc = 2;
+	else if (post_proc_cond0)
+		*post_proc = 0;
+	else
+		*post_proc = 1;
+
+	return 0;
+}
+
+/**
  * Request ASRC pair
  *
  * It assigns pair by the order of A->C->B because allocation of pair B,
@@ -239,8 +266,10 @@ static int fsl_asrc_config_pair(struct fsl_asrc_pair *pair)
 	u32 inrate, outrate, indiv, outdiv;
 	u32 clk_index[2], div[2];
 	int in, out, channels;
+	int pre_proc, post_proc;
 	struct clk *clk;
 	bool ideal;
+	int ret;
 
 	if (!config) {
 		pair_err("invalid pair config\n");
@@ -289,6 +318,12 @@ static int fsl_asrc_config_pair(struct fsl_asrc_pair *pair)
 		return -EINVAL;
 	}
 
+	ret = fsl_asrc_sel_proc(inrate, outrate, &pre_proc, &post_proc);
+	if (ret) {
+		pair_err("No supported pre-processing options\n");
+		return ret;
+	}
+
 	/* Validate input and output clock sources */
 	clk_index[IN] = clk_map[IN][config->inclk];
 	clk_index[OUT] = clk_map[OUT][config->outclk];
@@ -380,8 +415,8 @@ static int fsl_asrc_config_pair(struct fsl_asrc_pair *pair)
 	/* Apply configurations for pre- and post-processing */
 	regmap_update_bits(asrc_priv->regmap, REG_ASRCFG,
 			   ASRCFG_PREMODi_MASK(index) |	ASRCFG_POSTMODi_MASK(index),
-			   ASRCFG_PREMOD(index, process_option[in][out][0]) |
-			   ASRCFG_POSTMOD(index, process_option[in][out][1]));
+			   ASRCFG_PREMOD(index, pre_proc) |
+			   ASRCFG_POSTMOD(index, post_proc));
 
 	return fsl_asrc_set_ideal_ratio(pair, inrate, outrate);
 }
-- 
1.9.1


^ permalink raw reply related

* [PATCH V4 1/3] ASoC: fsl_asrc: Fix the issue about unsupported rate
From: S.j. Wang @ 2019-04-19 10:23 UTC (permalink / raw)
  To: timur@kernel.org, nicoleotsuka@gmail.com, Xiubo.Lee@gmail.com,
	festevam@gmail.com, broonie@kernel.org,
	alsa-devel@alsa-project.org
  Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org
In-Reply-To: <cover.1555669068.git.shengjiu.wang@nxp.com>

When the output sample rate is [8kHz, 30kHz], the limitation
of the supported ratio range is (1/24, 8). In the driver
we use (8kHz, 30kHz) instead of [8kHz, 30kHz].
So this patch is to fix this issue and the potential rounding
issue with divider.

Fixes: fff6e03c7b65 ("ASoC: fsl_asrc: add support for 8-30kHz
output sample rate")
Cc: <stable@vger.kernel.org>
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
---
 sound/soc/fsl/fsl_asrc.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/sound/soc/fsl/fsl_asrc.c b/sound/soc/fsl/fsl_asrc.c
index 0b937924d2e4..5b8adc7fb117 100644
--- a/sound/soc/fsl/fsl_asrc.c
+++ b/sound/soc/fsl/fsl_asrc.c
@@ -282,10 +282,10 @@ static int fsl_asrc_config_pair(struct fsl_asrc_pair *pair)
 		return -EINVAL;
 	}
 
-	if ((outrate > 8000 && outrate < 30000) &&
-	    (outrate/inrate > 24 || inrate/outrate > 8)) {
-		pair_err("exceed supported ratio range [1/24, 8] for \
-				inrate/outrate: %d/%d\n", inrate, outrate);
+	if ((outrate >= 8000 && outrate <= 30000) &&
+	    (outrate > 24 * inrate || inrate > 8 * outrate)) {
+		pair_err("exceed supported ratio range (1/24, 8) for inrate/outrate: %d/%d\n",
+				inrate, outrate);
 		return -EINVAL;
 	}
 
-- 
1.9.1


^ permalink raw reply related

* [PATCH V4 0/3] Support more sample rate in asrc
From: S.j. Wang @ 2019-04-19 10:23 UTC (permalink / raw)
  To: timur@kernel.org, nicoleotsuka@gmail.com, Xiubo.Lee@gmail.com,
	festevam@gmail.com, broonie@kernel.org,
	alsa-devel@alsa-project.org
  Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org

Support more sample rate in asrc

Shengjiu Wang (3):
  ASoC: fsl_asrc: Fix the issue about unsupported rate
  ASoC: fsl_asrc: replace the process_option table with function
  ASoC: fsl_asrc: Unify the supported input and output rate

Changes in v4
- add patch to Fix the [8kHz, 30kHz] open set issue.

Changes in v3
- remove FSL_ASRC_RATES
- refine fsl_asrc_sel_proc according to comments

Changes in v2
- add more comments in code
- add commit "Unify the supported input and output rate"

 sound/soc/fsl/fsl_asrc.c | 113 ++++++++++++++++++++++++++++++++---------------
 1 file changed, 77 insertions(+), 36 deletions(-)

-- 
1.9.1


^ permalink raw reply

* Re: [PATCH V3 1/2] ASoC: fsl_asrc: replace the process_option table with function
From: S.j. Wang @ 2019-04-19 10:21 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: alsa-devel@alsa-project.org, timur@kernel.org,
	Xiubo.Lee@gmail.com, festevam@gmail.com,
	linux-kernel@vger.kernel.org, broonie@kernel.org,
	linuxppc-dev@lists.ozlabs.org

Hi

> 
> 
> On Thu, Apr 18, 2019 at 09:37:06AM +0000, S.j. Wang wrote:
> > > > > And this is according to IMX6DQRM:
> > > > >     Limited support for the case when output sampling rates is
> > > > >     between 8kHz and 30kHz. The limitation is the supported ratio
> > > > >     (Fsin/Fsout) range as between 1/24 to 8
> > > > >
> > > > > This should cover your 8.125 condition already, even if having
> > > > > an outrate range between [8KHz, 30KHz] check, since an outrate
> > > > > above 30KHz will not have an inrate bigger than 8.125 times of
> > > > > it, given the maximum input rate is 192KHz.
> > > > >
> > > > > So I think that we can just drop that 8.125 condition from your
> > > > > change and there's no need to error out any more.
> > > > >
> > > > No, if outrate=8kHz,  inrate > 88.2kHz, these cases are not supported.
> > > > This is not covered by
> > > >
> > > >         if ((outrate > 8000 && outrate < 30000) &&
> > > >             (outrate/inrate > 24 || inrate/outrate > 8)) {
> > >
> > > Good catch. The range should be [8KHz, 30KHz] vs. (8KHz, 32KHz) in
> > > the code. Then I think the fix should be at both lines:
> > >
> > > -         if ((outrate > 8000 && outrate < 30000) &&
> > > -             (outrate/inrate > 24 || inrate/outrate > 8)) {
> > > +         if ((outrate >= 8000 && outrate =< 30000) &&
> > > +             (outrate > 24 * inrate || inrate > 8 * outrate)) {
> > >
> > > Overall, I think we should fix this instead of adding an extra one,
> > > since it is very likely saying the same thing.
> >
> > Actually if outrate < 8kHz, there will be issue too.
> 
> Here is the thing, the RM doesn't explicitly state that ASRC can support a
> lower output sample rate than 8KHz. And I actually had a concern when
> reviewing your PATCH-2, as the table of supported output sample rate no
> longer matches RM.
> 
> If you've verified a lower output sample rate working solid with the
> process_option function, that means our driver can go beyond the
> limitation mentioned in the RM, then I believe [8KHz, 32KHz] should be
> updated too -- that says we can do:
> -       if ((outrate > 8000 && outrate < 30000) &&
> -           (outrate/inrate > 24 || inrate/outrate > 8)) {
> +       if ((outrate >= 5512 && outrate =< 30000) &&
> +           (outrate > 24 * inrate || inrate > 8 * outrate)) {
> 
> Actually "ourate > 24 * inrate" is kind of pointless for range [5KHz, 32KHz]
> but we can keep it since it matches RM.

Ok, will send v4.

Best regards
Wang shengjiu

^ permalink raw reply

* Re: [PATCH] powerpc/dts/fsl: add crypto node alias for B4
From: Horia Geanta @ 2019-04-19  9:54 UTC (permalink / raw)
  To: Rob Herring, Mark Rutland
  Cc: Scott Wood, devicetree@vger.kernel.org, Paul Mackerras,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20190320125516.12277-1-horia.geanta@nxp.com>

Has this slipped through?

On 3/20/2019 2:57 PM, Horia Geantă wrote:
> crypto node alias is needed by U-boot to identify the node and
> perform fix-ups, like adding "fsl,sec-era" property.
> 
> Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
> ---
>  arch/powerpc/boot/dts/fsl/b4qds.dtsi | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/powerpc/boot/dts/fsl/b4qds.dtsi b/arch/powerpc/boot/dts/fsl/b4qds.dtsi
> index 999efd3bc167..05be919f3545 100644
> --- a/arch/powerpc/boot/dts/fsl/b4qds.dtsi
> +++ b/arch/powerpc/boot/dts/fsl/b4qds.dtsi
> @@ -40,6 +40,7 @@
>  	interrupt-parent = <&mpic>;
>  
>  	aliases {
> +		crypto = &crypto;
>  		phy_sgmii_10 = &phy_sgmii_10;
>  		phy_sgmii_11 = &phy_sgmii_11;
>  		phy_sgmii_1c = &phy_sgmii_1c;
> 

^ permalink raw reply

* [PATCH v2 11/11] compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING
From: Masahiro Yamada @ 2019-04-19  9:47 UTC (permalink / raw)
  To: Andrew Morton, linux-arch
  Cc: linux-s390, Arnd Bergmann, x86, Heiko Carstens, linux-mips,
	linux-kernel, Masahiro Yamada, Ingo Molnar, linux-mtd,
	linuxppc-dev, linux-arm-kernel
In-Reply-To: <20190419094754.24667-1-yamada.masahiro@socionext.com>

Commit 60a3cdd06394 ("x86: add optimized inlining") introduced
CONFIG_OPTIMIZE_INLINING, but it has been available only for x86.

The idea is obviously arch-agnostic. This commit moves the config
entry from arch/x86/Kconfig.debug to lib/Kconfig.debug so that all
architectures can benefit from it.

This can make a huge difference in kernel image size especially when
CONFIG_OPTIMIZE_FOR_SIZE is enabled.

For example, I got 3.5% smaller arm64 kernel for v5.1-rc1.

  dec       file
  18983424  arch/arm64/boot/Image.before
  18321920  arch/arm64/boot/Image.after

This also slightly improves the "Kernel hacking" Kconfig menu as
e61aca5158a8 ("Merge branch 'kconfig-diet' from Dave Hansen') suggested;
this config option would be a good fit in the "compiler option" menu.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
---

Changes in v2:
  - split into a separate patch

 arch/x86/Kconfig               |  3 ---
 arch/x86/Kconfig.debug         | 14 --------------
 include/linux/compiler_types.h |  3 +--
 lib/Kconfig.debug              | 14 ++++++++++++++
 4 files changed, 15 insertions(+), 19 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 5ad92419be19..9e93d109a6cb 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -310,9 +310,6 @@ config ZONE_DMA32
 config AUDIT_ARCH
 	def_bool y if X86_64
 
-config ARCH_SUPPORTS_OPTIMIZED_INLINING
-	def_bool y
-
 config ARCH_SUPPORTS_DEBUG_PAGEALLOC
 	def_bool y
 
diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
index 15d0fbe27872..f730680dc818 100644
--- a/arch/x86/Kconfig.debug
+++ b/arch/x86/Kconfig.debug
@@ -266,20 +266,6 @@ config CPA_DEBUG
 	---help---
 	  Do change_page_attr() self-tests every 30 seconds.
 
-config OPTIMIZE_INLINING
-	bool "Allow gcc to uninline functions marked 'inline'"
-	---help---
-	  This option determines if the kernel forces gcc to inline the functions
-	  developers have marked 'inline'. Doing so takes away freedom from gcc to
-	  do what it thinks is best, which is desirable for the gcc 3.x series of
-	  compilers. The gcc 4.x series have a rewritten inlining algorithm and
-	  enabling this option will generate a smaller kernel there. Hopefully
-	  this algorithm is so good that allowing gcc 4.x and above to make the
-	  decision will become the default in the future. Until then this option
-	  is there to test gcc for this.
-
-	  If unsure, say N.
-
 config DEBUG_ENTRY
 	bool "Debug low-level entry code"
 	depends on DEBUG_KERNEL
diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index ba814f18cb4c..19e58b9138a0 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -140,8 +140,7 @@ struct ftrace_likely_data {
  * Do not use __always_inline here, since currently it expands to inline again
  * (which would break users of __always_inline).
  */
-#if !defined(CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING) || \
-	!defined(CONFIG_OPTIMIZE_INLINING)
+#if !defined(CONFIG_OPTIMIZE_INLINING)
 #define inline inline __attribute__((__always_inline__)) __gnu_inline \
 	__maybe_unused notrace
 #else
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 0d9e81779e37..f8f284f46c85 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -310,6 +310,20 @@ config HEADERS_CHECK
 	  exported to $(INSTALL_HDR_PATH) (usually 'usr/include' in
 	  your build tree), to make sure they're suitable.
 
+config OPTIMIZE_INLINING
+	bool "Allow compiler to uninline functions marked 'inline'"
+	help
+	  This option determines if the kernel forces gcc to inline the functions
+	  developers have marked 'inline'. Doing so takes away freedom from gcc to
+	  do what it thinks is best, which is desirable for the gcc 3.x series of
+	  compilers. The gcc 4.x series have a rewritten inlining algorithm and
+	  enabling this option will generate a smaller kernel there. Hopefully
+	  this algorithm is so good that allowing gcc 4.x and above to make the
+	  decision will become the default in the future. Until then this option
+	  is there to test gcc for this.
+
+	  If unsure, say N.
+
 config DEBUG_SECTION_MISMATCH
 	bool "Enable full Section mismatch analysis"
 	help
-- 
2.17.1


^ permalink raw reply related

* [PATCH v2 06/11] MIPS: mark __fls() as __always_inline
From: Masahiro Yamada @ 2019-04-19  9:47 UTC (permalink / raw)
  To: Andrew Morton, linux-arch
  Cc: linux-s390, Arnd Bergmann, x86, Heiko Carstens, linux-mips,
	linux-kernel, Masahiro Yamada, Ingo Molnar, linux-mtd,
	linuxppc-dev, linux-arm-kernel
In-Reply-To: <20190419094754.24667-1-yamada.masahiro@socionext.com>

This prepares to move CONFIG_OPTIMIZE_INLINING from x86 to a common
place. We need to eliminate potential issues beforehand.

If it is enabled for mips, the following errors are reported:

arch/mips/mm/sc-mips.o: In function `mips_sc_prefetch_enable.part.2':
sc-mips.c:(.text+0x98): undefined reference to `mips_gcr_base'
sc-mips.c:(.text+0x9c): undefined reference to `mips_gcr_base'
sc-mips.c:(.text+0xbc): undefined reference to `mips_gcr_base'
sc-mips.c:(.text+0xc8): undefined reference to `mips_gcr_base'
sc-mips.c:(.text+0xdc): undefined reference to `mips_gcr_base'
arch/mips/mm/sc-mips.o:sc-mips.c:(.text.unlikely+0x44): more undefined references to `mips_gcr_base'

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
---

Changes in v2:
  - new patch

 arch/mips/include/asm/bitops.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h
index 830c93a010c3..6a26ead1c2b6 100644
--- a/arch/mips/include/asm/bitops.h
+++ b/arch/mips/include/asm/bitops.h
@@ -482,7 +482,7 @@ static inline void __clear_bit_unlock(unsigned long nr, volatile unsigned long *
  * Return the bit position (0..63) of the most significant 1 bit in a word
  * Returns -1 if no 1 bit exists
  */
-static inline unsigned long __fls(unsigned long word)
+static __always_inline unsigned long __fls(unsigned long word)
 {
 	int num;
 
-- 
2.17.1


^ permalink raw reply related

* [PATCH v2 10/11] powerpc/mm/radix: mark as __tlbie_pid() and friends as__always_inline
From: Masahiro Yamada @ 2019-04-19  9:47 UTC (permalink / raw)
  To: Andrew Morton, linux-arch
  Cc: linux-s390, Arnd Bergmann, x86, Heiko Carstens, linux-mips,
	linux-kernel, Masahiro Yamada, Ingo Molnar, linux-mtd,
	linuxppc-dev, linux-arm-kernel
In-Reply-To: <20190419094754.24667-1-yamada.masahiro@socionext.com>

This prepares to move CONFIG_OPTIMIZE_INLINING from x86 to a common
place. We need to eliminate potential issues beforehand.

If it is enabled for powerpc, the following errors are reported:

arch/powerpc/mm/tlb-radix.c: In function '__tlbie_lpid':
arch/powerpc/mm/tlb-radix.c:148:2: warning: asm operand 3 probably doesn't match constraints
  asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1)
  ^~~
arch/powerpc/mm/tlb-radix.c:148:2: error: impossible constraint in 'asm'
arch/powerpc/mm/tlb-radix.c: In function '__tlbie_pid':
arch/powerpc/mm/tlb-radix.c:118:2: warning: asm operand 3 probably doesn't match constraints
  asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1)
  ^~~
arch/powerpc/mm/tlb-radix.c: In function '__tlbiel_pid':
arch/powerpc/mm/tlb-radix.c:104:2: warning: asm operand 3 probably doesn't match constraints
  asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1)
  ^~~

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
---

Changes in v2:
  - new patch

 arch/powerpc/mm/tlb-radix.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c
index a2b2848f0ae3..14ff414d1545 100644
--- a/arch/powerpc/mm/tlb-radix.c
+++ b/arch/powerpc/mm/tlb-radix.c
@@ -90,8 +90,8 @@ void radix__tlbiel_all(unsigned int action)
 	asm volatile(PPC_INVALIDATE_ERAT "; isync" : : :"memory");
 }
 
-static inline void __tlbiel_pid(unsigned long pid, int set,
-				unsigned long ric)
+static __always_inline void __tlbiel_pid(unsigned long pid, int set,
+					 unsigned long ric)
 {
 	unsigned long rb,rs,prs,r;
 
@@ -106,7 +106,7 @@ static inline void __tlbiel_pid(unsigned long pid, int set,
 	trace_tlbie(0, 1, rb, rs, ric, prs, r);
 }
 
-static inline void __tlbie_pid(unsigned long pid, unsigned long ric)
+static __always_inline void __tlbie_pid(unsigned long pid, unsigned long ric)
 {
 	unsigned long rb,rs,prs,r;
 
@@ -136,7 +136,7 @@ static inline void __tlbiel_lpid(unsigned long lpid, int set,
 	trace_tlbie(lpid, 1, rb, rs, ric, prs, r);
 }
 
-static inline void __tlbie_lpid(unsigned long lpid, unsigned long ric)
+static __always_inline void __tlbie_lpid(unsigned long lpid, unsigned long ric)
 {
 	unsigned long rb,rs,prs,r;
 
@@ -239,7 +239,7 @@ static inline void fixup_tlbie_lpid(unsigned long lpid)
 /*
  * We use 128 set in radix mode and 256 set in hpt mode.
  */
-static inline void _tlbiel_pid(unsigned long pid, unsigned long ric)
+static __always_inline void _tlbiel_pid(unsigned long pid, unsigned long ric)
 {
 	int set;
 
-- 
2.17.1


^ permalink raw reply related

* [PATCH v2 05/11] mtd: rawnand: vf610_nfc: add initializer to avoid -Wmaybe-uninitialized
From: Masahiro Yamada @ 2019-04-19  9:47 UTC (permalink / raw)
  To: Andrew Morton, linux-arch
  Cc: linux-s390, Arnd Bergmann, x86, Heiko Carstens, linux-mips,
	linux-kernel, Masahiro Yamada, Ingo Molnar, linux-mtd,
	linuxppc-dev, linux-arm-kernel
In-Reply-To: <20190419094754.24667-1-yamada.masahiro@socionext.com>

This prepares to move CONFIG_OPTIMIZE_INLINING from x86 to a common
place. We need to eliminate potential issues beforehand.

Kbuild test robot has never reported -Wmaybe-uninitialized warning
for this probably because vf610_nfc_run() is inlined by the x86
compiler's inlining heuristic.

If CONFIG_OPTIMIZE_INLINING is enabled for a different architecture
and vf610_nfc_run() is not inlined, the following warning is reported:

drivers/mtd/nand/raw/vf610_nfc.c: In function ‘vf610_nfc_cmd’:
drivers/mtd/nand/raw/vf610_nfc.c:455:3: warning: ‘offset’ may be used uninitialized in this function [-Wmaybe-uninitialized]
   vf610_nfc_rd_from_sram(instr->ctx.data.buf.in + offset,
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            nfc->regs + NFC_MAIN_AREA(0) + offset,
            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            trfr_sz, !nfc->data_access);
            ~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
---

Changes in v2:
  - split into a separate patch

 drivers/mtd/nand/raw/vf610_nfc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mtd/nand/raw/vf610_nfc.c b/drivers/mtd/nand/raw/vf610_nfc.c
index a662ca1970e5..19792d725ec2 100644
--- a/drivers/mtd/nand/raw/vf610_nfc.c
+++ b/drivers/mtd/nand/raw/vf610_nfc.c
@@ -364,7 +364,7 @@ static int vf610_nfc_cmd(struct nand_chip *chip,
 {
 	const struct nand_op_instr *instr;
 	struct vf610_nfc *nfc = chip_to_nfc(chip);
-	int op_id = -1, trfr_sz = 0, offset;
+	int op_id = -1, trfr_sz = 0, offset = 0;
 	u32 col = 0, row = 0, cmd1 = 0, cmd2 = 0, code = 0;
 	bool force8bit = false;
 
-- 
2.17.1


^ permalink raw reply related

* [PATCH v2 09/11] powerpc/mm/radix: mark __radix__flush_tlb_range_psize() as __always_inline
From: Masahiro Yamada @ 2019-04-19  9:47 UTC (permalink / raw)
  To: Andrew Morton, linux-arch
  Cc: linux-s390, Arnd Bergmann, x86, Heiko Carstens, linux-mips,
	linux-kernel, Masahiro Yamada, Ingo Molnar, linux-mtd,
	linuxppc-dev, linux-arm-kernel
In-Reply-To: <20190419094754.24667-1-yamada.masahiro@socionext.com>

This prepares to move CONFIG_OPTIMIZE_INLINING from x86 to a common
place. We need to eliminate potential issues beforehand.

If it is enabled for powerpc, the following error is reported:

arch/powerpc/mm/tlb-radix.c: In function '__radix__flush_tlb_range_psize':
arch/powerpc/mm/tlb-radix.c:104:2: error: asm operand 3 probably doesn't match constraints [-Werror]
  asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1)
  ^~~
arch/powerpc/mm/tlb-radix.c:104:2: error: impossible constraint in 'asm'

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
---

Changes in v2:
  - split into a separate patch

 arch/powerpc/mm/tlb-radix.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c
index 6a23b9ebd2a1..a2b2848f0ae3 100644
--- a/arch/powerpc/mm/tlb-radix.c
+++ b/arch/powerpc/mm/tlb-radix.c
@@ -928,7 +928,7 @@ void radix__tlb_flush(struct mmu_gather *tlb)
 	tlb->need_flush_all = 0;
 }
 
-static inline void __radix__flush_tlb_range_psize(struct mm_struct *mm,
+static __always_inline void __radix__flush_tlb_range_psize(struct mm_struct *mm,
 				unsigned long start, unsigned long end,
 				int psize, bool also_pwc)
 {
-- 
2.17.1


^ permalink raw reply related

* [PATCH v2 07/11] ARM: mark setup_machine_tags() stub as __init __noreturn
From: Masahiro Yamada @ 2019-04-19  9:47 UTC (permalink / raw)
  To: Andrew Morton, linux-arch
  Cc: linux-s390, Arnd Bergmann, x86, Heiko Carstens, linux-mips,
	linux-kernel, Masahiro Yamada, Ingo Molnar, linux-mtd,
	linuxppc-dev, linux-arm-kernel
In-Reply-To: <20190419094754.24667-1-yamada.masahiro@socionext.com>

This prepares to move CONFIG_OPTIMIZE_INLINING from x86 to a common
place. We need to eliminate potential issues beforehand.

If it is enabled for arm, Clang build results in the following modpost
warning:

WARNING: vmlinux.o(.text+0x1124): Section mismatch in reference from the function setup_machine_tags() to the function .init.text:early_print()
The function setup_machine_tags() references
the function __init early_print().
This is often because setup_machine_tags lacks a __init
annotation or the annotation of early_print is wrong.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
---

Changes in v2:
  - new patch

 arch/arm/kernel/atags.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/kernel/atags.h b/arch/arm/kernel/atags.h
index 201100226301..067e12edc341 100644
--- a/arch/arm/kernel/atags.h
+++ b/arch/arm/kernel/atags.h
@@ -5,7 +5,7 @@ void convert_to_tag_list(struct tag *tags);
 const struct machine_desc *setup_machine_tags(phys_addr_t __atags_pointer,
 	unsigned int machine_nr);
 #else
-static inline const struct machine_desc *
+static inline const struct machine_desc * __init __noreturn
 setup_machine_tags(phys_addr_t __atags_pointer, unsigned int machine_nr)
 {
 	early_print("no ATAGS support: can't continue\n");
-- 
2.17.1

^ permalink raw reply related

* [PATCH v2 08/11] powerpc/prom_init: mark prom_getprop() and prom_getproplen() as __init
From: Masahiro Yamada @ 2019-04-19  9:47 UTC (permalink / raw)
  To: Andrew Morton, linux-arch
  Cc: linux-s390, Arnd Bergmann, x86, Heiko Carstens, linux-mips,
	linux-kernel, Masahiro Yamada, Ingo Molnar, linux-mtd,
	linuxppc-dev, linux-arm-kernel
In-Reply-To: <20190419094754.24667-1-yamada.masahiro@socionext.com>

This prepares to move CONFIG_OPTIMIZE_INLINING from x86 to a common
place. We need to eliminate potential issues beforehand.

If it is enabled for powerpc, the following modpost warnings are
reported:

WARNING: vmlinux.o(.text.unlikely+0x20): Section mismatch in reference from the function .prom_getprop() to the function .init.text:.call_prom()
The function .prom_getprop() references
the function __init .call_prom().
This is often because .prom_getprop lacks a __init
annotation or the annotation of .call_prom is wrong.

WARNING: vmlinux.o(.text.unlikely+0x3c): Section mismatch in reference from the function .prom_getproplen() to the function .init.text:.call_prom()
The function .prom_getproplen() references
the function __init .call_prom().
This is often because .prom_getproplen lacks a __init
annotation or the annotation of .call_prom is wrong.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
---

Changes in v2:
  - split into a separate patch

 arch/powerpc/kernel/prom_init.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
index f33ff4163a51..241fe6b7a8cc 100644
--- a/arch/powerpc/kernel/prom_init.c
+++ b/arch/powerpc/kernel/prom_init.c
@@ -501,14 +501,14 @@ static int __init prom_next_node(phandle *nodep)
 	}
 }

-static inline int prom_getprop(phandle node, const char *pname,
-			       void *value, size_t valuelen)
+static inline int __init prom_getprop(phandle node, const char *pname,
+				      void *value, size_t valuelen)
 {
 	return call_prom("getprop", 4, 1, node, ADDR(pname),
 			 (u32)(unsigned long) value, (u32) valuelen);
 }

-static inline int prom_getproplen(phandle node, const char *pname)
+static inline int __init prom_getproplen(phandle node, const char *pname)
 {
 	return call_prom("getproplen", 2, 1, node, ADDR(pname));
 }
-- 
2.17.1

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox