LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH v2] powerpc/32: remove bogus ppc_select syscall
From: Christophe Leroy @ 2021-03-04 15:24 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linuxppc-dev, linux-kernel@vger.kernel.org, Paul Mackerras,
	halesh.sadashiv
In-Reply-To: <CAK8P3a2=qOG1iLhw2fi=r128bRMdfNx4BseXONiS7vrnbVvr6w@mail.gmail.com>



Le 04/03/2021 à 16:17, Arnd Bergmann a écrit :
> On Thu, Mar 4, 2021 at 1:51 PM Christophe Leroy
> <christophe.leroy@csgroup.eu> wrote:
>>
>> From: Arnd Bergmann <arnd@arndb.de>
>>
>> The ppc_select function was introduced in linux-2.3.48 in order to support
>> code confusing the legacy select() calling convention with the standard one.
>> Even 24 years ago, all correctly built code should not have done this and
>> could have easily been phased out. Nothing that was compiled later should
>> actually try to use the old_select interface, and it would have been broken
>> already on all ppc64 kernels with the syscall emulation layer.
>>
>> This patch brings the 32 bit compat ABI and the native 32 bit ABI for
>> powerpc into a consistent state, by removing support for both the
>> old_select system call number and the handler for it.
>>
>> The bug report triggering this came from
>> Halesh Sadashiiv <halesh.sadashiv@ap.sony.com>, who discovered that the
>> 32 bit implementation of ppc_select would in case of a negative number
>> of file descriptors incorrectly return -EFAULT instead of -EINVAL.
>> There seems to be no way to fix this problem in a way that would
>> keep broken pre-1997 binaries running.
>>
>> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
>> Cc: Halesh Sadashiiv <halesh.sadashiv@ap.sony.com>
>> [chleroy: Rebased and updated the number of years elapsed in the commit message]
>> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
>> ---
>> First version was in 2008, at that time it was rejected, see
>> http://patchwork.ozlabs.org/project/linuxppc-dev/patch/200809240839.14902.arnd@arndb.de/
> 
> The patch from 2008 did two things:
> 
> - it removed the ppc32 specific 'select' syscall at #82
> - it fixed the generic '_newselect' syscall at #142
> 
> Back then, the decision was to only address the second issue, which
> got merged in commit dad2f2fb0fc7 ("powerpc: Fix wrong error code from
> ppc32 select syscall").
> 
> It is probably ok to remove the old select system call now, but
> my changelog text no longer makes sense, as the patch has nothing
> to do with the bug that was reported back then.
> 

I understood that the original reported bug was that calling that version of select() with a 
negative value as first parametre would lead to a -EFAULT instead of -EINVAL. That's exactly the 
case here, if you set n = -1 you get into this (unsigned long)n > 4096 then the buffer is at 
0xffffffff and access_ok() won't grand access to it so the return value will be -EFAULT instead of 
-EINVAL.

Am I missing something ?

Christophe

^ permalink raw reply

* Re: [PATCH v1] powerpc: Include running function as first entry in save_stack_trace() and friends
From: Marco Elver @ 2021-03-04 15:30 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Catalin Marinas, Will Deacon, LKML, broonie, Paul Mackerras,
	kasan-dev, linuxppc-dev, Linux ARM
In-Reply-To: <20210304145730.GC54534@C02TD0UTHF1T.local>

On Thu, 4 Mar 2021 at 15:57, Mark Rutland <mark.rutland@arm.com> wrote:
> [adding Mark Brown]
>
> On Wed, Mar 03, 2021 at 04:20:43PM +0100, Marco Elver wrote:
> > On Wed, Mar 03, 2021 at 03:52PM +0100, Christophe Leroy wrote:
> > > Le 03/03/2021 ï¿½ 15:38, Marco Elver a ï¿½critï¿½:
> > > > On Wed, 3 Mar 2021 at 15:09, Christophe Leroy
> > > > <christophe.leroy@csgroup.eu> wrote:
> > > > >
> > > > > It seems like all other sane architectures, namely x86 and arm64
> > > > > at least, include the running function as top entry when saving
> > > > > stack trace.
> > > > >
> > > > > Functionnalities like KFENCE expect it.
> > > > >
> > > > > Do the same on powerpc, it allows KFENCE to properly identify the faulting
> > > > > function as depicted below. Before the patch KFENCE was identifying
> > > > > finish_task_switch.isra as the faulting function.
> > > > >
> > > > > [   14.937370] ==================================================================
> > > > > [   14.948692] BUG: KFENCE: invalid read in test_invalid_access+0x54/0x108
> > > > > [   14.948692]
> > > > > [   14.956814] Invalid read at 0xdf98800a:
> > > > > [   14.960664]  test_invalid_access+0x54/0x108
> > > > > [   14.964876]  finish_task_switch.isra.0+0x54/0x23c
> > > > > [   14.969606]  kunit_try_run_case+0x5c/0xd0
> > > > > [   14.973658]  kunit_generic_run_threadfn_adapter+0x24/0x30
> > > > > [   14.979079]  kthread+0x15c/0x174
> > > > > [   14.982342]  ret_from_kernel_thread+0x14/0x1c
> > > > > [   14.986731]
> > > > > [   14.988236] CPU: 0 PID: 111 Comm: kunit_try_catch Tainted: G    B             5.12.0-rc1-01537-g95f6e2088d7e-dirty #4682
> > > > > [   14.999795] NIP:  c016ec2c LR: c02f517c CTR: c016ebd8
> > > > > [   15.004851] REGS: e2449d90 TRAP: 0301   Tainted: G    B              (5.12.0-rc1-01537-g95f6e2088d7e-dirty)
> > > > > [   15.015274] MSR:  00009032 <EE,ME,IR,DR,RI>  CR: 22000004  XER: 00000000
> > > > > [   15.022043] DAR: df98800a DSISR: 20000000
> > > > > [   15.022043] GPR00: c02f517c e2449e50 c1142080 e100dd24 c084b13c 00000008 c084b32b c016ebd8
> > > > > [   15.022043] GPR08: c0850000 df988000 c0d10000 e2449eb0 22000288
> > > > > [   15.040581] NIP [c016ec2c] test_invalid_access+0x54/0x108
> > > > > [   15.046010] LR [c02f517c] kunit_try_run_case+0x5c/0xd0
> > > > > [   15.051181] Call Trace:
> > > > > [   15.053637] [e2449e50] [c005a68c] finish_task_switch.isra.0+0x54/0x23c (unreliable)
> > > > > [   15.061338] [e2449eb0] [c02f517c] kunit_try_run_case+0x5c/0xd0
> > > > > [   15.067215] [e2449ed0] [c02f648c] kunit_generic_run_threadfn_adapter+0x24/0x30
> > > > > [   15.074472] [e2449ef0] [c004e7b0] kthread+0x15c/0x174
> > > > > [   15.079571] [e2449f30] [c001317c] ret_from_kernel_thread+0x14/0x1c
> > > > > [   15.085798] Instruction dump:
> > > > > [   15.088784] 8129d608 38e7ebd8 81020280 911f004c 39000000 995f0024 907f0028 90ff001c
> > > > > [   15.096613] 3949000a 915f0020 3d40c0d1 3d00c085 <8929000a> 3908adb0 812a4b98 3d40c02f
> > > > > [   15.104612] ==================================================================
> > > > >
> > > > > Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> > > >
> > > > Acked-by: Marco Elver <elver@google.com>
> > > >
> > > > Thank you, I think this looks like the right solution. Just a question below:
> > > >
> > > ...
> > >
> > > > > @@ -59,23 +70,26 @@ void save_stack_trace(struct stack_trace *trace)
> > > > >
> > > > >          sp = current_stack_frame();
> > > > >
> > > > > -       save_context_stack(trace, sp, current, 1);
> > > > > +       save_context_stack(trace, sp, (unsigned long)save_stack_trace, current, 1);
> > > >
> > > > This causes ip == save_stack_trace and also below for
> > > > save_stack_trace_tsk. Does this mean save_stack_trace() is included in
> > > > the trace? Looking at kernel/stacktrace.c, I think the library wants
> > > > to exclude itself from the trace, as it does '.skip = skipnr + 1' (and
> > > > '.skip   = skipnr + (current == tsk)' for the _tsk variant).
> > > >
> > > > If the arch-helper here is included, should this use _RET_IP_ instead?
> > > >
> > >
> > > Don't really know, I was inspired by arm64 which has:
> > >
> > > void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
> > >                  struct task_struct *task, struct pt_regs *regs)
> > > {
> > >     struct stackframe frame;
> > >
> > >     if (regs)
> > >             start_backtrace(&frame, regs->regs[29], regs->pc);
> > >     else if (task == current)
> > >             start_backtrace(&frame,
> > >                             (unsigned long)__builtin_frame_address(0),
> > >                             (unsigned long)arch_stack_walk);
> > >     else
> > >             start_backtrace(&frame, thread_saved_fp(task),
> > >                             thread_saved_pc(task));
> > >
> > >     walk_stackframe(task, &frame, consume_entry, cookie);
> > > }
> > >
> > > But looking at x86 you may be right, so what should be done really ?
> >
> > x86:
> >
> > [    2.843292] calling stack_trace_save:
> > [    2.843705]  test_func+0x6c/0x118
> > [    2.844184]  do_one_initcall+0x58/0x270
> > [    2.844618]  kernel_init_freeable+0x1da/0x23a
> > [    2.845110]  kernel_init+0xc/0x166
> > [    2.845494]  ret_from_fork+0x22/0x30
> >
> > [    2.867525] calling stack_trace_save_tsk:
> > [    2.868017]  test_func+0xa9/0x118
> > [    2.868530]  do_one_initcall+0x58/0x270
> > [    2.869003]  kernel_init_freeable+0x1da/0x23a
> > [    2.869535]  kernel_init+0xc/0x166
> > [    2.869957]  ret_from_fork+0x22/0x30
> >
> > arm64:
> >
> > [    3.786911] calling stack_trace_save:
> > [    3.787147]  stack_trace_save+0x50/0x78
> > [    3.787443]  test_func+0x84/0x13c
> > [    3.787738]  do_one_initcall+0x5c/0x310
> > [    3.788099]  kernel_init_freeable+0x214/0x294
> > [    3.788363]  kernel_init+0x18/0x164
> > [    3.788585]  ret_from_fork+0x10/0x30
> >
> > [    3.803615] calling stack_trace_save_tsk:
> > [    3.804266]  stack_trace_save_tsk+0x9c/0x100
> > [    3.804541]  test_func+0xc4/0x13c
> > [    3.804803]  do_one_initcall+0x5c/0x310
> > [    3.805031]  kernel_init_freeable+0x214/0x294
> > [    3.805284]  kernel_init+0x18/0x164
> > [    3.805505]  ret_from_fork+0x10/0x30
> >
> > +Cc arm64 folks.
> >
> > So I think the arm64 version also has a bug, because I think a user of
> > <linux/stacktrace.h> really doesn't care about the library function
> > itself. And from reading kernel/stacktrace.c I think it wants to exclude
> > itself entirely.
> >
> > It's a shame that <linux/stacktrace.h> isn't better documented, but I'm
> > pretty sure that including the library functions in the trace is not
> > useful.
>
> I agree this behaviour isn't desireable, and that the lack of
> documentation is unfortunate.
>
> It looks like GCC is happy to give us the function-entry-time FP if we use
> __builtin_frame_address(1), and assuming clang is similarly happy we can do:
>
> | diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> | index ad20981dfda4..5dfbf915eb7f 100644
> | --- a/arch/arm64/kernel/stacktrace.c
> | +++ b/arch/arm64/kernel/stacktrace.c
> | @@ -203,8 +203,8 @@ void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
> |                 start_backtrace(&frame, regs->regs[29], regs->pc);
> |         else if (task == current)
> |                 start_backtrace(&frame,
> | -                               (unsigned long)__builtin_frame_address(0),
> | -                               (unsigned long)arch_stack_walk);
> | +                               (unsigned long)__builtin_frame_address(1),
> | +                               (unsigned long)__builtin_return_address(0));
> |         else
> |                 start_backtrace(&frame, thread_saved_fp(task),
> |                                 thread_saved_pc(task));
>
> ... such that arch_stack_walk() will try to avoid including itself in a
> trace, and so the existing skipping should (w/ caveats below) skip
> stack_trace_save() or stack_trace_save_tsk().

Thank you! Yes, that works.

> If that works for you, I can spin that as a patch, though we'll need to
> check that doesn't introduce a new fencepost error elsewhere.
>
> The bigger problem here is that skipping is dodgy to begin with, and
> this is still liable to break in some cases. One big concern is that
> (especially with LTO) we cannot guarantee the compiler will not inline
> or outline functions, causing the skipp value to be too large or too
> small. That's liable to happen to callers, and in theory (though
> unlikely in practice), portions of arch_stack_walk() or
> stack_trace_save() could get outlined too.
>
> Unless we can get some strong guarantees from compiler folk such that we
> can guarantee a specific function acts boundary for unwinding (and
> doesn't itself get split, etc), the only reliable way I can think to
> solve this requires an assembly trampoline. Whatever we do is liable to
> need some invasive rework.

Will LTO and friends respect 'noinline'? One thing I also noticed is
that tail calls would also cause the stack trace to appear somewhat
incomplete (for some of my tests I've disabled tail call
optimizations). Is there a way to also mark a function
non-tail-callable? But I'm also not sure if with all that we'd be
guaranteed the code we want, even though in practice it might.

Thanks,
-- Marco

^ permalink raw reply

* Re: [PATCH 14/17] iommu: remove DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE
From: Robin Murphy @ 2021-03-04 15:25 UTC (permalink / raw)
  To: Christoph Hellwig, Joerg Roedel, Will Deacon, Li Yang
  Cc: kvm, linux-arm-msm, linuxppc-dev, dri-devel, virtualization,
	iommu, netdev, freedreno, David Woodhouse, linux-arm-kernel
In-Reply-To: <20210301084257.945454-15-hch@lst.de>

On 2021-03-01 08:42, Christoph Hellwig wrote:
> Use explicit methods for setting and querying the information instead.

Now that everyone's using iommu-dma, is there any point in bouncing this 
through the drivers at all? Seems like it would make more sense for the 
x86 drivers to reflect their private options back to iommu_dma_strict 
(and allow Intel's caching mode to override it as well), then have 
iommu_dma_init_domain just test !iommu_dma_strict && 
domain->ops->flush_iotlb_all.

Robin.

> Also remove the now unused iommu_domain_get_attr functionality.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/iommu/amd/iommu.c                   | 23 ++-------
>   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 47 ++++++-----------
>   drivers/iommu/arm/arm-smmu/arm-smmu.c       | 56 +++++----------------
>   drivers/iommu/dma-iommu.c                   |  8 ++-
>   drivers/iommu/intel/iommu.c                 | 27 ++--------
>   drivers/iommu/iommu.c                       | 19 +++----
>   include/linux/iommu.h                       | 17 ++-----
>   7 files changed, 51 insertions(+), 146 deletions(-)
> 
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index a69a8b573e40d0..37a8e51db17656 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -1771,24 +1771,11 @@ static struct iommu_group *amd_iommu_device_group(struct device *dev)
>   	return acpihid_device_group(dev);
>   }
>   
> -static int amd_iommu_domain_get_attr(struct iommu_domain *domain,
> -		enum iommu_attr attr, void *data)
> +static bool amd_iommu_dma_use_flush_queue(struct iommu_domain *domain)
>   {
> -	switch (domain->type) {
> -	case IOMMU_DOMAIN_UNMANAGED:
> -		return -ENODEV;
> -	case IOMMU_DOMAIN_DMA:
> -		switch (attr) {
> -		case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE:
> -			*(int *)data = !amd_iommu_unmap_flush;
> -			return 0;
> -		default:
> -			return -ENODEV;
> -		}
> -		break;
> -	default:
> -		return -EINVAL;
> -	}
> +	if (domain->type != IOMMU_DOMAIN_DMA)
> +		return false;
> +	return !amd_iommu_unmap_flush;
>   }
>   
>   /*****************************************************************************
> @@ -2257,7 +2244,7 @@ const struct iommu_ops amd_iommu_ops = {
>   	.release_device = amd_iommu_release_device,
>   	.probe_finalize = amd_iommu_probe_finalize,
>   	.device_group = amd_iommu_device_group,
> -	.domain_get_attr = amd_iommu_domain_get_attr,
> +	.dma_use_flush_queue = amd_iommu_dma_use_flush_queue,
>   	.get_resv_regions = amd_iommu_get_resv_regions,
>   	.put_resv_regions = generic_iommu_put_resv_regions,
>   	.is_attach_deferred = amd_iommu_is_attach_deferred,
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 8594b4a8304375..bf96172e8c1f71 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2449,33 +2449,21 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev)
>   	return group;
>   }
>   
> -static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
> -				    enum iommu_attr attr, void *data)
> +static bool arm_smmu_dma_use_flush_queue(struct iommu_domain *domain)
>   {
>   	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>   
> -	switch (domain->type) {
> -	case IOMMU_DOMAIN_UNMANAGED:
> -		switch (attr) {
> -		case DOMAIN_ATTR_NESTING:
> -			*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
> -			return 0;
> -		default:
> -			return -ENODEV;
> -		}
> -		break;
> -	case IOMMU_DOMAIN_DMA:
> -		switch (attr) {
> -		case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE:
> -			*(int *)data = smmu_domain->non_strict;
> -			return 0;
> -		default:
> -			return -ENODEV;
> -		}
> -		break;
> -	default:
> -		return -EINVAL;
> -	}
> +	if (domain->type != IOMMU_DOMAIN_DMA)
> +		return false;
> +	return smmu_domain->non_strict;
> +}
> +
> +
> +static void arm_smmu_dma_enable_flush_queue(struct iommu_domain *domain)
> +{
> +	if (domain->type != IOMMU_DOMAIN_DMA)
> +		return;
> +	to_smmu_domain(domain)->non_strict = true;
>   }
>   
>   static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
> @@ -2505,13 +2493,7 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
>   		}
>   		break;
>   	case IOMMU_DOMAIN_DMA:
> -		switch(attr) {
> -		case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE:
> -			smmu_domain->non_strict = *(int *)data;
> -			break;
> -		default:
> -			ret = -ENODEV;
> -		}
> +		ret = -ENODEV;
>   		break;
>   	default:
>   		ret = -EINVAL;
> @@ -2619,7 +2601,8 @@ static struct iommu_ops arm_smmu_ops = {
>   	.probe_device		= arm_smmu_probe_device,
>   	.release_device		= arm_smmu_release_device,
>   	.device_group		= arm_smmu_device_group,
> -	.domain_get_attr	= arm_smmu_domain_get_attr,
> +	.dma_use_flush_queue	= arm_smmu_dma_use_flush_queue,
> +	.dma_enable_flush_queue	= arm_smmu_dma_enable_flush_queue,
>   	.domain_set_attr	= arm_smmu_domain_set_attr,
>   	.of_xlate		= arm_smmu_of_xlate,
>   	.get_resv_regions	= arm_smmu_get_resv_regions,
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> index d8c6bfde6a6158..e7893e96f5177a 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> @@ -1481,42 +1481,20 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev)
>   	return group;
>   }
>   
> -static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
> -				    enum iommu_attr attr, void *data)
> +static bool arm_smmu_dma_use_flush_queue(struct iommu_domain *domain)
>   {
>   	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>   
> -	switch(domain->type) {
> -	case IOMMU_DOMAIN_UNMANAGED:
> -		switch (attr) {
> -		case DOMAIN_ATTR_NESTING:
> -			*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
> -			return 0;
> -		case DOMAIN_ATTR_IO_PGTABLE_CFG: {
> -			struct io_pgtable_domain_attr *pgtbl_cfg = data;
> -			*pgtbl_cfg = smmu_domain->pgtbl_cfg;
> +	if (domain->type != IOMMU_DOMAIN_DMA)
> +		return false;
> +	return smmu_domain->pgtbl_cfg.quirks & IO_PGTABLE_QUIRK_NON_STRICT;
> +}
>   
> -			return 0;
> -		}
> -		default:
> -			return -ENODEV;
> -		}
> -		break;
> -	case IOMMU_DOMAIN_DMA:
> -		switch (attr) {
> -		case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE: {
> -			bool non_strict = smmu_domain->pgtbl_cfg.quirks &
> -					  IO_PGTABLE_QUIRK_NON_STRICT;
> -			*(int *)data = non_strict;
> -			return 0;
> -		}
> -		default:
> -			return -ENODEV;
> -		}
> -		break;
> -	default:
> -		return -EINVAL;
> -	}
> +static void arm_smmu_dma_enable_flush_queue(struct iommu_domain *domain)
> +{
> +	if (domain->type != IOMMU_DOMAIN_DMA)
> +		return;
> +	to_smmu_domain(domain)->pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
>   }
>   
>   static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
> @@ -1557,16 +1535,7 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
>   		}
>   		break;
>   	case IOMMU_DOMAIN_DMA:
> -		switch (attr) {
> -		case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE:
> -			if (*(int *)data)
> -				smmu_domain->pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
> -			else
> -				smmu_domain->pgtbl_cfg.quirks &= ~IO_PGTABLE_QUIRK_NON_STRICT;
> -			break;
> -		default:
> -			ret = -ENODEV;
> -		}
> +		ret = -ENODEV;
>   		break;
>   	default:
>   		ret = -EINVAL;
> @@ -1631,7 +1600,8 @@ static struct iommu_ops arm_smmu_ops = {
>   	.probe_device		= arm_smmu_probe_device,
>   	.release_device		= arm_smmu_release_device,
>   	.device_group		= arm_smmu_device_group,
> -	.domain_get_attr	= arm_smmu_domain_get_attr,
> +	.dma_use_flush_queue	= arm_smmu_dma_use_flush_queue,
> +	.dma_enable_flush_queue	= arm_smmu_dma_enable_flush_queue,
>   	.domain_set_attr	= arm_smmu_domain_set_attr,
>   	.of_xlate		= arm_smmu_of_xlate,
>   	.get_resv_regions	= arm_smmu_get_resv_regions,
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 9ab6ee22c11088..d3fe5aad9d6ecf 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -305,8 +305,8 @@ static void iommu_dma_flush_iotlb_all(struct iova_domain *iovad)
>   	cookie = container_of(iovad, struct iommu_dma_cookie, iovad);
>   	domain = cookie->fq_domain;
>   	/*
> -	 * The IOMMU driver supporting DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE
> -	 * implies that ops->flush_iotlb_all must be non-NULL.
> +	 * The IOMMU driver supporting a DMA flush queue implies that
> +	 * ops->flush_iotlb_all must be non-NULL.
>   	 */
>   	domain->ops->flush_iotlb_all(domain);
>   }
> @@ -329,7 +329,6 @@ static int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
>   	struct iommu_dma_cookie *cookie = domain->iova_cookie;
>   	unsigned long order, base_pfn;
>   	struct iova_domain *iovad;
> -	int attr;
>   
>   	if (!cookie || cookie->type != IOMMU_DMA_IOVA_COOKIE)
>   		return -EINVAL;
> @@ -365,8 +364,7 @@ static int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
>   
>   	init_iova_domain(iovad, 1UL << order, base_pfn);
>   
> -	if (!cookie->fq_domain && !iommu_domain_get_attr(domain,
> -			DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE, &attr) && attr) {
> +	if (!cookie->fq_domain && iommu_dma_use_flush_queue(domain)) {
>   		if (init_iova_flush_queue(iovad, iommu_dma_flush_iotlb_all,
>   					  iommu_dma_entry_dtor))
>   			pr_warn("iova flush queue initialization failed\n");
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index ee0932307d646b..eaa80c33f4bc91 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -5453,13 +5453,13 @@ intel_iommu_domain_set_attr(struct iommu_domain *domain,
>   	return ret;
>   }
>   
> -static bool domain_use_flush_queue(void)
> +static bool intel_iommu_dma_use_flush_queue(struct iommu_domain *domain)
>   {
>   	struct dmar_drhd_unit *drhd;
>   	struct intel_iommu *iommu;
>   	bool r = true;
>   
> -	if (intel_iommu_strict)
> +	if (domain->type != IOMMU_DOMAIN_DMA || intel_iommu_strict)
>   		return false;
>   
>   	/*
> @@ -5483,27 +5483,6 @@ static bool domain_use_flush_queue(void)
>   	return r;
>   }
>   
> -static int
> -intel_iommu_domain_get_attr(struct iommu_domain *domain,
> -			    enum iommu_attr attr, void *data)
> -{
> -	switch (domain->type) {
> -	case IOMMU_DOMAIN_UNMANAGED:
> -		return -ENODEV;
> -	case IOMMU_DOMAIN_DMA:
> -		switch (attr) {
> -		case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE:
> -			*(int *)data = domain_use_flush_queue();
> -			return 0;
> -		default:
> -			return -ENODEV;
> -		}
> -		break;
> -	default:
> -		return -EINVAL;
> -	}
> -}
> -
>   /*
>    * Check that the device does not live on an external facing PCI port that is
>    * marked as untrusted. Such devices should not be able to apply quirks and
> @@ -5576,7 +5555,7 @@ const struct iommu_ops intel_iommu_ops = {
>   	.capable		= intel_iommu_capable,
>   	.domain_alloc		= intel_iommu_domain_alloc,
>   	.domain_free		= intel_iommu_domain_free,
> -	.domain_get_attr        = intel_iommu_domain_get_attr,
> +	.dma_use_flush_queue	= intel_iommu_dma_use_flush_queue,
>   	.domain_set_attr	= intel_iommu_domain_set_attr,
>   	.attach_dev		= intel_iommu_attach_device,
>   	.detach_dev		= intel_iommu_detach_device,
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 23daaea7883b75..0f12c4d58cdc42 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1512,12 +1512,8 @@ static int iommu_group_alloc_default_domain(struct bus_type *bus,
>   	if (!group->domain)
>   		group->domain = dom;
>   
> -	if (!iommu_dma_strict) {
> -		int attr = 1;
> -		iommu_domain_set_attr(dom,
> -				      DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
> -				      &attr);
> -	}
> +	if (!iommu_dma_strict && dom->ops->dma_enable_flush_queue)
> +		dom->ops->dma_enable_flush_queue(dom);
>   
>   	return 0;
>   }
> @@ -2664,14 +2660,13 @@ static int __init iommu_init(void)
>   }
>   core_initcall(iommu_init);
>   
> -int iommu_domain_get_attr(struct iommu_domain *domain,
> -			  enum iommu_attr attr, void *data)
> +bool iommu_dma_use_flush_queue(struct iommu_domain *domain)
>   {
> -	if (!domain->ops->domain_get_attr)
> -		return -EINVAL;
> -	return domain->ops->domain_get_attr(domain, attr, data);
> +	if (!domain->ops->dma_use_flush_queue)
> +		return false;
> +	return domain->ops->dma_use_flush_queue(domain);
>   }
> -EXPORT_SYMBOL_GPL(iommu_domain_get_attr);
> +EXPORT_SYMBOL_GPL(iommu_dma_use_flush_queue);
>   
>   int iommu_domain_set_attr(struct iommu_domain *domain,
>   			  enum iommu_attr attr, void *data)
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index c15a8658daad64..f30de33c6ff56e 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -108,7 +108,6 @@ enum iommu_cap {
>   
>   enum iommu_attr {
>   	DOMAIN_ATTR_NESTING,	/* two stages of translation */
> -	DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
>   	DOMAIN_ATTR_IO_PGTABLE_CFG,
>   	DOMAIN_ATTR_MAX,
>   };
> @@ -194,7 +193,8 @@ struct iommu_iotlb_gather {
>    * @probe_finalize: Do final setup work after the device is added to an IOMMU
>    *                  group and attached to the groups domain
>    * @device_group: find iommu group for a particular device
> - * @domain_get_attr: Query domain attributes
> + * @dma_use_flush_queue: Returns %true if a DMA flush queue is used
> + * @dma_enable_flush_queue: Try to enable the DMA flush queue
>    * @domain_set_attr: Change domain attributes
>    * @get_resv_regions: Request list of reserved regions for a device
>    * @put_resv_regions: Free list of reserved regions for a device
> @@ -244,8 +244,8 @@ struct iommu_ops {
>   	void (*release_device)(struct device *dev);
>   	void (*probe_finalize)(struct device *dev);
>   	struct iommu_group *(*device_group)(struct device *dev);
> -	int (*domain_get_attr)(struct iommu_domain *domain,
> -			       enum iommu_attr attr, void *data);
> +	bool (*dma_use_flush_queue)(struct iommu_domain *domain);
> +	void (*dma_enable_flush_queue)(struct iommu_domain *domain);
>   	int (*domain_set_attr)(struct iommu_domain *domain,
>   			       enum iommu_attr attr, void *data);
>   
> @@ -491,8 +491,7 @@ extern int iommu_page_response(struct device *dev,
>   extern int iommu_group_id(struct iommu_group *group);
>   extern struct iommu_domain *iommu_group_default_domain(struct iommu_group *);
>   
> -extern int iommu_domain_get_attr(struct iommu_domain *domain, enum iommu_attr,
> -				 void *data);
> +bool iommu_dma_use_flush_queue(struct iommu_domain *domain);
>   extern int iommu_domain_set_attr(struct iommu_domain *domain, enum iommu_attr,
>   				 void *data);
>   
> @@ -861,12 +860,6 @@ static inline int iommu_group_id(struct iommu_group *group)
>   	return -ENODEV;
>   }
>   
> -static inline int iommu_domain_get_attr(struct iommu_domain *domain,
> -					enum iommu_attr attr, void *data)
> -{
> -	return -EINVAL;
> -}
> -
>   static inline int iommu_domain_set_attr(struct iommu_domain *domain,
>   					enum iommu_attr attr, void *data)
>   {
> 

^ permalink raw reply

* Re: [PATCH v2] powerpc/32: remove bogus ppc_select syscall
From: Arnd Bergmann @ 2021-03-04 15:30 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: linuxppc-dev, linux-kernel@vger.kernel.org, Paul Mackerras,
	halesh.sadashiv
In-Reply-To: <aca143f6-fcfc-d89f-bb00-26e90257fbf6@csgroup.eu>

On Thu, Mar 4, 2021 at 4:24 PM Christophe Leroy
<christophe.leroy@csgroup.eu> wrote:
> Le 04/03/2021 à 16:17, Arnd Bergmann a écrit :
> > On Thu, Mar 4, 2021 at 1:51 PM Christophe Leroy <christophe.leroy@csgroup.eu> wrote:
> >> ---
> >> First version was in 2008, at that time it was rejected, see
> >> http://patchwork.ozlabs.org/project/linuxppc-dev/patch/200809240839.14902.arnd@arndb.de/
> >
> > The patch from 2008 did two things:
> >
> > - it removed the ppc32 specific 'select' syscall at #82
> > - it fixed the generic '_newselect' syscall at #142
> >
> > Back then, the decision was to only address the second issue, which
> > got merged in commit dad2f2fb0fc7 ("powerpc: Fix wrong error code from
> > ppc32 select syscall").
> >
> > It is probably ok to remove the old select system call now, but
> > my changelog text no longer makes sense, as the patch has nothing
> > to do with the bug that was reported back then.
> >
>
> I understood that the original reported bug was that calling that version of select() with a
> negative value as first parametre would lead to a -EFAULT instead of -EINVAL. That's exactly the
> case here, if you set n = -1 you get into this (unsigned long)n > 4096 then the buffer is at
> 0xffffffff and access_ok() won't grand access to it so the return value will be -EFAULT instead of
> -EINVAL.
>
> Am I missing something ?

This is the behavior of the ppc_select() implementation, but as far as
I can tell,
the bug report was for the problem that this behavior would happen for both
syscall #82 and syscall #142 when the correct behavior would have been to
only do it for #82 but not for #142.

       Arnd

^ permalink raw reply

* Re: [PATCH 16/17] iommu: remove DOMAIN_ATTR_IO_PGTABLE_CFG
From: Robin Murphy @ 2021-03-04 15:48 UTC (permalink / raw)
  To: Christoph Hellwig, Joerg Roedel, Will Deacon, Li Yang
  Cc: kvm, linux-arm-msm, linuxppc-dev, dri-devel, virtualization,
	iommu, netdev, freedreno, David Woodhouse, linux-arm-kernel
In-Reply-To: <20210301084257.945454-17-hch@lst.de>

On 2021-03-01 08:42, Christoph Hellwig wrote:
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Moreso than the previous patch, where the feature is at least relatively 
generic (note that there's a bunch of in-flight development around 
DOMAIN_ATTR_NESTING), I'm really not convinced that it's beneficial to 
bloat the generic iommu_ops structure with private driver-specific 
interfaces. The attribute interface is a great compromise for these 
kinds of things, and you can easily add type-checked wrappers around it 
for external callers (maybe even make the actual attributes internal 
between the IOMMU core and drivers) if that's your concern.

Robin.

> ---
>   drivers/gpu/drm/msm/adreno/adreno_gpu.c |  2 +-
>   drivers/iommu/arm/arm-smmu/arm-smmu.c   | 40 +++++++------------------
>   drivers/iommu/iommu.c                   |  9 ++++++
>   include/linux/iommu.h                   |  9 +++++-
>   4 files changed, 29 insertions(+), 31 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> index 0f184c3dd9d9ec..78d98ab2ee3a68 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> @@ -191,7 +191,7 @@ void adreno_set_llc_attributes(struct iommu_domain *iommu)
>   	struct io_pgtable_domain_attr pgtbl_cfg;
>   
>   	pgtbl_cfg.quirks = IO_PGTABLE_QUIRK_ARM_OUTER_WBWA;
> -	iommu_domain_set_attr(iommu, DOMAIN_ATTR_IO_PGTABLE_CFG, &pgtbl_cfg);
> +	iommu_domain_set_pgtable_attr(iommu, &pgtbl_cfg);
>   }
>   
>   struct msm_gem_address_space *
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> index 2e17d990d04481..2858999c86dfd1 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> @@ -1515,40 +1515,22 @@ static int arm_smmu_domain_enable_nesting(struct iommu_domain *domain)
>   	return ret;
>   }
>   
> -static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
> -				    enum iommu_attr attr, void *data)
> +static int arm_smmu_domain_set_pgtable_attr(struct iommu_domain *domain,
> +		struct io_pgtable_domain_attr *pgtbl_cfg)
>   {
> -	int ret = 0;
>   	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> +	int ret = -EPERM;
>   
> -	mutex_lock(&smmu_domain->init_mutex);
> -
> -	switch(domain->type) {
> -	case IOMMU_DOMAIN_UNMANAGED:
> -		switch (attr) {
> -		case DOMAIN_ATTR_IO_PGTABLE_CFG: {
> -			struct io_pgtable_domain_attr *pgtbl_cfg = data;
> -
> -			if (smmu_domain->smmu) {
> -				ret = -EPERM;
> -				goto out_unlock;
> -			}
> +	if (domain->type != IOMMU_DOMAIN_UNMANAGED)
> +		return -EINVAL;
>   
> -			smmu_domain->pgtbl_cfg = *pgtbl_cfg;
> -			break;
> -		}
> -		default:
> -			ret = -ENODEV;
> -		}
> -		break;
> -	case IOMMU_DOMAIN_DMA:
> -		ret = -ENODEV;
> -		break;
> -	default:
> -		ret = -EINVAL;
> +	mutex_lock(&smmu_domain->init_mutex);
> +	if (!smmu_domain->smmu) {
> +		smmu_domain->pgtbl_cfg = *pgtbl_cfg;
> +		ret = 0;
>   	}
> -out_unlock:
>   	mutex_unlock(&smmu_domain->init_mutex);
> +
>   	return ret;
>   }
>   
> @@ -1609,7 +1591,7 @@ static struct iommu_ops arm_smmu_ops = {
>   	.device_group		= arm_smmu_device_group,
>   	.dma_use_flush_queue	= arm_smmu_dma_use_flush_queue,
>   	.dma_enable_flush_queue	= arm_smmu_dma_enable_flush_queue,
> -	.domain_set_attr	= arm_smmu_domain_set_attr,
> +	.domain_set_pgtable_attr = arm_smmu_domain_set_pgtable_attr,
>   	.domain_enable_nesting	= arm_smmu_domain_enable_nesting,
>   	.of_xlate		= arm_smmu_of_xlate,
>   	.get_resv_regions	= arm_smmu_get_resv_regions,
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 2e9e058501a953..8490aefd4b41f8 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -2693,6 +2693,15 @@ int iommu_domain_enable_nesting(struct iommu_domain *domain)
>   }
>   EXPORT_SYMBOL_GPL(iommu_domain_enable_nesting);
>   
> +int iommu_domain_set_pgtable_attr(struct iommu_domain *domain,
> +		struct io_pgtable_domain_attr *pgtbl_cfg)
> +{
> +	if (!domain->ops->domain_set_pgtable_attr)
> +		return -EINVAL;
> +	return domain->ops->domain_set_pgtable_attr(domain, pgtbl_cfg);
> +}
> +EXPORT_SYMBOL_GPL(iommu_domain_set_pgtable_attr);
> +
>   void iommu_get_resv_regions(struct device *dev, struct list_head *list)
>   {
>   	const struct iommu_ops *ops = dev->bus->iommu_ops;
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index aed88aa3bd3edf..39d3ed4d2700ac 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -40,6 +40,7 @@ struct iommu_domain;
>   struct notifier_block;
>   struct iommu_sva;
>   struct iommu_fault_event;
> +struct io_pgtable_domain_attr;
>   
>   /* iommu fault flags */
>   #define IOMMU_FAULT_READ	0x0
> @@ -107,7 +108,6 @@ enum iommu_cap {
>    */
>   
>   enum iommu_attr {
> -	DOMAIN_ATTR_IO_PGTABLE_CFG,
>   	DOMAIN_ATTR_MAX,
>   };
>   
> @@ -196,6 +196,7 @@ struct iommu_iotlb_gather {
>    * @dma_enable_flush_queue: Try to enable the DMA flush queue
>    * @domain_set_attr: Change domain attributes
>    * @domain_enable_nesting: Enable nesting
> + * @domain_set_pgtable_attr: Set io page table attributes
>    * @get_resv_regions: Request list of reserved regions for a device
>    * @put_resv_regions: Free list of reserved regions for a device
>    * @apply_resv_region: Temporary helper call-back for iova reserved ranges
> @@ -249,6 +250,8 @@ struct iommu_ops {
>   	int (*domain_set_attr)(struct iommu_domain *domain,
>   			       enum iommu_attr attr, void *data);
>   	int (*domain_enable_nesting)(struct iommu_domain *domain);
> +	int (*domain_set_pgtable_attr)(struct iommu_domain *domain,
> +			struct io_pgtable_domain_attr *pgtbl_cfg);
>   
>   	/* Request/Free a list of reserved regions for a device */
>   	void (*get_resv_regions)(struct device *dev, struct list_head *list);
> @@ -493,9 +496,13 @@ extern int iommu_group_id(struct iommu_group *group);
>   extern struct iommu_domain *iommu_group_default_domain(struct iommu_group *);
>   
>   bool iommu_dma_use_flush_queue(struct iommu_domain *domain);
> +int iommu_domain_set_pgtable_attr(struct iommu_domain *domain,
> +		struct io_pgtable_domain_attr *pgtbl_cfg);
>   extern int iommu_domain_set_attr(struct iommu_domain *domain, enum iommu_attr,
>   				 void *data);
>   int iommu_domain_enable_nesting(struct iommu_domain *domain);
> +int iommu_domain_set_pgtable_attr(struct iommu_domain *domain,
> +		struct io_pgtable_domain_attr *pgtbl_cfg);
>   
>   extern int report_iommu_fault(struct iommu_domain *domain, struct device *dev,
>   			      unsigned long iova, int flags);
> 

^ permalink raw reply

* Re: [PATCH v2 1/3] powerpc: sstep: Fix load and update emulation
From: Segher Boessenkool @ 2021-03-04 15:45 UTC (permalink / raw)
  To: Naveen N. Rao
  Cc: ravi.bangoria, ananth, jniethe5, paulus, Sandipan Das,
	linuxppc-dev, dja
In-Reply-To: <20210303163127.GE1913@DESKTOP-TDPLP67.localdomain>

On Wed, Mar 03, 2021 at 10:01:27PM +0530, Naveen N. Rao wrote:
> On 2021/03/01 08:37PM, Segher Boessenkool wrote:
> > > And, r6 always ends up with 0xaea. It changes with the value I put into 
> > > r6 though.
> > 
> > That is exactly the behaviour specified for p8.  0aaa+0040=0aea.
> > 
> > > Granted, this is all up in the air, but it does look like there is more 
> > > going on and the value isn't the EA or the value at the address.
> > 
> > That *is* the EA.  The EA is the address the insn does the access at.
> 
> I'm probably missing something here. 0xaaa is the value I stored at an 
> offset of 64 bytes from the stack pointer (r1 is copied into r6). In the 
> ldu instruction above, the EA is 64(r6), which should translate to 
> r1+64.  The data returned by the load would be 0xaaa, which should be 
> discarded per the description you provided above. So, I would expect to 
> see a 0xc0.. address in r6.

Yes, I misread your code it seems.

> In fact, this looks to be the behavior documented for P9:
> 
> > > Power9 does:
> > >
> > >   Load with Update Instructions (RA = 0)
> > >     EA is placed into R0.
> > >   Load with Update Instructions (RA = RT)
> > >     The storage operand addressed by EA is accessed. The 
> > >     displacement
> > >     field is added to the data returned by the load and placed into 
> > >     RT.

Yup.  So on what cpu did you test?

Either way, the kernel should not emulate any particular cpu here, I'd
say, esp. since recent cpus do different things for this invalid form.


Segher

^ permalink raw reply

* Re: [PATCH v1] powerpc: Include running function as first entry in save_stack_trace() and friends
From: Mark Rutland @ 2021-03-04 16:59 UTC (permalink / raw)
  To: Marco Elver
  Cc: Catalin Marinas, Will Deacon, LKML, broonie, Paul Mackerras,
	kasan-dev, linuxppc-dev, Linux ARM
In-Reply-To: <CANpmjNOSpFbbDaH9hNucXrpzG=HpsoQpk5w-24x8sU_G-6cz0Q@mail.gmail.com>

On Thu, Mar 04, 2021 at 04:30:34PM +0100, Marco Elver wrote:
> On Thu, 4 Mar 2021 at 15:57, Mark Rutland <mark.rutland@arm.com> wrote:
> > [adding Mark Brown]
> >
> > The bigger problem here is that skipping is dodgy to begin with, and
> > this is still liable to break in some cases. One big concern is that
> > (especially with LTO) we cannot guarantee the compiler will not inline
> > or outline functions, causing the skipp value to be too large or too
> > small. That's liable to happen to callers, and in theory (though
> > unlikely in practice), portions of arch_stack_walk() or
> > stack_trace_save() could get outlined too.
> >
> > Unless we can get some strong guarantees from compiler folk such that we
> > can guarantee a specific function acts boundary for unwinding (and
> > doesn't itself get split, etc), the only reliable way I can think to
> > solve this requires an assembly trampoline. Whatever we do is liable to
> > need some invasive rework.
> 
> Will LTO and friends respect 'noinline'?

I hope so (and suspect we'd have more problems otherwise), but I don't
know whether they actually so.

I suspect even with 'noinline' the compiler is permitted to outline
portions of a function if it wanted to (and IIUC it could still make
specialized copies in the absence of 'noclone').

> One thing I also noticed is that tail calls would also cause the stack
> trace to appear somewhat incomplete (for some of my tests I've
> disabled tail call optimizations).

I assume you mean for a chain A->B->C where B tail-calls C, you get a
trace A->C? ... or is A going missing too?

> Is there a way to also mark a function non-tail-callable?

I think this can be bodged using __attribute__((optimize("$OPTIONS")))
on a caller to inhibit TCO (though IIRC GCC doesn't reliably support
function-local optimization options), but I don't expect there's any way
to mark a callee as not being tail-callable.

Accoding to the GCC documentation, GCC won't TCO noreturn functions, but
obviously that's not something we can use generally.

https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes

> But I'm also not sure if with all that we'd be guaranteed the code we
> want, even though in practice it might.

True! I'd just like to be on the least dodgy ground we can be.

Thanks,
Mark.

^ permalink raw reply

* Re: [PATCH v1] powerpc: Include running function as first entry in save_stack_trace() and friends
From: Marco Elver @ 2021-03-04 17:25 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Catalin Marinas, Will Deacon, LKML, broonie, Paul Mackerras,
	kasan-dev, linux-toolchains, linuxppc-dev, Linux ARM
In-Reply-To: <20210304165923.GA60457@C02TD0UTHF1T.local>

On Thu, Mar 04, 2021 at 04:59PM +0000, Mark Rutland wrote:
> On Thu, Mar 04, 2021 at 04:30:34PM +0100, Marco Elver wrote:
> > On Thu, 4 Mar 2021 at 15:57, Mark Rutland <mark.rutland@arm.com> wrote:
> > > [adding Mark Brown]
> > >
> > > The bigger problem here is that skipping is dodgy to begin with, and
> > > this is still liable to break in some cases. One big concern is that
> > > (especially with LTO) we cannot guarantee the compiler will not inline
> > > or outline functions, causing the skipp value to be too large or too
> > > small. That's liable to happen to callers, and in theory (though
> > > unlikely in practice), portions of arch_stack_walk() or
> > > stack_trace_save() could get outlined too.
> > >
> > > Unless we can get some strong guarantees from compiler folk such that we
> > > can guarantee a specific function acts boundary for unwinding (and
> > > doesn't itself get split, etc), the only reliable way I can think to
> > > solve this requires an assembly trampoline. Whatever we do is liable to
> > > need some invasive rework.
> > 
> > Will LTO and friends respect 'noinline'?
> 
> I hope so (and suspect we'd have more problems otherwise), but I don't
> know whether they actually so.
> 
> I suspect even with 'noinline' the compiler is permitted to outline
> portions of a function if it wanted to (and IIUC it could still make
> specialized copies in the absence of 'noclone').
> 
> > One thing I also noticed is that tail calls would also cause the stack
> > trace to appear somewhat incomplete (for some of my tests I've
> > disabled tail call optimizations).
> 
> I assume you mean for a chain A->B->C where B tail-calls C, you get a
> trace A->C? ... or is A going missing too?

Correct, it's just the A->C outcome.

> > Is there a way to also mark a function non-tail-callable?
> 
> I think this can be bodged using __attribute__((optimize("$OPTIONS")))
> on a caller to inhibit TCO (though IIRC GCC doesn't reliably support
> function-local optimization options), but I don't expect there's any way
> to mark a callee as not being tail-callable.

I don't think this is reliable. It'd be
__attribute__((optimize("-fno-optimize-sibling-calls"))), but doesn't
work if applied to the function we do not want to tail-call-optimize,
but would have to be applied to the function that does the tail-calling.
So it's a bit backwards, even if it worked.

> Accoding to the GCC documentation, GCC won't TCO noreturn functions, but
> obviously that's not something we can use generally.
> 
> https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes

Perhaps we can ask the toolchain folks to help add such an attribute. Or
maybe the feature already exists somewhere, but hidden.

+Cc linux-toolchains@vger.kernel.org

> > But I'm also not sure if with all that we'd be guaranteed the code we
> > want, even though in practice it might.
> 
> True! I'd just like to be on the least dodgy ground we can be.

It's been dodgy for a while, and I'd welcome any low-cost fixes to make
it less dodgy in the short-term at least. :-)

Thanks,
-- Marco

^ permalink raw reply

* Re: [PATCH v1] powerpc: Include running function as first entry in save_stack_trace() and friends
From: Nick Desaulniers @ 2021-03-04 17:54 UTC (permalink / raw)
  To: Marco Elver
  Cc: Mark Rutland, Catalin Marinas, Will Deacon, LKML, Mark Brown,
	Paul Mackerras, kasan-dev, linux-toolchains, linuxppc-dev,
	Linux ARM
In-Reply-To: <YEEYDSJeLPvqRAHZ@elver.google.com>

On Thu, Mar 4, 2021 at 9:42 AM Marco Elver <elver@google.com> wrote:
>
> On Thu, Mar 04, 2021 at 04:59PM +0000, Mark Rutland wrote:
> > On Thu, Mar 04, 2021 at 04:30:34PM +0100, Marco Elver wrote:
> > > On Thu, 4 Mar 2021 at 15:57, Mark Rutland <mark.rutland@arm.com> wrote:
> > > > [adding Mark Brown]
> > > >
> > > > The bigger problem here is that skipping is dodgy to begin with, and
> > > > this is still liable to break in some cases. One big concern is that
> > > > (especially with LTO) we cannot guarantee the compiler will not inline
> > > > or outline functions, causing the skipp value to be too large or too
> > > > small. That's liable to happen to callers, and in theory (though
> > > > unlikely in practice), portions of arch_stack_walk() or
> > > > stack_trace_save() could get outlined too.
> > > >
> > > > Unless we can get some strong guarantees from compiler folk such that we
> > > > can guarantee a specific function acts boundary for unwinding (and
> > > > doesn't itself get split, etc), the only reliable way I can think to
> > > > solve this requires an assembly trampoline. Whatever we do is liable to
> > > > need some invasive rework.
> > >
> > > Will LTO and friends respect 'noinline'?
> >
> > I hope so (and suspect we'd have more problems otherwise), but I don't
> > know whether they actually so.
> >
> > I suspect even with 'noinline' the compiler is permitted to outline
> > portions of a function if it wanted to (and IIUC it could still make
> > specialized copies in the absence of 'noclone').
> >
> > > One thing I also noticed is that tail calls would also cause the stack
> > > trace to appear somewhat incomplete (for some of my tests I've
> > > disabled tail call optimizations).
> >
> > I assume you mean for a chain A->B->C where B tail-calls C, you get a
> > trace A->C? ... or is A going missing too?
>
> Correct, it's just the A->C outcome.
>
> > > Is there a way to also mark a function non-tail-callable?
> >
> > I think this can be bodged using __attribute__((optimize("$OPTIONS")))
> > on a caller to inhibit TCO (though IIRC GCC doesn't reliably support
> > function-local optimization options), but I don't expect there's any way
> > to mark a callee as not being tail-callable.
>
> I don't think this is reliable. It'd be
> __attribute__((optimize("-fno-optimize-sibling-calls"))), but doesn't
> work if applied to the function we do not want to tail-call-optimize,
> but would have to be applied to the function that does the tail-calling.
> So it's a bit backwards, even if it worked.
>
> > Accoding to the GCC documentation, GCC won't TCO noreturn functions, but
> > obviously that's not something we can use generally.
> >
> > https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes

include/linux/compiler.h:246:
prevent_tail_call_optimization

commit a9a3ed1eff36 ("x86: Fix early boot crash on gcc-10, third try")

>
> Perhaps we can ask the toolchain folks to help add such an attribute. Or
> maybe the feature already exists somewhere, but hidden.
>
> +Cc linux-toolchains@vger.kernel.org
>
> > > But I'm also not sure if with all that we'd be guaranteed the code we
> > > want, even though in practice it might.
> >
> > True! I'd just like to be on the least dodgy ground we can be.
>
> It's been dodgy for a while, and I'd welcome any low-cost fixes to make
> it less dodgy in the short-term at least. :-)
>
> Thanks,
> -- Marco



-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply

* Re: [PATCH v1] powerpc: Include running function as first entry in save_stack_trace() and friends
From: Mark Rutland @ 2021-03-04 18:01 UTC (permalink / raw)
  To: Marco Elver
  Cc: Catalin Marinas, Will Deacon, LKML, broonie, Paul Mackerras,
	kasan-dev, linux-toolchains, linuxppc-dev, Linux ARM
In-Reply-To: <YEEYDSJeLPvqRAHZ@elver.google.com>

On Thu, Mar 04, 2021 at 06:25:33PM +0100, Marco Elver wrote:
> On Thu, Mar 04, 2021 at 04:59PM +0000, Mark Rutland wrote:
> > On Thu, Mar 04, 2021 at 04:30:34PM +0100, Marco Elver wrote:
> > > On Thu, 4 Mar 2021 at 15:57, Mark Rutland <mark.rutland@arm.com> wrote:
> > > > [adding Mark Brown]
> > > >
> > > > The bigger problem here is that skipping is dodgy to begin with, and
> > > > this is still liable to break in some cases. One big concern is that
> > > > (especially with LTO) we cannot guarantee the compiler will not inline
> > > > or outline functions, causing the skipp value to be too large or too
> > > > small. That's liable to happen to callers, and in theory (though
> > > > unlikely in practice), portions of arch_stack_walk() or
> > > > stack_trace_save() could get outlined too.
> > > >
> > > > Unless we can get some strong guarantees from compiler folk such that we
> > > > can guarantee a specific function acts boundary for unwinding (and
> > > > doesn't itself get split, etc), the only reliable way I can think to
> > > > solve this requires an assembly trampoline. Whatever we do is liable to
> > > > need some invasive rework.
> > > 
> > > Will LTO and friends respect 'noinline'?
> > 
> > I hope so (and suspect we'd have more problems otherwise), but I don't
> > know whether they actually so.
> > 
> > I suspect even with 'noinline' the compiler is permitted to outline
> > portions of a function if it wanted to (and IIUC it could still make
> > specialized copies in the absence of 'noclone').
> > 
> > > One thing I also noticed is that tail calls would also cause the stack
> > > trace to appear somewhat incomplete (for some of my tests I've
> > > disabled tail call optimizations).
> > 
> > I assume you mean for a chain A->B->C where B tail-calls C, you get a
> > trace A->C? ... or is A going missing too?
> 
> Correct, it's just the A->C outcome.

I'd assumed that those cases were benign, e.g. for livepatching what
matters is what can be returned to, so B disappearing from the trace
isn't a problem there.

Is the concern debugability, or is there a functional issue you have in
mind?

> > > Is there a way to also mark a function non-tail-callable?
> > 
> > I think this can be bodged using __attribute__((optimize("$OPTIONS")))
> > on a caller to inhibit TCO (though IIRC GCC doesn't reliably support
> > function-local optimization options), but I don't expect there's any way
> > to mark a callee as not being tail-callable.
> 
> I don't think this is reliable. It'd be
> __attribute__((optimize("-fno-optimize-sibling-calls"))), but doesn't
> work if applied to the function we do not want to tail-call-optimize,
> but would have to be applied to the function that does the tail-calling.

Yup; that's what I meant then I said you could do that on the caller but
not the callee.

I don't follow why you'd want to put this on the callee, though, so I
think I'm missing something. Considering a set of functions in different
compilation units:

  A->B->C->D->E->F->G->H->I->J->K

... if K were marked in this way, and J was compiled with visibility of
this, J would stick around, but J's callers might not, and so the a
trace might see:

  A->J->K

... do you just care about the final caller, i.e. you just need
certainty that J will be in the trace?

If so, we can somewhat bodge that by having K have an __always_inline
wrapper which has a barrier() or similar after the real call to K, so
the call couldn't be TCO'd.

Otherwise I'd expect we'd probably need to disable TCO generally.

> So it's a bit backwards, even if it worked.
> 
> > Accoding to the GCC documentation, GCC won't TCO noreturn functions, but
> > obviously that's not something we can use generally.
> > 
> > https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes
> 
> Perhaps we can ask the toolchain folks to help add such an attribute. Or
> maybe the feature already exists somewhere, but hidden.
> 
> +Cc linux-toolchains@vger.kernel.org
> 
> > > But I'm also not sure if with all that we'd be guaranteed the code we
> > > want, even though in practice it might.
> > 
> > True! I'd just like to be on the least dodgy ground we can be.
> 
> It's been dodgy for a while, and I'd welcome any low-cost fixes to make
> it less dodgy in the short-term at least. :-)

:)

Thanks,
Mark.

^ permalink raw reply

* Re: [PATCH v1] powerpc: Include running function as first entry in save_stack_trace() and friends
From: Marco Elver @ 2021-03-04 18:22 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Catalin Marinas, Will Deacon, LKML, broonie, Paul Mackerras,
	kasan-dev, linux-toolchains, linuxppc-dev, Linux ARM
In-Reply-To: <20210304180154.GD60457@C02TD0UTHF1T.local>

On Thu, 4 Mar 2021 at 19:02, Mark Rutland <mark.rutland@arm.com> wrote:
> On Thu, Mar 04, 2021 at 06:25:33PM +0100, Marco Elver wrote:
> > On Thu, Mar 04, 2021 at 04:59PM +0000, Mark Rutland wrote:
> > > On Thu, Mar 04, 2021 at 04:30:34PM +0100, Marco Elver wrote:
> > > > On Thu, 4 Mar 2021 at 15:57, Mark Rutland <mark.rutland@arm.com> wrote:
> > > > > [adding Mark Brown]
> > > > >
> > > > > The bigger problem here is that skipping is dodgy to begin with, and
> > > > > this is still liable to break in some cases. One big concern is that
> > > > > (especially with LTO) we cannot guarantee the compiler will not inline
> > > > > or outline functions, causing the skipp value to be too large or too
> > > > > small. That's liable to happen to callers, and in theory (though
> > > > > unlikely in practice), portions of arch_stack_walk() or
> > > > > stack_trace_save() could get outlined too.
> > > > >
> > > > > Unless we can get some strong guarantees from compiler folk such that we
> > > > > can guarantee a specific function acts boundary for unwinding (and
> > > > > doesn't itself get split, etc), the only reliable way I can think to
> > > > > solve this requires an assembly trampoline. Whatever we do is liable to
> > > > > need some invasive rework.
> > > >
> > > > Will LTO and friends respect 'noinline'?
> > >
> > > I hope so (and suspect we'd have more problems otherwise), but I don't
> > > know whether they actually so.
> > >
> > > I suspect even with 'noinline' the compiler is permitted to outline
> > > portions of a function if it wanted to (and IIUC it could still make
> > > specialized copies in the absence of 'noclone').
> > >
> > > > One thing I also noticed is that tail calls would also cause the stack
> > > > trace to appear somewhat incomplete (for some of my tests I've
> > > > disabled tail call optimizations).
> > >
> > > I assume you mean for a chain A->B->C where B tail-calls C, you get a
> > > trace A->C? ... or is A going missing too?
> >
> > Correct, it's just the A->C outcome.
>
> I'd assumed that those cases were benign, e.g. for livepatching what
> matters is what can be returned to, so B disappearing from the trace
> isn't a problem there.
>
> Is the concern debugability, or is there a functional issue you have in
> mind?

For me, it's just been debuggability, and reliable test cases.

> > > > Is there a way to also mark a function non-tail-callable?
> > >
> > > I think this can be bodged using __attribute__((optimize("$OPTIONS")))
> > > on a caller to inhibit TCO (though IIRC GCC doesn't reliably support
> > > function-local optimization options), but I don't expect there's any way
> > > to mark a callee as not being tail-callable.
> >
> > I don't think this is reliable. It'd be
> > __attribute__((optimize("-fno-optimize-sibling-calls"))), but doesn't
> > work if applied to the function we do not want to tail-call-optimize,
> > but would have to be applied to the function that does the tail-calling.
>
> Yup; that's what I meant then I said you could do that on the caller but
> not the callee.
>
> I don't follow why you'd want to put this on the callee, though, so I
> think I'm missing something. Considering a set of functions in different
> compilation units:
>
>   A->B->C->D->E->F->G->H->I->J->K

I was having this problem with KCSAN, where the compiler would
tail-call-optimize __tsan_X instrumentation. This would mean that
KCSAN runtime functions ended up in the trace, but the function where
the access happened would not. However, I don't care about the runtime
functions, and instead want to see the function where the access
happened. In that case, I'd like to just mark __tsan_X and any other
kcsan instrumentation functions as do-not-tail-call-optimize, which
would solve the problem.

The solution today is that when you compile a kernel with KCSAN, every
instrumented TU is compiled with -fno-optimize-sibling-calls. The
better solution would be to just mark KCSAN runtime functions somehow,
but permit tail calling other things. Although, I probably still want
to see the full trace, and would decide that having
-fno-optimize-sibling-calls is a small price to pay in a
debug-only-kernel to get complete traces.

> ... if K were marked in this way, and J was compiled with visibility of
> this, J would stick around, but J's callers might not, and so the a
> trace might see:
>
>   A->J->K
>
> ... do you just care about the final caller, i.e. you just need
> certainty that J will be in the trace?

Yes. But maybe it's a special problem that only sanitizers have.

> If so, we can somewhat bodge that by having K have an __always_inline
> wrapper which has a barrier() or similar after the real call to K, so
> the call couldn't be TCO'd.
>
> Otherwise I'd expect we'd probably need to disable TCO generally.

Thanks,
-- Marco

^ permalink raw reply

* Re: [PATCH] scsi: ibmvfc: Switch to using the new API kobj_to_dev()
From: Tyrel Datwyler @ 2021-03-04 18:48 UTC (permalink / raw)
  To: Jiapeng Chong
  Cc: martin.petersen, linux-scsi, jejb, linux-kernel, paulus,
	linuxppc-dev
In-Reply-To: <1614850124-54111-1-git-send-email-jiapeng.chong@linux.alibaba.com>

On 3/4/21 1:28 AM, Jiapeng Chong wrote:
> Fix the following coccicheck warnings:
> 
> ./drivers/scsi/ibmvscsi/ibmvfc.c:3483:60-61: WARNING opportunity for
> kobj_to_dev().
> 
> Reported-by: Abaci Robot <abaci@linux.alibaba.com>
> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>

Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com>

> ---
>  drivers/scsi/ibmvscsi/ibmvfc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
> index 755313b..e5f1ca7 100644
> --- a/drivers/scsi/ibmvscsi/ibmvfc.c
> +++ b/drivers/scsi/ibmvscsi/ibmvfc.c
> @@ -3480,7 +3480,7 @@ static ssize_t ibmvfc_read_trace(struct file *filp, struct kobject *kobj,
>  				 struct bin_attribute *bin_attr,
>  				 char *buf, loff_t off, size_t count)
>  {
> -	struct device *dev = container_of(kobj, struct device, kobj);
> +	struct device *dev = kobj_to_dev(kobj);
>  	struct Scsi_Host *shost = class_to_shost(dev);
>  	struct ibmvfc_host *vhost = shost_priv(shost);
>  	unsigned long flags = 0;
> 


^ permalink raw reply

* Re: [PATCH v1] powerpc: Include running function as first entry in save_stack_trace() and friends
From: Mark Rutland @ 2021-03-04 18:51 UTC (permalink / raw)
  To: Marco Elver
  Cc: Catalin Marinas, Will Deacon, LKML, broonie, Paul Mackerras,
	kasan-dev, linux-toolchains, linuxppc-dev, Linux ARM
In-Reply-To: <CANpmjNOZWuhqXATDjH3F=DMbpg2xOy0XppVJ+Wv2XjFh_crJJg@mail.gmail.com>

On Thu, Mar 04, 2021 at 07:22:53PM +0100, Marco Elver wrote:
> On Thu, 4 Mar 2021 at 19:02, Mark Rutland <mark.rutland@arm.com> wrote:
> > On Thu, Mar 04, 2021 at 06:25:33PM +0100, Marco Elver wrote:
> > > On Thu, Mar 04, 2021 at 04:59PM +0000, Mark Rutland wrote:
> > > > On Thu, Mar 04, 2021 at 04:30:34PM +0100, Marco Elver wrote:
> > > > > On Thu, 4 Mar 2021 at 15:57, Mark Rutland <mark.rutland@arm.com> wrote:
> > > > > > [adding Mark Brown]
> > > > > >
> > > > > > The bigger problem here is that skipping is dodgy to begin with, and
> > > > > > this is still liable to break in some cases. One big concern is that
> > > > > > (especially with LTO) we cannot guarantee the compiler will not inline
> > > > > > or outline functions, causing the skipp value to be too large or too
> > > > > > small. That's liable to happen to callers, and in theory (though
> > > > > > unlikely in practice), portions of arch_stack_walk() or
> > > > > > stack_trace_save() could get outlined too.
> > > > > >
> > > > > > Unless we can get some strong guarantees from compiler folk such that we
> > > > > > can guarantee a specific function acts boundary for unwinding (and
> > > > > > doesn't itself get split, etc), the only reliable way I can think to
> > > > > > solve this requires an assembly trampoline. Whatever we do is liable to
> > > > > > need some invasive rework.
> > > > >
> > > > > Will LTO and friends respect 'noinline'?
> > > >
> > > > I hope so (and suspect we'd have more problems otherwise), but I don't
> > > > know whether they actually so.
> > > >
> > > > I suspect even with 'noinline' the compiler is permitted to outline
> > > > portions of a function if it wanted to (and IIUC it could still make
> > > > specialized copies in the absence of 'noclone').
> > > >
> > > > > One thing I also noticed is that tail calls would also cause the stack
> > > > > trace to appear somewhat incomplete (for some of my tests I've
> > > > > disabled tail call optimizations).
> > > >
> > > > I assume you mean for a chain A->B->C where B tail-calls C, you get a
> > > > trace A->C? ... or is A going missing too?
> > >
> > > Correct, it's just the A->C outcome.
> >
> > I'd assumed that those cases were benign, e.g. for livepatching what
> > matters is what can be returned to, so B disappearing from the trace
> > isn't a problem there.
> >
> > Is the concern debugability, or is there a functional issue you have in
> > mind?
> 
> For me, it's just been debuggability, and reliable test cases.
> 
> > > > > Is there a way to also mark a function non-tail-callable?
> > > >
> > > > I think this can be bodged using __attribute__((optimize("$OPTIONS")))
> > > > on a caller to inhibit TCO (though IIRC GCC doesn't reliably support
> > > > function-local optimization options), but I don't expect there's any way
> > > > to mark a callee as not being tail-callable.
> > >
> > > I don't think this is reliable. It'd be
> > > __attribute__((optimize("-fno-optimize-sibling-calls"))), but doesn't
> > > work if applied to the function we do not want to tail-call-optimize,
> > > but would have to be applied to the function that does the tail-calling.
> >
> > Yup; that's what I meant then I said you could do that on the caller but
> > not the callee.
> >
> > I don't follow why you'd want to put this on the callee, though, so I
> > think I'm missing something. Considering a set of functions in different
> > compilation units:
> >
> >   A->B->C->D->E->F->G->H->I->J->K
> 
> I was having this problem with KCSAN, where the compiler would
> tail-call-optimize __tsan_X instrumentation.

Those are compiler-generated calls, right? When those are generated the
compilation unit (and whatever it has included) might not have provided
a prototype anyway, and the compiler has special knowledge of the
functions, so it feels like the compiler would need to inhibit TCO here
for this to be robust. For their intended usage subjecting them to TCO
doesn't seem to make sense AFAICT.

I suspect that compilers have some way of handling that; otherwise I'd
expect to have heard stories of mcount/fentry calls getting TCO'd and
causing problems. So maybe there's an easy fix there?

> This would mean that KCSAN runtime functions ended up in the trace,
> but the function where the access happened would not. However, I don't
> care about the runtime functions, and instead want to see the function
> where the access happened. In that case, I'd like to just mark
> __tsan_X and any other kcsan instrumentation functions as
> do-not-tail-call-optimize, which would solve the problem.

I understand why we don't want to TCO these calls, but given the calls
are implicitly generated, I strongly suspect it's better to fix the
implicit call generation to not be TCO'd to begin with.

> The solution today is that when you compile a kernel with KCSAN, every
> instrumented TU is compiled with -fno-optimize-sibling-calls. The
> better solution would be to just mark KCSAN runtime functions somehow,
> but permit tail calling other things. Although, I probably still want
> to see the full trace, and would decide that having
> -fno-optimize-sibling-calls is a small price to pay in a
> debug-only-kernel to get complete traces.
> 
> > ... if K were marked in this way, and J was compiled with visibility of
> > this, J would stick around, but J's callers might not, and so the a
> > trace might see:
> >
> >   A->J->K
> >
> > ... do you just care about the final caller, i.e. you just need
> > certainty that J will be in the trace?
> 
> Yes. But maybe it's a special problem that only sanitizers have.

I reckon for basically any instrumentation we don't want calls to be
TCO'd, though I'm not immediately sure of cases beyond sanitizers and
mcount/fentry.

Thanks,
Mark.

^ permalink raw reply

* Re: [PATCH v1] powerpc: Include running function as first entry in save_stack_trace() and friends
From: Marco Elver @ 2021-03-04 19:01 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Catalin Marinas, Will Deacon, LKML, broonie, Paul Mackerras,
	kasan-dev, linux-toolchains, linuxppc-dev, Linux ARM
In-Reply-To: <20210304185148.GE60457@C02TD0UTHF1T.local>

On Thu, 4 Mar 2021 at 19:51, Mark Rutland <mark.rutland@arm.com> wrote:
> On Thu, Mar 04, 2021 at 07:22:53PM +0100, Marco Elver wrote:
> > On Thu, 4 Mar 2021 at 19:02, Mark Rutland <mark.rutland@arm.com> wrote:
> > > On Thu, Mar 04, 2021 at 06:25:33PM +0100, Marco Elver wrote:
> > > > On Thu, Mar 04, 2021 at 04:59PM +0000, Mark Rutland wrote:
> > > > > On Thu, Mar 04, 2021 at 04:30:34PM +0100, Marco Elver wrote:
> > > > > > On Thu, 4 Mar 2021 at 15:57, Mark Rutland <mark.rutland@arm.com> wrote:
> > > > > > > [adding Mark Brown]
> > > > > > >
> > > > > > > The bigger problem here is that skipping is dodgy to begin with, and
> > > > > > > this is still liable to break in some cases. One big concern is that
> > > > > > > (especially with LTO) we cannot guarantee the compiler will not inline
> > > > > > > or outline functions, causing the skipp value to be too large or too
> > > > > > > small. That's liable to happen to callers, and in theory (though
> > > > > > > unlikely in practice), portions of arch_stack_walk() or
> > > > > > > stack_trace_save() could get outlined too.
> > > > > > >
> > > > > > > Unless we can get some strong guarantees from compiler folk such that we
> > > > > > > can guarantee a specific function acts boundary for unwinding (and
> > > > > > > doesn't itself get split, etc), the only reliable way I can think to
> > > > > > > solve this requires an assembly trampoline. Whatever we do is liable to
> > > > > > > need some invasive rework.
> > > > > >
> > > > > > Will LTO and friends respect 'noinline'?
> > > > >
> > > > > I hope so (and suspect we'd have more problems otherwise), but I don't
> > > > > know whether they actually so.
> > > > >
> > > > > I suspect even with 'noinline' the compiler is permitted to outline
> > > > > portions of a function if it wanted to (and IIUC it could still make
> > > > > specialized copies in the absence of 'noclone').
> > > > >
> > > > > > One thing I also noticed is that tail calls would also cause the stack
> > > > > > trace to appear somewhat incomplete (for some of my tests I've
> > > > > > disabled tail call optimizations).
> > > > >
> > > > > I assume you mean for a chain A->B->C where B tail-calls C, you get a
> > > > > trace A->C? ... or is A going missing too?
> > > >
> > > > Correct, it's just the A->C outcome.
> > >
> > > I'd assumed that those cases were benign, e.g. for livepatching what
> > > matters is what can be returned to, so B disappearing from the trace
> > > isn't a problem there.
> > >
> > > Is the concern debugability, or is there a functional issue you have in
> > > mind?
> >
> > For me, it's just been debuggability, and reliable test cases.
> >
> > > > > > Is there a way to also mark a function non-tail-callable?
> > > > >
> > > > > I think this can be bodged using __attribute__((optimize("$OPTIONS")))
> > > > > on a caller to inhibit TCO (though IIRC GCC doesn't reliably support
> > > > > function-local optimization options), but I don't expect there's any way
> > > > > to mark a callee as not being tail-callable.
> > > >
> > > > I don't think this is reliable. It'd be
> > > > __attribute__((optimize("-fno-optimize-sibling-calls"))), but doesn't
> > > > work if applied to the function we do not want to tail-call-optimize,
> > > > but would have to be applied to the function that does the tail-calling.
> > >
> > > Yup; that's what I meant then I said you could do that on the caller but
> > > not the callee.
> > >
> > > I don't follow why you'd want to put this on the callee, though, so I
> > > think I'm missing something. Considering a set of functions in different
> > > compilation units:
> > >
> > >   A->B->C->D->E->F->G->H->I->J->K
> >
> > I was having this problem with KCSAN, where the compiler would
> > tail-call-optimize __tsan_X instrumentation.
>
> Those are compiler-generated calls, right? When those are generated the
> compilation unit (and whatever it has included) might not have provided
> a prototype anyway, and the compiler has special knowledge of the
> functions, so it feels like the compiler would need to inhibit TCO here
> for this to be robust. For their intended usage subjecting them to TCO
> doesn't seem to make sense AFAICT.
>
> I suspect that compilers have some way of handling that; otherwise I'd
> expect to have heard stories of mcount/fentry calls getting TCO'd and
> causing problems. So maybe there's an easy fix there?

I agree, the compiler builtins should be handled by the compiler
directly, perhaps that was a bad example. But we also have "explicit
instrumentation", e.g. everything that's in <linux/instrumented.h>.

> > This would mean that KCSAN runtime functions ended up in the trace,
> > but the function where the access happened would not. However, I don't
> > care about the runtime functions, and instead want to see the function
> > where the access happened. In that case, I'd like to just mark
> > __tsan_X and any other kcsan instrumentation functions as
> > do-not-tail-call-optimize, which would solve the problem.
>
> I understand why we don't want to TCO these calls, but given the calls
> are implicitly generated, I strongly suspect it's better to fix the
> implicit call generation to not be TCO'd to begin with.
>
> > The solution today is that when you compile a kernel with KCSAN, every
> > instrumented TU is compiled with -fno-optimize-sibling-calls. The
> > better solution would be to just mark KCSAN runtime functions somehow,
> > but permit tail calling other things. Although, I probably still want
> > to see the full trace, and would decide that having
> > -fno-optimize-sibling-calls is a small price to pay in a
> > debug-only-kernel to get complete traces.
> >
> > > ... if K were marked in this way, and J was compiled with visibility of
> > > this, J would stick around, but J's callers might not, and so the a
> > > trace might see:
> > >
> > >   A->J->K
> > >
> > > ... do you just care about the final caller, i.e. you just need
> > > certainty that J will be in the trace?
> >
> > Yes. But maybe it's a special problem that only sanitizers have.
>
> I reckon for basically any instrumentation we don't want calls to be
> TCO'd, though I'm not immediately sure of cases beyond sanitizers and
> mcount/fentry.

Thinking about this more, I think it's all debugging tools. E.g.
lockdep, if you lock/unlock at the end of a function, you might tail
call into lockdep. If the compiler applies TCO, and lockdep determines
there's a bug and then shows a trace, you'll have no idea where the
actual bug is. The kernel has lots of debugging facilities that add
instrumentation in this way. So perhaps it's a general debugging-tool
problem (rather than just sanitizers).

Thanks,
-- Marco

^ permalink raw reply

* Re: [PATCH v1] powerpc: Include running function as first entry in save_stack_trace() and friends
From: Segher Boessenkool @ 2021-03-04 19:24 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Mark Rutland, Marco Elver, Catalin Marinas, linuxppc-dev, LKML,
	kasan-dev, Mark Brown, Paul Mackerras, linux-toolchains,
	Will Deacon, Linux ARM
In-Reply-To: <CAKwvOd=wBArMwvtDC8zV-QjQa5UuwWoxksQ8j+hUCZzbEAn+Fw@mail.gmail.com>

On Thu, Mar 04, 2021 at 09:54:44AM -0800, Nick Desaulniers wrote:
> On Thu, Mar 4, 2021 at 9:42 AM Marco Elver <elver@google.com> wrote:
> include/linux/compiler.h:246:
> prevent_tail_call_optimization
> 
> commit a9a3ed1eff36 ("x86: Fix early boot crash on gcc-10, third try")

That is much heavier than needed (an mb()).  You can just put an empty
inline asm after a call before a return, and that call cannot be
optimised to a sibling call: (the end of a function is an implicit
return:)

Instead of:

void g(void);
void f(int x)
	if (x)
		g();
}

Do:

void g(void);
void f(int x)
	if (x)
		g();
	asm("");
}

This costs no extra instructions, and certainly not something as heavy
as an mb()!  It works without the "if" as well, of course, but with it
it is a more interesting example of a tail call.

Segher

^ permalink raw reply

* Re: [PATCH v2 7/7] ASoC: dt-bindings: imx-rpmsg: Add binding doc for rpmsg machine driver
From: Rob Herring @ 2021-03-04 20:04 UTC (permalink / raw)
  To: Shengjiu Wang
  Cc: open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Linux-ALSA, Timur Tabi, Xiubo Li, Fabio Estevam, Shengjiu Wang,
	Takashi Iwai, Liam Girdwood, linux-kernel, Nicolin Chen,
	Mark Brown, linuxppc-dev
In-Reply-To: <CAA+D8ANOv91jr4381Acz1B2mZ6=Mx2J_2CMTGXmPKztv7bMjPA@mail.gmail.com>

On Thu, Feb 18, 2021 at 1:23 AM Shengjiu Wang <shengjiu.wang@gmail.com> wrote:
>
> On Thu, Feb 11, 2021 at 6:18 AM Rob Herring <robh@kernel.org> wrote:
> >
> > On Sun, Feb 07, 2021 at 06:23:55PM +0800, Shengjiu Wang wrote:
> > > Imx-rpmsg is a new added machine driver for supporting audio on Cortex-M
> > > core. The Cortex-M core will control the audio interface, DMA and audio
> > > codec, setup the pipeline, the audio driver on Cortex-A core side is just
> > > to communitcate with M core, it is a virtual sound card and don't touch
> > > the hardware.
> >
> > I don't understand why there are 2 nodes for this other than you happen
> > to want to split this into 2 Linux drivers. It's 1 h/w thing.
>
> This one is for the sound card machine driver.  Another one is
> for the sound card cpu dai driver. so there are 2 nodes.

You are explaining this to me in terms of drivers. Explain it in terms
of h/w blocks.

Rob

^ permalink raw reply

* Re: [PATCH 2/5] CMDLINE: drivers: of: ifdef out cmdline section
From: Daniel Walker @ 2021-03-04 20:48 UTC (permalink / raw)
  To: Rob Herring
  Cc: devicetree, Ruslan Ruslichenko, Daniel Gimpelevich, Frank Rowand,
	linuxppc-dev, X86 ML, open list:MIPS,
	linux-kernel@vger.kernel.org, xe-linux-external, Andrew Morton,
	Will Deacon
In-Reply-To: <CAL_JsqKnAMp0bkXzU-B8b8xx5fPC1R1NdOBn9Kpk=SONJL5paQ@mail.gmail.com>

On Thu, Mar 04, 2021 at 08:32:37AM -0600, Rob Herring wrote:
> On Wed, Mar 3, 2021 at 10:48 PM Daniel Walker <danielwa@cisco.com> wrote:
> >
> > It looks like there's some seepage of cmdline stuff into
> > the generic device tree code. This conflicts with the
> > generic cmdline implementation so I remove it in the case
> > when that's enabled.
> >
> > Cc: xe-linux-external@cisco.com
> > Signed-off-by: Ruslan Ruslichenko <rruslich@cisco.com>
> > Signed-off-by: Daniel Walker <danielwa@cisco.com>
> > ---
> >  drivers/of/fdt.c | 12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> >
> > diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> > index feb0f2d67fc5..cfe4f8d3c9f5 100644
> > --- a/drivers/of/fdt.c
> > +++ b/drivers/of/fdt.c
> > @@ -25,6 +25,7 @@
> >  #include <linux/serial_core.h>
> >  #include <linux/sysfs.h>
> >  #include <linux/random.h>
> > +#include <linux/cmdline.h>
> >
> >  #include <asm/setup.h>  /* for COMMAND_LINE_SIZE */
> >  #include <asm/page.h>
> > @@ -1048,8 +1049,18 @@ int __init early_init_dt_scan_chosen(unsigned long node, const char *uname,
> >
> >         early_init_dt_check_for_initrd(node);
> >
> > +#ifdef CONFIG_GENERIC_CMDLINE
> 
> What I like about Christophe's version is it removes the old DT
> implementation. Who's going to convert the rest of the DT based
> arches? That's arm, arm64, hexagon, microblaze, nios2, openrisc,
> riscv, sh, and xtensa. Either separate the common code from the config
> like Christophe's version or these all need converting. Though it's
> fine to hash out patch 1 with a couple of arches first.
 
I'm limited in what I can test, so I can't know for sure that I have something
which works on those architectures. Even powerpc 64 is part of this series but
I can't really test it at this time. Also Cisco's needs out strip the
implementation of extend or override.

I have un-tested conversions for arm32, arm64, c6x, microblaze, nios2, and
openrisc. These could go into -next and we can see who complains. The
implementation on these architectures isn't all uniform.

> >         /* Retrieve command line */
> >         p = of_get_flat_dt_prop(node, "bootargs", &l);
> 
> This needs to be outside the ifdef.

Ok ..

Daniel

^ permalink raw reply

* Re: [PATCH 1/5] CMDLINE: add generic builtin command line
From: Daniel Walker @ 2021-03-04 21:20 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: ob Herring, Ruslan Bilovol, Daniel Gimpelevich, linuxppc-dev, x86,
	linux-mips, linux-kernel, xe-linux-external, Andrew Morton,
	Will Deacon
In-Reply-To: <da33aa9e-ffd8-b012-0f2d-c9ad05f32b8f@csgroup.eu>

On Thu, Mar 04, 2021 at 08:00:49AM +0100, Christophe Leroy wrote:
> 
> 
> Le 04/03/2021 à 05:47, Daniel Walker a écrit :
> > This code allows architectures to use a generic builtin command line.
> > The state of the builtin command line options across architecture is
> > diverse. On x86 and mips they have pretty much the same code and the
> > code prepends the builtin command line onto the boot loader provided
> > one. On powerpc there is only a builtin override and nothing else.
> 
> This is not exact. powerpc has:
> CONFIG_FROM_BOOTLOADER
> CONFIG_EXTEND
> CONFIG_FORCE
 
I don't currently have ppc64 to test on, but CONFIG_FROM_BOOTLOADER should likely
stay, but the other two can come from the generic code.


> > 
> > The code in this commit unifies the code into a generic
> > header file under the CONFIG_GENERIC_CMDLINE option. When this
> > option is enabled the architecture can call the cmdline_add_builtin()
> > to add the builtin command line.
> > 
> > Cc: xe-linux-external@cisco.com
> > Signed-off-by: Ruslan Bilovol <rbilovol@cisco.com>
> > Signed-off-by: Daniel Walker <danielwa@cisco.com>
> > ---
> >   include/linux/cmdline.h | 75 +++++++++++++++++++++++++++++++++++++++++
> >   init/Kconfig            | 68 +++++++++++++++++++++++++++++++++++++
> >   2 files changed, 143 insertions(+)
> >   create mode 100644 include/linux/cmdline.h
> > 
> > diff --git a/include/linux/cmdline.h b/include/linux/cmdline.h
> > new file mode 100644
> > index 000000000000..f44011d1a9ee
> > --- /dev/null
> > +++ b/include/linux/cmdline.h
> > @@ -0,0 +1,75 @@
> 
> Missing the SPDX Licence Identifier
> 
> > +#ifndef _LINUX_CMDLINE_H
> > +#define _LINUX_CMDLINE_H
> > +
> > +/*
> > + *
> > + * Copyright (C) 2006,2021. Cisco Systems, Inc.
> > + *
> > + * Generic Append/Prepend cmdline support.
> > + */
> > +
> > +#if defined(CONFIG_GENERIC_CMDLINE) && defined(CONFIG_CMDLINE_BOOL)
> 
> I think it would be better if we can avoid the CONFIG_CMDLINE_BOOL.
> By making the CMDLINEs default to "" at all time, I think we can about that BOOL.

Wouldn't it be annoying if you have to deleted all the characters from two text
boxes vs. just disabling a single option ? What if you leave a space
accidentally , woops.

> > +
> > +#ifndef CONFIG_CMDLINE_OVERRIDE
> > +/*
> > + * This function will append or prepend a builtin command line to the command
> 
> As far as I understand, it doesn't "append _or_ prepend" but it does "append _and_ prepend"

I think the end results is accurately , no need to get pedantic.

> > + * line provided by the bootloader. Kconfig options can be used to alter
> > + * the behavior of this builtin command line.
> > + * @dest: The destination of the final appended/prepended string
> > + * @src: The starting string or NULL if there isn't one.
> > + * @tmp: temporary space used for prepending
> > + * @length: the maximum length of the strings above.
> 
> Missing some parameters here, but I think we should avoid those 'strlcpy'
> and 'strlcat', see later comment.
> 
> > + */
> > +static inline void
> > +__cmdline_add_builtin(char *dest, const char *src, char *tmp, unsigned long length,
> > +		size_t (*strlcpy)(char *dest, const char *src, size_t size),
> > +		size_t (*strlcat)(char *dest, const char *src, size_t count)
> 
> Don't use names that overide names of existing functions.
> 
> 'count' is __kernel_size_t not size_t
 
It's type checking all the parameters at compile time, it doesn't complain about
this that I've seen.


> > +		)
> > +{
> > +	if (src != dest && src != NULL) {
> > +		strlcpy(dest, " ", length);
> 
> Why do you need a space up front in that case ? Why not just copy the source to the destination ?

There may not be a space between them, it doesn't cost anything to have one.

> > +		strlcat(dest, src, length);
> > +	}
> > +
> > +	if (sizeof(CONFIG_CMDLINE_APPEND) > 1)
> > +		strlcat(dest, " " CONFIG_CMDLINE_APPEND, length);
> > +
> > +	if (sizeof(CONFIG_CMDLINE_PREPEND) > 1) {
> > +		strlcpy(tmp, CONFIG_CMDLINE_PREPEND " ", length);
> > +		strlcat(tmp, dest, length);
> > +		strlcpy(dest, tmp, length);
> 
> Could we use memmove(), or implement strmove() and avoid the temporary buffer at all ?

I don't really want to make drastic alteration like this, unless there is a
better reason for it. Most of this hasn't change inside Cisco's tree for almost a decade.

> > +	}
> > +}
> > +
> > +#define cmdline_add_builtin_custom(dest, src, length, label, strlcpy, strlcat) 			\
> 
> It is misleading to call parameters 'strlcpy' or 'strlcat', it hides that they are overriden.

I can change the names, it's not a big deal.

> > +{ 												\
> > +	if (sizeof(CONFIG_CMDLINE_PREPEND) > 1) { 						\
> > +		static label char cmdline_tmp_space[length]; 					\
> 
> Let the architecture define the temporary space when using the custom
> variant instead of just asking the architecture to provide the name of the
> section to use. powerpc already have prom_scratch for that.
 
How would it use this space exactly ? Is it large enough ? How is it managed?


> > +		__cmdline_add_builtin(dest, src, cmdline_tmp_space, length, strlcpy, strlcat); 	\
> > +	} else if (sizeof(CONFIG_CMDLINE_APPEND) > 1) { 					\
> > +		__cmdline_add_builtin(dest, src, NULL, length, strlcpy, strlcat); 		\
> > +	} 											\
> 
> Ah, so if I understand correctly, the user can set both
> CONFIG_CMDLINE_PREPEND and CONFIG_CMDLINE_APPEND but one of them is silently
> ignored.
 
Nothing should be ignored. Either one set gets you into the function, just one
has to create a variable.

> Then I think we should just offer the user to set one, name it
> CONFIG_CMDLINE then ask him to choose between FORCE, APPEND or PREPEND.

No, this doesn't work for Cisco. We need to functionality of this solution,
nothing less..

> > +}
> > +#define cmdline_add_builtin(dest, src, length)	                           \
> > +	cmdline_add_builtin_custom(dest, src, length, __initdata, &strlcpy, &strlcat)
> > +#else
> > +#define cmdline_add_builtin(dest, src, length)				   \
> > +{								  	   \
> > +	strlcpy(dest, CONFIG_CMDLINE_PREPEND " " CONFIG_CMDLINE_APPEND,    \
> > +		length);		   				   \
> > +}
> > +#endif /* !CONFIG_CMDLINE_OVERRIDE */
> > +
> > +#else
> > +#define cmdline_add_builtin_custom(dest, src, length, label, strlcpy, strlcat) { \
> > +	if (src != NULL) 							 \
> > +		strlcpy(dest, src, length);	 				 \
> > +}
> > +
> > +#define cmdline_add_builtin(dest, src, length) { 				\
> > +	cmdline_add_builtin_custom(dest, src, length, strlcpy, strlcat); 	\
> > +}
> > +#endif /* CONFIG_GENERIC_CMDLINE */
> 
> I'd rather avoid all those macros and use static inline functions instead.
 
The last two in the off case might be able to be converted.

> For the strlcpy() and strlcat(), use another name, for instance
> cmdline_strlcpy and cmdline_strlcat. Then at the begining of the file,
> define them as strlcpy ad strlcat unless they are already defined to
> something else (by the architecture before including cmdline.h).
 

Your duplicating your comments.


> > +
> > +
> > +#endif /* _LINUX_CMDLINE_H */
> > diff --git a/init/Kconfig b/init/Kconfig
> > index 29ad68325028..28363ab07cd4 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -2032,6 +2032,74 @@ config PROFILING
> >   config TRACEPOINTS
> >   	bool
> > +config GENERIC_CMDLINE
> > +	bool
> > +
> > +if GENERIC_CMDLINE
> > +
> > +config CMDLINE_BOOL
> > +	bool "Built-in kernel command line"
> 
> We don't need the CMDLINE_BOOL, just have CMDLINE always "" by default.
 
I think it's more usable as explained above.


> > +	help
> > +	  Allow for specifying boot arguments to the kernel at
> > +	  build time.  On some systems (e.g. embedded ones), it is
> > +	  necessary or convenient to provide some or all of the
> > +	  kernel boot arguments with the kernel itself (that is,
> > +	  to not rely on the boot loader to provide them.)
> > +
> > +	  To compile command line arguments into the kernel,
> > +	  set this option to 'Y', then fill in the
> > +	  the boot arguments in CONFIG_CMDLINE.
> > +
> > +	  Systems with fully functional boot loaders (i.e. non-embedded)
> > +	  should leave this option set to 'N'.
> > +
> > +config CMDLINE_APPEND
> 
> As far as I understand, the generic code will only take CMDLINE_APPEND into
> account if CMDLINE_PREPEND doesn't exist, otherwise it will silently ignore
> it.

No, that's not how that works.

> Only offer one string: CONFIG_CMDLINE, and make the use choose between APPEND, EXTEND or OVERRIDE

No. That's not how this works.

> > +	string "Built-in kernel command string append"
> > +	depends on CMDLINE_BOOL
> > +	default ""
> > +	help
> > +	  Enter arguments here that should be compiled into the kernel
> > +	  image and used at boot time.  If the boot loader provides a
> > +	  command line at boot time, this string is appended to it to
> > +	  form the full kernel command line, when the system boots.
> > +
> > +	  However, you can use the CONFIG_CMDLINE_OVERRIDE option to
> > +	  change this behavior.
> > +
> > +	  In most cases, the command line (whether built-in or provided
> > +	  by the boot loader) should specify the device for the root
> > +	  file system.
> > +
> > +config CMDLINE_PREPEND
> > +	string "Built-in kernel command string prepend"
> > +	depends on CMDLINE_BOOL
> > +	default ""
> > +	help
> > +	  Enter arguments here that should be compiled into the kernel
> > +	  image and used at boot time.  If the boot loader provides a
> > +	  command line at boot time, this string is prepended to it to
> > +	  form the full kernel command line, when the system boots.
> > +
> > +	  However, you can use the CONFIG_CMDLINE_OVERRIDE option to
> > +	  change this behavior.
> > +
> > +	  In most cases, the command line (whether built-in or provided
> > +	  by the boot loader) should specify the device for the root
> > +	  file system.
> > +
> > +config CMDLINE_OVERRIDE
> > +	bool "Built-in command line overrides boot loader arguments"
> > +	depends on CMDLINE_BOOL
> > +	help
> > +	  Set this option to 'Y' to have the kernel ignore the boot loader
> > +	  command line, and use ONLY the built-in command line. In this case
> > +	  append and prepend strings are concatenated to form the full
> > +	  command line.
> > +
> > +	  This is used to work around broken boot loaders.  This should
> > +	  be set to 'N' under normal conditions.
> > +endif
> > +
> >   endmenu		# General setup
> >   source "arch/Kconfig"
> > 
> 
> Christophe


Most of your comments are the kind of things this code went thru on it's first
implementation, and were discarded for a reason during usage and testing.

Daniel


^ permalink raw reply

* Re: [PATCH 2/5] CMDLINE: drivers: of: ifdef out cmdline section
From: Daniel Walker @ 2021-03-04 21:24 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: ob Herring, Ruslan Ruslichenko, Daniel Gimpelevich, Frank Rowand,
	devicetree, linuxppc-dev, x86, linux-mips, linux-kernel,
	Rob Herring, xe-linux-external, Andrew Morton, Will Deacon
In-Reply-To: <2b0081aa-52af-a4ab-7481-6e125bd103d6@csgroup.eu>

On Thu, Mar 04, 2021 at 08:09:52AM +0100, Christophe Leroy wrote:
> 
> 
> Le 04/03/2021 à 05:47, Daniel Walker a écrit :
> > It looks like there's some seepage of cmdline stuff into
> > the generic device tree code. This conflicts with the
> > generic cmdline implementation so I remove it in the case
> > when that's enabled.
> > 
> > Cc: xe-linux-external@cisco.com
> > Signed-off-by: Ruslan Ruslichenko <rruslich@cisco.com>
> > Signed-off-by: Daniel Walker <danielwa@cisco.com>
> > ---
> >   drivers/of/fdt.c | 12 ++++++++++++
> >   1 file changed, 12 insertions(+)
> > 
> > diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> > index feb0f2d67fc5..cfe4f8d3c9f5 100644
> > --- a/drivers/of/fdt.c
> > +++ b/drivers/of/fdt.c
> > @@ -25,6 +25,7 @@
> >   #include <linux/serial_core.h>
> >   #include <linux/sysfs.h>
> >   #include <linux/random.h>
> > +#include <linux/cmdline.h>
> >   #include <asm/setup.h>  /* for COMMAND_LINE_SIZE */
> >   #include <asm/page.h>
> > @@ -1048,8 +1049,18 @@ int __init early_init_dt_scan_chosen(unsigned long node, const char *uname,
> >   	early_init_dt_check_for_initrd(node);
> > +#ifdef CONFIG_GENERIC_CMDLINE
> >   	/* Retrieve command line */
> >   	p = of_get_flat_dt_prop(node, "bootargs", &l);
> > +
> > +	/*
> > +	 * The builtin command line will be added here, or it can override
> > +	 * with the DT bootargs.
> > +	 */
> > +	cmdline_add_builtin(data,
> > +			    ((p != NULL && l > 0) ? p : NULL), /* This is sanity checking */
> 
> Can we do more simple ? If p is NULL, p is already NULL, so (l > 0 ? p : NULL) should be enough.


I believe Rob gave me this line. Maybe he can comment on it.

Daniel

^ permalink raw reply

* Re: [PATCH v1] powerpc: Include running function as first entry in save_stack_trace() and friends
From: Segher Boessenkool @ 2021-03-04 21:54 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Marco Elver, Catalin Marinas, linuxppc-dev, LKML, kasan-dev,
	broonie, Paul Mackerras, Will Deacon, linux-arm-kernel
In-Reply-To: <20210304145730.GC54534@C02TD0UTHF1T.local>

Hi!

On Thu, Mar 04, 2021 at 02:57:30PM +0000, Mark Rutland wrote:
> It looks like GCC is happy to give us the function-entry-time FP if we use
> __builtin_frame_address(1),

From the GCC manual:
     Calling this function with a nonzero argument can have
     unpredictable effects, including crashing the calling program.  As
     a result, calls that are considered unsafe are diagnosed when the
     '-Wframe-address' option is in effect.  Such calls should only be
     made in debugging situations.

It *does* warn (the warning is in -Wall btw), on both powerpc and
aarch64.  Furthermore, using this builtin causes lousy code (it forces
the use of a frame pointer, which we normally try very hard to optimise
away, for good reason).

And, that warning is not an idle warning.  Non-zero arguments to
__builtin_frame_address can crash the program.  It won't on simpler
functions, but there is no real definition of what a simpler function
*is*.  It is meant for debugging, not for production use (this is also
why no one has bothered to make it faster).

On Power it should work, but on pretty much any other arch it won't.

> Unless we can get some strong guarantees from compiler folk such that we
> can guarantee a specific function acts boundary for unwinding (and
> doesn't itself get split, etc), the only reliable way I can think to
> solve this requires an assembly trampoline. Whatever we do is liable to
> need some invasive rework.

You cannot get such a guarantee, other than not letting the compiler
see into the routine at all, like with assembler code (not inline asm,
real assembler code).

The real way forward is to bite the bullet and to no longer pretend you
can do a full backtrace from just the stack contents.  You cannot.

Segher

^ permalink raw reply

* Re: [PATCH 2/5] CMDLINE: drivers: of: ifdef out cmdline section
From: Rob Herring @ 2021-03-04 22:29 UTC (permalink / raw)
  To: Daniel Walker
  Cc: devicetree, Ruslan Ruslichenko, Daniel Gimpelevich, Frank Rowand,
	linuxppc-dev, X86 ML, open list:MIPS,
	linux-kernel@vger.kernel.org, xe-linux-external, Andrew Morton,
	Will Deacon
In-Reply-To: <20210304212448.GK109100@zorba>

On Thu, Mar 4, 2021 at 3:24 PM Daniel Walker <danielwa@cisco.com> wrote:
>
> On Thu, Mar 04, 2021 at 08:09:52AM +0100, Christophe Leroy wrote:
> >
> >
> > Le 04/03/2021 à 05:47, Daniel Walker a écrit :
> > > It looks like there's some seepage of cmdline stuff into
> > > the generic device tree code. This conflicts with the
> > > generic cmdline implementation so I remove it in the case
> > > when that's enabled.
> > >
> > > Cc: xe-linux-external@cisco.com
> > > Signed-off-by: Ruslan Ruslichenko <rruslich@cisco.com>
> > > Signed-off-by: Daniel Walker <danielwa@cisco.com>
> > > ---
> > >   drivers/of/fdt.c | 12 ++++++++++++
> > >   1 file changed, 12 insertions(+)
> > >
> > > diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> > > index feb0f2d67fc5..cfe4f8d3c9f5 100644
> > > --- a/drivers/of/fdt.c
> > > +++ b/drivers/of/fdt.c
> > > @@ -25,6 +25,7 @@
> > >   #include <linux/serial_core.h>
> > >   #include <linux/sysfs.h>
> > >   #include <linux/random.h>
> > > +#include <linux/cmdline.h>
> > >   #include <asm/setup.h>  /* for COMMAND_LINE_SIZE */
> > >   #include <asm/page.h>
> > > @@ -1048,8 +1049,18 @@ int __init early_init_dt_scan_chosen(unsigned long node, const char *uname,
> > >     early_init_dt_check_for_initrd(node);
> > > +#ifdef CONFIG_GENERIC_CMDLINE
> > >     /* Retrieve command line */
> > >     p = of_get_flat_dt_prop(node, "bootargs", &l);
> > > +
> > > +   /*
> > > +    * The builtin command line will be added here, or it can override
> > > +    * with the DT bootargs.
> > > +    */
> > > +   cmdline_add_builtin(data,
> > > +                       ((p != NULL && l > 0) ? p : NULL), /* This is sanity checking */
> >
> > Can we do more simple ? If p is NULL, p is already NULL, so (l > 0 ? p : NULL) should be enough.
>
>
> I believe Rob gave me this line. Maybe he can comment on it.

It's an obvious improvement and LGTM.

^ permalink raw reply

* Re: [Freedreno] [PATCH 16/17] iommu: remove DOMAIN_ATTR_IO_PGTABLE_CFG
From: Rob Clark @ 2021-03-04 23:11 UTC (permalink / raw)
  To: Robin Murphy
  Cc: freedreno, Sai Prakash Ranjan, kvm, Joerg Roedel, linuxppc-dev,
	dri-devel, Li Yang,
	list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>, Joerg Roedel <joro@8bytes.org>, ,
	netdev, David Woodhouse, linux-arm-msm, virtualization,
	Will Deacon, Christoph Hellwig,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE
In-Reply-To: <d567ad5c-5f89-effa-7260-88c6d86b4695@arm.com>

On Thu, Mar 4, 2021 at 7:48 AM Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 2021-03-01 08:42, Christoph Hellwig wrote:
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
>
> Moreso than the previous patch, where the feature is at least relatively
> generic (note that there's a bunch of in-flight development around
> DOMAIN_ATTR_NESTING), I'm really not convinced that it's beneficial to
> bloat the generic iommu_ops structure with private driver-specific
> interfaces. The attribute interface is a great compromise for these
> kinds of things, and you can easily add type-checked wrappers around it
> for external callers (maybe even make the actual attributes internal
> between the IOMMU core and drivers) if that's your concern.

I suppose if this is *just* for the GPU we could move it into adreno_smmu_priv..

But one thing I'm not sure about is whether
IO_PGTABLE_QUIRK_ARM_OUTER_WBWA is something that other devices
*should* be using as well, but just haven't gotten around to yet.

BR,
-R

> Robin.
>
> > ---
> >   drivers/gpu/drm/msm/adreno/adreno_gpu.c |  2 +-
> >   drivers/iommu/arm/arm-smmu/arm-smmu.c   | 40 +++++++------------------
> >   drivers/iommu/iommu.c                   |  9 ++++++
> >   include/linux/iommu.h                   |  9 +++++-
> >   4 files changed, 29 insertions(+), 31 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > index 0f184c3dd9d9ec..78d98ab2ee3a68 100644
> > --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > @@ -191,7 +191,7 @@ void adreno_set_llc_attributes(struct iommu_domain *iommu)
> >       struct io_pgtable_domain_attr pgtbl_cfg;
> >
> >       pgtbl_cfg.quirks = IO_PGTABLE_QUIRK_ARM_OUTER_WBWA;
> > -     iommu_domain_set_attr(iommu, DOMAIN_ATTR_IO_PGTABLE_CFG, &pgtbl_cfg);
> > +     iommu_domain_set_pgtable_attr(iommu, &pgtbl_cfg);
> >   }
> >
> >   struct msm_gem_address_space *
> > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > index 2e17d990d04481..2858999c86dfd1 100644
> > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > @@ -1515,40 +1515,22 @@ static int arm_smmu_domain_enable_nesting(struct iommu_domain *domain)
> >       return ret;
> >   }
> >
> > -static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
> > -                                 enum iommu_attr attr, void *data)
> > +static int arm_smmu_domain_set_pgtable_attr(struct iommu_domain *domain,
> > +             struct io_pgtable_domain_attr *pgtbl_cfg)
> >   {
> > -     int ret = 0;
> >       struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> > +     int ret = -EPERM;
> >
> > -     mutex_lock(&smmu_domain->init_mutex);
> > -
> > -     switch(domain->type) {
> > -     case IOMMU_DOMAIN_UNMANAGED:
> > -             switch (attr) {
> > -             case DOMAIN_ATTR_IO_PGTABLE_CFG: {
> > -                     struct io_pgtable_domain_attr *pgtbl_cfg = data;
> > -
> > -                     if (smmu_domain->smmu) {
> > -                             ret = -EPERM;
> > -                             goto out_unlock;
> > -                     }
> > +     if (domain->type != IOMMU_DOMAIN_UNMANAGED)
> > +             return -EINVAL;
> >
> > -                     smmu_domain->pgtbl_cfg = *pgtbl_cfg;
> > -                     break;
> > -             }
> > -             default:
> > -                     ret = -ENODEV;
> > -             }
> > -             break;
> > -     case IOMMU_DOMAIN_DMA:
> > -             ret = -ENODEV;
> > -             break;
> > -     default:
> > -             ret = -EINVAL;
> > +     mutex_lock(&smmu_domain->init_mutex);
> > +     if (!smmu_domain->smmu) {
> > +             smmu_domain->pgtbl_cfg = *pgtbl_cfg;
> > +             ret = 0;
> >       }
> > -out_unlock:
> >       mutex_unlock(&smmu_domain->init_mutex);
> > +
> >       return ret;
> >   }
> >
> > @@ -1609,7 +1591,7 @@ static struct iommu_ops arm_smmu_ops = {
> >       .device_group           = arm_smmu_device_group,
> >       .dma_use_flush_queue    = arm_smmu_dma_use_flush_queue,
> >       .dma_enable_flush_queue = arm_smmu_dma_enable_flush_queue,
> > -     .domain_set_attr        = arm_smmu_domain_set_attr,
> > +     .domain_set_pgtable_attr = arm_smmu_domain_set_pgtable_attr,
> >       .domain_enable_nesting  = arm_smmu_domain_enable_nesting,
> >       .of_xlate               = arm_smmu_of_xlate,
> >       .get_resv_regions       = arm_smmu_get_resv_regions,
> > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> > index 2e9e058501a953..8490aefd4b41f8 100644
> > --- a/drivers/iommu/iommu.c
> > +++ b/drivers/iommu/iommu.c
> > @@ -2693,6 +2693,15 @@ int iommu_domain_enable_nesting(struct iommu_domain *domain)
> >   }
> >   EXPORT_SYMBOL_GPL(iommu_domain_enable_nesting);
> >
> > +int iommu_domain_set_pgtable_attr(struct iommu_domain *domain,
> > +             struct io_pgtable_domain_attr *pgtbl_cfg)
> > +{
> > +     if (!domain->ops->domain_set_pgtable_attr)
> > +             return -EINVAL;
> > +     return domain->ops->domain_set_pgtable_attr(domain, pgtbl_cfg);
> > +}
> > +EXPORT_SYMBOL_GPL(iommu_domain_set_pgtable_attr);
> > +
> >   void iommu_get_resv_regions(struct device *dev, struct list_head *list)
> >   {
> >       const struct iommu_ops *ops = dev->bus->iommu_ops;
> > diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> > index aed88aa3bd3edf..39d3ed4d2700ac 100644
> > --- a/include/linux/iommu.h
> > +++ b/include/linux/iommu.h
> > @@ -40,6 +40,7 @@ struct iommu_domain;
> >   struct notifier_block;
> >   struct iommu_sva;
> >   struct iommu_fault_event;
> > +struct io_pgtable_domain_attr;
> >
> >   /* iommu fault flags */
> >   #define IOMMU_FAULT_READ    0x0
> > @@ -107,7 +108,6 @@ enum iommu_cap {
> >    */
> >
> >   enum iommu_attr {
> > -     DOMAIN_ATTR_IO_PGTABLE_CFG,
> >       DOMAIN_ATTR_MAX,
> >   };
> >
> > @@ -196,6 +196,7 @@ struct iommu_iotlb_gather {
> >    * @dma_enable_flush_queue: Try to enable the DMA flush queue
> >    * @domain_set_attr: Change domain attributes
> >    * @domain_enable_nesting: Enable nesting
> > + * @domain_set_pgtable_attr: Set io page table attributes
> >    * @get_resv_regions: Request list of reserved regions for a device
> >    * @put_resv_regions: Free list of reserved regions for a device
> >    * @apply_resv_region: Temporary helper call-back for iova reserved ranges
> > @@ -249,6 +250,8 @@ struct iommu_ops {
> >       int (*domain_set_attr)(struct iommu_domain *domain,
> >                              enum iommu_attr attr, void *data);
> >       int (*domain_enable_nesting)(struct iommu_domain *domain);
> > +     int (*domain_set_pgtable_attr)(struct iommu_domain *domain,
> > +                     struct io_pgtable_domain_attr *pgtbl_cfg);
> >
> >       /* Request/Free a list of reserved regions for a device */
> >       void (*get_resv_regions)(struct device *dev, struct list_head *list);
> > @@ -493,9 +496,13 @@ extern int iommu_group_id(struct iommu_group *group);
> >   extern struct iommu_domain *iommu_group_default_domain(struct iommu_group *);
> >
> >   bool iommu_dma_use_flush_queue(struct iommu_domain *domain);
> > +int iommu_domain_set_pgtable_attr(struct iommu_domain *domain,
> > +             struct io_pgtable_domain_attr *pgtbl_cfg);
> >   extern int iommu_domain_set_attr(struct iommu_domain *domain, enum iommu_attr,
> >                                void *data);
> >   int iommu_domain_enable_nesting(struct iommu_domain *domain);
> > +int iommu_domain_set_pgtable_attr(struct iommu_domain *domain,
> > +             struct io_pgtable_domain_attr *pgtbl_cfg);
> >
> >   extern int report_iommu_fault(struct iommu_domain *domain, struct device *dev,
> >                             unsigned long iova, int flags);
> >
> _______________________________________________
> Freedreno mailing list
> Freedreno@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply

* linux-next: manual merge of the tty tree with the powerpc-fixes tree
From: Stephen Rothwell @ 2021-03-05  1:05 UTC (permalink / raw)
  To: Greg KH, Michael Ellerman, PowerPC
  Cc: Uwe Kleine-König, Jiri Slaby, Greg Kroah-Hartman,
	Linux Kernel Mailing List, Linux Next Mailing List, Jiri Slaby

[-- Attachment #1: Type: text/plain, Size: 802 bytes --]

Hi all,

Today's linux-next merge of the tty tree got a conflict in:

  drivers/tty/hvc/hvcs.c

between commit:

  386a966f5ce7 ("vio: make remove callback return void")

from the powerpc-fixes tree and commit:

  fb8d350c291c ("tty: hvc, drop unneeded forward declarations")

from the tty tree.

I fixed it up (they both removed the forward decalrartion of
hvcs_remove(), but the latter removed more) and can carry the fix as
necessary. This is now fixed as far as linux-next is concerned, but any
non trivial conflicts should be mentioned to your upstream maintainer
when your tree is submitted for merging.  You may also want to consider
cooperating with the maintainer of the conflicting tree to minimise any
particularly complex conflicts.

-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* [PATCH] ibmvnic: remove excessive irqsave
From: angkery @ 2021-03-05  1:43 UTC (permalink / raw)
  To: mpe, benh, paulus, drt, ljp, sukadev, davem, kuba
  Cc: netdev, linuxppc-dev, linux-kernel, Junlin Yang

From: Junlin Yang <yangjunlin@yulong.com>

ibmvnic_remove locks multiple spinlocks while disabling interrupts:
spin_lock_irqsave(&adapter->state_lock, flags);
spin_lock_irqsave(&adapter->rwi_lock, flags);

there is no need for the second irqsave,since interrupts are disabled
at that point, so remove the second irqsave:
spin_lock_irqsave(&adapter->state_lock, flags);
spin_lock(&adapter->rwi_lock);

Generated by: ./scripts/coccinelle/locks/flags.cocci
./drivers/net/ethernet/ibm/ibmvnic.c:5413:1-18:
ERROR: nested lock+irqsave that reuses flags from line 5404.

Signed-off-by: Junlin Yang <yangjunlin@yulong.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 2464c8a..a52668d 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -5408,9 +5408,9 @@ static void ibmvnic_remove(struct vio_dev *dev)
 	 * after setting state, so __ibmvnic_reset() which is called
 	 * from the flush_work() below, can make progress.
 	 */
-	spin_lock_irqsave(&adapter->rwi_lock, flags);
+	spin_lock(&adapter->rwi_lock);
 	adapter->state = VNIC_REMOVING;
-	spin_unlock_irqrestore(&adapter->rwi_lock, flags);
+	spin_unlock(&adapter->rwi_lock);
 
 	spin_unlock_irqrestore(&adapter->state_lock, flags);
 
-- 
1.9.1



^ permalink raw reply related

* Re: [PATCH v2 7/7] ASoC: dt-bindings: imx-rpmsg: Add binding doc for rpmsg machine driver
From: Shengjiu Wang @ 2021-03-05  2:55 UTC (permalink / raw)
  To: Rob Herring
  Cc: open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Linux-ALSA, Timur Tabi, Xiubo Li, Fabio Estevam, Shengjiu Wang,
	Takashi Iwai, Liam Girdwood, linux-kernel, Nicolin Chen,
	Mark Brown, linuxppc-dev
In-Reply-To: <CAL_JsqK1uc82hfdE4yj0ye-D6vygiqWkDVW96NOb-8kEFVqHMg@mail.gmail.com>

Hi

On Fri, Mar 5, 2021 at 4:05 AM Rob Herring <robh@kernel.org> wrote:
>
> On Thu, Feb 18, 2021 at 1:23 AM Shengjiu Wang <shengjiu.wang@gmail.com> wrote:
> >
> > On Thu, Feb 11, 2021 at 6:18 AM Rob Herring <robh@kernel.org> wrote:
> > >
> > > On Sun, Feb 07, 2021 at 06:23:55PM +0800, Shengjiu Wang wrote:
> > > > Imx-rpmsg is a new added machine driver for supporting audio on Cortex-M
> > > > core. The Cortex-M core will control the audio interface, DMA and audio
> > > > codec, setup the pipeline, the audio driver on Cortex-A core side is just
> > > > to communitcate with M core, it is a virtual sound card and don't touch
> > > > the hardware.
> > >
> > > I don't understand why there are 2 nodes for this other than you happen
> > > to want to split this into 2 Linux drivers. It's 1 h/w thing.
> >
> > This one is for the sound card machine driver.  Another one is
> > for the sound card cpu dai driver. so there are 2 nodes.
>
> You are explaining this to me in terms of drivers. Explain it in terms
> of h/w blocks.
>

Yes, there is only 1 h/w block, which is (MU) message unit

As the sound card needs a cpu dai node and sound card node,
so from the driver's perspective, I use two nodes.

Seems It is hard to only use one node for my case.
or do you have any suggestions?

Best regards
Wang shengjiu

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox