The Linux Kernel Mailing List

The Linux Kernel Mailing List
 help / color / mirror / Atom feed

* 2 BUG_ON crashes
From: shankarapailoor @ 2018-05-29 13:34 UTC (permalink / raw)
  To: clm, jbacik; +Cc: linux-btrfs, linux-kernel

Hi,

I have been fuzzing linux 4.17-rc4 with Syzkaller and it triggered the
following BUG_ONs.

1. kernel BUG at fs/btrfs/file.c:997
(https://elixir.bootlin.com/linux/v4.17-rc4/source/fs/btrfs/file.c#L997)

Stack Trace: https://pastebin.com/QZeDnRvm

2. kernel BUG at fs/btrfs/extent_io.c:4049
(https://elixir.bootlin.com/linux/v4.17-rc4/source/fs/btrfs/extent_io.c#L4049)

Stack Trace: https://pastebin.com/qZN5qF3e

Both bugs appear to be triggered by fault injections but can the
BUG_ON's be changed to Warnings?

Regards,
Shankara Pailoor

^ permalink raw reply

* RE: [PATCH 13/15] arm: dts: r8a7743: Add missing OPP properties for CPUs
From: Biju Das @ 2018-05-29 13:33 UTC (permalink / raw)
  To: Simon Horman, Viresh Kumar
  Cc: arm@kernel.org, Magnus Damm, Rob Herring, Mark Rutland,
	Vincent Guittot, ionela.voinescu@arm.com, Daniel Lezcano,
	chris.redpath@arm.com, linux-renesas-soc@vger.kernel.org,
	devicetree@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <20180528115832.b2wovvanypxkgalj@verge.net.au>

Hi All,

I have tested this patch on RZ/G1M and I didn't find any issues. r8a7743 is similar to r8a7791. So I assume you will apply the same patch for other R-SoC devices as well.

Apart from this, maybe we need to update the OPP binding documentation. i.e., extend the  operating- point usage to other cores in the cluster (Binding 1: operating-points).

Regards,
Biju

> -----Original Message-----
> From: Simon Horman [mailto:horms@verge.net.au]
> Sent: 28 May 2018 12:59
> To: Viresh Kumar <viresh.kumar@linaro.org>
> Cc: arm@kernel.org; Magnus Damm <magnus.damm@gmail.com>; Rob
> Herring <robh+dt@kernel.org>; Mark Rutland <mark.rutland@arm.com>;
> Vincent Guittot <vincent.guittot@linaro.org>; ionela.voinescu@arm.com;
> Daniel Lezcano <daniel.lezcano@linaro.org>; chris.redpath@arm.com; linux-
> renesas-soc@vger.kernel.org; devicetree@vger.kernel.org; linux-
> kernel@vger.kernel.org; Biju Das <biju.das@bp.renesas.com>
> Subject: Re: [PATCH 13/15] arm: dts: r8a7743: Add missing OPP properties for
> CPUs
>
> On Mon, May 28, 2018 at 04:28:31PM +0530, Viresh Kumar wrote:
> > On 28-05-18, 11:23, Simon Horman wrote:
> > > [Cc Biju Das]
> > >
> > > On Fri, May 25, 2018 at 04:01:59PM +0530, Viresh Kumar wrote:
> > > > The OPP properties, like "operating-points", should either be
> > > > present for all the CPUs of a cluster or none. If these are
> > > > present only for a subset of CPUs of a cluster then things will
> > > > start falling apart as soon as the CPUs are brought online in a
> > > > different order. For example, this will happen because the
> > > > operating system looks for such properties in the CPU node it is
> > > > trying to bring up, so that it can create an OPP table.
> > > >
> > > > Add such missing properties.
> > > >
> > > > Fix other missing property (clock latency) as well to make it all
> > > > work.
> > > >
> > > > Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> > >
> > > Thanks, this looks good to me and it looks like it should have:
> > >
> > > Fixes: 0417814ea140 ("ARM: dts: r8a7743: Add OPP table for frequency
> > > scaling")
> >
> > Sure.
> >
> > Will you be picking this patch directly and send it part of your pull
> > request ? Maybe add Fixes tag then only ?
>
> Yes, that is my plan. I can handle adding the Fixes tag.
> But I'll wait to see if Bjiu has an feedback first.



Renesas Electronics Europe Ltd, Dukes Meadow, Millboard Road, Bourne End, Buckinghamshire, SL8 5FH, UK. Registered in England & Wales under Registered No. 04586709.

^ permalink raw reply

* Re: [RFT v3 0/4] Perf script: Add python script for CoreSight trace disassembler
From: Arnaldo Carvalho de Melo @ 2018-05-29 13:32 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: Leo Yan, Jonathan Corbet, Robert Walker, Mike Leach, Kim Phillips,
	Tor Jeremiassen, Peter Zijlstra, Ingo Molnar, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, linux-arm-kernel,
	open list:DOCUMENTATION, Linux Kernel Mailing List, coresight
In-Reply-To: <CANLsYkzn5qyzjxMiCPQ1GxyNjhHJp-2H6Lds11HP9rG5xug0FA@mail.gmail.com>

Em Mon, May 28, 2018 at 03:53:42PM -0600, Mathieu Poirier escreveu:
> On 28 May 2018 at 14:03, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > Em Mon, May 28, 2018 at 04:44:59PM +0800, Leo Yan escreveu:
> >> This patch series is to support for using 'perf script' for CoreSight
> >> trace disassembler, for this purpose this patch series adds a new
> >> python script to parse CoreSight tracing event and use command 'objdump'
> >> for disassembled lines, finally this can generate readable program
> >> execution flow for reviewing tracing data.
> >>
> >> Patch 0001 is one fixing patch to generate samples for the start packet
> >> and exception packets.
> >>
> >> Patch 0002 is the prerequisite to add addr into sample dict, so this
> >> value can be used by python script to analyze instruction range.
> >>
> >> Patch 0003 is to add python script for trace disassembler.
> >>
> >> Patch 0004 is to add doc to explain python script usage and give
> >> example for it.
> >>
> >> This patch series has been rebased on acme git tree [1] with the commit
> >> 19422a9f2a3b ("perf tools: Fix kernel_start for PTI on x86") and tested
> >> on Hikey (ARM64 octa CA53 cores).
> >
> > Thanks, applied to perf/core.
> 
> Please hold off on that Arnaldo - I'm currently reviewing the set and
> I think some things can be improved.

Ok, I dropped all but the one adding sample->addr to the python
dictionary, that is ok to cherry pick.

- Arnaldo

^ permalink raw reply

* Re: [PATCH v5 02/10] sched/rt: add rt_rq utilization tracking
From: Vincent Guittot @ 2018-05-29 13:29 UTC (permalink / raw)
  To: Patrick Bellasi
  Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, Rafael J. Wysocki,
	Juri Lelli, Dietmar Eggemann, Morten Rasmussen, viresh kumar,
	Valentin Schneider, Quentin Perret
In-Reply-To: <20180525155437.GE30654@e110439-lin>

Hi Patrick,

On 25 May 2018 at 17:54, Patrick Bellasi <patrick.bellasi@arm.com> wrote:
> On 25-May 15:12, Vincent Guittot wrote:
>> schedutil governor relies on cfs_rq's util_avg to choose the OPP when cfs
>                                                                        ^
>                                                                      only
> otherwise, while RT tasks are running we go to max.
>
>> tasks are running.
>> When the CPU is overloaded by cfs and rt tasks, cfs tasks
>                   ^^^^^^^^^^
> I would say we always have the provlem whenever an RT task preempt a
> CFS one, even just for few ms, isn't it?

The problem only happens when there is not enough time to run all
tasks (rt and cfs). If the cfs task is preempted few ms and the main
impact is only a delay in its execution but there is still enough time
to do cfs jobs (cpu goes back to idle from time to time), there is no
"real" problem. At now, it means that it's not a problem as long as
the rt task doesn't take more than the margin that schedutil uses to
select a frequency : (max freq + max freq >> 2) util /max capacity

>
>> are preempted by rt tasks and in this case util_avg reflects the remaining
>> capacity but not what cfs want to use. In such case, schedutil can select a
>> lower OPP whereas the CPU is overloaded. In order to have a more accurate
>> view of the utilization of the CPU, we track the utilization that is
>> "stolen" by rt tasks.
>>
>> rt_rq uses rq_clock_task and cfs_rq uses cfs_rq_clock_task but they are
>> the same at the root group level, so the PELT windows of the util_sum are
>> aligned.
>>
>> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
>> ---
>>  kernel/sched/fair.c  | 15 ++++++++++++++-
>>  kernel/sched/pelt.c  | 23 +++++++++++++++++++++++
>>  kernel/sched/pelt.h  |  7 +++++++
>>  kernel/sched/rt.c    |  8 ++++++++
>>  kernel/sched/sched.h |  7 +++++++
>>  5 files changed, 59 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 6390c66..fb18bcc 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -7290,6 +7290,14 @@ static inline bool cfs_rq_has_blocked(struct cfs_rq *cfs_rq)
>>       return false;
>>  }
>>
>> +static inline bool rt_rq_has_blocked(struct rq *rq)
>> +{
>> +     if (rq->avg_rt.util_avg)
>
> Should use READ_ONCE?

I was expecting that there will be only one read by default but I can
add READ_ONCE

>
>> +             return true;
>> +
>> +     return false;
>
> What about just:
>
>        return READ_ONCE(rq->avg_rt.util_avg);
>
> ?

This function is renamed and extended with others tracking in the
following patches so we have to test several values in the function.
That's also why there is the if test because additional if test are
going to be added

>
>> +}
>> +
>>  #ifdef CONFIG_FAIR_GROUP_SCHED
>>
>>  static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq)
>> @@ -7349,6 +7357,10 @@ static void update_blocked_averages(int cpu)
>>               if (cfs_rq_has_blocked(cfs_rq))
>>                       done = false;
>>       }
>> +     update_rt_rq_load_avg(rq_clock_task(rq), rq, 0);
>> +     /* Don't need periodic decay once load/util_avg are null */
>> +     if (rt_rq_has_blocked(rq))
>> +             done = false;
>>
>>  #ifdef CONFIG_NO_HZ_COMMON
>>       rq->last_blocked_load_update_tick = jiffies;
>> @@ -7414,9 +7426,10 @@ static inline void update_blocked_averages(int cpu)
>>       rq_lock_irqsave(rq, &rf);
>>       update_rq_clock(rq);
>>       update_cfs_rq_load_avg(cfs_rq_clock_task(cfs_rq), cfs_rq);
>> +     update_rt_rq_load_avg(rq_clock_task(rq), rq, 0);
>>  #ifdef CONFIG_NO_HZ_COMMON
>>       rq->last_blocked_load_update_tick = jiffies;
>> -     if (!cfs_rq_has_blocked(cfs_rq))
>> +     if (!cfs_rq_has_blocked(cfs_rq) && !rt_rq_has_blocked(rq))
>>               rq->has_blocked_load = 0;
>>  #endif
>>       rq_unlock_irqrestore(rq, &rf);
>> diff --git a/kernel/sched/pelt.c b/kernel/sched/pelt.c
>> index e6ecbb2..213b922 100644
>> --- a/kernel/sched/pelt.c
>> +++ b/kernel/sched/pelt.c
>> @@ -309,3 +309,26 @@ int __update_load_avg_cfs_rq(u64 now, int cpu, struct cfs_rq *cfs_rq)
>>
>>       return 0;
>>  }
>> +
>> +/*
>> + * rt_rq:
>> + *
>> + *   util_sum = \Sum se->avg.util_sum but se->avg.util_sum is not tracked
>> + *   util_sum = cpu_scale * load_sum
>> + *   runnable_load_sum = load_sum
>> + *
>> + */
>> +
>> +int update_rt_rq_load_avg(u64 now, struct rq *rq, int running)
>> +{
>> +     if (___update_load_sum(now, rq->cpu, &rq->avg_rt,
>> +                             running,
>> +                             running,
>> +                             running)) {
>> +
>
> Not needed empty line?

yes probably.

This empty is coming from the copy/paste of __update_load_avg_cfs_rq()
I will consolidate this in the next version

>
>> +             ___update_load_avg(&rq->avg_rt, 1, 1);
>> +             return 1;
>> +     }
>> +
>> +     return 0;
>> +}
>> diff --git a/kernel/sched/pelt.h b/kernel/sched/pelt.h
>> index 9cac73e..b2983b7 100644
>> --- a/kernel/sched/pelt.h
>> +++ b/kernel/sched/pelt.h
>> @@ -3,6 +3,7 @@
>>  int __update_load_avg_blocked_se(u64 now, int cpu, struct sched_entity *se);
>>  int __update_load_avg_se(u64 now, int cpu, struct cfs_rq *cfs_rq, struct sched_entity *se);
>>  int __update_load_avg_cfs_rq(u64 now, int cpu, struct cfs_rq *cfs_rq);
>> +int update_rt_rq_load_avg(u64 now, struct rq *rq, int running);
>>
>>  /*
>>   * When a task is dequeued, its estimated utilization should not be update if
>> @@ -38,6 +39,12 @@ update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq)
>>       return 0;
>>  }
>>
>> +static inline int
>> +update_rt_rq_load_avg(u64 now, struct rq *rq, int running)
>> +{
>> +     return 0;
>> +}
>> +
>>  #endif
>>
>>
>> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
>> index ef3c4e6..b4148a9 100644
>> --- a/kernel/sched/rt.c
>> +++ b/kernel/sched/rt.c
>> @@ -5,6 +5,8 @@
>>   */
>>  #include "sched.h"
>>
>> +#include "pelt.h"
>> +
>>  int sched_rr_timeslice = RR_TIMESLICE;
>>  int sysctl_sched_rr_timeslice = (MSEC_PER_SEC / HZ) * RR_TIMESLICE;
>>
>> @@ -1572,6 +1574,9 @@ pick_next_task_rt(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
>>
>>       rt_queue_push_tasks(rq);
>>
>> +     update_rt_rq_load_avg(rq_clock_task(rq), rq,
>> +             rq->curr->sched_class == &rt_sched_class);
>> +
>>       return p;
>>  }
>>
>> @@ -1579,6 +1584,8 @@ static void put_prev_task_rt(struct rq *rq, struct task_struct *p)
>>  {
>>       update_curr_rt(rq);
>>
>> +     update_rt_rq_load_avg(rq_clock_task(rq), rq, 1);
>> +
>>       /*
>>        * The previous task needs to be made eligible for pushing
>>        * if it is still active
>> @@ -2308,6 +2315,7 @@ static void task_tick_rt(struct rq *rq, struct task_struct *p, int queued)
>>       struct sched_rt_entity *rt_se = &p->rt;
>>
>>       update_curr_rt(rq);
>> +     update_rt_rq_load_avg(rq_clock_task(rq), rq, 1);
>
> Mmm... not entirely sure... can't we fold
>    update_rt_rq_load_avg() into update_curr_rt() ?
>
> Currently update_curr_rt() is used in:
>    dequeue_task_rt
>    pick_next_task_rt
>    put_prev_task_rt
>    task_tick_rt
>
> while we update_rt_rq_load_avg() only in:
>    pick_next_task_rt
>    put_prev_task_rt
>    task_tick_rt
> and
>    update_blocked_averages
>
> Why we don't we need to update at dequeue_task_rt() time ?

We are tracking rt rq and not sched entities so we want to know when
sched rt will be the running or not  sched class on the rq. Tracking
dequeue_task_rt is useless

>
>>
>>       watchdog(rq, p);
>>
>> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
>> index 757a3ee..7a16de9 100644
>> --- a/kernel/sched/sched.h
>> +++ b/kernel/sched/sched.h
>> @@ -592,6 +592,7 @@ struct rt_rq {
>>       unsigned long           rt_nr_total;
>>       int                     overloaded;
>>       struct plist_head       pushable_tasks;
>> +
>>  #endif /* CONFIG_SMP */
>>       int                     rt_queued;
>>
>> @@ -847,6 +848,7 @@ struct rq {
>>
>>       u64                     rt_avg;
>>       u64                     age_stamp;
>> +     struct sched_avg        avg_rt;
>>       u64                     idle_stamp;
>>       u64                     avg_idle;
>>
>> @@ -2205,4 +2207,9 @@ static inline unsigned long cpu_util_cfs(struct rq *rq)
>>
>>       return util;
>>  }
>> +
>> +static inline unsigned long cpu_util_rt(struct rq *rq)
>> +{
>> +     return rq->avg_rt.util_avg;
>
> READ_ONCE?
>
>> +}
>>  #endif
>> --
>> 2.7.4
>>
>
> --
> #include <best/regards.h>
>
> Patrick Bellasi

^ permalink raw reply

* [PATCH] Staging:media:imx Fix multiple assignments in a line
From: Janani Sankara Babu @ 2018-05-29 23:08 UTC (permalink / raw)
  To: gregkh; +Cc: linux-media, devel, linux-kernel, Janani Sankara Babu

This patch solves multiple assignments warning shown by checkpatch
script.

Signed-off-by: Janani Sankara Babu <jananis37@gmail.com>
---
 drivers/staging/media/imx/imx-media-csi.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/media/imx/imx-media-csi.c b/drivers/staging/media/imx/imx-media-csi.c
index aeab05f..15068f7 100644
--- a/drivers/staging/media/imx/imx-media-csi.c
+++ b/drivers/staging/media/imx/imx-media-csi.c
@@ -1191,10 +1191,12 @@ static int csi_enum_frame_size(struct v4l2_subdev *sd,
 	} else {
 		crop = __csi_get_crop(priv, cfg, fse->which);

-		fse->min_width = fse->max_width = fse->index & 1 ?
+		fse->min_width = fse->index & 1 ?
 			crop->width / 2 : crop->width;
-		fse->min_height = fse->max_height = fse->index & 2 ?
+		fse->max_width = fse->min_width;
+		fse->min_height = fse->index & 2 ?
 			crop->height / 2 : crop->height;
+		fse->max_height = fse->min_height;
 	}

 	mutex_unlock(&priv->lock);
--
1.9.1

^ permalink raw reply related

* Re: [PATCH 0/2] x86/boot/KASLR: Skip specified number of 1GB huge pages when do physical randomization
From: Luiz Capitulino @ 2018-05-29 13:27 UTC (permalink / raw)
  To: Baoquan He
  Cc: Ingo Molnar, linux-kernel, keescook, tglx, x86, hpa, fanc.fnst,
	yasu.isimatu, indou.takao, douly.fnst
In-Reply-To: <20180528095418.GD31261@MiWiFi-R3L-srv>

On Mon, 28 May 2018 17:54:18 +0800
Baoquan He <bhe@redhat.com> wrote:

> On 05/23/18 at 03:10pm, Luiz Capitulino wrote:
> > On Fri, 18 May 2018 19:28:36 +0800
> > Baoquan He <bhe@redhat.com> wrote:
> >   
> > > > Note that it's not KASLR specific: if we had some other kernel feature that tried 
> > > > to allocate a piece of memory from what appears to be perfectly usable generic RAM 
> > > > we'd have the same problems!    
> > > 
> > > Hmm, this may not be the situation for 1GB huge pages. For 1GB huge
> > > pages, the bug is that on KVM guest with 4GB ram, when user adds
> > > 'default_hugepagesz=1G hugepagesz=1G hugepages=1' to kernel
> > > command-line, if 'nokaslr' is specified, they can get 1GB huge page
> > > allocated successfully. If remove 'nokaslr', namely KASLR is enabled,
> > > the 1GB huge page allocation failed.  
> > 
> > Let me clarify that this issue is not specific to KVM in any way. The same
> > issue happens on bare-metal, but if you have lots of memory you'll hardly
> > notice it. On the other hand, it's common to create KVM guests with a few
> > GBs of memory. In those guests, you may not be able to get a 1GB hugepage
> > at all if kaslr is enabled.
> > 
> > This series is a simple fix for this bug. It hooks up into already existing
> > KASLR code that scans memory regions to be avoided. The memory hotplug
> > issue is left for another day.  
> 
> Exactly. 
> 
> This issue is about kernel being randomized into good 1GB huge pages to
> break later huge page allocation, and we can only scan memory to know
> where 1GB huge page is located and avoid them.
> 
> The memory hotplug issue is about kernel being randomized into movable
> memory regions, and we need read ACPI SRAT table to retrieve the
> attribute of memory regions to know if it's movable, then avoid it if
> yes.

Makes sense. Since the KASLR code already scans memory regions looking
for regions to skip and since this series just uses that, I think this
is a good solution to the problem:

Reviewed-and-Tested-by: Luiz Capitulino <lcapitulino@redhat.com>

> 
> > 
> > Now, if I understand what Ingo is saying is that he wants to see all problems
> > solved with a generic solution vs. a specific solution for each problem.  
> 
> Hmm, if we understand Ingo's words correctly, for these two issues,
> seems there isn't a generic solution to solve both of them. We can only
> fix them separately.
> 
> Hi Ingo,
> 
> Ping!
> 
> Not sure if my above understanding is correct. Could you confirm if I
> have understood your comments and if the solution of this patchset is
> right?
> 
> Thanks
> Baoquan
> 

^ permalink raw reply

* Re: [PATCH] floppy: Do not copy a kernel pointer to user memory in FDGETPRM ioctl
From: Andy Whitcroft @ 2018-05-29 13:27 UTC (permalink / raw)
  To: Brian Belleville; +Cc: Jiri Kosina, linux-kernel
In-Reply-To: <1520467365-7194-1-git-send-email-bbellevi@uci.edu>

On Wed, Mar 07, 2018 at 04:02:45PM -0800, Brian Belleville wrote:
> The final field of a floppy_struct is the field "name", which is a
> pointer to a string in kernel memory. The kernel pointer should not be
> copied to user memory. The FDGETPRM ioctl copies a floppy_struct to
> user memory, including the "name" field. This pointer cannot be used
> by the user, and it will leak a kernel address to user-space, which
> will reveal the location of kernel code and data and undermine KASLR
> protection. Instead, copy the floppy_struct except for the "name"
> field.
> 
> Signed-off-by: Brian Belleville <bbellevi@uci.edu>
> ---
>  drivers/block/floppy.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c
> index eae484a..4d4a422 100644
> --- a/drivers/block/floppy.c
> +++ b/drivers/block/floppy.c
> @@ -3470,6 +3470,7 @@ static int fd_locked_ioctl(struct block_device *bdev, fmode_t mode, unsigned int
>  					  (struct floppy_struct **)&outparam);
>  		if (ret)
>  			return ret;
> +		size = offsetof(struct floppy_struct, name);
>  		break;
>  	case FDMSGON:
>  		UDP->flags |= FTD_MSG;

I am not sure it is reasonable to simply set size here to the length of the
valid data.  Though in the real world everyonne should be using the defines
and those should include the full length, the code itself does not require
this, it only prevents overly long reads.  So I think it is possible to do
this read with a shorter userspace buffer; with this change we would
then write beyond the end of the buffer.

This also seems to introduce a slight behavioural difference between the
primary and compat calls.  The compat call already elides the name but it
also is copying into a new structure for return and this is pre-cleared,
so the name will always be null for the compat case and undefined for
the primary ioctl.

Perhaps the below patch would be more appropriate.

-apw

>From ddb8c77229a9507fa5575c910d2847e123a9c94c Mon Sep 17 00:00:00 2001
From: Andy Whitcroft <apw@canonical.com>
Date: Tue, 29 May 2018 13:04:15 +0100
Subject: [PATCH 1/1] floppy: Do not copy a kernel pointer to user memory in
 FDGETPRM ioctl

The final field of a floppy_struct is the field "name", which is a pointer
to a string in kernel memory.  The kernel pointer should not be copied to
user memory.  The FDGETPRM ioctl copies a floppy_struct to user memory,
including this "name" field.  This pointer cannot be used by the user
and it will leak a kernel address to user-space, which will reveal the
location of kernel code and data and undermine KASLR protection.

Model this code after the compat ioctl which copies the returned data
to a previously cleared temporary structure on the stack (excluding the
name pointer) and copy out to userspace from there.  As we already have
an inparam union with an appropriate member and that memory is already
cleared even for read only calls make use of that as a temporary store.

Based on an initial patch by Brian Belleville.

CVE-2018-7755
Signed-off-by: Andy Whitcroft <apw@canonical.com>
---
 drivers/block/floppy.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c
index 8ec7235fc93b..7512f6ff7c43 100644
--- a/drivers/block/floppy.c
+++ b/drivers/block/floppy.c
@@ -3470,6 +3470,8 @@ static int fd_locked_ioctl(struct block_device *bdev, fmode_t mode, unsigned int
 					  (struct floppy_struct **)&outparam);
 		if (ret)
 			return ret;
+		memcpy(&inparam.g, outparam, offsetof(struct floppy_struct, name));
+		outparam = &inparam.g;
 		break;
 	case FDMSGON:
 		UDP->flags |= FTD_MSG;
-- 
2.17.0

^ permalink raw reply related

* Re: [PATCH] net: davinci: fix building davinci mdio code without CONFIG_OF
From: Sekhar Nori @ 2018-05-29 13:25 UTC (permalink / raw)
  To: Arnd Bergmann, David S. Miller
  Cc: Grygorii Strashko, Florian Fainelli, linux-omap, netdev,
	linux-kernel
In-Reply-To: <20180528155059.2736080-1-arnd@arndb.de>

Hi Arnd,

On Monday 28 May 2018 09:20 PM, Arnd Bergmann wrote:
> Test-building this driver on targets without CONFIG_OF revealed a build
> failure:
> 
> drivers/net/ethernet/ti/davinci_mdio.c: In function 'davinci_mdio_probe':
> drivers/net/ethernet/ti/davinci_mdio.c:380:9: error: implicit declaration of function 'davinci_mdio_probe_dt'; did you mean 'davinci_mdio_probe'? [-Werror=implicit-function-declaration]
> 
> This adjusts the #ifdef logic in the driver to make it build in
> all configurations.
> 
> Fixes: 2652113ff043 ("net: ethernet: ti: Allow most drivers with COMPILE_TEST")
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>

Your patch fixes the issue.

Acked-by: Sekhar Nori <nsekhar@ti.com>

One question below:

> ---
>  drivers/net/ethernet/ti/davinci_mdio.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/ti/davinci_mdio.c b/drivers/net/ethernet/ti/davinci_mdio.c
> index 8ac72831af05..a98aedae1b41 100644
> --- a/drivers/net/ethernet/ti/davinci_mdio.c
> +++ b/drivers/net/ethernet/ti/davinci_mdio.c
> @@ -321,7 +321,6 @@ static int davinci_mdio_write(struct mii_bus *bus, int phy_id,
>  	return ret;
>  }
>  
> -#if IS_ENABLED(CONFIG_OF)
>  static int davinci_mdio_probe_dt(struct mdio_platform_data *data,
>  			 struct platform_device *pdev)
>  {
> @@ -339,7 +338,6 @@ static int davinci_mdio_probe_dt(struct mdio_platform_data *data,
>  
>  	return 0;
>  }
> -#endif
>  
>  #if IS_ENABLED(CONFIG_OF)
>  static const struct davinci_mdio_of_param of_cpsw_mdio_data = {
> @@ -374,7 +372,7 @@ static int davinci_mdio_probe(struct platform_device *pdev)
>  		return -ENOMEM;
>  	}
>  
> -	if (dev->of_node) {
> +	if (IS_ENABLED(CONFIG_OF) && dev->of_node) {
>  		const struct of_device_id	*of_id;
>  
>  		ret = davinci_mdio_probe_dt(&data->pdata, pdev);

I was expecting this one change to fix the issue since the if() block
should be compiled away removing references to davinci_mdio_probe_dt().

The code does get compiled out and there are no references to
davinci_mdio_probe_dt() in the final object when !CONFIG_OF.

But the compile error remains if the #ifdefs you removed above are
installed back. Not sure why.

Thanks,
Sekhar

^ permalink raw reply

* [PATCH] can: m_can: Fix runtime resume call
From: Faiz Abbas @ 2018-05-29 13:24 UTC (permalink / raw)
  To: linux-can, netdev, linux-kernel, linux-omap; +Cc: mkl, wg, faiz_abbas

pm_runtime_get_sync() returns a 1 if the state of the device is already
'active'. This is not a failure case and should return a success.

Therefore fix error handling for pm_runtime_get_sync() call such that
it returns success when the value is 1.

Also cleanup the TODO for using runtime PM for sleep mode as that is
implemented.

Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
---
 drivers/net/can/m_can/m_can.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
index a9fbf81ac3d4..04c48371ab2a 100644
--- a/drivers/net/can/m_can/m_can.c
+++ b/drivers/net/can/m_can/m_can.c
@@ -634,10 +634,12 @@ static int m_can_clk_start(struct m_can_priv *priv)
 	int err;
 
 	err = pm_runtime_get_sync(priv->device);
-	if (err)
+	if (err < 0) {
 		pm_runtime_put_noidle(priv->device);
+		return err;
+	}
 
-	return err;
+	return 0;
 }
 
 static void m_can_clk_stop(struct m_can_priv *priv)
@@ -1687,8 +1689,6 @@ static int m_can_plat_probe(struct platform_device *pdev)
 	return ret;
 }
 
-/* TODO: runtime PM with power down or sleep mode  */
-
 static __maybe_unused int m_can_suspend(struct device *dev)
 {
 	struct net_device *ndev = dev_get_drvdata(dev);
-- 
2.17.0

^ permalink raw reply related

* [GIT PULL] overlayfs update for 4.18
From: Miklos Szeredi @ 2018-05-29 13:21 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-kernel, linux-fsdevel, linux-unionfs

Hi Al,

I'm sending this pull request to you instead of Linus, because a bigger than
usual chunk involves the VFS.

Please pull from:

  git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git for-viro

This update contains the following:

 - Deal with vfs_mkdir() not instantiating dentry.

 - Stack file operations.  This solves the ro/rw file descriptor inconsistency,
   weirdness with ioctl, as well as removing a bunch of overlay specific hacks
   from the VFS.

 - Allow metadata-only copy-up when data is unchanged.

 - Various cleanups in VFS and overlayfs.

Thanks,
Miklos

---
Amir Goldstein (8):
      ovl: update documentation for unionmount-testsuite
      ovl: remove WARN_ON() real inode attributes mismatch
      ovl: strip debug argument from ovl_do_ helpers
      ovl: struct cattr cleanups
      ovl: return dentry from ovl_create_real()
      ovl: create helper ovl_create_temp()
      ovl: make ovl_create_real() cope with vfs_mkdir() safely
      ovl: use inode_insert5() to hash a newly created inode

Miklos Szeredi (41):
      ovl: clean up copy-up error paths
      vfs: factor out inode_insert5()
      vfs: dedpue: return loff_t
      vfs: dedupe: rationalize args
      vfs: dedupe: extract helper for a single dedup
      vfs: add path_open()
      vfs: optionally don't account file in nr_files
      vfs: add f_op->pre_mmap()
      vfs: export vfs_ioctl() to modules
      vfs: export vfs_dedupe_file_range_one() to modules
      ovl: copy up times
      ovl: copy up inode flags
      Revert "Revert "ovl: get_write_access() in truncate""
      ovl: copy up file size as well
      ovl: deal with overlay files in ovl_d_real()
      ovl: stack file ops
      ovl: add helper to return real file
      ovl: add ovl_read_iter()
      ovl: add ovl_write_iter()
      ovl: add ovl_fsync()
      ovl: add ovl_mmap()
      ovl: add ovl_fallocate()
      ovl: add lsattr/chattr support
      ovl: add ovl_fiemap()
      ovl: add O_DIRECT support
      ovl: add reflink/copyfile/dedup support
      vfs: don't open real
      ovl: copy-up on MAP_SHARED
      ovl: obsolete "check_copy_up" module option
      ovl: fix documentation of non-standard behavior
      vfs: simplify dentry_open()
      Revert "ovl: fix may_write_real() for overlayfs directories"
      Revert "ovl: don't allow writing ioctl on lower layer"
      vfs: fix freeze protection in mnt_want_write_file() for overlayfs
      Revert "ovl: fix relatime for directories"
      Revert "vfs: update ovl inode before relatime check"
      Revert "vfs: add flags to d_real()"
      Revert "vfs: do get_write_access() on upper layer of overlayfs"
      Partially revert "locks: fix file locking on overlayfs"
      Revert "fsnotify: support overlayfs"
      vfs: remove open_flags from d_real()

Vivek Goyal (29):
      ovl: Pass argument to ovl_get_inode() in a structure
      ovl: Initialize ovl_inode->redirect in ovl_get_inode()
      ovl: Move the copy up helpers to copy_up.c
      ovl: Provide a mount option metacopy=on/off for metadata copyup
      ovl: During copy up, first copy up metadata and then data
      ovl: Copy up only metadata during copy up where it makes sense
      ovl: Add helper ovl_already_copied_up()
      ovl: A new xattr OVL_XATTR_METACOPY for file on upper
      ovl: Use out_err instead of out_nomem
      ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
      ovl: Copy up meta inode data from lowest data inode
      ovl: Add helper ovl_dentry_lowerdata() to get lower data dentry
      ovl: Fix ovl_getattr() to get number of blocks from lower
      ovl: Store lower data inode in ovl_inode
      ovl: Add helper ovl_inode_realdata()
      ovl: Open file with data except for the case of fsync
      ovl: Do not expose metacopy only dentry from d_real()
      ovl: Move some dir related ovl_lookup_single() code in else block
      ovl: Check redirects for metacopy files
      ovl: Treat metacopy dentries as type OVL_PATH_MERGE
      ovl: Add an inode flag OVL_CONST_INO
      ovl: Do not set dentry type ORIGIN for broken hardlinks
      ovl: Set redirect on metacopy files upon rename
      ovl: Set redirect on upper inode when it is linked
      ovl: Check redirect on index as well
      ovl: Disbale metacopy for MAP_SHARED mmap()
      ovl: Do not do metadata only copy-up for truncate operation
      ovl: Do not do metacopy only for ioctl modifying file attr
      ovl: Enable metadata only feature

---
 Documentation/filesystems/Locking       |   4 +-
 Documentation/filesystems/overlayfs.txt |  97 ++++--
 Documentation/filesystems/vfs.txt       |  19 +-
 fs/btrfs/ctree.h                        |   5 +-
 fs/btrfs/ioctl.c                        |   7 +-
 fs/file_table.c                         |  13 +-
 fs/inode.c                              | 210 +++++--------
 fs/internal.h                           |  17 +-
 fs/ioctl.c                              |   1 +
 fs/locks.c                              |  20 +-
 fs/namei.c                              |   2 +-
 fs/namespace.c                          |  69 +----
 fs/ocfs2/file.c                         |  10 +-
 fs/open.c                               |  74 ++---
 fs/overlayfs/Kconfig                    |  40 +++
 fs/overlayfs/Makefile                   |   4 +-
 fs/overlayfs/copy_up.c                  | 273 +++++++++-------
 fs/overlayfs/dir.c                      | 312 +++++++++++++------
 fs/overlayfs/export.c                   |  11 +-
 fs/overlayfs/file.c                     | 530 ++++++++++++++++++++++++++++++++
 fs/overlayfs/inode.c                    | 203 ++++++++----
 fs/overlayfs/namei.c                    | 205 +++++++-----
 fs/overlayfs/overlayfs.h                | 119 ++++---
 fs/overlayfs/ovl_entry.h                |   7 +-
 fs/overlayfs/super.c                    | 134 +++++---
 fs/overlayfs/util.c                     | 252 ++++++++++++++-
 fs/read_write.c                         |  91 +++---
 fs/xattr.c                              |   9 +-
 fs/xfs/xfs_file.c                       |   8 +-
 include/linux/dcache.h                  |  15 +-
 include/linux/fs.h                      |  36 ++-
 include/linux/fsnotify.h                |  14 +-
 include/uapi/linux/fs.h                 |   1 -
 mm/util.c                               |   5 +
 34 files changed, 1981 insertions(+), 836 deletions(-)
 create mode 100644 fs/overlayfs/file.c

^ permalink raw reply

* [PATCH 04/19] Bluetooth: hci_nokia: Add serdev_id_table
From: Ricardo Ribalda Delgado @ 2018-05-29 13:09 UTC (permalink / raw)
  To: linux-kernel, linux-serial
  Cc: Ricardo Ribalda Delgado, Marcel Holtmann, Johan Hedberg,
	Rob Herring, Johan Hovold, linux-bluetooth
In-Reply-To: <20180529131014.18641-1-ricardo.ribalda@gmail.com>

Describe which hardware is supported by the current driver.

Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Johan Hedberg <johan.hedberg@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Johan Hovold <johan@kernel.org>
Cc: linux-bluetooth@vger.kernel.org
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
---
 drivers/bluetooth/hci_nokia.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/bluetooth/hci_nokia.c b/drivers/bluetooth/hci_nokia.c
index 3539fd03f47e..e32dfcd56b8d 100644
--- a/drivers/bluetooth/hci_nokia.c
+++ b/drivers/bluetooth/hci_nokia.c
@@ -801,6 +801,11 @@ static const struct of_device_id nokia_bluetooth_of_match[] = {
 MODULE_DEVICE_TABLE(of, nokia_bluetooth_of_match);
 #endif
 
+static struct serdev_device_id nokia_bluetooth_serdev_id[] = {
+	{ "hp4-bluetooth", },
+	{},
+};
+
 static struct serdev_device_driver nokia_bluetooth_serdev_driver = {
 	.probe = nokia_bluetooth_serdev_probe,
 	.remove = nokia_bluetooth_serdev_remove,
@@ -809,6 +814,7 @@ static struct serdev_device_driver nokia_bluetooth_serdev_driver = {
 		.pm = &nokia_bluetooth_pm_ops,
 		.of_match_table = of_match_ptr(nokia_bluetooth_of_match),
 	},
+	.id_table = nokia_bluetooth_serdev_id,
 };
 
 module_serdev_device_driver(nokia_bluetooth_serdev_driver);
-- 
2.17.0

^ permalink raw reply related

* Re: [PATCH v4 4/8] PCI: Replace dev_node parameter of of_pci_get_host_bridge_resources with device
From: Bjorn Helgaas @ 2018-05-29 13:20 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Vladimir Zapolskiy, Bjorn Helgaas, Linux Kernel Mailing List,
	linux-pci, linux-arm-kernel, Jingoo Han, Joao Pinto,
	Lorenzo Pieralisi
In-Reply-To: <eb9b477a-33c9-db45-9e62-101e3606e8ff@siemens.com>

On Mon, May 28, 2018 at 12:46:35PM +0200, Jan Kiszka wrote:
> On 2018-05-28 12:00, Vladimir Zapolskiy wrote:
> > Hi Jan, Bjorn,
> > 
> > On 05/15/2018 12:07 PM, Jan Kiszka wrote:
> >> From: Jan Kiszka <jan.kiszka@siemens.com>
> >>
> >> Another step towards a managed version of
> >> of_pci_get_host_bridge_resources(): Feed in the underlying device,
> >> rather than just the OF node. This will allow to use managed resource
> >> allocation internally later on.
> >>
> >> CC: Jingoo Han <jingoohan1@gmail.com>
> >> CC: Joao Pinto <Joao.Pinto@synopsys.com>
> >> CC: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> >> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> > 
> > [snip]
> > 
> >> diff --git a/drivers/pci/host/pcie-altera.c b/drivers/pci/host/pcie-altera.c
> >> index a6af62e0256d..61802e55a00c 100644
> >> --- a/drivers/pci/host/pcie-altera.c
> >> +++ b/drivers/pci/host/pcie-altera.c
> >> @@ -488,11 +488,10 @@ static int altera_pcie_parse_request_of_pci_ranges(struct altera_pcie *pcie)
> >>  {
> >>  	int err, res_valid = 0;
> >>  	struct device *dev = &pcie->pdev->dev;
> >> -	struct device_node *np = dev->of_node;
> >>  	struct resource_entry *win;
> >>  
> >> -	err = of_pci_get_host_bridge_resources(np, 0, 0xff, &pcie->resources,
> >> -					       NULL);
> >> +	err = of_pci_get_host_bridge_resources(dev, 0, 0xff
> >> +						    &pcie->resources, NULL);
> >>  	if (err)
> >>  		return err;
> >>  
> > 
> > In case if it is an undiscovered issue, a comma was mistakenly removed,
> > which will result it compilation error.
> > 
> > The problem is also found in pci/next , see commit 88e3909aa125.
> 
> Yes, that's known. We have a bisection breakage: The issue was fixed
> again by patch 6 in that series.

I updated 88e3909aa125 to fix the bisection issue.  I'll rebuild
pci/next later today or tomorrow.

^ permalink raw reply

* [PATCH v2 00/12] coresight: tmc-etr Transparent buffer management
From: Suzuki K Poulose @ 2018-05-29 13:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, mathieu.poirier, mike.leach, robert.walker,
	coresight, devicetree, robh, frowand.list, Suzuki K Poulose

This series is split of the Coresight ETR perf support patches posted
here [0]. The CATU support and perf backend support will be posted as
separate series for better management and review of the patches.

This series adds the support for TMC ETR Scatter-Gather mode to allow
using physical non-contiguous buffer for holding the trace data. It
also adds a layer to handle the buffer management in a transparent
manner, independent of the underlying mode used by the TMC ETR.
The layer chooses the ETR mode based on different parameters (size,
re-using a set of pages, presence of an SMMU etc.).

Finally we add a sysfs parameter to tune the buffer size for ETR in
sysfs-mode.

During the testing, we found out that if the TMC ETR is not properly
connected to the memory subsystem, the ETR could lock-up the system
while waiting for the "read" transactions to complete in scatter-gather
mode. So, we do not use the mode on a system unless it is safe to do
so. This is specified by a DT property "arm,scatter-gather".

Applies on coreisght-next tree from Mathieu

Changes since previous version [1]:
 - Rebased to Mathieu's coresight-next tree to resolve a conflict.
 - Added tags for DT changes from Rob and Mathieu
 - Split the SG mode backend support patch from the
   ETR-BUF patch.
 - Address other comments from Mathieu

Changes since splitted series [0] :
 - Split the series in [0]
 - Address comments on v2
 - Rename DT property "scatter-gather" to "arm,scatter-gather"
 - Add ETM PID for Cortex-A35, use macros to make the listing easier

[0] - http://lists.infradead.org/pipermail/linux-arm-kernel/2018-May/574875.html
[1] - http://lists.infradead.org/pipermail/linux-arm-kernel/2018-May/579135.html

Suzuki K Poulose (12):
  coresight: ETM: Add support for Arm Cortex-A73 and Cortex-A35
  coresight: tmc: Hide trace buffer handling for file read
  coresight: tmc-etr: Do not clean trace buffer
  coresight: tmc-etr: Disallow perf mode
  coresight: Add helper for inserting synchronization packets
  dts: bindings: Restrict coresight tmc-etr scatter-gather mode
  dts: juno: Add scatter-gather support for all revisions
  coresight: Add generic TMC sg table framework
  coresight: Add support for TMC ETR SG unit
  coresight: tmc-etr: Add transparent buffer management
  coresight: tmc-etr buf: Add TMC scatter gather mode backend
  coresight: tmc: Add configuration support for trace buffer size

 .../ABI/testing/sysfs-bus-coresight-devices-tmc    |    8 +
 .../devicetree/bindings/arm/coresight.txt          |    5 +-
 arch/arm64/boot/dts/arm/juno-base.dtsi             |    1 +
 drivers/hwtracing/coresight/coresight-etb10.c      |   12 +-
 drivers/hwtracing/coresight/coresight-etm4x.c      |   31 +-
 drivers/hwtracing/coresight/coresight-priv.h       |   10 +-
 drivers/hwtracing/coresight/coresight-tmc-etf.c    |   45 +-
 drivers/hwtracing/coresight/coresight-tmc-etr.c    | 1010 ++++++++++++++++++--
 drivers/hwtracing/coresight/coresight-tmc.c        |   83 +-
 drivers/hwtracing/coresight/coresight-tmc.h        |  110 ++-
 drivers/hwtracing/coresight/coresight.c            |    3 +-
 11 files changed, 1144 insertions(+), 174 deletions(-)

-- 
2.7.4

^ permalink raw reply

* [PATCH v2 03/12] coresight: tmc-etr: Do not clean trace buffer
From: Suzuki K Poulose @ 2018-05-29 13:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, mathieu.poirier, mike.leach, robert.walker,
	coresight, devicetree, robh, frowand.list, Suzuki K Poulose
In-Reply-To: <1527599737-28408-1-git-send-email-suzuki.poulose@arm.com>

We zero out the entire trace buffer used for ETR before it is enabled,
for helping with debugging. With the addition of scatter-gather mode,
the buffer could be bigger and non-contiguous.

Get rid of this step; if someone wants to debug, they can always add it
as and when needed.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-tmc-etr.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index f88342d..1de05c9 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -13,9 +13,6 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata)
 {
 	u32 axictl, sts;
 
-	/* Zero out the memory to help with debug */
-	memset(drvdata->vaddr, 0, drvdata->size);
-
 	CS_UNLOCK(drvdata->base);
 
 	/* Wait for TMCSReady bit to be set */
@@ -340,9 +337,8 @@ int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata)
 	if (drvdata->mode == CS_MODE_SYSFS) {
 		/*
 		 * The trace run will continue with the same allocated trace
-		 * buffer. The trace buffer is cleared in tmc_etr_enable_hw(),
-		 * so we don't have to explicitly clear it. Also, since the
-		 * tracer is still enabled drvdata::buf can't be NULL.
+		 * buffer. Since the tracer is still enabled drvdata::buf can't
+		 * be NULL.
 		 */
 		tmc_etr_enable_hw(drvdata);
 	} else {
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH v10 01/16] videobuf2: Make struct vb2_buffer refcounted
From: Ezequiel Garcia @ 2018-05-29 13:17 UTC (permalink / raw)
  To: sathyam panda
  Cc: linux-media, kernel, Hans Verkuil, Mauro Carvalho Chehab,
	Shuah Khan, Pawel Osciak, Alexandre Courbot, Sakari Ailus,
	Brian Starkey, linux-kernel, Gustavo Padovan
In-Reply-To: <CAE6UAyx81nZDQEHuNn0BK5EkB-KmNdSnkiNF+NJTmiUkz72CrA@mail.gmail.com>

On Fri, 2018-05-25 at 12:11 +0530, sathyam panda wrote:
> Hello,
> 
> On 5/21/18, Ezequiel Garcia <ezequiel@collabora.com> wrote:
> > The in-fence implementation involves having a per-buffer fence callback,
> > that triggers on the fence signal. The fence callback is called
> > asynchronously
> > and needs a valid reference to the associated ideobuf2 buffer.
> > 
> > Allow this by making the vb2_buffer refcounted, so it can be passed
> > to other contexts.
> > 
> 
> -Is it really required, because when a queued buffer with an in_fence
> is deallocated, firstly queue is cancelled.
> -And __vb2_dqbuf is called which calls dma_fence_remove_callback.
> -So if fence callback has been called -__vb2_dqbuf will wait to
> acquire fence lock.
> -So during execution of fence callback, buffers and queue are still valid.
> -And if __vb2_dqbuf remove callback first ,then dma_fence_signal will
> wait for lock
> - so there won't be any fence callback to call for that buffer when
> dma_fence_signal resumes.
> 

Hi Sathyam,

Thanks for your review! The refcount is definitely required,
as the fence callback only schedules a workqueue, which is
completely asynchronous with respect to the rest of the
ioctls.

In particular, the workqueue is not synchronized with
vb2_core_queue_release.

Also, another subtle detail, dma_fence_remove_callback
can fail to remove the callback.

Thanks,
Eze

^ permalink raw reply

* Re: [PATCH 07/15] arm: dts: exynos: Add missing cooling device properties for CPUs
From: Krzysztof Kozlowski @ 2018-05-29 13:18 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: arm, Rob Herring, Mark Rutland, Kukjin Kim, Vincent Guittot,
	ionela.voinescu, Daniel Lezcano, chris.redpath, devicetree,
	linux-arm-kernel, linux-samsung-soc@vger.kernel.org, linux-kernel
In-Reply-To: <cfd8de4b2ad1cfd7b0f1a0706c0b6b918e1ebffc.1527244201.git.viresh.kumar@linaro.org>

On Fri, May 25, 2018 at 12:31 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> The cooling device properties, like "#cooling-cells" and
> "dynamic-power-coefficient", should either be present for all the CPUs
> of a cluster or none. If these are present only for a subset of CPUs of
> a cluster then things will start falling apart as soon as the CPUs are
> brought online in a different order.

Thanks for the patch.

In case of Exynos, the booting CPU always has these information in DT
and the booting CPU cannot be changed (chosen by firmware/hardware
configuration). Therefore there is no real risk of falling although
for correctness of DT your change makes sense.

It is too late for this cycle for me so I'll pick it up after merge window.
Alternatively, arm-soc guys can pick it up directly with my tag:
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>

Best regards,
Krzysztof


> For example, this will happen
> because the operating system looks for such properties in the CPU node
> it is trying to bring up, so that it can register a cooling device.
>
> Add such missing properties.
>
> Fix other missing properties (clocks, OPP, clock latency) as well to
> make it all work.
>
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
>  arch/arm/boot/dts/exynos3250.dtsi | 16 ++++++++++++++++
>  arch/arm/boot/dts/exynos4210.dtsi | 13 +++++++++++++
>  arch/arm/boot/dts/exynos4412.dtsi |  9 +++++++++
>  arch/arm/boot/dts/exynos5250.dtsi | 23 +++++++++++++++++++++++
>  4 files changed, 61 insertions(+)
>

^ permalink raw reply

* Re: [PATCH v9 00/12] Support PPTT for ARM64
From: Sudeep Holla @ 2018-05-29 13:18 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Sudeep Holla, Catalin Marinas, Jeremy Linton,
	ACPI Devel Maling List, Mark Rutland, austinwc, tnowicki,
	Palmer Dabbelt, Will Deacon, linux-riscv, Morten.Rasmussen,
	vkilari, Lorenzo Pieralisi, jhugo, Al Stone, Len Brown,
	John Garry, wangxiongfeng2, Dietmar Eggemann, Linux ARM,
	Ard Biesheuvel, Greg KH, Rafael J. Wysocki,
	Linux Kernel Mailing List, Hanjun Guo, Linux-Renesas
In-Reply-To: <CAMuHMdXgiMeD4uF+j8W+CpNwYYK2W_8xqk_=vGBiW=bUvKeq7w@mail.gmail.com>



On 29/05/18 12:56, Geert Uytterhoeven wrote:
> Hi Sudeep,
> 
> On Tue, May 29, 2018 at 1:14 PM, Sudeep Holla <sudeep.holla@arm.com> wrote:
>> On 29/05/18 11:48, Geert Uytterhoeven wrote:
>>> On Thu, May 17, 2018 at 7:05 PM, Catalin Marinas
>>> <catalin.marinas@arm.com> wrote:
>>>> On Fri, May 11, 2018 at 06:57:55PM -0500, Jeremy Linton wrote:
>>>>> Jeremy Linton (12):
>>>>>   drivers: base: cacheinfo: move cache_setup_of_node()
>>>>>   drivers: base: cacheinfo: setup DT cache properties early
>>>>>   cacheinfo: rename of_node to fw_token
>>>>>   arm64/acpi: Create arch specific cpu to acpi id helper
>>>>>   ACPI/PPTT: Add Processor Properties Topology Table parsing
>>>>>   ACPI: Enable PPTT support on ARM64
>>>>>   drivers: base cacheinfo: Add support for ACPI based firmware tables
>>>>>   arm64: Add support for ACPI based firmware tables
>>>>>   arm64: topology: rename cluster_id
>>>>>   arm64: topology: enable ACPI/PPTT based CPU topology
>>>>>   ACPI: Add PPTT to injectable table list
>>>>>   arm64: topology: divorce MC scheduling domain from core_siblings
>>>>
>>>> Queued for 4.18 (without Sudeep's latest property_read_u64 cacheinfo
>>>> patch - http://lkml.kernel.org/r/20180517154701.GA20281@e107155-lin; I
>>>> can add it separately).
>>>
>>> This is now commit 37c3ec2d810f87ea ("arm64: topology: divorce MC
>>> scheduling domain from core_siblings") in arm64/for-next/core, causing
>>> system suspend on big.LITTLE systems to hang after shutting down the first
>>> CPU:
>>>
>>>     $ echo mem > /sys/power/state
>>>     PM: suspend entry (deep)
>>>     PM: Syncing filesystems ... done.
>>>     Freezing user space processes ... (elapsed 0.001 seconds) done.
>>>     OOM killer disabled.
>>>     Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
>>>     Disabling non-boot CPUs ...
>>>     CPU1: shutdown
>>>     psci: CPU1 killed.
>>>
>>
>> Is it OK to assume the suspend failed just after shutting down one CPU
>> or it's failing during resume ? It depends on whether you had console
>> disabled or not.
> 
> I have no-console-suspend enabled.
> It's failing during suspend, the next lines should be:
> 
>     CPU2: shutdown
>     psci: CPU2 killed.
>     ...
> 

OK, I was hoping to be something during resume as this patch has nothing
executed during suspend. Do you see any change in topology before and
after this patch applied. I am interested in the output of:

$ grep "" /sys/devices/system/cpu/cpu*/topology/*

>>> For me, it fails on the following big.LITTLE systems:
>>>
>>>     R-Car H3 ES2.0 (4xCA57 + 4xCA53)
>>>     R-Car M3-W (2xCA57 + 4xCA53)
>>>
>>
>> Interesting, is it PSCI based system suspend ?
> 
> Yes it is.
> 
> Suspend-to-idle, which doesn't offline CPUs, still works.
> 

>From DT, I guess this platform doesn't have any idle states.
Does this use genpd power domains ? I see power-domains in the DT, so
asking to get more info. Do you have any out of tree patches especially
if they are depending on some topology cpumasks ?

>>> System supend still works fine on systems with big cores only:
>>>
>>>     R-Car H3 ES1.0 (4xCA57 (4xCA53 disabled in firmware))
>>>     R-Car M3-N (2xCA57)
>>>
>>> Reverting this commit fixes the issue for me.
>>
>> I can't find anything that relates to system suspend in these patches
>> unless they are messing with something during CPU hot plug-in back
>> during resume.
> 
> It's only the last patch that introduces the breakage.
> 

As specified in the commit log, it won't change any behavior for DT
systems if it's non-NUMA or single node system. So I am still wondering
what could trigger this regression.

-- 
Regards,
Sudeep

^ permalink raw reply

* [PATCH v2 05/12] coresight: Add helper for inserting synchronization packets
From: Suzuki K Poulose @ 2018-05-29 13:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, mathieu.poirier, mike.leach, robert.walker,
	coresight, devicetree, robh, frowand.list, Suzuki K Poulose
In-Reply-To: <1527599737-28408-1-git-send-email-suzuki.poulose@arm.com>

Right now we open code filling the trace buffer with synchronization
packets when the circular buffer wraps around in different drivers.
Move this to a common place. While at it, clean up the barrier_pkt
array to strip off the trailing '\0'.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-etb10.c   | 12 ++++-------
 drivers/hwtracing/coresight/coresight-priv.h    | 10 ++++++++-
 drivers/hwtracing/coresight/coresight-tmc-etf.c | 27 ++++++++-----------------
 drivers/hwtracing/coresight/coresight-tmc-etr.c | 13 +-----------
 drivers/hwtracing/coresight/coresight.c         |  3 +--
 5 files changed, 23 insertions(+), 42 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c
index 9b6c555..78e71bf 100644
--- a/drivers/hwtracing/coresight/coresight-etb10.c
+++ b/drivers/hwtracing/coresight/coresight-etb10.c
@@ -195,7 +195,6 @@ static void etb_dump_hw(struct etb_drvdata *drvdata)
 	bool lost = false;
 	int i;
 	u8 *buf_ptr;
-	const u32 *barrier;
 	u32 read_data, depth;
 	u32 read_ptr, write_ptr;
 	u32 frame_off, frame_endoff;
@@ -226,19 +225,16 @@ static void etb_dump_hw(struct etb_drvdata *drvdata)
 
 	depth = drvdata->buffer_depth;
 	buf_ptr = drvdata->buf;
-	barrier = barrier_pkt;
 	for (i = 0; i < depth; i++) {
 		read_data = readl_relaxed(drvdata->base +
 					  ETB_RAM_READ_DATA_REG);
-		if (lost && *barrier) {
-			read_data = *barrier;
-			barrier++;
-		}
-
 		*(u32 *)buf_ptr = read_data;
 		buf_ptr += 4;
 	}
 
+	if (lost)
+		coresight_insert_barrier_packet(drvdata->buf);
+
 	if (frame_off) {
 		buf_ptr -= (frame_endoff * 4);
 		for (i = 0; i < frame_endoff; i++) {
@@ -447,7 +443,7 @@ static void etb_update_buffer(struct coresight_device *csdev,
 		buf_ptr = buf->data_pages[cur] + offset;
 		read_data = readl_relaxed(drvdata->base +
 					  ETB_RAM_READ_DATA_REG);
-		if (lost && *barrier) {
+		if (lost && i < CORESIGHT_BARRIER_PKT_SIZE) {
 			read_data = *barrier;
 			barrier++;
 		}
diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h
index 0e5a74d..1a6cf35 100644
--- a/drivers/hwtracing/coresight/coresight-priv.h
+++ b/drivers/hwtracing/coresight/coresight-priv.h
@@ -57,7 +57,8 @@ static DEVICE_ATTR_RO(name)
 #define coresight_simple_reg64(type, name, lo_off, hi_off)		\
 	__coresight_simple_func(type, NULL, name, lo_off, hi_off)
 
-extern const u32 barrier_pkt[5];
+extern const u32 barrier_pkt[4];
+#define CORESIGHT_BARRIER_PKT_SIZE (sizeof(barrier_pkt))
 
 enum etm_addr_type {
 	ETM_ADDR_TYPE_NONE,
@@ -91,6 +92,13 @@ struct cs_buffers {
 	void			**data_pages;
 };
 
+static inline void coresight_insert_barrier_packet(void *buf)
+{
+	if (buf)
+		memcpy(buf, barrier_pkt, CORESIGHT_BARRIER_PKT_SIZE);
+}
+
+
 static inline void CS_LOCK(void __iomem *addr)
 {
 	do {
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c
index 73160cd..0549249 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
@@ -32,39 +32,28 @@ static void tmc_etb_enable_hw(struct tmc_drvdata *drvdata)
 
 static void tmc_etb_dump_hw(struct tmc_drvdata *drvdata)
 {
-	bool lost = false;
 	char *bufp;
-	const u32 *barrier;
-	u32 read_data, status;
+	u32 read_data, lost;
 	int i;
 
-	/*
-	 * Get a hold of the status register and see if a wrap around
-	 * has occurred.
-	 */
-	status = readl_relaxed(drvdata->base + TMC_STS);
-	if (status & TMC_STS_FULL)
-		lost = true;
-
+	/* Check if the buffer wrapped around. */
+	lost = readl_relaxed(drvdata->base + TMC_STS) & TMC_STS_FULL;
 	bufp = drvdata->buf;
 	drvdata->len = 0;
-	barrier = barrier_pkt;
 	while (1) {
 		for (i = 0; i < drvdata->memwidth; i++) {
 			read_data = readl_relaxed(drvdata->base + TMC_RRD);
 			if (read_data == 0xFFFFFFFF)
-				return;
-
-			if (lost && *barrier) {
-				read_data = *barrier;
-				barrier++;
-			}
-
+				goto done;
 			memcpy(bufp, &read_data, 4);
 			bufp += 4;
 			drvdata->len += 4;
 		}
 	}
+done:
+	if (lost)
+		coresight_insert_barrier_packet(drvdata->buf);
+	return;
 }
 
 static void tmc_etb_disable_hw(struct tmc_drvdata *drvdata)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index 18c9a18..04206ff 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -91,9 +91,7 @@ ssize_t tmc_etr_get_sysfs_trace(struct tmc_drvdata *drvdata,
 
 static void tmc_etr_dump_hw(struct tmc_drvdata *drvdata)
 {
-	const u32 *barrier;
 	u32 val;
-	u32 *temp;
 	u64 rwp;
 
 	rwp = tmc_read_rwp(drvdata);
@@ -106,16 +104,7 @@ static void tmc_etr_dump_hw(struct tmc_drvdata *drvdata)
 	if (val & TMC_STS_FULL) {
 		drvdata->buf = drvdata->vaddr + rwp - drvdata->paddr;
 		drvdata->len = drvdata->size;
-
-		barrier = barrier_pkt;
-		temp = (u32 *)drvdata->buf;
-
-		while (*barrier) {
-			*temp = *barrier;
-			temp++;
-			barrier++;
-		}
-
+		coresight_insert_barrier_packet(drvdata->buf);
 	} else {
 		drvdata->buf = drvdata->vaddr;
 		drvdata->len = rwp - drvdata->paddr;
diff --git a/drivers/hwtracing/coresight/coresight.c b/drivers/hwtracing/coresight/coresight.c
index 29e834a..4969b32 100644
--- a/drivers/hwtracing/coresight/coresight.c
+++ b/drivers/hwtracing/coresight/coresight.c
@@ -51,8 +51,7 @@ static struct list_head *stm_path;
  * beginning of the data collected in a buffer.  That way the decoder knows that
  * it needs to look for another sync sequence.
  */
-const u32 barrier_pkt[5] = {0x7fffffff, 0x7fffffff,
-			    0x7fffffff, 0x7fffffff, 0x0};
+const u32 barrier_pkt[4] = {0x7fffffff, 0x7fffffff, 0x7fffffff, 0x7fffffff};
 
 static int coresight_id_match(struct device *dev, void *data)
 {
-- 
2.7.4

^ permalink raw reply related

* [PATCH v2 06/12] dts: bindings: Restrict coresight tmc-etr scatter-gather mode
From: Suzuki K Poulose @ 2018-05-29 13:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, mathieu.poirier, mike.leach, robert.walker,
	coresight, devicetree, robh, frowand.list, Suzuki K Poulose,
	Mark Rutland, John Horley
In-Reply-To: <1527599737-28408-1-git-send-email-suzuki.poulose@arm.com>

We are about to add the support for ETR builtin scatter-gather mode
for dealing with large amount of trace buffers. However, on some of
the platforms, using the ETR SG mode can lock up the system due to
the way the ETR is connected to the memory subsystem.

In SG mode, the ETR performs READ from the scatter-gather table to
fetch the next page and regular WRITE of trace data. If the READ
operation doesn't complete(due to the memory subsystem issues,
which we have seen on a couple of platforms) the trace WRITE
cannot proceed leading to issues. So, we by default do not
use the SG mode, unless it is known to be safe on the platform.
We define a DT property for the TMC node to specify whether we
have a proper SG mode.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: John Horley <john.horley@arm.com>
Cc: Robert Walker <robert.walker@arm.com>
Cc: devicetree@vger.kernel.org
Cc: frowand.list@gmail.com
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 Documentation/devicetree/bindings/arm/coresight.txt | 2 ++
 drivers/hwtracing/coresight/coresight-tmc.c         | 9 ++++++++-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/arm/coresight.txt b/Documentation/devicetree/bindings/arm/coresight.txt
index 15ac8e8..603d3c6 100644
--- a/Documentation/devicetree/bindings/arm/coresight.txt
+++ b/Documentation/devicetree/bindings/arm/coresight.txt
@@ -86,6 +86,8 @@ its hardware characteristcs.
 	* arm,buffer-size: size of contiguous buffer space for TMC ETR
 	 (embedded trace router)
 
+	* arm,scatter-gather: boolean. Indicates that the TMC-ETR can safely
+	  use the SG mode on this system.
 
 Example:
 
diff --git a/drivers/hwtracing/coresight/coresight-tmc.c b/drivers/hwtracing/coresight/coresight-tmc.c
index bb57e7f..bc8fc86 100644
--- a/drivers/hwtracing/coresight/coresight-tmc.c
+++ b/drivers/hwtracing/coresight/coresight-tmc.c
@@ -12,6 +12,7 @@
 #include <linux/err.h>
 #include <linux/fs.h>
 #include <linux/miscdevice.h>
+#include <linux/property.h>
 #include <linux/uaccess.h>
 #include <linux/slab.h>
 #include <linux/dma-mapping.h>
@@ -296,6 +297,12 @@ const struct attribute_group *coresight_tmc_groups[] = {
 	NULL,
 };
 
+static inline bool tmc_etr_can_use_sg(struct tmc_drvdata *drvdata)
+{
+	return fwnode_property_present(drvdata->dev->fwnode,
+				       "arm,scatter-gather");
+}
+
 /* Detect and initialise the capabilities of a TMC ETR */
 static int tmc_etr_setup_caps(struct tmc_drvdata *drvdata,
 			     u32 devid, void *dev_caps)
@@ -305,7 +312,7 @@ static int tmc_etr_setup_caps(struct tmc_drvdata *drvdata,
 	/* Set the unadvertised capabilities */
 	tmc_etr_init_caps(drvdata, (u32)(unsigned long)dev_caps);
 
-	if (!(devid & TMC_DEVID_NOSCAT))
+	if (!(devid & TMC_DEVID_NOSCAT) && tmc_etr_can_use_sg(drvdata))
 		tmc_etr_set_cap(drvdata, TMC_ETR_SG);
 
 	/* Check if the AXI address width is available */
-- 
2.7.4

^ permalink raw reply related

* [PATCH v2 07/12] dts: juno: Add scatter-gather support for all revisions
From: Suzuki K Poulose @ 2018-05-29 13:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, mathieu.poirier, mike.leach, robert.walker,
	coresight, devicetree, robh, frowand.list, Suzuki K Poulose,
	Sudeep Holla, Liviu Dudau, Lorenzo Pieralisi
In-Reply-To: <1527599737-28408-1-git-send-email-suzuki.poulose@arm.com>

Advertise that the scatter-gather is properly integrated on
all revisions of Juno board.

Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Liviu Dudau <liviu.dudau@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/boot/dts/arm/juno-base.dtsi | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/boot/dts/arm/juno-base.dtsi b/arch/arm64/boot/dts/arm/juno-base.dtsi
index eb749c5..6ce9090 100644
--- a/arch/arm64/boot/dts/arm/juno-base.dtsi
+++ b/arch/arm64/boot/dts/arm/juno-base.dtsi
@@ -198,6 +198,7 @@
 		clocks = <&soc_smc50mhz>;
 		clock-names = "apb_pclk";
 		power-domains = <&scpi_devpd 0>;
+		arm,scatter-gather;
 		port {
 			etr_in_port: endpoint {
 				slave-mode;
-- 
2.7.4

^ permalink raw reply related

* [PATCH v2 09/12] coresight: Add support for TMC ETR SG unit
From: Suzuki K Poulose @ 2018-05-29 13:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, mathieu.poirier, mike.leach, robert.walker,
	coresight, devicetree, robh, frowand.list, Suzuki K Poulose
In-Reply-To: <1527599737-28408-1-git-send-email-suzuki.poulose@arm.com>

This patch adds support for setting up an SG table used by the
TMC ETR inbuilt SG unit. The TMC ETR uses 4K page sized tables
to hold pointers to the 4K data pages with the last entry in a
table pointing to the next table with the entries, by kind of
chaining. The 2 LSBs determine the type of the table entry, to
one of :

 Normal - Points to a 4KB data page.
 Last   - Points to a 4KB data page, but is the last entry in the
          page table.
 Link   - Points to another 4KB table page with pointers to data.

The code takes care of handling the system page size which could
be different than 4K. So we could end up putting multiple ETR
SG tables in a single system page, vice versa for the data pages.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-tmc-etr.c | 263 ++++++++++++++++++++++++
 1 file changed, 263 insertions(+)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index 402b061..67b4117 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -11,6 +11,87 @@
 #include "coresight-tmc.h"
 
 /*
+ * The TMC ETR SG has a page size of 4K. The SG table contains pointers
+ * to 4KB buffers. However, the OS may use a PAGE_SIZE different from
+ * 4K (i.e, 16KB or 64KB). This implies that a single OS page could
+ * contain more than one SG buffer and tables.
+ *
+ * A table entry has the following format:
+ *
+ * ---Bit31------------Bit4-------Bit1-----Bit0--
+ * |     Address[39:12]    | SBZ |  Entry Type  |
+ * ----------------------------------------------
+ *
+ * Address: Bits [39:12] of a physical page address. Bits [11:0] are
+ *	    always zero.
+ *
+ * Entry type:
+ *	b00 - Reserved.
+ *	b01 - Last entry in the tables, points to 4K page buffer.
+ *	b10 - Normal entry, points to 4K page buffer.
+ *	b11 - Link. The address points to the base of next table.
+ */
+
+typedef u32 sgte_t;
+
+#define ETR_SG_PAGE_SHIFT		12
+#define ETR_SG_PAGE_SIZE		(1UL << ETR_SG_PAGE_SHIFT)
+#define ETR_SG_PAGES_PER_SYSPAGE	(PAGE_SIZE / ETR_SG_PAGE_SIZE)
+#define ETR_SG_PTRS_PER_PAGE		(ETR_SG_PAGE_SIZE / sizeof(sgte_t))
+#define ETR_SG_PTRS_PER_SYSPAGE		(PAGE_SIZE / sizeof(sgte_t))
+
+#define ETR_SG_ET_MASK			0x3
+#define ETR_SG_ET_LAST			0x1
+#define ETR_SG_ET_NORMAL		0x2
+#define ETR_SG_ET_LINK			0x3
+
+#define ETR_SG_ADDR_SHIFT		4
+
+#define ETR_SG_ENTRY(addr, type) \
+	(sgte_t)((((addr) >> ETR_SG_PAGE_SHIFT) << ETR_SG_ADDR_SHIFT) | \
+		 (type & ETR_SG_ET_MASK))
+
+#define ETR_SG_ADDR(entry) \
+	(((dma_addr_t)(entry) >> ETR_SG_ADDR_SHIFT) << ETR_SG_PAGE_SHIFT)
+#define ETR_SG_ET(entry)		((entry) & ETR_SG_ET_MASK)
+
+/*
+ * struct etr_sg_table : ETR SG Table
+ * @sg_table:		Generic SG Table holding the data/table pages.
+ * @hwaddr:		hwaddress used by the TMC, which is the base
+ *			address of the table.
+ */
+struct etr_sg_table {
+	struct tmc_sg_table	*sg_table;
+	dma_addr_t		hwaddr;
+};
+
+/*
+ * tmc_etr_sg_table_entries: Total number of table entries required to map
+ * @nr_pages system pages.
+ *
+ * We need to map @nr_pages * ETR_SG_PAGES_PER_SYSPAGE data pages.
+ * Each TMC page can map (ETR_SG_PTRS_PER_PAGE - 1) buffer pointers,
+ * with the last entry pointing to another page of table entries.
+ * If we spill over to a new page for mapping 1 entry, we could as
+ * well replace the link entry of the previous page with the last entry.
+ */
+static inline unsigned long __attribute_const__
+tmc_etr_sg_table_entries(int nr_pages)
+{
+	unsigned long nr_sgpages = nr_pages * ETR_SG_PAGES_PER_SYSPAGE;
+	unsigned long nr_sglinks = nr_sgpages / (ETR_SG_PTRS_PER_PAGE - 1);
+	/*
+	 * If we spill over to a new page for 1 entry, we could as well
+	 * make it the LAST entry in the previous page, skipping the Link
+	 * address.
+	 */
+	if (nr_sglinks && (nr_sgpages % (ETR_SG_PTRS_PER_PAGE - 1) < 2))
+		nr_sglinks--;
+	return nr_sgpages + nr_sglinks;
+}
+
+/*
  * tmc_pages_get_offset:  Go through all the pages in the tmc_pages
  * and map the device address @addr to an offset within the virtual
  * contiguous buffer.
@@ -277,6 +358,188 @@ ssize_t tmc_sg_table_get_data(struct tmc_sg_table *sg_table,
 	return len;
 }
 
+#ifdef ETR_SG_DEBUG
+/* Map a dma address to virtual address */
+static unsigned long
+tmc_sg_daddr_to_vaddr(struct tmc_sg_table *sg_table,
+		      dma_addr_t addr, bool table)
+{
+	long offset;
+	unsigned long base;
+	struct tmc_pages *tmc_pages;
+
+	if (table) {
+		tmc_pages = &sg_table->table_pages;
+		base = (unsigned long)sg_table->table_vaddr;
+	} else {
+		tmc_pages = &sg_table->data_pages;
+		base = (unsigned long)sg_table->data_vaddr;
+	}
+
+	offset = tmc_pages_get_offset(tmc_pages, addr);
+	if (offset < 0)
+		return 0;
+	return base + offset;
+}
+
+/* Dump the given sg_table */
+static void tmc_etr_sg_table_dump(struct etr_sg_table *etr_table)
+{
+	sgte_t *ptr;
+	int i = 0;
+	dma_addr_t addr;
+	struct tmc_sg_table *sg_table = etr_table->sg_table;
+
+	ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table,
+					      etr_table->hwaddr, true);
+	while (ptr) {
+		addr = ETR_SG_ADDR(*ptr);
+		switch (ETR_SG_ET(*ptr)) {
+		case ETR_SG_ET_NORMAL:
+			dev_dbg(sg_table->dev,
+				"%05d: %p\t:[N] 0x%llx\n", i, ptr, addr);
+			ptr++;
+			break;
+		case ETR_SG_ET_LINK:
+			dev_dbg(sg_table->dev,
+				"%05d: *** %p\t:{L} 0x%llx ***\n",
+				 i, ptr, addr);
+			ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table,
+							      addr, true);
+			break;
+		case ETR_SG_ET_LAST:
+			dev_dbg(sg_table->dev,
+				"%05d: ### %p\t:[L] 0x%llx ###\n",
+				 i, ptr, addr);
+			return;
+		default:
+			dev_dbg(sg_table->dev,
+				"%05d: xxx %p\t:[INVALID] 0x%llx xxx\n",
+				 i, ptr, addr);
+			return;
+		}
+		i++;
+	}
+	dev_dbg(sg_table->dev, "******* End of Table *****\n");
+}
+#else
+static inline void tmc_etr_sg_table_dump(struct etr_sg_table *etr_table) {}
+#endif
+
+/*
+ * Populate the SG Table page table entries from table/data
+ * pages allocated. Each Data page has ETR_SG_PAGES_PER_SYSPAGE SG pages.
+ * So does a Table page. So we keep track of indices of the tables
+ * in each system page and move the pointers accordingly.
+ */
+#define INC_IDX_ROUND(idx, size) ((idx) = ((idx) + 1) % (size))
+static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table)
+{
+	dma_addr_t paddr;
+	int i, type, nr_entries;
+	int tpidx = 0; /* index to the current system table_page */
+	int sgtidx = 0;	/* index to the sg_table within the current syspage */
+	int sgtentry = 0; /* the entry within the sg_table */
+	int dpidx = 0; /* index to the current system data_page */
+	int spidx = 0; /* index to the SG page within the current data page */
+	sgte_t *ptr; /* pointer to the table entry to fill */
+	struct tmc_sg_table *sg_table = etr_table->sg_table;
+	dma_addr_t *table_daddrs = sg_table->table_pages.daddrs;
+	dma_addr_t *data_daddrs = sg_table->data_pages.daddrs;
+
+	nr_entries = tmc_etr_sg_table_entries(sg_table->data_pages.nr_pages);
+	/*
+	 * Use the contiguous virtual address of the table to update entries.
+	 */
+	ptr = sg_table->table_vaddr;
+	/*
+	 * Fill all the entries, except the last entry to avoid special
+	 * checks within the loop.
+	 */
+	for (i = 0; i < nr_entries - 1; i++) {
+		if (sgtentry == ETR_SG_PTRS_PER_PAGE - 1) {
+			/*
+			 * Last entry in a sg_table page is a link address to
+			 * the next table page. If this sg_table is the last
+			 * one in the system page, it links to the first
+			 * sg_table in the next system page. Otherwise, it
+			 * links to the next sg_table page within the system
+			 * page.
+			 */
+			if (sgtidx == ETR_SG_PAGES_PER_SYSPAGE - 1) {
+				paddr = table_daddrs[tpidx + 1];
+			} else {
+				paddr = table_daddrs[tpidx] +
+					(ETR_SG_PAGE_SIZE * (sgtidx + 1));
+			}
+			type = ETR_SG_ET_LINK;
+		} else {
+			/*
+			 * Update the indices to the data_pages to point to the
+			 * next sg_page in the data buffer.
+			 */
+			type = ETR_SG_ET_NORMAL;
+			paddr = data_daddrs[dpidx] + spidx * ETR_SG_PAGE_SIZE;
+			if (!INC_IDX_ROUND(spidx, ETR_SG_PAGES_PER_SYSPAGE))
+				dpidx++;
+		}
+		*ptr++ = ETR_SG_ENTRY(paddr, type);
+		/*
+		 * Move to the next table pointer, moving the table page index
+		 * if necessary
+		 */
+		if (!INC_IDX_ROUND(sgtentry, ETR_SG_PTRS_PER_PAGE)) {
+			if (!INC_IDX_ROUND(sgtidx, ETR_SG_PAGES_PER_SYSPAGE))
+				tpidx++;
+		}
+	}
+
+	/* Set up the last entry, which is always a data pointer */
+	paddr = data_daddrs[dpidx] + spidx * ETR_SG_PAGE_SIZE;
+	*ptr++ = ETR_SG_ENTRY(paddr, ETR_SG_ET_LAST);
+}
+
+/*
+ * tmc_init_etr_sg_table: Allocate a TMC ETR SG table, data buffer of @size and
+ * populate the table.
+ *
+ * @dev		- Device pointer for the TMC
+ * @node	- NUMA node where the memory should be allocated
+ * @size	- Total size of the data buffer
+ * @pages	- Optional list of page virtual address
+ */
+static struct etr_sg_table __maybe_unused *
+tmc_init_etr_sg_table(struct device *dev, int node,
+		  unsigned long size, void **pages)
+{
+	int nr_entries, nr_tpages;
+	int nr_dpages = size >> PAGE_SHIFT;
+	struct tmc_sg_table *sg_table;
+	struct etr_sg_table *etr_table;
+
+	etr_table = kzalloc(sizeof(*etr_table), GFP_KERNEL);
+	if (!etr_table)
+		return ERR_PTR(-ENOMEM);
+	nr_entries = tmc_etr_sg_table_entries(nr_dpages);
+	nr_tpages = DIV_ROUND_UP(nr_entries, ETR_SG_PTRS_PER_SYSPAGE);
+
+	sg_table = tmc_alloc_sg_table(dev, node, nr_tpages, nr_dpages, pages);
+	if (IS_ERR(sg_table)) {
+		kfree(etr_table);
+		return ERR_PTR(PTR_ERR(sg_table));
+	}
+
+	etr_table->sg_table = sg_table;
+	/* TMC should use table base address for DBA */
+	etr_table->hwaddr = sg_table->table_daddr;
+	tmc_etr_sg_table_populate(etr_table);
+	/* Sync the table pages for the HW */
+	tmc_sg_table_sync_table(sg_table);
+	tmc_etr_sg_table_dump(etr_table);
+
+	return etr_table;
+}
+
 static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata)
 {
 	u32 axictl, sts;
-- 
2.7.4

^ permalink raw reply related

* [PATCH v2 10/12] coresight: tmc-etr: Add transparent buffer management
From: Suzuki K Poulose @ 2018-05-29 13:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, mathieu.poirier, mike.leach, robert.walker,
	coresight, devicetree, robh, frowand.list, Suzuki K Poulose
In-Reply-To: <1527599737-28408-1-git-send-email-suzuki.poulose@arm.com>

The TMC-ETR can use the target trace buffer in two different modes.
Normal physically contiguous mode and a discontiguous list pages in
Scatter-Gather mode. Also we have dedicated Coresight component, CATU
(Coresight Address Translation Unit) to provide improved scatter-gather
mode in Coresight SoC-600. This complicates the management of the
buffer used for trace, depending on the mode in which ETR is configured.

So, this patch adds a transparent layer for managing the ETR buffer
which abstracts the basic operations on the buffer (alloc, free,
sync and retrieve the data) and uses the mode specific helpers to
do the actual operation. This also allows the ETR driver to choose
the best mode for a given use case and adds the flexibility to
fallback to a different mode, without duplicating the code.

The patch also adds the "normal" flat memory mode and switches
the sysfs driver to use the new layer.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
Changes since last version:
 - Split the SG mode support to a separate patch
---
 drivers/hwtracing/coresight/coresight-tmc-etr.c | 342 ++++++++++++++++++------
 drivers/hwtracing/coresight/coresight-tmc.h     |  55 +++-
 2 files changed, 308 insertions(+), 89 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index 67b4117..a0e504a 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -6,10 +6,18 @@
 
 #include <linux/coresight.h>
 #include <linux/dma-mapping.h>
+#include <linux/iommu.h>
 #include <linux/slab.h>
 #include "coresight-priv.h"
 #include "coresight-tmc.h"
 
+struct etr_flat_buf {
+	struct device	*dev;
+	dma_addr_t	daddr;
+	void		*vaddr;
+	size_t		size;
+};
+
 /*
  * The TMC ETR SG has a page size of 4K. The SG table contains pointers
  * to 4KB buffers. However, the OS may use a PAGE_SIZE different from
@@ -540,16 +548,207 @@ tmc_init_etr_sg_table(struct device *dev, int node,
 	return etr_table;
 }
 
+/*
+ * tmc_etr_alloc_flat_buf: Allocate a contiguous DMA buffer.
+ */
+static int tmc_etr_alloc_flat_buf(struct tmc_drvdata *drvdata,
+				  struct etr_buf *etr_buf, int node,
+				  void **pages)
+{
+	struct etr_flat_buf *flat_buf;
+
+	/* We cannot reuse existing pages for flat buf */
+	if (pages)
+		return -EINVAL;
+
+	flat_buf = kzalloc(sizeof(*flat_buf), GFP_KERNEL);
+	if (!flat_buf)
+		return -ENOMEM;
+
+	flat_buf->vaddr = dma_alloc_coherent(drvdata->dev, etr_buf->size,
+					     &flat_buf->daddr, GFP_KERNEL);
+	if (!flat_buf->vaddr) {
+		kfree(flat_buf);
+		return -ENOMEM;
+	}
+
+	flat_buf->size = etr_buf->size;
+	flat_buf->dev = drvdata->dev;
+	etr_buf->hwaddr = flat_buf->daddr;
+	etr_buf->mode = ETR_MODE_FLAT;
+	etr_buf->private = flat_buf;
+	return 0;
+}
+
+static void tmc_etr_free_flat_buf(struct etr_buf *etr_buf)
+{
+	struct etr_flat_buf *flat_buf = etr_buf->private;
+
+	if (flat_buf && flat_buf->daddr)
+		dma_free_coherent(flat_buf->dev, flat_buf->size,
+				  flat_buf->vaddr, flat_buf->daddr);
+	kfree(flat_buf);
+}
+
+static void tmc_etr_sync_flat_buf(struct etr_buf *etr_buf, u64 rrp, u64 rwp)
+{
+	/*
+	 * Adjust the buffer to point to the beginning of the trace data
+	 * and update the available trace data.
+	 */
+	etr_buf->offset = rrp - etr_buf->hwaddr;
+	if (etr_buf->full)
+		etr_buf->len = etr_buf->size;
+	else
+		etr_buf->len = rwp - rrp;
+}
+
+static ssize_t tmc_etr_get_data_flat_buf(struct etr_buf *etr_buf,
+					 u64 offset, size_t len, char **bufpp)
+{
+	struct etr_flat_buf *flat_buf = etr_buf->private;
+
+	*bufpp = (char *)flat_buf->vaddr + offset;
+	/*
+	 * tmc_etr_buf_get_data already adjusts the length to handle
+	 * buffer wrapping around.
+	 */
+	return len;
+}
+
+static const struct etr_buf_operations etr_flat_buf_ops = {
+	.alloc = tmc_etr_alloc_flat_buf,
+	.free = tmc_etr_free_flat_buf,
+	.sync = tmc_etr_sync_flat_buf,
+	.get_data = tmc_etr_get_data_flat_buf,
+};
+
+static const struct etr_buf_operations *etr_buf_ops[] = {
+	[ETR_MODE_FLAT] = &etr_flat_buf_ops,
+};
+
+static inline int tmc_etr_mode_alloc_buf(int mode,
+					 struct tmc_drvdata *drvdata,
+					 struct etr_buf *etr_buf, int node,
+					 void **pages)
+{
+	int rc;
+
+	switch (mode) {
+	case ETR_MODE_FLAT:
+		rc = etr_buf_ops[mode]->alloc(drvdata, etr_buf, node, pages);
+		if (!rc)
+			etr_buf->ops = etr_buf_ops[mode];
+		return rc;
+	default:
+		return -EINVAL;
+	}
+}
+
+/*
+ * tmc_alloc_etr_buf: Allocate a buffer use by ETR.
+ * @drvdata	: ETR device details.
+ * @size	: size of the requested buffer.
+ * @flags	: Required properties for the buffer.
+ * @node	: Node for memory allocations.
+ * @pages	: An optional list of pages.
+ */
+static struct etr_buf *tmc_alloc_etr_buf(struct tmc_drvdata *drvdata,
+					 ssize_t size, int flags,
+					 int node, void **pages)
+{
+	int rc = 0;
+	struct etr_buf *etr_buf;
+
+	etr_buf = kzalloc(sizeof(*etr_buf), GFP_KERNEL);
+	if (!etr_buf)
+		return ERR_PTR(-ENOMEM);
+
+	etr_buf->size = size;
+
+	rc = tmc_etr_mode_alloc_buf(ETR_MODE_FLAT, drvdata,
+				    etr_buf, node, pages);
+	if (rc) {
+		kfree(etr_buf);
+		return ERR_PTR(rc);
+	}
+
+	return etr_buf;
+}
+
+static void tmc_free_etr_buf(struct etr_buf *etr_buf)
+{
+	WARN_ON(!etr_buf->ops || !etr_buf->ops->free);
+	etr_buf->ops->free(etr_buf);
+	kfree(etr_buf);
+}
+
+/*
+ * tmc_etr_buf_get_data: Get the pointer the trace data at @offset
+ * with a maximum of @len bytes.
+ * Returns: The size of the linear data available @pos, with *bufpp
+ * updated to point to the buffer.
+ */
+static ssize_t tmc_etr_buf_get_data(struct etr_buf *etr_buf,
+				    u64 offset, size_t len, char **bufpp)
+{
+	/* Adjust the length to limit this transaction to end of buffer */
+	len = (len < (etr_buf->size - offset)) ? len : etr_buf->size - offset;
+
+	return etr_buf->ops->get_data(etr_buf, (u64)offset, len, bufpp);
+}
+
+static inline s64
+tmc_etr_buf_insert_barrier_packet(struct etr_buf *etr_buf, u64 offset)
+{
+	ssize_t len;
+	char *bufp;
+
+	len = tmc_etr_buf_get_data(etr_buf, offset,
+				   CORESIGHT_BARRIER_PKT_SIZE, &bufp);
+	if (WARN_ON(len <= CORESIGHT_BARRIER_PKT_SIZE))
+		return -EINVAL;
+	coresight_insert_barrier_packet(bufp);
+	return offset + CORESIGHT_BARRIER_PKT_SIZE;
+}
+
+/*
+ * tmc_sync_etr_buf: Sync the trace buffer availability with drvdata.
+ * Makes sure the trace data is synced to the memory for consumption.
+ * @etr_buf->offset will hold the offset to the beginning of the trace data
+ * within the buffer, with @etr_buf->len bytes to consume.
+ */
+static void tmc_sync_etr_buf(struct tmc_drvdata *drvdata)
+{
+	struct etr_buf *etr_buf = drvdata->etr_buf;
+	u64 rrp, rwp;
+	u32 status;
+
+	rrp = tmc_read_rrp(drvdata);
+	rwp = tmc_read_rwp(drvdata);
+	status = readl_relaxed(drvdata->base + TMC_STS);
+	etr_buf->full = status & TMC_STS_FULL;
+
+	WARN_ON(!etr_buf->ops || !etr_buf->ops->sync);
+
+	etr_buf->ops->sync(etr_buf, rrp, rwp);
+
+	/* Insert barrier packets at the beginning, if there was an overflow */
+	if (etr_buf->full)
+		tmc_etr_buf_insert_barrier_packet(etr_buf, etr_buf->offset);
+}
+
 static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata)
 {
 	u32 axictl, sts;
+	struct etr_buf *etr_buf = drvdata->etr_buf;
 
 	CS_UNLOCK(drvdata->base);
 
 	/* Wait for TMCSReady bit to be set */
 	tmc_wait_for_tmcready(drvdata);
 
-	writel_relaxed(drvdata->size / 4, drvdata->base + TMC_RSZ);
+	writel_relaxed(etr_buf->size / 4, drvdata->base + TMC_RSZ);
 	writel_relaxed(TMC_MODE_CIRCULAR_BUFFER, drvdata->base + TMC_MODE);
 
 	axictl = readl_relaxed(drvdata->base + TMC_AXICTL);
@@ -563,15 +762,15 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata)
 	}
 
 	writel_relaxed(axictl, drvdata->base + TMC_AXICTL);
-	tmc_write_dba(drvdata, drvdata->paddr);
+	tmc_write_dba(drvdata, etr_buf->hwaddr);
 	/*
 	 * If the TMC pointers must be programmed before the session,
 	 * we have to set it properly (i.e, RRP/RWP to base address and
 	 * STS to "not full").
 	 */
 	if (tmc_etr_has_cap(drvdata, TMC_ETR_SAVE_RESTORE)) {
-		tmc_write_rrp(drvdata, drvdata->paddr);
-		tmc_write_rwp(drvdata, drvdata->paddr);
+		tmc_write_rrp(drvdata, etr_buf->hwaddr);
+		tmc_write_rwp(drvdata, etr_buf->hwaddr);
 		sts = readl_relaxed(drvdata->base + TMC_STS) & ~TMC_STS_FULL;
 		writel_relaxed(sts, drvdata->base + TMC_STS);
 	}
@@ -587,59 +786,48 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata)
 }
 
 /*
- * Return the available trace data in the buffer @pos, with a maximum
- * limit of @len, also updating the @bufpp on where to find it.
+ * Return the available trace data in the buffer (starts at etr_buf->offset,
+ * limited by etr_buf->len) from @pos, with a maximum limit of @len,
+ * also updating the @bufpp on where to find it. Since the trace data
+ * starts at anywhere in the buffer, depending on the RRP, we adjust the
+ * @len returned to handle buffer wrapping around.
  */
 ssize_t tmc_etr_get_sysfs_trace(struct tmc_drvdata *drvdata,
 				loff_t pos, size_t len, char **bufpp)
 {
+	s64 offset;
 	ssize_t actual = len;
-	char *bufp = drvdata->buf + pos;
-	char *bufend = (char *)(drvdata->vaddr + drvdata->size);
-
-	/* Adjust the len to available size @pos */
-	if (pos + actual > drvdata->len)
-		actual = drvdata->len - pos;
+	struct etr_buf *etr_buf = drvdata->etr_buf;
 
+	if (pos + actual > etr_buf->len)
+		actual = etr_buf->len - pos;
 	if (actual <= 0)
 		return actual;
 
-	/*
-	 * Since we use a circular buffer, with trace data starting
-	 * @drvdata->buf, possibly anywhere in the buffer @drvdata->vaddr,
-	 * wrap the current @pos to within the buffer.
-	 */
-	if (bufp >= bufend)
-		bufp -= drvdata->size;
-	/*
-	 * For simplicity, avoid copying over a wrapped around buffer.
-	 */
-	if ((bufp + actual) > bufend)
-		actual = bufend - bufp;
-	*bufpp = bufp;
-	return actual;
+	/* Compute the offset from which we read the data */
+	offset = etr_buf->offset + pos;
+	if (offset >= etr_buf->size)
+		offset -= etr_buf->size;
+	return tmc_etr_buf_get_data(etr_buf, offset, actual, bufpp);
 }
 
-static void tmc_etr_dump_hw(struct tmc_drvdata *drvdata)
+static struct etr_buf *
+tmc_etr_setup_sysfs_buf(struct tmc_drvdata *drvdata)
 {
-	u32 val;
-	u64 rwp;
+	return tmc_alloc_etr_buf(drvdata, drvdata->size,
+				 0, cpu_to_node(0), NULL);
+}
 
-	rwp = tmc_read_rwp(drvdata);
-	val = readl_relaxed(drvdata->base + TMC_STS);
+static void
+tmc_etr_free_sysfs_buf(struct etr_buf *buf)
+{
+	if (buf)
+		tmc_free_etr_buf(buf);
+}
 
-	/*
-	 * Adjust the buffer to point to the beginning of the trace data
-	 * and update the available trace data.
-	 */
-	if (val & TMC_STS_FULL) {
-		drvdata->buf = drvdata->vaddr + rwp - drvdata->paddr;
-		drvdata->len = drvdata->size;
-		coresight_insert_barrier_packet(drvdata->buf);
-	} else {
-		drvdata->buf = drvdata->vaddr;
-		drvdata->len = rwp - drvdata->paddr;
-	}
+static void tmc_etr_sync_sysfs_buf(struct tmc_drvdata *drvdata)
+{
+	tmc_sync_etr_buf(drvdata);
 }
 
 static void tmc_etr_disable_hw(struct tmc_drvdata *drvdata)
@@ -652,7 +840,8 @@ static void tmc_etr_disable_hw(struct tmc_drvdata *drvdata)
 	 * read before the TMC is disabled.
 	 */
 	if (drvdata->mode == CS_MODE_SYSFS)
-		tmc_etr_dump_hw(drvdata);
+		tmc_etr_sync_sysfs_buf(drvdata);
+
 	tmc_disable_hw(drvdata);
 
 	CS_LOCK(drvdata->base);
@@ -661,35 +850,32 @@ static void tmc_etr_disable_hw(struct tmc_drvdata *drvdata)
 static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
 {
 	int ret = 0;
-	bool used = false;
 	unsigned long flags;
-	void __iomem *vaddr = NULL;
-	dma_addr_t paddr = 0;
 	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
+	struct etr_buf *new_buf = NULL, *free_buf = NULL;
 
 	/*
-	 * If we don't have a buffer release the lock and allocate memory.
-	 * Otherwise keep the lock and move along.
+	 * If we are enabling the ETR from disabled state, we need to make
+	 * sure we have a buffer with the right size. The etr_buf is not reset
+	 * immediately after we stop the tracing in SYSFS mode as we wait for
+	 * the user to collect the data. We may be able to reuse the existing
+	 * buffer, provided the size matches. Any allocation has to be done
+	 * with the lock released.
 	 */
 	spin_lock_irqsave(&drvdata->spinlock, flags);
-	if (!drvdata->vaddr) {
+	if (!drvdata->etr_buf || (drvdata->etr_buf->size != drvdata->size)) {
 		spin_unlock_irqrestore(&drvdata->spinlock, flags);
 
-		/*
-		 * Contiguous  memory can't be allocated while a spinlock is
-		 * held.  As such allocate memory here and free it if a buffer
-		 * has already been allocated (from a previous session).
-		 */
-		vaddr = dma_alloc_coherent(drvdata->dev, drvdata->size,
-					   &paddr, GFP_KERNEL);
-		if (!vaddr)
-			return -ENOMEM;
+		/* Allocate memory with the locks released */
+		free_buf = new_buf = tmc_etr_setup_sysfs_buf(drvdata);
+		if (IS_ERR(new_buf))
+			return PTR_ERR(new_buf);
 
 		/* Let's try again */
 		spin_lock_irqsave(&drvdata->spinlock, flags);
 	}
 
-	if (drvdata->reading) {
+	if (drvdata->reading || drvdata->mode == CS_MODE_PERF) {
 		ret = -EBUSY;
 		goto out;
 	}
@@ -697,21 +883,19 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
 	/*
 	 * In sysFS mode we can have multiple writers per sink.  Since this
 	 * sink is already enabled no memory is needed and the HW need not be
-	 * touched.
+	 * touched, even if the buffer size has changed.
 	 */
 	if (drvdata->mode == CS_MODE_SYSFS)
 		goto out;
 
 	/*
-	 * If drvdata::vaddr == NULL, use the memory allocated above.
-	 * Otherwise a buffer still exists from a previous session, so
-	 * simply use that.
+	 * If we don't have a buffer or it doesn't match the requested size,
+	 * use the buffer allocated above. Otherwise reuse the existing buffer.
 	 */
-	if (drvdata->vaddr == NULL) {
-		used = true;
-		drvdata->vaddr = vaddr;
-		drvdata->paddr = paddr;
-		drvdata->buf = drvdata->vaddr;
+	if (!drvdata->etr_buf ||
+	    (new_buf && drvdata->etr_buf->size != new_buf->size)) {
+		free_buf = drvdata->etr_buf;
+		drvdata->etr_buf = new_buf;
 	}
 
 	drvdata->mode = CS_MODE_SYSFS;
@@ -720,8 +904,8 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
 
 	/* Free memory outside the spinlock if need be */
-	if (!used && vaddr)
-		dma_free_coherent(drvdata->dev, drvdata->size, vaddr, paddr);
+	if (free_buf)
+		tmc_etr_free_sysfs_buf(free_buf);
 
 	if (!ret)
 		dev_info(drvdata->dev, "TMC-ETR enabled\n");
@@ -800,8 +984,8 @@ int tmc_read_prepare_etr(struct tmc_drvdata *drvdata)
 		goto out;
 	}
 
-	/* If drvdata::buf is NULL the trace data has been read already */
-	if (drvdata->buf == NULL) {
+	/* If drvdata::etr_buf is NULL the trace data has been read already */
+	if (drvdata->etr_buf == NULL) {
 		ret = -EINVAL;
 		goto out;
 	}
@@ -820,8 +1004,7 @@ int tmc_read_prepare_etr(struct tmc_drvdata *drvdata)
 int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata)
 {
 	unsigned long flags;
-	dma_addr_t paddr;
-	void __iomem *vaddr = NULL;
+	struct etr_buf *etr_buf = NULL;
 
 	/* config types are set a boot time and never change */
 	if (WARN_ON_ONCE(drvdata->config_type != TMC_CONFIG_TYPE_ETR))
@@ -842,17 +1025,16 @@ int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata)
 		 * The ETR is not tracing and the buffer was just read.
 		 * As such prepare to free the trace buffer.
 		 */
-		vaddr = drvdata->vaddr;
-		paddr = drvdata->paddr;
-		drvdata->buf = drvdata->vaddr = NULL;
+		etr_buf =  drvdata->etr_buf;
+		drvdata->etr_buf = NULL;
 	}
 
 	drvdata->reading = false;
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
 
 	/* Free allocated memory out side of the spinlock */
-	if (vaddr)
-		dma_free_coherent(drvdata->dev, drvdata->size, vaddr, paddr);
+	if (etr_buf)
+		tmc_free_etr_buf(etr_buf);
 
 	return 0;
 }
diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
index cdb668b..39ed306 100644
--- a/drivers/hwtracing/coresight/coresight-tmc.h
+++ b/drivers/hwtracing/coresight/coresight-tmc.h
@@ -123,6 +123,34 @@ enum tmc_mem_intf_width {
 #define CORESIGHT_SOC_600_ETR_CAPS	\
 	(TMC_ETR_SAVE_RESTORE | TMC_ETR_AXI_ARCACHE)
 
+enum etr_mode {
+	ETR_MODE_FLAT,		/* Uses contiguous flat buffer */
+};
+
+struct etr_buf_operations;
+
+/**
+ * struct etr_buf - Details of the buffer used by ETR
+ * @mode	: Mode of the ETR buffer, contiguous, Scatter Gather etc.
+ * @full	: Trace data overflow
+ * @size	: Size of the buffer.
+ * @hwaddr	: Address to be programmed in the TMC:DBA{LO,HI}
+ * @offset	: Offset of the trace data in the buffer for consumption.
+ * @len		: Available trace data @buf (may round up to the beginning).
+ * @ops		: ETR buffer operations for the mode.
+ * @private	: Backend specific information for the buf
+ */
+struct etr_buf {
+	enum etr_mode			mode;
+	bool				full;
+	ssize_t				size;
+	dma_addr_t			hwaddr;
+	unsigned long			offset;
+	s64				len;
+	const struct etr_buf_operations	*ops;
+	void				*private;
+};
+
 /**
  * struct tmc_drvdata - specifics associated to an TMC component
  * @base:	memory mapped base address for this component.
@@ -130,11 +158,10 @@ enum tmc_mem_intf_width {
  * @csdev:	component vitals needed by the framework.
  * @miscdev:	specifics to handle "/dev/xyz.tmc" entry.
  * @spinlock:	only one at a time pls.
- * @buf:	area of memory where trace data get sent.
- * @paddr:	DMA start location in RAM.
- * @vaddr:	virtual representation of @paddr.
- * @size:	trace buffer size.
- * @len:	size of the available trace.
+ * @buf:	Snapshot of the trace data for ETF/ETB.
+ * @etr_buf:	details of buffer used in TMC-ETR
+ * @len:	size of the available trace for ETF/ETB.
+ * @size:	trace buffer size for this TMC (common for all modes).
  * @mode:	how this TMC is being used.
  * @config_type: TMC variant, must be of type @tmc_config_type.
  * @memwidth:	width of the memory interface databus, in bytes.
@@ -149,11 +176,12 @@ struct tmc_drvdata {
 	struct miscdevice	miscdev;
 	spinlock_t		spinlock;
 	bool			reading;
-	char			*buf;
-	dma_addr_t		paddr;
-	void __iomem		*vaddr;
-	u32			size;
+	union {
+		char		*buf;		/* TMC ETB */
+		struct etr_buf	*etr_buf;	/* TMC ETR */
+	};
 	u32			len;
+	u32			size;
 	u32			mode;
 	enum tmc_config_type	config_type;
 	enum tmc_mem_intf_width	memwidth;
@@ -161,6 +189,15 @@ struct tmc_drvdata {
 	u32			etr_caps;
 };
 
+struct etr_buf_operations {
+	int (*alloc)(struct tmc_drvdata *drvdata, struct etr_buf *etr_buf,
+			int node, void **pages);
+	void (*sync)(struct etr_buf *etr_buf, u64 rrp, u64 rwp);
+	ssize_t (*get_data)(struct etr_buf *etr_buf, u64 offset, size_t len,
+				char **bufpp);
+	void (*free)(struct etr_buf *etr_buf);
+};
+
 /**
  * struct tmc_pages - Collection of pages used for SG.
  * @nr_pages:		Number of pages in the list.
-- 
2.7.4

^ permalink raw reply related

* [PATCH v2 11/12] coresight: tmc-etr buf: Add TMC scatter gather mode backend
From: Suzuki K Poulose @ 2018-05-29 13:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, mathieu.poirier, mike.leach, robert.walker,
	coresight, devicetree, robh, frowand.list, Suzuki K Poulose
In-Reply-To: <1527599737-28408-1-git-send-email-suzuki.poulose@arm.com>

Add the support for Scatter-Gather mode to the etr-buf layer.
Since we now have two different modes, we choose the backend
based on a set of conditions, documented in the code.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
Change since last version:
 - New in this version, splitted from the original patch.
 - No functional changes.
---
 drivers/hwtracing/coresight/coresight-tmc-etr.c | 114 +++++++++++++++++++++++-
 drivers/hwtracing/coresight/coresight-tmc.h     |   1 +
 2 files changed, 111 insertions(+), 4 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index a0e504a..19955cf 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -516,7 +516,7 @@ static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table)
  * @size	- Total size of the data buffer
  * @pages	- Optional list of page virtual address
  */
-static struct etr_sg_table __maybe_unused *
+static struct etr_sg_table *
 tmc_init_etr_sg_table(struct device *dev, int node,
 		  unsigned long size, void **pages)
 {
@@ -623,8 +623,86 @@ static const struct etr_buf_operations etr_flat_buf_ops = {
 	.get_data = tmc_etr_get_data_flat_buf,
 };
 
+/*
+ * tmc_etr_alloc_sg_buf: Allocate an SG buf @etr_buf. Setup the parameters
+ * appropriately.
+ */
+static int tmc_etr_alloc_sg_buf(struct tmc_drvdata *drvdata,
+				struct etr_buf *etr_buf, int node,
+				void **pages)
+{
+	struct etr_sg_table *etr_table;
+
+	etr_table = tmc_init_etr_sg_table(drvdata->dev, node,
+					  etr_buf->size, pages);
+	if (IS_ERR(etr_table))
+		return -ENOMEM;
+	etr_buf->hwaddr = etr_table->hwaddr;
+	etr_buf->mode = ETR_MODE_ETR_SG;
+	etr_buf->private = etr_table;
+	return 0;
+}
+
+static void tmc_etr_free_sg_buf(struct etr_buf *etr_buf)
+{
+	struct etr_sg_table *etr_table = etr_buf->private;
+
+	if (etr_table) {
+		tmc_free_sg_table(etr_table->sg_table);
+		kfree(etr_table);
+	}
+}
+
+static ssize_t tmc_etr_get_data_sg_buf(struct etr_buf *etr_buf, u64 offset,
+				       size_t len, char **bufpp)
+{
+	struct etr_sg_table *etr_table = etr_buf->private;
+
+	return tmc_sg_table_get_data(etr_table->sg_table, offset, len, bufpp);
+}
+
+static void tmc_etr_sync_sg_buf(struct etr_buf *etr_buf, u64 rrp, u64 rwp)
+{
+	long r_offset, w_offset;
+	struct etr_sg_table *etr_table = etr_buf->private;
+	struct tmc_sg_table *table = etr_table->sg_table;
+
+	/* Convert hw address to offset in the buffer */
+	r_offset = tmc_sg_get_data_page_offset(table, rrp);
+	if (r_offset < 0) {
+		dev_warn(table->dev,
+			 "Unable to map RRP %llx to offset\n", rrp);
+		etr_buf->len = 0;
+		return;
+	}
+
+	w_offset = tmc_sg_get_data_page_offset(table, rwp);
+	if (w_offset < 0) {
+		dev_warn(table->dev,
+			 "Unable to map RWP %llx to offset\n", rwp);
+		etr_buf->len = 0;
+		return;
+	}
+
+	etr_buf->offset = r_offset;
+	if (etr_buf->full)
+		etr_buf->len = etr_buf->size;
+	else
+		etr_buf->len = ((w_offset < r_offset) ? etr_buf->size : 0) +
+				w_offset - r_offset;
+	tmc_sg_table_sync_data_range(table, r_offset, etr_buf->len);
+}
+
+static const struct etr_buf_operations etr_sg_buf_ops = {
+	.alloc = tmc_etr_alloc_sg_buf,
+	.free = tmc_etr_free_sg_buf,
+	.sync = tmc_etr_sync_sg_buf,
+	.get_data = tmc_etr_get_data_sg_buf,
+};
+
 static const struct etr_buf_operations *etr_buf_ops[] = {
 	[ETR_MODE_FLAT] = &etr_flat_buf_ops,
+	[ETR_MODE_ETR_SG] = &etr_sg_buf_ops,
 };
 
 static inline int tmc_etr_mode_alloc_buf(int mode,
@@ -636,6 +714,7 @@ static inline int tmc_etr_mode_alloc_buf(int mode,
 
 	switch (mode) {
 	case ETR_MODE_FLAT:
+	case ETR_MODE_ETR_SG:
 		rc = etr_buf_ops[mode]->alloc(drvdata, etr_buf, node, pages);
 		if (!rc)
 			etr_buf->ops = etr_buf_ops[mode];
@@ -657,17 +736,38 @@ static struct etr_buf *tmc_alloc_etr_buf(struct tmc_drvdata *drvdata,
 					 ssize_t size, int flags,
 					 int node, void **pages)
 {
-	int rc = 0;
+	int rc = -ENOMEM;
+	bool has_etr_sg, has_iommu;
 	struct etr_buf *etr_buf;
 
+	has_etr_sg = tmc_etr_has_cap(drvdata, TMC_ETR_SG);
+	has_iommu = iommu_get_domain_for_dev(drvdata->dev);
+
 	etr_buf = kzalloc(sizeof(*etr_buf), GFP_KERNEL);
 	if (!etr_buf)
 		return ERR_PTR(-ENOMEM);
 
 	etr_buf->size = size;
 
-	rc = tmc_etr_mode_alloc_buf(ETR_MODE_FLAT, drvdata,
-				    etr_buf, node, pages);
+	/*
+	 * If we have to use an existing list of pages, we cannot reliably
+	 * use a contiguous DMA memory (even if we have an IOMMU). Otherwise,
+	 * we use the contiguous DMA memory if at least one of the following
+	 * conditions is true:
+	 *  a) The ETR cannot use Scatter-Gather.
+	 *  b) we have a backing IOMMU
+	 *  c) The requested memory size is smaller (< 1M).
+	 *
+	 * Fallback to available mechanisms.
+	 *
+	 */
+	if (!pages &&
+	    (!has_etr_sg || has_iommu || size < SZ_1M))
+		rc = tmc_etr_mode_alloc_buf(ETR_MODE_FLAT, drvdata,
+					    etr_buf, node, pages);
+	if (rc && has_etr_sg)
+		rc = tmc_etr_mode_alloc_buf(ETR_MODE_ETR_SG, drvdata,
+					    etr_buf, node, pages);
 	if (rc) {
 		kfree(etr_buf);
 		return ERR_PTR(rc);
@@ -761,6 +861,12 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata)
 		axictl |= TMC_AXICTL_ARCACHE_OS;
 	}
 
+	if (etr_buf->mode == ETR_MODE_ETR_SG) {
+		if (WARN_ON(!tmc_etr_has_cap(drvdata, TMC_ETR_SG)))
+			return;
+		axictl |= TMC_AXICTL_SCT_GAT_MODE;
+	}
+
 	writel_relaxed(axictl, drvdata->base + TMC_AXICTL);
 	tmc_write_dba(drvdata, etr_buf->hwaddr);
 	/*
diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
index 39ed306..eeeba48 100644
--- a/drivers/hwtracing/coresight/coresight-tmc.h
+++ b/drivers/hwtracing/coresight/coresight-tmc.h
@@ -125,6 +125,7 @@ enum tmc_mem_intf_width {
 
 enum etr_mode {
 	ETR_MODE_FLAT,		/* Uses contiguous flat buffer */
+	ETR_MODE_ETR_SG,	/* Uses in-built TMC ETR SG mechanism */
 };
 
 struct etr_buf_operations;
-- 
2.7.4

^ permalink raw reply related

* [PATCH v2 12/12] coresight: tmc: Add configuration support for trace buffer size
From: Suzuki K Poulose @ 2018-05-29 13:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, mathieu.poirier, mike.leach, robert.walker,
	coresight, devicetree, robh, frowand.list, Suzuki K Poulose
In-Reply-To: <1527599737-28408-1-git-send-email-suzuki.poulose@arm.com>

Now that we can dynamically switch between contiguous memory and
SG table depending on the trace buffer size, provide the support
for selecting an appropriate buffer size.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 .../ABI/testing/sysfs-bus-coresight-devices-tmc    |  8 ++++++
 .../devicetree/bindings/arm/coresight.txt          |  3 +-
 drivers/hwtracing/coresight/coresight-tmc.c        | 33 ++++++++++++++++++++++
 3 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc
index 4fe677e..ea78714 100644
--- a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc
+++ b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc
@@ -83,3 +83,11 @@ KernelVersion:	4.7
 Contact:	Mathieu Poirier <mathieu.poirier@linaro.org>
 Description:	(R) Indicates the capabilities of the Coresight TMC.
 		The value is read directly from the DEVID register, 0xFC8,
+
+What:		/sys/bus/coresight/devices/<memory_map>.tmc/buffer_size
+Date:		August 2018
+KernelVersion:	4.18
+Contact:	Mathieu Poirier <mathieu.poirier@linaro.org>
+Description:	(RW) Size of the trace buffer for TMC-ETR when used in SYSFS
+		mode. Writable only for TMC-ETR configurations. The value
+		should be aligned to the kernel pagesize.
diff --git a/Documentation/devicetree/bindings/arm/coresight.txt b/Documentation/devicetree/bindings/arm/coresight.txt
index 603d3c6..9aa30a1 100644
--- a/Documentation/devicetree/bindings/arm/coresight.txt
+++ b/Documentation/devicetree/bindings/arm/coresight.txt
@@ -84,7 +84,8 @@ its hardware characteristcs.
 * Optional property for TMC:
 
 	* arm,buffer-size: size of contiguous buffer space for TMC ETR
-	 (embedded trace router)
+	  (embedded trace router). This property is obsolete. The buffer size
+	  can be configured dynamically via buffer_size property in sysfs.
 
 	* arm,scatter-gather: boolean. Indicates that the TMC-ETR can safely
 	  use the SG mode on this system.
diff --git a/drivers/hwtracing/coresight/coresight-tmc.c b/drivers/hwtracing/coresight/coresight-tmc.c
index bc8fc86..1b817ec 100644
--- a/drivers/hwtracing/coresight/coresight-tmc.c
+++ b/drivers/hwtracing/coresight/coresight-tmc.c
@@ -277,8 +277,41 @@ static ssize_t trigger_cntr_store(struct device *dev,
 }
 static DEVICE_ATTR_RW(trigger_cntr);
 
+static ssize_t buffer_size_show(struct device *dev,
+				struct device_attribute *attr, char *buf)
+{
+	struct tmc_drvdata *drvdata = dev_get_drvdata(dev->parent);
+
+	return sprintf(buf, "%#x\n", drvdata->size);
+}
+
+static ssize_t buffer_size_store(struct device *dev,
+				 struct device_attribute *attr,
+				 const char *buf, size_t size)
+{
+	int ret;
+	unsigned long val;
+	struct tmc_drvdata *drvdata = dev_get_drvdata(dev->parent);
+
+	/* Only permitted for TMC-ETRs */
+	if (drvdata->config_type != TMC_CONFIG_TYPE_ETR)
+		return -EPERM;
+
+	ret = kstrtoul(buf, 0, &val);
+	if (ret)
+		return ret;
+	/* The buffer size should be page aligned */
+	if (val & (PAGE_SIZE - 1))
+		return -EINVAL;
+	drvdata->size = val;
+	return size;
+}
+
+static DEVICE_ATTR_RW(buffer_size);
+
 static struct attribute *coresight_tmc_attrs[] = {
 	&dev_attr_trigger_cntr.attr,
+	&dev_attr_buffer_size.attr,
 	NULL,
 };
 
-- 
2.7.4

^ permalink raw reply related

* [PATCH v2 08/12] coresight: Add generic TMC sg table framework
From: Suzuki K Poulose @ 2018-05-29 13:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, mathieu.poirier, mike.leach, robert.walker,
	coresight, devicetree, robh, frowand.list, Suzuki K Poulose
In-Reply-To: <1527599737-28408-1-git-send-email-suzuki.poulose@arm.com>

This patch introduces a generic sg table data structure and
associated operations. An SG table can be used to map a set
of Data pages where the trace data could be stored by the TMC
ETR. The information about the data pages could be stored in
different formats, depending on the type of the underlying
SG mechanism (e.g, TMC ETR SG vs Coresight CATU). The generic
structure provides book keeping of the pages used for the data
as well as the table contents. The table should be filled by
the user of the infrastructure.

A table can be created by specifying the number of data pages
as well as the number of table pages required to hold the
pointers, where the latter could be different for different
types of tables. The pages are mapped in the appropriate dma
data direction mode (i.e, DMA_TO_DEVICE for table pages
and DMA_FROM_DEVICE for data pages).  The framework can optionally
accept a set of allocated data pages (e.g, perf ring buffer) and
map them accordingly. The table and data pages are vmap'ed to allow
easier access by the drivers. The framework also provides helpers to
sync the data written to the pages with appropriate directions.

This will be later used by the TMC ETR SG unit and CATU.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
Changes since previous version:
 - Drop helper for table vaddr/paddr, data vaddr as we don't use
   them anyways.
---
 drivers/hwtracing/coresight/coresight-tmc-etr.c | 268 ++++++++++++++++++++++++
 drivers/hwtracing/coresight/coresight-tmc.h     |  50 +++++
 2 files changed, 318 insertions(+)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index 04206ff..402b061 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -6,9 +6,277 @@
 
 #include <linux/coresight.h>
 #include <linux/dma-mapping.h>
+#include <linux/slab.h>
 #include "coresight-priv.h"
 #include "coresight-tmc.h"
 
+/*
+ * tmc_pages_get_offset:  Go through all the pages in the tmc_pages
+ * and map the device address @addr to an offset within the virtual
+ * contiguous buffer.
+ */
+static long
+tmc_pages_get_offset(struct tmc_pages *tmc_pages, dma_addr_t addr)
+{
+	int i;
+	dma_addr_t page_start;
+
+	for (i = 0; i < tmc_pages->nr_pages; i++) {
+		page_start = tmc_pages->daddrs[i];
+		if (addr >= page_start && addr < (page_start + PAGE_SIZE))
+			return i * PAGE_SIZE + (addr - page_start);
+	}
+
+	return -EINVAL;
+}
+
+/*
+ * tmc_pages_free : Unmap and free the pages used by tmc_pages.
+ * If the pages were not allocated in tmc_pages_alloc(), we would
+ * simply drop the refcount.
+ */
+static void tmc_pages_free(struct tmc_pages *tmc_pages,
+			   struct device *dev, enum dma_data_direction dir)
+{
+	int i;
+
+	for (i = 0; i < tmc_pages->nr_pages; i++) {
+		if (tmc_pages->daddrs && tmc_pages->daddrs[i])
+			dma_unmap_page(dev, tmc_pages->daddrs[i],
+					 PAGE_SIZE, dir);
+		if (tmc_pages->pages && tmc_pages->pages[i])
+			__free_page(tmc_pages->pages[i]);
+	}
+
+	kfree(tmc_pages->pages);
+	kfree(tmc_pages->daddrs);
+	tmc_pages->pages = NULL;
+	tmc_pages->daddrs = NULL;
+	tmc_pages->nr_pages = 0;
+}
+
+/*
+ * tmc_pages_alloc : Allocate and map pages for a given @tmc_pages.
+ * If @pages is not NULL, the list of page virtual addresses are
+ * used as the data pages. The pages are then dma_map'ed for @dev
+ * with dma_direction @dir.
+ *
+ * Returns 0 upon success, else the error number.
+ */
+static int tmc_pages_alloc(struct tmc_pages *tmc_pages,
+			   struct device *dev, int node,
+			   enum dma_data_direction dir, void **pages)
+{
+	int i, nr_pages;
+	dma_addr_t paddr;
+	struct page *page;
+
+	nr_pages = tmc_pages->nr_pages;
+	tmc_pages->daddrs = kcalloc(nr_pages, sizeof(*tmc_pages->daddrs),
+					 GFP_KERNEL);
+	if (!tmc_pages->daddrs)
+		return -ENOMEM;
+	tmc_pages->pages = kcalloc(nr_pages, sizeof(*tmc_pages->pages),
+					 GFP_KERNEL);
+	if (!tmc_pages->pages) {
+		kfree(tmc_pages->daddrs);
+		tmc_pages->daddrs = NULL;
+		return -ENOMEM;
+	}
+
+	for (i = 0; i < nr_pages; i++) {
+		if (pages && pages[i]) {
+			page = virt_to_page(pages[i]);
+			/* Hold a refcount on the page */
+			get_page(page);
+		} else {
+			page = alloc_pages_node(node,
+						GFP_KERNEL | __GFP_ZERO, 0);
+		}
+		paddr = dma_map_page(dev, page, 0, PAGE_SIZE, dir);
+		if (dma_mapping_error(dev, paddr))
+			goto err;
+		tmc_pages->daddrs[i] = paddr;
+		tmc_pages->pages[i] = page;
+	}
+	return 0;
+err:
+	tmc_pages_free(tmc_pages, dev, dir);
+	return -ENOMEM;
+}
+
+static inline long
+tmc_sg_get_data_page_offset(struct tmc_sg_table *sg_table, dma_addr_t addr)
+{
+	return tmc_pages_get_offset(&sg_table->data_pages, addr);
+}
+
+static inline void tmc_free_table_pages(struct tmc_sg_table *sg_table)
+{
+	if (sg_table->table_vaddr)
+		vunmap(sg_table->table_vaddr);
+	tmc_pages_free(&sg_table->table_pages, sg_table->dev, DMA_TO_DEVICE);
+}
+
+static void tmc_free_data_pages(struct tmc_sg_table *sg_table)
+{
+	if (sg_table->data_vaddr)
+		vunmap(sg_table->data_vaddr);
+	tmc_pages_free(&sg_table->data_pages, sg_table->dev, DMA_FROM_DEVICE);
+}
+
+void tmc_free_sg_table(struct tmc_sg_table *sg_table)
+{
+	tmc_free_table_pages(sg_table);
+	tmc_free_data_pages(sg_table);
+}
+
+/*
+ * Alloc pages for the table. Since this will be used by the device,
+ * allocate the pages closer to the device (i.e, dev_to_node(dev)
+ * rather than the CPU node).
+ */
+static int tmc_alloc_table_pages(struct tmc_sg_table *sg_table)
+{
+	int rc;
+	struct tmc_pages *table_pages = &sg_table->table_pages;
+
+	rc = tmc_pages_alloc(table_pages, sg_table->dev,
+			     dev_to_node(sg_table->dev),
+			     DMA_TO_DEVICE, NULL);
+	if (rc)
+		return rc;
+	sg_table->table_vaddr = vmap(table_pages->pages,
+				     table_pages->nr_pages,
+				     VM_MAP,
+				     PAGE_KERNEL);
+	if (!sg_table->table_vaddr)
+		rc = -ENOMEM;
+	else
+		sg_table->table_daddr = table_pages->daddrs[0];
+	return rc;
+}
+
+static int tmc_alloc_data_pages(struct tmc_sg_table *sg_table, void **pages)
+{
+	int rc;
+
+	/* Allocate data pages on the node requested by the caller */
+	rc = tmc_pages_alloc(&sg_table->data_pages,
+			     sg_table->dev, sg_table->node,
+			     DMA_FROM_DEVICE, pages);
+	if (!rc) {
+		sg_table->data_vaddr = vmap(sg_table->data_pages.pages,
+					    sg_table->data_pages.nr_pages,
+					    VM_MAP,
+					    PAGE_KERNEL);
+		if (!sg_table->data_vaddr)
+			rc = -ENOMEM;
+	}
+	return rc;
+}
+
+/*
+ * tmc_alloc_sg_table: Allocate and setup dma pages for the TMC SG table
+ * and data buffers. TMC writes to the data buffers and reads from the SG
+ * Table pages.
+ *
+ * @dev		- Device to which page should be DMA mapped.
+ * @node	- Numa node for mem allocations
+ * @nr_tpages	- Number of pages for the table entries.
+ * @nr_dpages	- Number of pages for Data buffer.
+ * @pages	- Optional list of virtual address of pages.
+ */
+struct tmc_sg_table *tmc_alloc_sg_table(struct device *dev,
+					int node,
+					int nr_tpages,
+					int nr_dpages,
+					void **pages)
+{
+	long rc;
+	struct tmc_sg_table *sg_table;
+
+	sg_table = kzalloc(sizeof(*sg_table), GFP_KERNEL);
+	if (!sg_table)
+		return ERR_PTR(-ENOMEM);
+	sg_table->data_pages.nr_pages = nr_dpages;
+	sg_table->table_pages.nr_pages = nr_tpages;
+	sg_table->node = node;
+	sg_table->dev = dev;
+
+	rc  = tmc_alloc_data_pages(sg_table, pages);
+	if (!rc)
+		rc = tmc_alloc_table_pages(sg_table);
+	if (rc) {
+		tmc_free_sg_table(sg_table);
+		kfree(sg_table);
+		return ERR_PTR(rc);
+	}
+
+	return sg_table;
+}
+
+/*
+ * tmc_sg_table_sync_data_range: Sync the data buffer written
+ * by the device from @offset upto a @size bytes.
+ */
+void tmc_sg_table_sync_data_range(struct tmc_sg_table *table,
+				  u64 offset, u64 size)
+{
+	int i, index, start;
+	int npages = DIV_ROUND_UP(size, PAGE_SIZE);
+	struct device *dev = table->dev;
+	struct tmc_pages *data = &table->data_pages;
+
+	start = offset >> PAGE_SHIFT;
+	for (i = start; i < (start + npages); i++) {
+		index = i % data->nr_pages;
+		dma_sync_single_for_cpu(dev, data->daddrs[index],
+					PAGE_SIZE, DMA_FROM_DEVICE);
+	}
+}
+
+/* tmc_sg_sync_table: Sync the page table */
+void tmc_sg_table_sync_table(struct tmc_sg_table *sg_table)
+{
+	int i;
+	struct device *dev = sg_table->dev;
+	struct tmc_pages *table_pages = &sg_table->table_pages;
+
+	for (i = 0; i < table_pages->nr_pages; i++)
+		dma_sync_single_for_device(dev, table_pages->daddrs[i],
+					   PAGE_SIZE, DMA_TO_DEVICE);
+}
+
+/*
+ * tmc_sg_table_get_data: Get the buffer pointer for data @offset
+ * in the SG buffer. The @bufpp is updated to point to the buffer.
+ * Returns :
+ *	the length of linear data available at @offset.
+ *	or
+ *	<= 0 if no data is available.
+ */
+ssize_t tmc_sg_table_get_data(struct tmc_sg_table *sg_table,
+			      u64 offset, size_t len, char **bufpp)
+{
+	size_t size;
+	int pg_idx = offset >> PAGE_SHIFT;
+	int pg_offset = offset & (PAGE_SIZE - 1);
+	struct tmc_pages *data_pages = &sg_table->data_pages;
+
+	size = tmc_sg_table_buf_size(sg_table);
+	if (offset >= size)
+		return -EINVAL;
+
+	/* Make sure we don't go beyond the end */
+	len = (len < (size - offset)) ? len : size - offset;
+	/* Respect the page boundaries */
+	len = (len < (PAGE_SIZE - pg_offset)) ? len : (PAGE_SIZE - pg_offset);
+	if (len > 0)
+		*bufpp = page_address(data_pages->pages[pg_idx]) + pg_offset;
+	return len;
+}
+
 static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata)
 {
 	u32 axictl, sts;
diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
index 1d7cd58..cdb668b 100644
--- a/drivers/hwtracing/coresight/coresight-tmc.h
+++ b/drivers/hwtracing/coresight/coresight-tmc.h
@@ -7,6 +7,7 @@
 #ifndef _CORESIGHT_TMC_H
 #define _CORESIGHT_TMC_H
 
+#include <linux/dma-mapping.h>
 #include <linux/miscdevice.h>
 
 #define TMC_RSZ			0x004
@@ -160,6 +161,38 @@ struct tmc_drvdata {
 	u32			etr_caps;
 };
 
+/**
+ * struct tmc_pages - Collection of pages used for SG.
+ * @nr_pages:		Number of pages in the list.
+ * @daddrs:		Array of DMA'able page address.
+ * @pages:		Array pages for the buffer.
+ */
+struct tmc_pages {
+	int nr_pages;
+	dma_addr_t	*daddrs;
+	struct page	**pages;
+};
+
+/*
+ * struct tmc_sg_table - Generic SG table for TMC
+ * @dev:		Device for DMA allocations
+ * @table_vaddr:	Contiguous Virtual address for PageTable
+ * @data_vaddr:		Contiguous Virtual address for Data Buffer
+ * @table_daddr:	DMA address of the PageTable base
+ * @node:		Node for Page allocations
+ * @table_pages:	List of pages & dma address for Table
+ * @data_pages:		List of pages & dma address for Data
+ */
+struct tmc_sg_table {
+	struct device *dev;
+	void *table_vaddr;
+	void *data_vaddr;
+	dma_addr_t table_daddr;
+	int node;
+	struct tmc_pages table_pages;
+	struct tmc_pages data_pages;
+};
+
 /* Generic functions */
 void tmc_wait_for_tmcready(struct tmc_drvdata *drvdata);
 void tmc_flush_and_stop(struct tmc_drvdata *drvdata);
@@ -215,4 +248,21 @@ static inline bool tmc_etr_has_cap(struct tmc_drvdata *drvdata, u32 cap)
 	return !!(drvdata->etr_caps & cap);
 }
 
+struct tmc_sg_table *tmc_alloc_sg_table(struct device *dev,
+					int node,
+					int nr_tpages,
+					int nr_dpages,
+					void **pages);
+void tmc_free_sg_table(struct tmc_sg_table *sg_table);
+void tmc_sg_table_sync_table(struct tmc_sg_table *sg_table);
+void tmc_sg_table_sync_data_range(struct tmc_sg_table *table,
+				  u64 offset, u64 size);
+ssize_t tmc_sg_table_get_data(struct tmc_sg_table *sg_table,
+			      u64 offset, size_t len, char **bufpp);
+static inline unsigned long
+tmc_sg_table_buf_size(struct tmc_sg_table *sg_table)
+{
+	return sg_table->data_pages.nr_pages << PAGE_SHIFT;
+}
+
 #endif
-- 
2.7.4

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox