Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [RFT v3 1/4] perf cs-etm: Generate branch sample for missed packets
From: Mike Leach @ 2018-05-30 15:04 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <5efe804f-364b-6ae1-3fca-ec7f6d8a383d@arm.com>

Hi Leo,



On 30 May 2018 at 10:45, Robert Walker <robert.walker@arm.com> wrote:
>
>
> On 28/05/18 23:13, Mathieu Poirier wrote:
>>
>> Leo and/or Robert,
>>
>> On Mon, May 28, 2018 at 04:45:00PM +0800, Leo Yan wrote:
>>>
>>> Commit e573e978fb12 ("perf cs-etm: Inject capabilitity for CoreSight
>>> traces") reworks the samples generation flow from CoreSight trace to
>>> match the correct format so Perf report tool can display the samples
>>> properly.
>>>
>>> But the change has side effect for branch packet handling, it only
>>> generate branch samples by checking previous packet flag
>>> 'last_instr_taken_branch' is true, this results in below three kinds
>>> packets are missed to generate branch samples:
>>>
>>> - The start tracing packet at the beginning of tracing data;
>>> - The exception handling packet;
>>> - If one CS_ETM_TRACE_ON packet is inserted, we also miss to handle it
>>>    for branch samples.  CS_ETM_TRACE_ON packet itself can give the info
>>>    that there have a discontinuity in the trace, on the other hand we
>>>    also miss to generate proper branch sample for packets before and
>>>    after CS_ETM_TRACE_ON packet.
>>>
>>> This patch is to add branch sample handling for up three kinds packets:
>>>
>>> - In function cs_etm__sample(), check if 'prev_packet->sample_type' is
>>>    zero and in this case it generates branch sample for the start tracing
>>>    packet; furthermore, we also need to handle the condition for
>>>    prev_packet::end_addr is zero in the cs_etm__last_executed_instr();
>>>
>>> - In function cs_etm__sample(), check if 'prev_packet->exc' is true and
>>>    generate branch sample for exception handling packet;
>>>
>>> - If there has one CS_ETM_TRACE_ON packet is coming, we firstly generate
>>>    branch sample in the function cs_etm__flush(), this can save complete
>>>    info for the previous CS_ETM_RANGE packet just before CS_ETM_TRACE_ON
>>>    packet.  We also generate branch sample for the new CS_ETM_RANGE
>>>    packet after CS_ETM_TRACE_ON packet, this have two purposes, the
>>>    first one purpose is to save the info for the new CS_ETM_RANGE packet,
>>>    the second purpose is to save CS_ETM_TRACE_ON packet info so we can
>>>    have hint for a discontinuity in the trace.
>>>
>>>    For CS_ETM_TRACE_ON packet, its fields 'packet->start_addr' and
>>>    'packet->end_addr' equal to 0xdeadbeefdeadbeefUL which are emitted in
>>>    the decoder layer as dummy value.  This patch is to convert these
>>>    values to zeros for more readable; this is accomplished by functions
>>>    cs_etm__last_executed_instr() and cs_etm__first_executed_instr().  The
>>>    later one is a new function introduced by this patch.
>>>
>>> Reviewed-by: Robert Walker <robert.walker@arm.com>
>>> Fixes: e573e978fb12 ("perf cs-etm: Inject capabilitity for CoreSight
>>> traces")
>>> Signed-off-by: Leo Yan <leo.yan@linaro.org>
>>> ---
>>>   tools/perf/util/cs-etm.c | 93
>>> +++++++++++++++++++++++++++++++++++++-----------
>>>   1 file changed, 73 insertions(+), 20 deletions(-)
>>>
>>> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
>>> index 822ba91..8418173 100644
>>> --- a/tools/perf/util/cs-etm.c
>>> +++ b/tools/perf/util/cs-etm.c
>>> @@ -495,6 +495,20 @@ static inline void
>>> cs_etm__reset_last_branch_rb(struct cs_etm_queue *etmq)
>>>   static inline u64 cs_etm__last_executed_instr(struct cs_etm_packet
>>> *packet)
>>>   {
>>>         /*
>>> +        * The packet is the start tracing packet if the end_addr is
>>> zero,
>>> +        * returns 0 for this case.
>>> +        */
>>> +       if (!packet->end_addr)
>>> +               return 0;
>>
>>
>> What is considered to be the "start tracing packet"?  Right now the only
>> two
>> kind of packets inserted in the decoder packet buffer queue are INST_RANGE
>> and
>> TRACE_ON.  How can we hit a condition where packet->end-addr == 0?
>>
>>
>>> +
>>> +       /*
>>> +        * The packet is the CS_ETM_TRACE_ON packet if the end_addr is
>>> +        * magic number 0xdeadbeefdeadbeefUL, returns 0 for this case.
>>> +        */
>>> +       if (packet->end_addr == 0xdeadbeefdeadbeefUL)
>>> +               return 0;
>>
>>
>> As it is with the above, I find triggering on addresses to be brittle and
>> hard
>> to maintain on the long run.  Packets all have a sample_type field that
>> should
>> be used in cases like this one.  That way we know exactly the condition
>> that is
>> targeted.
>>
>> While working on this set, please spin-off another patch that defines
>> CS_ETM_INVAL_ADDR 0xdeadbeefdeadbeefUL and replace all the cases where the
>> numeral is used.  That way we stop using the hard coded value.
>>
>>> +
>>> +       /*
>>>          * The packet records the execution range with an exclusive end
>>> address
>>>          *
>>>          * A64 instructions are constant size, so the last executed
>>> @@ -505,6 +519,18 @@ static inline u64 cs_etm__last_executed_instr(struct
>>> cs_etm_packet *packet)
>>>         return packet->end_addr - A64_INSTR_SIZE;
>>>   }
>>>   +static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet
>>> *packet)
>>> +{
>>> +       /*
>>> +        * The packet is the CS_ETM_TRACE_ON packet if the start_addr is
>>> +        * magic number 0xdeadbeefdeadbeefUL, returns 0 for this case.
>>> +        */
>>> +       if (packet->start_addr == 0xdeadbeefdeadbeefUL)
>>> +               return 0;
>>
>>
>> Same comment as above.
>>
>>> +
>>> +       return packet->start_addr;
>>> +}
>>> +
>>>   static inline u64 cs_etm__instr_count(const struct cs_etm_packet
>>> *packet)
>>>   {
>>>         /*
>>> @@ -546,7 +572,7 @@ static void cs_etm__update_last_branch_rb(struct
>>> cs_etm_queue *etmq)
>>>         be       = &bs->entries[etmq->last_branch_pos];
>>>         be->from = cs_etm__last_executed_instr(etmq->prev_packet);
>>> -       be->to   = etmq->packet->start_addr;
>>> +       be->to   = cs_etm__first_executed_instr(etmq->packet);
>>>         /* No support for mispredict */
>>>         be->flags.mispred = 0;
>>>         be->flags.predicted = 1;
>>> @@ -701,7 +727,7 @@ static int cs_etm__synth_branch_sample(struct
>>> cs_etm_queue *etmq)
>>>         sample.ip = cs_etm__last_executed_instr(etmq->prev_packet);
>>>         sample.pid = etmq->pid;
>>>         sample.tid = etmq->tid;
>>> -       sample.addr = etmq->packet->start_addr;
>>> +       sample.addr = cs_etm__first_executed_instr(etmq->packet);
>>>         sample.id = etmq->etm->branches_id;
>>>         sample.stream_id = etmq->etm->branches_id;
>>>         sample.period = 1;
>>> @@ -897,13 +923,28 @@ static int cs_etm__sample(struct cs_etm_queue
>>> *etmq)
>>>                 etmq->period_instructions = instrs_over;
>>>         }
>>>   -     if (etm->sample_branches &&
>>> -           etmq->prev_packet &&
>>> -           etmq->prev_packet->sample_type == CS_ETM_RANGE &&
>>> -           etmq->prev_packet->last_instr_taken_branch) {
>>> -               ret = cs_etm__synth_branch_sample(etmq);
>>> -               if (ret)
>>> -                       return ret;
>>> +       if (etm->sample_branches && etmq->prev_packet) {
>>> +               bool generate_sample = false;
>>> +
>>> +               /* Generate sample for start tracing packet */
>>> +               if (etmq->prev_packet->sample_type == 0 ||
>>
>>
>> What kind of packet is sample_type == 0 ?
>>
>>> +                   etmq->prev_packet->sample_type == CS_ETM_TRACE_ON)
>>> +                       generate_sample = true;
>>> +
>>> +               /* Generate sample for exception packet */
>>> +               if (etmq->prev_packet->exc == true)
>>> +                       generate_sample = true;
>>
>>
>> Please don't do that.  Exception packets have a type of their own and can
>> be
>> added to the decoder packet queue the same way INST_RANGE and TRACE_ON
>> packets
>> are.  Moreover exception packet containt an address that, if I'm reading
>> the
>> documenation properly, can be used to keep track of instructions that were
>> executed between the last address of the previous range packet and the
>> address
>> executed just before the exception occurred.  Mike and Rob will have to
>> confirm
>> this as the decoder may be doing all that hard work for us.
>>

clarification on the exception packets....

The Opencsd output exception packet gives you the exception number,
and optionally the preferred return address. If this address is
present does depend a lot on the underlying protocol - will normally
be there with ETMv4.
Exceptions are marked differently in the underlying protocol - the
OCSD packets abstract away these differences.

consider the code:

0x1000: <some instructions>
0x1100: BR 0x2000
....
0x2000: <some instructions>
0x2020  BZ r4

Without an exception this would result in the packets

OCSD_RANGE(0x1000,0x1104, Last instr type=Br, taken)   // recall that
range packets have start addr inclusive, end addr exclusive.
OCSD_RANGE(0x2000,0x2024, Last instr type=Br, <taken / not taken -
depends on condition>

Now consider an exception occurring before the BR 0x2000

this will result in:-
OCSD_RANGE(0x1000, 0x1100, Last instr type=Other)
OCSD_EXECEPTION(IRQ, ret-addr 0x1100)
OCSD_RANGE(IRQ_START, IRQ_END+4, Last instr type = BR, taken)    //
this is more likely to have multiple ranges / branches before any
return, but simplified here.
OCSD_EXCEPTION_RETURN()  // present if exception returns are
explicitly marked in underlying trace - may not always be depending on
circumstances.
OCSD_RANGE(0x1100,0x1104, Last=BR, taken) // continue on with short
range - just the branch
OCSD_RANGE(0x2000,0x2024, Last instr type=Br, <taken / not taken -
depends on condition>

Now consider the exception occurring after the BR, but before any
other instructions are executed.

OCSD_RANGE(0x1000,0x1104, Last instr type=Br, taken)   // recall that
range packets have start addr inclusive, end addr exclusive.
OCSD_EXECEPTION(IRQ, ret-addr 0x2000)     // here the preferred return
address is actually the target of the branch.
OCSD_RANGE(IRQ_START, IRQ_END+4, Last instr type = BR, taken)    //
this is more likely to have multiple ranges / branches before any
return, but simplified here.
OCSD_RANGE(0x2000,0x2024, Last instr type=Br, <taken / not taken -
depends on condition>

So in general it is possible to arrive in the IRQ_START range with the
previous packet having been either a taken branch, a not taken branch,
or not a branch.
Care must be taken - whether AutoFDO or normal trace disassembly not
to assume that having the last range packet as a taken branch means
that the next range packet is the target, if there is an intervening
exception packet.

Regards

Mike

>>> +
>>> +               /* Generate sample for normal branch packet */
>>> +               if (etmq->prev_packet->sample_type == CS_ETM_RANGE &&
>>> +                   etmq->prev_packet->last_instr_taken_branch)
>>> +                       generate_sample = true;
>>> +
>>> +               if (generate_sample) {
>>> +                       ret = cs_etm__synth_branch_sample(etmq);
>>> +                       if (ret)
>>> +                               return ret;
>>> +               }
>>>         }
>>>         if (etm->sample_branches || etm->synth_opts.last_branch) {
>>> @@ -922,11 +963,16 @@ static int cs_etm__sample(struct cs_etm_queue
>>> *etmq)
>>>   static int cs_etm__flush(struct cs_etm_queue *etmq)
>>>   {
>>>         int err = 0;
>>> +       struct cs_etm_auxtrace *etm = etmq->etm;
>>>         struct cs_etm_packet *tmp;
>>>   -     if (etmq->etm->synth_opts.last_branch &&
>>> -           etmq->prev_packet &&
>>> -           etmq->prev_packet->sample_type == CS_ETM_RANGE) {
>>> +       if (!etmq->prev_packet)
>>> +               return 0;
>>> +
>>> +       if (etmq->prev_packet->sample_type != CS_ETM_RANGE)
>>> +               return 0;
>>> +
>>> +       if (etmq->etm->synth_opts.last_branch) {
>>
>>
>> If you add:
>>
>>          if (!etmq->etm->synth_opts.last_branch)
>>                  return 0;
>>
>> You can avoid indenting the whole block.
>>
>>>                 /*
>>>                  * Generate a last branch event for the branches left in
>>> the
>>>                  * circular buffer at the end of the trace.
>>> @@ -939,18 +985,25 @@ static int cs_etm__flush(struct cs_etm_queue *etmq)
>>>                 err = cs_etm__synth_instruction_sample(
>>>                         etmq, addr,
>>>                         etmq->period_instructions);
>>> +               if (err)
>>> +                       return err;
>>>                 etmq->period_instructions = 0;
>>> +       }
>>>   -             /*
>>> -                * Swap PACKET with PREV_PACKET: PACKET becomes
>>> PREV_PACKET for
>>> -                * the next incoming packet.
>>> -                */
>>> -               tmp = etmq->packet;
>>> -               etmq->packet = etmq->prev_packet;
>>> -               etmq->prev_packet = tmp;
>>> +       if (etm->sample_branches) {
>>> +               err = cs_etm__synth_branch_sample(etmq);
>>> +               if (err)
>>> +                       return err;
>>>         }
>>>   -     return err;
>>> +       /*
>>> +        * Swap PACKET with PREV_PACKET: PACKET becomes PREV_PACKET for
>>> +        * the next incoming packet.
>>> +        */
>>> +       tmp = etmq->packet;
>>> +       etmq->packet = etmq->prev_packet;
>>> +       etmq->prev_packet = tmp;
>>
>>
>> Robert, I remember noticing that when you first submitted the code but
>> forgot to
>> go back to it.  What is the point of swapping the packets?  I understand
>>
>> etmq->prev_packet = etmq->packet;
>>
>> But not
>>
>> etmq->packet = tmp;
>>
>> After all etmq->packet will be clobbered as soon as
>> cs_etm_decoder__get_packet()
>> is called, which is alwasy right after either cs_etm__sample() or
>> cs_etm__flush().
>>
>
> This is code I inherited from the original versions of these patches, but it
> works because:
> - etmq->packet and etmq->prev_packet are pointers to struct cs_etm_packet
> allocated by zalloc() in cs_etm__alloc_queue()
> - cs_etm_decoder__get_packet() takes a pointer to struct cs_etm_packet and
> copies the contents of the first packet from the queue into the passed
> location with:
>    *packet = decoder->packet_buffer[decoder->head]
>
> So the swap code is only swapping the pointers over, not the contents of the
> packets.
>
> Regards
>
> Rob
>
>
>
>> Thanks,
>> Mathieu
>>
>>
>>
>>> +       return 0;
>>>   }
>>>     static int cs_etm__run_decoder(struct cs_etm_queue *etmq)
>>> --
>>> 2.7.4
>>>
>



-- 
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK

^ permalink raw reply

* [PATCH v2 5/6] ARM: dts: Add generic interconnect target module node for MCAN
From: Tony Lindgren @ 2018-05-30 15:04 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20180530141133.3711-6-faiz_abbas@ti.com>

* Faiz Abbas <faiz_abbas@ti.com> [180530 14:12]:
> The ti-sysc driver provides support for manipulating the idlemodes
> and interconnect level resets.
...
> --- a/arch/arm/boot/dts/dra76x.dtsi
> +++ b/arch/arm/boot/dts/dra76x.dtsi
> @@ -11,6 +11,25 @@
>  / {
>  	compatible = "ti,dra762", "ti,dra7";
>  
> +	ocp {
> +
> +		target-module at 0x42c00000 {
> +			compatible = "ti,sysc-dra7-mcan";
> +			ranges = <0x0 0x42c00000 0x2000>;
> +			#address-cells = <1>;
> +			#size-cells = <1>;
> +			reg = <0x42c01900 0x4>,
> +			      <0x42c01904 0x4>,
> +			      <0x42c01908 0x4>;
> +			reg-names = "rev", "sysc", "syss";
> +			ti,sysc-mask = <(SYSC_OMAP4_SOFTRESET |
> +					 SYSC_DRA7_MCAN_ENAWAKEUP)>;
> +			ti,syss-mask = <1>;
> +			clocks = <&wkupaon_clkctrl DRA7_ADC_CLKCTRL 0>;
> +			clock-names = "fck";
> +		};
> +	};
> +
>  };

Looks good to me except I think the reset won't do anything currently
with ti-sysc.c unless you specfify also "ti,hwmods" for the module?

Can you please check? It might be worth adding the reset function to
ti-sysc.c for non "ti,hwmods" case and that just might remove the
need for any hwmod code for this module.

Regards,

Tony

^ permalink raw reply

* [PATCH v3 1/2] regulator: dt-bindings: add QCOM RPMh regulator bindings
From: Mark Brown @ 2018-05-30 15:02 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <CAD=FV=W-MLU9c-SvfLgKAKGP1eHN5xO-013f2M09XUa+QeHzrA@mail.gmail.com>

On Wed, May 30, 2018 at 07:46:50AM -0700, Doug Anderson wrote:
> On Wed, May 30, 2018 at 2:37 AM, Mark Brown <broonie@kernel.org> wrote:

> >> Linux vote for the lowest voltage it's comfortable with.  Linux keeps
> >> track of the true voltage that the driver wants and will always change
> >> its vote back to that before enabling.  Thus (assuming Linux is OK
> >> with 1.2 V - 1.4 V for a rail):

> > That's pretty much what it should do anyway with normally designed
> > hardware.

> I guess the question is: do we insist that the driver include this
> workaround, or are we OK with letting the hardware behave as the
> hardware does?

What you're describing sounds like what we should be doing normally, if
we're not doing that we should probably be fixing the core.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20180530/a848ca01/attachment.sig>

^ permalink raw reply

* [PATCH v2 4/6] bus: ti-sysc: Add support for using ti-sysc for MCAN on dra76x
From: Tony Lindgren @ 2018-05-30 14:57 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20180530141133.3711-5-faiz_abbas@ti.com>

* Faiz Abbas <faiz_abbas@ti.com> [180530 14:12]:
> The dra76x MCAN generic interconnect module has a its own
> format for the bits in the control registers.
...
> --- a/drivers/bus/ti-sysc.c
> +++ b/drivers/bus/ti-sysc.c
> @@ -1262,6 +1262,22 @@ static const struct sysc_capabilities sysc_omap4_usb_host_fs = {
>  	.regbits = &sysc_regbits_omap4_usb_host_fs,
>  };
>  
> +static const struct sysc_regbits sysc_regbits_dra7_mcan = {
> +	.dmadisable_shift = -ENODEV,
> +	.midle_shift = -ENODEV,
> +	.sidle_shift = -ENODEV,
> +	.clkact_shift = -ENODEV,
> +	.enwkup_shift = 4,
> +	.srst_shift = 0,
> +	.emufree_shift = -ENODEV,
> +	.autoidle_shift = -ENODEV,
> +};
> +
> +static const struct sysc_capabilities sysc_dra7_mcan = {
> +	.type = TI_SYSC_DRA7_MCAN,
> +	.regbits = &sysc_regbits_dra7_mcan,
> +};
> +
>  static int sysc_init_pdata(struct sysc *ddata)
>  {
>  	struct ti_sysc_platform_data *pdata = dev_get_platdata(ddata->dev);
> @@ -1441,6 +1457,7 @@ static const struct of_device_id sysc_match[] = {
>  	{ .compatible = "ti,sysc-mcasp", .data = &sysc_omap4_mcasp, },
>  	{ .compatible = "ti,sysc-usb-host-fs",
>  	  .data = &sysc_omap4_usb_host_fs, },
> +	{ .compatible = "ti,sysc-dra7-mcan", .data = &sysc_dra7_mcan, },
>  	{  },
>  };

Looks good to me. And presumably you checked that no other dra7 modules
use sysconfig register bit layout like this?

Regards,

Tony

^ permalink raw reply

* [PATCH] PCI: Add pci=safemode option
From: Bjorn Helgaas @ 2018-05-30 14:56 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <6dfe2db8f974d94c9867f30ec83d9333@codeaurora.org>

On Wed, May 30, 2018 at 12:44:29AM -0700, okaya at codeaurora.org wrote:
> On 2018-05-30 00:37, Christoph Hellwig wrote:
> > On Tue, May 29, 2018 at 09:41:33PM -0700, Sinan Kaya wrote:
> > > Bjorn and I discussed the need for such a "safe" mode feature when you
> > > want to bring up PCI for a platform. You want to turn off everything
> > > as
> > > a starter and just stick to bare minimum.
> > 
> > Can we please make it a config option the instead of adding code
> > to every kernel?  Also maybe the bringup should be in the name
> > to make this more clear?
> 
> One other requirement was to have a runtime option rather than compile time
> option.
> 
> When someone reported a problem, we wanted to be able to say "use this
> option and see if system boots" without doing any bisects or recompilation.
> 
> This would be the first step in troubleshooting a system to see if
> fundamental features are working.
> 
> I don't mind changing the name
> Bjorn mentioned safe option. I made it safemode. I am looking at Bjorn for
> suggestions at this moment.

This *was* my idea, but I'm starting to think it was a bad idea.

I don't want people to use pci= parameters as the normal way to get a
system to boot.  That would be a huge support hassle (putting things
in release notes, diagnosing problems when people forget it, etc).

But the parameters *are* useful for debugging.  If we had a
"pci=safemode" and it avoided some problem, the next step would be to
narrow it down by using the more specific flags (pci=nomsi, pci=noari,
pci=no_ext_tags, etc).  So I think 95% of the value is in the specific
flags, and a "pci=safemode" might add a little bit of value but at the
cost of a small but nagging maintenance concern and code clutter.

Bjorn

^ permalink raw reply

* [PATCH v4 1/2] regulator: dt-bindings: add QCOM RPMh regulator bindings
From: Doug Anderson @ 2018-05-30 14:54 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20180530103720.GH6920@sirena.org.uk>

Hi,

On Wed, May 30, 2018 at 3:37 AM, Mark Brown <broonie@kernel.org> wrote:
> On Tue, May 29, 2018 at 10:23:20PM -0700, Doug Anderson wrote:
>
>> > +                       qcom,drms-mode-max-microamps = <10000 1000000>;
>
>> Things look pretty good to me now.  I'm still hesitant about the whole
>> need to list the modes twice (once using the unordered
>> "regulator-allowed-modes" and once to match up against the ordered
>> "qcom,drms-mode-max-microamps").  I'm also still of the opinion that
>> the whole "drms-mode-max-microamps" ought to be a standard property
>> (not a qcom specific one) and handled in the regulator core.
>
> I'm confused as to why we are specifying the maximum current the device
> can deliver in a given mode in the DT - surely that's a fixed property
> of the hardware?

Said another way: you're saying that you'd expect the "max-microamps"
table to have one fewer element than the listed modes?  So in the
above example you'd have:

drms-modes: LPM, HPM
max-microamps: 10000

...or in a more complicated case, you could have:

drms-modes: RET, LPM, AUTO, HPM
max-microamps: 1, 100, 10000


Did I understand you correctly?

I'm personally OK with that color for the shed.  I kinda like the
symmetry of having the same number of elements in both lists (and
being able to print an error message if someone ends up asking for too
much current), but it's not a big deal for me either way.

-Doug

^ permalink raw reply

* Regression in Linux next again
From: Naresh Kamboju @ 2018-05-30 14:53 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20180530143338.GM6920@sirena.org.uk>

On 30 May 2018 at 20:03, Mark Brown <broonie@kernel.org> wrote:
> On Wed, May 30, 2018 at 04:03:15PM +0200, Maciej Purski wrote:
>
>> I'm afraid, I have no idea, how to fix it quickly. You can revert it and
>> in the next version I'll fix the build error and split the last patch even
>> more, so we could perform a more precise bisection. I'd be grateful if you
>> could push it on your test coupled branch and Tony could test it again before
>> merging it with next again.
>
> Yeah, if we could get testing from Tony first that'd be ideal.

Linux next 4.17.0-rc7-next-20180529 boot failed on x15 device.
Manually reproduced boot failed problem on x15 device.
The qemu_arm32 boots successfully.

Ref bug:
LKFT: linux-next: Boot failed on beagle board x15
https://bugs.linaro.org/show_bug.cgi?id=3863

^ permalink raw reply

* [PATCH v2 1/6] ARM: dra762: hwmod: Add MCAN support
From: Tony Lindgren @ 2018-05-30 14:50 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20180530141133.3711-2-faiz_abbas@ti.com>

* Faiz Abbas <faiz_abbas@ti.com> [180530 14:12]:
> From: Lokesh Vutla <lokeshvutla@ti.com>
> 
> Add MCAN hwmod data and register it for dra762 silicons.
> 
> Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
> Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
> ---
>  arch/arm/mach-omap2/omap_hwmod_7xx_data.c | 32 +++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
> 
> diff --git a/arch/arm/mach-omap2/omap_hwmod_7xx_data.c b/arch/arm/mach-omap2/omap_hwmod_7xx_data.c
> index 62352d1e6361..a2cd7f865a60 100644
> --- a/arch/arm/mach-omap2/omap_hwmod_7xx_data.c
> +++ b/arch/arm/mach-omap2/omap_hwmod_7xx_data.c
> @@ -1355,6 +1355,29 @@ static struct omap_hwmod dra7xx_mailbox13_hwmod = {
>  	},
>  };
>  
> +/*
> + * 'mcan' class
> + *
> + */
> +static struct omap_hwmod_class dra76x_mcan_hwmod_class = {
> +	.name	= "mcan",
> +};
> +
> +/* mcan */
> +static struct omap_hwmod dra76x_mcan_hwmod = {
> +	.name		= "mcan",
> +	.class		= &dra76x_mcan_hwmod_class,
> +	.clkdm_name	= "wkupaon_clkdm",
> +	.main_clk	= "mcan_clk",
> +	.prcm = {
> +		.omap4 = {
> +			.clkctrl_offs = DRA7XX_CM_WKUPAON_ADC_CLKCTRL_OFFSET,
> +			.context_offs = DRA7XX_RM_WKUPAON_ADC_CONTEXT_OFFSET,
> +			.modulemode   = MODULEMODE_SWCTRL,
> +		},
> +	},
> +};

You should be now able to leave out at least the clkctrl_offs and modulemode
here. Please also check if leaving out clkdm_name and main_clk now works :)

> @@ -3818,6 +3841,14 @@ static struct omap_hwmod_ocp_if dra7xx_l4_per2__epwmss2 = {
>  	.user		= OCP_USER_MPU,
>  };
>  
> +/* l3_main_1 -> mcan */
> +static struct omap_hwmod_ocp_if dra76x_l3_main_1__mcan = {
> +	.master		= &dra7xx_l3_main_1_hwmod,
> +	.slave		= &dra76x_mcan_hwmod,
> +	.clk		= "l3_iclk_div",
> +	.user		= OCP_USER_MPU | OCP_USER_SDMA,
> +};

I think this we still need though for the clk. Tero, do you have any comments
on what all clocks can now be left out?

Regards,

Tony

^ permalink raw reply

* [RFT v3 1/4] perf cs-etm: Generate branch sample for missed packets
From: Leo Yan @ 2018-05-30 14:49 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <CANLsYkzPShMj1cbXZPb7Lzio+gKcmvhD2+KTYr7vP9ec1d6bmg@mail.gmail.com>

On Wed, May 30, 2018 at 08:45:46AM -0600, Mathieu Poirier wrote:
> On 29 May 2018 at 18:28, Leo Yan <leo.yan@linaro.org> wrote:
> > Hi Mathieu,
> >
> > On Tue, May 29, 2018 at 10:04:49AM -0600, Mathieu Poirier wrote:
> >
> > [...]
> >
> >> > As now this patch is big with more complex logic, so I consider to
> >> > split it into small patches:
> >> >
> >> > - Define CS_ETM_INVAL_ADDR;
> >> > - Fix for CS_ETM_TRACE_ON packet;
> >> > - Fix for exception packet;
> >> >
> >> > Does this make sense for you?  I have concern that this patch is a
> >> > fixing patch, so not sure after spliting patches will introduce
> >> > trouble for applying them for other stable kernels ...
> >>
> >> Reverse the order:
> >>
> >> - Fix for CS_ETM_TRACE_ON packet;
> >> - Fix for exception packet;
> >> - Define CS_ETM_INVAL_ADDR;
> >>
> >> But you may not need to - see next comment.
> >
> > From the discussion context, I think here 'you may not need to' is
> > referring to my concern for applying patches on stable kernel, so I
> > should take this patch series as an enhancement and don't need to
> > consider much for stable kernel.
> 
> Yes, that is what I meant.

Thanks for confirmation, will send new patch series according to the
discussion.

[...]

Thanks,
Leo Yan

^ permalink raw reply

* [PATCH v3 1/2] regulator: dt-bindings: add QCOM RPMh regulator bindings
From: Doug Anderson @ 2018-05-30 14:46 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20180530093701.GD6920@sirena.org.uk>

Hi,

On Wed, May 30, 2018 at 2:37 AM, Mark Brown <broonie@kernel.org> wrote:
> On Tue, May 29, 2018 at 10:30:33PM -0700, Doug Anderson wrote:
>> On Wed, May 23, 2018 at 8:56 AM, Mark Brown <broonie@kernel.org> wrote:
>
>> > Yes, that's definitely not what's expected but it's unfortunately what
>> > the firmware chose to implement so we may well be stuck with it
>> > unfortunately.
>
>> We're not really stuck with it if we do what I was suggesting.  I was
>> suggesting that every time we disable the regulator in Linux we have
>> Linux vote for the lowest voltage it's comfortable with.  Linux keeps
>> track of the true voltage that the driver wants and will always change
>> its vote back to that before enabling.  Thus (assuming Linux is OK
>> with 1.2 V - 1.4 V for a rail):
>
> That's pretty much what it should do anyway with normally designed
> hardware.

I guess the question is: do we insist that the driver include this
workaround, or are we OK with letting the hardware behave as the
hardware does?

-Doug

^ permalink raw reply

* [RFT v3 1/4] perf cs-etm: Generate branch sample for missed packets
From: Mathieu Poirier @ 2018-05-30 14:45 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20180530002837.GB11923@leoy-ThinkPad-X240s>

On 29 May 2018 at 18:28, Leo Yan <leo.yan@linaro.org> wrote:
> Hi Mathieu,
>
> On Tue, May 29, 2018 at 10:04:49AM -0600, Mathieu Poirier wrote:
>
> [...]
>
>> > As now this patch is big with more complex logic, so I consider to
>> > split it into small patches:
>> >
>> > - Define CS_ETM_INVAL_ADDR;
>> > - Fix for CS_ETM_TRACE_ON packet;
>> > - Fix for exception packet;
>> >
>> > Does this make sense for you?  I have concern that this patch is a
>> > fixing patch, so not sure after spliting patches will introduce
>> > trouble for applying them for other stable kernels ...
>>
>> Reverse the order:
>>
>> - Fix for CS_ETM_TRACE_ON packet;
>> - Fix for exception packet;
>> - Define CS_ETM_INVAL_ADDR;
>>
>> But you may not need to - see next comment.
>
> From the discussion context, I think here 'you may not need to' is
> referring to my concern for applying patches on stable kernel, so I
> should take this patch series as an enhancement and don't need to
> consider much for stable kernel.

Yes, that is what I meant.

>
> On the other hand, your suggestion is possible to mean 'not need
> to' split into small patches (though I guess this is misunderstanding
> for your meaning).
>
> Could you clarify which is your meaning?
>
>> >> > +
>> >> > +   /*
>> >> >      * The packet records the execution range with an exclusive end address
>> >> >      *
>> >> >      * A64 instructions are constant size, so the last executed
>> >> > @@ -505,6 +519,18 @@ static inline u64 cs_etm__last_executed_instr(struct cs_etm_packet *packet)
>> >> >     return packet->end_addr - A64_INSTR_SIZE;
>> >> >  }
>> >> >
>> >> > +static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet)
>> >> > +{
>> >> > +   /*
>> >> > +    * The packet is the CS_ETM_TRACE_ON packet if the start_addr is
>> >> > +    * magic number 0xdeadbeefdeadbeefUL, returns 0 for this case.
>> >> > +    */
>> >> > +   if (packet->start_addr == 0xdeadbeefdeadbeefUL)
>> >> > +           return 0;
>> >>
>> >> Same comment as above.
>> >
>> > Will do this.
>> >
>> >> > +
>> >> > +   return packet->start_addr;
>> >> > +}
>> >> > +
>> >> >  static inline u64 cs_etm__instr_count(const struct cs_etm_packet *packet)
>> >> >  {
>> >> >     /*
>> >> > @@ -546,7 +572,7 @@ static void cs_etm__update_last_branch_rb(struct cs_etm_queue *etmq)
>> >> >
>> >> >     be       = &bs->entries[etmq->last_branch_pos];
>> >> >     be->from = cs_etm__last_executed_instr(etmq->prev_packet);
>> >> > -   be->to   = etmq->packet->start_addr;
>> >> > +   be->to   = cs_etm__first_executed_instr(etmq->packet);
>> >> >     /* No support for mispredict */
>> >> >     be->flags.mispred = 0;
>> >> >     be->flags.predicted = 1;
>> >> > @@ -701,7 +727,7 @@ static int cs_etm__synth_branch_sample(struct cs_etm_queue *etmq)
>> >> >     sample.ip = cs_etm__last_executed_instr(etmq->prev_packet);
>> >> >     sample.pid = etmq->pid;
>> >> >     sample.tid = etmq->tid;
>> >> > -   sample.addr = etmq->packet->start_addr;
>> >> > +   sample.addr = cs_etm__first_executed_instr(etmq->packet);
>> >> >     sample.id = etmq->etm->branches_id;
>> >> >     sample.stream_id = etmq->etm->branches_id;
>> >> >     sample.period = 1;
>> >> > @@ -897,13 +923,28 @@ static int cs_etm__sample(struct cs_etm_queue *etmq)
>> >> >             etmq->period_instructions = instrs_over;
>> >> >     }
>> >> >
>> >> > -   if (etm->sample_branches &&
>> >> > -       etmq->prev_packet &&
>> >> > -       etmq->prev_packet->sample_type == CS_ETM_RANGE &&
>> >> > -       etmq->prev_packet->last_instr_taken_branch) {
>> >> > -           ret = cs_etm__synth_branch_sample(etmq);
>> >> > -           if (ret)
>> >> > -                   return ret;
>> >> > +   if (etm->sample_branches && etmq->prev_packet) {
>> >> > +           bool generate_sample = false;
>> >> > +
>> >> > +           /* Generate sample for start tracing packet */
>> >> > +           if (etmq->prev_packet->sample_type == 0 ||
>> >>
>> >> What kind of packet is sample_type == 0 ?
>> >
>> > Just as explained above, sample_type == 0 is the packet which
>> > initialized in the function cs_etm__alloc_queue().
>> >
>> >> > +               etmq->prev_packet->sample_type == CS_ETM_TRACE_ON)
>> >> > +                   generate_sample = true;
>> >> > +
>> >> > +           /* Generate sample for exception packet */
>> >> > +           if (etmq->prev_packet->exc == true)
>> >> > +                   generate_sample = true;
>> >>
>> >> Please don't do that.  Exception packets have a type of their own and can be
>> >> added to the decoder packet queue the same way INST_RANGE and TRACE_ON packets
>> >> are.  Moreover exception packet containt an address that, if I'm reading the
>> >> documenation properly, can be used to keep track of instructions that were
>> >> executed between the last address of the previous range packet and the address
>> >> executed just before the exception occurred.  Mike and Rob will have to confirm
>> >> this as the decoder may be doing all that hard work for us.
>> >
>> > Sure, will wait for Rob and Mike to confirm for this.
>> >
>> > At my side, I dump the packet, the exception packet isn't passed to
>> > cs-etm.c layer, the decoder layer only sets the flag
>> > 'packet->exc = true' when exception packet is coming [1].
>>
>> That's because we didn't need the information.  Now that we do a
>> function that will insert a packet in the decoder packet queue and
>> deal with the new packet type in the main decoder loop [2].  At that
>> point your work may not be eligible for stable anymore and I think it
>> is fine.  Robert's work was an enhancement over mine and yours is an
>> enhancement over his.
>>
>> [2]. https://elixir.bootlin.com/linux/v4.17-rc7/source/tools/perf/util/cs-etm.c#L999
>
> Agree, will look into for exception packet and try to add new packet
> type for this.
>
> [...]
>
> Thanks,
> Leo Yan

^ permalink raw reply

* Regression in Linux next again
From: Tony Lindgren @ 2018-05-30 14:45 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20180530143338.GM6920@sirena.org.uk>

* Mark Brown <broonie@kernel.org> [180530 14:36]:
> On Wed, May 30, 2018 at 04:03:15PM +0200, Maciej Purski wrote:
> 
> > I'm afraid, I have no idea, how to fix it quickly. You can revert it and
> > in the next version I'll fix the build error and split the last patch even
> > more, so we could perform a more precise bisection. I'd be grateful if you
> > could push it on your test coupled branch and Tony could test it again before
> > merging it with next again.
> 
> Yeah, if we could get testing from Tony first that'd be ideal.

I can boot test the patches easily no problem. Maciej, maybe email
me some debug version and I'll email you back the dmesg output.

Regards,

Tony

^ permalink raw reply

* Regression in Linux next again
From: Mark Brown @ 2018-05-30 14:33 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20180530140318eucas1p2af2d01b9678e8914c2b8e099c584ec11~zcQzVKtj30757607576eucas1p2V@eucas1p2.samsung.com>

On Wed, May 30, 2018 at 04:03:15PM +0200, Maciej Purski wrote:

> I'm afraid, I have no idea, how to fix it quickly. You can revert it and
> in the next version I'll fix the build error and split the last patch even
> more, so we could perform a more precise bisection. I'd be grateful if you
> could push it on your test coupled branch and Tony could test it again before
> merging it with next again.

Yeah, if we could get testing from Tony first that'd be ideal.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20180530/597b2dc3/attachment.sig>

^ permalink raw reply

* [PATCH v2 09/10] ARM: dts: exynos5250: add DSI node
From: Krzysztof Kozlowski @ 2018-05-30 14:32 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1527682561-1386-10-git-send-email-m.purski@samsung.com>

On Wed, May 30, 2018 at 2:16 PM, Maciej Purski <m.purski@samsung.com> wrote:
> From: Andrzej Hajda <a.hajda@samsung.com>
>
> The patch adds common part of DSI node for Exynos5250 platforms
> and a required mipi-phy node.
>
> Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
> Signed-off-by: Maciej Purski <m.purski@samsung.com>
> ---
>  arch/arm/boot/dts/exynos5250.dtsi | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)

Thanks for changes. I'll take both DTS patches after bindings get
accepted and after upcomming merge window.

Best regards,
Krzysztof

^ permalink raw reply

* [PATCH v2 9/9] PM / Domains: Add dev_pm_domain_attach_by_id() to manage multi PM domains
From: Jon Hunter @ 2018-05-30 14:30 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <CAPDyKFoiEctBuyw1nVS6ruQ92-jOsMMSAUsWKew42dkmk64_8A@mail.gmail.com>


On 30/05/18 12:31, Ulf Hansson wrote:
> On 30 May 2018 at 11:19, Jon Hunter <jonathanh@nvidia.com> wrote:
>> Hi Ulf,
>>
>> On 29/05/18 11:04, Ulf Hansson wrote:
>>> The existing dev_pm_domain_attach() function, allows a single PM domain to
>>> be attached per device. To be able to support devices that are partitioned
>>> across multiple PM domains, let's introduce a new interface,
>>> dev_pm_domain_attach_by_id().
>>>
>>> The dev_pm_domain_attach_by_id() returns a new allocated struct device with
>>> the corresponding attached PM domain. This enables for example a driver to
>>> operate on the new device from a power management point of view. The driver
>>> may then also benefit from using the received device, to set up so called
>>> device-links towards its original device. Depending on the situation, these
>>> links may then be dynamically changed.
>>
>> I have given this series a go with Tegra updating the XHCI driver to make
>> use of these new APIs. Good news it does appear to work fine for Tegra,
>> however, initially when looking at the device_link_add() API ...
>>
>> /**
>>  * device_link_add - Create a link between two devices.
>>  * @consumer: Consumer end of the link.
>>  * @supplier: Supplier end of the link.
>>  * @flags: Link flags.
>>
>>  ... I had assumed that the 'consumer' device would be the actual XHCI
>> device (in the case of Tegra) and the 'supplier' device would be the new
>> genpd device. However, this did not work and I got the following WARN on
>> boot ...
>>
>> [    2.050929] ---[ end trace eff0b5265e530c92 ]---
>> [    2.055567] WARNING: CPU: 2 PID: 1 at drivers/base/core.c:446 device_links_driver_bound+0xc0/0xd0
>> [    2.064422] Modules linked in:
>> [    2.067471] CPU: 2 PID: 1 Comm: swapper/0 Tainted: G        W         4.17.0-rc7-next-20180529-00011-g4faf0dc0ebf3-dirty #32
>> [    2.078667] Hardware name: Google Pixel C (DT)
>> [    2.083101] pstate: 80000005 (Nzcv daif -PAN -UAO)
>> [    2.087881] pc : device_links_driver_bound+0xc0/0xd0
>> [    2.092832] lr : device_links_driver_bound+0x20/0xd0
>>
>> Switching the Tegra XHCI device to be the 'supplier' and genpd device to
>> be the 'consumer' does work, but is this correct? Seems to be opposite to
> 
> It shall be the opposite. The Tegra XHCI device shall be the consumer.
> 
>> what I expected. Maybe I am missing something?
> 
> The problem you get is because the device returned from
> dev_pm_domain_attach_by_id(), let's call it genpd_dev, doesn't have
> the a driver.
> 
> You need to use a couple of device links flag, something like this:
> 
> link = device_link_add(dev, genpd_dev, DL_FLAG_STATELESS |
> DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE);

Thanks, adding the DL_FLAG_STATELESS flag resolved the issue. I already added
the DL_FLAG_PM_RUNTIME but I did not bother with the DL_FLAG_RPM_ACTIVE as
from the description it appears that this will always keep it on? Seems to
work fine without it.

> Moreover, you also need these commits, depending if you are running on
> something else than Rafael's tree.
> 
> a0504aecba76 PM / runtime: Drop usage count for suppliers at device link removal
> 1e8378619841 PM / runtime: Fixup reference counting of device link
> suppliers at probe

Yes these are currently in -next and so I have these.
 
>>
>>> The new interface is typically called by drivers during their probe phase,
>>> in case they manages devices which uses multiple PM domains. If that is the
>>> case, the driver also becomes responsible of managing the detaching of the
>>> PM domains, which typically should be done at the remove phase. Detaching
>>> is done by calling the existing dev_pm_domain_detach() function and for
>>> each of the received devices from dev_pm_domain_attach_by_id().
>>>
>>> Note, currently its only genpd that supports multiple PM domains per
>>> device, but dev_pm_domain_attach_by_id() can easily by extended to cover
>>> other PM domain types, if/when needed.
>>>
>>> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
>>> ---
>>>
>>> Changes in v2:
>>>       - Fixed comments from Jon. Clarified function descriptions/changelog and
>>>       return ERR_PTR(-EEXIST) in case a PM domain is already assigned.
>>>       - Fix build error when CONFIG_PM is unset.
>>>
>>> ---
>>>  drivers/base/power/common.c | 43 ++++++++++++++++++++++++++++++++++---
>>>  include/linux/pm_domain.h   |  7 ++++++
>>>  2 files changed, 47 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/base/power/common.c b/drivers/base/power/common.c
>>> index 7ae62b6355b8..5e5ea0c239de 100644
>>> --- a/drivers/base/power/common.c
>>> +++ b/drivers/base/power/common.c
>>> @@ -116,14 +116,51 @@ int dev_pm_domain_attach(struct device *dev, bool power_on)
>>>  }
>>>  EXPORT_SYMBOL_GPL(dev_pm_domain_attach);
>>>
>>> +/**
>>> + * dev_pm_domain_attach_by_id - Attach a device to one of its PM domains.
>>> + * @dev: Device to attach.
>>
>> Nit ... I still don't think this is the device we are attaching to, but the
>> device the PM domains are associated with. IOW we are using this device to
>> lookup the PM domains.
> 
> Right. I forgot to update that part of the description.
> 
> What about:
> 
> dev_pm_domain_attach_by_id - Associate a device with one of its PM domains.
> @dev: The device used to lookup the PM domain.

Perfect.
 
>>
>>> + * @index: The index of the PM domain.
>>> + *
>>> + * As @dev may only be attached to a single PM domain, the backend PM domain
>>> + * provider creates a virtual device to attach instead. If attachment succeeds,
>>> + * the ->detach() callback in the struct dev_pm_domain are assigned by the
>>> + * corresponding backend attach function, as to deal with detaching of the
>>> + * created virtual device.
>>> + *
>>> + * This function should typically be invoked by a driver during the probe phase,
>>> + * in case its device requires power management through multiple PM domains. The
>>> + * driver may benefit from using the received device, to configure device-links
>>> + * towards its original device. Depending on the use-case and if needed, the
>>> + * links may be dynamically changed by the driver, which allows it to control
>>> + * the power to the PM domains independently from each other.
>>> + *
>>> + * Callers must ensure proper synchronization of this function with power
>>> + * management callbacks.
>>> + *
>>> + * Returns the virtual created device when successfully attached to its PM
>>> + * domain, NULL in case @dev don't need a PM domain, else an ERR_PTR().
>>> + * Note that, to detach the returned virtual device, the driver shall call
>>> + * dev_pm_domain_detach() on it, typically during the remove phase.
>>> + */
>>> +struct device *dev_pm_domain_attach_by_id(struct device *dev,
>>> +                                       unsigned int index)
>>> +{
>>> +     if (dev->pm_domain)
>>> +             return ERR_PTR(-EEXIST);
>>> +
>>> +     return genpd_dev_pm_attach_by_id(dev, index);
>>> +}
>>> +EXPORT_SYMBOL_GPL(dev_pm_domain_attach_by_id);
>>> +
>>>  /**
>>>   * dev_pm_domain_detach - Detach a device from its PM domain.
>>>   * @dev: Device to detach.
>>>   * @power_off: Used to indicate whether we should power off the device.
>>>   *
>>> - * This functions will reverse the actions from dev_pm_domain_attach() and thus
>>> - * try to detach the @dev from its PM domain. Typically it should be invoked
>>> - * from subsystem level code during the remove phase.
>>> + * This functions will reverse the actions from dev_pm_domain_attach() and
>>> + * dev_pm_domain_attach_by_id(), thus it detaches @dev from its PM domain.
>>> + * Typically it should be invoked during the remove phase, either from
>>> + * subsystem level code or from drivers.
>>>   *
>>>   * Callers must ensure proper synchronization of this function with power
>>>   * management callbacks.
>>> diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
>>> index 82458e8e2e01..9206a4fef9ac 100644
>>> --- a/include/linux/pm_domain.h
>>> +++ b/include/linux/pm_domain.h
>>> @@ -299,6 +299,8 @@ struct generic_pm_domain *of_genpd_remove_last(struct device_node *np)
>>>
>>>  #ifdef CONFIG_PM
>>>  int dev_pm_domain_attach(struct device *dev, bool power_on);
>>> +struct device *dev_pm_domain_attach_by_id(struct device *dev,
>>> +                                       unsigned int index);
>>>  void dev_pm_domain_detach(struct device *dev, bool power_off);
>>>  void dev_pm_domain_set(struct device *dev, struct dev_pm_domain *pd);
>>>  #else
>>> @@ -306,6 +308,11 @@ static inline int dev_pm_domain_attach(struct device *dev, bool power_on)
>>>  {
>>>       return 0;
>>>  }
>>> +static inline struct device *dev_pm_domain_attach_by_id(struct device *dev,
>>> +                                                     unsigned int index)
>>> +{
>>> +     return NULL;
>>> +}
>>>  static inline void dev_pm_domain_detach(struct device *dev, bool power_off) {}
>>>  static inline void dev_pm_domain_set(struct device *dev,
>>>                                    struct dev_pm_domain *pd) {}
>>>
>>
>> My only other comments on this series are ...
>>
>> 1. I think it would be nice to have an dev_pm_domain_attach_by_name() and
>>    that the DT binding has a 'power-domain-names' property.
> 
> I think it makes sense, but my plan was to do that as second step on
> top. Are you okay with that as well?

Yes that is fine with me.

>> 2. I am wondering if there could be value in having a
>>    dev_pm_domain_attach_link_all() helper which would attach and link all
>>    PM domains at once.
> 
> Perhaps it can be useful, yes! However, maybe we can postpone that to
> after this series. I want to keep the series as simple as possible,
> then we can build upon it.

That's fine too and I can always add these if you prefer.

Cheers
Jon

-- 
nvpublic

^ permalink raw reply

* [PATCH v2 6/6] ARM: dts: dra76x: Add MCAN node
From: Faiz Abbas @ 2018-05-30 14:11 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20180530141133.3711-1-faiz_abbas@ti.com>

From: Franklin S Cooper Jr <fcooper@ti.com>

Add support for the MCAN peripheral which supports both classic
CAN messages along with the new CAN-FD message.

Add MCAN node to evm and enable it with a maximum datarate of 5 mbps

Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
---
 arch/arm/boot/dts/dra76-evm.dts |  7 +++++++
 arch/arm/boot/dts/dra76x.dtsi   | 15 +++++++++++++++
 2 files changed, 22 insertions(+)

diff --git a/arch/arm/boot/dts/dra76-evm.dts b/arch/arm/boot/dts/dra76-evm.dts
index 2deb96405d06..277765257410 100644
--- a/arch/arm/boot/dts/dra76-evm.dts
+++ b/arch/arm/boot/dts/dra76-evm.dts
@@ -404,3 +404,10 @@
 	phys = <&pcie1_phy>, <&pcie2_phy>;
 	phy-names = "pcie-phy0", "pcie-phy1";
 };
+
+&m_can0 {
+	status = "okay";
+	can-transceiver {
+		max-bitrate = <5000000>;
+	};
+};
diff --git a/arch/arm/boot/dts/dra76x.dtsi b/arch/arm/boot/dts/dra76x.dtsi
index 57b8dc0fe719..d7a8cc569808 100644
--- a/arch/arm/boot/dts/dra76x.dtsi
+++ b/arch/arm/boot/dts/dra76x.dtsi
@@ -27,6 +27,21 @@
 			ti,syss-mask = <1>;
 			clocks = <&wkupaon_clkctrl DRA7_ADC_CLKCTRL 0>;
 			clock-names = "fck";
+
+			m_can0: mcan at 42c01a00 {
+				compatible = "bosch,m_can";
+				reg = <0x1a00 0x4000>, <0x0 0x18FC>;
+				reg-names = "m_can", "message_ram";
+				interrupt-parent = <&gic>;
+				interrupts = <GIC_SPI 67 IRQ_TYPE_LEVEL_HIGH>,
+					     <GIC_SPI 68 IRQ_TYPE_LEVEL_HIGH>;
+				interrupt-names = "int0", "int1";
+				ti,hwmods = "mcan";
+				clocks = <&mcan_clk>, <&l3_iclk_div>;
+				clock-names = "cclk", "hclk";
+				bosch,mram-cfg = <0x0 0 0 32 0 0 1 1>;
+				status = "disabled";
+			};
 		};
 	};
 
-- 
2.17.0

^ permalink raw reply related

* [PATCH v2 5/6] ARM: dts: Add generic interconnect target module node for MCAN
From: Faiz Abbas @ 2018-05-30 14:11 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20180530141133.3711-1-faiz_abbas@ti.com>

The ti-sysc driver provides support for manipulating the idlemodes
and interconnect level resets.

Add the generic interconnect target module node for MCAN to support
the same.

CC: Tony Lindgren <tony@atomide.com>
Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
---
 arch/arm/boot/dts/dra76x.dtsi | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/arch/arm/boot/dts/dra76x.dtsi b/arch/arm/boot/dts/dra76x.dtsi
index bfc82636999c..57b8dc0fe719 100644
--- a/arch/arm/boot/dts/dra76x.dtsi
+++ b/arch/arm/boot/dts/dra76x.dtsi
@@ -11,6 +11,25 @@
 / {
 	compatible = "ti,dra762", "ti,dra7";
 
+	ocp {
+
+		target-module at 0x42c00000 {
+			compatible = "ti,sysc-dra7-mcan";
+			ranges = <0x0 0x42c00000 0x2000>;
+			#address-cells = <1>;
+			#size-cells = <1>;
+			reg = <0x42c01900 0x4>,
+			      <0x42c01904 0x4>,
+			      <0x42c01908 0x4>;
+			reg-names = "rev", "sysc", "syss";
+			ti,sysc-mask = <(SYSC_OMAP4_SOFTRESET |
+					 SYSC_DRA7_MCAN_ENAWAKEUP)>;
+			ti,syss-mask = <1>;
+			clocks = <&wkupaon_clkctrl DRA7_ADC_CLKCTRL 0>;
+			clock-names = "fck";
+		};
+	};
+
 };
 
 /* MCAN interrupts are hard-wired to irqs 67, 68 */
-- 
2.17.0

^ permalink raw reply related

* [PATCH v2 4/6] bus: ti-sysc: Add support for using ti-sysc for MCAN on dra76x
From: Faiz Abbas @ 2018-05-30 14:11 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20180530141133.3711-1-faiz_abbas@ti.com>

The dra76x MCAN generic interconnect module has a its own
format for the bits in the control registers.

Therefore add a new module type, new regbits and new capabilities
specific to the MCAN module.

CC: Tony Lindgren <tony@atomide.com>
Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
---
 .../devicetree/bindings/bus/ti-sysc.txt         |  1 +
 drivers/bus/ti-sysc.c                           | 17 +++++++++++++++++
 include/dt-bindings/bus/ti-sysc.h               |  2 ++
 include/linux/platform_data/ti-sysc.h           |  1 +
 4 files changed, 21 insertions(+)

diff --git a/Documentation/devicetree/bindings/bus/ti-sysc.txt b/Documentation/devicetree/bindings/bus/ti-sysc.txt
index 2957a9ae291f..ebbb11144b7b 100644
--- a/Documentation/devicetree/bindings/bus/ti-sysc.txt
+++ b/Documentation/devicetree/bindings/bus/ti-sysc.txt
@@ -36,6 +36,7 @@ Required standard properties:
 		"ti,sysc-omap-aes"
 		"ti,sysc-mcasp"
 		"ti,sysc-usb-host-fs"
+		"ti,sysc-dra7-mcan"
 
 - reg		shall have register areas implemented for the interconnect
 		target module in question such as revision, sysc and syss
diff --git a/drivers/bus/ti-sysc.c b/drivers/bus/ti-sysc.c
index 7cd2fd04b212..83b47974567a 100644
--- a/drivers/bus/ti-sysc.c
+++ b/drivers/bus/ti-sysc.c
@@ -1262,6 +1262,22 @@ static const struct sysc_capabilities sysc_omap4_usb_host_fs = {
 	.regbits = &sysc_regbits_omap4_usb_host_fs,
 };
 
+static const struct sysc_regbits sysc_regbits_dra7_mcan = {
+	.dmadisable_shift = -ENODEV,
+	.midle_shift = -ENODEV,
+	.sidle_shift = -ENODEV,
+	.clkact_shift = -ENODEV,
+	.enwkup_shift = 4,
+	.srst_shift = 0,
+	.emufree_shift = -ENODEV,
+	.autoidle_shift = -ENODEV,
+};
+
+static const struct sysc_capabilities sysc_dra7_mcan = {
+	.type = TI_SYSC_DRA7_MCAN,
+	.regbits = &sysc_regbits_dra7_mcan,
+};
+
 static int sysc_init_pdata(struct sysc *ddata)
 {
 	struct ti_sysc_platform_data *pdata = dev_get_platdata(ddata->dev);
@@ -1441,6 +1457,7 @@ static const struct of_device_id sysc_match[] = {
 	{ .compatible = "ti,sysc-mcasp", .data = &sysc_omap4_mcasp, },
 	{ .compatible = "ti,sysc-usb-host-fs",
 	  .data = &sysc_omap4_usb_host_fs, },
+	{ .compatible = "ti,sysc-dra7-mcan", .data = &sysc_dra7_mcan, },
 	{  },
 };
 MODULE_DEVICE_TABLE(of, sysc_match);
diff --git a/include/dt-bindings/bus/ti-sysc.h b/include/dt-bindings/bus/ti-sysc.h
index 2c005376ac0e..7138384e2ef9 100644
--- a/include/dt-bindings/bus/ti-sysc.h
+++ b/include/dt-bindings/bus/ti-sysc.h
@@ -15,6 +15,8 @@
 /* SmartReflex sysc found on 36xx and later */
 #define SYSC_OMAP3_SR_ENAWAKEUP		(1 << 26)
 
+#define SYSC_DRA7_MCAN_ENAWAKEUP	(1 << 4)
+
 /* SYSCONFIG STANDBYMODE/MIDLEMODE/SIDLEMODE supported by hardware */
 #define SYSC_IDLE_FORCE			0
 #define SYSC_IDLE_NO			1
diff --git a/include/linux/platform_data/ti-sysc.h b/include/linux/platform_data/ti-sysc.h
index 80ce28d40832..1ea3aab972b4 100644
--- a/include/linux/platform_data/ti-sysc.h
+++ b/include/linux/platform_data/ti-sysc.h
@@ -14,6 +14,7 @@ enum ti_sysc_module_type {
 	TI_SYSC_OMAP4_SR,
 	TI_SYSC_OMAP4_MCASP,
 	TI_SYSC_OMAP4_USB_HOST_FS,
+	TI_SYSC_DRA7_MCAN,
 };
 
 struct ti_sysc_cookie {
-- 
2.17.0

^ permalink raw reply related

* [PATCH v2 3/6] clk: ti: dra7: Add clkctrl clock data for the mcan clocks
From: Faiz Abbas @ 2018-05-30 14:11 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20180530141133.3711-1-faiz_abbas@ti.com>

Add clkctrl data for the m_can clocks and register it within the
clkctrl driver

CC: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
---
 drivers/clk/ti/clk-7xx.c         | 1 +
 include/dt-bindings/clock/dra7.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/clk/ti/clk-7xx.c b/drivers/clk/ti/clk-7xx.c
index fb249a1637a5..71a122b2dc67 100644
--- a/drivers/clk/ti/clk-7xx.c
+++ b/drivers/clk/ti/clk-7xx.c
@@ -708,6 +708,7 @@ static const struct omap_clkctrl_reg_data dra7_wkupaon_clkctrl_regs[] __initcons
 	{ DRA7_COUNTER_32K_CLKCTRL, NULL, 0, "wkupaon_iclk_mux" },
 	{ DRA7_UART10_CLKCTRL, dra7_uart10_bit_data, CLKF_SW_SUP, "wkupaon_cm:clk:0060:24" },
 	{ DRA7_DCAN1_CLKCTRL, dra7_dcan1_bit_data, CLKF_SW_SUP, "wkupaon_cm:clk:0068:24" },
+	{ DRA7_ADC_CLKCTRL, NULL, CLKF_SW_SUP, "mcan_clk" },
 	{ 0 },
 };
 
diff --git a/include/dt-bindings/clock/dra7.h b/include/dt-bindings/clock/dra7.h
index 5e1061b15aed..d7549c57cac3 100644
--- a/include/dt-bindings/clock/dra7.h
+++ b/include/dt-bindings/clock/dra7.h
@@ -168,5 +168,6 @@
 #define DRA7_COUNTER_32K_CLKCTRL	DRA7_CLKCTRL_INDEX(0x50)
 #define DRA7_UART10_CLKCTRL	DRA7_CLKCTRL_INDEX(0x80)
 #define DRA7_DCAN1_CLKCTRL	DRA7_CLKCTRL_INDEX(0x88)
+#define DRA7_ADC_CLKCTRL	DRA7_CLKCTRL_INDEX(0xa0)
 
 #endif
-- 
2.17.0

^ permalink raw reply related

* [PATCH v2 2/6] ARM: dts: dra762: Add MCAN clock support
From: Faiz Abbas @ 2018-05-30 14:11 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20180530141133.3711-1-faiz_abbas@ti.com>

From: Lokesh Vutla <lokeshvutla@ti.com>

MCAN is clocked by H14 divider of DPLL_GMAC. Unlike other
DPLL dividers this DPLL_GMAC H14 divider is controlled by
control module. Adding support for these clocks.

Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
---
 arch/arm/boot/dts/dra76x.dtsi | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/arch/arm/boot/dts/dra76x.dtsi b/arch/arm/boot/dts/dra76x.dtsi
index 1c88c581ff18..bfc82636999c 100644
--- a/arch/arm/boot/dts/dra76x.dtsi
+++ b/arch/arm/boot/dts/dra76x.dtsi
@@ -17,3 +17,36 @@
 &crossbar_mpu {
 	ti,irqs-skip = <10 67 68 133 139 140>;
 };
+
+&scm_conf_clocks {
+	dpll_gmac_h14x2_ctrl_ck: dpll_gmac_h14x2_ctrl_ck at 3fc {
+		#clock-cells = <0>;
+		compatible = "ti,divider-clock";
+		clocks = <&dpll_gmac_x2_ck>;
+		ti,max-div = <63>;
+		reg = <0x03fc>;
+		ti,bit-shift=<20>;
+		ti,latch-bit=<26>;
+		assigned-clocks = <&dpll_gmac_h14x2_ctrl_ck>;
+		assigned-clock-rates = <80000000>;
+	};
+
+	dpll_gmac_h14x2_ctrl_mux_ck: dpll_gmac_h14x2_ctrl_mux_ck at 3fc {
+		#clock-cells = <0>;
+		compatible = "ti,mux-clock";
+		clocks = <&dpll_gmac_ck>, <&dpll_gmac_h14x2_ctrl_ck>;
+		reg = <0x3fc>;
+		ti,bit-shift = <29>;
+		ti,latch-bit=<26>;
+		assigned-clocks = <&dpll_gmac_h14x2_ctrl_mux_ck>;
+		assigned-clock-parents = <&dpll_gmac_h14x2_ctrl_ck>;
+	};
+
+	mcan_clk: mcan_clk at 3fc {
+		#clock-cells = <0>;
+		compatible = "ti,gate-clock";
+		clocks = <&dpll_gmac_h14x2_ctrl_mux_ck>;
+		ti,bit-shift = <27>;
+		reg = <0x3fc>;
+	};
+};
-- 
2.17.0

^ permalink raw reply related

* [PATCH v2 1/6] ARM: dra762: hwmod: Add MCAN support
From: Faiz Abbas @ 2018-05-30 14:11 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20180530141133.3711-1-faiz_abbas@ti.com>

From: Lokesh Vutla <lokeshvutla@ti.com>

Add MCAN hwmod data and register it for dra762 silicons.

Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
---
 arch/arm/mach-omap2/omap_hwmod_7xx_data.c | 32 +++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/arch/arm/mach-omap2/omap_hwmod_7xx_data.c b/arch/arm/mach-omap2/omap_hwmod_7xx_data.c
index 62352d1e6361..a2cd7f865a60 100644
--- a/arch/arm/mach-omap2/omap_hwmod_7xx_data.c
+++ b/arch/arm/mach-omap2/omap_hwmod_7xx_data.c
@@ -1355,6 +1355,29 @@ static struct omap_hwmod dra7xx_mailbox13_hwmod = {
 	},
 };
 
+/*
+ * 'mcan' class
+ *
+ */
+static struct omap_hwmod_class dra76x_mcan_hwmod_class = {
+	.name	= "mcan",
+};
+
+/* mcan */
+static struct omap_hwmod dra76x_mcan_hwmod = {
+	.name		= "mcan",
+	.class		= &dra76x_mcan_hwmod_class,
+	.clkdm_name	= "wkupaon_clkdm",
+	.main_clk	= "mcan_clk",
+	.prcm = {
+		.omap4 = {
+			.clkctrl_offs = DRA7XX_CM_WKUPAON_ADC_CLKCTRL_OFFSET,
+			.context_offs = DRA7XX_RM_WKUPAON_ADC_CONTEXT_OFFSET,
+			.modulemode   = MODULEMODE_SWCTRL,
+		},
+	},
+};
+
 /*
  * 'mcspi' class
  *
@@ -3818,6 +3841,14 @@ static struct omap_hwmod_ocp_if dra7xx_l4_per2__epwmss2 = {
 	.user		= OCP_USER_MPU,
 };
 
+/* l3_main_1 -> mcan */
+static struct omap_hwmod_ocp_if dra76x_l3_main_1__mcan = {
+	.master		= &dra7xx_l3_main_1_hwmod,
+	.slave		= &dra76x_mcan_hwmod,
+	.clk		= "l3_iclk_div",
+	.user		= OCP_USER_MPU | OCP_USER_SDMA,
+};
+
 static struct omap_hwmod_ocp_if *dra7xx_hwmod_ocp_ifs[] __initdata = {
 	&dra7xx_l3_main_1__dmm,
 	&dra7xx_l3_main_2__l3_instr,
@@ -3958,6 +3989,7 @@ static struct omap_hwmod_ocp_if *dra7xx_gp_hwmod_ocp_ifs[] __initdata = {
 /* SoC variant specific hwmod links */
 static struct omap_hwmod_ocp_if *dra76x_hwmod_ocp_ifs[] __initdata = {
 	&dra7xx_l4_per3__usb_otg_ss4,
+	&dra76x_l3_main_1__mcan,
 	NULL,
 };
 
-- 
2.17.0

^ permalink raw reply related

* [PATCH v2 0/6] Add MCAN Support for dra76x
From: Faiz Abbas @ 2018-05-30 14:11 UTC (permalink / raw)
  To: linux-arm-kernel

The following patches add dts, hwmod and sysconfig support
for MCAN on TI's dra76 SOCs

The patches depend on the following series:
https://patchwork.kernel.org/patch/10221105/

Changes in v2:
 1. Added Support for mcan in the ti-sysc driver
    Also added the target-module node for the same

 2. Added clkctrl data for mcan clocks

Faiz Abbas (3):
  clk: ti: dra7: Add clkctrl clock data for the mcan clocks
  bus: ti-sysc: Add support for using ti-sysc for MCAN on dra76x
  ARM: dts: Add generic interconnect target module node for MCAN

Franklin S Cooper Jr (1):
  ARM: dts: dra76x: Add MCAN node

Lokesh Vutla (2):
  ARM: dra762: hwmod: Add MCAN support
  ARM: dts: dra762: Add MCAN clock support

 .../devicetree/bindings/bus/ti-sysc.txt       |  1 +
 arch/arm/boot/dts/dra76-evm.dts               |  7 ++
 arch/arm/boot/dts/dra76x.dtsi                 | 67 +++++++++++++++++++
 arch/arm/mach-omap2/omap_hwmod_7xx_data.c     | 32 +++++++++
 drivers/bus/ti-sysc.c                         | 17 +++++
 drivers/clk/ti/clk-7xx.c                      |  1 +
 include/dt-bindings/bus/ti-sysc.h             |  2 +
 include/dt-bindings/clock/dra7.h              |  1 +
 include/linux/platform_data/ti-sysc.h         |  1 +
 9 files changed, 129 insertions(+)

-- 
2.17.0

^ permalink raw reply

* [PATCH v3 1/2] ARM: dma-mapping: Implement arm_dma_iommu_detach_device()
From: Thierry Reding @ 2018-05-30 14:07 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <d3121aeb-1d75-2ba6-681a-f1e0681e290a@arm.com>

On Wed, May 30, 2018 at 02:42:50PM +0100, Robin Murphy wrote:
> On 30/05/18 14:12, Thierry Reding wrote:
> > On Wed, May 30, 2018 at 02:54:46PM +0200, Thierry Reding wrote:
> > > On Wed, May 30, 2018 at 10:59:30AM +0100, Robin Murphy wrote:
> > > > On 30/05/18 09:03, Thierry Reding wrote:
> > > > > From: Thierry Reding <treding@nvidia.com>
> > > > > 
> > > > > Implement this function to enable drivers from detaching from any IOMMU
> > > > > domains that architecture code might have attached them to so that they
> > > > > can take exclusive control of the IOMMU via the IOMMU API.
> > > > > 
> > > > > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > > > > ---
> > > > > Changes in v3:
> > > > > - make API 32-bit ARM specific
> > > > > - avoid extra local variable
> > > > > 
> > > > > Changes in v2:
> > > > > - fix compilation
> > > > > 
> > > > >    arch/arm/include/asm/dma-mapping.h |  3 +++
> > > > >    arch/arm/mm/dma-mapping-nommu.c    |  4 ++++
> > > > >    arch/arm/mm/dma-mapping.c          | 16 ++++++++++++++++
> > > > >    3 files changed, 23 insertions(+)
> > > > > 
> > > > > diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
> > > > > index 8436f6ade57d..5960e9f3a9d0 100644
> > > > > --- a/arch/arm/include/asm/dma-mapping.h
> > > > > +++ b/arch/arm/include/asm/dma-mapping.h
> > > > > @@ -103,6 +103,9 @@ extern void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
> > > > >    #define arch_teardown_dma_ops arch_teardown_dma_ops
> > > > >    extern void arch_teardown_dma_ops(struct device *dev);
> > > > > +#define arm_dma_iommu_detach_device arm_dma_iommu_detach_device
> > > > > +extern void arm_dma_iommu_detach_device(struct device *dev);
> > > > > +
> > > > >    /* do not use this function in a driver */
> > > > >    static inline bool is_device_dma_coherent(struct device *dev)
> > > > >    {
> > > > > diff --git a/arch/arm/mm/dma-mapping-nommu.c b/arch/arm/mm/dma-mapping-nommu.c
> > > > > index f448a0663b10..eb781369377b 100644
> > > > > --- a/arch/arm/mm/dma-mapping-nommu.c
> > > > > +++ b/arch/arm/mm/dma-mapping-nommu.c
> > > > > @@ -241,3 +241,7 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
> > > > >    void arch_teardown_dma_ops(struct device *dev)
> > > > >    {
> > > > >    }
> > > > > +
> > > > > +void arm_dma_iommu_detach_device(struct device *dev)
> > > > > +{
> > > > > +}
> > > > > diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> > > > > index af27f1c22d93..6d8af08b3e7d 100644
> > > > > --- a/arch/arm/mm/dma-mapping.c
> > > > > +++ b/arch/arm/mm/dma-mapping.c
> > > > > @@ -2400,3 +2400,19 @@ void arch_teardown_dma_ops(struct device *dev)
> > > > >    	arm_teardown_iommu_dma_ops(dev);
> > > > >    }
> > > > > +
> > > > > +void arm_dma_iommu_detach_device(struct device *dev)
> > > > > +{
> > > > > +#ifdef CONFIG_ARM_DMA_USE_IOMMU
> > > > > +	struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
> > > > > +
> > > > > +	if (!mapping)
> > > > > +		return;
> > > > > +
> > > > > +	arm_iommu_release_mapping(mapping);
> > > > 
> > > > Potentially freeing the mapping before you try to operate on it is never the
> > > > best idea. Plus arm_iommu_detach_device() already releases a reference
> > > > appropriately anyway, so it's a double-free.
> > > 
> > > But the reference released by arm_iommu_detach_device() is to balance
> > > out the reference acquired by arm_iommu_attach_device(), isn't it? In
> > > the above, the arm_iommu_release_mapping() is supposed to drop the
> > > final reference which was obtained by arm_iommu_create_mapping(). The
> > > mapping shouldn't go away irrespective of the order in which these
> > > will be called.
> > 
> > Going over the DMA/IOMMU code I just remembered that I drew inspiration
> > from arm_teardown_iommu_dma_ops() for the initial proposal which also
> > calls both arm_iommu_detach_device() and arm_iommu_release_mapping().
> > That said, one other possibility to implement this would be to export
> > the 32-bit and 64-bit ARM implementations of arch_teardown_dma_ops()
> > and use that instead. linux/dma-mapping.h implements a stub for
> > architectures that don't provide one, so it should work without any
> > #ifdef guards.
> > 
> > That combined with the set_dma_ops() fix in arm_iommu_detach_device()
> > should fix this pretty nicely.
> 
> OK, having a second look at the ARM code I see I had indeed overlooked that
> extra reference held until arm_teardown_iommu_dma_ops() - mea culpa - but
> frankly that looks wrong anyway, as it basically defeats the point of
> refcounting the mapping at all. AFAICS arm_setup_iommu_dma_ops() should just
> be made to behave 'normally' by unconditionally dropping the initial
> reference after calling __arm_iommu_attach_device(), then we don't need all
> these odd and confusing release calls dotted around at all.

Good point. I can follow up with a series to fix the reference counting.

Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20180530/547e8bef/attachment.sig>

^ permalink raw reply

* [PATCH v4 2/2] drm/nouveau: tegra: Detach from ARM DMA/IOMMU mapping
From: Thierry Reding @ 2018-05-30 14:06 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20180530140625.21247-1-thierry.reding@gmail.com>

From: Thierry Reding <treding@nvidia.com>

Depending on the kernel configuration, early ARM architecture setup code
may have attached the GPU to a DMA/IOMMU mapping that transparently uses
the IOMMU to back the DMA API. Tegra requires special handling for IOMMU
backed buffers (a special bit in the GPU's MMU page tables indicates the
memory path to take: via the SMMU or directly to the memory controller).
Transparently backing DMA memory with an IOMMU prevents Nouveau from
properly handling such memory accesses and causes memory access faults.

As a side-note: buffers other than those allocated in instance memory
don't need to be physically contiguous from the GPU's perspective since
the GPU can map them into contiguous buffers using its own MMU. Mapping
these buffers through the IOMMU is unnecessary and will even lead to
performance degradation because of the additional translation. One
exception to this are compressible buffers which need large pages. In
order to enable these large pages, multiple small pages will have to be
combined into one large (I/O virtually contiguous) mapping via the
IOMMU. However, that is a topic outside the scope of this fix and isn't
currently supported. An implementation will want to explicitly create
these large pages in the Nouveau driver, so detaching from a DMA/IOMMU
mapping would still be required.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
Changes in v4:
- use existing APIs to detach from a DMA/IOMMU mapping

Changes in v3:
- clarify the use of IOMMU mapping for compressible buffers
- squash multiple patches into this

 drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
index 78597da6313a..0e372a190d3f 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
@@ -23,6 +23,10 @@
 #ifdef CONFIG_NOUVEAU_PLATFORM_DRIVER
 #include "priv.h"
 
+#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
+#include <asm/dma-iommu.h>
+#endif
+
 static int
 nvkm_device_tegra_power_up(struct nvkm_device_tegra *tdev)
 {
@@ -105,6 +109,15 @@ nvkm_device_tegra_probe_iommu(struct nvkm_device_tegra *tdev)
 	unsigned long pgsize_bitmap;
 	int ret;
 
+#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
+	if (dev->archdata.mapping) {
+		struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
+
+		arm_iommu_detach_device(dev);
+		arm_iommu_release_mapping(mapping);
+	}
+#endif
+
 	if (!tdev->func->iommu_bit)
 		return;
 
-- 
2.17.0

^ permalink raw reply related

* [PATCH v4 1/2] ARM: dma-mapping: Set proper DMA ops in arm_iommu_detach_device()
From: Thierry Reding @ 2018-05-30 14:06 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20180530140625.21247-1-thierry.reding@gmail.com>

From: Thierry Reding <treding@nvidia.com>

Instead of setting the DMA ops pointer to NULL, set the correct,
non-IOMMU ops depending on the device's coherency setting.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
Changes in v4:
- new patch to fix existing arm_iommu_detach_device() to do what we need

 arch/arm/mm/dma-mapping.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index af27f1c22d93..87a0037574e4 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -1151,6 +1151,11 @@ int arm_dma_supported(struct device *dev, u64 mask)
 	return __dma_supported(dev, mask, false);
 }
 
+static const struct dma_map_ops *arm_get_dma_map_ops(bool coherent)
+{
+	return coherent ? &arm_coherent_dma_ops : &arm_dma_ops;
+}
+
 #ifdef CONFIG_ARM_DMA_USE_IOMMU
 
 static int __dma_info_to_prot(enum dma_data_direction dir, unsigned long attrs)
@@ -2296,7 +2301,7 @@ void arm_iommu_detach_device(struct device *dev)
 	iommu_detach_device(mapping->domain, dev);
 	kref_put(&mapping->kref, release_iommu_mapping);
 	to_dma_iommu_mapping(dev) = NULL;
-	set_dma_ops(dev, NULL);
+	set_dma_ops(dev, arm_get_dma_map_ops(dev->archdata.dma_coherent));
 
 	pr_debug("Detached IOMMU controller from %s device.\n", dev_name(dev));
 }
@@ -2357,11 +2362,6 @@ static void arm_teardown_iommu_dma_ops(struct device *dev) { }
 
 #endif	/* CONFIG_ARM_DMA_USE_IOMMU */
 
-static const struct dma_map_ops *arm_get_dma_map_ops(bool coherent)
-{
-	return coherent ? &arm_coherent_dma_ops : &arm_dma_ops;
-}
-
 void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
 			const struct iommu_ops *iommu, bool coherent)
 {
-- 
2.17.0

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox