linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Leo Yan <leo.yan@arm.com>
To: Tanmay Jagdale <tanmay@marvell.com>
Cc: suzuki.poulose@arm.com, mike.leach@linaro.org,
	james.clark@linaro.org, john.g.garry@oracle.com,
	leo.yan@linux.dev, will@kernel.org, acme@kernel.org,
	adrian.hunter@intel.com, linux-arm-kernel@lists.infradead.org,
	linux-perf-users@vger.kernel.org, coresight@lists.linaro.org,
	linux-kernel@vger.kernel.org, sgoutham@marvell.com,
	gcherian@marvell.com
Subject: Re: [PATCH V3 1/2] perf: cs-etm: Fixes in instruction sample synthesis
Date: Thu, 27 Mar 2025 15:35:30 +0000	[thread overview]
Message-ID: <20250327153530.GF604566@e132581.arm.com> (raw)
In-Reply-To: <20250327111149.461012-2-tanmay@marvell.com>

Hi Tanmay,

On Thu, Mar 27, 2025 at 04:41:48PM +0530, Tanmay Jagdale wrote:
> The existing method to synthesize instruction samples has the
> following issues:
> 1. Branch instruction mnemonics were being added to non-branch
>    instructions too.
> 2. Branch target address was missing
> 
> To fix the issues, start synthesizing the instructions from the
> previous packet (tidq->prev_packet) instead of current packet
> (tidq->packet). This way it's easy to figure out the target
> address of the branch instruction in tidq->prev_packet which
> is the current packet's (tidq->packet) first executed instruction.
> 
> Since we have now switched to processing the previous packet
> first, we need not swap the packets during cs_etm__flush().
> 
> Signed-off-by: Tanmay Jagdale <tanmay@marvell.com>
> Reviewed-by: James Clark <james.clark@arm.com>

I saw James's reviewed tag.  However, I have several comments.

Sorry I jumped in too late.

> ---
>  tools/perf/util/cs-etm.c | 32 +++++++++++++++++++++++++-------
>  1 file changed, 25 insertions(+), 7 deletions(-)
> 
> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> index 0bf9e5c27b59..ebed5b98860e 100644
> --- a/tools/perf/util/cs-etm.c
> +++ b/tools/perf/util/cs-etm.c
> @@ -1576,10 +1576,26 @@ static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq,

Seems to me, the problem is cs_etm__synth_instruction_sample() is
invoked from multiple callers.

Both the previous packet and packet are valid fo the flow:
  cs_etm__sample()
    `> cs_etm__synth_instruction_sample()

Only the previous packet is valid and the current packet stores stale
data for the flows:

  cs_etm__flush()
    `> cs_etm__synth_instruction_sample()

  cs_etm__end_block()
    `> cs_etm__synth_instruction_sample()

First, as a prerequisite, I think we should resolve the stale data in
the packet.  So we need a fix like:

diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c                
index 0bf9e5c27b59..b7b17c0e4806 100644                                         
--- a/tools/perf/util/cs-etm.c                                                  
+++ b/tools/perf/util/cs-etm.c                                                  
@@ -741,6 +741,9 @@ static void cs_etm__packet_swap(struct cs_etm_auxtrace *etm,
                                                                                
        if (etm->synth_opts.branches || etm->synth_opts.last_branch ||          
            etm->synth_opts.instructions) {                                     
+               /* The previous packet will not be used, cleanup it */          
+               memset(tidq->prev_packet, 0x0, sizeof(*tidq->packet));          
+                                                                               
                /*                                                              
                 * Swap PACKET with PREV_PACKET: PACKET becomes PREV_PACKET for 
                 * the next incoming packet.                                    

>  	sample.stream_id = etmq->etm->instructions_id;
>  	sample.period = period;
>  	sample.cpu = tidq->packet->cpu;

Should we use "prev_packet->cpu" at here?

Even for a branch instruction, as its IP address is from the previous
packet, we should use "prev_packet->cpu" for CPU ID as well.

> -	sample.flags = tidq->prev_packet->flags;
>  	sample.cpumode = event->sample.header.misc;
>  
> -	cs_etm__copy_insn(etmq, tidq->trace_chan_id, tidq->packet, &sample);
> +	cs_etm__copy_insn(etmq, tidq->trace_chan_id, tidq->prev_packet, &sample);
> +
> +	/* Populate branch target information only when we encounter
> +	 * branch instruction, which is at the end of tidq->prev_packet.
> +	 */
> +	if (addr == (tidq->prev_packet->end_addr - 4)) {

  if (!addr && addr == cs_etm__last_executed_instr(tidq->prev_packet))

> +		/* Update the perf_sample flags using the prev_packet
> +		 * since that is the queue we are synthesizing.
> +		 */
> +		sample.flags = tidq->prev_packet->flags;
> +
> +		/* The last instruction of the previous queue would be a
> +		 * branch operation. Get the target of that branch by looking
> +		 * into the first executed instruction of the current packet
> +		 * queue.
> +		 */
> +		sample.addr = cs_etm__first_executed_instr(tidq->packet);

If connected to the change suggested for cleaning up packet in
cs_etm__packet_swap(), when run at here, if "tidq->packet" is a valid
packet, then it will return a branch target address, otherwise, it
will return 0.

> +	}
>  
>  	if (etm->synth_opts.last_branch)
>  		sample.branch_stack = tidq->last_branch;
> @@ -1771,7 +1787,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq,
>  	/* Get instructions remainder from previous packet */
>  	instrs_prev = tidq->period_instructions;
>  
> -	tidq->period_instructions += tidq->packet->instr_count;
> +	tidq->period_instructions += tidq->prev_packet->instr_count;

A side effect for this change is we will defer to synthesize instruction
samples for _current_ packet, either the packet will be handled after
a new packet incoming, or at the end of a trace chunk.

The problem is for the later one, we can see cs_etm__end_block() and
cs_etm__flush() both only handle the previous packet. As a result, the
last packet will be ignored.

I would suggest we need to firstly fix this issue in
cs_etm__end_block() and cs_etm__flush() (maybe we need to consider to
consolidate the code with cs_etm__sample()).

>  	/*
>  	 * Record a branch when the last instruction in
> @@ -1851,8 +1867,11 @@ static int cs_etm__sample(struct cs_etm_queue *etmq,
>  			 * been executed, but PC has not advanced to next
>  			 * instruction)
>  			 */
> +			/* Get address from prev_packet since we are synthesizing
> +			 * that in cs_etm__synth_instruction_sample()
> +			 */
>  			addr = cs_etm__instr_addr(etmq, trace_chan_id,
> -						  tidq->packet, offset - 1);
> +						  tidq->prev_packet, offset - 1);
>  			ret = cs_etm__synth_instruction_sample(
>  				etmq, tidq, addr,
>  				etm->instructions_sample_period);
> @@ -1916,7 +1935,7 @@ static int cs_etm__flush(struct cs_etm_queue *etmq,
>  
>  	/* Handle start tracing packet */
>  	if (tidq->prev_packet->sample_type == CS_ETM_EMPTY)
> -		goto swap_packet;
> +		goto reset_last_br;
>  
>  	if (etmq->etm->synth_opts.last_branch &&
>  	    etmq->etm->synth_opts.instructions &&
> @@ -1952,8 +1971,7 @@ static int cs_etm__flush(struct cs_etm_queue *etmq,
>  			return err;
>  	}
>  
> -swap_packet:
> -	cs_etm__packet_swap(etm, tidq);
> +reset_last_br:

As said, if we consolidate cs_etm__flush() for processing both
previous packet and current packet, then we don't need to remove
cs_etm__packet_swap() at here, right?

Thanks,
Leo

>  
>  	/* Reset last branches after flush the trace */
>  	if (etm->synth_opts.last_branch)
> -- 
> 2.43.0
> 

WARNING: multiple messages have this Message-ID (diff)
From: Tanmay Jagdale <tanmay@marvell.com>
To: <leo.yan@arm.com>, Tanmay Jagdale <tanmay@marvell.com>
Cc: <suzuki.poulose@arm.com>, <mike.leach@linaro.org>,
	<james.clark@linaro.org>, <john.g.garry@oracle.com>,
	<leo.yan@linux.dev>, <will@kernel.org>, <acme@kernel.org>,
	<adrian.hunter@intel.com>, <linux-arm-kernel@lists.infradead.org>,
	<linux-perf-users@vger.kernel.org>, <coresight@lists.linaro.org>,
	<linux-kernel@vger.kernel.org>, <sgoutham@marvell.com>,
	<gcherian@marvell.com>
Subject: Re: [PATCH V3 1/2] perf: cs-etm: Fixes in instruction sample synthesis
Date: Tue, 1 Apr 2025 22:28:45 +0530	[thread overview]
Message-ID: <20250327153530.GF604566@e132581.arm.com> (raw)
Message-ID: <20250401165845.FSFZOeApFpD3i5d47cIieM2-5QRNtNSGk5FIJMUC7Y8@z> (raw)
In-Reply-To: <20250327111149.461012-2-tanmay@marvell.com>

From: Leo Yan <leo.yan@arm.com>

Hi Leo,

I was on vacation so could not get back earlier.
> 
> > Hi Tanmay,
> > 
> > On Thu, Mar 27, 2025 at 04:41:48PM +0530, Tanmay Jagdale wrote:
>>> The existing method to synthesize instruction samples has the
>>> following issues:
>>> 1. Branch instruction mnemonics were being added to non-branch
>>>    instructions too.
>>> 2. Branch target address was missing
>>> 
>>> To fix the issues, start synthesizing the instructions from the
>>> previous packet (tidq->prev_packet) instead of current packet
>>> (tidq->packet). This way it's easy to figure out the target
>>> address of the branch instruction in tidq->prev_packet which
>>> is the current packet's (tidq->packet) first executed instruction.
>>> 
>>> Since we have now switched to processing the previous packet
>>> first, we need not swap the packets during cs_etm__flush().
>>> 
>>> Signed-off-by: Tanmay Jagdale <tanmay@marvell.com>
>>> Reviewed-by: James Clark <james.clark@arm.com>
>>
>> I saw James's reviewed tag.  However, I have several comments.
>> 
>> Sorry I jumped in too late.
No problem, thanks for the review.

> 
>>> ---
>>>  tools/perf/util/cs-etm.c | 32 +++++++++++++++++++++++++-------
>>>  1 file changed, 25 insertions(+), 7 deletions(-)
>>> 
>>> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
>>> index 0bf9e5c27b59..ebed5b98860e 100644
>>> --- a/tools/perf/util/cs-etm.c
>>> +++ b/tools/perf/util/cs-etm.c
>>> @@ -1576,10 +1576,26 @@ static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq,
> 
> Seems to me, the problem is cs_etm__synth_instruction_sample() is
> invoked from multiple callers.
> 
> Both the previous packet and packet are valid fo the flow:
>   cs_etm__sample()
>     `> cs_etm__synth_instruction_sample()
> 
> Only the previous packet is valid and the current packet stores stale
> data for the flows:
> 
>   cs_etm__flush()
>     `> cs_etm__synth_instruction_sample()
> 
>  cs_etm__end_block()
>    `> cs_etm__synth_instruction_sample()
> 
> First, as a prerequisite, I think we should resolve the stale data in
> the packet.  So we need a fix like:
Agree.

> 
> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c                
> index 0bf9e5c27b59..b7b17c0e4806 100644                                         
> --- a/tools/perf/util/cs-etm.c                                                  
> +++ b/tools/perf/util/cs-etm.c                                                  
> @@ -741,6 +741,9 @@ static void cs_etm__packet_swap(struct cs_etm_auxtrace *etm,
>                                                                                 
>         if (etm->synth_opts.branches || etm->synth_opts.last_branch ||          
>             etm->synth_opts.instructions) {                                     
> +               /* The previous packet will not be used, cleanup it */          
> +               memset(tidq->prev_packet, 0x0, sizeof(*tidq->packet));          
> +                                                                               
>                 /*                                                              
>                  * Swap PACKET with PREV_PACKET: PACKET becomes PREV_PACKET for 
>                  * the next incoming packet.                                    
> 
Thanks for pointing out, I'll include this fix.

>>>  	sample.stream_id = etmq->etm->instructions_id;
>>>  	sample.period = period;
>>>  	sample.cpu = tidq->packet->cpu;
> 
> Should we use "prev_packet->cpu" at here?
> 
> Even for a branch instruction, as its IP address is from the previous
> packet, we should use "prev_packet->cpu" for CPU ID as well.
ACK.

> 
>>> -	sample.flags = tidq->prev_packet->flags;
>>>  	sample.cpumode = event->sample.header.misc;
>>>  
>>> -	cs_etm__copy_insn(etmq, tidq->trace_chan_id, tidq->packet, &sample);
>>> +	cs_etm__copy_insn(etmq, tidq->trace_chan_id, tidq->prev_packet, &sample);
>>> +
>>> +	/* Populate branch target information only when we encounter
>>> +	 * branch instruction, which is at the end of tidq->prev_packet.
>>> +	 */
>>> +	if (addr == (tidq->prev_packet->end_addr - 4)) {
> 
>   if (!addr && addr == cs_etm__last_executed_instr(tidq->prev_packet))
> 
>>> +		/* Update the perf_sample flags using the prev_packet
>>> +		 * since that is the queue we are synthesizing.
>>> +		 */
>>> +		sample.flags = tidq->prev_packet->flags;
>>> +
>>> +		/* The last instruction of the previous queue would be a
>>> +		 * branch operation. Get the target of that branch by looking
>>> +		 * into the first executed instruction of the current packet
>>> +		 * queue.
>>> +		 */
>>> +		sample.addr = cs_etm__first_executed_instr(tidq->packet);
> 
> If connected to the change suggested for cleaning up packet in
> cs_etm__packet_swap(), when run at here, if "tidq->packet" is a valid
> packet, then it will return a branch target address, otherwise, it
> will return 0.
> 
>>> +	}
>>>  
>>>  	if (etm->synth_opts.last_branch)
>>>  		sample.branch_stack = tidq->last_branch;
>>> @@ -1771,7 +1787,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq,
>>>  	/* Get instructions remainder from previous packet */
>>>  	instrs_prev = tidq->period_instructions;
>>>  
>>> -	tidq->period_instructions += tidq->packet->instr_count;
>>> +	tidq->period_instructions += tidq->prev_packet->instr_count;
> 
> A side effect for this change is we will defer to synthesize instruction
> samples for _current_ packet, either the packet will be handled after
> a new packet incoming, or at the end of a trace chunk.
> 
> The problem is for the later one, we can see cs_etm__end_block() and
> cs_etm__flush() both only handle the previous packet. As a result, the
> last packet will be ignored.
Yes I agree, this is a side effect of the patch. The last packet's instructions
are not handled.

> 
> I would suggest we need to firstly fix this issue in
> cs_etm__end_block() and cs_etm__flush() (maybe we need to consider to
> consolidate the code with cs_etm__sample()).
Okay sure. I will take a look at consolidating the code and post them in
the next version.

> 
>>>  	/*
>>>  	 * Record a branch when the last instruction in
>>> @@ -1851,8 +1867,11 @@ static int cs_etm__sample(struct cs_etm_queue *etmq,
>>>  			 * been executed, but PC has not advanced to next
>>>  			 * instruction)
>>>  			 */
>>> +			/* Get address from prev_packet since we are synthesizing
>>> +			 * that in cs_etm__synth_instruction_sample()
>>> +			 */
>>>  			addr = cs_etm__instr_addr(etmq, trace_chan_id,
>>> -						  tidq->packet, offset - 1);
>>> +						  tidq->prev_packet, offset - 1);
>>>  			ret = cs_etm__synth_instruction_sample(
>>>  				etmq, tidq, addr,
>>>  				etm->instructions_sample_period);
>>> @@ -1916,7 +1935,7 @@ static int cs_etm__flush(struct cs_etm_queue *etmq,
>>>  
>>>  	/* Handle start tracing packet */
>>>  	if (tidq->prev_packet->sample_type == CS_ETM_EMPTY)
>>> -		goto swap_packet;
>>> +		goto reset_last_br;
>>>  
>>>  	if (etmq->etm->synth_opts.last_branch &&
>>>  	    etmq->etm->synth_opts.instructions &&
>>> @@ -1952,8 +1971,7 @@ static int cs_etm__flush(struct cs_etm_queue *etmq,
>>>  			return err;
>>>  	}
>>>  
>>> -swap_packet:
>>> -	cs_etm__packet_swap(etm, tidq);
>>> +reset_last_br:
> 
> As said, if we consolidate cs_etm__flush() for processing both
> previous packet and current packet, then we don't need to remove
> cs_etm__packet_swap() at here, right?
Yes I think so too.

Thanks,
Tanmay
> 
> Thanks,
> Leo
> 

  reply	other threads:[~2025-03-27 15:35 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-27 11:11 [PATCH V3 0/2] Fix Coresight instruction synthesis logic Tanmay Jagdale
2025-03-27 11:11 ` [PATCH V3 1/2] perf: cs-etm: Fixes in instruction sample synthesis Tanmay Jagdale
2025-03-27 15:35   ` Leo Yan [this message]
2025-04-01 16:58     ` Tanmay Jagdale
2025-03-27 11:11 ` [PATCH V3 2/2] perf: cs-etm: Store previous timestamp in packet queue Tanmay Jagdale

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250327153530.GF604566@e132581.arm.com \
    --to=leo.yan@arm.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=coresight@lists.linaro.org \
    --cc=gcherian@marvell.com \
    --cc=james.clark@linaro.org \
    --cc=john.g.garry@oracle.com \
    --cc=leo.yan@linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mike.leach@linaro.org \
    --cc=sgoutham@marvell.com \
    --cc=suzuki.poulose@arm.com \
    --cc=tanmay@marvell.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).