From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7F86DDDBB; Sat, 1 Jun 2024 06:10:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.158.5 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717222227; cv=none; b=ByepZSph3KxAOrKqM0g4SSM3En7t0hetzd7KcvlTjcTL9W7lAI/JyrGeSDxT/vAwviYj9dmF3akaV4sAmap35fu9HTHlcb2h4C7WIrBmUMgnaMRNt05dsTIzkCWenZMxc7VraSa9o/2N7/t59mlY9tq6XRwOU4yHG4Aq+5Ldk/w= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717222227; c=relaxed/simple; bh=N0M4/MkjnxhgT4Bc3srykDwJUPq8yCDRrTuSL92Rgq4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=pdG3MaaYaU2cuva/cgZ+tzuxv8UywRO4ZY9O+AYUIwgpl8sYzUXqcaQyrnIVJk66PoBaOq3Ey9svEgB2TpI+37GkRILZfwTRqsPaASMLnwWFaMu2msS37umpdGP62qxfesHhWDlWHEw8vrKID2YK6fwts0fRlrg0X7qJFO51Cis= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.vnet.ibm.com; spf=none smtp.mailfrom=linux.vnet.ibm.com; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b=Sa1+WP06; arc=none smtp.client-ip=148.163.158.5 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.vnet.ibm.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.vnet.ibm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="Sa1+WP06" Received: from pps.filterd (m0353723.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 4515td6B001106; Sat, 1 Jun 2024 06:10:08 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc : content-transfer-encoding : date : from : in-reply-to : message-id : mime-version : references : subject : to; s=pp1; bh=flTmuPpkS21F/LStRYU2ERHgpHEfWDn00gvA5pZl7kg=; b=Sa1+WP06ixI8hRgZA9sL25yP9ie61wedmzO2Bid9/mhXgDkILKgn0209YDjbQ6tQ9KZH 5SwMf6xiN+OMva9BB1oGU3Gy3BMqWpvgiJs1Tw/xsNeHqemJfQEuleJ2Z62v19Lk4yOA m01Fv/fvEncdrPphpJ+fadwuLYNi9/1E1OYEdP1ig+lidikHUyix0tCNtvgxaaG1YRKL YcCViuzpIhGSQqy9vgnvcs86vHguKRfWTBe8LcAQ9oK7RALETxjdNzOhD22QmrIxWmpD lqwovh5aok9AynHxhGvJUPNy5YdXHoJYUbh5elpOKCMnPMpacRWOhMlphM0Yqms5dM6j Jg== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3yfw56g3n9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sat, 01 Jun 2024 06:10:08 +0000 Received: from m0353723.ppops.net (m0353723.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 4516A7fl027950; Sat, 1 Jun 2024 06:10:07 GMT Received: from ppma12.dal12v.mail.ibm.com (dc.9e.1632.ip4.static.sl-reverse.com [50.22.158.220]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3yfw56g3n7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sat, 01 Jun 2024 06:10:07 +0000 Received: from pps.filterd (ppma12.dal12v.mail.ibm.com [127.0.0.1]) by ppma12.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 4514JCO1022532; Sat, 1 Jun 2024 06:10:06 GMT Received: from smtprelay01.fra02v.mail.ibm.com ([9.218.2.227]) by ppma12.dal12v.mail.ibm.com (PPS) with ESMTPS id 3yfv180jxe-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sat, 01 Jun 2024 06:10:06 +0000 Received: from smtpav03.fra02v.mail.ibm.com (smtpav03.fra02v.mail.ibm.com [10.20.54.102]) by smtprelay01.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 4516A1VC56295788 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 1 Jun 2024 06:10:03 GMT Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5C42C20040; Sat, 1 Jun 2024 06:10:01 +0000 (GMT) Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 596DF20043; Sat, 1 Jun 2024 06:09:58 +0000 (GMT) Received: from localhost.localdomain (unknown [9.43.41.43]) by smtpav03.fra02v.mail.ibm.com (Postfix) with ESMTP; Sat, 1 Jun 2024 06:09:58 +0000 (GMT) From: Athira Rajeev To: acme@kernel.org, jolsa@kernel.org, adrian.hunter@intel.com, irogers@google.com, namhyung@kernel.org, segher@kernel.crashing.org, christophe.leroy@csgroup.eu Cc: linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, akanksha@linux.ibm.com, maddy@linux.ibm.com, atrajeev@linux.vnet.ibm.com, kjain@linux.ibm.com, disgoel@linux.vnet.ibm.com Subject: [PATCH V3 03/14] tools/perf: Add support to capture and parse raw instruction in powerpc using dso__data_read_offset utility Date: Sat, 1 Jun 2024 11:39:30 +0530 Message-Id: <20240601060941.13692-4-atrajeev@linux.vnet.ibm.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20240601060941.13692-1-atrajeev@linux.vnet.ibm.com> References: <20240601060941.13692-1-atrajeev@linux.vnet.ibm.com> Precedence: bulk X-Mailing-List: linux-perf-users@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: xOYi-esjCos4aw8Nm4vARECxk6_L5j8X X-Proofpoint-ORIG-GUID: QmEfPaGs6EMKOt-r_JEE5J3AAGdWJ7H2 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.650,FMLib:17.12.28.16 definitions=2024-06-01_01,2024-05-30_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 priorityscore=1501 adultscore=0 malwarescore=0 phishscore=0 clxscore=1015 spamscore=0 impostorscore=0 mlxlogscore=999 lowpriorityscore=0 suspectscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2405010000 definitions=main-2406010045 Add support to capture and parse raw instruction in powerpc. Currently, the perf tool infrastructure uses two ways to disassemble and understand the instruction. One is objdump and other option is via libcapstone. Currently, the perf tool infrastructure uses "--no-show-raw-insn" option with "objdump" while disassemble. Example from powerpc with this option for an instruction address is: Snippet from: objdump --start-address=
--stop-address=
-d --no-show-raw-insn -C c0000000010224b4: lwz r10,0(r9) This line "lwz r10,0(r9)" is parsed to extract instruction name, registers names and offset. Also to find whether there is a memory reference in the operands, "memory_ref_char" field of objdump is used. For x86, "(" is used as memory_ref_char to tackle instructions of the form "mov (%rax), %rcx". In case of powerpc, not all instructions using "(" are the only memory instructions. Example, above instruction can also be of extended form (X form) "lwzx r10,0,r19". Inorder to easy identify the instruction category and extract the source/target registers, patch adds support to use raw instruction for powerpc. Approach used is to read the raw instruction directly from the DSO file using "dso__data_read_offset" utility which is already implemented in perf infrastructure in "util/dso.c". Example: 38 01 81 e8 ld r4,312(r1) Here "38 01 81 e8" is the raw instruction representation. In powerpc, this translates to instruction form: "ld RT,DS(RA)" and binary code as: _____________________________________ | 58 | RT | RA | DS | | ------------------------------------- 0 6 11 16 30 31 Function "symbol__disassemble_dso" is updated to read raw instruction directly from DSO using dso__data_read_offset utility. In case of above example, this captures: line: 38 01 81 e8 Signed-off-by: Athira Rajeev --- tools/perf/util/disasm.c | 98 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 98 insertions(+) diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c index b5fe3a7508bb..89a9e4136c09 100644 --- a/tools/perf/util/disasm.c +++ b/tools/perf/util/disasm.c @@ -1586,6 +1586,91 @@ static int symbol__disassemble_capstone(char *filename, struct symbol *sym, } #endif +static int symbol__disassemble_dso(char *filename, struct symbol *sym, + struct annotate_args *args) +{ + struct annotation *notes = symbol__annotation(sym); + struct map *map = args->ms.map; + struct dso *dso = map__dso(map); + u64 start = map__rip_2objdump(map, sym->start); + u64 end = map__rip_2objdump(map, sym->end); + u64 len = end - start; + u64 offset; + int i, count; + u8 *buf = NULL; + char disasm_buf[512]; + struct disasm_line *dl; + u32 *line; + + /* Return if objdump is specified explicitly */ + if (args->options->objdump_path) + return -1; + + pr_debug("Reading raw instruction from : %s using dso__data_read_offset\n", filename); + + buf = malloc(len); + if (buf == NULL) + goto err; + + count = dso__data_read_offset(dso, NULL, sym->start, buf, len); + + line = (u32 *)buf; + + if ((u64)count != len) + goto err; + + /* add the function address and name */ + scnprintf(disasm_buf, sizeof(disasm_buf), "%#"PRIx64" <%s>:", + start, sym->name); + + args->offset = -1; + args->line = disasm_buf; + args->line_nr = 0; + args->fileloc = NULL; + args->ms.sym = sym; + + dl = disasm_line__new(args); + if (dl == NULL) + goto err; + + annotation_line__add(&dl->al, ¬es->src->source); + + /* Each raw instruction is 4 byte */ + count = len/4; + + for (i = 0, offset = 0; i < count; i++) { + args->offset = offset; + sprintf(args->line, "%x", line[i]); + dl = disasm_line__new(args); + if (dl == NULL) + goto err; + + annotation_line__add(&dl->al, ¬es->src->source); + offset += 4; + } + + /* It failed in the middle */ + if (offset != len) { + struct list_head *list = ¬es->src->source; + + /* Discard all lines and fallback to objdump */ + while (!list_empty(list)) { + dl = list_first_entry(list, struct disasm_line, al.node); + + list_del_init(&dl->al.node); + disasm_line__free(dl); + } + count = -1; + } + +out: + free(buf); + return count < 0 ? count : 0; + +err: + count = -1; + goto out; +} /* * Possibly create a new version of line with tabs expanded. Returns the * existing or new line, storage is updated if a new line is allocated. If @@ -1710,6 +1795,19 @@ int symbol__disassemble(struct symbol *sym, struct annotate_args *args) strcpy(symfs_filename, tmp); } + /* + * For powerpc data type profiling, use the dso__data_read_offset + * to read raw instruction directly and interpret the binary code + * to understand instructions and register fields. For sort keys as + * type and typeoff, disassemble to mnemonic notation is + * not required in case of powerpc. + */ + if (arch__is(args->arch, "powerpc")) { + err = symbol__disassemble_dso(symfs_filename, sym, args); + if (err == 0) + goto out_remove_tmp; + } + #ifdef HAVE_LIBCAPSTONE_SUPPORT err = symbol__disassemble_capstone(symfs_filename, sym, args); if (err == 0) -- 2.43.0