From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-dl1-f74.google.com (mail-dl1-f74.google.com [74.125.82.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4BC7D3B6BF9 for ; Thu, 23 Apr 2026 16:11:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.74 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776960711; cv=none; b=u8ZHbza2izVzoGu4XCbUOr4sDEqusn3UQBwFQGrrbHPOVwsnsMsGbRJ5LNs5w3MjYpN3Sz8CX7cE/Fix2hhmBrAPY/1tCtGYkOdIkbR8RpKPJozi4z6upgmwCmI+/VGdRvDdyGF5IBUUzZlmWktLXd8Ro6nD899avQItZ3gQxo0= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776960711; c=relaxed/simple; bh=EDdMWl1pNyTo9rDguhd3UHXjCzEsoqT1I2acDtFwOfo=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Oyw+S7I4hsd+09xTttVX7IOiHzERwxGXt4r4rpEEMXz/rDs0lPs0r/e7IsqAQLdoqHrHmVMBO+eAoRihuCERScNKtrT7qFg4oUJY11PbTGXIIn9LpYgzc+UNOPl+yu9HD6ptzUEeffaOJ01mAXsCg1Fl5EpgmHpmZluRWxJDtW8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--irogers.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=uxknnoht; arc=none smtp.client-ip=74.125.82.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--irogers.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="uxknnoht" Received: by mail-dl1-f74.google.com with SMTP id a92af1059eb24-12db218e265so2383032c88.0 for ; Thu, 23 Apr 2026 09:11:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1776960709; x=1777565509; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Xxs24a0xlOd8OFcipwO4jOIDyCfdhCvvWdSR1+KlfUE=; b=uxknnohtRDL85Yc4NKpz86a+bPiYetciA3knNFER+vynQJ6IIi6nh3inMBJVt8p+dV BGnP3kdWqy0B7YD1UBNZ/W7DQlI2r3UojCbgvS+omMJtm6bdroFwFblaGt2lfyzMAWkL QFJIkE+VchN/eLSUzmw9A5ojXnuirKnsEOEkMBDmH/FcVZMHJy8eJYPwup0p8tdkfLcH KeyF1dCMMzvFQJOueEAW4GDwW1DsBzy0SR55iC5tkfpBPWn8sdaSVR0pXhOBY+O6X7cH UeX9a2dlq0ZFKEN9phFg8Dgb6b0qWeASfDZJbcVMS+uKlE/mqvL3J47aw/Rg06m7UYql 1JLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776960709; x=1777565509; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Xxs24a0xlOd8OFcipwO4jOIDyCfdhCvvWdSR1+KlfUE=; b=bSujBOV/oZSaINex7Znmrg+ulVw4LazRrmzOWY1RJi4rnwcc9m8Db0mZgo8JEBVpsn gUIMmhX6za8iCtQtuhq9n5eR7Yw5ewhvbperpz7JFMRAwLu1urODbmbyVXhsGJtHNR9w yNA2pHePHfD/otRY5HRoS2VC21aPfcgFTvZA8uaEwPGjgSt7flNEdEUHVGg8kdsXNN6L ojmt6IzfSbVo4CljcOj06NDqrj2h4P/FeEjhgC/ffrhqsryyX/20e4N0F1C+UTBPgiOg 580thzjFTvwCzmBBCP+oukWkzo/MoNPrWyWqunejND3SSyVbeSW8+16xFhrLLBY/X9CV BpUA== X-Forwarded-Encrypted: i=1; AFNElJ8RAmyafELRD2CA0iifrRfTrrYuFX8XIGvpPozjdhJNKYaYG5ylUzYHXjFYZKPqRK8tIZMC5Lf0ileyZrU=@vger.kernel.org X-Gm-Message-State: AOJu0Yz0Y2zokQBdl6OUXHQaefBnZHUDHroH4fZD6DAKQfzAyEMW9DZn jsBFNJjp6wUm/sWGOPwHyt74wmMkrJrns9cmUWqNQL9/wU1IF8h/cgW4pdPqML85W+CGNjIn6Ur btSPLPmKUmg== X-Received: from dlbur6-n2.prod.google.com ([2002:a05:7022:ea46:20b0:12d:b32f:d690]) (user=irogers job=prod-delivery.src-stubby-dispatcher) by 2002:a05:7022:69a:b0:128:d6ed:e898 with SMTP id a92af1059eb24-12c73fa5903mr16065387c88.29.1776960709170; Thu, 23 Apr 2026 09:11:49 -0700 (PDT) Date: Thu, 23 Apr 2026 09:09:55 -0700 In-Reply-To: <20260423161006.1762700-1-irogers@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260423035526.1537178-1-irogers@google.com> <20260423161006.1762700-1-irogers@google.com> X-Mailer: git-send-email 2.54.0.rc2.533.g4f5dca5207-goog Message-ID: <20260423161006.1762700-49-irogers@google.com> Subject: [PATCH v3 48/58] perf rw-by-pid: Port rw-by-pid to use python module From: Ian Rogers To: irogers@google.com, acme@kernel.org, adrian.hunter@intel.com, james.clark@linaro.org, leo.yan@linux.dev, namhyung@kernel.org, tmricht@linux.ibm.com Cc: 9erthalion6@gmail.com, adityab1@linux.ibm.com, alexandre.chartre@oracle.com, alice.mei.rogers@gmail.com, ankur.a.arora@oracle.com, ashelat@redhat.com, atrajeev@linux.ibm.com, blakejones@google.com, changbin.du@huawei.com, chuck.lever@oracle.com, collin.funk1@gmail.com, coresight@lists.linaro.org, ctshao@google.com, dapeng1.mi@linux.intel.com, derek.foreman@collabora.com, dsterba@suse.com, gautam@linux.ibm.com, howardchu95@gmail.com, john.g.garry@oracle.com, jolsa@kernel.org, jonathan.cameron@huawei.com, justinstitt@google.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, mike.leach@arm.com, mingo@redhat.com, morbo@google.com, nathan@kernel.org, nichen@iscas.ac.cn, nick.desaulniers+lkml@gmail.com, pan.deng@intel.com, peterz@infradead.org, ravi.bangoria@amd.com, ricky.ringler@proton.me, stephen.s.brennan@oracle.com, sun.jian.kdev@gmail.com, suzuki.poulose@arm.com, swapnil.sapkal@amd.com, tanze@kylinos.cn, terrelln@fb.com, thomas.falcon@intel.com, tianyou.li@intel.com, tycho@kernel.org, wangyang.guo@intel.com, xiaqinxin@huawei.com, yang.lee@linux.alibaba.com, yuzhuo@google.com, zhiguo.zhou@intel.com, zli94@ncsu.edu Content-Type: text/plain; charset="UTF-8" Port the legacy Perl script rw-by-pid.pl to a python script using the perf module in tools/perf/python. The new script uses a class-based architecture and leverages the perf.session API for event processing. It tracks read and write activity by PID for all processes, aggregating bytes requested, bytes read, total reads, and errors. Complications: - Refactored process_event to extract helper methods (_handle_sys_enter_read, etc.) to reduce the number of branches and satisfy pylint. - Split long lines to comply with line length limits. - pylint warns about the module name not being snake_case, but it is kept for consistency with the original script name. Assisted-by: Gemini:gemini-3.1-pro-preview Signed-off-by: Ian Rogers --- v2: - Fixed Substring Matching: Replaced loose substring checks like if "sys_enter_read" in event_name: with exact matches against syscalls:sys_enter_read and raw_syscalls:sys_enter_read using sample.evsel.name . This prevents unrelated syscalls with similar names (like readahead ) from being incorrectly aggregated. Similar fixes were applied for exit events and write events. - Inlined Handlers and Tracked Errors: Inlined the _handle_sys_* helper methods into process_event() to make error handling easier. Now, if a sample lacks expected fields (raising AttributeError ), it is added to the self.unhandled tracker instead of being silently dropped, providing better visibility to the user. - Code Cleanup: Fixed trailing whitespace and added a pylint disable comment for too-many-branches caused by the inlining. --- tools/perf/python/rw-by-pid.py | 158 +++++++++++++++++++++++++++++++++ 1 file changed, 158 insertions(+) create mode 100755 tools/perf/python/rw-by-pid.py diff --git a/tools/perf/python/rw-by-pid.py b/tools/perf/python/rw-by-pid.py new file mode 100755 index 000000000000..53e16b373111 --- /dev/null +++ b/tools/perf/python/rw-by-pid.py @@ -0,0 +1,158 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0-only +"""Display r/w activity for all processes.""" + +import argparse +from collections import defaultdict +import sys +from typing import Optional, Dict, List, Tuple, Any +import perf + +class RwByPid: + """Tracks and displays read/write activity by PID.""" + def __init__(self) -> None: + self.reads: Dict[int, Dict[str, Any]] = defaultdict( + lambda: { + "bytes_requested": 0, + "bytes_read": 0, + "total_reads": 0, + "comm": "", + "errors": defaultdict(int), + } + ) + self.writes: Dict[int, Dict[str, Any]] = defaultdict( + lambda: { + "bytes_written": 0, + "total_writes": 0, + "comm": "", + "errors": defaultdict(int), + } + ) + self.unhandled: Dict[str, int] = defaultdict(int) + self.session: Optional[perf.session] = None + + def process_event(self, sample: perf.sample_event) -> None: # pylint: disable=too-many-branches + """Process events.""" + event_name = sample.evsel.name # type: ignore + pid = sample.sample_pid + + assert self.session is not None + try: + comm = self.session.process(pid).comm() + except Exception: # pylint: disable=broad-except + comm = "unknown" + + if event_name in ("syscalls:sys_enter_read", "raw_syscalls:sys_enter_read"): + try: + count = sample.count + self.reads[pid]["bytes_requested"] += count + self.reads[pid]["total_reads"] += 1 + self.reads[pid]["comm"] = comm + except AttributeError: + self.unhandled[event_name] += 1 + elif event_name in ("syscalls:sys_exit_read", "raw_syscalls:sys_exit_read"): + try: + ret = sample.ret + if ret > 0: + self.reads[pid]["bytes_read"] += ret + else: + self.reads[pid]["errors"][ret] += 1 + except AttributeError: + self.unhandled[event_name] += 1 + elif event_name in ("syscalls:sys_enter_write", "raw_syscalls:sys_enter_write"): + try: + count = sample.count + self.writes[pid]["bytes_written"] += count + self.writes[pid]["total_writes"] += 1 + self.writes[pid]["comm"] = comm + except AttributeError: + self.unhandled[event_name] += 1 + elif event_name in ("syscalls:sys_exit_write", "raw_syscalls:sys_exit_write"): + try: + ret = sample.ret + if ret <= 0: + self.writes[pid]["errors"][ret] += 1 + except AttributeError: + self.unhandled[event_name] += 1 + else: + self.unhandled[event_name] += 1 + + def print_totals(self) -> None: + """Print summary tables.""" + print("read counts by pid:\n") + print( + f"{'pid':>6s} {'comm':<20s} {'# reads':>10s} " + f"{'bytes_requested':>15s} {'bytes_read':>10s}" + ) + print(f"{'-'*6} {'-'*20} {'-'*10} {'-'*15} {'-'*10}") + + for pid, data in sorted(self.reads.items(), + key=lambda kv: kv[1]["bytes_read"], reverse=True): + print( + f"{pid:6d} {data['comm']:<20s} {data['total_reads']:10d} " + f"{data['bytes_requested']:15d} {data['bytes_read']:10d}" + ) + + print("\nfailed reads by pid:\n") + print(f"{'pid':>6s} {'comm':<20s} {'error #':>6s} {'# errors':>10s}") + print(f"{'-'*6} {'-'*20} {'-'*6} {'-'*10}") + + errcounts: List[Tuple[int, str, int, int]] = [] + for pid, data in self.reads.items(): + for error, count in data["errors"].items(): + errcounts.append((pid, data["comm"], error, count)) + + for pid, comm, error, count in sorted(errcounts, key=lambda x: x[3], reverse=True): + print(f"{pid:6d} {comm:<20s} {error:6d} {count:10d}") + + print("\nwrite counts by pid:\n") + print(f"{'pid':>6s} {'comm':<20s} {'# writes':>10s} {'bytes_written':>15s}") + print(f"{'-'*6} {'-'*20} {'-'*10} {'-'*15}") + + for pid, data in sorted(self.writes.items(), + key=lambda kv: kv[1]["bytes_written"], reverse=True): + print( + f"{pid:6d} {data['comm']:<20s} " + f"{data['total_writes']:10d} {data['bytes_written']:15d}" + ) + + print("\nfailed writes by pid:\n") + print(f"{'pid':>6s} {'comm':<20s} {'error #':>6s} {'# errors':>10s}") + print(f"{'-'*6} {'-'*20} {'-'*6} {'-'*10}") + + errcounts = [] + for pid, data in self.writes.items(): + for error, count in data["errors"].items(): + errcounts.append((pid, data["comm"], error, count)) + + for pid, comm, error, count in sorted(errcounts, key=lambda x: x[3], reverse=True): + print(f"{pid:6d} {comm:<20s} {error:6d} {count:10d}") + + if self.unhandled: + print("\nunhandled events:\n") + print(f"{'event':<40s} {'count':>10s}") + print(f"{'-'*40} {'-'*10}") + for event_name, count in self.unhandled.items(): + print(f"{event_name:<40s} {count:10d}") + + def run(self, input_file: str) -> None: + """Run the session.""" + self.session = perf.session(perf.data(input_file), sample=self.process_event) + self.session.process_events() + self.print_totals() + +def main() -> None: + """Main function.""" + parser = argparse.ArgumentParser(description="Trace r/w activity by PID") + parser.add_argument("-i", "--input", default="perf.data", help="Input file") + args = parser.parse_args() + + analyzer = RwByPid() + try: + analyzer.run(args.input) + except IOError as e: + print(e, file=sys.stderr) + sys.exit(1) + +if __name__ == "__main__": + main() -- 2.54.0.rc2.533.g4f5dca5207-goog