public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: Kalesh Singh <kaleshsingh@google.com>
Cc: ilkos@google.com, tjmercier@google.com, surenb@google.com,
	kernel-team@android.com, Jonathan Corbet <corbet@lwn.net>,
	Sumit Semwal <sumit.semwal@linaro.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>,
	Kees Cook <keescook@chromium.org>,
	Mike Rapoport <rppt@kernel.org>, Colin Cross <ccross@google.com>,
	Randy Dunlap <rdunlap@infradead.org>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-media@vger.kernel.org,
	dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org
Subject: Re: [RFC PATCH] procfs: Add file path and size to /proc/<pid>/fdinfo
Date: Fri, 20 May 2022 08:29:18 +0200	[thread overview]
Message-ID: <4e35dc30-1157-50b3-e3b6-954481a0524d@amd.com> (raw)
In-Reply-To: <20220519214021.3572840-1-kaleshsingh@google.com>

Am 19.05.22 um 23:40 schrieb Kalesh Singh:
> Processes can pin shared memory by keeping a handle to it through a
> file descriptor; for instance dmabufs, memfd, and ashsmem (in Android).
>
> In the case of a memory leak, to identify the process pinning the
> memory, userspace needs to:
>    - Iterate the /proc/<pid>/fd/* for each process
>    - Do a readlink on each entry to identify the type of memory from
>      the file path.
>    - stat() each entry to get the size of the memory.
>
> The file permissions on /proc/<pid>/fd/* only allows for the owner
> or root to perform the operations above; and so is not suitable for
> capturing the system-wide state in a production environment.
>
> This issue was addressed for dmabufs by making /proc/*/fdinfo/*
> accessible to a process with PTRACE_MODE_READ_FSCREDS credentials[1]
> To allow the same kind of tracking for other types of shared memory,
> add the following fields to /proc/<pid>/fdinfo/<fd>:
>
> path - This allows identifying the type of memory based on common
>         prefixes: e.g. "/memfd...", "/dmabuf...", "/dev/ashmem..."
>
>         This was not an issued when dmabuf tracking was introduced
>         because the exp_name field of dmabuf fdinfo could be used
>         to distinguish dmabuf fds from other types.
>
> size - To track the amount of memory that is being pinned.
>
>         dmabufs expose size as an additional field in fdinfo. Remove
>         this and make it a common field for all fds.
>
> Access to /proc/<pid>/fdinfo is governed by PTRACE_MODE_READ_FSCREDS
> -- the same as for /proc/<pid>/maps which also exposes the path and
> size for mapped memory regions.
>
> This allows for a system process with PTRACE_MODE_READ_FSCREDS to
> account the pinned per-process memory via fdinfo.

I think this should be split into two patches, one adding the size and 
one adding the path.

Adding the size is completely unproblematic, but the path might raise 
some eyebrows.

>
> [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F20210308170651.919148-1-kaleshsingh%40google.com%2F&amp;data=05%7C01%7Cchristian.koenig%40amd.com%7C95ee7bf71c2c4aa342fa08da39e03398%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637885932392014544%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=kf%2B2es12hV3z5zjOFhx3EyxI1XEMeHexqTLNpNoDhAY%3D&amp;reserved=0
>
> Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
> ---
>   Documentation/filesystems/proc.rst | 22 ++++++++++++++++++++--
>   drivers/dma-buf/dma-buf.c          |  1 -
>   fs/proc/fd.c                       |  9 +++++++--
>   3 files changed, 27 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
> index 061744c436d9..ad66d78aca51 100644
> --- a/Documentation/filesystems/proc.rst
> +++ b/Documentation/filesystems/proc.rst
> @@ -1922,13 +1922,16 @@ if precise results are needed.
>   3.8	/proc/<pid>/fdinfo/<fd> - Information about opened file
>   ---------------------------------------------------------------
>   This file provides information associated with an opened file. The regular
> -files have at least four fields -- 'pos', 'flags', 'mnt_id' and 'ino'.
> +files have at least six fields -- 'pos', 'flags', 'mnt_id', 'ino', 'size',
> +and 'path'.
> +
>   The 'pos' represents the current offset of the opened file in decimal
>   form [see lseek(2) for details], 'flags' denotes the octal O_xxx mask the
>   file has been created with [see open(2) for details] and 'mnt_id' represents
>   mount ID of the file system containing the opened file [see 3.5
>   /proc/<pid>/mountinfo for details]. 'ino' represents the inode number of
> -the file.
> +the file, 'size' represents the size of the file in bytes, and 'path'
> +represents the file path.
>   
>   A typical output is::
>   
> @@ -1936,6 +1939,8 @@ A typical output is::
>   	flags:	0100002
>   	mnt_id:	19
>   	ino:	63107
> +        size:   0
> +        path:   /dev/null
>   
>   All locks associated with a file descriptor are shown in its fdinfo too::
>   
> @@ -1953,6 +1958,8 @@ Eventfd files
>   	flags:	04002
>   	mnt_id:	9
>   	ino:	63107
> +        size:   0
> +        path:   anon_inode:[eventfd]
>   	eventfd-count:	5a
>   
>   where 'eventfd-count' is hex value of a counter.
> @@ -1966,6 +1973,8 @@ Signalfd files
>   	flags:	04002
>   	mnt_id:	9
>   	ino:	63107
> +        size:   0
> +        path:   anon_inode:[signalfd]
>   	sigmask:	0000000000000200
>   
>   where 'sigmask' is hex value of the signal mask associated
> @@ -1980,6 +1989,8 @@ Epoll files
>   	flags:	02
>   	mnt_id:	9
>   	ino:	63107
> +        size:   0
> +        path:   anon_inode:[eventpoll]
>   	tfd:        5 events:       1d data: ffffffffffffffff pos:0 ino:61af sdev:7
>   
>   where 'tfd' is a target file descriptor number in decimal form,
> @@ -1998,6 +2009,8 @@ For inotify files the format is the following::
>   	flags:	02000000
>   	mnt_id:	9
>   	ino:	63107
> +        size:   0
> +        path:   anon_inode:inotify
>   	inotify wd:3 ino:9e7e sdev:800013 mask:800afce ignored_mask:0 fhandle-bytes:8 fhandle-type:1 f_handle:7e9e0000640d1b6d
>   
>   where 'wd' is a watch descriptor in decimal form, i.e. a target file
> @@ -2021,6 +2034,8 @@ For fanotify files the format is::
>   	flags:	02
>   	mnt_id:	9
>   	ino:	63107
> +        size:   0
> +        path:   anon_inode:[fanotify]
>   	fanotify flags:10 event-flags:0
>   	fanotify mnt_id:12 mflags:40 mask:38 ignored_mask:40000003
>   	fanotify ino:4f969 sdev:800013 mflags:0 mask:3b ignored_mask:40000000 fhandle-bytes:8 fhandle-type:1 f_handle:69f90400c275b5b4
> @@ -2046,6 +2061,8 @@ Timerfd files
>   	flags:	02
>   	mnt_id:	9
>   	ino:	63107
> +        size:   0
> +        path:   anon_inode:[timerfd]
>   	clockid: 0
>   	ticks: 0
>   	settime flags: 01
> @@ -2070,6 +2087,7 @@ DMA Buffer files
>   	mnt_id:	9
>   	ino:	63107
>   	size:   32768
> +        path:   /dmabuf:
>   	count:  2
>   	exp_name:  system-heap
>   
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index b1e25ae98302..d61183ff3c30 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -377,7 +377,6 @@ static void dma_buf_show_fdinfo(struct seq_file *m, struct file *file)
>   {
>   	struct dma_buf *dmabuf = file->private_data;
>   
> -	seq_printf(m, "size:\t%zu\n", dmabuf->size);
>   	/* Don't count the temporary reference taken inside procfs seq_show */
>   	seq_printf(m, "count:\t%ld\n", file_count(dmabuf->file) - 1);
>   	seq_printf(m, "exp_name:\t%s\n", dmabuf->exp_name);
> diff --git a/fs/proc/fd.c b/fs/proc/fd.c
> index 913bef0d2a36..a8a968bc58f0 100644
> --- a/fs/proc/fd.c
> +++ b/fs/proc/fd.c
> @@ -54,10 +54,15 @@ static int seq_show(struct seq_file *m, void *v)
>   	if (ret)
>   		return ret;
>   
> -	seq_printf(m, "pos:\t%lli\nflags:\t0%o\nmnt_id:\t%i\nino:\t%lu\n",
> +	seq_printf(m, "pos:\t%lli\nflags:\t0%o\nmnt_id:\t%i\nino:\t%lu\nsize:\t%zu\n",
>   		   (long long)file->f_pos, f_flags,
>   		   real_mount(file->f_path.mnt)->mnt_id,
> -		   file_inode(file)->i_ino);
> +		   file_inode(file)->i_ino,
> +		   file_inode(file)->i_size);

We might consider splitting this into multiple seq_printf calls, one for 
each printed attribute.

It becomes a bit unreadable and the minimal additional overhead 
shouldn't matter that much.

Regards,
Christian.

> +
> +	seq_puts(m, "path:\t");
> +	seq_file_path(m, file, "\n");
> +	seq_putc(m, '\n');
>   
>   	/* show_fd_locks() never deferences files so a stale value is safe */
>   	show_fd_locks(m, file, files);
>
> base-commit: b015dcd62b86d298829990f8261d5d154b8d7af5


  parent reply	other threads:[~2022-05-20  6:29 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-19 21:40 [RFC PATCH] procfs: Add file path and size to /proc/<pid>/fdinfo Kalesh Singh
2022-05-19 21:46 ` Randy Dunlap
2022-05-19 21:53   ` Kalesh Singh
2022-05-20  1:49 ` Kees Cook
2022-05-20  4:01   ` Kalesh Singh
2022-05-20  6:29 ` Christian König [this message]
2022-05-20 16:12   ` Kalesh Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4e35dc30-1157-50b3-e3b6-954481a0524d@amd.com \
    --to=christian.koenig@amd.com \
    --cc=akpm@linux-foundation.org \
    --cc=ccross@google.com \
    --cc=corbet@lwn.net \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=ilkos@google.com \
    --cc=kaleshsingh@google.com \
    --cc=keescook@chromium.org \
    --cc=kernel-team@android.com \
    --cc=linaro-mm-sig@lists.linaro.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=mail@christoph.anton.mitterer.name \
    --cc=rdunlap@infradead.org \
    --cc=rppt@kernel.org \
    --cc=sumit.semwal@linaro.org \
    --cc=surenb@google.com \
    --cc=tjmercier@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox