linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: SeongJae Park <sj@kernel.org>
To: Suren Baghdasaryan <surenb@google.com>
Cc: akpm@linux-foundation.org, viro@zeniv.linux.org.uk,
	brauner@kernel.org, jack@suse.cz, dchinner@redhat.com,
	casey@schaufler-ca.com, ben.wolsieffer@hefring.com,
	paulmck@kernel.org, david@redhat.com, avagin@google.com,
	usama.anjum@collabora.com, peterx@redhat.com, hughd@google.com,
	ryan.roberts@arm.com, wangkefeng.wang@huawei.com,
	Liam.Howlett@Oracle.com, yuzhao@google.com,
	axelrasmussen@google.com, lstoakes@gmail.com,
	talumbau@google.com, willy@infradead.org, vbabka@suse.cz,
	mgorman@techsingularity.net, jhubbard@nvidia.com,
	vishal.moola@gmail.com, mathieu.desnoyers@efficios.com,
	dhowells@redhat.com, jgg@ziepe.ca, sidhartha.kumar@oracle.com,
	andriy.shevchenko@linux.intel.com, yangxingui@huawei.com,
	keescook@chromium.org, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	kernel-team@android.com
Subject: Re: [PATCH 3/3] mm/maps: read proc/pid/maps under RCU
Date: Mon, 22 Jan 2024 21:36:29 -0800	[thread overview]
Message-ID: <20240123053629.365673-1-sj@kernel.org> (raw)
In-Reply-To: <20240122071324.2099712-3-surenb@google.com>

Hi Suren,

On Sun, 21 Jan 2024 23:13:24 -0800 Suren Baghdasaryan <surenb@google.com> wrote:

> With maple_tree supporting vma tree traversal under RCU and per-vma locks
> making vma access RCU-safe, /proc/pid/maps can be read under RCU and
> without the need to read-lock mmap_lock. However vma content can change
> from under us, therefore we make a copy of the vma and we pin pointer
> fields used when generating the output (currently only vm_file and
> anon_name). Afterwards we check for concurrent address space
> modifications, wait for them to end and retry. That last check is needed
> to avoid possibility of missing a vma during concurrent maple_tree
> node replacement, which might report a NULL when a vma is replaced
> with another one. While we take the mmap_lock for reading during such
> contention, we do that momentarily only to record new mm_wr_seq counter.
> This change is designed to reduce mmap_lock contention and prevent a
> process reading /proc/pid/maps files (often a low priority task, such as
> monitoring/data collection services) from blocking address space updates.
> 
> Note that this change has a userspace visible disadvantage: it allows for
> sub-page data tearing as opposed to the previous mechanism where data
> tearing could happen only between pages of generated output data.
> Since current userspace considers data tearing between pages to be
> acceptable, we assume is will be able to handle sub-page data tearing
> as well.
> 
> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> ---
>  fs/proc/internal.h |   2 +
>  fs/proc/task_mmu.c | 114 ++++++++++++++++++++++++++++++++++++++++++---
>  2 files changed, 109 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/proc/internal.h b/fs/proc/internal.h
> index a71ac5379584..e0247225bb68 100644
> --- a/fs/proc/internal.h
> +++ b/fs/proc/internal.h
> @@ -290,6 +290,8 @@ struct proc_maps_private {
>  	struct task_struct *task;
>  	struct mm_struct *mm;
>  	struct vma_iterator iter;
> +	unsigned long mm_wr_seq;
> +	struct vm_area_struct vma_copy;
>  #ifdef CONFIG_NUMA
>  	struct mempolicy *task_mempolicy;
>  #endif
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index 3f78ebbb795f..3886d04afc01 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -126,11 +126,96 @@ static void release_task_mempolicy(struct proc_maps_private *priv)
>  }
>  #endif
>  
> -static struct vm_area_struct *proc_get_vma(struct proc_maps_private *priv,
> -						loff_t *ppos)
> +#ifdef CONFIG_PER_VMA_LOCK
> +
> +static const struct seq_operations proc_pid_maps_op;
> +/*
> + * Take VMA snapshot and pin vm_file and anon_name as they are used by
> + * show_map_vma.
> + */
> +static int get_vma_snapshow(struct proc_maps_private *priv, struct vm_area_struct *vma)
>  {
> +	struct vm_area_struct *copy = &priv->vma_copy;
> +	int ret = -EAGAIN;
> +
> +	memcpy(copy, vma, sizeof(*vma));
> +	if (copy->vm_file && !get_file_rcu(&copy->vm_file))
> +		goto out;
> +
> +	if (copy->anon_name && !anon_vma_name_get_rcu(copy))
> +		goto put_file;

From today updated mm-unstable which containing this patch, I'm getting below
build error when CONFIG_ANON_VMA_NAME is not set.  Seems this patch needs to
handle the case?

    .../linux/fs/proc/task_mmu.c: In function ‘get_vma_snapshow’:
    .../linux/fs/proc/task_mmu.c:145:19: error: ‘struct vm_area_struct’ has no member named ‘anon_name’; did you mean ‘anon_vma’?
      145 |         if (copy->anon_name && !anon_vma_name_get_rcu(copy))
          |                   ^~~~~~~~~
          |                   anon_vma
    .../linux/fs/proc/task_mmu.c:161:19: error: ‘struct vm_area_struct’ has no member named ‘anon_name’; did you mean ‘anon_vma’?
      161 |         if (copy->anon_name)
          |                   ^~~~~~~~~
          |                   anon_vma
    .../linux/fs/proc/task_mmu.c:162:41: error: ‘struct vm_area_struct’ has no member named ‘anon_name’; did you mean ‘anon_vma’?
      162 |                 anon_vma_name_put(copy->anon_name);
          |                                         ^~~~~~~~~
          |                                         anon_vma
    .../linux/fs/proc/task_mmu.c: In function ‘put_vma_snapshot’:
    .../linux/fs/proc/task_mmu.c:174:18: error: ‘struct vm_area_struct’ has no member named ‘anon_name’; did you mean ‘anon_vma’?
      174 |         if (vma->anon_name)
          |                  ^~~~~~~~~
          |                  anon_vma
    .../linux/fs/proc/task_mmu.c:175:40: error: ‘struct vm_area_struct’ has no member named ‘anon_name’; did you mean ‘anon_vma’?
      175 |                 anon_vma_name_put(vma->anon_name);
          |                                        ^~~~~~~~~
          |                                        anon_vma

[...]


Thanks,
SJ

  reply	other threads:[~2024-01-23  5:36 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-22  7:13 [PATCH 1/3] mm: make vm_area_struct anon_name field RCU-safe Suren Baghdasaryan
2024-01-22  7:13 ` [PATCH 2/3] mm: add mm_struct sequence number to detect write locks Suren Baghdasaryan
2024-01-22  7:13 ` [PATCH 3/3] mm/maps: read proc/pid/maps under RCU Suren Baghdasaryan
2024-01-23  5:36   ` SeongJae Park [this message]
2024-01-23  6:07     ` Suren Baghdasaryan
2024-01-23 23:12       ` Suren Baghdasaryan
2024-01-23 23:48         ` SeongJae Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240123053629.365673-1-sj@kernel.org \
    --to=sj@kernel.org \
    --cc=Liam.Howlett@Oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=avagin@google.com \
    --cc=axelrasmussen@google.com \
    --cc=ben.wolsieffer@hefring.com \
    --cc=brauner@kernel.org \
    --cc=casey@schaufler-ca.com \
    --cc=david@redhat.com \
    --cc=dchinner@redhat.com \
    --cc=dhowells@redhat.com \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=jgg@ziepe.ca \
    --cc=jhubbard@nvidia.com \
    --cc=keescook@chromium.org \
    --cc=kernel-team@android.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lstoakes@gmail.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mgorman@techsingularity.net \
    --cc=paulmck@kernel.org \
    --cc=peterx@redhat.com \
    --cc=ryan.roberts@arm.com \
    --cc=sidhartha.kumar@oracle.com \
    --cc=surenb@google.com \
    --cc=talumbau@google.com \
    --cc=usama.anjum@collabora.com \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    --cc=vishal.moola@gmail.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=willy@infradead.org \
    --cc=yangxingui@huawei.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).