linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Josef Bacik <josef@toxicpanda.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm: drop mmap_sem before calling balance_dirty_pages() in write fault
Date: Tue, 24 Sep 2019 10:48:09 -0700	[thread overview]
Message-ID: <20190924174809.GH1855@bombadil.infradead.org> (raw)
In-Reply-To: <20190924171518.26682-1-hannes@cmpxchg.org>

On Tue, Sep 24, 2019 at 01:15:18PM -0400, Johannes Weiner wrote:
> +static int fault_dirty_shared_page(struct vm_fault *vmf)

vm_fault_t, shirley?

> @@ -2239,16 +2241,36 @@ static void fault_dirty_shared_page(struct vm_area_struct *vma,
>  	mapping = page_rmapping(page);
>  	unlock_page(page);
>  
> +	if (!page_mkwrite)
> +		file_update_time(vma->vm_file);
> +
> +	/*
> +	 * Throttle page dirtying rate down to writeback speed.
> +	 *
> +	 * mapping may be NULL here because some device drivers do not
> +	 * set page.mapping but still dirty their pages
> +	 *
> +	 * Drop the mmap_sem before waiting on IO, if we can. The file
> +	 * is pinning the mapping, as per above.
> +	 */
>  	if ((dirtied || page_mkwrite) && mapping) {
> -		/*
> -		 * Some device drivers do not set page.mapping
> -		 * but still dirty their pages
> -		 */
> +		struct file *fpin = NULL;
> +
> +		if ((vmf->flags &
> +		     (FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT)) ==
> +		    FAULT_FLAG_ALLOW_RETRY) {
> +			fpin = get_file(vma->vm_file);
> +			up_read(&vma->vm_mm->mmap_sem);
> +			ret = VM_FAULT_RETRY;
> +		}
> +
>  		balance_dirty_pages_ratelimited(mapping);
> +
> +		if (fpin)
> +			fput(fpin);
>  	}
>  
> -	if (!page_mkwrite)
> -		file_update_time(vma->vm_file);
> +	return ret;
>  }

I'm not a fan of moving file_update_time() to _before_ the
balance_dirty_pages call.  Also, this is now the third place that needs
maybe_unlock_mmap_for_io, see
https://lore.kernel.org/linux-mm/20190917120852.x6x3aypwvh573kfa@box/

How about:

+	struct file *fpin = NULL;

 	if ((dirtied || page_mkwrite) && mapping) {
		fpin = maybe_unlock_mmap_for_io(vmf, fpin);
 		balance_dirty_pages_ratelimited(mapping);
	}
+
+	if (fpin) {
+		file_update_time(fpin);
+		fput(fpin);
+		return VM_FAULT_RETRY;
+	}

 	if (!page_mkwrite)
 		file_update_time(vma->vm_file);
+	return 0;
 }

>  /*
> @@ -2491,6 +2513,7 @@ static vm_fault_t wp_page_shared(struct vm_fault *vmf)
>  	__releases(vmf->ptl)
>  {
>  	struct vm_area_struct *vma = vmf->vma;
> +	int ret = VM_FAULT_WRITE;

vm_fault_t again.

> @@ -3576,7 +3599,6 @@ static vm_fault_t do_shared_fault(struct vm_fault *vmf)
>  static vm_fault_t do_fault(struct vm_fault *vmf)
>  {
>  	struct vm_area_struct *vma = vmf->vma;
> -	struct mm_struct *vm_mm = vma->vm_mm;
>  	vm_fault_t ret;
>  
>  	/*
> @@ -3617,7 +3639,12 @@ static vm_fault_t do_fault(struct vm_fault *vmf)
>  
>  	/* preallocated pagetable is unused: free it */
>  	if (vmf->prealloc_pte) {
> -		pte_free(vm_mm, vmf->prealloc_pte);
> +		/*
> +		 * XXX: Accessing vma->vm_mm now is not safe. The page
> +		 * fault handler may have dropped the mmap_sem a long
> +		 * time ago. Only s390 derefs that parameter.
> +		 */
> +		pte_free(vma->vm_mm, vmf->prealloc_pte);

I'm confused.  This code looks like it was safe before (as it was caching
vma->vm_mm in a local variable), and now you've made it unsafe ... ?


  reply	other threads:[~2019-09-24 17:48 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-24 17:15 [PATCH] mm: drop mmap_sem before calling balance_dirty_pages() in write fault Johannes Weiner
2019-09-24 17:48 ` Matthew Wilcox [this message]
2019-09-24 19:42   ` Johannes Weiner
2019-09-24 20:46     ` Matthew Wilcox
2019-09-24 21:43       ` Johannes Weiner
2019-09-26 13:49         ` Kirill A. Shutemov
2019-09-26 18:50           ` Matthew Wilcox
2019-09-26 19:26           ` Johannes Weiner
2019-09-27  8:39             ` Kirill A. Shutemov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190924174809.GH1855@bombadil.infradead.org \
    --to=willy@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=josef@toxicpanda.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).