All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: James Houghton <jthoughton@google.com>
Cc: Muchun Song <songmuchun@bytedance.com>,
	Naoya Horiguchi <naoya.horiguchi@nec.com>,
	Miaohe Lin <linmiaohe@huawei.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Yang Shi <shy828301@gmail.com>,
	Axel Rasmussen <axelrasmussen@google.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] hugetlbfs: don't delete error page from pagecache
Date: Wed, 19 Oct 2022 17:02:02 -0700	[thread overview]
Message-ID: <Y1CP+jEO+cIEOejb@monkey> (raw)
In-Reply-To: <20221018200125.848471-1-jthoughton@google.com>

On 10/18/22 20:01, James Houghton wrote:
> This change is very similar to the change that was made for shmem [1],
> and it solves the same problem but for HugeTLBFS instead.
> 
> Currently, when poison is found in a HugeTLB page, the page is removed
> from the page cache. That means that attempting to map or read that
> hugepage in the future will result in a new hugepage being allocated
> instead of notifying the user that the page was poisoned. As [1] states,
> this is effectively memory corruption.
> 
> The fix is to leave the page in the page cache. If the user attempts to
> use a poisoned HugeTLB page with a syscall, the syscall will fail with
> EIO, the same error code that shmem uses. For attempts to map the page,
> the thread will get a BUS_MCEERR_AR SIGBUS.
> 
> [1]: commit a76054266661 ("mm: shmem: don't truncate page if memory failure happens")
> 
> Signed-off-by: James Houghton <jthoughton@google.com>
> ---
>  fs/hugetlbfs/inode.c | 13 ++++++-------
>  mm/hugetlb.c         |  4 ++++
>  mm/memory-failure.c  |  5 ++++-
>  3 files changed, 14 insertions(+), 8 deletions(-)

Thanks James!

Code looks correct.  One observation below, but I do not suggest changing.

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>

> 
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index fef5165b73a5..7f836f8f9db1 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -328,6 +328,12 @@ static ssize_t hugetlbfs_read_iter(struct kiocb *iocb, struct iov_iter *to)
>  		} else {
>  			unlock_page(page);
>  
> +			if (PageHWPoison(page)) {
> +				put_page(page);
> +				retval = -EIO;
> +				break;
> +			}
> +
>  			/*
>  			 * We have the page, copy it to user space buffer.
>  			 */
> @@ -1111,13 +1117,6 @@ static int hugetlbfs_migrate_folio(struct address_space *mapping,
>  static int hugetlbfs_error_remove_page(struct address_space *mapping,
>  				struct page *page)
>  {
> -	struct inode *inode = mapping->host;
> -	pgoff_t index = page->index;
> -
> -	hugetlb_delete_from_page_cache(page_folio(page));
> -	if (unlikely(hugetlb_unreserve_pages(inode, index, index + 1, 1)))
> -		hugetlb_fix_reserve_counts(inode);
> -
>  	return 0;
>  }
>  
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 97896165fd3f..5120a9ccbf5b 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -6101,6 +6101,10 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
>  
>  	ptl = huge_pte_lock(h, dst_mm, dst_pte);
>  
> +	ret = -EIO;
> +	if (PageHWPoison(page))
> +		goto out_release_unlock;
> +
>  	/*
>  	 * We allow to overwrite a pte marker: consider when both MISSING|WP
>  	 * registered, we firstly wr-protect a none pte which has no page cache
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 145bb561ddb3..bead6bccc7f2 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1080,6 +1080,7 @@ static int me_huge_page(struct page_state *ps, struct page *p)
>  	int res;
>  	struct page *hpage = compound_head(p);
>  	struct address_space *mapping;
> +	bool extra_pins = false;
>  
>  	if (!PageHuge(hpage))
>  		return MF_DELAYED;
> @@ -1087,6 +1088,8 @@ static int me_huge_page(struct page_state *ps, struct page *p)
>  	mapping = page_mapping(hpage);
>  	if (mapping) {
>  		res = truncate_error_page(hpage, page_to_pfn(p), mapping);
> +		/* The page is kept in page cache. */

That looks a bit silly as we are know truncate_error_page() is a noop and call
it anyway.  I suppose we could just set 'res = MF_RECOVERED'.  However, it
really should be left as is in case more non-hugetlb functionality is added to
truncate_error_page() in the future.
-- 
Mike Kravetz

> +		extra_pins = true;
>  		unlock_page(hpage);
>  	} else {
>  		unlock_page(hpage);
> @@ -1104,7 +1107,7 @@ static int me_huge_page(struct page_state *ps, struct page *p)
>  		}
>  	}
>  
> -	if (has_extra_refcount(ps, p, false))
> +	if (has_extra_refcount(ps, p, extra_pins))
>  		res = MF_FAILED;
>  
>  	return res;
> -- 
> 2.38.0.413.g74048e4d9e-goog
> 


  parent reply	other threads:[~2022-10-20  0:02 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-18 20:01 [PATCH] hugetlbfs: don't delete error page from pagecache James Houghton
2022-10-19 18:31 ` Yang Shi
2022-10-19 18:42   ` James Houghton
2022-10-20 18:27     ` Yang Shi
2022-10-19 18:55   ` Mike Kravetz
2022-10-20 18:42     ` Yang Shi
2022-10-20  0:02 ` Mike Kravetz [this message]
2022-11-04  2:10 ` HORIGUCHI NAOYA(堀口 直也)
2022-11-04 13:27   ` Mike Kravetz
2022-11-07  4:25     ` HORIGUCHI NAOYA(堀口 直也)
2022-11-07 16:55       ` Mike Kravetz
2022-11-07 19:41         ` Andrew Morton
2022-11-07 18:50       ` Yang Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y1CP+jEO+cIEOejb@monkey \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=jthoughton@google.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=shy828301@gmail.com \
    --cc=songmuchun@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.