All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Mike Kravetz <mike.kravetz@oracle.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	David Hildenbrand <david@redhat.com>,
	Mina Almasry <almasrymina@google.com>,
	Michal Privoznik <mprivozn@redhat.com>,
	Michal Hocko <mhocko@kernel.org>,
	Muchun Song <songmuchun@bytedance.com>,
	"Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com>,
	Tejun Heo <tj@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	stable@vger.kernel.org
Subject: Re: [PATCH] hugetlb_cgroup: fix reservation accounting
Date: Thu, 22 Oct 2020 08:21:07 -0400	[thread overview]
Message-ID: <20201022081538-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20201021204426.36069-1-mike.kravetz@oracle.com>

On Wed, Oct 21, 2020 at 01:44:26PM -0700, Mike Kravetz wrote:
> Michal Privoznik was using "free page reporting" in QEMU/virtio-balloon
> with hugetlbfs and hit the warning below.  QEMU with free page hinting
> uses fallocate(FALLOC_FL_PUNCH_HOLE) to discard pages that are reported
> as free by a VM. The reporting granularity is in pageblock granularity.
> So when the guest reports 2M chunks, we fallocate(FALLOC_FL_PUNCH_HOLE)
> one huge page in QEMU.
> 
> [  315.251417] ------------[ cut here ]------------
> [  315.251424] WARNING: CPU: 7 PID: 6636 at mm/page_counter.c:57 page_counter_uncharge+0x4b/0x50
> [  315.251425] Modules linked in: ...
> [  315.251466] CPU: 7 PID: 6636 Comm: qemu-system-x86 Not tainted 5.9.0 #137
> [  315.251467] Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS PRO/X570 AORUS PRO, BIOS F21 07/31/2020
> [  315.251469] RIP: 0010:page_counter_uncharge+0x4b/0x50
> ...
> [  315.251479] Call Trace:
> [  315.251485]  hugetlb_cgroup_uncharge_file_region+0x4b/0x80
> [  315.251487]  region_del+0x1d3/0x300
> [  315.251489]  hugetlb_unreserve_pages+0x39/0xb0
> [  315.251492]  remove_inode_hugepages+0x1a8/0x3d0
> [  315.251495]  ? tlb_finish_mmu+0x7a/0x1d0
> [  315.251497]  hugetlbfs_fallocate+0x3c4/0x5c0
> [  315.251519]  ? kvm_arch_vcpu_ioctl_run+0x614/0x1700 [kvm]
> [  315.251522]  ? file_has_perm+0xa2/0xb0
> [  315.251524]  ? inode_security+0xc/0x60
> [  315.251525]  ? selinux_file_permission+0x4e/0x120
> [  315.251527]  vfs_fallocate+0x146/0x290
> [  315.251529]  __x64_sys_fallocate+0x3e/0x70
> [  315.251531]  do_syscall_64+0x33/0x40
> [  315.251533]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> ...
> [  315.251542] ---[ end trace 4c88c62ccb1349c9 ]---
> 
> Investigation of the issue uncovered bugs in hugetlb cgroup reservation
> accounting.  This patch addresses the found issues.
> 
> Fixes: 075a61d07a8e ("hugetlb_cgroup: add accounting for shared mappings")
> Cc: <stable@vger.kernel.org>
> Reported-by: Michal Privoznik <mprivozn@redhat.com>
> Co-developed-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  mm/hugetlb.c | 20 +++++++++++---------
>  1 file changed, 11 insertions(+), 9 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 67fc6383995b..b853a11de14f 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -655,6 +655,8 @@ static long region_del(struct resv_map *resv, long f, long t)
>  			}
>  
>  			del += t - f;
> +			hugetlb_cgroup_uncharge_file_region(
> +				resv, rg, t - f);
>  
>  			/* New entry for end of split region */
>  			nrg->from = t;
> @@ -667,9 +669,6 @@ static long region_del(struct resv_map *resv, long f, long t)
>  			/* Original entry is trimmed */
>  			rg->to = f;
>  
> -			hugetlb_cgroup_uncharge_file_region(
> -				resv, rg, nrg->to - nrg->from);
> -
>  			list_add(&nrg->link, &rg->link);
>  			nrg = NULL;
>  			break;
> @@ -685,17 +684,17 @@ static long region_del(struct resv_map *resv, long f, long t)
>  		}
>  
>  		if (f <= rg->from) {	/* Trim beginning of region */
> -			del += t - rg->from;
> -			rg->from = t;
> -
>  			hugetlb_cgroup_uncharge_file_region(resv, rg,
>  							    t - rg->from);
> -		} else {		/* Trim end of region */
> -			del += rg->to - f;
> -			rg->to = f;
>  
> +			del += t - rg->from;
> +			rg->from = t;
> +		} else {		/* Trim end of region */
>  			hugetlb_cgroup_uncharge_file_region(resv, rg,
>  							    rg->to - f);
> +
> +			del += rg->to - f;
> +			rg->to = f;
>  		}
>  	}
>  
> @@ -2454,6 +2453,9 @@ struct page *alloc_huge_page(struct vm_area_struct *vma,
>  
>  		rsv_adjust = hugepage_subpool_put_pages(spool, 1);
>  		hugetlb_acct_memory(h, -rsv_adjust);
> +		if (deferred_reserve)
> +			hugetlb_cgroup_uncharge_page_rsvd(hstate_index(h),
> +					pages_per_huge_page(h), page);
>  	}
>  	return page;
>  
> -- 
> 2.25.4



  parent reply	other threads:[~2020-10-22 12:21 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-21 20:44 [PATCH] hugetlb_cgroup: fix reservation accounting Mike Kravetz
2020-10-22  7:42 ` Michal Privoznik
2020-10-22 12:21 ` Michael S. Tsirkin [this message]
2020-10-28 19:37   ` Mina Almasry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201022081538-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=almasrymina@google.com \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=mprivozn@redhat.com \
    --cc=songmuchun@bytedance.com \
    --cc=stable@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.