From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2473278F4F for ; Mon, 6 Jan 2025 20:56:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736196967; cv=none; b=rYD+V2l3xbf2T9FI5p1cB0hZhwsqr1ghrUk0lWnCCCWRBkPtxMR2rDGQ4btSg/JuGIPdibnJy8fjk3p3bhABhZcnChQoQ0Qr5aX5eea1UfJcOBxWPniWGxSjWknjYq+/ZUAz3CoS6r+vEL9+PGB4cLv44junR0gTG7M47fBiX6c= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736196967; c=relaxed/simple; bh=jCDFq+VQWaksZfqOXl92/2+mBoXor+7SkK9XS69pFRM=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=MqzgdMDMYa1ceN0FKnFvJgEAokjcpMNdwOtMtlhokz1G0QUiiepbohLOoRiG1rmQt2l72dv8cqqBrHfrMsZdPvVL2O99Md3+IxJAHJqDW66kFGpmgHC5NhLYV+4GHBNuTV77IG2HcTmqwo9LHWKIbHqBPWfGVQ1ZVFX1b9RV86g= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=HSz2b+n3; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="HSz2b+n3" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1736196963; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=RkftWYjHm4LxfdUQv8d50IoMaQzENKu1a+Nikid1hy4=; b=HSz2b+n3fqxHm2g/XiDyuUERJB1uXr+58Zt309OlDnJRuF47GLwzLu7oanVQb/kM1dIzw3 gwx1gmJsQMpgMtusjZVOE6dUPYfEovLJXS9VMCGpZNjd707MIRt4SHbl9kJjXm0qROQXTe qSRLrgvjHQkPANs8RWba1XZIX1gOsuE= Received: from mail-qv1-f71.google.com (mail-qv1-f71.google.com [209.85.219.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-562-ktdRaQs9NFOeieTIdotU9w-1; Mon, 06 Jan 2025 15:56:02 -0500 X-MC-Unique: ktdRaQs9NFOeieTIdotU9w-1 X-Mimecast-MFC-AGG-ID: ktdRaQs9NFOeieTIdotU9w Received: by mail-qv1-f71.google.com with SMTP id 6a1803df08f44-6d931c7fc26so243744326d6.3 for ; Mon, 06 Jan 2025 12:56:02 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736196962; x=1736801762; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=RkftWYjHm4LxfdUQv8d50IoMaQzENKu1a+Nikid1hy4=; b=YaTmB3TpG3Wo/Is5WBiu92Kt3ivSjLUFPnHPbyZ7hKQ2zzNGOOV2eI/5AIyc2eMW5n E/JOO3dSG74rNC8RsfsdAT+twAC29nC+fOxY8gprX9DfK3gixfmy4W9FtFihF4HlIJ1Q qk3wgB2km+fTHV9xzyW/vuFFy9KoMb/DC4pA3JDifbvf2KU0Y2DLaaI2tM2mAFk5NCD7 LRKCA9Tc9/2no2u4613Wam45GWRwn16ZBLKBltTtSkg765fgBiSVieEYBemdONqnNrR7 Sfi3wuDbuQly+V3cp6D/LPjIQyySnazsLPglE1JgM6CHBBEzrvr93/XuvRmJ32MUC0lu SrcA== X-Gm-Message-State: AOJu0Yzu3yNR6f169gER2WVkS581LI9HqmA7FN7nbVu3XQ9m+SDnvKN7 H8/iT4uz1yMN+nSi6hYkMf8THfZc3BsClE3r/Inv5bF1C9OoYTxRAi7zGErmnZqFk+mqFi690GR 1dk9N+98GKvtS2ZA3cZk7HLG6dRb5ladDw5TgXsz8AbGb23rszNwchJhx4G5o5A== X-Gm-Gg: ASbGncssHISAu7P8irbJ8ut/nXZWyq6ejRNtycIZSq4RvAvk6wfR8hrqVKkD0Uv0oeU eaSt4iqft3FQquA6sL5UAY54Lf0N/mE8fOwzmU6Ap+Kr6E6oaoURvBtHufWWs25Jve5lQRRIkKy dLxyEjuKS9iFp2bTntOXrb0DPgRfnCZAiyaegstDUnv35eLtJRkd5qZ6C5e8bsA3i3Set4BFCdS Vv5fG3fvrDarrbeiKmSFDGkRWlhDq3tkgIqkcZXN/M67ZJ0KID/UDShWYUHzJbt7nx+hEfUK69e Pq2tbfwpUYFemVYNLg== X-Received: by 2002:a05:6214:5786:b0:6d8:b115:76a6 with SMTP id 6a1803df08f44-6dd231f1e5emr962346046d6.0.1736196961864; Mon, 06 Jan 2025 12:56:01 -0800 (PST) X-Google-Smtp-Source: AGHT+IF2tWW0v0vbbhLYq/9qMLhOjU3cuHAXYxZjjEtoKxmUzSvrlM/MC+TsfF9Vd8igIfGVH+Uv5w== X-Received: by 2002:a05:6214:5786:b0:6d8:b115:76a6 with SMTP id 6a1803df08f44-6dd231f1e5emr962345716d6.0.1736196961539; Mon, 06 Jan 2025 12:56:01 -0800 (PST) Received: from x1n (pool-99-254-114-190.cpe.net.cable.rogers.com. [99.254.114.190]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6dd180eacb8sm174561526d6.20.2025.01.06.12.56.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Jan 2025 12:56:00 -0800 (PST) Date: Mon, 6 Jan 2025 15:55:58 -0500 From: Peter Xu To: Ackerley Tng Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, riel@surriel.com, leitao@debian.org, akpm@linux-foundation.org, muchun.song@linux.dev, osalvador@suse.de, roman.gushchin@linux.dev, nao.horiguchi@gmail.com Subject: Re: [PATCH 4/7] mm/hugetlb: Clean up map/global resv accounting when allocate Message-ID: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: On Mon, Jan 06, 2025 at 02:48:12PM +0000, Ackerley Tng wrote: > Peter Xu writes: > > > On Sat, Dec 28, 2024 at 12:06:34AM +0000, Ackerley Tng wrote: > >> > > >> > - /* If this allocation is not consuming a reservation, charge it now. > >> > + /* > >> > + * If this allocation is not consuming a per-vma reservation, > >> > + * charge the hugetlb cgroup now. > >> > */ > >> > - deferred_reserve = map_chg || cow_from_owner; > >> > - if (deferred_reserve) { > >> > + if (map_chg) { > >> > ret = hugetlb_cgroup_charge_cgroup_rsvd( > >> > idx, pages_per_huge_page(h), &h_cg); > >> > >> Should hugetlb_cgroup_charge_cgroup_rsvd() be called when map_chg == MAP_CHG_ENFORCED? > > > > This looks like a pretty niche use case, though I would say yes. > > > > I don't think I take a lot of consideration here when drafting the patch, > > as the change here should have kept the old behavior: map_chg grows into > > the tristate so that we can drop deferred_reserve, OTOH nothing should > > change from such behavior of cgroup charging. > > > > When it happens, it means the owner process CoWed a private hugetlb folio > > which will enforce bypassing the vma reservation. Here bypassing the vma > > check makes sense to me, because the new to-be-cowed folio X will replace > > another folio Y, which should have consumed the private vma resv at this > > specific index. So there's no way the to-be-cowed folio X can have anything > > to do with the vma reservation.. > > > > Besides the vma reservation, I don't see why this folio allocation needs to > > be any more special. IOW, it should still go through all rest checks and > > fail the process properly if the check fails, that should include any form > > of cgroups (either hugetlb or memcg), IMHO. > > > > Do you have any specific thought on this path? > > I re-read the code, and I hope this understanding is right: > > When a user sets "rsvd.max_usage_in_bytes" to X, the user is saying that > within this cgroup, the maximum memory that can be reserved in the vma > reservation is X. Right, and the allocation may or may not attach to a vma reservation at all. In this case it skips the vma reservation however will still need to be accounted; there should have other similar cases where vma resv doesn't count, e.g. MAP_NORESERVE. For those we do accounting on reservations only until allocation time. > > Hence even when this CoW is performed, this should count towards the > cgroup's "rsvd.max_usage_in_bytes" and so yes, it should be charged. > > I think I misunderstood the context on cgroup charging earlier and hence > I thought it shouldn't be charged, but I agree with you after > re-reading. Thanks. I'll hold another 1-2 days then I'll respin. -- Peter Xu