From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B7FA7C77B60 for ; Fri, 28 Apr 2023 10:29:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 18A0B6B0071; Fri, 28 Apr 2023 06:29:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 13B0F6B0072; Fri, 28 Apr 2023 06:29:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 028ED6B0074; Fri, 28 Apr 2023 06:29:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id E6BAB6B0071 for ; Fri, 28 Apr 2023 06:29:44 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id B870A40369 for ; Fri, 28 Apr 2023 10:29:44 +0000 (UTC) X-FDA: 80730428688.23.4D78D78 Received: from outbound-smtp04.blacknight.com (outbound-smtp04.blacknight.com [81.17.249.35]) by imf15.hostedemail.com (Postfix) with ESMTP id C6D0AA0015 for ; Fri, 28 Apr 2023 10:29:41 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf15.hostedemail.com: domain of mgorman@techsingularity.net designates 81.17.249.35 as permitted sender) smtp.mailfrom=mgorman@techsingularity.net ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682677782; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vGM5DhSq2Z2J5Na14PXz0bzfIw+RnDeTDEGXHwoS1p8=; b=QIajGVB5rM01194QvldHd5ohGWSvOEdF5lmMAf1uraElegFkGzAS4rdV2c4+22w4RRFRUu 6XQ2IFms7FAzSvQ0t8f2pt5yBzmiu9JrDZSkFqmWAH40LWY1GVEbhPL7t9gTGGC+AjnQYz C96hO7M3JkX/B1Es7TUwA5pjCdqp//s= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf15.hostedemail.com: domain of mgorman@techsingularity.net designates 81.17.249.35 as permitted sender) smtp.mailfrom=mgorman@techsingularity.net ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682677782; a=rsa-sha256; cv=none; b=ClTjBtXFyBphcAQLr5UlnWPA5S+ZjpmMPUo/IWaTPSfw4cvlh25DEOEyxAfxAPIJvPUHKK FO6rGnmy3sx35hgUz95U6VrqEojJxFoFUytR4fodj0BkfFnorsbnt1UNhBaCapbMayHgOn F1DZv92EEweeJygIDjSat7BVgke24c4= Received: from mail.blacknight.com (pemlinmail05.blacknight.ie [81.17.254.26]) by outbound-smtp04.blacknight.com (Postfix) with ESMTPS id A2743BEEBC for ; Fri, 28 Apr 2023 11:29:39 +0100 (IST) Received: (qmail 29727 invoked from network); 28 Apr 2023 10:29:39 -0000 Received: from unknown (HELO techsingularity.net) (mgorman@techsingularity.net@[84.203.21.103]) by 81.17.254.9 with ESMTPSA (AES256-SHA encrypted, authenticated); 28 Apr 2023 10:29:38 -0000 Date: Fri, 28 Apr 2023 11:29:36 +0100 From: Mel Gorman To: Johannes Weiner Cc: linux-mm@kvack.org, Kaiyang Zhao , Vlastimil Babka , David Rientjes , linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: Re: [RFC PATCH 05/26] mm: page_alloc: per-migratetype pcplist for THPs Message-ID: <20230428102936.7qsaskyjkpiyapgq@techsingularity.net> References: <20230418191313.268131-1-hannes@cmpxchg.org> <20230418191313.268131-6-hannes@cmpxchg.org> <20230421124744.skrxvziwg3bx7rgt@techsingularity.net> <20230421150648.GB320347@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20230421150648.GB320347@cmpxchg.org> X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: C6D0AA0015 X-Stat-Signature: s5ey7mbhozr57yyfs7wsedxg14mte7x1 X-HE-Tag: 1682677781-821395 X-HE-Meta: U2FsdGVkX1/q2KWvtudSAq0BXhfdyN7sIMovbjU3NfR01widljpnFjzIYzdUgkxEcKPq7FgNLQw7BVUu/GFSUZ0YA9h4DpFcRtrknSF3yQkHKpUHSduBcOhCKCCXvSlaMBs0oX/BW3ktkXvsOVYapZtWSZTZjOQR04yocwHs6p/m0kIgeMX8ReQjEcX/fDNo/2mMoxV/La6QRXfGf7/7oger5n1ZRrKN2goieb3y19R6ZR4kCikCvteyBXyNrnL6E9Eu+zxCWBDMRCBVh5UEbYoEKvnZbVDjpavf7SMCOdgOdWX+OdfclbnGrlvCvWYCWD/NnTZsTQ2vwSX3dEAycEf/5m1ifjxsh1qybgAyZfqWMguGJUU6+QOOuQPQSwnfRzIvcNWR+br4g4VsG1vCK7nm6hQh1w4PWOA4pZ2ygi1bfmdnG/jNQugW26zmA6gf07dpF8UOFJkqcg+QZn6elSYXiUWTtxTKuXqhZrXlrNfrK4k7ydy8mGaRCEAtlqTaT3VtoYCGvaTfDIgB7d+IfQyqihHX8LXG1PtGAnhj2Ig7driQFUriIGbKrN7AkVVGexSCPHr4SVhvhUb5GMo317fNwnY0M9MEPDU2cIezsSbeAPIqJcwYOW98ZpSf6/Bs0VMEVH0ZQxEezhtwBJvLZ5vCfHK+ip2IchzRO1vdIvYZMEkgJhhjHUar9HGAIdshB17ik+gLXujJ4UddQM/GPDFSjZn+UmrOO5/I7F2qsip2uuZmQYA7gsDPV9EOzdqEU8nHq72qRvlGLO5F3tVrC020r3T/ts0mVGkBmrr69uzOLgyqzYxJWOhF29ju6+A9Nt1fPVWuOpn5zxOWAl1MFiuGKE8LYlvl6ZshhugrnuH5Vz0wu9CO8Vsf+rMYAS1iB2+IZyUrsAiIqMXtTc4n8criSfFUNO29JGCTWFpFnTKjOCjLOF2PH4e0/8RLLaSTJfMY0T6LbaW+VFqkMBw G0bEpIX0 JqnrwQ/epFTbniLX3hdyvdt7DYV/0EebWGUBI3OobhKNoxDKy/hS16+ZKWtQo1lTG5uTha9ahIp5+7B185yYrKAxCmtr7mcJ79q5BnyXLTR4wSO9MMmPL/QKBUeXqj/5UKTSCjP1VhwpmLbkBiQKtmq04tfrVYno4yrTRKxRxyL+QmZBxq7x3U15ntqNVDvSW2VDmYtCKBFq86yBsk5r5mgA4V55mxpAWNWIf9wJ6XLtHQ2SfEAz7PHElv5b9oKBjo5pvArkhPTYzGHuECYpRpmnMLA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Apr 21, 2023 at 11:06:48AM -0400, Johannes Weiner wrote: > On Fri, Apr 21, 2023 at 01:47:44PM +0100, Mel Gorman wrote: > > On Tue, Apr 18, 2023 at 03:12:52PM -0400, Johannes Weiner wrote: > > > Right now, there is only one pcplist for THP allocations. However, > > > while most THPs are movable, the huge zero page is not. This means a > > > movable THP allocation can grab an unmovable block from the pcplist, > > > and a subsequent THP split, partial free, and reallocation of the > > > remainder will mix movable and unmovable pages in the block. > > > > > > While this isn't a huge source of block pollution in practice, it > > > happens often enough to trigger debug warnings fairly quickly under > > > load. In the interest of tightening up pageblock hygiene, make the THP > > > pcplists fully migratetype-aware, just like the lower order ones. > > > > > > Signed-off-by: Johannes Weiner > > > > Split out :P > > > > Take special care of this one because, while I didn't check this, I > > suspect it'll push the PCP structure size into the next cache line and > > increase overhead. > > > > The changelog makes it unclear why exactly this happens or why the > > patch fixes it. > > Before this, I'd see warnings from the last patch in the series about > received migratetype not matching requested mt. > > The way it happens is that the zero page gets freed and the unmovable > block put on the pcplist. A regular THP allocation is subsequently > served from an unmovable block. > > Mental note, I think this can happen the other way around too: a > regular THP on the pcp being served to a MIGRATE_UNMOVABLE zero > THP. It's not supposed to, but it looks like there is a bug in the > code that's meant to prevent that from happening in rmqueue(): > > if (likely(pcp_allowed_order(order))) { > /* > * MIGRATE_MOVABLE pcplist could have the pages on CMA area and > * we need to skip it when CMA area isn't allowed. > */ > if (!IS_ENABLED(CONFIG_CMA) || alloc_flags & ALLOC_CMA || > migratetype != MIGRATE_MOVABLE) { > page = rmqueue_pcplist(preferred_zone, zone, order, > migratetype, alloc_flags); > if (likely(page)) > goto out; > } > } > > Surely that last condition should be migratetype == MIGRATE_MOVABLE? > It should be. It would have been missed for ages because it would need a test case based on a machine configuration that requires CMA for functional correctness and is using THP which is an unlikely combination. > > The huge zero page strips GFP_MOVABLE (so unmovable) > > but at allocation time, it doesn't really matter what the movable type > > is because it's a full pageblock. It doesn't appear to be a hazard until > > the split happens. Assuming that's the case, it should be ok to always > > set the pageblock movable for THP allocations regardless of GFP flags at > > allocation time or else set the pageblock MOVABLE at THP split (always > > MOVABLE at allocation time makes more sense). > > The regular allocator compaction skips over compound pages anyway, so > the migratetype should indeed not matter there. > > The bigger issue is CMA. alloc_contig_range() will try to move THPs to > free a larger range. We have to be careful not to place an unmovable > zero THP into a CMA region. That means we can not play games with MT - > we really do have to physically keep unmovable and movable THPs apart. > Fair point. > Another option would be not to use pcp for the zero THP. It's cached > anyway in the caller. But it would add branches to the THP alloc and > free fast paths (pcp_allowed_order() also checking migratetype). And this is probably the most straight-forward option. The intent behind caching some THPs on PCP was faulting large mappings of normal THPs and reducing the contention on the zone lock a little. The zero THP is somewhat special because it should not be allocated at high frequency. -- Mel Gorman SUSE Labs