From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C8138C4332F for ; Mon, 7 Nov 2022 06:58:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 37CAE8E0002; Mon, 7 Nov 2022 01:58:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 32BD08E0001; Mon, 7 Nov 2022 01:58:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1F4188E0002; Mon, 7 Nov 2022 01:58:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 0D9D08E0001 for ; Mon, 7 Nov 2022 01:58:41 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id C097A40E30 for ; Mon, 7 Nov 2022 06:58:40 +0000 (UTC) X-FDA: 80105743200.26.BD8EF0B Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf02.hostedemail.com (Postfix) with ESMTP id 262888000B for ; Mon, 7 Nov 2022 06:58:39 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 09737B80B8D; Mon, 7 Nov 2022 06:58:38 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 84B43C433C1; Mon, 7 Nov 2022 06:58:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1667804316; bh=/nsnjaPdz26XUOTcvh1Wl+FSv7xUUdqY1hb0BkBhvnQ=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=J0ohsC7MNJPHCWuCd8UPzFCklAY4c5FqCvUnAu5N/ByDcUXyamQt8w6GvPTQyNwVl aRD96lgNDkqMcY7Lqsht3sn7BbsHO2yNgwm6BmgfgjpTrKgFfSHKESnb0bvxz1tvm7 AOVDvhPKDl5OPRQh6MqIgN0Bd2GKSe11NjUGIGKi/U8S6jw9UhGINKVZ9U/G70iT7L WPKEnV1BLfeGcef7bQmA0vk2JVzA7BFEGLxt1humZgcl6QOgdEjHwDARC1rYGGetbT LlKCBbk+e3AFI3buWGAiGZzNp7IT6ISVKvZ9P/oqt45wfPwsHD3aZ0EpcfAMvhsJBz WrvYs31EBmhWw== Date: Mon, 7 Nov 2022 08:58:17 +0200 From: Mike Rapoport To: Luis Chamberlain Cc: Song Liu , bpf@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, x86@kernel.org, peterz@infradead.org, hch@lst.de, rick.p.edgecombe@intel.com, dave.hansen@intel.com, zhengjun.xing@linux.intel.com, kbusch@kernel.org, p.raghav@samsung.com, dave@stgolabs.net, vbabka@suse.cz, mgorman@suse.de, willy@infradead.org, torvalds@linux-foundation.org, a.manzanares@samsung.com Subject: Re: [PATCH bpf-next v1 RESEND 1/5] vmalloc: introduce vmalloc_exec, vfree_exec, and vcopy_exec Message-ID: References: <20221031222541.1773452-1-song@kernel.org> <20221031222541.1773452-2-song@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1667804320; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zliOVmy7972y4+RETC4UX0stmTgCeOkJ/+lWvYseJCE=; b=WvF+Y/jj4bY7lHJVIdwjnYXFhJ6uFxzuG9PLF9/xt7tDL1JV/8oc8c5bKl5ZeI4uEZuHso IOCk285joJOh+g7s4cEYXfelVggtQICvdK2Re+dSp0oW+sQ+ryk0ZGi+fM7pa/VxaxIImn XbBS39fQwrbsX/TUkDQ7lMIhc7mHJUs= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=J0ohsC7M; spf=pass (imf02.hostedemail.com: domain of rppt@kernel.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=rppt@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1667804320; a=rsa-sha256; cv=none; b=FMFV4GtCMDm2o+oO8HyoFSmBpm80JzXYX7qqJ8PXdFEDt6nqIdj1FKuew0KnpVsDyKwbXn lTPbfkXuOSD07E25BsLUvuVo9irCiNSS3EJtCN1C0m3k492j1i+PPEAwkfPfcQj2WCGyZJ aOVyngCcEROK+k1ieKkX6ALsbsJtA+I= X-Rspamd-Queue-Id: 262888000B X-Rspam-User: X-Rspamd-Server: rspam08 Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=J0ohsC7M; spf=pass (imf02.hostedemail.com: domain of rppt@kernel.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=rppt@kernel.org; dmarc=pass (policy=none) header.from=kernel.org X-Stat-Signature: poyr6wns95sm3gw5m6tr98nhxgt84bia X-HE-Tag: 1667804319-427140 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Nov 03, 2022 at 11:59:48AM -0700, Luis Chamberlain wrote: > On Thu, Nov 03, 2022 at 05:51:57PM +0200, Mike Rapoport wrote: > > > I had to put this project on a backburner for $VARIOUS_REASONS, but I still > > think that we need a generic allocator for memory with non-default > > permissions in the direct map and that code allocation should build on that > > allocator. > > It seems this generalization of the bpf prog pack to possibly be used > for modules / kprobes / ftrace is a small step in that direction. > > > All that said, the direct map fragmentation problem is currently relevant > > only to x86 because it's the only architecture that supports splitting of > > the large pages in the direct map. > > I was thinking even more long term too, using this as a proof of concept. If > this practice in general helps with fragmentation, could it be used for > experimetnation with compound pages later, as a way to reduce possible > fragmentation. As Rick already mentioned, these patches help with the direct map fragmentation only indirectly. With these patches memory is freed in PMD_SIZE chunks and this makes the changes to the direct map in vm_remove_mappings() to happen in in PMD_SIZE units and this is pretty much the only effect of this series on the direct map layout. A bit unrelated, but I'm wondering now if we want to have the direct map alias of the pages allocated for code also to be read-only... > > Whenever a large page in the direct map is split, all > > kernel accesses via the direct map will use small pages which requires > > dealing with 512 page table entries instead of one for 2M range. > > > > Since small pages in the direct map are never collapsed back to large > > pages, long living system that heavily uses eBPF programs will have its > > direct map severely fragmented, higher TLB miss rate and worse overall > > performance. > > Shouldn't compaction help with those situations? Compaction helps to reduce fragmentation of the physical memory, it tries to bring free physical pages next to each other to create large contiguous chunks, but it does not change the virtual addresses the users of the underlying data see. Changing permissions of a small page in the direct map causes "discontinuity" in the virtual space. E.g. if we have 2M mapped RW with a single PMD changing several page in the middle of those 2M to R+X will require to remap that range with 512 PTEs. > Thanks! > > Luis -- Sincerely yours, Mike.