From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from 69-171-232-181.mail-mxout.facebook.com (69-171-232-181.mail-mxout.facebook.com [69.171.232.181]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B8725DC for ; Tue, 12 Dec 2023 14:31:08 -0800 (PST) Received: by devbig309.ftw3.facebook.com (Postfix, from userid 128203) id 9F2422B68D7A0; Tue, 12 Dec 2023 14:30:55 -0800 (PST) From: Yonghong Song To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , kernel-team@fb.com, Martin KaFai Lau Subject: [PATCH bpf-next 3/5] bpf: Refill only one percpu element in memalloc Date: Tue, 12 Dec 2023 14:30:55 -0800 Message-Id: <20231212223055.2138132-1-yonghong.song@linux.dev> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231212223040.2135547-1-yonghong.song@linux.dev> References: <20231212223040.2135547-1-yonghong.song@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Typically for percpu map element or data structure, once allocated, most operations are lookup or in-place update. Deletion are really rare. Currently, for percpu data strcture, 4 elements will be refilled if the size is <=3D 256. Let us just do with one element for percpu data. For example, for size 256 and 128 cpus, the potential saving will be 3 * 256 * 128 * 128 =3D 12MB. Signed-off-by: Yonghong Song --- kernel/bpf/memalloc.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index 84987e97fd0a..a1d718ee264d 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -483,11 +483,15 @@ static void init_refill_work(struct bpf_mem_cache *= c) =20 static void prefill_mem_cache(struct bpf_mem_cache *c, int cpu) { + int cnt =3D 1; + /* To avoid consuming memory assume that 1st run of bpf * prog won't be doing more than 4 map_update_elem from * irq disabled region */ - alloc_bulk(c, c->unit_size <=3D 256 ? 4 : 1, cpu_to_node(cpu), false); + if (!c->percpu_size && c->unit_size <=3D 256) + cnt =3D 4; + alloc_bulk(c, cnt, cpu_to_node(cpu), false); } =20 static int check_obj_size(struct bpf_mem_cache *c, unsigned int idx) --=20 2.34.1