From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 983FEC76192 for ; Tue, 16 Jul 2019 08:12:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 513E92145D for ; Tue, 16 Jul 2019 08:12:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=netronome-com.20150623.gappssmtp.com header.i=@netronome-com.20150623.gappssmtp.com header.b="F8iQBWTx" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728118AbfGPIMV (ORCPT ); Tue, 16 Jul 2019 04:12:21 -0400 Received: from mail-wm1-f68.google.com ([209.85.128.68]:51565 "EHLO mail-wm1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726463AbfGPIMV (ORCPT ); Tue, 16 Jul 2019 04:12:21 -0400 Received: by mail-wm1-f68.google.com with SMTP id 207so17670931wma.1 for ; Tue, 16 Jul 2019 01:12:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=netronome-com.20150623.gappssmtp.com; s=20150623; h=references:user-agent:from:to:cc:subject:in-reply-to:date :message-id:mime-version; bh=FObK9tcFkxCYQnUNgTDBb17fYC3+kC7ROYtn+Ghk2e0=; b=F8iQBWTxIcTwj8644z4H9x3wxCIsufv07w/KNA8wdn7lQ2Aq9IA8v+gDLdjKe0SKQZ ec4BazlQAWeg7sY0IG1XKKvPZqCTqQVSGFk++ApAlbCOcBHmAkbXTIH/gkhVukoQ1ZXM avmGTxF+MeZ+B7HbL44EdNpL2ks18XTPyd7Z324g6nkCjQEQexW8hWIvB1U7o14lzgCS g6AJbDpVXc0kvkOxeTN/A/Avx8sVyzEGWtlT+IvWP8pX/njbVoSk1u4ET4wDX0Dne26h l3iUBSogzmHI5+COvHGR8SIB/3kQmqTOI+IJr1sxj0Y41ctnUPGII+3cyjzrOVm50rCa WNog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:references:user-agent:from:to:cc:subject :in-reply-to:date:message-id:mime-version; bh=FObK9tcFkxCYQnUNgTDBb17fYC3+kC7ROYtn+Ghk2e0=; b=WHEHKN8AZUHTZr6lTmuB2ROUlN5n2w4cv3bMqGsQ5/HTtDEsLsKvQL9TMs9bMlM6pZ Q1uuxi4o++Db7i7ofBqpVwNwjz3byloA3C2zxGnf3kmSuEfTJlA2KXucK0o+UPA//0Iu 0B3+ZL7sy0YeU56FWMt8qJ7a7dGZ9PFI9+AfAtqbHWlKiZPsbhHoHbPj+50Y5PuSxSDa vc3CWZVkVS4E/MP193ONpfbAeWq+NVpVUy1IBWqZfgvC2kEmtHCOZfI5nVSqTjPumXxu aRoAeXY7aXUbS9BPGKE1mhAVHy/cdFV6U1Ixb+k2QjaKfRMCfz3Nb2R9oBTZ0q9UDQ0T oT+Q== X-Gm-Message-State: APjAAAXti5Dzd9RpuZqDiYK29CtsDp04dxwyEaXNFCoTaCcZKkTGROxn 3sTlRW8u0v/ajgcZNmHVIOo1Gg== X-Google-Smtp-Source: APXvYqwwIUiIx6LtKoWgK095oqc/afm+EuA52sze9Fx7FExwcCbOl+3ckZoCB2ONyhc/tCudT2Cv1w== X-Received: by 2002:a1c:4184:: with SMTP id o126mr28326710wma.68.1563264738368; Tue, 16 Jul 2019 01:12:18 -0700 (PDT) Received: from LAPTOP-V3S7NLPL ([217.38.71.146]) by smtp.gmail.com with ESMTPSA id u13sm23451068wrq.62.2019.07.16.01.12.14 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 16 Jul 2019 01:12:14 -0700 (PDT) References: <1562275611-31790-1-git-send-email-jiong.wang@netronome.com> <1562275611-31790-3-git-send-email-jiong.wang@netronome.com> <87o920235d.fsf@netronome.com> <87muhk2264.fsf@netronome.com> <87h87n39aj.fsf@netronome.com> User-agent: mu4e 0.9.18; emacs 25.2.2 From: Jiong Wang To: Andrii Nakryiko Cc: Jiong Wang , Andrii Nakryiko , Alexei Starovoitov , Daniel Borkmann , Edward Cree , "Naveen N. Rao" , Jakub Kicinski , bpf , Networking , oss-drivers@netronome.com Subject: Re: [oss-drivers] Re: [RFC bpf-next 2/8] bpf: extend list based insn patching infra to verification layer In-reply-to: Date: Tue, 16 Jul 2019 09:12:13 +0100 Message-ID: <87y30ytn36.fsf@netronome.com> MIME-Version: 1.0 Content-Type: text/plain Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Andrii Nakryiko writes: > On Mon, Jul 15, 2019 at 3:02 AM Jiong Wang wrote: >> >> >> Andrii Nakryiko writes: >> >> > On Thu, Jul 11, 2019 at 5:20 AM Jiong Wang wrote: >> >> >> >> >> >> Jiong Wang writes: >> >> >> >> > Andrii Nakryiko writes: >> >> > >> >> >> On Thu, Jul 4, 2019 at 2:32 PM Jiong Wang wrote: >> >> >>> >> >> >>> Verification layer also needs to handle auxiliar info as well as adjusting >> >> >>> subprog start. >> >> >>> >> >> >>> At this layer, insns inside patch buffer could be jump, but they should >> >> >>> have been resolved, meaning they shouldn't jump to insn outside of the >> >> >>> patch buffer. Lineration function for this layer won't touch insns inside >> >> >>> patch buffer. >> >> >>> >> >> >>> Adjusting subprog is finished along with adjusting jump target when the >> >> >>> input will cover bpf to bpf call insn, re-register subprog start is cheap. >> >> >>> But adjustment when there is insn deleteion is not considered yet. >> >> >>> >> >> >>> Signed-off-by: Jiong Wang >> >> >>> --- >> >> >>> kernel/bpf/verifier.c | 150 ++++++++++++++++++++++++++++++++++++++++++++++++++ >> >> >>> 1 file changed, 150 insertions(+) >> >> >>> >> >> >>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >> >> >>> index a2e7637..2026d64 100644 >> >> >>> --- a/kernel/bpf/verifier.c >> >> >>> +++ b/kernel/bpf/verifier.c >> >> >>> @@ -8350,6 +8350,156 @@ static void opt_hard_wire_dead_code_branches(struct bpf_verifier_env *env) >> >> >>> } >> >> >>> } >> >> >>> >> >> >>> +/* Linearize bpf list insn to array (verifier layer). */ >> >> >>> +static struct bpf_verifier_env * >> >> >>> +verifier_linearize_list_insn(struct bpf_verifier_env *env, >> >> >>> + struct bpf_list_insn *list) >> >> >> >> >> >> It's unclear why this returns env back? It's not allocating a new env, >> >> >> so it's weird and unnecessary. Just return error code. >> >> > >> >> > The reason is I was thinking we have two layers in BPF, the core and the >> >> > verifier. >> >> > >> >> > For core layer (the relevant file is core.c), when doing patching, the >> >> > input is insn list and bpf_prog, the linearization should linearize the >> >> > insn list into insn array, and also whatever others affect inside bpf_prog >> >> > due to changing on insns, for example line info inside prog->aux. So the >> >> > return value is bpf_prog for core layer linearization hook. >> >> > >> >> > For verifier layer, it is similar, but the context if bpf_verifier_env, the >> >> > linearization hook should linearize the insn list, and also those affected >> >> > inside env, for example bpf_insn_aux_data, so the return value is >> >> > bpf_verifier_env, meaning returning an updated verifier context >> >> > (bpf_verifier_env) after insn list linearization. >> >> >> >> Realized your point is no new env is allocated, so just return error >> >> code. Yes, the env pointer is not changed, just internal data is >> >> updated. Return bpf_verifier_env mostly is trying to make the hook more >> >> clear that it returns an updated "context" where the linearization happens, >> >> for verifier layer, it is bpf_verifier_env, and for core layer, it is >> >> bpf_prog, so return value was designed to return these two types. >> > >> > Oh, I missed that core layer returns bpf_prog*. I think this is >> > confusing as hell and is very contrary to what one would expect. If >> > the function doesn't allocate those objects, it shouldn't return them, >> > except for rare cases of some accessor functions. Me reading this, >> > I'll always be suprised and will have to go skim code just to check >> > whether those functions really return new bpf_prog or >> > bpf_verifier_env, respectively. >> >> bpf_prog_realloc do return new bpf_prog, so we will need to return bpf_prog >> * for core layer. > > Ah, I see, then it would make sense for core layer, but still is very > confusing for verifier_linearize_list_insn. > I still hope for unified solution, so it shouldn't matter. But it > pointed me to a bug in your code, see below. Yeah, thanks! > >> >> > >> > Please change them both to just return error code. >> > >> >> >> >> > >> >> > Make sense? >> >> > >> >> > Regards, >> >> > Jiong >> >> > >> >> >> >> >> >>> +{ >> >> >>> + u32 *idx_map, idx, orig_cnt, fini_cnt = 0; >> >> >>> + struct bpf_subprog_info *new_subinfo; >> >> >>> + struct bpf_insn_aux_data *new_data; >> >> >>> + struct bpf_prog *prog = env->prog; >> >> >>> + struct bpf_verifier_env *ret_env; >> >> >>> + struct bpf_insn *insns, *insn; >> >> >>> + struct bpf_list_insn *elem; >> >> >>> + int ret; >> >> >>> + >> >> >>> + /* Calculate final size. */ >> >> >>> + for (elem = list; elem; elem = elem->next) >> >> >>> + if (!(elem->flag & LIST_INSN_FLAG_REMOVED)) >> >> >>> + fini_cnt++; >> >> >>> + >> >> >>> + orig_cnt = prog->len; >> >> >>> + insns = prog->insnsi; >> >> >>> + /* If prog length remains same, nothing else to do. */ >> >> >>> + if (fini_cnt == orig_cnt) { >> >> >>> + for (insn = insns, elem = list; elem; elem = elem->next, insn++) >> >> >>> + *insn = elem->insn; >> >> >>> + return env; >> >> >>> + } >> >> >>> + /* Realloc insn buffer when necessary. */ >> >> >>> + if (fini_cnt > orig_cnt) >> >> >>> + prog = bpf_prog_realloc(prog, bpf_prog_size(fini_cnt), >> >> >>> + GFP_USER); >> >> >>> + if (!prog) >> >> >>> + return ERR_PTR(-ENOMEM); > > On realloc failure, prog will be non-NULL, so you need to handle error > properly (and propagate it, instead of returning -ENOMEM): > > if (IS_ERR(prog)) > return ERR_PTR(prog); > > >> >> >>> + insns = prog->insnsi; >> >> >>> + prog->len = fini_cnt; >> >> >>> + ret_env = env; >> >> >>> + >> >> >>> + /* idx_map[OLD_IDX] = NEW_IDX */ >> >> >>> + idx_map = kvmalloc(orig_cnt * sizeof(u32), GFP_KERNEL); >> >> >>> + if (!idx_map) >> >> >>> + return ERR_PTR(-ENOMEM); >> >> >>> + memset(idx_map, 0xff, orig_cnt * sizeof(u32)); >> >> >>> + >> >> >>> + /* Use the same alloc method used when allocating env->insn_aux_data. */ >> >> >>> + new_data = vzalloc(array_size(sizeof(*new_data), fini_cnt)); >> >> >>> + if (!new_data) { >> >> >>> + kvfree(idx_map); >> >> >>> + return ERR_PTR(-ENOMEM); >> >> >>> + } >> >> >>> + >> >> >>> + /* Copy over insn + calculate idx_map. */ >> >> >>> + for (idx = 0, elem = list; elem; elem = elem->next) { >> >> >>> + int orig_idx = elem->orig_idx - 1; >> >> >>> + >> >> >>> + if (orig_idx >= 0) { >> >> >>> + idx_map[orig_idx] = idx; >> >> >>> + >> >> >>> + if (elem->flag & LIST_INSN_FLAG_REMOVED) >> >> >>> + continue; >> >> >>> + >> >> >>> + new_data[idx] = env->insn_aux_data[orig_idx]; >> >> >>> + >> >> >>> + if (elem->flag & LIST_INSN_FLAG_PATCHED) >> >> >>> + new_data[idx].zext_dst = >> >> >>> + insn_has_def32(env, &elem->insn); >> >> >>> + } else { >> >> >>> + new_data[idx].seen = true; >> >> >>> + new_data[idx].zext_dst = insn_has_def32(env, >> >> >>> + &elem->insn); >> >> >>> + } >> >> >>> + insns[idx++] = elem->insn; >> >> >>> + } >> >> >>> + >> >> >>> + new_subinfo = kvzalloc(sizeof(env->subprog_info), GFP_KERNEL); >> >> >>> + if (!new_subinfo) { >> >> >>> + kvfree(idx_map); >> >> >>> + vfree(new_data); >> >> >>> + return ERR_PTR(-ENOMEM); >> >> >>> + } >> >> >>> + memcpy(new_subinfo, env->subprog_info, sizeof(env->subprog_info)); >> >> >>> + memset(env->subprog_info, 0, sizeof(env->subprog_info)); >> >> >>> + env->subprog_cnt = 0; >> >> >>> + env->prog = prog; >> >> >>> + ret = add_subprog(env, 0); >> >> >>> + if (ret < 0) { >> >> >>> + ret_env = ERR_PTR(ret); >> >> >>> + goto free_all_ret; >> >> >>> + } >> >> >>> + /* Relocate jumps using idx_map. >> >> >>> + * old_dst = jmp_insn.old_target + old_pc + 1; >> >> >>> + * new_dst = idx_map[old_dst] = jmp_insn.new_target + new_pc + 1; >> >> >>> + * jmp_insn.new_target = new_dst - new_pc - 1; >> >> >>> + */ >> >> >>> + for (idx = 0, elem = list; elem; elem = elem->next) { >> >> >>> + int orig_idx = elem->orig_idx; >> >> >>> + >> >> >>> + if (elem->flag & LIST_INSN_FLAG_REMOVED) >> >> >>> + continue; >> >> >>> + if ((elem->flag & LIST_INSN_FLAG_PATCHED) || !orig_idx) { >> >> >>> + idx++; >> >> >>> + continue; >> >> >>> + } >> >> >>> + >> >> >>> + ret = bpf_jit_adj_imm_off(&insns[idx], orig_idx - 1, idx, >> >> >>> + idx_map); >> >> >>> + if (ret < 0) { >> >> >>> + ret_env = ERR_PTR(ret); >> >> >>> + goto free_all_ret; >> >> >>> + } >> >> >>> + /* Recalculate subprog start as we are at bpf2bpf call insn. */ >> >> >>> + if (ret > 0) { >> >> >>> + ret = add_subprog(env, idx + insns[idx].imm + 1); >> >> >>> + if (ret < 0) { >> >> >>> + ret_env = ERR_PTR(ret); >> >> >>> + goto free_all_ret; >> >> >>> + } >> >> >>> + } >> >> >>> + idx++; >> >> >>> + } >> >> >>> + if (ret < 0) { >> >> >>> + ret_env = ERR_PTR(ret); >> >> >>> + goto free_all_ret; >> >> >>> + } >> >> >>> + >> >> >>> + env->subprog_info[env->subprog_cnt].start = fini_cnt; >> >> >>> + for (idx = 0; idx <= env->subprog_cnt; idx++) >> >> >>> + new_subinfo[idx].start = env->subprog_info[idx].start; >> >> >>> + memcpy(env->subprog_info, new_subinfo, sizeof(env->subprog_info)); >> >> >>> + >> >> >>> + /* Adjust linfo. >> >> >>> + * FIXME: no support for insn removal at the moment. >> >> >>> + */ >> >> >>> + if (prog->aux->nr_linfo) { >> >> >>> + struct bpf_line_info *linfo = prog->aux->linfo; >> >> >>> + u32 nr_linfo = prog->aux->nr_linfo; >> >> >>> + >> >> >>> + for (idx = 0; idx < nr_linfo; idx++) >> >> >>> + linfo[idx].insn_off = idx_map[linfo[idx].insn_off]; >> >> >>> + } >> >> >>> + vfree(env->insn_aux_data); >> >> >>> + env->insn_aux_data = new_data; >> >> >>> + goto free_mem_list_ret; >> >> >>> +free_all_ret: >> >> >>> + vfree(new_data); >> >> >>> +free_mem_list_ret: >> >> >>> + kvfree(new_subinfo); >> >> >>> + kvfree(idx_map); >> >> >>> + return ret_env; >> >> >>> +} >> >> >>> + >> >> >>> static int opt_remove_dead_code(struct bpf_verifier_env *env) >> >> >>> { >> >> >>> struct bpf_insn_aux_data *aux_data = env->insn_aux_data; >> >> >>> -- >> >> >>> 2.7.4 >> >> >>> >> >> >>