* [PATCH bpf-next 0/2] bpf: add cg_skb_is_valid_access @ 2018-10-17 5:56 Song Liu 2018-10-17 5:56 ` [PATCH bpf-next 1/2] bpf: add cg_skb_is_valid_access for BPF_PROG_TYPE_CGROUP_SKB Song Liu 2018-10-17 5:56 ` [PATCH bpf-next 2/2] bpf: add tests for direct packet access from CGROUP_SKB Song Liu 0 siblings, 2 replies; 6+ messages in thread From: Song Liu @ 2018-10-17 5:56 UTC (permalink / raw) To: netdev; +Cc: ast, daniel, kernel-team, Song Liu This set enables BPF program of type BPF_PROG_TYPE_CGROUP_SKB to access __skb_buff->len/data/data_end directly. Song Liu (2): bpf: add cg_skb_is_valid_access for BPF_PROG_TYPE_CGROUP_SKB bpf: add tests for direct packet access from CGROUP_SKB kernel/bpf/cgroup.c | 4 +++ net/core/filter.c | 26 +++++++++++++++++- tools/testing/selftests/bpf/test_verifier.c | 30 +++++++++++++++++++++ 3 files changed, 59 insertions(+), 1 deletion(-) ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH bpf-next 1/2] bpf: add cg_skb_is_valid_access for BPF_PROG_TYPE_CGROUP_SKB 2018-10-17 5:56 [PATCH bpf-next 0/2] bpf: add cg_skb_is_valid_access Song Liu @ 2018-10-17 5:56 ` Song Liu 2018-10-17 17:26 ` Alexei Starovoitov 2018-10-17 5:56 ` [PATCH bpf-next 2/2] bpf: add tests for direct packet access from CGROUP_SKB Song Liu 1 sibling, 1 reply; 6+ messages in thread From: Song Liu @ 2018-10-17 5:56 UTC (permalink / raw) To: netdev; +Cc: ast, daniel, kernel-team, Song Liu BPF programs of BPF_PROG_TYPE_CGROUP_SKB need to access headers in the skb. This patch enables direct access of skb for these programs. In __cgroup_bpf_run_filter_skb(), bpf_compute_data_pointers() is called to compute proper data_end for the BPF program. Signed-off-by: Song Liu <songliubraving@fb.com> --- kernel/bpf/cgroup.c | 4 ++++ net/core/filter.c | 26 +++++++++++++++++++++++++- 2 files changed, 29 insertions(+), 1 deletion(-) diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c index 00f6ed2e4f9a..340d496f35bd 100644 --- a/kernel/bpf/cgroup.c +++ b/kernel/bpf/cgroup.c @@ -566,6 +566,10 @@ int __cgroup_bpf_run_filter_skb(struct sock *sk, save_sk = skb->sk; skb->sk = sk; __skb_push(skb, offset); + + /* compute pointers for the bpf prog */ + bpf_compute_data_pointers(skb); + ret = BPF_PROG_RUN_ARRAY(cgrp->bpf.effective[type], skb, bpf_prog_run_save_cb); __skb_pull(skb, offset); diff --git a/net/core/filter.c b/net/core/filter.c index 1a3ac6c46873..8b5a502e241f 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -5346,6 +5346,30 @@ static bool sk_filter_is_valid_access(int off, int size, return bpf_skb_is_valid_access(off, size, type, prog, info); } +static bool cg_skb_is_valid_access(int off, int size, + enum bpf_access_type type, + const struct bpf_prog *prog, + struct bpf_insn_access_aux *info) +{ + if (type == BPF_WRITE) + return false; + + switch (off) { + case bpf_ctx_range(struct __sk_buff, len): + break; + case bpf_ctx_range(struct __sk_buff, data): + info->reg_type = PTR_TO_PACKET; + break; + case bpf_ctx_range(struct __sk_buff, data_end): + info->reg_type = PTR_TO_PACKET_END; + break; + default: + return false; + } + + return bpf_skb_is_valid_access(off, size, type, prog, info); +} + static bool lwt_is_valid_access(int off, int size, enum bpf_access_type type, const struct bpf_prog *prog, @@ -7038,7 +7062,7 @@ const struct bpf_prog_ops xdp_prog_ops = { const struct bpf_verifier_ops cg_skb_verifier_ops = { .get_func_proto = cg_skb_func_proto, - .is_valid_access = sk_filter_is_valid_access, + .is_valid_access = cg_skb_is_valid_access, .convert_ctx_access = bpf_convert_ctx_access, }; -- 2.17.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf: add cg_skb_is_valid_access for BPF_PROG_TYPE_CGROUP_SKB 2018-10-17 5:56 ` [PATCH bpf-next 1/2] bpf: add cg_skb_is_valid_access for BPF_PROG_TYPE_CGROUP_SKB Song Liu @ 2018-10-17 17:26 ` Alexei Starovoitov 2018-10-17 19:02 ` Alexei Starovoitov 0 siblings, 1 reply; 6+ messages in thread From: Alexei Starovoitov @ 2018-10-17 17:26 UTC (permalink / raw) To: Song Liu; +Cc: netdev, ast, daniel, kernel-team On Tue, Oct 16, 2018 at 10:56:05PM -0700, Song Liu wrote: > BPF programs of BPF_PROG_TYPE_CGROUP_SKB need to access headers in the > skb. This patch enables direct access of skb for these programs. The lack of direct packet access in CGROUP_SKB progs was an unpleasant surprise to me, so thank you for fixing it, but there are few issues with the patch. See below. > In __cgroup_bpf_run_filter_skb(), bpf_compute_data_pointers() is called > to compute proper data_end for the BPF program. > > Signed-off-by: Song Liu <songliubraving@fb.com> > --- > kernel/bpf/cgroup.c | 4 ++++ > net/core/filter.c | 26 +++++++++++++++++++++++++- > 2 files changed, 29 insertions(+), 1 deletion(-) > > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c > index 00f6ed2e4f9a..340d496f35bd 100644 > --- a/kernel/bpf/cgroup.c > +++ b/kernel/bpf/cgroup.c > @@ -566,6 +566,10 @@ int __cgroup_bpf_run_filter_skb(struct sock *sk, > save_sk = skb->sk; > skb->sk = sk; > __skb_push(skb, offset); > + > + /* compute pointers for the bpf prog */ > + bpf_compute_data_pointers(skb); > + > ret = BPF_PROG_RUN_ARRAY(cgrp->bpf.effective[type], skb, > bpf_prog_run_save_cb); > __skb_pull(skb, offset); > diff --git a/net/core/filter.c b/net/core/filter.c > index 1a3ac6c46873..8b5a502e241f 100644 > --- a/net/core/filter.c > +++ b/net/core/filter.c > @@ -5346,6 +5346,30 @@ static bool sk_filter_is_valid_access(int off, int size, > return bpf_skb_is_valid_access(off, size, type, prog, info); > } > > +static bool cg_skb_is_valid_access(int off, int size, > + enum bpf_access_type type, > + const struct bpf_prog *prog, > + struct bpf_insn_access_aux *info) > +{ > + if (type == BPF_WRITE) > + return false; this disables writes into cb[0..4] that were allowed for cgroup_inet_* before. One can argue that this may break existing progs, but looking at the place where BPF_CGROUP_RUN_PROG_INET_INGRESS is called it seems it's actually not correct in all cases to access cb there. Just few lines down we call bpf_prog_run_save_cb() which save/restores these 24 bytes. So we have two option either add save/restore for INET_INGRESS only or disable read and write access to cb[0..4] for CGROUP_SKB progs. I prefer the former. > + > + switch (off) { > + case bpf_ctx_range(struct __sk_buff, len): > + break; > + case bpf_ctx_range(struct __sk_buff, data): > + info->reg_type = PTR_TO_PACKET; > + break; > + case bpf_ctx_range(struct __sk_buff, data_end): > + info->reg_type = PTR_TO_PACKET_END; > + break; > + default: > + return false; > + } this also enables access to a range of fields family..local_port. It's ok to do for egress, but not for ingress unless we add code similar to the bottom of sk_filter_trim_cap() that inits skb->sk. above change also allows access to data_meta and flow_keys which is not correct. Considering all that I'm proposing to fix INET_INGRESS call site similar to code below it in sk_filter_trim_cap(). In particular to do: struct sock *save_sk = skb->sk; skb->sk = sk; save and clear cb BPF_CGROUP_RUN_PROG_INET_INGRESS restore cb skb->sk = save_sk; all of above can probaby be inside BPF_CGROUP_RUN_PROG_INET_INGRESS macro. Then in this cg_skb_is_valid_access() allow access to data/data_end and family..local_port range as well. while disallowing access to flow_keys and data_meta. In patch 2 we gotta have tests for all these fields. Thoughts? > + > + return bpf_skb_is_valid_access(off, size, type, prog, info); > +} > + > static bool lwt_is_valid_access(int off, int size, > enum bpf_access_type type, > const struct bpf_prog *prog, > @@ -7038,7 +7062,7 @@ const struct bpf_prog_ops xdp_prog_ops = { > > const struct bpf_verifier_ops cg_skb_verifier_ops = { > .get_func_proto = cg_skb_func_proto, > - .is_valid_access = sk_filter_is_valid_access, > + .is_valid_access = cg_skb_is_valid_access, > .convert_ctx_access = bpf_convert_ctx_access, > }; > > -- > 2.17.1 > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf: add cg_skb_is_valid_access for BPF_PROG_TYPE_CGROUP_SKB 2018-10-17 17:26 ` Alexei Starovoitov @ 2018-10-17 19:02 ` Alexei Starovoitov 2018-10-17 19:07 ` Song Liu 0 siblings, 1 reply; 6+ messages in thread From: Alexei Starovoitov @ 2018-10-17 19:02 UTC (permalink / raw) To: Alexei Starovoitov, Song Liu Cc: netdev@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net, Kernel Team On 10/17/18 10:26 AM, Alexei Starovoitov wrote: > On Tue, Oct 16, 2018 at 10:56:05PM -0700, Song Liu wrote: >> BPF programs of BPF_PROG_TYPE_CGROUP_SKB need to access headers in the >> skb. This patch enables direct access of skb for these programs. > > The lack of direct packet access in CGROUP_SKB progs was > an unpleasant surprise to me, so thank you for fixing it, > but there are few issues with the patch. See below. > >> In __cgroup_bpf_run_filter_skb(), bpf_compute_data_pointers() is called >> to compute proper data_end for the BPF program. >> >> Signed-off-by: Song Liu <songliubraving@fb.com> >> --- >> kernel/bpf/cgroup.c | 4 ++++ >> net/core/filter.c | 26 +++++++++++++++++++++++++- >> 2 files changed, 29 insertions(+), 1 deletion(-) >> >> diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c >> index 00f6ed2e4f9a..340d496f35bd 100644 >> --- a/kernel/bpf/cgroup.c >> +++ b/kernel/bpf/cgroup.c >> @@ -566,6 +566,10 @@ int __cgroup_bpf_run_filter_skb(struct sock *sk, >> save_sk = skb->sk; >> skb->sk = sk; >> __skb_push(skb, offset); >> + >> + /* compute pointers for the bpf prog */ >> + bpf_compute_data_pointers(skb); >> + >> ret = BPF_PROG_RUN_ARRAY(cgrp->bpf.effective[type], skb, >> bpf_prog_run_save_cb); >> __skb_pull(skb, offset); >> diff --git a/net/core/filter.c b/net/core/filter.c >> index 1a3ac6c46873..8b5a502e241f 100644 >> --- a/net/core/filter.c >> +++ b/net/core/filter.c >> @@ -5346,6 +5346,30 @@ static bool sk_filter_is_valid_access(int off, int size, >> return bpf_skb_is_valid_access(off, size, type, prog, info); >> } >> >> +static bool cg_skb_is_valid_access(int off, int size, >> + enum bpf_access_type type, >> + const struct bpf_prog *prog, >> + struct bpf_insn_access_aux *info) >> +{ >> + if (type == BPF_WRITE) >> + return false; > > this disables writes into cb[0..4] that were allowed for cgroup_inet_* before. > One can argue that this may break existing progs, > but looking at the place where BPF_CGROUP_RUN_PROG_INET_INGRESS is called > it seems it's actually not correct in all cases to access cb there. > Just few lines down we call bpf_prog_run_save_cb() which save/restores > these 24 bytes. > So we have two option either add save/restore for INET_INGRESS only > or disable read and write access to cb[0..4] for CGROUP_SKB progs. > I prefer the former. > >> + >> + switch (off) { >> + case bpf_ctx_range(struct __sk_buff, len): >> + break; >> + case bpf_ctx_range(struct __sk_buff, data): >> + info->reg_type = PTR_TO_PACKET; >> + break; >> + case bpf_ctx_range(struct __sk_buff, data_end): >> + info->reg_type = PTR_TO_PACKET_END; >> + break; >> + default: >> + return false; >> + } > > this also enables access to a range of fields family..local_port. > It's ok to do for egress, but not for ingress unless we > add code similar to the bottom of sk_filter_trim_cap() that > inits skb->sk. > > above change also allows access to data_meta and flow_keys > which is not correct. > > Considering all that I'm proposing to fix INET_INGRESS call site > similar to code below it in sk_filter_trim_cap(). > In particular to do: > struct sock *save_sk = skb->sk; > skb->sk = sk; > save and clear cb > BPF_CGROUP_RUN_PROG_INET_INGRESS > restore cb > skb->sk = save_sk; > > all of above can probaby be inside BPF_CGROUP_RUN_PROG_INET_INGRESS macro. > Then in this cg_skb_is_valid_access() allow access to data/data_end > and family..local_port range as well. > while disallowing access to flow_keys and data_meta. > > In patch 2 we gotta have tests for all these fields. > > Thoughts? chatted with Song offline. I completely misread 'return false' in the above as 'break'. The patch actually disables access to pkt_type, mark, queue_mapping and so on. Which is not correct either. Since tests were not failing we really need to improve this aspect of test coverage in test_verifier.c Also I missed that __cgroup_bpf_run_filter_skb() already does save_sk = skb->sk; skb->sk = sk; and bpf_prog_run_save_cb() So no issue in the existing code. That was false alarm. Revising the proposal... I think cg_skb_is_valid_access() can be made similar to lwt_is_valid_access(). Allowing writes into mark, priority, cb[0..4] and read of data/data_end. In addition it's also ok to allow family..local_port range (unlike lwt where sk may not be present). and no access to data_meta and flow_keys. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf: add cg_skb_is_valid_access for BPF_PROG_TYPE_CGROUP_SKB 2018-10-17 19:02 ` Alexei Starovoitov @ 2018-10-17 19:07 ` Song Liu 0 siblings, 0 replies; 6+ messages in thread From: Song Liu @ 2018-10-17 19:07 UTC (permalink / raw) To: Alexei Starovoitov Cc: Alexei Starovoitov, netdev@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net, Kernel Team > On Oct 17, 2018, at 12:02 PM, Alexei Starovoitov <ast@fb.com> wrote: > > On 10/17/18 10:26 AM, Alexei Starovoitov wrote: >> On Tue, Oct 16, 2018 at 10:56:05PM -0700, Song Liu wrote: >>> BPF programs of BPF_PROG_TYPE_CGROUP_SKB need to access headers in the >>> skb. This patch enables direct access of skb for these programs. >> >> The lack of direct packet access in CGROUP_SKB progs was >> an unpleasant surprise to me, so thank you for fixing it, >> but there are few issues with the patch. See below. >> >>> In __cgroup_bpf_run_filter_skb(), bpf_compute_data_pointers() is called >>> to compute proper data_end for the BPF program. >>> >>> Signed-off-by: Song Liu <songliubraving@fb.com> >>> --- >>> kernel/bpf/cgroup.c | 4 ++++ >>> net/core/filter.c | 26 +++++++++++++++++++++++++- >>> 2 files changed, 29 insertions(+), 1 deletion(-) >>> >>> diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c >>> index 00f6ed2e4f9a..340d496f35bd 100644 >>> --- a/kernel/bpf/cgroup.c >>> +++ b/kernel/bpf/cgroup.c >>> @@ -566,6 +566,10 @@ int __cgroup_bpf_run_filter_skb(struct sock *sk, >>> save_sk = skb->sk; >>> skb->sk = sk; >>> __skb_push(skb, offset); >>> + >>> + /* compute pointers for the bpf prog */ >>> + bpf_compute_data_pointers(skb); >>> + >>> ret = BPF_PROG_RUN_ARRAY(cgrp->bpf.effective[type], skb, >>> bpf_prog_run_save_cb); >>> __skb_pull(skb, offset); >>> diff --git a/net/core/filter.c b/net/core/filter.c >>> index 1a3ac6c46873..8b5a502e241f 100644 >>> --- a/net/core/filter.c >>> +++ b/net/core/filter.c >>> @@ -5346,6 +5346,30 @@ static bool sk_filter_is_valid_access(int off, int size, >>> return bpf_skb_is_valid_access(off, size, type, prog, info); >>> } >>> >>> +static bool cg_skb_is_valid_access(int off, int size, >>> + enum bpf_access_type type, >>> + const struct bpf_prog *prog, >>> + struct bpf_insn_access_aux *info) >>> +{ >>> + if (type == BPF_WRITE) >>> + return false; >> >> this disables writes into cb[0..4] that were allowed for cgroup_inet_* before. >> One can argue that this may break existing progs, >> but looking at the place where BPF_CGROUP_RUN_PROG_INET_INGRESS is called >> it seems it's actually not correct in all cases to access cb there. >> Just few lines down we call bpf_prog_run_save_cb() which save/restores >> these 24 bytes. >> So we have two option either add save/restore for INET_INGRESS only >> or disable read and write access to cb[0..4] for CGROUP_SKB progs. >> I prefer the former. >> >>> + >>> + switch (off) { >>> + case bpf_ctx_range(struct __sk_buff, len): >>> + break; >>> + case bpf_ctx_range(struct __sk_buff, data): >>> + info->reg_type = PTR_TO_PACKET; >>> + break; >>> + case bpf_ctx_range(struct __sk_buff, data_end): >>> + info->reg_type = PTR_TO_PACKET_END; >>> + break; >>> + default: >>> + return false; >>> + } >> >> this also enables access to a range of fields family..local_port. >> It's ok to do for egress, but not for ingress unless we >> add code similar to the bottom of sk_filter_trim_cap() that >> inits skb->sk. >> >> above change also allows access to data_meta and flow_keys >> which is not correct. >> >> Considering all that I'm proposing to fix INET_INGRESS call site >> similar to code below it in sk_filter_trim_cap(). >> In particular to do: >> struct sock *save_sk = skb->sk; >> skb->sk = sk; >> save and clear cb >> BPF_CGROUP_RUN_PROG_INET_INGRESS >> restore cb >> skb->sk = save_sk; >> >> all of above can probaby be inside BPF_CGROUP_RUN_PROG_INET_INGRESS macro. >> Then in this cg_skb_is_valid_access() allow access to data/data_end >> and family..local_port range as well. >> while disallowing access to flow_keys and data_meta. >> >> In patch 2 we gotta have tests for all these fields. >> >> Thoughts? > > chatted with Song offline. > I completely misread 'return false' in the above as 'break'. > The patch actually disables access to pkt_type, mark, queue_mapping > and so on. Which is not correct either. > Since tests were not failing we really need to improve this aspect > of test coverage in test_verifier.c > > Also I missed that __cgroup_bpf_run_filter_skb() already > does save_sk = skb->sk; skb->sk = sk; > and bpf_prog_run_save_cb() > So no issue in the existing code. That was false alarm. > Revising the proposal... > I think cg_skb_is_valid_access() can be made similar to > lwt_is_valid_access(). > Allowing writes into mark, priority, cb[0..4] > and read of data/data_end. > In addition it's also ok to allow family..local_port range > (unlike lwt where sk may not be present). > and no access to data_meta and flow_keys. Thanks Alexei! I will send v2 shortly. Song ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH bpf-next 2/2] bpf: add tests for direct packet access from CGROUP_SKB 2018-10-17 5:56 [PATCH bpf-next 0/2] bpf: add cg_skb_is_valid_access Song Liu 2018-10-17 5:56 ` [PATCH bpf-next 1/2] bpf: add cg_skb_is_valid_access for BPF_PROG_TYPE_CGROUP_SKB Song Liu @ 2018-10-17 5:56 ` Song Liu 1 sibling, 0 replies; 6+ messages in thread From: Song Liu @ 2018-10-17 5:56 UTC (permalink / raw) To: netdev; +Cc: ast, daniel, kernel-team, Song Liu Tests are added to make sure CGROUP_SKB can directly access len, data, and data_end in __sk_buff, but not other fields. Signed-off-by: Song Liu <songliubraving@fb.com> --- tools/testing/selftests/bpf/test_verifier.c | 30 +++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c index cf4cd32b6772..aaf2ceba83dd 100644 --- a/tools/testing/selftests/bpf/test_verifier.c +++ b/tools/testing/selftests/bpf/test_verifier.c @@ -4862,6 +4862,36 @@ static struct bpf_test tests[] = { .result = REJECT, .flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS, }, + { + "direct packet read for CGROUP_SKB", + .insns = { + BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, + offsetof(struct __sk_buff, data)), + BPF_LDX_MEM(BPF_W, BPF_REG_3, BPF_REG_1, + offsetof(struct __sk_buff, data_end)), + BPF_LDX_MEM(BPF_W, BPF_REG_4, BPF_REG_1, + offsetof(struct __sk_buff, len)), + BPF_MOV64_REG(BPF_REG_0, BPF_REG_2), + BPF_ALU64_IMM(BPF_ADD, BPF_REG_0, 8), + BPF_JMP_REG(BPF_JGT, BPF_REG_0, BPF_REG_3, 1), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_2, 0), + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .result = ACCEPT, + .prog_type = BPF_PROG_TYPE_CGROUP_SKB, + }, + { + "invalid access of tc_classid for CGROUP_SKB", + .insns = { + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_1, + offsetof(struct __sk_buff, tc_classid)), + BPF_EXIT_INSN(), + }, + .result = REJECT, + .errstr = "invalid bpf_context access", + .prog_type = BPF_PROG_TYPE_CGROUP_SKB, + }, { "valid cgroup storage access", .insns = { -- 2.17.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-10-18 3:05 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-10-17 5:56 [PATCH bpf-next 0/2] bpf: add cg_skb_is_valid_access Song Liu 2018-10-17 5:56 ` [PATCH bpf-next 1/2] bpf: add cg_skb_is_valid_access for BPF_PROG_TYPE_CGROUP_SKB Song Liu 2018-10-17 17:26 ` Alexei Starovoitov 2018-10-17 19:02 ` Alexei Starovoitov 2018-10-17 19:07 ` Song Liu 2018-10-17 5:56 ` [PATCH bpf-next 2/2] bpf: add tests for direct packet access from CGROUP_SKB Song Liu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox