From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.6 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT, USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2FCB4C0650E for ; Mon, 1 Jul 2019 20:48:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0511220652 for ; Mon, 1 Jul 2019 20:48:29 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="v2SizwqE" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726964AbfGAUs2 (ORCPT ); Mon, 1 Jul 2019 16:48:28 -0400 Received: from mail-pf1-f202.google.com ([209.85.210.202]:32979 "EHLO mail-pf1-f202.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726869AbfGAUs1 (ORCPT ); Mon, 1 Jul 2019 16:48:27 -0400 Received: by mail-pf1-f202.google.com with SMTP id d190so9471518pfa.0 for ; Mon, 01 Jul 2019 13:48:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=INF3BhAYxiC5zN7sI0E/1ejr1GqqfZAk7apZjO1QaBg=; b=v2SizwqEzo0HRILTfG8hKxn7HR13u+sEEwA0E2mO0W6/SZR7mMFwY+758HE5rxyQCY VD/dggiH60mX67ZVxSLvFNiWjPucZAhEm5O52i7Fn/4hCZUr4nZkv081l3/Ef7bExx5q bFV81flb81bsZczZgRa4WfJ3BdwAYoen4RpuxzY/BzOB5X3tb00X9PyutYI5Yr9HUrB8 IlkOfc/k+60toNvVS7sCneoOhkFHHSSdwO9f60rekNvED0hAqNPCqFDg7EZ/7uI2bK6M Jx9OMyP31TVJEi96s55nTQTRWgSTD3X5MJuAXY2vqdeNfxXsg0XjqlS9icviBqZgfiyc 326A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=INF3BhAYxiC5zN7sI0E/1ejr1GqqfZAk7apZjO1QaBg=; b=lS7WaubmWkdB7GUKiYWAmrqB898acOU8pq0OLnn9FDl1Q37GJaSaMSxYvpkw9WrnkX pwklaJiyFxyWA8/LrHjpBKoqR+ac39vk6hsKAqaZLW+kE+Eu02IGFXxbg/+g1yOdpfy0 FUObkL2o6kwmZ8JC/L/xr/+LTFY900po2AJ1mUeV/mcFfvkwA8pYfGRDI92+l4Jru+eV EYqWW7mMpCxApKD9YcN05lIn/xpd76bg8EGBRh/D3ibVzTHFLhnECmL+IM08O8CltBf1 9O65Dq+Zy5pUzo8yjalW2DUMVBPLmMHTlxdkwgqEnRgzXRfZ84JXqpRDrKMUBm7EgUhp GGng== X-Gm-Message-State: APjAAAUAdHsWUEhw2hvzzvA0R8asf13Et5aLgo9RmoQ/jIxct3EAqP31 Km1clUzITuF0zoNVuefua+w09NCvAmoucgqvEhKc+pP+pSiP9ZRB6N+dPUb5OvOO79R+CHzEW/l MYPcP9AcgGQVqE3tnaIeQvilMIgeb7wEqJf+id5ld0FN4vyJb43k7zA== X-Google-Smtp-Source: APXvYqzKhBy10PewYzuyN7H79WKX0/koGTFebHygP4NeE8AacXlN2HnrwjKNM0sSDjoyFhVV3C8IUEg= X-Received: by 2002:a65:6541:: with SMTP id a1mr26318110pgw.409.1562014106370; Mon, 01 Jul 2019 13:48:26 -0700 (PDT) Date: Mon, 1 Jul 2019 13:48:14 -0700 In-Reply-To: <20190701204821.44230-1-sdf@google.com> Message-Id: <20190701204821.44230-2-sdf@google.com> Mime-Version: 1.0 References: <20190701204821.44230-1-sdf@google.com> X-Mailer: git-send-email 2.22.0.410.gd8fdbe21b5-goog Subject: [PATCH bpf-next 1/8] bpf: add BPF_CGROUP_SOCK_OPS callback that is executed on every RTT From: Stanislav Fomichev To: netdev@vger.kernel.org, bpf@vger.kernel.org Cc: davem@davemloft.net, ast@kernel.org, daniel@iogearbox.net, Stanislav Fomichev , Eric Dumazet , Priyaranjan Jha , Yuchung Cheng , Soheil Hassas Yeganeh Content-Type: text/plain; charset="UTF-8" Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Performance impact should be minimal because it's under a new BPF_SOCK_OPS_RTT_CB_FLAG flag that has to be explicitly enabled. Suggested-by: Eric Dumazet Cc: Eric Dumazet Cc: Priyaranjan Jha Cc: Yuchung Cheng Cc: Soheil Hassas Yeganeh Signed-off-by: Stanislav Fomichev --- include/net/tcp.h | 8 ++++++++ include/uapi/linux/bpf.h | 6 +++++- net/ipv4/tcp_input.c | 4 ++++ 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index 9d36cc88d043..e16d8a3fd3b4 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -2221,6 +2221,14 @@ static inline bool tcp_bpf_ca_needs_ecn(struct sock *sk) return (tcp_call_bpf(sk, BPF_SOCK_OPS_NEEDS_ECN, 0, NULL) == 1); } +static inline void tcp_bpf_rtt(struct sock *sk) +{ + struct tcp_sock *tp = tcp_sk(sk); + + if (BPF_SOCK_OPS_TEST_FLAG(tp, BPF_SOCK_OPS_RTT_CB_FLAG)) + tcp_call_bpf(sk, BPF_SOCK_OPS_RTT_CB, 0, NULL); +} + #if IS_ENABLED(CONFIG_SMC) extern struct static_key_false tcp_have_smc; #endif diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index cffea1826a1f..9cdd0aaeba06 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -1770,6 +1770,7 @@ union bpf_attr { * * **BPF_SOCK_OPS_RTO_CB_FLAG** (retransmission time out) * * **BPF_SOCK_OPS_RETRANS_CB_FLAG** (retransmission) * * **BPF_SOCK_OPS_STATE_CB_FLAG** (TCP state change) + * * **BPF_SOCK_OPS_RTT_CB_FLAG** (every RTT) * * Therefore, this function can be used to clear a callback flag by * setting the appropriate bit to zero. e.g. to disable the RTO @@ -3314,7 +3315,8 @@ struct bpf_sock_ops { #define BPF_SOCK_OPS_RTO_CB_FLAG (1<<0) #define BPF_SOCK_OPS_RETRANS_CB_FLAG (1<<1) #define BPF_SOCK_OPS_STATE_CB_FLAG (1<<2) -#define BPF_SOCK_OPS_ALL_CB_FLAGS 0x7 /* Mask of all currently +#define BPF_SOCK_OPS_RTT_CB_FLAG (1<<3) +#define BPF_SOCK_OPS_ALL_CB_FLAGS 0xF /* Mask of all currently * supported cb flags */ @@ -3369,6 +3371,8 @@ enum { BPF_SOCK_OPS_TCP_LISTEN_CB, /* Called on listen(2), right after * socket transition to LISTEN state. */ + BPF_SOCK_OPS_RTT_CB, /* Called on every RTT. + */ }; /* List of TCP states. There is a build check in net/ipv4/tcp.c to detect diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index b71efeb0ae5b..c21e8a22fb3b 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -778,6 +778,8 @@ static void tcp_rtt_estimator(struct sock *sk, long mrtt_us) tp->rttvar_us -= (tp->rttvar_us - tp->mdev_max_us) >> 2; tp->rtt_seq = tp->snd_nxt; tp->mdev_max_us = tcp_rto_min_us(sk); + + tcp_bpf_rtt(sk); } } else { /* no previous measure. */ @@ -786,6 +788,8 @@ static void tcp_rtt_estimator(struct sock *sk, long mrtt_us) tp->rttvar_us = max(tp->mdev_us, tcp_rto_min_us(sk)); tp->mdev_max_us = tp->rttvar_us; tp->rtt_seq = tp->snd_nxt; + + tcp_bpf_rtt(sk); } tp->srtt_us = max(1U, srtt); } -- 2.22.0.410.gd8fdbe21b5-goog