From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from Chamillionaire.breakpoint.cc (Chamillionaire.breakpoint.cc [193.142.43.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 158A92F2B for ; Thu, 3 Feb 2022 18:24:12 +0000 (UTC) Received: from fw by Chamillionaire.breakpoint.cc with local (Exim 4.92) (envelope-from ) id 1nFgBp-0002Uz-J1; Thu, 03 Feb 2022 18:46:37 +0100 Date: Thu, 3 Feb 2022 18:46:37 +0100 From: Florian Westphal To: Kishen Maloor Cc: mptcp@lists.linux.dev Subject: Re: [PATCH mptcp-next v5 5/8] mptcp: netlink: store per namespace list of refcounted listen socks Message-ID: <20220203174637.GC4901@breakpoint.cc> References: <20220203072508.3072309-1-kishen.maloor@intel.com> <20220203072508.3072309-6-kishen.maloor@intel.com> Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220203072508.3072309-6-kishen.maloor@intel.com> User-Agent: Mutt/1.10.1 (2018-07-13) Kishen Maloor wrote: > The kernel can create listening sockets bound to announced addresses > via the ADD_ADDR option for receiving MP_JOIN requests. Path > managers may further choose to advertise the same addr+port over multiple > MPTCP connections. So this change provides a simple framework to > manage a list of all distinct listning sockets created in the kernel > over a namespace by encapsulating the socket in a structure that is > ref counted and can be shared across multiple connections. The sockets > are released when there are no more references. I think it makes sense to work on a hook in tcp v4/v6 input path that gets called for th->syn && !th->ack && no-listener-found case. The hook would: 1. retrieve join token, fetch mptcp_sock and allow 3whs to continue if things look ok from mptcp p.o.v. 2. return "go ahead and send tcp rst" or "mptcp magic, skb stolen" to the tcp stack. This also makes sure that plain tcp or mptcp connect requests will not work for addresses that did not go through socket/bind/listen API. I will try to prototype something next week. Given that hook lives in an error path (from tcp point of view) I think its going to be OK from a upstreaming point of view. It hopefully avoids the need for "magic listener sockets", and avoids kernel fighting with userspace applications over which address:port pairs are really useable. The latter is a concern IMO, esp. with reuseport and other round-robin schemes, I don't want mptcp layer to interfere with other application running on same host.