From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-io1-f41.google.com (mail-io1-f41.google.com [209.85.166.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 66E9313B2BC for ; Thu, 23 May 2024 06:02:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.41 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716444178; cv=none; b=qiopGRnTwSKQwrvr1kgzQjGtETk5aajwZQHjlUGbhxcAYF+BcEMz5+PSX4+Senq8R1Wu9CgesUCOdy9Al/gUe5JG2fuMSLvKkjFLZnxYgcxputSs0szJ+fXMD6/dEEEQT2u0rrzF4Oxw2gxdHfxPqDP2M+emmps8EBhXZ2xR42s= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716444178; c=relaxed/simple; bh=B9lGrOwbzZ9cwWP1ohGSBcMGMvcXioUh06F1RRqtc5A=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=BuHStP3lUhb8HOYxjn6BGsyfZqrGMAfJdw7GBypqGwjoDIDGKd6yx76tKyNoDcUv/jmMGo2Pg/AwRrVR5dFSV5L3cBniqGGV2DYXLdbvmPqp2n1Ghrc0fi1rcDoG9V5i9vUeQPfHe5fyA174CNWmYOnrkt09itS+IzFkWUjtdbY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=davidwei.uk; spf=none smtp.mailfrom=davidwei.uk; dkim=pass (2048-bit key) header.d=davidwei-uk.20230601.gappssmtp.com header.i=@davidwei-uk.20230601.gappssmtp.com header.b=Uz2AHfik; arc=none smtp.client-ip=209.85.166.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=davidwei.uk Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=davidwei.uk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=davidwei-uk.20230601.gappssmtp.com header.i=@davidwei-uk.20230601.gappssmtp.com header.b="Uz2AHfik" Received: by mail-io1-f41.google.com with SMTP id ca18e2360f4ac-7e1c2b07507so344606239f.2 for ; Wed, 22 May 2024 23:02:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=davidwei-uk.20230601.gappssmtp.com; s=20230601; t=1716444176; x=1717048976; darn=vger.kernel.org; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=txOSv1K2CA1USg0Ze10vD1qMyJxbEs9Mfk9GiCoI090=; b=Uz2AHfikahhuUowwWHZ/rUV3mnPgvxCUBUVLy75Ni4QVr3Zm8DMJc7X+YPT/ZlueFf fM09t2fSBt5pPpQRo2M2zZC0pkRmdXI9FCcScF4/EKN/cySBOR/qDzA5ekE4ILkpPwpy 5mKNJ6bHDg+TEheZbQ3IkCrHP/NZCEn3hQOxoKAsaSNFqq1PjKOo0dSQD72z4WdrL11T SDuOkvOaGOZFmfxRTdXg/vfbkBY5oBXpiOFGPI2dN03MwRCRVp7zTktF7l5JeFQL9kbh gj4aNjvikrd9qQBAiO+5XOkySFNkgVh2QVR/GkOax5g1Q2KM/PsKBO5u3BVjXDevfa5r Zpcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716444176; x=1717048976; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=txOSv1K2CA1USg0Ze10vD1qMyJxbEs9Mfk9GiCoI090=; b=EjOxe9yRyVGRNYBwoTjwUZfyML6UHux1lkVkpAfKaRFuw5imlASuNv0c4vKDMRUmS3 IKPuyZ/auskEMNvHPH0eMe1+SNSiU5grsxgWmE5ZcG6BIfdy6SgiWZD9uvPFpHlm841U XPvHgiRe1FGNvRPlW+EOh6I8KAFgcj4T2fE2UppUZfjZCx7xGBmvUS8SNRqyCxCl8NkK WP5KY+BqP6ldlb5pCYWr0j7iWX7hshyLnrMPuQ+2o4WvX/WsGkV1bey8Xj+Av0vZNMxP OeFHlaInCCcngYYVnJZsb8h4yAlflWwWN+rqf3/6389r8Y3KPJFSq9m78wqRjCJ8pV64 OF5g== X-Forwarded-Encrypted: i=1; AJvYcCVbNJ2MRrVIoqXwYivKcMJ/bRA8c292ZVWntp0fw9hWtIDMS3E1EBNpt62lMRsyJrfDwXyLiN2Qm5h/bgL8/LvdmuaJDsTCqdwqdRn+ X-Gm-Message-State: AOJu0YzEzs3sULiCU8WVDfu60aBwonqxzyYi1IafoJJ8hn6Eyf0PYmdV fG+PDiN54zqstFBKaV7M9IEH92tTCxwL5mv7g8J9Zt0gQaz8fk19Jl/29LOMZRE= X-Google-Smtp-Source: AGHT+IEHlOqXvdbfYiugVF0TXgIiftVk0fEr87owHTsVUKSpnXRyU8iozPM5JlvqTh0LUpuxvOHGhg== X-Received: by 2002:a05:6e02:1447:b0:36c:7f3d:59a with SMTP id e9e14a558f8ab-371f7c80fe6mr48056835ab.4.1716444176327; Wed, 22 May 2024 23:02:56 -0700 (PDT) Received: from [192.168.1.16] (174-21-188-197.tukw.qwest.net. [174.21.188.197]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-634117190absm23448503a12.87.2024.05.22.23.02.54 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 22 May 2024 23:02:55 -0700 (PDT) Message-ID: <9097e78d-0e7d-43bd-bafd-e53a4872a4d1@davidwei.uk> Date: Wed, 22 May 2024 23:02:54 -0700 Precedence: bulk X-Mailing-List: linux-parisc@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH net-next v9 11/14] tcp: RX path for devmem TCP Content-Language: en-GB To: Mina Almasry , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-alpha@vger.kernel.org, linux-mips@vger.kernel.org, linux-parisc@vger.kernel.org, sparclinux@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-arch@vger.kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org Cc: Donald Hunter , Jakub Kicinski , "David S. Miller" , Eric Dumazet , Paolo Abeni , Jonathan Corbet , Richard Henderson , Ivan Kokshaysky , Matt Turner , Thomas Bogendoerfer , "James E.J. Bottomley" , Helge Deller , Andreas Larsson , Jesper Dangaard Brouer , Ilias Apalodimas , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Arnd Bergmann , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Steffen Klassert , Herbert Xu , David Ahern , Willem de Bruijn , Shuah Khan , Sumit Semwal , =?UTF-8?Q?Christian_K=C3=B6nig?= , Pavel Begunkov , Jason Gunthorpe , Yunsheng Lin , Shailend Chand , Harshitha Ramamurthy , Shakeel Butt , Jeroen de Borst , Praveen Kaligineedi , Willem de Bruijn , Kaiyuan Zhang References: <20240510232128.1105145-1-almasrymina@google.com> <20240510232128.1105145-12-almasrymina@google.com> From: David Wei In-Reply-To: <20240510232128.1105145-12-almasrymina@google.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit On 2024-05-10 16:21, Mina Almasry wrote: > +/* On error, returns the -errno. On success, returns number of bytes sent to the > + * user. May not consume all of @remaining_len. > + */ > +static int tcp_recvmsg_dmabuf(struct sock *sk, const struct sk_buff *skb, > + unsigned int offset, struct msghdr *msg, > + int remaining_len) > +{ > + struct dmabuf_cmsg dmabuf_cmsg = { 0 }; > + struct tcp_xa_pool tcp_xa_pool; > + unsigned int start; > + int i, copy, n; > + int sent = 0; > + int err = 0; > + > + tcp_xa_pool.max = 0; > + tcp_xa_pool.idx = 0; > + do { > + start = skb_headlen(skb); > + > + if (skb_frags_readable(skb)) { > + err = -ENODEV; > + goto out; > + } > + > + /* Copy header. */ > + copy = start - offset; > + if (copy > 0) { > + copy = min(copy, remaining_len); > + > + n = copy_to_iter(skb->data + offset, copy, > + &msg->msg_iter); > + if (n != copy) { > + err = -EFAULT; > + goto out; > + } > + > + offset += copy; > + remaining_len -= copy; > + > + /* First a dmabuf_cmsg for # bytes copied to user > + * buffer. > + */ > + memset(&dmabuf_cmsg, 0, sizeof(dmabuf_cmsg)); > + dmabuf_cmsg.frag_size = copy; > + err = put_cmsg(msg, SOL_SOCKET, SO_DEVMEM_LINEAR, > + sizeof(dmabuf_cmsg), &dmabuf_cmsg); > + if (err || msg->msg_flags & MSG_CTRUNC) { > + msg->msg_flags &= ~MSG_CTRUNC; > + if (!err) > + err = -ETOOSMALL; > + goto out; > + } > + > + sent += copy; > + > + if (remaining_len == 0) > + goto out; > + } > + > + /* after that, send information of dmabuf pages through a > + * sequence of cmsg > + */ > + for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) { > + skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; > + struct net_iov *niov; > + u64 frag_offset; > + int end; > + > + /* !skb_frags_readable() should indicate that ALL the > + * frags in this skb are dmabuf net_iovs. We're checking > + * for that flag above, but also check individual frags > + * here. If the tcp stack is not setting > + * skb_frags_readable() correctly, we still don't want > + * to crash here. > + */ > + if (!skb_frag_net_iov(frag)) { > + net_err_ratelimited("Found non-dmabuf skb with net_iov"); > + err = -ENODEV; > + goto out; > + } > + > + niov = skb_frag_net_iov(frag); Sorry if we've already discussed this. We have this additional hunk: + if (niov->pp->mp_ops != &dmabuf_devmem_ops) { + err = -ENODEV; + goto out; + } In case one of our skbs end up here, skb_frag_is_net_iov() and !skb_frags_readable(). Does this even matter? And if so then is there a better way to distinguish between our two types of net_iovs?