From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sowmini Varadhan Subject: Re: [PATCH net-next 4/6] kcm: Kernel Connection Multiplexor module Date: Fri, 20 Nov 2015 17:50:12 -0500 Message-ID: <20151120225012.GB10508@oracle.com> References: <1448054520-1464587-1-git-send-email-tom@herbertland.com> <1448054520-1464587-5-git-send-email-tom@herbertland.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: davem@davemloft.net, netdev@vger.kernel.org, kernel-team@fb.com, davewatson@fb.com, alexei.starovoitov@gmail.com To: Tom Herbert Return-path: Received: from userp1040.oracle.com ([156.151.31.81]:36175 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752130AbbKTWuV (ORCPT ); Fri, 20 Nov 2015 17:50:21 -0500 Content-Disposition: inline In-Reply-To: <1448054520-1464587-5-git-send-email-tom@herbertland.com> Sender: netdev-owner@vger.kernel.org List-ID: On (11/20/15 13:21), Tom Herbert wrote: > +static int kcm_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) : > + > + if (msg->msg_flags & MSG_BATCH) { > + kcm->tx_wait_more = true; > + } else if (kcm->tx_wait_more || not_busy) { > + err = kcm_write_msgs(kcm); > + if (err < 0) { > + /* We got a hard error in write_msgs but have > + * already queued this message. Report an error > + * in the socket, but don't affect return value > + * from sendmsg > + */ > + pr_warn("KCM: Hard failure on kcm_write_msgs\n"); > + report_csk_error(&kcm->sk, -err); > + } > + } It's interesting that kcm copies the user data to a skb and then invokes kernel_sendpage on the frag_list in that skb- was this specifically done with some perf goals in mind? If yes, do you happen to have some estimate of how much this approach buys you, as opposed to just setting up a sglist and calling tcp_sendpage later? (RDS uses the latter approach, and I've tried to use the changes introduced by Eric's commit in 5640f76, it helps slightly but I think there may be other bottlenecks to overcome first for the specific req-resp patterns that are common in DB workloads) The other question I had when reading this code is: what if the application never sends that last MSG_BATCH-less message, e.g., it lies about how its going send more messages? will something eventually time-out and send the data? Any estimates for a good batch size? --Sowmini