From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sowmini Varadhan Subject: Re: [PATCH RFC net-next 1/6] sock: MSG_PEEK support for sk_error_queue Date: Thu, 18 Jan 2018 06:02:07 -0500 Message-ID: <20180118110207.GA24920@oracle.com> References: <05d060dc1169649d84c37ad51b0f8fe54a2a3185.1516147540.git.sowmini.varadhan@oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Network Development , David Miller , rds-devel@oss.oracle.com, santosh.shilimkar@oracle.com To: Willem de Bruijn Return-path: Received: from userp2120.oracle.com ([156.151.31.85]:51232 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755343AbeARLE3 (ORCPT ); Thu, 18 Jan 2018 06:04:29 -0500 Content-Disposition: inline In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: On (01/17/18 18:50), Willem de Bruijn wrote: > > This can cause reordering with parallel readers. Can we avoid the need > for peeking? It also caused a slew of subtle bugs previously. Yes, I did notice the potential for re-ordering when writing the patch.. but these are not actuallly messages from the wire, so is re-ordering fatal? In general, I"m not particularly attached to this solution- in my testing, I'm seeing that it's possible to reduce the latency and still take a hit on the throughput if the application does not reap the completion notifciation (and send out new data) efficiently Some (radically differnt) alternatives that were suggested to me - send up all the cookies as ancillary data with recvmsg (i.e., send it as a cmsgdata along with actual data from the wire). In most cases, the application has data to read, anyway. If it doesnt (pure sender), we could wake up recvmsg with 0 bytes of data, but with the cookie info in the ancillary data. This feels not-so-elegant to me, but I suppose it would have the benefit of optimizing on the syscall overhead.. (and you could use MSG_CTRUNC to handle the case of insuufficient bufffer for cookies, sending the rest on the next call).. - allow application to use a setsockopt on the rds socket, with some shmem region, into which the kernel could write the cookies, Let application reap cookies without syscall overhead from that shmem region.. > How about just define a max number of cookies and require the caller > to always read with sufficient room to hold them? This may be "good enough" as well, maybe allow a max of (say) 16 cookies, and set up the skb's in the error queue to send up batches of 16 cookies at a time? --Sowmini