From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sowmini Varadhan Subject: Re: [PATCH RFC net-next 1/6] sock: MSG_PEEK support for sk_error_queue Date: Thu, 18 Jan 2018 18:20:23 -0500 Message-ID: <20180118232023.GH24553@oracle.com> References: <05d060dc1169649d84c37ad51b0f8fe54a2a3185.1516147540.git.sowmini.varadhan@oracle.com> <20180118110207.GA24920@oracle.com> <1516290887.3606.21.camel@gmail.com> <20180118161048.GB24553@oracle.com> <1516294395.3606.23.camel@gmail.com> <20180118171251.GD24553@oracle.com> <20180118230341.GG24553@oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Eric Dumazet , Network Development , David Miller , rds-devel@oss.oracle.com, santosh.shilimkar@oracle.com To: Willem de Bruijn Return-path: Received: from aserp2130.oracle.com ([141.146.126.79]:56462 "EHLO aserp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753881AbeARXUd (ORCPT ); Thu, 18 Jan 2018 18:20:33 -0500 Content-Disposition: inline In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: On (01/18/18 18:09), Willem de Bruijn wrote: > If that is true in general for PF_RDS, then it is a reasonable approach. > How about treating it as a (follow-on) optimization path. Opportunistic > piggybacking of notifications on data reads is more widely applicable. sounds good. > > that's similar to what I have, except that it does not have the > > MSG_PEEK part (you'd need to enforce that the data portion > > is upper-bounded, and that the application has the responsibility > > of sending down "enough" buffer with recvmsg). > > Right. I think that an upper bound is the simplest solution here. > > By the way, if you allocate an skb immediately on page pinning, then > there are always sufficient skbs to store all notifications. On errqueue > enqueue just drop the new skb and copy its notification to the body of > the skb already on the queue, if one exists and it has room. That is > essentially what the tcp zerocopy code does with the [data, info] range. ok, I'll give that a shot (I'm working through the other review comments as well) fwiw, the data-corruption issue I mentioned turned out to be a day-one bug in rds-tcp (patched in http://patchwork.ozlabs.org/patch/863183/). The buffer reaping with zcopy (and aggressiveness of rds-stress) brought this one out.. --Sowmini