From mboxrd@z Thu Jan  1 00:00:00 1970
From: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Subject: Re: [PATCH RFC net-next 1/6] sock: MSG_PEEK support for
 sk_error_queue
Date: Thu, 18 Jan 2018 18:20:23 -0500
Message-ID: <20180118232023.GH24553@oracle.com>
References: <05d060dc1169649d84c37ad51b0f8fe54a2a3185.1516147540.git.sowmini.varadhan@oracle.com>
 <CAF=yD-KkL+rnSOvYMFKmeaJoD1zLj1mxtroccTNUJGrhm4SyKA@mail.gmail.com>
 <20180118110207.GA24920@oracle.com>
 <1516290887.3606.21.camel@gmail.com>
 <20180118161048.GB24553@oracle.com>
 <1516294395.3606.23.camel@gmail.com>
 <20180118171251.GD24553@oracle.com>
 <CAF=yD-Lnq6yumBn5OcpfmDoCruox=vJFyoh-N0RFrXam2HMn9g@mail.gmail.com>
 <20180118230341.GG24553@oracle.com>
 <CAF=yD-K4auN9L=ijJpq+72XoUsmWiwiz2zCxkE7_7EJPBP=mjg@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
        Network Development <netdev@vger.kernel.org>,
        David Miller <davem@davemloft.net>, rds-devel@oss.oracle.com,
        santosh.shilimkar@oracle.com
To: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Return-path: <netdev-owner@vger.kernel.org>
Received: from aserp2130.oracle.com ([141.146.126.79]:56462 "EHLO
        aserp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1753881AbeARXUd (ORCPT
        <rfc822;netdev@vger.kernel.org>); Thu, 18 Jan 2018 18:20:33 -0500
Content-Disposition: inline
In-Reply-To: <CAF=yD-K4auN9L=ijJpq+72XoUsmWiwiz2zCxkE7_7EJPBP=mjg@mail.gmail.com>
Sender: netdev-owner@vger.kernel.org
List-ID: <netdev.vger.kernel.org>

On (01/18/18 18:09), Willem de Bruijn wrote:
> If that is true in general for PF_RDS, then it is a reasonable approach.
> How about treating it as a (follow-on) optimization path. Opportunistic
> piggybacking of notifications on data reads is more widely applicable.

sounds good.

> > that's similar to what I have, except that it does not have the
> > MSG_PEEK part (you'd need to enforce that the data portion
> > is upper-bounded, and that the application has the responsibility
> > of sending down "enough" buffer with recvmsg).
> 
> Right. I think that an upper bound is the simplest solution here.
> 
> By the way, if you allocate an skb immediately on page pinning, then
> there are always sufficient skbs to store all notifications. On errqueue
> enqueue just drop the new skb and copy its notification to the body of
> the skb already on the queue, if one exists and it has room. That is
> essentially what the tcp zerocopy code does with the [data, info] range.

ok, I'll give that a shot (I'm working through the other review comments
as well)

fwiw, the data-corruption issue I mentioned turned out to be a day-one
bug in rds-tcp (patched in http://patchwork.ozlabs.org/patch/863183/). 
The buffer reaping with zcopy (and aggressiveness of rds-stress) brought
this one out..

--Sowmini