From: Sowmini Varadhan <sowmini.varadhan@oracle.com>
To: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Cc: Network Development <netdev@vger.kernel.org>,
David Miller <davem@davemloft.net>,
rds-devel@oss.oracle.com, santosh.shilimkar@oracle.com
Subject: Re: [PATCH net-next 4/7] rds: support for zcopy completion notification
Date: Sun, 28 Jan 2018 11:15:35 -0500 [thread overview]
Message-ID: <20180128161513.GA22885@oracle.com> (raw)
In-Reply-To: <CAF=yD-J99GuA4ndHU8aeQjNcLEZjhVZ7ViWcL8bW2jqVhStNXg@mail.gmail.com>
thanks for taking the time to go through the code!
>
> An alternative that does not require a timer is to batch on the sk
> error queue itself, like tcp zerocopy. That queues the first notification
> skb on the error queue without any notification latency.
>
> Then, if a subsequent notification comes in while another is pending
> with < MAX zcookies, it coalesces the new notification onto the pending
> skb and consumes the other. For RDS notifications, the implementation
> is an extra skb_put + uint32_t assignment.
This is an interesting idea, let me give it a try.
> Optionally, the socket can trigger another sk_error_report on each
> new notification.
I was trying to avoid that- an upcall for each message is not a good
idea when you have high traffic and are able to free multiple rds_messages
at a time.
> > +static void rds_rm_zerocopy_callback(struct rds_sock *rs,
> > + struct rds_znotifier *znotifier,
> > + bool force)
> > +{
> > + struct sock *sk = rds_rs_to_sk(rs);
> > + struct sk_buff *skb;
> > + struct sock_exterr_skb *serr;
> > + unsigned long flags;
> > + u32 *ptr;
> > + int ncookies = 0, i;
> > + struct rds_znotifier *znotif, *ztmp, *first;
> > + LIST_HEAD(tmp_list);
> > +
> > + spin_lock_irqsave(&rs->rs_lock, flags);
> > + ncookies = rs->rs_ncookies;
> > + if (ncookies < SO_EE_ORIGIN_MAX_ZCOOKIES && !force) {
> > + if (znotifier) { /* add this cookie to the list and return */
>
> can be checked before taking lock.
>
> More importantly, when is this ever NULL?
It is null when invoked from tthe timer callback (rs_zcopy_notify()
going off because havent had any traffic for the expiration interval
so we want to send out pending notifications, but dont have any znotifier
in this case). But you are right in that:
> This function is a callback
> for a zerocopy struct of type znotifier. Is it doing double duty to flush
> any outstanding if znotifier == NULL && force == true? If so, the first
> condition probably never occurs unless force == true and thus the
> second is redundant.
yes, force can simply be !znotifier.
I dont quite follow the "can be checked before taking the lock
comment though"- the lock is needed to make sure we atomically do
the lists "add new entry and potentially flush" operation.
the check is for the "potentially" part of that operation, so
I'm not seeing how it would help to move it out of the lock.
having said all that, I like the earlier suggestion of just
batching on the error_queue itself. If that works out without any
issues, all of this stuff may not be needed, so let me give that
a shot first.
> This adds unnecessary notification latency to delivery of current
> notification. The latest notification can be appended to tmp_list and
> sent up immediately.
true it if fits within the MAX cookies limit.
> > + (void)mod_timer(&rs->rs_cookie_timer, RDS_REAP_TIMEOUT);
>
> Resetting timeout on each queued notification causes unbound
> notification latency for previous notifications on the queue.
I'm not sure I get that comment. RDS_REAP_TIMEOUT is the upper bound
for sending notification. If, in the interim, we get back a TCP
ack that lets us reap a bunch of messages, we'd go and flush the
queue anyway, so the RDS_REAP_TIMEOUT will not matter. Can you
elaborate on your concern?
> > +
> > + sock_put(rds_rs_to_sk(rs));
> > + rm->m_rs = NULL;
>
> These two lines are now called only if znotifier is true, but
> used to be called whenever rm->m_rs != NULL. Intentional?
good catch, no that's a bug. Thanks for flagging.
Also, ACK to all the comments for patch 5/7. Will fix for next round.
next prev parent reply other threads:[~2018-01-28 16:15 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-24 11:45 [PATCH net-next 0/7] RDS zerocopy support Sowmini Varadhan
2018-01-24 11:45 ` [PATCH net-next 1/7] skbuff: export mm_[un]account_pinned_pages for other modules Sowmini Varadhan
2018-01-24 11:45 ` [PATCH net-next 2/7] rds: hold a sock ref from rds_message to the rds_sock Sowmini Varadhan
2018-01-25 14:44 ` Willem de Bruijn
2018-01-25 15:35 ` Sowmini Varadhan
2018-01-28 13:51 ` Willem de Bruijn
2018-01-28 16:18 ` Sowmini Varadhan
2018-01-28 18:54 ` Willem de Bruijn
2018-01-24 11:45 ` [PATCH net-next 3/7] sock: permit SO_ZEROCOPY on PF_RDS socket Sowmini Varadhan
2018-01-24 11:45 ` [PATCH net-next 4/7] rds: support for zcopy completion notification Sowmini Varadhan
2018-01-28 13:56 ` Willem de Bruijn
2018-01-28 16:15 ` Sowmini Varadhan [this message]
2018-01-28 18:46 ` Willem de Bruijn
2018-01-24 11:46 ` [PATCH net-next 5/7] rds: zerocopy Tx support Sowmini Varadhan
2018-01-28 13:57 ` Willem de Bruijn
2018-01-24 11:46 ` [PATCH net-next 6/7] selftests/net: add support for PF_RDS sockets Sowmini Varadhan
2018-01-28 13:58 ` Willem de Bruijn
2018-01-24 11:46 ` [PATCH net-next 7/7] selftests/net: add zerocopy support for PF_RDS test case Sowmini Varadhan
2018-01-28 14:00 ` Willem de Bruijn
2018-01-28 16:18 ` Sowmini Varadhan
2018-01-28 18:39 ` Willem de Bruijn
2018-01-28 19:57 ` Sowmini Varadhan
2018-01-25 16:41 ` [PATCH net-next 0/7] RDS zerocopy support Santosh Shilimkar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180128161513.GA22885@oracle.com \
--to=sowmini.varadhan@oracle.com \
--cc=davem@davemloft.net \
--cc=netdev@vger.kernel.org \
--cc=rds-devel@oss.oracle.com \
--cc=santosh.shilimkar@oracle.com \
--cc=willemdebruijn.kernel@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).