netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org" <santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Wengang Wang
	<wen.gang.wang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
	netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH] RDS: sync congestion map updating
Date: Fri, 1 Apr 2016 21:30:54 -0700	[thread overview]
Message-ID: <56FF4AFE.9080606@oracle.com> (raw)
In-Reply-To: <20160402011459.GC8565-2ukJVAZIZ/Y@public.gmane.org>



On 4/1/16 6:14 PM, Leon Romanovsky wrote:
> On Fri, Apr 01, 2016 at 12:47:24PM -0700, santosh shilimkar wrote:
>> (cc-ing netdev)
>> On 3/30/2016 7:59 PM, Wengang Wang wrote:
>>>
>>>
>>> 在 2016年03月31日 09:51, Wengang Wang 写道:
>>>>
>>>>
>>>> 在 2016年03月31日 01:16, santosh shilimkar 写道:
>>>>> Hi Wengang,
>>>>>
>>>>> On 3/30/2016 9:19 AM, Leon Romanovsky wrote:
>>>>>> On Wed, Mar 30, 2016 at 05:08:22PM +0800, Wengang Wang wrote:
>>>>>>> Problem is found that some among a lot of parallel RDS
>>>>>>> communications hang.
>>>>>>> In my test ten or so among 33 communications hang. The send
>>>>>>> requests got
>>>>>>> -ENOBUF error meaning the peer socket (port) is congested. But
>>>>>>> meanwhile,
>>>>>>> peer socket (port) is not congested.
>>>>>>>
>>>>>>> The congestion map updating can happen in two paths: one is in
>>>>>>> rds_recvmsg path
>>>>>>> and the other is when it receives packets from the hardware. There
>>>>>>> is no
>>>>>>> synchronization when updating the congestion map. So a bit
>>>>>>> operation (clearing)
>>>>>>> in the rds_recvmsg path can be skipped by another bit operation
>>>>>>> (setting) in
>>>>>>> hardware packet receving path.
>>>>>>>
>>>
>>> To be more detailed.  Here, the two paths (user calls recvmsg and
>>> hardware receives data) are for different rds socks. thus the
>>> rds_sock->rs_recv_lock is not helpful to sync the updating on congestion
>>> map.
>>>
>> For archive purpose, let me try to conclude the thread. I synced
>> with Wengang offlist and came up with below fix. I was under
>> impression that __set_bit_le() was atmoic version. After fixing
>> it like patch(end of the email), the bug gets addressed.
>>
>> I will probably send this as fix for stable as well.
>>
>>
>>  From 5614b61f6fdcd6ae0c04e50b97efd13201762294 Mon Sep 17 00:00:00 2001
>> From: Santosh Shilimkar <santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
>> Date: Wed, 30 Mar 2016 23:26:47 -0700
>> Subject: [PATCH] RDS: Fix the atomicity for congestion map update
>>
>> Two different threads with different rds sockets may be in
>> rds_recv_rcvbuf_delta() via receive path. If their ports
>> both map to the same word in the congestion map, then
>> using non-atomic ops to update it could cause the map to
>> be incorrect. Lets use atomics to avoid such an issue.
>>
>> Full credit to Wengang <wen.gang.wang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> for
>> finding the issue, analysing it and also pointing out
>> to offending code with spin lock based fix.
>
> I'm glad that you solved the issue without spinlocks.
> Out of curiosity, I see that this patch is needed to be sent
> to Dave and applied by him. Is it right?
>
Right. I was planning send this one along with one more fix
together on netdev for Dave to pick it up.

> ➜  linus-tree git:(master) ./scripts/get_maintainer.pl -f net/rds/cong.c
> Santosh Shilimkar <santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> (supporter:RDS -
> RELIABLE DATAGRAM SOCKETS)
> "David S. Miller" <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> (maintainer:NETWORKING
> [GENERAL])
> netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org (open list:RDS - RELIABLE DATAGRAM SOCKETS)
> linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org (open list:RDS - RELIABLE DATAGRAM SOCKETS)
> rds-devel-N0ozoZBvEnrZJqsBc5GL+g@public.gmane.org (moderated list:RDS - RELIABLE DATAGRAM
> SOCKETS)
> linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org (open list)
>
>>
>> Signed-off-by: Wengang Wang <wen.gang.wang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
>> Signed-off-by: Santosh Shilimkar <santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
>
> Reviewed-by: Leon Romanovsky <leon-2ukJVAZIZ/Y@public.gmane.org>
>
Thanks for review.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

      parent reply	other threads:[~2016-04-02  4:30 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1459328902-31968-1-git-send-email-wen.gang.wang@oracle.com>
     [not found] ` <20160330161952.GA2670@leon.nu>
     [not found]   ` <56FC09D6.7090602@oracle.com>
     [not found]     ` <56FC82B7.3070504@oracle.com>
     [not found]       ` <56FC927E.9090404@oracle.com>
     [not found]         ` <56FC927E.9090404-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2016-04-01 19:47           ` [PATCH] RDS: sync congestion map updating santosh shilimkar
2016-04-02  1:14             ` Leon Romanovsky
     [not found]               ` <20160402011459.GC8565-2ukJVAZIZ/Y@public.gmane.org>
2016-04-02  4:30                 ` santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56FF4AFE.9080606@oracle.com \
    --to=santosh.shilimkar-qhclzuegtsvqt0dzr+alfa@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=wen.gang.wang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).