From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marcelo Ricardo Leitner Subject: Re: [PATCH net] sctp: fix a success return may hide an error Date: Tue, 16 Aug 2016 15:33:30 -0300 Message-ID: <20160816183330.GD3110@localhost.localdomain> References: <31f3b581258d0458edcf30f65ef9513bdc41acc1.1470919978.git.lucien.xin@gmail.com> <20160812.211113.1099230839994600349.davem@davemloft.net> <063D6719AE5E284EB5DD2968C1650D6DB00DFD69@AcuExch.aculab.com> <063D6719AE5E284EB5DD2968C1650D6DB00E036B@AcuExch.aculab.com> <20160816172447.GC3110@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: David Laight , David Miller , network dev , "linux-sctp@vger.kernel.org" , Vladislav Yasevich , "daniel@iogearbox.net" To: Xin Long Return-path: Received: from mx1.redhat.com ([209.132.183.28]:53606 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750822AbcHPSdf (ORCPT ); Tue, 16 Aug 2016 14:33:35 -0400 Content-Disposition: inline In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: On Wed, Aug 17, 2016 at 02:24:19AM +0800, Xin Long wrote: > >> > This err returns back to sctp_sendmsg, there sctp will abort asoc. > > > > That's not right I think. sctp_sendmsg will only free the asoc if it was > > created to send that specific chunk. And in this case, this change > > should have no effect as it can't have sctp_outq_flush() touching > > several transports in a row. > > > > I'm basing on: > > out_free: > > if (new_asoc) > > sctp_association_free(asoc); > > > > and sctp_recvmsg will just fetch, return and clear the error via > > sctp_skb_recv_datagram, but not free it. > > > > Do you see any other place freeing it? > Sorry, you are right, it free assoc just for new_asoc. > > > > >> > >> That doesn't seem a good idea. > >> You don't want to abort the association if there is a transient > >> memory allocation failure. > >> You also can't drop data chunks. > > > > From a system-wise POV, this behavior - to free the new asoc in case of > > transient memory allocation failure - doesn't seem bad to me. > > That's what will have to happen if any allocation before it failed and > > also it helps the system to reduce the stress a little bit. I don't see > > any inconsistency/problems here because we are not dropping a single > > random chunk but instead we are actually refusing to initialize a new > > asoc in such conditions. > > > > Nevertheless, I agree that letting the application see ENOMEM errors when > > the data actually got queued and is being fully handled, as in, it will > > be retransmitted later, is not be wise, as the application probably > > won't be able to distinguish from ENOMEMs that it should retry or not. > > Here I see a problem, yet it's not due to this specific change, perhaps > > it just got attention because of it. In this situation, we should handle > > ENOMEMs internally if possible so the application can know that if it > > hits an ENOMEM, it's real and it has to retry. > If letting the application see ENOMEM errors, and sctp has to drop this > chunk, instead of retransmiting the ENOMEM chunk, but the ENOMEM > chunk may not be the chunk from current msg, as it flush all the queue. > even if users get an ENOMEM error, they may re-send a chunk that is same > with the one that is still in retransmit queue. Yep, one more reason to handle those internally when safe.