From mboxrd@z Thu Jan 1 00:00:00 1970 From: Thomas Graf Subject: Re: [PATCH 1/4] [NETLINK]: Handle NLM_F_ECHO in netlink_rcv_skb() Date: Fri, 11 Aug 2006 23:47:44 +0200 Message-ID: <20060811214744.GO14627@postel.suug.ch> References: <20060809204821.216122988@postel.suug.ch> <20060809205439.434010049@postel.suug.ch> <20060810155120.GA494@ms2.inr.ac.ru> <20060810190210.GH14627@postel.suug.ch> <20060810203252.GA6414@ms2.inr.ac.ru> <20060810211833.GK14627@postel.suug.ch> <20060811153549.GA16351@ms2.inr.ac.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: davem@davemloft.net, netdev@vger.kernel.org Return-path: Received: from postel.suug.ch ([194.88.212.233]:38113 "EHLO postel.suug.ch") by vger.kernel.org with ESMTP id S1751094AbWHKVr0 (ORCPT ); Fri, 11 Aug 2006 17:47:26 -0400 To: Alexey Kuznetsov Content-Disposition: inline In-Reply-To: <20060811153549.GA16351@ms2.inr.ac.ru> Sender: netdev-owner@vger.kernel.org List-Id: netdev.vger.kernel.org * Alexey Kuznetsov 2006-08-11 19:35 > Well, tc was supposed to use it, but this did not happen and > it remained deficient. Makes sense, especially for auto generated handles. I've been listening to the notifications on a separate socket for this purpose. It would make sense however to extend a wait_for_ack() function and report back and eventual echoed objects to have a blocking operation as well. > Actually, it was supposed to be done everywhere, but originator info > did not propagate deep enough in many cases, especially in IPv6. > So, this is not a hack, it is a good work. :-) It does make sense, the way it has been implemented if at all is creepy. Even worse, IPv6 is using current->pid, some other code has been using the pid from NETLINK_CREDS() :-) > Each socket, which subscribes to multicasts becomes sensitive > to rcvbuf overflows. F.e. when you do control operations on a socket, > which is subscribed to multicasts, the response can be lost in stream > of events and -ENOBUFS generated instead. If it is a daemon, it can resync > the state, but if it is a simple utility, it cannot recover. Yes, for that reason it is recommended to use a separate socket when receiving multicasts. Also because some of the multicast code is buggy and provides the pid of the requestor's socket to netlink_broadcast() leading to excluding that socket. > Probably, unicasts sent due to NLM_F_ECHO should somehow override > rcvbuf limits. > > This reminded me about a capital problem, found by openvz people. > Frankly speaking, I still have no idea how to repair this, probably you > will find a solution. > > Look: while a dump, skb allocation can fail (because of many reasons, > the most obvious is that rcvbuf space was eaten by multicasts). > But error is not reported! Oops. The worst thing is that even if an error > is reported, iproute would ignore it. I'm not sure I understand this correctly, if rcvbuf space was eaten by multicasts subsequent recvmsg() will follow invoking netlink_dump() again and the dump continues.