From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from zimbra13.linbit.com (zimbra.linbit.com [212.69.161.123]) by mail09.linbit.com (LINBIT Mail Daemon) with ESMTP id 24F22101E062 for ; Wed, 24 Sep 2014 23:50:48 +0200 (CEST) Date: Wed, 24 Sep 2014 23:50:47 +0200 From: Lars Ellenberg To: PaX Team Message-ID: <20140924215047.GI7118@soda.linbit> References: <20140919094909.GA21578@schiffbauer.net> <5422E9F6.18603.4C8D74DA@pageexec.freemail.hu> <20140924163106.GH7118@soda.linbit> <5423084F.5168.4D040014@pageexec.freemail.hu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5423084F.5168.4D040014@pageexec.freemail.hu> Cc: drbd-dev@lists.linbit.com Subject: Re: [Drbd-dev] drbd 8.4.3: refcounter overflow on re-sync List-Id: "*Coordination* of development, patches, contributions -- *Questions* \(even to developers\) go to drbd-user, please." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Wed, Sep 24, 2014 at 08:07:11PM +0200, PaX Team wrote: > > > perhaps it's a consequence of the reaction from the kernel on the overflow > > > which is equivalent to a SIGKILL with all that it implies (files and network > > > connections get closed, etc). > > > > That would be the result of the _ASM_EXTABLE()? > > or what causes that "reaction"? > > no, the extable mechanism is only used to re-enter the kernel in a known > way to be able to report back on the detected refcount overflow. the actual > reaction is in pax_report_refcount_overflow Which is registered in the corresponding place in the exception table. So yes. > (you'll need a grsec or PaX tree I browsed some PaX patch instead. > to see its body, it's not in the upstream kernel). it basically logs details > about the overflow (registers, process info, etc) then forces a SIGKILL into > the task. > > you can see its output in the original report in this thread in fact, this > is what enabled me to figure out which atomic variable was involved and start > a discussion about this case (FYI, i've since turned both variables into the > 'unchecked' type). > > > As the process in question in *this* case is a drbd kernel thread, it > > does not much care about that KILL. It notices, clears it, and lives on. > > grsecurity handles kernel tasks too via gr_handle_kernel_exploit but for > the refcount overflow detection we specifically chose to ignore them for > two reasons. first, in the typical exploit scenario of these kinds of bugs > it's a userland process in whose context the refcount overflow triggers. Then I guess Marc is a very lucky guy... Otherwise you had killed the whole box just because it managed to sync the first TiB ;-) > second, since this is an early detection (i.e., before any damage could > have been done by an attack), the kernel state isn't corrupted yet and is > thus recoverable, so it's not urgent to halt the system (which is otherwise > necessary when unrecoverable state change occurs, think various forms of > memory corruption, etc). > > > But how would KILL'ing an innocent userland process improve the overall > > situation? Being a user land process, it cannot possibly be blamed for > > an in-kernel counter overflow, so why even kill it? > > notwithstanding the very few false positives that arise due to our 'secure > by default' choice in handling atomic_t accessors (i actually blame the > kernel's lack of a proper abstraction layer on top atomic_t ;), an exploit > is anything but an innocent userland process and the proper way to handle > it is to kill it and also ban the user account (all this is a configurable > choice in grsecurity). Carefully crafter exploits may be able to exploit PaX for a nice DoS, provoking it to kill someone else instead, no? I predicted earlier that this would not be a fruitful discussion. Because where you come from, a dead system is better than "suspicious behavior", and anyone that even only happens to be in the vicinity of "suspicious behavior" will get shot as a precautionary measure -- "collateral damage, should not have been there in the first place, really his own fault, what was he thinking" ;-) (For arbitrary values^W^W empirically sampled values of suspicious) And even though I sure can flex my mind, go those places, think that way, I rather not. Anyways, if it helps make the world a better place... At least it's all just bits and entropy :) Cheers, Lars