From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751375AbdHaEjK (ORCPT ); Thu, 31 Aug 2017 00:39:10 -0400 Received: from mout.gmx.net ([212.227.17.22]:64841 "EHLO mout.gmx.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750883AbdHaEjI (ORCPT ); Thu, 31 Aug 2017 00:39:08 -0400 Message-ID: <1504154298.23109.23.camel@gmx.de> Subject: Re: tip -ENOBOOT - bisected to locking/refcounts, x86/asm: Implement fast refcount overflow protection From: Mike Galbraith To: Kees Cook , "David S. Miller" , Peter Zijlstra Cc: LKML , Ingo Molnar , "Reshetova, Elena" , Network Development Date: Thu, 31 Aug 2017 06:38:18 +0200 In-Reply-To: References: <1503996623.8323.20.camel@gmx.de> <1504025721.6024.25.camel@gmx.de> <1504030207.6560.0.camel@gmx.de> <1504069332.8352.3.camel@gmx.de> <1504113212.5852.6.camel@gmx.de> <1504115735.5852.11.camel@gmx.de> <1504145389.23109.4.camel@gmx.de> <1504149176.23109.9.camel@gmx.de> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.20.5 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-Provags-ID: V03:K0:cq61I8EiEU38alTQENfNSYMQmMXHJSl+bGTJBjZW4IR/rnUbgyI EA5RljMD0tLIyWfz4wlUZamuUtWkYnmROF6/FN8rogdetMoEmQblbDZKCEwaj58Q6kJGhcf 2ypfLTE0urK9Xu0lk6jdgoo8varQLr7hhMau2TtS4k6gE8ZNePdjWgr3W7tofUwds55nJVc /H4DrMd6ynehEky5Y9kqA== X-UI-Out-Filterresults: notjunk:1;V01:K0:TiaVvp4fH/Y=:hQUkPKuShUA771mzHsYDPb ed0R8yphc2OffdcYovwTmw4Ls1QUDB5mpiLu3/VAEkSDLtWJmLraLI9M2RXxbViDYs/umLW19 X5jy7r07lXgiT/weF4/Xa80xh3kTmc7Y58X7JdhIL/latL7efozflcCy0K7lKr6c92aMlDl3X QX84/uU+gPorzvATA/cibDV2rH+oXnQstj522ylEFrVC/OvP9YEh2aOVPz9c8UtS7ctqGkElX KiWoLBICI6iYhzwBJ52i8pgyYncjyPYtrwySNOkmCjQoC7/cIlCSciS6ytL2gj0GMPA+ZBwqa VydRtRYQ19CG76R2eAsHAV2Be9fCeQmwtVSfEWIaStYjEoxdaDqDLEL3TDpeWLfBKrMzvwev2 PDZx6O8KSYukUykBXn+jSOaoz0e3AN6+68OuPNJohdNvkF6InJTilHRS7+i0WfN9jZzGD6YB9 B5Dd8YC/UpxPKhwqa1nWwmoV3zrPi3gjPE6e9hx/sULn5kr4nSs7VW/2KYXMXyKte7MSaU+eu T0CPo/QBXhwRMmSH4TikcLGZEWksX9oxb9oTe6J2HGJvIRXAHbTZjeJpFvY0nlHqTpBlV/Mgq spVlRkuG1WSjcqT53N8h+HnCMUKY4pjROxpCTz5Joqvjf4NHaEyHFMhcUimbY/0C2d8z1M6oi cm255zo8YvsWRUI17x/RQg049xgfiMIncx7o7mGkrxLcRGdNJ2GeAAiyMIAK2rSFVxp0S676h HFR/keYvNyXH5wHR6wmHahzTt3mBNGVJHaxD73JlPMX15t2GhGzH8GgiT5nXDHfCqFEPQEjAf 6p+z2KY5mI3D8x/zxFGOG4Anji3LzBT8aMc7XhgWcnn8MuUBlE= Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2017-08-30 at 21:10 -0700, Kees Cook wrote: > On Wed, Aug 30, 2017 at 9:01 PM, Kees Cook wrote: > > On Wed, Aug 30, 2017 at 8:12 PM, Mike Galbraith wrote: > >> On Wed, 2017-08-30 at 19:27 -0700, Kees Cook wrote: > >> > >>> Interesting! Can you try with 633547973ffc3 ("net: convert > >>> sk_buff.users from atomic_t to refcount_t") reverted? I'll see if > >>> running haveged will help me trigger this on my system... > >> > >> With that (plus 230cd1279d001 fix to it) reverted, vbox boots. > > > > Wonderful! Thank you so much for helping track this down. > > > > So, it seems that sk_buff.users will need some more special attention > > before we can convert it to refcount. > > > > x86-refcount will saturate with refcount_dec_and_test() if the result > > is negative. But that would mean at least starting at 0. FULL should > > have WARNed in this case, so I remain slightly confused why it was > > missed by FULL. > > Actually, if this is a race condition it's possible that FULL is slow > enough to miss it... > > I bet something briefly takes the refcount negative, and with > unchecked atomics, it come back up positive again during the race. > FULL may miss the race, and x86-refcount will catch it and saturate... Hm, I'll go have a stare.. not that that's likely to turn anything up, memory ordering stares usually inducing a zombie like state. -Mike