From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752071AbdIANKW convert rfc822-to-8bit (ORCPT ); Fri, 1 Sep 2017 09:10:22 -0400 Received: from mout.gmx.net ([212.227.17.20]:55406 "EHLO mout.gmx.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751778AbdIANKT (ORCPT ); Fri, 1 Sep 2017 09:10:19 -0400 Message-ID: <1504271369.332.29.camel@gmx.de> Subject: Re: tip -ENOBOOT - bisected to locking/refcounts, x86/asm: Implement fast refcount overflow protection From: Mike Galbraith To: Kees Cook Cc: "David S. Miller" , Peter Zijlstra , LKML , Ingo Molnar , "Reshetova, Elena" , Network Development Date: Fri, 01 Sep 2017 15:09:29 +0200 In-Reply-To: <1504249070.17604.20.camel@gmx.de> References: <1503996623.8323.20.camel@gmx.de> <1504025721.6024.25.camel@gmx.de> <1504030207.6560.0.camel@gmx.de> <1504069332.8352.3.camel@gmx.de> <1504113212.5852.6.camel@gmx.de> <1504115735.5852.11.camel@gmx.de> <1504145389.23109.4.camel@gmx.de> <1504149176.23109.9.camel@gmx.de> <1504187918.27500.16.camel@gmx.de> <1504199967.666.16.camel@gmx.de> <1504249070.17604.20.camel@gmx.de> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.20.5 Mime-Version: 1.0 Content-Transfer-Encoding: 8BIT X-Provags-ID: V03:K0:yOVa2Z9psE7USR2gZ9m6zkqwFChGXKZ/N9QYj5XAL6aiLgb4KVJ xUl/mku8hH7EzuK+ZBg9U9VLhUwEbnPBeES2Wcwhuq13+yImWzFmGiJ9eUslcr3QNhQISre 0+N8sppBuOFaf/1kW+4/psr91TsE8aGu8BvyJ02/rI+tz8GhzOLod2OBI5Suav03Ew0/AYS 59Z9cl8LtbVdJVst8lbxg== X-UI-Out-Filterresults: notjunk:1;V01:K0:ufMor2WQ9Cs=:tLA15pZv9TS9Wc1mdM39Gk nyTZ40StmfWeJFkpUodC0C7z50P1gD1T8Jq6J8otPhuhYFIIKwrcUP+AMwDfRRs4JwRyuyoLF vhetLYvv0S3OGumKzosrGRhYD9Kp1OlEzWRUUk4bquGA0lnYRKnOC/PuB+2CRhsiXeOIwkR5U bLZAi4+iuVX89cjPiRN2jZgTUm1+hAiSBM8B5f8fmrpcyEK22iL0jMyDGRKrh7m1eQM6uouCS BTJ4AvU93xeibVgwRDFpObMkoHixh55UXEqMsMDhUrch5k0E4vgqgiPGLdIJDJAji2SVsuaNu uazIG1gko7Lfg8q0SDUgTfz9ga+G7M9IU0ZGUMh7SjJ7MzPZW4004tP3MJq1F5JAeXVFSIQBU P3AmcTAGQ/n+C5ePYe/hWhtr1MyXHTAI8gKrHvEe/PyHvZy6Wok4FcnMbybYdcbTf0HwGkUP1 6bY6IJBQe11/bJPu3CQCaQBy85JglM9eTUMBh6VbzJTLMUyWy4284C45QXDrZKoHtElFXXH7u N4zWW1hkCLJG//tSfATvDJKy7MQyNSlaYQXBW6cbN+gGh8aYC87zqujsg+W7G+QsjoeGlrPpN 4iwwwiA8m4HYcCpyql2lDAQD2mmlHrsFySBJsVfkWm2Qd7tcEmMX9lBGy3G9I0JOGNOpE91Wg nI2LmQiZg7JXR00/DMOthub2JhRNO/ByCGDtE5H3+WEh3xp99nlOgBMIsavYRE2akuvBRMR3t qxogp93pyebbv6C+wGK31Ec8DvXoXEOSsClj2ucx29Lk99F9gwX+iGyOllP5cIfMRLzUvzuu2 s4pPpN4xMl/zimMKfZ2g2s1R0q3E+YMsQB2DL+cNukCaMbGx0M= Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 2017-09-01 at 08:57 +0200, Mike Galbraith wrote: > On Thu, 2017-08-31 at 11:45 -0700, Kees Cook wrote: > > On Thu, Aug 31, 2017 at 10:19 AM, Mike Galbraith wrote: > > > On Thu, 2017-08-31 at 10:00 -0700, Kees Cook wrote: > > >> > > >> Oh! So it's gcc-version sensitive? That's alarming. Is this mapping correct: > > >> > > >> 4.8.5: WARN, eventual kernel hang > > >> 6.3.1, 7.0.1: WARN, but continues working > > > > > > Yeah, that's correct. I find that troubling, simply because this gcc > > > version has been through one hell of a lot of kernels with me. Yeah, I > > > know, that doesn't exempt it from having bugs, but color me suspicious. > > > > I still can't hit this with a 4.8.5 build. :( > > > > With _RATELIMIT removed, this should, in theory, report whatever goes > > negative first... > > I applied the other patch you posted, and built with gcc-6.3.1 to > remove the gcc-4.8.5 aspect.  Look below the resulting splat. Grr, that one has a in6_dev_getx() line missing for the first increment, where things go pear shaped. With that added, looking at counter both before, and after incl, with a trace_printk() in the exception handler showing it doing its saturate thing, irqs disabled across the whole damn refcount_inc(), and even booting box nr_cpus=1 for extra credit... HTH can that first refcount_inc() get there? # tracer: nop # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / delay # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | systemd-1 [000] d..1 1.937284: in6_dev_getx: PRE refs.counter:3 systemd-1 [000] d..1 1.937295: ex_handler_refcount: *(int *)regs->cx = -1073741824 systemd-1 [000] d..1 1.937296: in6_dev_getx: POST refs.counter:-1073741824 systemd-1 [000] d..1 1.937296: in6_dev_getx: PRE refs.counter:-1073741824 systemd-1 [000] d..1 1.937297: ex_handler_refcount: *(int *)regs->cx = -1073741824 systemd-1 [000] d..1 1.937297: in6_dev_getx: POST refs.counter:-1073741824 systemd-1 [000] d..1 1.937297: in6_dev_getx: PRE refs.counter:-1073741824 systemd-1 [000] d..1 1.937298: ex_handler_refcount: *(int *)regs->cx = -1073741824 systemd-1 [000] d..1 1.937299: in6_dev_getx: POST refs.counter:-1073741824 --- arch/x86/include/asm/refcount.h | 14 ++++++++++++++ arch/x86/mm/extable.c | 1 + include/net/addrconf.h | 12 ++++++++++++ net/ipv6/route.c | 6 +++--- 4 files changed, 30 insertions(+), 3 deletions(-) --- a/arch/x86/include/asm/refcount.h +++ b/arch/x86/include/asm/refcount.h @@ -55,6 +55,20 @@ static __always_inline void refcount_inc : : "cc", "cx"); } +static __always_inline void refcount_inc_x(refcount_t *r) +{ + unsigned long flags; + + local_irq_save(flags); + trace_printk("PRE refs.counter:%d\n", r->refs.counter); + asm volatile(LOCK_PREFIX "incl %0\n\t" + REFCOUNT_CHECK_LT_ZERO + : [counter] "+m" (r->refs.counter) + : : "cc", "cx"); + trace_printk("POST refs.counter:%d\n", r->refs.counter); + local_irq_restore(flags); +} + static __always_inline void refcount_dec(refcount_t *r) { asm volatile(LOCK_PREFIX "decl %0\n\t" --- a/arch/x86/mm/extable.c +++ b/arch/x86/mm/extable.c @@ -45,6 +45,7 @@ bool ex_handler_refcount(const struct ex { /* First unconditionally saturate the refcount. */ *(int *)regs->cx = INT_MIN / 2; + trace_printk("*(int *)regs->cx = %d\n", *(int *)regs->cx); /* * Strictly speaking, this reports the fixup destination, not --- a/include/net/addrconf.h +++ b/include/net/addrconf.h @@ -321,6 +321,18 @@ static inline struct inet6_dev *in6_dev_ return idev; } +static inline struct inet6_dev *in6_dev_getx(const struct net_device *dev) +{ + struct inet6_dev *idev; + + rcu_read_lock(); + idev = rcu_dereference(dev->ip6_ptr); + if (idev) + refcount_inc_x(&idev->refcnt); + rcu_read_unlock(); + return idev; +} + static inline struct neigh_parms *__in6_dev_nd_parms_get_rcu(const struct net_device *dev) { struct inet6_dev *idev = __in6_dev_get(dev); --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -4041,12 +4041,12 @@ void __init ip6_route_init_special_entri * the loopback reference in rt6_info will not be taken, do it * manually for init_net */ init_net.ipv6.ip6_null_entry->dst.dev = init_net.loopback_dev; - init_net.ipv6.ip6_null_entry->rt6i_idev = in6_dev_get(init_net.loopback_dev); + init_net.ipv6.ip6_null_entry->rt6i_idev = in6_dev_getx(init_net.loopback_dev); #ifdef CONFIG_IPV6_MULTIPLE_TABLES init_net.ipv6.ip6_prohibit_entry->dst.dev = init_net.loopback_dev; - init_net.ipv6.ip6_prohibit_entry->rt6i_idev = in6_dev_get(init_net.loopback_dev); + init_net.ipv6.ip6_prohibit_entry->rt6i_idev = in6_dev_getx(init_net.loopback_dev); init_net.ipv6.ip6_blk_hole_entry->dst.dev = init_net.loopback_dev; - init_net.ipv6.ip6_blk_hole_entry->rt6i_idev = in6_dev_get(init_net.loopback_dev); + init_net.ipv6.ip6_blk_hole_entry->rt6i_idev = in6_dev_getx(init_net.loopback_dev); #endif }