All of lore.kernel.org
 help / color / mirror / Atom feed
From: Niklas Cassel <niklas.cassel@axis.com>
To: <tj@kernel.org>, <peterz@infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: [BUG] sched: leaf_cfs_rq_list use after free
Date: Fri, 4 Mar 2016 11:41:17 +0100	[thread overview]
Message-ID: <56D9664D.8080503@axis.com> (raw)

Hello

I've stumbled upon a use after free bug related to
CONFIG_FAIR_GROUP_SCHED / rq->cfs_rq->leaf_cfs_rq_list in v4.4.


Normally, a cfs_rq is immediately removed from the leaf_cfs_rq_list
and cfs_rq->onlist is set to 0, then the cfs_rq is freed at a later
time by call_rcu(&tg->rcu, free_sched_group_rcu).


What happens when we crash is that a cfs_rq is immediately removed
from the leaf_cfs_rq_list and cfs_rq->onlist is set to 0, however
then the cfs_rq is readded to the list, cfs_rq->onlist gets set to 1,
then comes the call to call_rcu(&tg->rcu, free_sched_group_rcu).

Now the cfs_rq is freed, filled with 0x6b6b6b6b by SLUB_DEBUG,
and still on the leaf_cfs_rq_list. Since the cfs_rq is still on
the list, the next call to update_blocked_averages will iterate
the list and will try to access members of the cfs_rq object,
an object which has already been freed.



[   27.531374] Unable to handle kernel paging request at virtual address 6b6b706b
[   27.538596] pgd = 8cea8000
[   27.541295] [6b6b706b] *pgd=00000000
[   27.544870] Internal error: Oops: 1 [#1] PREEMPT SMP ARM
[   27.564025] CPU: 1 PID: 1252 Comm: logger Tainted: G           O    4.4.0 #2
[   27.571064] Hardware name: Axis ARTPEC-6 Platform
[   27.575759] task: b9586540 ti: 8c84c000 task.ti: 8c84c000
[   27.581155] PC is at update_blocked_averages+0xcc/0x748
[   27.586372] LR is at update_blocked_averages+0xbc/0x748
[   27.591589] pc : [<80051d78>]    lr : [<80051d68>]    psr: 200c0193
               sp : 8c84dce8  ip : 00000500  fp : 8efb1680
[   27.603056] r10: 00000006  r9 : 80847788  r8 : 6b6b6b6b
[   27.608271] r7 : 00000007  r6 : ffff958a  r5 : 00000007  r4 : ffff958a
[   27.614789] r3 : 6b6b6b6b  r2 : 00000101  r1 : 00000000  r0 : 00000003
[   27.621308] Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
[   27.628521] Control: 10c5387d  Table: 0cea804a  DAC: 00000055
[   27.634257] Process logger (pid: 1252, stack limit = 0x8c84c210)
[   27.640254] Stack: (0x8c84dce8 to 0x8c84e000)
[   27.644604] dce0:                   6b6b6b6b 00000103 bad39440 80048250 00000000 bad398d0
[   27.652774] dd00: bf6cf0d0 00000001 807e2c48 bad398d0 00000000 8054e7c8 ffff4582 bf6cec00
[   27.660944] dd20: 00000001 8004825c 00000100 807dc400 8c84de40 bf6cb340 bad87ebc 00000100
[   27.669114] dd40: afb50401 200c0113 00000200 807dc400 807e2100 ffff958a 00000007 8083916c
[   27.677283] dd60: 00000100 00000006 0000001c 80058748 bf6cb340 8054e810 00000000 00000001
[   27.685452] dd80: 807dc400 bf6cec00 00000001 bf6cec00 8083916c 00000001 c0803100 807dc400
[   27.693622] dda0: 807e209c 000000a0 00000007 8083916c 00000100 00000006 0000001c 800282a0
[   27.701791] ddc0: 00000001 bf6d2a80 b95fac00 0000000a ffff958b 00400000 bacf7000 807dc400
[   27.709961] dde0: 00000000 00000000 0000001b bf0188c0 00000001 c0803100 b95fac00 80028830
[   27.718130] de00: 807dc400 8006ca14 c0802100 c080210c 807e2db0 8081a140 8c84de40 80009420
[   27.726300] de20: 8054e780 80122048 800c0013 ffffffff 8c84de74 00000001 00100073 800142c0
[   27.734469] de40: b95ace70 b9586540 00000000 00000000 600c0013 00000000 024080c0 8010e8e0
[   27.742639] de60: 00000001 00000001 00100073 b95fac00 00000000 8c84de90 8054e780 80122048
[   27.750808] de80: 800c0013 ffffffff b95fac00 80122044 bad00640 8011c418 000001f6 b95acb70
[   27.758978] dea0: 76f42000 b95acb70 76f42000 b95acb68 76f43000 8ce48780 00100073 8010e8e0
[   27.767148] dec0: 00100073 00000000 b95fac00 00000000 00000000 00000001 b9421000 00000001
[   27.775317] dee0: 00000000 76f46000 00000000 00000000 8001e8b8 76f42000 00000003 00000003
[   27.783486] df00: b95fac00 8ce48780 00000001 00001000 807e2c64 8010efb4 00000000 00000000
[   27.791656] df20: 0000004d 00000073 8c84df50 8ce487c4 b95fac00 00000003 00000013 00000000
[   27.799825] df40: 8c84c000 b95fac00 7ece0b44 800faf84 00000002 00000000 00000000 8c84df64
[   27.807995] df60: b95fac00 00000000 00000002 00000003 00000013 00000000 00000000 8010d4e8
[   27.816163] df80: 00000002 00000000 00000003 00000003 00000000 00000003 000000c0 800104e4
[   27.824333] dfa0: 00000020 800104b0 00000003 00000000 00000000 00000013 00000003 00000002
[   27.832502] dfc0: 00000003 00000000 00000003 000000c0 0007ecd0 76f45958 76f45574 7ece0b44
[   27.840671] dfe0: 00000000 7ece09fc 76f2e814 76f368d8 400c0010 00000000 00000000 00000000
[   27.848847] [<80051d78>] (update_blocked_averages) from [<80058748>] (rebalance_domains+0x38/0x2cc)
[   27.857889] [<80058748>] (rebalance_domains) from [<800282a0>] (__do_softirq+0x98/0x354)
[   27.865975] [<800282a0>] (__do_softirq) from [<80028830>] (irq_exit+0xb0/0x11c)
[   27.873281] [<80028830>] (irq_exit) from [<8006ca14>] (__handle_domain_irq+0x60/0xb8)
[   27.881106] [<8006ca14>] (__handle_domain_irq) from [<80009420>] (gic_handle_irq+0x48/0x94)
[   27.889452] [<80009420>] (gic_handle_irq) from [<800142c0>] (__irq_svc+0x40/0x74)
[   27.896924] Exception stack(0x8c84de40 to 0x8c84de88)
[   27.901969] de40: b95ace70 b9586540 00000000 00000000 600c0013 00000000 024080c0 8010e8e0
[   27.910139] de60: 00000001 00000001 00100073 b95fac00 00000000 8c84de90 8054e780 80122048
[   27.918306] de80: 800c0013 ffffffff
[   27.921793] [<800142c0>] (__irq_svc) from [<80122048>] (__slab_alloc.constprop.9+0x28/0x2c)
[   27.930139] [<80122048>] (__slab_alloc.constprop.9) from [<8011c418>] (kmem_cache_alloc+0x14c/0x204)
[   27.939265] [<8011c418>] (kmem_cache_alloc) from [<8010e8e0>] (mmap_region+0x29c/0x680)
[   27.947262] [<8010e8e0>] (mmap_region) from [<8010efb4>] (do_mmap+0x2f0/0x378)
[   27.954481] [<8010efb4>] (do_mmap) from [<800faf84>] (vm_mmap_pgoff+0x74/0xa4)
[   27.961699] [<800faf84>] (vm_mmap_pgoff) from [<8010d4e8>] (SyS_mmap_pgoff+0x94/0xf0)
[   27.969524] [<8010d4e8>] (SyS_mmap_pgoff) from [<800104b0>] (__sys_trace_return+0x0/0x10)
[   27.977694] Code: e59b8078 e59b309c e3a0cc05 e3580000 (e18300dc) 

A snippet of the trace_printks I've added when analyzing the problem.
The prints show that a certain cfs_rq gets readded after it has been removed,
and that update_blocked_averages uses the cfs_rq which has already been freed:

         systemd-1     [000]    22.664453: bprint:               alloc_fair_sched_group: allocated cfs_rq 0x8efb0780 tg 0x8efb1800 tg->css.id 0
         systemd-1     [000]    22.664479: bprint:               alloc_fair_sched_group: allocated cfs_rq 0x8efb1680 tg 0x8efb1800 tg->css.id 0
         systemd-1     [000]    22.664481: bprint:               cpu_cgroup_css_alloc: tg 0x8efb1800 tg->css.id 0
         systemd-1     [000]    22.664547: bprint:               cpu_cgroup_css_online: tg 0x8efb1800 tg->css.id 80
         systemd-874   [001]    27.389000: bprint:               list_add_leaf_cfs_rq: cfs_rq 0x8efb1680 cpu 1 on_list 0x0
    migrate_cert-820   [001]    27.421337: bprint:               update_blocked_averages: cfs_rq 0x8efb1680 cpu 1 on_list 0x1
     kworker/0:1-24    [000]    27.421356: bprint:               cpu_cgroup_css_offline: tg 0x8efb1800 tg->css.id 80
     kworker/0:1-24    [000]    27.421445: bprint:               list_del_leaf_cfs_rq: cfs_rq 0x8efb1680 cpu 1 on_list 0x1
    migrate_cert-820   [001]    27.421506: bprint:               list_add_leaf_cfs_rq: cfs_rq 0x8efb1680 cpu 1 on_list 0x0
   system-status-815   [001]    27.491358: bprint:               update_blocked_averages: cfs_rq 0x8efb1680 cpu 1 on_list 0x1
     kworker/0:1-24    [000]    27.501561: bprint:               cpu_cgroup_css_free: tg 0x8efb1800 tg->css.id 80
    migrate_cert-820   [001]    27.511337: bprint:               update_blocked_averages: cfs_rq 0x8efb1680 cpu 1 on_list 0x1
     ksoftirqd/0-3     [000]    27.521830: bprint:               free_fair_sched_group: freeing cfs_rq 0x8efb0780 tg 0x8efb1800 tg->css.id 80
     ksoftirqd/0-3     [000]    27.521857: bprint:               free_fair_sched_group: freeing cfs_rq 0x8efb1680 tg 0x8efb1800 tg->css.id 80
          logger-1252  [001]    27.531355: bprint:               update_blocked_averages: cfs_rq 0x8efb1680 cpu 1 on_list 0x6b6b6b6b


I've reproduced this on v4.4, but I've also managed to reproduce the bug
after cherry-picking the following patches
(all but one were marked for v4.4 stable):

6fe1f34 sched/cgroup: Fix cgroup entity load tracking tear-down
d6e022f workqueue: handle NUMA_NO_NODE for unbound pool_workqueue lookup
041bd12 Revert "workqueue: make sure delayed work run in local cpu"
8bb5ef7 cgroup: make sure a parent css isn't freed before its children
aa226ff cgroup: make sure a parent css isn't offlined before its children
e93ad19 cpuset: make mm migration asynchronous

             reply	other threads:[~2016-03-04 10:41 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-04 10:41 Niklas Cassel [this message]
2016-03-10 12:54 ` [BUG] sched: leaf_cfs_rq_list use after free Peter Zijlstra
2016-03-11 17:02   ` Niklas Cassel
2016-03-11 17:28     ` Peter Zijlstra
2016-03-11 18:20   ` Tejun Heo
  -- strict thread matches above, loose matches on Subject: below --
2016-03-12  9:42 Kazuki Yamaguchi
2016-03-12 13:59 ` Peter Zijlstra
2016-03-14 11:20 ` Peter Zijlstra
2016-03-14 12:09   ` Peter Zijlstra
2016-03-16 14:24     ` Tejun Heo
2016-03-16 14:44       ` Tejun Heo
2016-03-16 15:22       ` Peter Zijlstra
2016-03-16 16:50         ` Tejun Heo
2016-03-16 17:04           ` Peter Zijlstra
2016-03-16 17:49             ` Tejun Heo
2016-03-17  8:29         ` Niklas Cassel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56D9664D.8080503@axis.com \
    --to=niklas.cassel@axis.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.