From: Vlastimil Babka <vbabka@suse.cz>
To: Tejun Heo <tj@kernel.org>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
Sasha Levin <sasha.levin@oracle.com>,
ast@kernel.org, "netdev@vger.kernel.org" <netdev@vger.kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
Christoph Lameter <cl@linux.com>,
Linux-MM layout <linux-mm@kvack.org>,
marco.gra@gmail.com
Subject: Re: bpf: use-after-free in array_map_alloc
Date: Tue, 24 May 2016 10:40:54 +0200 [thread overview]
Message-ID: <57441396.2050607@suse.cz> (raw)
In-Reply-To: <20160523213501.GA5383@mtj.duckdns.org>
[+CC Marco who reported the CVE, forgot that earlier]
On 05/23/2016 11:35 PM, Tejun Heo wrote:
> Hello,
>
> Can you please test whether this patch resolves the issue? While
> adding support for atomic allocations, I reduced alloc_mutex covered
> region too much.
>
> Thanks.
Ugh, this makes the code even more head-spinning than it was.
> diff --git a/mm/percpu.c b/mm/percpu.c
> index 0c59684..bd2df70 100644
> --- a/mm/percpu.c
> +++ b/mm/percpu.c
> @@ -162,7 +162,7 @@ static struct pcpu_chunk *pcpu_reserved_chunk;
> static int pcpu_reserved_chunk_limit;
>
> static DEFINE_SPINLOCK(pcpu_lock); /* all internal data structures */
> -static DEFINE_MUTEX(pcpu_alloc_mutex); /* chunk create/destroy, [de]pop */
> +static DEFINE_MUTEX(pcpu_alloc_mutex); /* chunk create/destroy, [de]pop, map extension */
>
> static struct list_head *pcpu_slot __read_mostly; /* chunk list slots */
>
> @@ -435,6 +435,8 @@ static int pcpu_extend_area_map(struct pcpu_chunk *chunk, int new_alloc)
> size_t old_size = 0, new_size = new_alloc * sizeof(new[0]);
> unsigned long flags;
>
> + lockdep_assert_held(&pcpu_alloc_mutex);
I don't see where the mutex gets locked when called via
pcpu_map_extend_workfn? (except via the new cancel_work_sync() call below?)
Also what protects chunks with scheduled work items from being removed?
> +
> new = pcpu_mem_zalloc(new_size);
> if (!new)
> return -ENOMEM;
> @@ -895,6 +897,9 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved,
> return NULL;
> }
>
> + if (!is_atomic)
> + mutex_lock(&pcpu_alloc_mutex);
BTW I noticed that
bool is_atomic = (gfp & GFP_KERNEL) != GFP_KERNEL;
this is too pessimistic IMHO. Reclaim is possible even without __GFP_FS
and __GFP_IO. Could you just use gfpflags_allow_blocking(gfp) here?
> +
> spin_lock_irqsave(&pcpu_lock, flags);
>
> /* serve reserved allocations from the reserved chunk if available */
> @@ -967,12 +972,11 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved,
> if (is_atomic)
> goto fail;
>
> - mutex_lock(&pcpu_alloc_mutex);
> + lockdep_assert_held(&pcpu_alloc_mutex);
>
> if (list_empty(&pcpu_slot[pcpu_nr_slots - 1])) {
> chunk = pcpu_create_chunk();
> if (!chunk) {
> - mutex_unlock(&pcpu_alloc_mutex);
> err = "failed to allocate new chunk";
> goto fail;
> }
> @@ -983,7 +987,6 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved,
> spin_lock_irqsave(&pcpu_lock, flags);
> }
>
> - mutex_unlock(&pcpu_alloc_mutex);
> goto restart;
>
> area_found:
> @@ -993,8 +996,6 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved,
> if (!is_atomic) {
> int page_start, page_end, rs, re;
>
> - mutex_lock(&pcpu_alloc_mutex);
> -
> page_start = PFN_DOWN(off);
> page_end = PFN_UP(off + size);
>
> @@ -1005,7 +1006,6 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved,
>
> spin_lock_irqsave(&pcpu_lock, flags);
> if (ret) {
> - mutex_unlock(&pcpu_alloc_mutex);
> pcpu_free_area(chunk, off, &occ_pages);
> err = "failed to populate";
> goto fail_unlock;
> @@ -1045,6 +1045,8 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved,
> /* see the flag handling in pcpu_blance_workfn() */
> pcpu_atomic_alloc_failed = true;
> pcpu_schedule_balance_work();
> + } else {
> + mutex_unlock(&pcpu_alloc_mutex);
> }
> return NULL;
> }
> @@ -1137,6 +1139,8 @@ static void pcpu_balance_workfn(struct work_struct *work)
> list_for_each_entry_safe(chunk, next, &to_free, list) {
> int rs, re;
>
> + cancel_work_sync(&chunk->map_extend_work);
This deserves some comment?
> +
> pcpu_for_each_pop_region(chunk, rs, re, 0, pcpu_unit_pages) {
> pcpu_depopulate_chunk(chunk, rs, re);
> spin_lock_irq(&pcpu_lock);
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-05-24 8:40 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <5713C0AD.3020102@oracle.com>
[not found] ` <20160417172943.GA83672@ast-mbp.thefacebook.com>
2016-05-23 12:01 ` bpf: use-after-free in array_map_alloc Vlastimil Babka
2016-05-23 12:07 ` Vlastimil Babka
2016-05-23 21:35 ` Tejun Heo
2016-05-23 22:13 ` Alexei Starovoitov
2016-05-24 8:40 ` Vlastimil Babka [this message]
2016-05-24 15:30 ` Tejun Heo
2016-05-24 19:04 ` Tejun Heo
2016-05-24 20:43 ` Alexei Starovoitov
2016-05-25 15:44 ` [PATCH percpu/for-4.7-fixes 1/2] percpu: fix synchronization between chunk->map_extend_work and chunk destruction Tejun Heo
2016-05-26 9:19 ` Vlastimil Babka
2016-05-26 19:21 ` Tejun Heo
2016-05-26 20:48 ` Vlastimil Babka
2016-05-25 15:45 ` [PATCH percpu/for-4.7-fixes 2/2] percpu: fix synchronization between synchronous map extension " Tejun Heo
2016-05-26 9:48 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=57441396.2050607@suse.cz \
--to=vbabka@suse.cz \
--cc=alexei.starovoitov@gmail.com \
--cc=ast@kernel.org \
--cc=cl@linux.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=marco.gra@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=sasha.levin@oracle.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).