* Building tools/perf fails on next
@ 2015-06-17 6:09 Riku Voipio
2015-06-17 9:17 ` [PATCH] tools/perf, rbtree: Add RCU wrappers to make rbtree.h usable in user-space Ingo Molnar
0 siblings, 1 reply; 12+ messages in thread
From: Riku Voipio @ 2015-06-17 6:09 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: mingo, acme, linux-kernel
Hi,
The commit:
commit d72da4a4d973d8a0a0d3c97e7cdebf287fbe3a99
Author: Peter Zijlstra <peterz@infradead.org>
Date: Wed May 27 11:09:36 2015 +0930
rbtree: Make lockless searches non-fatal
Adds <linux/rcupdate.h> to rbtree.h, which in turn is included from perf userspace
headers. Now building tools/perf will fail with hundreds of lines of gcc complaining
about kernel defines not available. Reverting the patch makes perf build again.
This is with gcc-4.9 from debian but I don't think it's compiler specific.
Beginning of build error:
$ make -C tools/perf
make: Entering directory '/build/linux/tools/perf'
BUILD: Doing 'make -j4' parallel build
Auto-detecting system features:
... dwarf: [ on ]
... glibc: [ on ]
... gtk2: [ OFF ]
... libaudit: [ on ]
... libbfd: [ OFF ]
... libelf: [ on ]
... libnuma: [ on ]
... libperl: [ on ]
... libpython: [ on ]
... libslang: [ on ]
... libunwind: [ on ]
... libdw-dwarf-unwind: [ on ]
... zlib: [ on ]
... lzma: [ OFF ]
config/Makefile:395: GTK2 not found, disables GTK2 support. Please install gtk2-devel or libgtk2.0-dev
config/Makefile:527: No bfd.h/libbfd found, please install binutils-dev[el]/zlib-static/libiberty-dev to gain symbol demangling
config/Makefile:556: No liblzma found, disables xz kernel module decompression, please install xz-devel/liblzma-dev
CC util/abspath.o
CC event-parse.o
CC fd/array.o
LD fd/libapi-in.o
CC fs/fs.o
CC fs/debugfs.o
CC event-plugin.o
CC fs/findfs.o
CC trace-seq.o
CC util/alias.o
CC parse-filter.o
CC fs/tracefs.o
CC util/annotate.o
LD fs/libapi-in.o
LD libapi-in.o
AR libapi.a
CC parse-utils.o
GEN common-cmds.h
PERF_VERSION = 4.1.rc8.gca3cfa
CC kbuffer-parse.o
LD libtraceevent-in.o
CC arch/common.o
LINK libtraceevent.a
CC ui/setup.o
In file included from /build/linux/include/uapi/linux/kernel.h:4:0,
from /build/linux/include/linux/cache.h:4,
from /build/linux/include/linux/rcupdate.h:37,
from /build/linux/tools/perf/util/include/../../../../include/linux/rbtree.h:34,
from /build/linux/tools/perf/util/include/linux/rbtree.h:4,
from arch/../util/map.h:7,
from arch/../util/event.h:8,
from arch/../util/session.h:5,
from arch/common.h:4,
from arch/common.c:3:
/build/linux/include/uapi/linux/sysinfo.h:8:2: error: unknown type name ‘__kernel_long_t’
__kernel_long_t uptime; /* Seconds since boot */
^
/build/linux/include/uapi/linux/sysinfo.h:9:2: error: unknown type name ‘__kernel_ulong_t’
__kernel_ulong_t loads[3]; /* 1, 5, and 15 minute load averages */
^
/build/linux/include/uapi/linux/sysinfo.h:10:2: error: unknown type name ‘__kernel_ulong_t’
__kernel_ulong_t totalram; /* Total usable main memory size */
^
/build/linux/include/uapi/linux/sysinfo.h:11:2: error: unknown type name ‘__kernel_ulong_t’
__kernel_ulong_t freeram; /* Available memory size */
^
/build/linux/include/uapi/linux/sysinfo.h:12:2: error: unknown type name ‘__kernel_ulong_t’
__kernel_ulong_t sharedram; /* Amount of shared memory */
^
/build/linux/include/uapi/linux/sysinfo.h:13:2: error: unknown type name ‘__kernel_ulong_t’
__kernel_ulong_t bufferram; /* Memory used by buffers */
^
/build/linux/include/uapi/linux/sysinfo.h:14:2: error: unknown type name ‘__kernel_ulong_t’
__kernel_ulong_t totalswap; /* Total swap space size */
^
/build/linux/include/uapi/linux/sysinfo.h:15:2: error: unknown type name ‘__kernel_ulong_t’
__kernel_ulong_t freeswap; /* swap space still available */
^
/build/linux/include/uapi/linux/sysinfo.h:18:2: error: unknown type name ‘__kernel_ulong_t’
__kernel_ulong_t totalhigh; /* Total high memory size */
^
/build/linux/include/uapi/linux/sysinfo.h:19:2: error: unknown type name ‘__kernel_ulong_t’
__kernel_ulong_t freehigh; /* Available high memory size */
^
/build/linux/include/uapi/linux/sysinfo.h:21:22: error: ‘__kernel_ulong_t’ undeclared here (not in a function)
char _f[20-2*sizeof(__kernel_ulong_t)-sizeof(__u32)]; /* Padding: libc5 uses this.. */
....
^ permalink raw reply [flat|nested] 12+ messages in thread* [PATCH] tools/perf, rbtree: Add RCU wrappers to make rbtree.h usable in user-space
2015-06-17 6:09 Building tools/perf fails on next Riku Voipio
@ 2015-06-17 9:17 ` Ingo Molnar
2015-06-17 9:33 ` Riku Voipio
2015-07-06 8:13 ` Stephen Rothwell
0 siblings, 2 replies; 12+ messages in thread
From: Ingo Molnar @ 2015-06-17 9:17 UTC (permalink / raw)
To: Riku Voipio, Rusty Russell, Stephen Rothwell
Cc: Peter Zijlstra, mingo, acme, linux-kernel, Paul E. McKenney
* Riku Voipio <riku.voipio@iki.fi> wrote:
> Hi,
>
> The commit:
>
> commit d72da4a4d973d8a0a0d3c97e7cdebf287fbe3a99
> Author: Peter Zijlstra <peterz@infradead.org>
> Date: Wed May 27 11:09:36 2015 +0930
>
> rbtree: Make lockless searches non-fatal
>
> Adds <linux/rcupdate.h> to rbtree.h, which in turn is included from perf userspace
> headers. Now building tools/perf will fail with hundreds of lines of gcc complaining
> about kernel defines not available. Reverting the patch makes perf build again.
> This is with gcc-4.9 from debian but I don't think it's compiler specific.
Does the patch below make things work?
This fix could go into the modules tree, as this commit came via Rusty.
Stephen, feel free to add:
make -C tools/perf
to the linux-next build tests. It's always supposed to build without failure, in
pretty much whatever x86 distro you run your build tests on.
Thanks,
Ingo
===================>
>From 62c251255f07ede8efa356d4ea9ab51827ffa0d0 Mon Sep 17 00:00:00 2001
From: Ingo Molnar <mingo@kernel.org>
Date: Wed, 17 Jun 2015 11:07:11 +0200
Subject: [PATCH] tools/perf, rbtree: Add RCU wrappers to make rbtree.h usable in user-space
Reported-by: Riku Voipio <riku.voipio@iki.fi>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
tools/perf/util/include/linux/rcupdate.h | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/tools/perf/util/include/linux/rcupdate.h b/tools/perf/util/include/linux/rcupdate.h
new file mode 100644
index 000000000000..3e022dd9a69b
--- /dev/null
+++ b/tools/perf/util/include/linux/rcupdate.h
@@ -0,0 +1,9 @@
+#ifndef PERF_LINUX_RCUPDATE_H_
+#define PERF_LINUX_RCUPDATE_H_
+
+/* Simple trivial wrappers for now, we don't use RCU in perf user-space (yet): */
+#define WRITE_ONCE(var, val) ((var) = (val))
+#define rcu_assign_pointer(ptr, val) WRITE_ONCE(ptr, val)
+
+#endif
+
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] tools/perf, rbtree: Add RCU wrappers to make rbtree.h usable in user-space
2015-06-17 9:17 ` [PATCH] tools/perf, rbtree: Add RCU wrappers to make rbtree.h usable in user-space Ingo Molnar
@ 2015-06-17 9:33 ` Riku Voipio
2015-07-06 8:13 ` Stephen Rothwell
1 sibling, 0 replies; 12+ messages in thread
From: Riku Voipio @ 2015-06-17 9:33 UTC (permalink / raw)
To: Ingo Molnar
Cc: Rusty Russell, Stephen Rothwell, Peter Zijlstra, mingo, acme,
linux-kernel, Paul E. McKenney
On Wednesday, June 17, 2015 12:17:04 PM EEST, Ingo Molnar wrote:
> * Riku Voipio <riku.voipio@iki.fi> wrote:
>
>> Hi,
>>
>> The commit:
>>
>> commit d72da4a4d973d8a0a0d3c97e7cdebf287fbe3a99
>> Author: Peter Zijlstra <peterz@infradead.org>
>> Date: Wed May 27 11:09:36 2015 +0930 ...
>
> Does the patch below make things work?
It does,
Tested-by: Riku Voipio <riku.voipio@iki.fi>
> This fix could go into the modules tree, as this commit came via Rusty.
>
> Stephen, feel free to add:
>
> make -C tools/perf
>
> to the linux-next build tests. It's always supposed to build
> without failure, in
> pretty much whatever x86 distro you run your build tests on.
>
> Thanks,
>
> Ingo
>
>
> ===================>
> From 62c251255f07ede8efa356d4ea9ab51827ffa0d0 Mon Sep 17 00:00:00 2001
> From: Ingo Molnar <mingo@kernel.org>
> Date: Wed, 17 Jun 2015 11:07:11 +0200
> Subject: [PATCH] tools/perf, rbtree: Add RCU wrappers to make
> rbtree.h usable in user-space
>
> Reported-by: Riku Voipio <riku.voipio@iki.fi>
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> ---
> tools/perf/util/include/linux/rcupdate.h | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/tools/perf/util/include/linux/rcupdate.h
> b/tools/perf/util/include/linux/rcupdate.h
> new file mode 100644
> index 000000000000..3e022dd9a69b
> --- /dev/null
> +++ b/tools/perf/util/include/linux/rcupdate.h
> @@ -0,0 +1,9 @@
> +#ifndef PERF_LINUX_RCUPDATE_H_
> +#define PERF_LINUX_RCUPDATE_H_
> +
> +/* Simple trivial wrappers for now, we don't use RCU in perf
> user-space (yet): */
> +#define WRITE_ONCE(var, val) ((var) = (val))
> +#define rcu_assign_pointer(ptr, val) WRITE_ONCE(ptr, val)
> +
> +#endif
> +
>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] tools/perf, rbtree: Add RCU wrappers to make rbtree.h usable in user-space
2015-06-17 9:17 ` [PATCH] tools/perf, rbtree: Add RCU wrappers to make rbtree.h usable in user-space Ingo Molnar
2015-06-17 9:33 ` Riku Voipio
@ 2015-07-06 8:13 ` Stephen Rothwell
2015-07-06 9:41 ` Ingo Molnar
1 sibling, 1 reply; 12+ messages in thread
From: Stephen Rothwell @ 2015-07-06 8:13 UTC (permalink / raw)
To: Ingo Molnar
Cc: Riku Voipio, Rusty Russell, Peter Zijlstra, mingo, acme,
linux-kernel, Paul E. McKenney
[-- Attachment #1: Type: text/plain, Size: 2528 bytes --]
Hi Ingo,
On Wed, 17 Jun 2015 11:17:04 +0200 Ingo Molnar <mingo@kernel.org> wrote:
>
> * Riku Voipio <riku.voipio@iki.fi> wrote:
>
> > The commit:
> >
> > commit d72da4a4d973d8a0a0d3c97e7cdebf287fbe3a99
> > Author: Peter Zijlstra <peterz@infradead.org>
> > Date: Wed May 27 11:09:36 2015 +0930
> >
> > rbtree: Make lockless searches non-fatal
> >
> > Adds <linux/rcupdate.h> to rbtree.h, which in turn is included from perf userspace
> > headers. Now building tools/perf will fail with hundreds of lines of gcc complaining
> > about kernel defines not available. Reverting the patch makes perf build again.
> > This is with gcc-4.9 from debian but I don't think it's compiler specific.
>
> Does the patch below make things work?
>
> This fix could go into the modules tree, as this commit came via Rusty.
>
> Stephen, feel free to add:
>
> make -C tools/perf
>
> to the linux-next build tests. It's always supposed to build without failure, in
> pretty much whatever x86 distro you run your build tests on.
OK, I have started doing that, but int order to make it build at all, I
have added the patch below to my fixes tree (since the breakage is now
in Linus' tree).
This means that it will get a conflict with the tip tree tomorrow, but
I will just fix the conflict by using the version from the tip tree.
Once the tip tree fixes are sent to Linus, I will drop this patch from
my fixes tree.
> From 62c251255f07ede8efa356d4ea9ab51827ffa0d0 Mon Sep 17 00:00:00 2001
> From: Ingo Molnar <mingo@kernel.org>
> Date: Wed, 17 Jun 2015 11:07:11 +0200
> Subject: [PATCH] tools/perf, rbtree: Add RCU wrappers to make rbtree.h usable in user-space
>
> Reported-by: Riku Voipio <riku.voipio@iki.fi>
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> ---
> tools/perf/util/include/linux/rcupdate.h | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/tools/perf/util/include/linux/rcupdate.h b/tools/perf/util/include/linux/rcupdate.h
> new file mode 100644
> index 000000000000..3e022dd9a69b
> --- /dev/null
> +++ b/tools/perf/util/include/linux/rcupdate.h
> @@ -0,0 +1,9 @@
> +#ifndef PERF_LINUX_RCUPDATE_H_
> +#define PERF_LINUX_RCUPDATE_H_
> +
> +/* Simple trivial wrappers for now, we don't use RCU in perf user-space (yet): */
> +#define WRITE_ONCE(var, val) ((var) = (val))
> +#define rcu_assign_pointer(ptr, val) WRITE_ONCE(ptr, val)
> +
> +#endif
> +
--
Cheers,
Stephen Rothwell sfr@canb.auug.org.au
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] tools/perf, rbtree: Add RCU wrappers to make rbtree.h usable in user-space
2015-07-06 8:13 ` Stephen Rothwell
@ 2015-07-06 9:41 ` Ingo Molnar
2015-07-06 22:39 ` Stephen Rothwell
0 siblings, 1 reply; 12+ messages in thread
From: Ingo Molnar @ 2015-07-06 9:41 UTC (permalink / raw)
To: Stephen Rothwell
Cc: Riku Voipio, Rusty Russell, Peter Zijlstra, mingo, acme,
linux-kernel, Paul E. McKenney
* Stephen Rothwell <sfr@canb.auug.org.au> wrote:
> Hi Ingo,
>
> On Wed, 17 Jun 2015 11:17:04 +0200 Ingo Molnar <mingo@kernel.org> wrote:
> >
> > * Riku Voipio <riku.voipio@iki.fi> wrote:
> >
> > > The commit:
> > >
> > > commit d72da4a4d973d8a0a0d3c97e7cdebf287fbe3a99
> > > Author: Peter Zijlstra <peterz@infradead.org>
> > > Date: Wed May 27 11:09:36 2015 +0930
> > >
> > > rbtree: Make lockless searches non-fatal
> > >
> > > Adds <linux/rcupdate.h> to rbtree.h, which in turn is included from perf userspace
> > > headers. Now building tools/perf will fail with hundreds of lines of gcc complaining
> > > about kernel defines not available. Reverting the patch makes perf build again.
> > > This is with gcc-4.9 from debian but I don't think it's compiler specific.
> >
> > Does the patch below make things work?
> >
> > This fix could go into the modules tree, as this commit came via Rusty.
> >
> > Stephen, feel free to add:
> >
> > make -C tools/perf
> >
> > to the linux-next build tests. It's always supposed to build without failure, in
> > pretty much whatever x86 distro you run your build tests on.
>
> OK, I have started doing that, but int order to make it build at all, I
> have added the patch below to my fixes tree (since the breakage is now
> in Linus' tree).
>
> This means that it will get a conflict with the tip tree tomorrow, but
> I will just fix the conflict by using the version from the tip tree.
> Once the tip tree fixes are sent to Linus, I will drop this patch from
> my fixes tree.
On next linux-next iteration you should be able to just drop your merge conflict
resolution, and use the -tip tree as-is with no extra fixups.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] tools/perf, rbtree: Add RCU wrappers to make rbtree.h usable in user-space
2015-07-06 9:41 ` Ingo Molnar
@ 2015-07-06 22:39 ` Stephen Rothwell
0 siblings, 0 replies; 12+ messages in thread
From: Stephen Rothwell @ 2015-07-06 22:39 UTC (permalink / raw)
To: Ingo Molnar
Cc: Riku Voipio, Rusty Russell, Peter Zijlstra, mingo, acme,
linux-kernel, Paul E. McKenney
[-- Attachment #1: Type: text/plain, Size: 589 bytes --]
Hi Ingo,
On Mon, 6 Jul 2015 11:41:17 +0200 Ingo Molnar <mingo@kernel.org> wrote:
>
> On next linux-next iteration you should be able to just drop your merge conflict
> resolution, and use the -tip tree as-is with no extra fixups.
I dropped the patch I had yesterday and just merged
perf-urgent-for-linus from the tip tree into my fixes tree (since I
merge my fixes tree immediately on top of Linus' tree and before I try
to do any builds). I will drop that merge when Linus takes your pull
request.
--
Cheers,
Stephen Rothwell sfr@canb.auug.org.au
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/2 block/for-linus] writeback: don't embed root bdi_writeback_congested in bdi_writeback
@ 2015-07-02 0:52 Tejun Heo
2015-07-02 0:53 ` [PATCH 2/2 block/for-linus] writeback: don't drain bdi_writeback_congested on bdi destruction Tejun Heo
0 siblings, 1 reply; 12+ messages in thread
From: Tejun Heo @ 2015-07-02 0:52 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-kernel, Jon Christopherson
52ebea749aae ("writeback: make backing_dev_info host cgroup-specific
bdi_writebacks") made bdi (backing_dev_info) host per-cgroup wb's
(bdi_writeback's). As the congested state needs to be per-wb and
referenced from blkcg side and multiple wbs, the patch made all
non-root cong's (bdi_writeback_congested's) reference counted and
indexed on bdi.
When a bdi is destroyed, cgwb_bdi_destroy() tries to drain all
non-root cong's; however, this can hang indefinitely because wb's can
also be referenced from blkcg_gq's which are destroyed after bdi
destruction is complete.
To fix the bug, bdi destruction will be updated to not wait for cong's
to drain, which naturally means that cong's may outlive the associated
bdi. This is fine for non-root cong's but is problematic for the root
cong's which are embedded in their bdi's as they may end up getting
dereferenced after the containing bdi's are freed.
This patch makes root cong's behave the same as non-root cong's. They
are no longer embedded in their bdi's but allocated separately during
bdi initialization, indexed and reference counted the same way.
* As cong handling is the same for all wb's, wb->congested
initialization is moved into wb_init().
* When !CONFIG_CGROUP_WRITEBACK, there was no indexing or refcnting.
bdi->wb_congested is now a pointer pointing to the root cong
allocated during bdi init and minimal refcnting operations are
implemented.
* The above makes root wb init paths diverge depending on
CONFIG_CGROUP_WRITEBACK. root wb init is moved to cgwb_bdi_init().
This patch in itself shouldn't cause any consequential behavior
differences but prepares for the actual fix.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Jon Christopherson <jon@jons.org>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=100681
---
Hello,
These two patches are to fix a deadlock during bdi destruction which
was introduced by cgroup writeback changes. Jens, can you please
route these two once Jon confirms the fix?
Thanks.
include/linux/backing-dev-defs.h | 5 +-
include/linux/backing-dev.h | 5 +-
mm/backing-dev.c | 87 ++++++++++++++++++++-------------------
3 files changed, 53 insertions(+), 44 deletions(-)
--- a/include/linux/backing-dev-defs.h
+++ b/include/linux/backing-dev-defs.h
@@ -50,10 +50,10 @@ enum wb_stat_item {
*/
struct bdi_writeback_congested {
unsigned long state; /* WB_[a]sync_congested flags */
+ atomic_t refcnt; /* nr of attached wb's and blkg */
#ifdef CONFIG_CGROUP_WRITEBACK
struct backing_dev_info *bdi; /* the associated bdi */
- atomic_t refcnt; /* nr of attached wb's and blkg */
int blkcg_id; /* ID of the associated blkcg */
struct rb_node rb_node; /* on bdi->cgwb_congestion_tree */
#endif
@@ -150,11 +150,12 @@ struct backing_dev_info {
atomic_long_t tot_write_bandwidth;
struct bdi_writeback wb; /* the root writeback info for this bdi */
- struct bdi_writeback_congested wb_congested; /* its congested state */
#ifdef CONFIG_CGROUP_WRITEBACK
struct radix_tree_root cgwb_tree; /* radix tree of active cgroup wbs */
struct rb_root cgwb_congested_tree; /* their congested states */
atomic_t usage_cnt; /* counts both cgwbs and cgwb_contested's */
+#else
+ struct bdi_writeback_congested *wb_congested;
#endif
wait_queue_head_t wb_waitq;
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -465,11 +465,14 @@ static inline bool inode_cgwb_enabled(st
static inline struct bdi_writeback_congested *
wb_congested_get_create(struct backing_dev_info *bdi, int blkcg_id, gfp_t gfp)
{
- return bdi->wb.congested;
+ atomic_inc(&bdi->wb_congested->refcnt);
+ return bdi->wb_congested;
}
static inline void wb_congested_put(struct bdi_writeback_congested *congested)
{
+ if (atomic_dec_and_test(&congested->refcnt))
+ kfree(congested);
}
static inline struct bdi_writeback *wb_find_current(struct backing_dev_info *bdi)
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -287,7 +287,7 @@ void wb_wakeup_delayed(struct bdi_writeb
#define INIT_BW (100 << (20 - PAGE_SHIFT))
static int wb_init(struct bdi_writeback *wb, struct backing_dev_info *bdi,
- gfp_t gfp)
+ int blkcg_id, gfp_t gfp)
{
int i, err;
@@ -311,21 +311,29 @@ static int wb_init(struct bdi_writeback
INIT_LIST_HEAD(&wb->work_list);
INIT_DELAYED_WORK(&wb->dwork, wb_workfn);
+ wb->congested = wb_congested_get_create(bdi, blkcg_id, gfp);
+ if (!wb->congested)
+ return -ENOMEM;
+
err = fprop_local_init_percpu(&wb->completions, gfp);
if (err)
- return err;
+ goto out_put_cong;
for (i = 0; i < NR_WB_STAT_ITEMS; i++) {
err = percpu_counter_init(&wb->stat[i], 0, gfp);
- if (err) {
- while (--i)
- percpu_counter_destroy(&wb->stat[i]);
- fprop_local_destroy_percpu(&wb->completions);
- return err;
- }
+ if (err)
+ goto out_destroy_stat;
}
return 0;
+
+out_destroy_stat:
+ while (--i)
+ percpu_counter_destroy(&wb->stat[i]);
+ fprop_local_destroy_percpu(&wb->completions);
+out_put_cong:
+ wb_congested_put(wb->congested);
+ return err;
}
/*
@@ -361,6 +369,7 @@ static void wb_exit(struct bdi_writeback
percpu_counter_destroy(&wb->stat[i]);
fprop_local_destroy_percpu(&wb->completions);
+ wb_congested_put(wb->congested);
}
#ifdef CONFIG_CGROUP_WRITEBACK
@@ -392,9 +401,6 @@ wb_congested_get_create(struct backing_d
struct bdi_writeback_congested *new_congested = NULL, *congested;
struct rb_node **node, *parent;
unsigned long flags;
-
- if (blkcg_id == 1)
- return &bdi->wb_congested;
retry:
spin_lock_irqsave(&cgwb_lock, flags);
@@ -453,9 +459,6 @@ void wb_congested_put(struct bdi_writeba
struct backing_dev_info *bdi = congested->bdi;
unsigned long flags;
- if (congested->blkcg_id == 1)
- return;
-
local_irq_save(flags);
if (!atomic_dec_and_lock(&congested->refcnt, &cgwb_lock)) {
local_irq_restore(flags);
@@ -480,7 +483,6 @@ static void cgwb_release_workfn(struct w
css_put(wb->memcg_css);
css_put(wb->blkcg_css);
- wb_congested_put(wb->congested);
fprop_local_destroy_percpu(&wb->memcg_completions);
percpu_ref_exit(&wb->refcnt);
@@ -541,7 +543,7 @@ static int cgwb_create(struct backing_de
if (!wb)
return -ENOMEM;
- ret = wb_init(wb, bdi, gfp);
+ ret = wb_init(wb, bdi, blkcg_css->id, gfp);
if (ret)
goto err_free;
@@ -553,12 +555,6 @@ static int cgwb_create(struct backing_de
if (ret)
goto err_ref_exit;
- wb->congested = wb_congested_get_create(bdi, blkcg_css->id, gfp);
- if (!wb->congested) {
- ret = -ENOMEM;
- goto err_fprop_exit;
- }
-
wb->memcg_css = memcg_css;
wb->blkcg_css = blkcg_css;
INIT_WORK(&wb->release_work, cgwb_release_workfn);
@@ -588,12 +584,10 @@ static int cgwb_create(struct backing_de
if (ret) {
if (ret == -EEXIST)
ret = 0;
- goto err_put_congested;
+ goto err_fprop_exit;
}
goto out_put;
-err_put_congested:
- wb_congested_put(wb->congested);
err_fprop_exit:
fprop_local_destroy_percpu(&wb->memcg_completions);
err_ref_exit:
@@ -662,14 +656,20 @@ struct bdi_writeback *wb_get_create(stru
return wb;
}
-static void cgwb_bdi_init(struct backing_dev_info *bdi)
+static int cgwb_bdi_init(struct backing_dev_info *bdi)
{
- bdi->wb.memcg_css = mem_cgroup_root_css;
- bdi->wb.blkcg_css = blkcg_root_css;
- bdi->wb_congested.blkcg_id = 1;
+ int ret;
+
INIT_RADIX_TREE(&bdi->cgwb_tree, GFP_ATOMIC);
bdi->cgwb_congested_tree = RB_ROOT;
atomic_set(&bdi->usage_cnt, 1);
+
+ ret = wb_init(&bdi->wb, bdi, 1, GFP_KERNEL);
+ if (!ret) {
+ bdi->wb.memcg_css = mem_cgroup_root_css;
+ bdi->wb.blkcg_css = blkcg_root_css;
+ }
+ return ret;
}
static void cgwb_bdi_destroy(struct backing_dev_info *bdi)
@@ -732,15 +732,28 @@ void wb_blkcg_offline(struct blkcg *blkc
#else /* CONFIG_CGROUP_WRITEBACK */
-static void cgwb_bdi_init(struct backing_dev_info *bdi) { }
+static int cgwb_bdi_init(struct backing_dev_info *bdi)
+{
+ int err;
+
+ bdi->wb_congested = kzalloc(sizeof(*bdi->wb_congested), GFP_KERNEL);
+ if (!bdi->wb_congested)
+ return -ENOMEM;
+
+ err = wb_init(&bdi->wb, bdi, 1, GFP_KERNEL);
+ if (err) {
+ kfree(bdi->wb_congested);
+ return err;
+ }
+ return 0;
+}
+
static void cgwb_bdi_destroy(struct backing_dev_info *bdi) { }
#endif /* CONFIG_CGROUP_WRITEBACK */
int bdi_init(struct backing_dev_info *bdi)
{
- int err;
-
bdi->dev = NULL;
bdi->min_ratio = 0;
@@ -749,15 +762,7 @@ int bdi_init(struct backing_dev_info *bd
INIT_LIST_HEAD(&bdi->bdi_list);
init_waitqueue_head(&bdi->wb_waitq);
- err = wb_init(&bdi->wb, bdi, GFP_KERNEL);
- if (err)
- return err;
-
- bdi->wb_congested.state = 0;
- bdi->wb.congested = &bdi->wb_congested;
-
- cgwb_bdi_init(bdi);
- return 0;
+ return cgwb_bdi_init(bdi);
}
EXPORT_SYMBOL(bdi_init);
^ permalink raw reply [flat|nested] 12+ messages in thread* [PATCH 2/2 block/for-linus] writeback: don't drain bdi_writeback_congested on bdi destruction
2015-07-02 0:52 [PATCH 1/2 block/for-linus] writeback: don't embed root bdi_writeback_congested in bdi_writeback Tejun Heo
@ 2015-07-02 0:53 ` Tejun Heo
2015-07-02 3:02 ` Jon Christopherson
0 siblings, 1 reply; 12+ messages in thread
From: Tejun Heo @ 2015-07-02 0:53 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-kernel, Jon Christopherson
52ebea749aae ("writeback: make backing_dev_info host cgroup-specific
bdi_writebacks") made bdi (backing_dev_info) host per-cgroup wb's
(bdi_writeback's). As the congested state needs to be per-wb and
referenced from blkcg side and multiple wbs, the patch made all
non-root cong's (bdi_writeback_congested's) reference counted and
indexed on bdi.
When a bdi is destroyed, cgwb_bdi_destroy() tries to drain all
non-root cong's; however, this can hang indefinitely because wb's can
also be referenced from blkcg_gq's which are destroyed after bdi
destruction is complete.
This patch fixes the bug by updating bdi destruction to not wait for
cong's to drain. A cong is unlinked from bdi->cgwb_congested_tree on
bdi destuction regardless of its reference count as the bdi may go
away any point after destruction. wb_congested_put() checks whether
the cong is already unlinked on release.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Jon Christopherson <jon@jons.org>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=100681
Fixes: 52ebea749aae ("writeback: make backing_dev_info host cgroup-specific bdi_writebacks")
---
mm/backing-dev.c | 22 ++++++++++++++++------
1 file changed, 16 insertions(+), 6 deletions(-)
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -425,7 +425,6 @@ retry:
new_congested = NULL;
rb_link_node(&congested->rb_node, parent, node);
rb_insert_color(&congested->rb_node, &bdi->cgwb_congested_tree);
- atomic_inc(&bdi->usage_cnt);
goto found;
}
@@ -456,7 +455,6 @@ found:
*/
void wb_congested_put(struct bdi_writeback_congested *congested)
{
- struct backing_dev_info *bdi = congested->bdi;
unsigned long flags;
local_irq_save(flags);
@@ -465,12 +463,15 @@ void wb_congested_put(struct bdi_writeba
return;
}
- rb_erase(&congested->rb_node, &congested->bdi->cgwb_congested_tree);
+ /* bdi might already have been destroyed leaving @congested unlinked */
+ if (congested->bdi) {
+ rb_erase(&congested->rb_node,
+ &congested->bdi->cgwb_congested_tree);
+ congested->bdi = NULL;
+ }
+
spin_unlock_irqrestore(&cgwb_lock, flags);
kfree(congested);
-
- if (atomic_dec_and_test(&bdi->usage_cnt))
- wake_up_all(&cgwb_release_wait);
}
static void cgwb_release_workfn(struct work_struct *work)
@@ -675,13 +676,22 @@ static int cgwb_bdi_init(struct backing_
static void cgwb_bdi_destroy(struct backing_dev_info *bdi)
{
struct radix_tree_iter iter;
+ struct bdi_writeback_congested *congested, *congested_n;
void **slot;
WARN_ON(test_bit(WB_registered, &bdi->wb.state));
spin_lock_irq(&cgwb_lock);
+
radix_tree_for_each_slot(slot, &bdi->cgwb_tree, &iter, 0)
cgwb_kill(*slot);
+
+ rbtree_postorder_for_each_entry_safe(congested, congested_n,
+ &bdi->cgwb_congested_tree, rb_node) {
+ rb_erase(&congested->rb_node, &bdi->cgwb_congested_tree);
+ congested->bdi = NULL; /* mark @congested unlinked */
+ }
+
spin_unlock_irq(&cgwb_lock);
/*
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2015-07-06 22:40 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-17 6:09 Building tools/perf fails on next Riku Voipio
2015-06-17 9:17 ` [PATCH] tools/perf, rbtree: Add RCU wrappers to make rbtree.h usable in user-space Ingo Molnar
2015-06-17 9:33 ` Riku Voipio
2015-07-06 8:13 ` Stephen Rothwell
2015-07-06 9:41 ` Ingo Molnar
2015-07-06 22:39 ` Stephen Rothwell
-- strict thread matches above, loose matches on Subject: below --
2015-07-02 0:52 [PATCH 1/2 block/for-linus] writeback: don't embed root bdi_writeback_congested in bdi_writeback Tejun Heo
2015-07-02 0:53 ` [PATCH 2/2 block/for-linus] writeback: don't drain bdi_writeback_congested on bdi destruction Tejun Heo
2015-07-02 3:02 ` Jon Christopherson
[not found] ` <5594AD98.4050402@jons.org>
2015-07-02 13:21 ` [PATCH] tools/perf, rbtree: Add RCU wrappers to make rbtree.h usable in user-space Tejun Heo
2015-07-02 20:51 ` Rusty Russell
2015-07-03 7:14 ` Peter Zijlstra
2015-07-03 10:00 ` Jon Christopherson
2015-07-04 16:15 ` Arnaldo Carvalho de Melo
2015-07-06 10:57 ` Peter Zijlstra
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).