From: Greg KH <greg@kroah.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>,
Nick Piggin <nickpiggin@yahoo.com.au>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org, Jens Axboe <jens.axboe@oracle.com>,
Fengguang Wu <fengguang.wu@gmail.com>,
Trond Myklebust <trond.myklebust@fys.uio.no>,
Miklos Szeredi <miklos@szeredi.hu>
Subject: Re: per BDI dirty limit (was Re: -mm merge plans for 2.6.24)
Date: Sat, 27 Oct 2007 09:02:03 -0700 [thread overview]
Message-ID: <20071027160203.GA5709@kroah.com> (raw)
In-Reply-To: <1193474399.27652.15.camel@twins>
On Sat, Oct 27, 2007 at 10:39:59AM +0200, Peter Zijlstra wrote:
> On Fri, 2007-10-26 at 19:40 -0700, Greg KH wrote:
> > On Sat, Oct 27, 2007 at 03:18:08AM +0200, Peter Zijlstra wrote:
> > >
> > > On Fri, 2007-10-26 at 22:04 +0200, Peter Zijlstra wrote:
> > > > This crashes and burns on bootup, but I'm too tired to figure out what I
> > > > did wrong... will give it another try tomorrow..
> > >
> > > Ok, can't sleep.. took a look. I have several problems here.
> > >
> > > The thing that makes it go *boom* is the __ATTR_NULL. Removing that
> > > makes it boot. Albeit it then warns me of multiple duplicate sysfs
> > > objects, all named "bdi".
> > >
> > > For some obscure reason this device interface insists on using the
> > > bus_id as name (?!), and further reduces usability by limiting that to
> > > 20 odd characters.
> > >
> > > This makes it quite useless. I tried fudging around that limit by using
> > > device_rename and kobject_rename, but to no avail.
> > >
> > > Really, it should not be this hard to use, trying to expose a handfull
> > > of simple integers to userspace should not take 8h+ and still not work.
> > >
> > > Peter, who thinks sysfs is contorted mess beyond his skill. I'll stick
> > > to VM and scheduler code, that actually makes sense.
> >
> > Heh, that's funny :)
> >
> > I'll look at this and see what I can come up with. Would you just like
> > a whole new patch, or one against this one?
>
> Sorry for the grumpy note, I get that way at 3.30 am. Maybe I ought not
> have mailed :-/
>
> This is the code I had at that time.
Ah, I see a few problems. Here, try this version instead. It's
compile-tested only, and should be a lot simpler.
Note, we still are not setting the parent to the new bdi structure
properly, so the devices will show up in /sys/devices/virtual/ instead
of in their proper location. To do this, we need the parent of the
device, which I'm not so sure what it should be (block device? block
device controller?)
Let me know if this works better, I'm off to a kids birthday party for
the day, but will be around this evening...
thanks,
greg k-h
---
block/genhd.c | 2
fs/fuse/inode.c | 2
fs/nfs/client.c | 2
include/linux/backing-dev.h | 19 +++++++
include/linux/string.h | 4 +
include/linux/writeback.h | 3 +
mm/backing-dev.c | 110 ++++++++++++++++++++++++++++++++++++++++++++
mm/page-writeback.c | 2
mm/util.c | 42 ++++++++++++++++
9 files changed, 183 insertions(+), 3 deletions(-)
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -182,6 +182,7 @@ void add_disk(struct gendisk *disk)
disk->minors, NULL, exact_match, exact_lock, disk);
register_disk(disk);
blk_register_queue(disk);
+ bdi_register(&disk->queue->backing_dev_info, "bdi-%s", disk->disk_name);
}
EXPORT_SYMBOL(add_disk);
@@ -190,6 +191,7 @@ EXPORT_SYMBOL(del_gendisk); /* in partit
void unlink_gendisk(struct gendisk *disk)
{
blk_unregister_queue(disk);
+ bdi_unregister(&disk->queue->backing_dev_info);
blk_unregister_region(MKDEV(disk->major, disk->first_minor),
disk->minors);
}
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -467,7 +467,7 @@ static struct fuse_conn *new_conn(void)
atomic_set(&fc->num_waiting, 0);
fc->bdi.ra_pages = (VM_MAX_READAHEAD * 1024) / PAGE_CACHE_SIZE;
fc->bdi.unplug_io_fn = default_unplug_io_fn;
- err = bdi_init(&fc->bdi);
+ err = bdi_init_fmt(&fc->bdi, "bdi-fuse-%llu", (unsigned long long)fc->id);
if (err) {
kfree(fc);
fc = NULL;
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -678,7 +678,7 @@ static int nfs_probe_fsinfo(struct nfs_s
goto out_error;
nfs_server_set_fsinfo(server, &fsinfo);
- error = bdi_init(&server->backing_dev_info);
+ error = bdi_init_fmt(&server->backing_dev_info, "bdi-nfs-%s-%p", clp->cl_hostname, server);
if (error)
goto out_error;
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -11,6 +11,8 @@
#include <linux/percpu_counter.h>
#include <linux/log2.h>
#include <linux/proportions.h>
+#include <linux/kernel.h>
+#include <linux/device.h>
#include <asm/atomic.h>
struct page;
@@ -48,11 +50,28 @@ struct backing_dev_info {
struct prop_local_percpu completions;
int dirty_exceeded;
+
+ struct device *dev;
};
int bdi_init(struct backing_dev_info *bdi);
void bdi_destroy(struct backing_dev_info *bdi);
+int bdi_register(struct backing_dev_info *bdi, const char *fmt, ...);
+void bdi_unregister(struct backing_dev_info *bdi);
+
+#define bdi_init_fmt(bdi, fmt...) \
+ ({ \
+ int ret; \
+ ret = bdi_init(bdi); \
+ if (!ret) { \
+ ret = bdi_register(bdi, ##fmt); \
+ if (ret) \
+ bdi_destroy(bdi); \
+ } \
+ ret; \
+ })
+
static inline void __add_bdi_stat(struct backing_dev_info *bdi,
enum bdi_stat_item item, s64 amount)
{
--- a/include/linux/string.h
+++ b/include/linux/string.h
@@ -8,6 +8,7 @@
#include <linux/compiler.h> /* for inline */
#include <linux/types.h> /* for size_t */
#include <linux/stddef.h> /* for NULL */
+#include <stdarg.h>
#ifdef __cplusplus
extern "C" {
@@ -111,6 +112,9 @@ extern void *kmemdup(const void *src, si
extern char **argv_split(gfp_t gfp, const char *str, int *argcp);
extern void argv_free(char **argv);
+char *kvprintf(const char *fmt, va_list args);
+char *kprintf(const char *fmt, ...);
+
#ifdef __cplusplus
}
#endif
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -113,6 +113,9 @@ struct file;
int dirty_writeback_centisecs_handler(struct ctl_table *, int, struct file *,
void __user *, size_t *, loff_t *);
+void get_dirty_limits(long *pbackground, long *pdirty, long *pbdi_dirty,
+ struct backing_dev_info *bdi);
+
void page_writeback_init(void);
void balance_dirty_pages_ratelimited_nr(struct address_space *mapping,
unsigned long nr_pages_dirtied);
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -4,12 +4,119 @@
#include <linux/fs.h>
#include <linux/sched.h>
#include <linux/module.h>
+#include <linux/writeback.h>
+#include <linux/device.h>
+
+
+static struct class *bdi_class;
+
+static ssize_t readahead_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct backing_dev_info *bdi = dev_get_drvdata(dev);
+ char *end;
+
+ bdi->ra_pages = simple_strtoul(buf, &end, 10);
+
+ return end - buf;
+}
+
+#define BDI_SHOW(name, expr) \
+static ssize_t name##_show(struct device *dev, \
+ struct device_attribute *attr, char *page) \
+{ \
+ struct backing_dev_info *bdi = dev_get_drvdata(dev); \
+ \
+ return snprintf(page, PAGE_SIZE-1, "%lld\n", (long long)expr); \
+}
+
+BDI_SHOW(readahead, bdi->ra_pages)
+
+BDI_SHOW(reclaimable, bdi_stat(bdi, BDI_RECLAIMABLE))
+BDI_SHOW(writeback, bdi_stat(bdi, BDI_WRITEBACK))
+
+static inline unsigned long get_dirty(struct backing_dev_info *bdi, int i)
+{
+ unsigned long thresh[3];
+
+ get_dirty_limits(&thresh[0], &thresh[1], &thresh[2], bdi);
+
+ return thresh[i];
+}
+
+BDI_SHOW(dirty, get_dirty(bdi, 1))
+BDI_SHOW(bdi_dirty, get_dirty(bdi, 2))
+
+static struct device_attribute bdi_dev_attrs[] = {
+ __ATTR(readahead, 0644, readahead_show, readahead_store),
+ __ATTR_RO(reclaimable),
+ __ATTR_RO(writeback),
+ __ATTR_RO(dirty),
+ __ATTR_RO(bdi_dirty),
+};
+
+static __init int bdi_class_init(void)
+{
+ bdi_class = class_create(THIS_MODULE, "bdi");
+ return 0;
+}
+
+__initcall(bdi_class_init);
+
+int bdi_register(struct backing_dev_info *bdi, const char *fmt, ...)
+{
+ char *name;
+ va_list args;
+ int ret = -ENOMEM;
+ int i;
+
+ va_start(args, fmt);
+ name = kvprintf(fmt, args);
+ va_end(args);
+
+ if (!name)
+ return -ENOMEM;
+
+ bdi->dev = device_create(bdi_class, NULL, MKDEV(0,0), name);
+ if (IS_ERR(bdi->dev))
+ goto exit;
+
+ dev_set_drvdata(bdi->dev, bdi);
+
+ for (i = 0; i < ARRAY_SIZE(bdi_dev_attrs); i++) {
+ ret = device_create_file(bdi->dev, &bdi_dev_attrs[i]);
+ if (ret)
+ break;
+ }
+ if (ret) {
+ while (--i >= 0)
+ device_remove_file(bdi->dev, &bdi_dev_attrs[i]);
+ device_unregister(bdi->dev);
+ bdi->dev = NULL;
+ }
+
+exit:
+ kfree(name);
+
+ return ret;
+}
+
+void bdi_unregister(struct backing_dev_info *bdi)
+{
+ device_unregister(bdi->dev);
+}
+
+EXPORT_SYMBOL(bdi_register);
+EXPORT_SYMBOL(bdi_unregister);
int bdi_init(struct backing_dev_info *bdi)
{
int i, j;
int err;
+ memset(bdi, 0, sizeof(*bdi));
+
for (i = 0; i < NR_BDI_STAT_ITEMS; i++) {
err = percpu_counter_init_irq(&bdi->bdi_stat[i], 0);
if (err)
@@ -33,6 +140,8 @@ void bdi_destroy(struct backing_dev_info
{
int i;
+ bdi_unregister(bdi);
+
for (i = 0; i < NR_BDI_STAT_ITEMS; i++)
percpu_counter_destroy(&bdi->bdi_stat[i]);
@@ -90,3 +199,4 @@ long congestion_wait(int rw, long timeou
}
EXPORT_SYMBOL(congestion_wait);
+
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -291,7 +291,7 @@ static unsigned long determine_dirtyable
return x + 1; /* Ensure that we never return 0 */
}
-static void
+void
get_dirty_limits(long *pbackground, long *pdirty, long *pbdi_dirty,
struct backing_dev_info *bdi)
{
--- a/mm/util.c
+++ b/mm/util.c
@@ -136,3 +136,45 @@ char *strndup_user(const char __user *s,
return p;
}
EXPORT_SYMBOL(strndup_user);
+
+char *kvprintf(const char *fmt, va_list args)
+{
+ char c;
+ char *buf;
+ int need;
+ int limit;
+ va_list args1;
+
+ va_copy(args1, args);
+ need = vsnprintf(&c, 1, fmt, args1);
+ va_end(args1);
+
+ /* Allocate the new space and copy the string in */
+ limit = need + 1;
+ buf = kmalloc(limit, GFP_KERNEL);
+ if (!buf)
+ return NULL;
+ need = vsnprintf(buf, limit, fmt, args);
+
+ /* something wrong with the string we copied? */
+ if (need >= limit) {
+ kfree(buf);
+ return NULL;
+ }
+
+ return buf;
+}
+EXPORT_SYMBOL(kvprintf);
+
+char *kprintf(const char *fmt, ...)
+{
+ char *buf;
+ va_list args;
+
+ va_start(args, fmt);
+ buf = kvprintf(fmt, args);
+ va_end(args);
+
+ return buf;
+}
+EXPORT_SYMBOL(kprintf);
next prev parent reply other threads:[~2007-10-27 16:00 UTC|newest]
Thread overview: 112+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-10-01 21:22 -mm merge plans for 2.6.24 Andrew Morton
2007-10-01 21:34 ` wibbling over the cpuset shed domain connnection Paul Jackson
2007-10-02 12:36 ` Nick Piggin
2007-10-03 5:21 ` Paul Jackson
2007-10-02 13:12 ` Nick Piggin
2007-10-03 7:00 ` Paul Jackson
2007-10-03 10:57 ` Andrew Morton
2007-10-02 4:21 ` Memory controller merge (was Re: -mm merge plans for 2.6.24) Balbir Singh
2007-10-02 15:46 ` Hugh Dickins
2007-10-03 8:13 ` Balbir Singh
2007-10-03 18:47 ` Hugh Dickins
2007-10-04 4:16 ` Balbir Singh
2007-10-04 13:16 ` Hugh Dickins
2007-10-05 3:07 ` Balbir Singh
2007-10-07 17:41 ` Hugh Dickins
2007-10-08 2:54 ` Balbir Singh
2007-10-04 16:10 ` Paul Menage
2007-10-10 21:07 ` Rik van Riel
2007-10-11 6:33 ` Balbir Singh
2007-10-02 6:18 ` x86 patches was Re: -mm merge plans for 2.6.24 Andi Kleen
2007-10-02 6:32 ` Andrew Morton
2007-10-02 7:01 ` Andi Kleen
2007-10-02 7:18 ` Andrew Morton
2007-10-02 7:36 ` KAMEZAWA Hiroyuki
2007-10-02 7:43 ` Andrew Morton
2007-10-02 8:16 ` KAMEZAWA Hiroyuki
2007-10-02 10:48 ` Yasunori Goto
2007-10-02 18:18 ` Christoph Lameter
2007-10-02 17:25 ` Lee Schermerhorn
2007-10-02 16:40 ` Nish Aravamudan
2007-10-02 17:17 ` Lee Schermerhorn
2007-10-02 18:16 ` Christoph Lameter
2007-10-02 7:55 ` Matt Mackall
2007-10-02 7:59 ` Andi Kleen
2007-10-02 9:26 ` Andy Whitcroft
2007-10-02 7:37 ` Ingo Molnar
2007-10-02 7:46 ` Andi Kleen
2007-10-02 7:58 ` Thomas Gleixner
2007-10-02 7:59 ` v4l-stk11xx* [Was: -mm merge plans for 2.6.24] Jiri Slaby
[not found] ` <4701FC79.3060608@gmail.com>
2007-10-02 8:10 ` Wireless damage " Jiri Slaby
2007-10-02 8:17 ` per BDI dirty limit (was Re: -mm merge plans for 2.6.24) Peter Zijlstra
[not found] ` <20071002082831.GA19954@mail.ustc.edu.cn>
2007-10-02 8:28 ` Fengguang Wu
2007-10-02 8:31 ` Andrew Morton
2007-10-02 8:48 ` Peter Zijlstra
2007-10-02 10:31 ` Kay Sievers
2007-10-02 10:44 ` Peter Zijlstra
[not found] ` <20071002104734.GA9410@mail.ustc.edu.cn>
2007-10-02 10:47 ` Fengguang Wu
2007-10-02 11:22 ` Kay Sievers
[not found] ` <20071002112802.GA12607@mail.ustc.edu.cn>
2007-10-02 11:28 ` Fengguang Wu
2007-10-02 11:21 ` Kay Sievers
2007-10-02 11:40 ` Peter Zijlstra
2007-10-02 12:05 ` Nick Piggin
2007-10-03 10:15 ` Kay Sievers
2007-10-03 10:37 ` Peter Zijlstra
2007-10-03 13:35 ` Kay Sievers
2007-10-03 13:58 ` Peter Zijlstra
2007-10-26 14:48 ` Peter Zijlstra
2007-10-26 15:06 ` Miklos Szeredi
2007-10-26 15:10 ` Kay Sievers
2007-10-26 15:22 ` Peter Zijlstra
2007-10-26 15:33 ` Kay Sievers
2007-10-26 15:33 ` Peter Zijlstra
2007-10-26 15:55 ` Kay Sievers
2007-10-26 20:04 ` Peter Zijlstra
2007-10-27 1:18 ` Peter Zijlstra
2007-10-27 2:40 ` Greg KH
2007-10-27 8:39 ` Peter Zijlstra
2007-10-27 16:02 ` Greg KH [this message]
2007-10-27 16:07 ` Peter Zijlstra
2007-10-27 21:08 ` Kay Sievers
2007-10-27 21:35 ` Peter Zijlstra
2007-10-28 7:10 ` Greg KH
2007-11-02 13:15 ` Peter Zijlstra
2007-11-02 13:50 ` Kay Sievers
2007-11-02 13:54 ` Peter Zijlstra
2007-11-02 14:17 ` Peter Zijlstra
2007-11-02 14:32 ` Kay Sievers
2007-11-02 14:59 ` [PATCH] mm: sysfs: expose the BDI object in sysfs Peter Zijlstra
2007-11-02 15:13 ` Kay Sievers
2007-10-26 16:37 ` per BDI dirty limit (was Re: -mm merge plans for 2.6.24) Trond Myklebust
2007-12-14 14:50 ` Peter Zijlstra
2007-12-14 15:14 ` Miklos Szeredi
2007-12-14 15:54 ` Peter Zijlstra
2007-10-02 14:38 ` Kay Sievers
2007-10-03 11:00 ` Martin Knoblauch
[not found] ` <20071002083922.GA28892@mail.ustc.edu.cn>
2007-10-02 8:39 ` writeback fixes Fengguang Wu
2007-10-02 16:06 ` kswapd min order, slub max order [was Re: -mm merge plans for 2.6.24] Hugh Dickins
2007-10-02 9:10 ` Nick Piggin
2007-10-02 18:38 ` Mel Gorman
2007-10-02 18:28 ` Christoph Lameter
2007-10-03 0:37 ` Christoph Lameter
2007-10-02 16:12 ` -mm merge plans for 2.6.24 Pekka Enberg
2007-10-02 16:21 ` new aops merge [was Re: -mm merge plans for 2.6.24] Hugh Dickins
2007-10-02 17:45 ` remove zero_page (was Re: -mm merge plans for 2.6.24) Nick Piggin
2007-10-03 10:58 ` Andrew Morton
2007-10-03 15:21 ` Linus Torvalds
2007-10-08 15:17 ` Nick Piggin
2007-10-09 13:00 ` Hugh Dickins
2007-10-09 14:52 ` Linus Torvalds
2007-10-09 9:31 ` Nick Piggin
2007-10-10 2:22 ` Linus Torvalds
2007-10-09 10:15 ` Nick Piggin
2007-10-10 3:06 ` Linus Torvalds
2007-10-10 4:06 ` Hugh Dickins
2007-10-10 5:20 ` Linus Torvalds
2007-10-09 14:30 ` Nick Piggin
2007-10-10 15:04 ` Linus Torvalds
2007-10-03 19:50 ` A kernel Tracing interface " David Wilder
2007-10-09 9:19 ` r/o bind mounts, was Re: -mm merge plans for 2.6.24 Christoph Hellwig
2007-10-13 8:44 ` Borislav Petkov
2007-10-13 8:52 ` Andrew Morton
2007-10-13 11:45 ` Borislav Petkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20071027160203.GA5709@kroah.com \
--to=greg@kroah.com \
--cc=akpm@linux-foundation.org \
--cc=fengguang.wu@gmail.com \
--cc=jens.axboe@oracle.com \
--cc=kay.sievers@vrfy.org \
--cc=linux-kernel@vger.kernel.org \
--cc=miklos@szeredi.hu \
--cc=nickpiggin@yahoo.com.au \
--cc=peterz@infradead.org \
--cc=trond.myklebust@fys.uio.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox