public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [GIT PULL] block fixes for 2.6.36-rc5
@ 2010-09-22  7:49 Jens Axboe
  2010-09-22 15:06 ` Linus Torvalds
  0 siblings, 1 reply; 3+ messages in thread
From: Jens Axboe @ 2010-09-22  7:49 UTC (permalink / raw)
  To: Linus Torvalds, linux-kernel@vger.kernel.org

Hi Linus,

A small batch of fixes. Two important fixes, correcting a bug in the
unified request/bio command flags causing blk_rq_map_kern() to
flag writes incorrectly, and the oops in CFQ with cgroups and a usb key.
The small 3-patch series from Jan fixes a long standing regression
dating back to 2.6.32.

Please pull.

  git://git.kernel.dk/linux-2.6-block.git for-linus

Benny Halevy (1):
      block: fix blk_rq_map_kern bio direction flag

Dan Carpenter (1):
      cciss: freeing uninitialized data on error path

Jan Kara (3):
      bdi: Initialize noop_backing_dev_info properly
      char: Mark /dev/zero and /dev/kmem as not capable of writeback
      bdi: Fix warnings in __mark_inode_dirty for /dev/zero and friends

Vivek Goyal (1):
      cfq-iosched: fix a kernel OOPs when usb key is inserted

 block/blk-map.c       |    2 +-
 block/cfq-iosched.c   |   16 +++++++++++++---
 drivers/block/cciss.c |    2 +-
 drivers/char/mem.c    |    3 ++-
 fs/char_dev.c         |    4 +++-
 fs/fs-writeback.c     |   23 +++++++++++++++++++++--
 mm/backing-dev.c      |    2 ++
 7 files changed, 43 insertions(+), 9 deletions(-)


diff --git a/block/blk-map.c b/block/blk-map.c
index c65d759..ade0a08 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -307,7 +307,7 @@ int blk_rq_map_kern(struct request_queue *q, struct request *rq, void *kbuf,
 		return PTR_ERR(bio);
 
 	if (rq_data_dir(rq) == WRITE)
-		bio->bi_rw |= (1 << REQ_WRITE);
+		bio->bi_rw |= REQ_WRITE;
 
 	if (do_copy)
 		rq->cmd_flags |= REQ_COPY_USER;
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index f65c6f0..9eba291 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -1019,10 +1019,20 @@ cfq_find_alloc_cfqg(struct cfq_data *cfqd, struct cgroup *cgroup, int create)
 	 */
 	atomic_set(&cfqg->ref, 1);
 
-	/* Add group onto cgroup list */
-	sscanf(dev_name(bdi->dev), "%u:%u", &major, &minor);
-	cfq_blkiocg_add_blkio_group(blkcg, &cfqg->blkg, (void *)cfqd,
+	/*
+	 * Add group onto cgroup list. It might happen that bdi->dev is
+	 * not initiliazed yet. Initialize this new group without major
+	 * and minor info and this info will be filled in once a new thread
+	 * comes for IO. See code above.
+	 */
+	if (bdi->dev) {
+		sscanf(dev_name(bdi->dev), "%u:%u", &major, &minor);
+		cfq_blkiocg_add_blkio_group(blkcg, &cfqg->blkg, (void *)cfqd,
 					MKDEV(major, minor));
+	} else
+		cfq_blkiocg_add_blkio_group(blkcg, &cfqg->blkg, (void *)cfqd,
+					0);
+
 	cfqg->weight = blkcg_get_weight(blkcg, cfqg->blkg.dev);
 
 	/* Add group on cfqd list */
diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
index 6124c2f..5e4fadc 100644
--- a/drivers/block/cciss.c
+++ b/drivers/block/cciss.c
@@ -4792,7 +4792,7 @@ static int __devinit cciss_init_one(struct pci_dev *pdev,
 clean4:
 	kfree(h->cmd_pool_bits);
 	/* Free up sg elements */
-	for (k = 0; k < h->nr_cmds; k++)
+	for (k-- ; k >= 0; k--)
 		kfree(h->scatter_list[k]);
 	kfree(h->scatter_list);
 	cciss_free_sg_chain_blocks(h->cmd_sg_list, h->nr_cmds);
diff --git a/drivers/char/mem.c b/drivers/char/mem.c
index a398ecd..1f528fa 100644
--- a/drivers/char/mem.c
+++ b/drivers/char/mem.c
@@ -788,10 +788,11 @@ static const struct file_operations zero_fops = {
 /*
  * capabilities for /dev/zero
  * - permits private mappings, "copies" are taken of the source of zeros
+ * - no writeback happens
  */
 static struct backing_dev_info zero_bdi = {
 	.name		= "char/mem",
-	.capabilities	= BDI_CAP_MAP_COPY,
+	.capabilities	= BDI_CAP_MAP_COPY | BDI_CAP_NO_ACCT_AND_WRITEBACK,
 };
 
 static const struct file_operations full_fops = {
diff --git a/fs/char_dev.c b/fs/char_dev.c
index f80a4f2..143d393 100644
--- a/fs/char_dev.c
+++ b/fs/char_dev.c
@@ -40,7 +40,9 @@ struct backing_dev_info directly_mappable_cdev_bdi = {
 #endif
 		/* permit direct mmap, for read, write or exec */
 		BDI_CAP_MAP_DIRECT |
-		BDI_CAP_READ_MAP | BDI_CAP_WRITE_MAP | BDI_CAP_EXEC_MAP),
+		BDI_CAP_READ_MAP | BDI_CAP_WRITE_MAP | BDI_CAP_EXEC_MAP |
+		/* no writeback happens */
+		BDI_CAP_NO_ACCT_AND_WRITEBACK),
 };
 
 static struct kobj_map *cdev_map;
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 81e086d..5581122 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -52,8 +52,6 @@ struct wb_writeback_work {
 #define CREATE_TRACE_POINTS
 #include <trace/events/writeback.h>
 
-#define inode_to_bdi(inode)	((inode)->i_mapping->backing_dev_info)
-
 /*
  * We don't actually have pdflush, but this one is exported though /proc...
  */
@@ -71,6 +69,27 @@ int writeback_in_progress(struct backing_dev_info *bdi)
 	return test_bit(BDI_writeback_running, &bdi->state);
 }
 
+static inline struct backing_dev_info *inode_to_bdi(struct inode *inode)
+{
+	struct super_block *sb = inode->i_sb;
+	struct backing_dev_info *bdi = inode->i_mapping->backing_dev_info;
+
+	/*
+	 * For inodes on standard filesystems, we use superblock's bdi. For
+	 * inodes on virtual filesystems, we want to use inode mapping's bdi
+	 * because they can possibly point to something useful (think about
+	 * block_dev filesystem).
+	 */
+	if (sb->s_bdi && sb->s_bdi != &noop_backing_dev_info) {
+		/* Some device inodes could play dirty tricks. Catch them... */
+		WARN(bdi != sb->s_bdi && bdi_cap_writeback_dirty(bdi),
+			"Dirtiable inode bdi %s != sb bdi %s\n",
+			bdi->name, sb->s_bdi->name);
+		return sb->s_bdi;
+	}
+	return bdi;
+}
+
 static void bdi_queue_work(struct backing_dev_info *bdi,
 		struct wb_writeback_work *work)
 {
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index c2bf86f..65d4204 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -30,6 +30,7 @@ EXPORT_SYMBOL_GPL(default_backing_dev_info);
 
 struct backing_dev_info noop_backing_dev_info = {
 	.name		= "noop",
+	.capabilities	= BDI_CAP_NO_ACCT_AND_WRITEBACK,
 };
 EXPORT_SYMBOL_GPL(noop_backing_dev_info);
 
@@ -243,6 +244,7 @@ static int __init default_bdi_init(void)
 	err = bdi_init(&default_backing_dev_info);
 	if (!err)
 		bdi_register(&default_backing_dev_info, NULL, "default");
+	err = bdi_init(&noop_backing_dev_info);
 
 	return err;
 }
-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [GIT PULL] block fixes for 2.6.36-rc5
  2010-09-22  7:49 [GIT PULL] block fixes for 2.6.36-rc5 Jens Axboe
@ 2010-09-22 15:06 ` Linus Torvalds
  2010-09-22 15:24   ` Jens Axboe
  0 siblings, 1 reply; 3+ messages in thread
From: Linus Torvalds @ 2010-09-22 15:06 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel@vger.kernel.org

Gaah. This is just _incredibly_ ugly:

On Wed, Sep 22, 2010 at 12:49 AM, Jens Axboe <jaxboe@fusionio.com> wrote:
>
> -       /* Add group onto cgroup list */
> -       sscanf(dev_name(bdi->dev), "%u:%u", &major, &minor);
> -       cfq_blkiocg_add_blkio_group(blkcg, &cfqg->blkg, (void *)cfqd,
> +       /*
> +        * Add group onto cgroup list. It might happen that bdi->dev is
> +        * not initiliazed yet. Initialize this new group without major
> +        * and minor info and this info will be filled in once a new thread
> +        * comes for IO. See code above.
> +        */
> +       if (bdi->dev) {
> +               sscanf(dev_name(bdi->dev), "%u:%u", &major, &minor);
> +               cfq_blkiocg_add_blkio_group(blkcg, &cfqg->blkg, (void *)cfqd,
>                                        MKDEV(major, minor));
> +       } else
> +               cfq_blkiocg_add_blkio_group(blkcg, &cfqg->blkg, (void *)cfqd,
> +                                       0);
> +

and quite frankly, anything that does that kind of thing is total
sh*t. Not only is the sscanf() just broken (really? figuring out
things from some internal string? Using dev_t in this time and age for
kernel internal stuff?) to begin with, but if you have to then do it
conditionally, for chrissake do it _cleanly_.

Make a small helper function that does "get me the dev_t of this
'dev'", and make that one do

   static unsigned int device_dev_t(struct device *dev)
   {
      unsigned int major = 0, minor = 0;

      if (dev)
         sscanf(dev_name(bdi->dev), "%u:%u", &major, &minor);

      return MKDEV(major, minor);
   }

and then just have a single 'cfq_blkiocg_add_blkio_group()' there.

But more seriously, why the hell does anything internal to cfq use a
'dev_t' in the first place? Why isn't that 'struct blkio_group' using
a pointer to the 'struct device' or something like that instead (or
the pointer to the queue, or whatever)? It's just damn wrong to use
dev_t in this day and age, and the fact that you need to make it up
using sscanf() should have clued people into that fact.

I hate seeing obvious crap-workarounds this late in an -rc.

                            Linus

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [GIT PULL] block fixes for 2.6.36-rc5
  2010-09-22 15:06 ` Linus Torvalds
@ 2010-09-22 15:24   ` Jens Axboe
  0 siblings, 0 replies; 3+ messages in thread
From: Jens Axboe @ 2010-09-22 15:24 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel@vger.kernel.org

On 2010-09-22 17:06, Linus Torvalds wrote:
> Gaah. This is just _incredibly_ ugly:
> 
> On Wed, Sep 22, 2010 at 12:49 AM, Jens Axboe <jaxboe@fusionio.com> wrote:
>>
>> -       /* Add group onto cgroup list */
>> -       sscanf(dev_name(bdi->dev), "%u:%u", &major, &minor);
>> -       cfq_blkiocg_add_blkio_group(blkcg, &cfqg->blkg, (void *)cfqd,
>> +       /*
>> +        * Add group onto cgroup list. It might happen that bdi->dev is
>> +        * not initiliazed yet. Initialize this new group without major
>> +        * and minor info and this info will be filled in once a new thread
>> +        * comes for IO. See code above.
>> +        */
>> +       if (bdi->dev) {
>> +               sscanf(dev_name(bdi->dev), "%u:%u", &major, &minor);
>> +               cfq_blkiocg_add_blkio_group(blkcg, &cfqg->blkg, (void *)cfqd,
>>                                        MKDEV(major, minor));
>> +       } else
>> +               cfq_blkiocg_add_blkio_group(blkcg, &cfqg->blkg, (void *)cfqd,
>> +                                       0);
>> +
> 
> and quite frankly, anything that does that kind of thing is total
> sh*t. Not only is the sscanf() just broken (really? figuring out
> things from some internal string? Using dev_t in this time and age for
> kernel internal stuff?) to begin with, but if you have to then do it
> conditionally, for chrissake do it _cleanly_.
> 
> Make a small helper function that does "get me the dev_t of this
> 'dev'", and make that one do
> 
>    static unsigned int device_dev_t(struct device *dev)
>    {
>       unsigned int major = 0, minor = 0;
> 
>       if (dev)
>          sscanf(dev_name(bdi->dev), "%u:%u", &major, &minor);
> 
>       return MKDEV(major, minor);
>    }
> 
> and then just have a single 'cfq_blkiocg_add_blkio_group()' there.
> 
> But more seriously, why the hell does anything internal to cfq use a
> 'dev_t' in the first place? Why isn't that 'struct blkio_group' using
> a pointer to the 'struct device' or something like that instead (or
> the pointer to the queue, or whatever)? It's just damn wrong to use
> dev_t in this day and age, and the fact that you need to make it up
> using sscanf() should have clued people into that fact.
> 
> I hate seeing obvious crap-workarounds this late in an -rc.

I don't think anybody will disagree. The major problem here is that we
still have some drivers using multiple devices per queue, one of them
will be gone for .37 at least and I hope we can get mtd blk converted
too. When that happens, we can place a dev in the queue and get rid of
this hack and similar hacks on double bdi etc registrations.

So we can put some makeup on the corpse like you describe, or we can
just live with this until we can kill off abominations like this for
2.6.37.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2010-09-22 15:24 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-22  7:49 [GIT PULL] block fixes for 2.6.36-rc5 Jens Axboe
2010-09-22 15:06 ` Linus Torvalds
2010-09-22 15:24   ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox