hfsplus mount regression in 2.6.38

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* hfsplus mount regression in 2.6.38
@ 2011-05-25 14:25 Seth Forshee
  2011-05-27  9:25 ` Christoph Hellwig
  0 siblings, 1 reply; 8+ messages in thread
From: Seth Forshee @ 2011-05-25 14:25 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Anton Salikhmetov, linux-kernel

We've been receiving reports of problems mounting HFS+ formatted volumes
(iPods mostly, but not exclusively) starting with the 2.6.38 kernel,
with messages like the following in dmesg.

  usb 2-2: new high speed USB device using ehci_hcd and address 12
  scsi7 : usb-storage 2-2:1.0
  scsi 7:0:0:0: Direct-Access Apple iPod 1.62 PQ: 0 ANSI: 0
  sd 7:0:0:0: Attached scsi generic sg2 type 0
  sd 7:0:0:0: [sdb] 1982464 2048-byte logical blocks: (4.06 GB/3.78 GiB)
  sd 7:0:0:0: [sdb] Write Protect is off
  sd 7:0:0:0: [sdb] Mode Sense: 68 00 00 08
  sd 7:0:0:0: [sdb] No Caching mode page present
  sd 7:0:0:0: [sdb] Assuming drive cache: write through
  sd 7:0:0:0: [sdb] 1982464 2048-byte logical blocks: (4.06 GB/3.78 GiB)
  sd 7:0:0:0: [sdb] No Caching mode page present
  sd 7:0:0:0: [sdb] Assuming drive cache: write through
  sdb: [mac] sdb1 sdb2 sdb3
  sd 7:0:0:0: [sdb] 1982464 2048-byte logical blocks: (4.06 GB/3.78 GiB)
  sd 7:0:0:0: [sdb] No Caching mode page present
  sd 7:0:0:0: [sdb] Assuming drive cache: write through
  sd 7:0:0:0: [sdb] Attached SCSI removable disk
  sd 7:0:0:0: [sdb] Bad block number requested
  hfs: unable to find HFS+ superblock
  sd 7:0:0:0: [sdb] Bad block number requested
  hfs: unable to find HFS+ superblock

Reverting commits 52399b1 (hfsplus: use raw bio access for the volume
headers) and 358f26d5 (hfsplus: use raw bio access for partition tables)
fixes the problems. It appears the problems are due to hfsplus
submitting 512 byte bios to a block device whose sector size is larger
than 512 byts (2 KB in the log above), and the block driver is quite
reasonably rejecting any requests without proper sector alignment.

How would you suggest fixing this?  Most file systems are using
sb_bread() for this sort of thing, but since the offending patches are
intended to stop using buffer_heads I'm assuming that's not an option.

Thanks,
Seth

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: hfsplus mount regression in 2.6.38
  2011-05-25 14:25 hfsplus mount regression in 2.6.38 Seth Forshee
@ 2011-05-27  9:25 ` Christoph Hellwig
  2011-05-27 13:23   ` Seth Forshee
  0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2011-05-27  9:25 UTC (permalink / raw)
  To: Anton Salikhmetov, linux-kernel

On Wed, May 25, 2011 at 09:25:21AM -0500, Seth Forshee wrote:
> Reverting commits 52399b1 (hfsplus: use raw bio access for the volume
> headers) and 358f26d5 (hfsplus: use raw bio access for partition tables)
> fixes the problems. It appears the problems are due to hfsplus
> submitting 512 byte bios to a block device whose sector size is larger
> than 512 byts (2 KB in the log above), and the block driver is quite
> reasonably rejecting any requests without proper sector alignment.
> 
> How would you suggest fixing this?  Most file systems are using
> sb_bread() for this sort of thing, but since the offending patches are
> intended to stop using buffer_heads I'm assuming that's not an option.

Basically all hardcoded uses of HFSPLUS_SECTOR_SIZE need to be replaced
with a use of bdev_logical_block_size, or a per-sb variable derived from
it, and the addressing needs to be accomodated to fit it.  I'd need to
look into a bit more detail in what form the sectors we pass into it
are in - we might have to convert them from 512byte to large units,
or they might already be in it.  If they happen to be in 512 byte units
we might have to do read-modify write cycles.

What large sector size device do you have, a CDROM?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: hfsplus mount regression in 2.6.38
  2011-05-27  9:25 ` Christoph Hellwig
@ 2011-05-27 13:23   ` Seth Forshee
  2011-05-27 18:24     ` Seth Forshee
  0 siblings, 1 reply; 8+ messages in thread
From: Seth Forshee @ 2011-05-27 13:23 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Anton Salikhmetov, linux-kernel

On Fri, May 27, 2011 at 05:25:22AM -0400, Christoph Hellwig wrote:
> On Wed, May 25, 2011 at 09:25:21AM -0500, Seth Forshee wrote:
> > Reverting commits 52399b1 (hfsplus: use raw bio access for the volume
> > headers) and 358f26d5 (hfsplus: use raw bio access for partition tables)
> > fixes the problems. It appears the problems are due to hfsplus
> > submitting 512 byte bios to a block device whose sector size is larger
> > than 512 byts (2 KB in the log above), and the block driver is quite
> > reasonably rejecting any requests without proper sector alignment.
> > 
> > How would you suggest fixing this?  Most file systems are using
> > sb_bread() for this sort of thing, but since the offending patches are
> > intended to stop using buffer_heads I'm assuming that's not an option.
> 
> Basically all hardcoded uses of HFSPLUS_SECTOR_SIZE need to be replaced
> with a use of bdev_logical_block_size, or a per-sb variable derived from
> it, and the addressing needs to be accomodated to fit it.  I'd need to
> look into a bit more detail in what form the sectors we pass into it
> are in - we might have to convert them from 512byte to large units,
> or they might already be in it.  If they happen to be in 512 byte units
> we might have to do read-modify write cycles.

It seems reasonable to use sb->s_blocksize for this, as it shouldn't get
set to anything larger than the logical block size. That's what
sb_bread() uses in fact.

> What large sector size device do you have, a CDROM?

I don't have any large-sector devices personally, these are reports
coming in from users. Many of them are seeing this problem with iPods,
and some have reported it with HDDs. As far as I know none of the
reports have been with a CDROM. Here's a link to the bug.

  http://bugs.launchpad.net/bugs/734883

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: hfsplus mount regression in 2.6.38
  2011-05-27 13:23   ` Seth Forshee
@ 2011-05-27 18:24     ` Seth Forshee
  2011-06-02 21:58       ` Seth Forshee
  0 siblings, 1 reply; 8+ messages in thread
From: Seth Forshee @ 2011-05-27 18:24 UTC (permalink / raw)
  To: Christoph Hellwig, Anton Salikhmetov, linux-kernel

On Fri, May 27, 2011 at 08:23:56AM -0500, Seth Forshee wrote:
> On Fri, May 27, 2011 at 05:25:22AM -0400, Christoph Hellwig wrote:
> > On Wed, May 25, 2011 at 09:25:21AM -0500, Seth Forshee wrote:
> > > Reverting commits 52399b1 (hfsplus: use raw bio access for the volume
> > > headers) and 358f26d5 (hfsplus: use raw bio access for partition tables)
> > > fixes the problems. It appears the problems are due to hfsplus
> > > submitting 512 byte bios to a block device whose sector size is larger
> > > than 512 byts (2 KB in the log above), and the block driver is quite
> > > reasonably rejecting any requests without proper sector alignment.
> > > 
> > > How would you suggest fixing this?  Most file systems are using
> > > sb_bread() for this sort of thing, but since the offending patches are
> > > intended to stop using buffer_heads I'm assuming that's not an option.
> > 
> > Basically all hardcoded uses of HFSPLUS_SECTOR_SIZE need to be replaced
> > with a use of bdev_logical_block_size, or a per-sb variable derived from
> > it, and the addressing needs to be accomodated to fit it.  I'd need to
> > look into a bit more detail in what form the sectors we pass into it
> > are in - we might have to convert them from 512byte to large units,
> > or they might already be in it.  If they happen to be in 512 byte units
> > we might have to do read-modify write cycles.
> 
> It seems reasonable to use sb->s_blocksize for this, as it shouldn't get
> set to anything larger than the logical block size. That's what
> sb_bread() uses in fact.

I started looking into this, and it seems like one aspect of it is a
little nasty. My knowledge here is rudimentary at best though, so maybe
you can comment on whether or not this situation is possible.

It looks like some of the metadata accessed via bio can reside in blocks
adjacent to blocks containing file data, in which case it would be
possible for some of this metadata to be present in the page cache.
This could potentially lead to races where metadata changes are
committed and later overwritten by stale data in the page cache. Is this
a valid concern?

I'm also not fully understanding why using buffer_heads caused the issue
these patches were intended to fix. It looks to me like the bios
submitted by sb_bread() to read the alternate volume header should be
the maximum of 512 bytes and the logical block size set by the block
driver. So unless the block driver is setting this size incorrectly I
don't see how the I/O size gets forced to 4 KB. Am I missing something?

Thanks,
Seth

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: hfsplus mount regression in 2.6.38
  2011-05-27 18:24     ` Seth Forshee
@ 2011-06-02 21:58       ` Seth Forshee
  2011-06-30 11:35         ` Christoph Hellwig
  0 siblings, 1 reply; 8+ messages in thread
From: Seth Forshee @ 2011-06-02 21:58 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Anton Salikhmetov, linux-kernel

On Fri, May 27, 2011 at 01:24:06PM -0500, Seth Forshee wrote:
> > > Basically all hardcoded uses of HFSPLUS_SECTOR_SIZE need to be replaced
> > > with a use of bdev_logical_block_size, or a per-sb variable derived from
> > > it, and the addressing needs to be accomodated to fit it.  I'd need to
> > > look into a bit more detail in what form the sectors we pass into it
> > > are in - we might have to convert them from 512byte to large units,
> > > or they might already be in it.  If they happen to be in 512 byte units
> > > we might have to do read-modify write cycles.
> > 
> > It seems reasonable to use sb->s_blocksize for this, as it shouldn't get
> > set to anything larger than the logical block size. That's what
> > sb_bread() uses in fact.
> 
> I started looking into this, and it seems like one aspect of it is a
> little nasty. My knowledge here is rudimentary at best though, so maybe
> you can comment on whether or not this situation is possible.
> 
> It looks like some of the metadata accessed via bio can reside in blocks
> adjacent to blocks containing file data, in which case it would be
> possible for some of this metadata to be present in the page cache.
> This could potentially lead to races where metadata changes are
> committed and later overwritten by stale data in the page cache. Is this
> a valid concern?

I took a crack at converting the users of direct bio to use
bdev_logical_block_size instead of HFSPLUS_SECTOR_SIZE. sb->s_blocksize
doesn't turn out to work because it may change after reading the
volume header. The patch is below; feedback is appreciated.

So far I've only done light testing, and no testing with large-sector
devices since I don't have any to test with. I'm still concerned about
duplicating data also in the page cache with this approach. Any thoughts
on whether or not this is something to be worried about?


diff --git a/fs/hfsplus/hfsplus_fs.h b/fs/hfsplus/hfsplus_fs.h
index d685752..4e7f64b 100644
--- a/fs/hfsplus/hfsplus_fs.h
+++ b/fs/hfsplus/hfsplus_fs.h
@@ -13,6 +13,7 @@
 #include <linux/fs.h>
 #include <linux/mutex.h>
 #include <linux/buffer_head.h>
+#include <linux/blkdev.h>
 #include "hfsplus_raw.h"
 
 #define DBG_BNODE_REFS	0x00000001
@@ -110,7 +111,9 @@ struct hfsplus_vh;
 struct hfs_btree;
 
 struct hfsplus_sb_info {
+	void *s_vhdr_buf;
 	struct hfsplus_vh *s_vhdr;
+	void *s_backup_vhdr_buf;
 	struct hfsplus_vh *s_backup_vhdr;
 	struct hfs_btree *ext_tree;
 	struct hfs_btree *cat_tree;
@@ -258,6 +261,15 @@ struct hfsplus_readdir_data {
 	struct hfsplus_cat_key key;
 };
 
+/*
+ * Find minimum acceptible I/O size for an hfsplus sb.
+ */
+static inline unsigned short hfsplus_min_io_size(struct super_block *sb)
+{
+	return max_t(unsigned short, bdev_logical_block_size(sb->s_bdev),
+		     HFSPLUS_SECTOR_SIZE);
+}
+
 #define hfs_btree_open hfsplus_btree_open
 #define hfs_btree_close hfsplus_btree_close
 #define hfs_btree_write hfsplus_btree_write
@@ -436,8 +448,8 @@ int hfsplus_compare_dentry(const struct dentry *parent,
 /* wrapper.c */
 int hfsplus_read_wrapper(struct super_block *);
 int hfs_part_find(struct super_block *, sector_t *, sector_t *);
-int hfsplus_submit_bio(struct block_device *bdev, sector_t sector,
-		void *data, int rw);
+int hfsplus_submit_bio(struct super_block *sb, sector_t sector,
+		void *buf, void **data, int rw);
 
 /* time macros */
 #define __hfsp_mt2ut(t)		(be32_to_cpu(t) - 2082844800U)
diff --git a/fs/hfsplus/part_tbl.c b/fs/hfsplus/part_tbl.c
index 40ad88c..eb355d8 100644
--- a/fs/hfsplus/part_tbl.c
+++ b/fs/hfsplus/part_tbl.c
@@ -88,11 +88,12 @@ static int hfs_parse_old_pmap(struct super_block *sb, struct old_pmap *pm,
 	return -ENOENT;
 }
 
-static int hfs_parse_new_pmap(struct super_block *sb, struct new_pmap *pm,
-		sector_t *part_start, sector_t *part_size)
+static int hfs_parse_new_pmap(struct super_block *sb, void *buf,
+		struct new_pmap *pm, sector_t *part_start, sector_t *part_size)
 {
 	struct hfsplus_sb_info *sbi = HFSPLUS_SB(sb);
 	int size = be32_to_cpu(pm->pmMapBlkCnt);
+	int buf_size = hfsplus_min_io_size(sb);
 	int res;
 	int i = 0;
 
@@ -107,11 +108,14 @@ static int hfs_parse_new_pmap(struct super_block *sb, struct new_pmap *pm,
 		if (++i >= size)
 			return -ENOENT;
 
-		res = hfsplus_submit_bio(sb->s_bdev,
-					 *part_start + HFS_PMAP_BLK + i,
-					 pm, READ);
-		if (res)
-			return res;
+		pm = (struct new_pmap *)((u8 *)pm + HFSPLUS_SECTOR_SIZE);
+		if ((u8 *)pm - (u8 *)buf >= buf_size) {
+			res = hfsplus_submit_bio(sb,
+						 *part_start + HFS_PMAP_BLK + i,
+						 buf, (void **)&pm, READ);
+			if (res)
+				return res;
+		}
 	} while (pm->pmSig == cpu_to_be16(HFS_NEW_PMAP_MAGIC));
 
 	return -ENOENT;
@@ -124,15 +128,15 @@ static int hfs_parse_new_pmap(struct super_block *sb, struct new_pmap *pm,
 int hfs_part_find(struct super_block *sb,
 		sector_t *part_start, sector_t *part_size)
 {
-	void *data;
+	void *buf, *data;
 	int res;
 
-	data = kmalloc(HFSPLUS_SECTOR_SIZE, GFP_KERNEL);
-	if (!data)
+	buf = kmalloc(hfsplus_min_io_size(sb), GFP_KERNEL);
+	if (!buf)
 		return -ENOMEM;
 
-	res = hfsplus_submit_bio(sb->s_bdev, *part_start + HFS_PMAP_BLK,
-				 data, READ);
+	res = hfsplus_submit_bio(sb, *part_start + HFS_PMAP_BLK,
+				 buf, &data, READ);
 	if (res)
 		goto out;
 
@@ -141,13 +145,13 @@ int hfs_part_find(struct super_block *sb,
 		res = hfs_parse_old_pmap(sb, data, part_start, part_size);
 		break;
 	case HFS_NEW_PMAP_MAGIC:
-		res = hfs_parse_new_pmap(sb, data, part_start, part_size);
+		res = hfs_parse_new_pmap(sb, buf, data, part_start, part_size);
 		break;
 	default:
 		res = -ENOENT;
 		break;
 	}
 out:
-	kfree(data);
+	kfree(buf);
 	return res;
 }
diff --git a/fs/hfsplus/super.c b/fs/hfsplus/super.c
index b49b555..3add787 100644
--- a/fs/hfsplus/super.c
+++ b/fs/hfsplus/super.c
@@ -197,17 +197,17 @@ int hfsplus_sync_fs(struct super_block *sb, int wait)
 		write_backup = 1;
 	}
 
-	error2 = hfsplus_submit_bio(sb->s_bdev,
+	error2 = hfsplus_submit_bio(sb,
 				   sbi->part_start + HFSPLUS_VOLHEAD_SECTOR,
-				   sbi->s_vhdr, WRITE_SYNC);
+				   sbi->s_vhdr_buf, NULL, WRITE_SYNC);
 	if (!error)
 		error = error2;
 	if (!write_backup)
 		goto out;
 
-	error2 = hfsplus_submit_bio(sb->s_bdev,
+	error2 = hfsplus_submit_bio(sb,
 				  sbi->part_start + sbi->sect_count - 2,
-				  sbi->s_backup_vhdr, WRITE_SYNC);
+				  sbi->s_backup_vhdr_buf, NULL, WRITE_SYNC);
 	if (!error)
 		error2 = error;
 out:
diff --git a/fs/hfsplus/wrapper.c b/fs/hfsplus/wrapper.c
index 4ac88ff..0f6a89d 100644
--- a/fs/hfsplus/wrapper.c
+++ b/fs/hfsplus/wrapper.c
@@ -31,25 +31,65 @@ static void hfsplus_end_io_sync(struct bio *bio, int err)
 	complete(bio->bi_private);
 }
 
-int hfsplus_submit_bio(struct block_device *bdev, sector_t sector,
-		void *data, int rw)
+/*
+ * hfsplus_submit_bio - Perfrom block I/O
+ * @sb: super block of volume for I/O
+ * @sector: block to read or write, for blocks of HFSPLUS_SECTOR_SIZE bytes
+ * @buf: buffer for I/O
+ * @data: output pointer for location of requested data
+ * @rw: direction of I/O
+ *
+ * The unit of I/O is hfsplus_min_io_size(sb), which may be bigger than
+ * HFSPLUS_SECTOR_SIZE, and @buf must be sized accordingly. On reads
+ * @data will return a pointer to the start of the requested sector,
+ * which may not be the same location as @buf.
+ *
+ * If @sector is not aligned to the bdev logical block size it will
+ * be rounded down. For writes this means that @buf should contain data
+ * that starts at the rounded-down address. As long as the data was
+ * read using hfsplus_submit_bio() and the same buffer is used things
+ * will work correctly.
+ */
+int hfsplus_submit_bio(struct super_block *sb, sector_t sector,
+		void *buf, void **data, int rw)
 {
 	DECLARE_COMPLETION_ONSTACK(wait);
 	struct bio *bio;
 	int ret = 0;
+	unsigned int io_size;
+	loff_t start;
+	int offset;
+
+	/*
+	 * Align sector to hardware sector size and find offset.
+	 */
+	io_size = hfsplus_min_io_size(sb);
+	start = (loff_t)sector << HFSPLUS_SECTOR_SHIFT;
+	offset = start & (io_size - 1);
+	sector &= ~((io_size >> HFSPLUS_SECTOR_SHIFT) - 1);
 
 	bio = bio_alloc(GFP_NOIO, 1);
 	bio->bi_sector = sector;
-	bio->bi_bdev = bdev;
+	bio->bi_bdev = sb->s_bdev;
 	bio->bi_end_io = hfsplus_end_io_sync;
 	bio->bi_private = &wait;
 
-	/*
-	 * We always submit one sector at a time, so bio_add_page must not fail.
-	 */
-	if (bio_add_page(bio, virt_to_page(data), HFSPLUS_SECTOR_SIZE,
-			 offset_in_page(data)) != HFSPLUS_SECTOR_SIZE)
-		BUG();
+	if (!(rw & WRITE) && data)
+		*data = (u8 *)buf + offset;
+
+	while (io_size > 0) {
+		unsigned int page_offset = offset_in_page(buf);
+		unsigned int len = min_t(unsigned int, PAGE_SIZE - page_offset,
+					 io_size);
+
+		ret = bio_add_page(bio, virt_to_page(buf), len, page_offset);
+		if (ret != len) {
+			ret = -EIO;
+			goto out;
+		}
+		io_size -= len;
+		buf = (u8 *)buf + len;
+	}
 
 	submit_bio(rw, bio);
 	wait_for_completion(&wait);
@@ -57,8 +97,9 @@ int hfsplus_submit_bio(struct block_device *bdev, sector_t sector,
 	if (!bio_flagged(bio, BIO_UPTODATE))
 		ret = -EIO;
 
+out:
 	bio_put(bio);
-	return ret;
+	return ret < 0 ? ret : 0;
 }
 
 static int hfsplus_read_mdb(void *bufptr, struct hfsplus_wd *wd)
@@ -147,17 +188,17 @@ int hfsplus_read_wrapper(struct super_block *sb)
 	}
 
 	error = -ENOMEM;
-	sbi->s_vhdr = kmalloc(HFSPLUS_SECTOR_SIZE, GFP_KERNEL);
-	if (!sbi->s_vhdr)
+	sbi->s_vhdr_buf = kmalloc(hfsplus_min_io_size(sb), GFP_KERNEL);
+	if (!sbi->s_vhdr_buf)
 		goto out;
-	sbi->s_backup_vhdr = kmalloc(HFSPLUS_SECTOR_SIZE, GFP_KERNEL);
-	if (!sbi->s_backup_vhdr)
+	sbi->s_backup_vhdr_buf = kmalloc(hfsplus_min_io_size(sb), GFP_KERNEL);
+	if (!sbi->s_backup_vhdr_buf)
 		goto out_free_vhdr;
 
 reread:
-	error = hfsplus_submit_bio(sb->s_bdev,
-				   part_start + HFSPLUS_VOLHEAD_SECTOR,
-				   sbi->s_vhdr, READ);
+	error = hfsplus_submit_bio(sb, part_start + HFSPLUS_VOLHEAD_SECTOR,
+				   sbi->s_vhdr_buf, (void **)&sbi->s_vhdr,
+				   READ);
 	if (error)
 		goto out_free_backup_vhdr;
 
@@ -186,9 +227,9 @@ reread:
 		goto reread;
 	}
 
-	error = hfsplus_submit_bio(sb->s_bdev,
-				   part_start + part_size - 2,
-				   sbi->s_backup_vhdr, READ);
+	error = hfsplus_submit_bio(sb, part_start + part_size - 2,
+				   sbi->s_backup_vhdr_buf,
+				   (void **)&sbi->s_backup_vhdr, READ);
 	if (error)
 		goto out_free_backup_vhdr;
 

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: hfsplus mount regression in 2.6.38
  2011-06-02 21:58       ` Seth Forshee
@ 2011-06-30 11:35         ` Christoph Hellwig
  2011-06-30 13:10           ` Seth Forshee
  0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2011-06-30 11:35 UTC (permalink / raw)
  To: Christoph Hellwig, Anton Salikhmetov, linux-kernel

On Thu, Jun 02, 2011 at 04:58:21PM -0500, Seth Forshee wrote:
> I took a crack at converting the users of direct bio to use
> bdev_logical_block_size instead of HFSPLUS_SECTOR_SIZE. sb->s_blocksize
> doesn't turn out to work because it may change after reading the
> volume header. The patch is below; feedback is appreciated.
> 
> So far I've only done light testing, and no testing with large-sector
> devices since I don't have any to test with. I'm still concerned about
> duplicating data also in the page cache with this approach. Any thoughts
> on whether or not this is something to be worried about?

Did you manage to test it on a large sector device?

I'm be rather surprised if we actually need the read modify write
cycles.  I've not seen any filesystem that doesn't align it's metadata
to the sector size yet.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: hfsplus mount regression in 2.6.38
  2011-06-30 11:35         ` Christoph Hellwig
@ 2011-06-30 13:10           ` Seth Forshee
  2011-06-30 13:12             ` Christoph Hellwig
  0 siblings, 1 reply; 8+ messages in thread
From: Seth Forshee @ 2011-06-30 13:10 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Anton Salikhmetov, linux-kernel

On Thu, Jun 30, 2011 at 07:35:20AM -0400, Christoph Hellwig wrote:
> On Thu, Jun 02, 2011 at 04:58:21PM -0500, Seth Forshee wrote:
> > I took a crack at converting the users of direct bio to use
> > bdev_logical_block_size instead of HFSPLUS_SECTOR_SIZE. sb->s_blocksize
> > doesn't turn out to work because it may change after reading the
> > volume header. The patch is below; feedback is appreciated.
> > 
> > So far I've only done light testing, and no testing with large-sector
> > devices since I don't have any to test with. I'm still concerned about
> > duplicating data also in the page cache with this approach. Any thoughts
> > on whether or not this is something to be worried about?
> 
> Did you manage to test it on a large sector device?

I've gotten some test results back, but the results are mixed. I haven't
had much time yet to try to find out what's going wrong, but I hope to
do so soon.

> I'm be rather surprised if we actually need the read modify write
> cycles.  I've not seen any filesystem that doesn't align it's metadata
> to the sector size yet.

I agree. What I was more concerned about was that if you had e.g. a 2 KB
sector device, metadata could possibly end up in the filesystem cache
along with some file data in an adjacent block, and that as a result you
could overwrite that data structure with stale data.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: hfsplus mount regression in 2.6.38
  2011-06-30 13:10           ` Seth Forshee
@ 2011-06-30 13:12             ` Christoph Hellwig
  0 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2011-06-30 13:12 UTC (permalink / raw)
  To: Christoph Hellwig, Anton Salikhmetov, linux-kernel

On Thu, Jun 30, 2011 at 02:10:12PM +0100, Seth Forshee wrote:
> > I'm be rather surprised if we actually need the read modify write
> > cycles.  I've not seen any filesystem that doesn't align it's metadata
> > to the sector size yet.
> 
> I agree. What I was more concerned about was that if you had e.g. a 2 KB
> sector device, metadata could possibly end up in the filesystem cache
> along with some file data in an adjacent block, and that as a result you
> could overwrite that data structure with stale data.

Given that all other metadata in hfsplus is part of an inode with it's
own address space we would have done that already before.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2011-06-30 13:12 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-05-25 14:25 hfsplus mount regression in 2.6.38 Seth Forshee
2011-05-27  9:25 ` Christoph Hellwig
2011-05-27 13:23   ` Seth Forshee
2011-05-27 18:24     ` Seth Forshee
2011-06-02 21:58       ` Seth Forshee
2011-06-30 11:35         ` Christoph Hellwig
2011-06-30 13:10           ` Seth Forshee
2011-06-30 13:12             ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox