linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nathan Scott <nathans@sgi.com>
To: Badari Pulavarty <pbadari@us.ibm.com>, Christoph Hellwig <hch@lst.de>
Cc: mcao@us.ibm.com, akpm@osdl.org,
	lkml <linux-kernel@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 0/3] map multiple blocks in get_block() and mpage_readpages()
Date: Thu, 23 Feb 2006 12:40:04 +1100	[thread overview]
Message-ID: <20060223014004.GA900@frodo> (raw)
In-Reply-To: <20060222165942.GA25167@lst.de>

On Wed, Feb 22, 2006 at 05:59:42PM +0100, christoph wrote:
> On Wed, Feb 22, 2006 at 08:58:30AM -0800, Badari Pulavarty wrote:
> > Thanks. Only current issue Nathan raised is, he wants to see
> > b_size change to u64 [size_t] (instead of u32) to support
> > really-huge-IO requests.

And also to not go backwards on what the current DIO mapping
interface provides us.

> Does this sound reasonable to you ?
> 
> I know that we didn't want to increase b_size at some point because
> of concerns about the number of objects fitting into a page in the

There's four extra bytes on an 88 byte structure (with sector_t
CONFIG'd at 64 bits) - so, yes, there'll be a small decrease in
the number that fit in a page on 64 bit platforms.  Perhaps its
not worth it, but it would be good to sort out these pesky size
mismatches cos they introduce very subtle bugs.

> slab allocator.  If the same number of bigger heads fit into the
> same page I see no problems with the increase.

Heh, bigger heads.  Well, the same number fit in the page for 32
bit platforms, does that count?

Taking a quick look at the struct definition, theres some oddities
crept in since the 2.4 version - looks like sct had arranged the
fields in nice 32-bit-platform-cache-aligned groups, with several
cache-alignment related comments, some of which now remain and make
very little sense on their own (& with the now-incorrect grouping).

Anyway, here's a patch Badari - I reckon its worth it, but then I
am a bit biased (as its likely only XFS is going to be seeing this
size of I/O for some time, and as someone who has hunted down some
of these size mismatch bugs...)

cheers.

-- 
Nathan


Increase the size of the buffer_head b_size field for 64 bit
platforms.  Update some old and moldy comments in and around
the structure as well.

The b_size increase allows us to perform larger mappings and
allocations for large I/O requests from userspace, which tie
in with other changes allowing the get_block_t() interface to
map multiple blocks at once.

Signed-off-by: Nathan Scott <nathans@sgi.com>

--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -46,20 +46,25 @@ struct address_space;
 typedef void (bh_end_io_t)(struct buffer_head *bh, int uptodate);
 
 /*
- * Keep related fields in common cachelines.  The most commonly accessed
- * field (b_state) goes at the start so the compiler does not generate
- * indexed addressing for it.
+ * Historically, a buffer_head was used to map a single block
+ * within a page, and of course as the unit of I/O through the
+ * filesystem and block layers.  Nowadays the basic I/O unit
+ * is the bio, and buffer_heads are used for extracting block
+ * mappings (via a get_block_t call), for tracking state within
+ * a page (via a page_mapping) and for wrapping bio submission
+ * for backward compatibility reasons (e.g. submit_bh).  There
+ * may be one or two other uses too (I used it for drying the
+ * dishes the other night when I couldn't find a tea towel...).
  */
 struct buffer_head {
-	/* First cache line: */
 	unsigned long b_state;		/* buffer state bitmap (see above) */
 	struct buffer_head *b_this_page;/* circular list of page's buffers */
 	struct page *b_page;		/* the page this bh is mapped to */
-	atomic_t b_count;		/* users using this block */
-	u32 b_size;			/* block size */
+	atomic_t b_count;		/* users using this buffer_head */
 
-	sector_t b_blocknr;		/* block number */
-	char *b_data;			/* pointer to data block */
+	size_t b_size;			/* size of mapping */
+	sector_t b_blocknr;		/* start block number */
+	char *b_data;			/* pointer to data within the page */
 
 	struct block_device *b_bdev;
 	bh_end_io_t *b_end_io;		/* I/O completion */

  reply	other threads:[~2006-02-23  1:43 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-02-20 21:21 [PATCH 0/3] map multiple blocks in get_block() and mpage_readpages() Badari Pulavarty
2006-02-20 21:23 ` [PATCH 1/3] pass b_size to ->get_block() Badari Pulavarty
2006-02-20 21:23 ` [PATCH 2/3] map multiple blocks for mpage_readpages() Badari Pulavarty
2006-02-23  3:29   ` Suparna Bhattacharya
2006-02-23 22:18     ` Badari Pulavarty
2006-02-20 21:24 ` [PATCH 3/3] remove ->get_blocks() support Badari Pulavarty
2006-02-20 21:59 ` [PATCH 0/3] map multiple blocks in get_block() and mpage_readpages() Nathan Scott
2006-02-20 23:06   ` Badari Pulavarty
2006-02-20 23:16     ` Nathan Scott
2006-02-21  2:41   ` Jeremy Higdon
2006-02-21 16:03     ` Badari Pulavarty
2006-02-21 21:39       ` Nathan Scott
2006-02-22 15:12 ` christoph
2006-02-22 16:58   ` Badari Pulavarty
2006-02-22 16:59     ` christoph
2006-02-23  1:40       ` Nathan Scott [this message]
2006-02-23  1:59         ` Andrew Morton
2006-02-23 16:28           ` [PATCH] change b_size to size_t Badari Pulavarty
2006-02-23 16:32             ` Benjamin LaHaise
2006-02-23 17:20               ` Badari Pulavarty
2006-02-23 17:28               ` Badari Pulavarty
2006-02-23 17:29                 ` Benjamin LaHaise
2006-02-23 18:46                   ` Badari Pulavarty
2006-02-23 17:40                 ` Badari Pulavarty
2006-02-23 16:40             ` Dave Kleikamp
2006-02-22 17:23     ` [PATCH 0/3] map multiple blocks in get_block() and mpage_readpages() Peter Staubach
2006-02-22 18:37     ` Dave Kleikamp
2006-02-22 19:00       ` Badari Pulavarty
2006-02-24 17:19   ` Badari Pulavarty
2006-03-06 10:03     ` Suparna Bhattacharya
2006-03-06 22:39       ` Nathan Scott
2006-03-07  9:00         ` Badari Pulavarty

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060223014004.GA900@frodo \
    --to=nathans@sgi.com \
    --cc=akpm@osdl.org \
    --cc=hch@lst.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mcao@us.ibm.com \
    --cc=pbadari@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).