All of lore.kernel.org
 help / color / mirror / Atom feed
From: LA Walsh <law@sgi.com>
To: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: Jesse Pollard <pollard@tomcat.admin.navo.hpc.mil>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: 64-bit block sizes on 32-bit systems
Date: Tue, 27 Mar 2001 13:55:48 -0800	[thread overview]
Message-ID: <3AC10C64.108BCFD3@sgi.com> (raw)
In-Reply-To: <200103271957.NAA13547@tomcat.admin.navo.hpc.mil> <20010327152011.A1354@cs.cmu.edu>

Jan Harkes wrote:
> 
> On Tue, Mar 27, 2001 at 01:57:42PM -0600, Jesse Pollard wrote:
> > > Using similar numbers as presented. If we are working our way through
> > > every single block in a Pentabyte filesystem, and the blocksize is 512
> > > bytes. Then the 1us in extra CPU cycles because of 64-bit operations
> > > would add, according to by back of the envelope calculation, 2199023
> > > seconds of CPU time a bit more than 25 days.
> >
> > Ummm... I don't think it adds that much. You seem to be leaving out the
> > overlap disk/IO and computation for read-ahead. This should eliminate the
> > majority of the delay effect.
> 
> 1024 TB should be around 2*10^12 512-byte blocks, divide by 10^6 (1us)
> of "assumed" overhead per block operation is 2*10^6 seconds, no I
> believe I'm pretty close there. I am considering everything being
> "available in the cache", i.e. no waiting for disk access.
---
	If everything being used is only used from the cache, then
the application probably doesn't need 64-bit block support.  

	I submit that your argument may be flawed in the assumption that
if an application needs multi-terabyte files and devices, that most
of the data will be in the in-memory cache. 
 

> The time to update the pagetables is identical to the time to update a
> 4KB page when the OS is using a 2MB pagesize. Ofcourse it will take more
> time to load the data into the page, however it should be a consecutive
> stretch of data on disk, which should give a more efficient transfer
> than small blocks scattered around the disk.
---
	Not if you were doing alot of random reads where you only
needd 1-2K of data.  The read-time of the extra 2M-1K would seem
to eat into any performance boot gained by the large pagesize.

> 
> > Granted, 512 bytes could be considered too small for some things, but
> > once you pass 32K you start adding a lot of rotational delay problems.
> > I've used file systems with 256K blocks - they are slow when compaired
> > to the throughput using 32K. I wasn't the one running the benchmarks,
> > but with a MaxStrat 400GB raid with 256K sized data transfer was much
> > slower (around 3 times slower) than 32K. (The target application was
> > a GIS server using Oracle).
> 
> But your subsystem (the disk) was probably still using 512 byte blocks,
> possibly scattered. And the OS was still using 4KB pages, it takes more
> time to reclaim and gather 64 pages per IO operation than one, that's
> why I'm saying that the pagesize needs to scale along with the blocksize.
> 
> The application might have been assuming a small block size as well, and
> the OS was told to do several read/modify/write cycles, perhaps even 512
> times as much as necessary.
> 
> I'm not saying that the current system will perform well when working
> with large blocks, but compared to increasing the size of block_t, a
> larger blocksize has more potential to give improvements in the long
> term without adding an unrecoverable performance hit.
---
	That's totally application dependent.  Database applications
might tend to skip around in the data and do short/reads/writes over
a very large file.  Large block sizes will degrade their performance.

	This was the idea of making it a *configurable* option.  If
you need it, configure it.  Same with block size -- that should
likely have a wider range for configuration as well.  But
configuration (and ideally auto-configuration where possible)
seems the ultimate win-win situation.

-l
-- 
The above thoughts are my own and do not necessarily represent those
		of my employer.
L A Walsh                        | Trust Technology, Core Linux, SGI
law@sgi.com                      | Voice: (650) 933-5338

  reply	other threads:[~2001-03-27 21:58 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-03-27 19:57 64-bit block sizes on 32-bit systems Jesse Pollard
2001-03-27 20:20 ` Jan Harkes
2001-03-27 21:55   ` LA Walsh [this message]
  -- strict thread matches above, loose matches on Subject: below --
2001-03-27 22:23 Jesse Pollard
2001-03-27 23:56 ` Steve Lord
2001-03-28  8:09   ` Brad Boyer
2001-03-28 14:53     ` Dave Kleikamp
2001-03-27 19:30 Jesse Pollard
     [not found] <Pine.LNX.4.30.0103270022500.21075-100000@age.cs.columbia.edu>
     [not found] ` <3AC0CA9C.3D804361@sgi.com>
2001-03-27 19:00   ` Jan Harkes
2001-03-27 17:22 LA Walsh
2001-03-26 21:27 Jesse Pollard
2001-03-26 22:07 ` Jonathan Morton
2001-03-27  4:14   ` Jesse Pollard
2001-03-26 19:26 Jesse Pollard
2001-03-26 18:01 Manfred Spraul
2001-03-26 18:07 ` Matthew Wilcox
2001-03-26 19:40 ` LA Walsh
2001-03-26 21:53   ` Manfred Spraul
2001-03-26 22:07     ` LA Walsh
2001-03-26 17:35 LA Walsh
2001-03-26 16:39 LA Walsh
2001-03-26 17:18 ` Matthew Wilcox
2001-03-26 17:47   ` Andreas Dilger
2001-03-26 18:09     ` Matthew Wilcox
2001-03-26 18:37       ` Eric W. Biederman
2001-03-26 19:36         ` Martin Dalecki
2001-03-26 23:03         ` AJ Lewis
2001-03-26 19:05       ` Scott Laird
2001-03-26 19:09       ` Andreas Dilger
2001-03-26 20:31         ` Dan Hollis
2001-03-26 19:20       ` Rik van Riel
2001-03-26 20:14       ` Jes Sorensen
2001-03-26 17:58 ` Eric W. Biederman
2001-03-28  8:06 ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3AC10C64.108BCFD3@sgi.com \
    --to=law@sgi.com \
    --cc=jaharkes@cs.cmu.edu \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pollard@tomcat.admin.navo.hpc.mil \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.