All of lore.kernel.org
 help / color / mirror / Atom feed
* External journals and NVRAM devices
@ 2002-11-01  5:38 Jeremy Howard
  2002-11-01  6:29 ` Andreas Dilger
                   ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Jeremy Howard @ 2002-11-01  5:38 UTC (permalink / raw)
  To: ReiserFS List

Hi all,

I'm looking at buying solid state drives / NVRAM drives for our servers
to hold an external ReiserFS journal.

We are using 2.4.20pre11, and Chris Mason's data logging patches.

I'm looking for any tips on how large the journal is when using
data=journal, and whether the external log patches are stable and work OK
in data=journal mode. Is there a command to show the current journal
size? Does the size vary over time? We need to ensure we buy a card with
enough memory so this is important information for us.

Is anyone currently using NVRAM for the journal? If so, how do you find
the performance of this configuration?

TIA,
   Jeremy



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: External journals and NVRAM devices
  2002-11-01  5:38 Jeremy Howard
@ 2002-11-01  6:29 ` Andreas Dilger
  2002-11-01 14:30   ` Edward Shishkin
  2002-11-01 20:40 ` Hans Reiser
  2002-11-01 20:41 ` Hans Reiser
  2 siblings, 1 reply; 21+ messages in thread
From: Andreas Dilger @ 2002-11-01  6:29 UTC (permalink / raw)
  To: Jeremy Howard; +Cc: ReiserFS List

On Nov 01, 2002  16:38 +1100, Jeremy Howard wrote:
> I'm looking at buying solid state drives / NVRAM drives for our servers
> to hold an external ReiserFS journal.
> 
> We are using 2.4.20pre11, and Chris Mason's data logging patches.
> 
> I'm looking for any tips on how large the journal is when using
> data=journal, and whether the external log patches are stable and work OK
> in data=journal mode. Is there a command to show the current journal
> size? Does the size vary over time? We need to ensure we buy a card with
> enough memory so this is important information for us.
> 
> Is anyone currently using NVRAM for the journal? If so, how do you find
> the performance of this configuration?

When people were testing this with ext3 external journals, they just
used a RAMDISK for getting the performance measurements.  Obviously,
(I hope ;-) this is not something you can do in real life, but for
performance measurement it is OK.

Most people found that the ramdisk (and presumably the NVRAM device too)
didn't perform much, if any, better than having a separate fast disk for
the journal, because you are doing sequential I/O to the journal anyways.
If it is on a separate disk/controller from the filesystem you don't have
any seek or channel contention with the filesystem.  Of course, using a
regular disk for the journal is MUCH cheaper than an NVRAM card, so you
probably want to test this out before you go ahead and buy the NVRAM card.

NVRAM devices are great for disks you are doing a lot of random I/O
on (maybe database indexes or something), because there is zero seek
latency, but for sequential I/O (like the journal) it really isn't
anything special.

Cheers, Andreas
--
Andreas Dilger
http://www-mddsp.enel.ucalgary.ca/People/adilger/
http://sourceforge.net/projects/ext2resize/


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: External journals and NVRAM devices
@ 2002-11-01  7:29 JP Howard
  2002-11-01  7:49 ` Serge Kolodeznyh
  2002-11-01 15:10 ` Edward Shishkin
  0 siblings, 2 replies; 21+ messages in thread
From: JP Howard @ 2002-11-01  7:29 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: ReiserFS List

On Thu, 31 Oct 2002 23:29:57 -0700, "Andreas Dilger"
<adilger@clusterfs.com> said:
> When people were testing this with ext3 external journals, they just
> used a RAMDISK for getting the performance measurements.  Obviously,
> (I hope ;-) this is not something you can do in real life, but for
> performance measurement it is OK.
> 
> Most people found that the ramdisk (and presumably the NVRAM device too)
> didn't perform much, if any, better than having a separate fast disk for
> the journal, because you are doing sequential I/O to the journal anyways.
<...>

Yes, I'd heard something like this. Our servers aren't going to have a
spare drive bay, I think, so a PCI NVRAM card may turn out to be a more
economical solution (although I haven't received quotes back from the
vendors yet...).

If I do find a spare drive bay, how unsafe would it be to use a single
drive, rather than RAID 1 mirroring? What does ReiserFS do if it gets an
IO error on the journal device? Could that bring down our whole system? I
assume that it would--in which case using NVRAM would actually save two
drive bays, since it should be reliable enough to not need redundency.

So, how big is a ReiserFS journal when using data=journal anyways?...

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: External journals and NVRAM devices
  2002-11-01  7:29 JP Howard
@ 2002-11-01  7:49 ` Serge Kolodeznyh
  2002-11-01 15:33   ` Len Sorensen
  2002-11-01 15:10 ` Edward Shishkin
  1 sibling, 1 reply; 21+ messages in thread
From: Serge Kolodeznyh @ 2002-11-01  7:49 UTC (permalink / raw)
  To: reiserfs-list

Sirs, why there are no [reiserfs-list] in subject of letters ?
I can't filter letters to special box, list had filled up my main inbox,
please, check this maillist system's configuration.

Thanks.

---
Serge Kolodeznyh
Paradigma AG
Network /system administrator
SVK42-RIPN / SVK33-RIPE

----- Original Message -----
From: "JP Howard" <jh_lists@fastmail.fm>
To: "Andreas Dilger" <adilger@clusterfs.com>
Cc: "ReiserFS List" <reiserfs-list@namesys.com>
Sent: Friday, November 01, 2002 10:29 AM
Subject: Re: External journals and NVRAM devices


> On Thu, 31 Oct 2002 23:29:57 -0700, "Andreas Dilger"
> <adilger@clusterfs.com> said:



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: External journals and NVRAM devices
  2002-11-01  6:29 ` Andreas Dilger
@ 2002-11-01 14:30   ` Edward Shishkin
  0 siblings, 0 replies; 21+ messages in thread
From: Edward Shishkin @ 2002-11-01 14:30 UTC (permalink / raw)
  To: Jeremy Howard; +Cc: Andreas Dilger, ReiserFS List

Andreas Dilger wrote:
> 
> On Nov 01, 2002  16:38 +1100, Jeremy Howard wrote:
> > I'm looking at buying solid state drives / NVRAM drives for our servers
> > to hold an external ReiserFS journal.
> >
> > We are using 2.4.20pre11, and Chris Mason's data logging patches.
> >
> > I'm looking for any tips on how large the journal is when using
> > data=journal, and whether the external log patches are stable and work OK
> > in data=journal mode. 

Yes. And the experience said that external logging brings a bit better effect 
for this journal mode then for other ones. 

> >Is there a command to show the current journal size? 

#debugreiserfs main_device

> > Does the size vary over time? We need to ensure we buy a card with
> > enough memory so this is important information for us.

The journal size remains the same unless you specify another one by
reiserfstune utility.  

Thanks,
Edward.

> >
> > Is anyone currently using NVRAM for the journal? If so, how do you find
> > the performance of this configuration?
> 
> When people were testing this with ext3 external journals, they just
> used a RAMDISK for getting the performance measurements.  Obviously,
> (I hope ;-) this is not something you can do in real life, but for
> performance measurement it is OK.
> 
> Most people found that the ramdisk (and presumably the NVRAM device too)
> didn't perform much, if any, better than having a separate fast disk for
> the journal, because you are doing sequential I/O to the journal anyways.
> If it is on a separate disk/controller from the filesystem you don't have
> any seek or channel contention with the filesystem.  Of course, using a
> regular disk for the journal is MUCH cheaper than an NVRAM card, so you
> probably want to test this out before you go ahead and buy the NVRAM card.
> 
> NVRAM devices are great for disks you are doing a lot of random I/O
> on (maybe database indexes or something), because there is zero seek
> latency, but for sequential I/O (like the journal) it really isn't
> anything special.
> 
> Cheers, Andreas
> --
> Andreas Dilger
> http://www-mddsp.enel.ucalgary.ca/People/adilger/
> http://sourceforge.net/projects/ext2resize/

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: External journals and NVRAM devices
  2002-11-01  7:29 JP Howard
  2002-11-01  7:49 ` Serge Kolodeznyh
@ 2002-11-01 15:10 ` Edward Shishkin
  1 sibling, 0 replies; 21+ messages in thread
From: Edward Shishkin @ 2002-11-01 15:10 UTC (permalink / raw)
  To: JP Howard; +Cc: Andreas Dilger, ReiserFS List

JP Howard wrote:
> 
> On Thu, 31 Oct 2002 23:29:57 -0700, "Andreas Dilger"
> <adilger@clusterfs.com> said:
> > When people were testing this with ext3 external journals, they just
> > used a RAMDISK for getting the performance measurements.  Obviously,
> > (I hope ;-) this is not something you can do in real life, but for
> > performance measurement it is OK.
> >
> > Most people found that the ramdisk (and presumably the NVRAM device too)
> > didn't perform much, if any, better than having a separate fast disk for
> > the journal, because you are doing sequential I/O to the journal anyways.
> <...>
> 
> Yes, I'd heard something like this. Our servers aren't going to have a
> spare drive bay, I think, so a PCI NVRAM card may turn out to be a more
> economical solution (although I haven't received quotes back from the
> vendors yet...).
> 
> If I do find a spare drive bay, how unsafe would it be to use a single
> drive, rather than RAID 1 mirroring? What does ReiserFS do if it gets an
> IO error on the journal device? 

Reiserfs will want you to do following:
#reiserfsck --no-journal-available main_device
or 
#reiserfsck --no-journal-available --rebuild-tree main_device
then specify new journal device by reiserfstune.

>Could that bring down our whole system? I
> assume that it would--in which case using NVRAM would actually save two
> drive bays, since it should be reliable enough to not need redundency.
> 
> So, how big is a ReiserFS journal when using data=journal anyways?...

8192 blocks for reiserfs with standard journal.
The size of external journal device for non-standard.

Edward.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: External journals and NVRAM devices
  2002-11-01  7:49 ` Serge Kolodeznyh
@ 2002-11-01 15:33   ` Len Sorensen
  0 siblings, 0 replies; 21+ messages in thread
From: Len Sorensen @ 2002-11-01 15:33 UTC (permalink / raw)
  To: Serge Kolodeznyh; +Cc: reiserfs-list

On Fri, Nov 01, 2002 at 10:49:42AM +0300, Serge Kolodeznyh wrote:
> Sirs, why there are no [reiserfs-list] in subject of letters ?
> I can't filter letters to special box, list had filled up my main inbox,
> please, check this maillist system's configuration.

There is however a 'X-Mailing-List: reiserfs-list@namesys.com' header.

Len Sorensen

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: External journals and NVRAM devices
@ 2002-11-01 20:38 JP Howard
  2002-11-01 20:57 ` Valdis.Kletnieks
  2002-11-01 21:57 ` Edward Shishkin
  0 siblings, 2 replies; 21+ messages in thread
From: JP Howard @ 2002-11-01 20:38 UTC (permalink / raw)
  To: Edward Shishkin; +Cc: Andreas Dilger, ReiserFS List

On Fri, 01 Nov 2002 17:30:18 +0300, "Edward Shishkin"
<edward@namesys.com> said:
> Yes. And the experience said that external logging brings a bit better
> effect for this journal mode then for other ones. 
> 
> > >Is there a command to show the current journal size? 
> 
> #debugreiserfs main_device
> 
Hmmm...
----
Blocksize: 4096
<...>
Journal parameters:
<...>
	Size 8193 blocks (including 1 for journal header) (first block
	18)
----

So my journal is only 32k? With Ext3 I used 192MB journal in data=journal
mode. Should I be using a size around that when using data=journal with
ReiserFS too? Why is the default so low?--is it because the data logging
patches don't automatically change it?
> 
> The journal size remains the same unless you specify another one by
> reiserfstune utility.  
> 
And non-standard journal sizes don't work under 2.4, right? Or are there
patches for this that are reasonably stable?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: External journals and NVRAM devices
  2002-11-01  5:38 Jeremy Howard
  2002-11-01  6:29 ` Andreas Dilger
@ 2002-11-01 20:40 ` Hans Reiser
  2002-11-01 21:45   ` Edward Shishkin
  2002-11-01 20:41 ` Hans Reiser
  2 siblings, 1 reply; 21+ messages in thread
From: Hans Reiser @ 2002-11-01 20:40 UTC (permalink / raw)
  To: Jeremy Howard; +Cc: ReiserFS List, mose Jadon, Nikita Danilov

Jeremy Howard wrote:

> Hi all,
>
> I'm looking at buying solid state drives / NVRAM drives for our servers
> to hold an external ReiserFS journal.
>
> We are using 2.4.20pre11, and Chris Mason's data logging patches.
>
> I'm looking for any tips on how large the journal is when using
> data=journal, and whether the external log patches are stable and work OK
> in data=journal mode. Is there a command to show the current journal
> size? Does the size vary over time? We need to ensure we buy a card with
> enough memory so this is important information for us.
>
> Is anyone currently using NVRAM for the journal? If so, how do you find
> the performance of this configuration?
>
> TIA,
>   Jeremy
>
>
>
>
You might contact the umem folks.  They are working with us.

Edward, I want those benchmarks produced in a readable form.  Nikita, 
make sure that this happens, and that Edward produces a benchmark with 
an explanation of the significance of what was measured that is 
understandable to persons who aren't him.

-- 
Hans



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: External journals and NVRAM devices
  2002-11-01  5:38 Jeremy Howard
  2002-11-01  6:29 ` Andreas Dilger
  2002-11-01 20:40 ` Hans Reiser
@ 2002-11-01 20:41 ` Hans Reiser
  2002-11-02  3:04   ` Andrew Clausen
  2 siblings, 1 reply; 21+ messages in thread
From: Hans Reiser @ 2002-11-01 20:41 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Jeremy Howard, ReiserFS List

Andreas Dilger wrote:

>On Nov 01, 2002  16:38 +1100, Jeremy Howard wrote:
>  
>
>>I'm looking at buying solid state drives / NVRAM drives for our servers
>>to hold an external ReiserFS journal.
>>
>>We are using 2.4.20pre11, and Chris Mason's data logging patches.
>>
>>I'm looking for any tips on how large the journal is when using
>>data=journal, and whether the external log patches are stable and work OK
>>in data=journal mode. Is there a command to show the current journal
>>size? Does the size vary over time? We need to ensure we buy a card with
>>enough memory so this is important information for us.
>>
>>Is anyone currently using NVRAM for the journal? If so, how do you find
>>the performance of this configuration?
>>    
>>
>
>When people were testing this with ext3 external journals, they just
>used a RAMDISK for getting the performance measurements.  Obviously,
>(I hope ;-) this is not something you can do in real life, but for
>performance measurement it is OK.
>
>Most people found that the ramdisk (and presumably the NVRAM device too)
>didn't perform much, if any, better than having a separate fast disk for
>the journal, because you are doing sequential I/O to the journal anyways.
>If it is on a separate disk/controller from the filesystem you don't have
>any seek or channel contention with the filesystem.  Of course, using a
>regular disk for the journal is MUCH cheaper than an NVRAM card, so you
>probably want to test this out before you go ahead and buy the NVRAM card.
>
>NVRAM devices are great for disks you are doing a lot of random I/O
>on (maybe database indexes or something), because there is zero seek
>latency, but for sequential I/O (like the journal) it really isn't
>anything special.
>
>Cheers, Andreas
>--
>Andreas Dilger
>http://www-mddsp.enel.ucalgary.ca/People/adilger/
>http://sourceforge.net/projects/ext2resize/
>
>
>
>  
>
NVRAM devices are for fsync intensive operations.

-- 
Hans



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: External journals and NVRAM devices
  2002-11-01 20:38 External journals and NVRAM devices JP Howard
@ 2002-11-01 20:57 ` Valdis.Kletnieks
  2002-11-01 21:57 ` Edward Shishkin
  1 sibling, 0 replies; 21+ messages in thread
From: Valdis.Kletnieks @ 2002-11-01 20:57 UTC (permalink / raw)
  To: JP Howard; +Cc: Edward Shishkin, Andreas Dilger, ReiserFS List

[-- Attachment #1: Type: text/plain, Size: 677 bytes --]

On Fri, 01 Nov 2002 20:38:05 GMT, JP Howard said:

> Blocksize: 4096
> <...>
> Journal parameters:
> <...>
> 	Size 8193 blocks (including 1 for journal header) (first block
> 	18)
> ----
> 
> So my journal is only 32k? With Ext3 I used 192MB journal in data=journal

That's 32M, not 32K.

Note that the required size of the journal file depends in part on the maximum
amount of data in the journal that isn't stored to permanent locations yet.
As a result, if the journal commit process uses a "small and often" scheme,
you'll need less journal than one that makes large commits on a rare basis.
-- 
				Valdis Kletnieks
				Computer Systems Senior Engineer
				Virginia Tech


[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: External journals and NVRAM devices
@ 2002-11-01 21:37 JP Howard
  2002-11-04 13:16 ` Chris Mason
  2002-11-05 21:23 ` reiser
  0 siblings, 2 replies; 21+ messages in thread
From: JP Howard @ 2002-11-01 21:37 UTC (permalink / raw)
  To: Edward Shishkin, Andreas Dilger, ReiserFS List, Oleg Drokin

On Sat, 02 Nov 2002 00:58:40 +0300, "Edward Shishkin"
<edward@namesys.com> said:
> > > ----
> > > Blocksize: 4096
> > > <...>
> > > Journal parameters:
> > > <...>
> > >         Size 8193 blocks (including 1 for journal header) (first block
> > >         18)
> > > ----
> > >
> > > So my journal is only 32k?
> > 
> > Yes. Standard journal can be only 32K.
> 
> Sorry, 32M!
> 
Apologies to all for my lack of reading comprehension. Yes, of course
that shows 32M...

When the 2.4.20 journal-size patches are out I'll try and do some
benchmarking with larger journals on NVRAM devices. We use Cyrus IMAPd
which is very fsync intensive, so I expect an NVRAM journal may make a
big difference...

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: External journals and NVRAM devices
  2002-11-01 20:40 ` Hans Reiser
@ 2002-11-01 21:45   ` Edward Shishkin
  0 siblings, 0 replies; 21+ messages in thread
From: Edward Shishkin @ 2002-11-01 21:45 UTC (permalink / raw)
  To: Hans Reiser; +Cc: Jeremy Howard, ReiserFS List, mose Jadon, Nikita Danilov

Hans Reiser wrote:
> 
> Jeremy Howard wrote:
> 
> > Hi all,
> >
> > I'm looking at buying solid state drives / NVRAM drives for our servers
> > to hold an external ReiserFS journal.
> >
> > We are using 2.4.20pre11, and Chris Mason's data logging patches.
> >
> > I'm looking for any tips on how large the journal is when using
> > data=journal, and whether the external log patches are stable and work OK
> > in data=journal mode. Is there a command to show the current journal
> > size? Does the size vary over time? We need to ensure we buy a card with
> > enough memory so this is important information for us.
> >
> > Is anyone currently using NVRAM for the journal? If so, how do you find
> > the performance of this configuration?
> >
> > TIA,
> >   Jeremy
> >
> >
> >
> >
> You might contact the umem folks.  They are working with us.
> 
> Edward, I want those benchmarks produced in a readable form.  Nikita,
> make sure that this happens, and that Edward produces a benchmark with
> an explanation of the significance of what was measured that is
> understandable to persons who aren't him.

Ok, I'll try to make something on next week.
Edward.

> 
> --
> Hans

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: External journals and NVRAM devices
  2002-11-01 20:38 External journals and NVRAM devices JP Howard
  2002-11-01 20:57 ` Valdis.Kletnieks
@ 2002-11-01 21:57 ` Edward Shishkin
  2002-11-01 21:58   ` Edward Shishkin
  1 sibling, 1 reply; 21+ messages in thread
From: Edward Shishkin @ 2002-11-01 21:57 UTC (permalink / raw)
  To: JP Howard; +Cc: Andreas Dilger, ReiserFS List, green

JP Howard wrote:
> 
> On Fri, 01 Nov 2002 17:30:18 +0300, "Edward Shishkin"
> <edward@namesys.com> said:
> > Yes. And the experience said that external logging brings a bit better
> > effect for this journal mode then for other ones.
> >
> > > >Is there a command to show the current journal size?
> >
> > #debugreiserfs main_device
> >
> Hmmm...
> ----
> Blocksize: 4096
> <...>
> Journal parameters:
> <...>
>         Size 8193 blocks (including 1 for journal header) (first block
>         18)
> ----
> 
> So my journal is only 32k?

Yes. Standard journal can be only 32K.

 With Ext3 I used 192MB journal in data=journal
> mode. Should I be using a size around that when using data=journal with
> ReiserFS too? Why is the default so low?--is it because the data logging
> patches don't automatically change it?

Reiserfs journal size doesn't depend on journal mode. 

> >
> > The journal size remains the same unless you specify another one by
> > reiserfstune utility.
> >
> And non-standard journal sizes don't work under 2.4, right? Or are there
> patches for this that are reasonably stable?

Yes they are. Let me specify it on next week unless Oleg will do it earlier.
Edward.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: External journals and NVRAM devices
  2002-11-01 21:57 ` Edward Shishkin
@ 2002-11-01 21:58   ` Edward Shishkin
  0 siblings, 0 replies; 21+ messages in thread
From: Edward Shishkin @ 2002-11-01 21:58 UTC (permalink / raw)
  To: JP Howard, Andreas Dilger, ReiserFS List, green

Edward Shishkin wrote:
> 
> JP Howard wrote:
> >
> > On Fri, 01 Nov 2002 17:30:18 +0300, "Edward Shishkin"
> > <edward@namesys.com> said:
> > > Yes. And the experience said that external logging brings a bit better
> > > effect for this journal mode then for other ones.
> > >
> > > > >Is there a command to show the current journal size?
> > >
> > > #debugreiserfs main_device
> > >
> > Hmmm...
> > ----
> > Blocksize: 4096
> > <...>
> > Journal parameters:
> > <...>
> >         Size 8193 blocks (including 1 for journal header) (first block
> >         18)
> > ----
> >
> > So my journal is only 32k?
> 
> Yes. Standard journal can be only 32K.

Sorry, 32M!

> 
>  With Ext3 I used 192MB journal in data=journal
> > mode. Should I be using a size around that when using data=journal with
> > ReiserFS too? Why is the default so low?--is it because the data logging
> > patches don't automatically change it?
> 
> Reiserfs journal size doesn't depend on journal mode.
> 
> > >
> > > The journal size remains the same unless you specify another one by
> > > reiserfstune utility.
> > >
> > And non-standard journal sizes don't work under 2.4, right? Or are there
> > patches for this that are reasonably stable?
> 
> Yes they are. Let me specify it on next week unless Oleg will do it earlier.
> Edward.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: External journals and NVRAM devices
  2002-11-01 20:41 ` Hans Reiser
@ 2002-11-02  3:04   ` Andrew Clausen
  0 siblings, 0 replies; 21+ messages in thread
From: Andrew Clausen @ 2002-11-02  3:04 UTC (permalink / raw)
  To: Hans Reiser; +Cc: Andreas Dilger, Jeremy Howard, ReiserFS List

On Fri, Nov 01, 2002 at 11:41:55PM +0300, Hans Reiser wrote:
> NVRAM devices are for fsync intensive operations.

And for low commit latency.

Cheers,
Andrew


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: External journals and NVRAM devices
  2002-11-01 21:37 JP Howard
@ 2002-11-04 13:16 ` Chris Mason
  2002-11-05 21:23 ` reiser
  1 sibling, 0 replies; 21+ messages in thread
From: Chris Mason @ 2002-11-04 13:16 UTC (permalink / raw)
  To: JP Howard; +Cc: Edward Shishkin, Andreas Dilger, ReiserFS List, Oleg Drokin

On Fri, 2002-11-01 at 16:37, JP Howard wrote:
 
> Apologies to all for my lack of reading comprehension. Yes, of course
> that shows 32M...
> 
> When the 2.4.20 journal-size patches are out I'll try and do some
> benchmarking with larger journals on NVRAM devices. We use Cyrus IMAPd
> which is very fsync intensive, so I expect an NVRAM journal may make a
> big difference...
> 

The data logging patches include a ton of reiserfs writeback changes
that allow good performance with small transactions (like mail server
workloads) and small log sizes.

With ext3, a 128M or bigger log can really improve performance because
so much of the writeback is done through bdflush/kupdate.  With a bigger
log, it is much more likely things will already be written to the main
disk by the time you need to wrap around and reuse part of the log for a
new transaction.  So a big log, and tuning bdflush to trigger writeback
quickly can really help ext3 performance.

(note, this gives ext3 some memory pressure advantages)

With data logging reiserfs a mail server will rarely need more than a
32MB log.  A larger log can help when you've got a very full tree
(millions of files), since those transactions will be slightly larger,
or when you are doing fsyncs on big files.

Andrew Morton's synctest program is a pretty good tool for benchmarking
optimal transaction parameters.

-chris



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: External journals and NVRAM devices
  2002-11-01 21:37 JP Howard
  2002-11-04 13:16 ` Chris Mason
@ 2002-11-05 21:23 ` reiser
  2002-11-06 20:18   ` Andreas Dilger
  1 sibling, 1 reply; 21+ messages in thread
From: reiser @ 2002-11-05 21:23 UTC (permalink / raw)
  To: Chris Mason
  Cc: JP Howard, Edward Shishkin, Andreas Dilger, ReiserFS List,
	Oleg Drokin

Chris Mason wrote:

>On Fri, 2002-11-01 at 16:37, JP Howard wrote:
> 
>  
>
>>Apologies to all for my lack of reading comprehension. Yes, of course
>>that shows 32M...
>>
>>When the 2.4.20 journal-size patches are out I'll try and do some
>>benchmarking with larger journals on NVRAM devices. We use Cyrus IMAPd
>>which is very fsync intensive, so I expect an NVRAM journal may make a
>>big difference...
>>
>>    
>>
>
>The data logging patches include a ton of reiserfs writeback changes
>that allow good performance with small transactions (like mail server
>workloads) and small log sizes.
>
>With ext3, a 128M or bigger log can really improve performance because
>so much of the writeback is done through bdflush/kupdate. 
>
Please explain the because clause of the sentence above in more detail.

> With a bigger
>log, it is much more likely things will already be written to the main
>disk by the time you need to wrap around and reuse part of the log for a
>new transaction.  So a big log, and tuning bdflush to trigger writeback
>quickly can really help ext3 performance.
>
>(note, this gives ext3 some memory pressure advantages)
>
>With data logging reiserfs a mail server will rarely need more than a
>32MB log.  A larger log can help when you've got a very full tree
>(millions of files), since those transactions will be slightly larger,
>or when you are doing fsyncs on big files.
>
>Andrew Morton's synctest program is a pretty good tool for benchmarking
>optimal transaction parameters.
>
>-chris
>
>
>
>
>  
>



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: External journals and NVRAM devices
@ 2002-11-05 22:12 JP Howard
  0 siblings, 0 replies; 21+ messages in thread
From: JP Howard @ 2002-11-05 22:12 UTC (permalink / raw)
  To: Hans Reiser, Chris Mason
  Cc: Edward Shishkin, Andreas Dilger, ReiserFS List, Oleg Drokin

On Tue, 05 Nov 2002 13:23:30 -0800, "reiser" <reiser@namesys.com> said:
> Chris Mason wrote:
> >The data logging patches include a ton of reiserfs writeback changes
> >that allow good performance with small transactions (like mail server
> >workloads) and small log sizes.
> >
> >With ext3, a 128M or bigger log can really improve performance because
> >so much of the writeback is done through bdflush/kupdate. 
> >
> Please explain the because clause of the sentence above in more detail.
> 
I've just ordered some 1GB Umem NVRAM cards (they're battery-backed RAM
cards that look to Linux like a standard block device). I'd like to use
them to improve performance as much as possible.

Our application is fsync intensive (Cyrus IMAPd server). We have heaps of
RAM, NVRAM, and CPU to spare, but are IO bound. Is there any way to tune
ReiserFS data=journal to improve IO performance by taking advantage of
our spare RAM/NVRAM?

TIA for any info,
  Jeremy

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: External journals and NVRAM devices
  2002-11-05 21:23 ` reiser
@ 2002-11-06 20:18   ` Andreas Dilger
  2002-11-06 20:42     ` Chris Mason
  0 siblings, 1 reply; 21+ messages in thread
From: Andreas Dilger @ 2002-11-06 20:18 UTC (permalink / raw)
  To: reiser; +Cc: Chris Mason, JP Howard, Edward Shishkin, ReiserFS List,
	Oleg Drokin

On Nov 05, 2002  13:23 -0800, reiser wrote:
> Chris Mason wrote:
> >With ext3, a 128M or bigger log can really improve performance because
> >so much of the writeback is done through bdflush/kupdate. 
>
> Please explain the because clause of the sentence above in more detail.

Nobody has answered this yet AFAIK, so I will.

The reason that having a large log can help performance is because
having bdflush drive the dirty buffer writeout allows for more changes
of write merging by the elevator and such, and also avoids stalls in
user-space code as it waits for a full journal to commit transactions.

There is a fine line here (for ext3 at least), because if you have a
large journal but it fills up before the transactions have been flushed
to the filesystem, then user apps stall while the journal is flushed
(can be several seconds).

Cheers, Andreas
--
Andreas Dilger
http://www-mddsp.enel.ucalgary.ca/People/adilger/
http://sourceforge.net/projects/ext2resize/


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: External journals and NVRAM devices
  2002-11-06 20:18   ` Andreas Dilger
@ 2002-11-06 20:42     ` Chris Mason
  0 siblings, 0 replies; 21+ messages in thread
From: Chris Mason @ 2002-11-06 20:42 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: reiser, JP Howard, Edward Shishkin, ReiserFS List, Oleg Drokin

On Wed, 2002-11-06 at 15:18, Andreas Dilger wrote:
> On Nov 05, 2002  13:23 -0800, reiser wrote:
> > Chris Mason wrote:
> > >With ext3, a 128M or bigger log can really improve performance because
> > >so much of the writeback is done through bdflush/kupdate. 
> >
> > Please explain the because clause of the sentence above in more detail.
> 
> Nobody has answered this yet AFAIK, so I will.
> 
> The reason that having a large log can help performance is because
> having bdflush drive the dirty buffer writeout allows for more changes
> of write merging by the elevator and such, and also avoids stalls in
> user-space code as it waits for a full journal to commit transactions.
> 
> There is a fine line here (for ext3 at least), because if you have a
> large journal but it fills up before the transactions have been flushed
> to the filesystem, then user apps stall while the journal is flushed
> (can be several seconds).

Sorry for the delay.  This is why tuning bdflush with a large ext3 log
to trigger writeback quickly can help.  It lowers the chance userspace
will have to wait for the log to flushed by decreasing the time dirty
buffers are allowed to hang around.  (andreas knows this better than I
do, just trying to explain my last message ;-)

The major difference with reiserfs (patched or not) is the log is
flushed per transaction instead of trying to reclaim the whole thing.

In the stock kernels, this really hurts reiserfs with small
transactions, because it only flushes one transaction at a time.  This
means I write 3 or 4 blocks, wait, then write 3 or 4 more, wait, etc.

The data logging patches have code to send more than one transaction at
once, so I reclaim the log in chunks of about 200 blocks.  The end
result is the log wrapping around is a less expensive operation with the
patches applied, and you usually won't need as large a log to make
data=journal work well.  

The downside to my current code is that reiserfs can pin more ram (up to
the size of the log) than ext3, and for a longer period of time.

If you're going to disk, large logs are easy to come by.  nvram is
different though, so it matters more there.

-chris



^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2002-11-06 20:42 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-11-01 20:38 External journals and NVRAM devices JP Howard
2002-11-01 20:57 ` Valdis.Kletnieks
2002-11-01 21:57 ` Edward Shishkin
2002-11-01 21:58   ` Edward Shishkin
  -- strict thread matches above, loose matches on Subject: below --
2002-11-05 22:12 JP Howard
2002-11-01 21:37 JP Howard
2002-11-04 13:16 ` Chris Mason
2002-11-05 21:23 ` reiser
2002-11-06 20:18   ` Andreas Dilger
2002-11-06 20:42     ` Chris Mason
2002-11-01  7:29 JP Howard
2002-11-01  7:49 ` Serge Kolodeznyh
2002-11-01 15:33   ` Len Sorensen
2002-11-01 15:10 ` Edward Shishkin
2002-11-01  5:38 Jeremy Howard
2002-11-01  6:29 ` Andreas Dilger
2002-11-01 14:30   ` Edward Shishkin
2002-11-01 20:40 ` Hans Reiser
2002-11-01 21:45   ` Edward Shishkin
2002-11-01 20:41 ` Hans Reiser
2002-11-02  3:04   ` Andrew Clausen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.