linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
* [linux-lvm] More information on my LV with bad read performance..
@ 2001-10-26  0:02 Robert Macaulay
  2001-10-26  2:06 ` Andreas Dilger
  0 siblings, 1 reply; 6+ messages in thread
From: Robert Macaulay @ 2001-10-26  0:02 UTC (permalink / raw)
  To: linux-lvm

I realized I didn't include a lvdisplay -v of my volume. Here it is.
The disks are spread out over 4 scsi busses. 
Thanks again.

--- Logical volume ---
LV Name                /dev/vgOracle/foo
VG Name                vgOracle
LV Write Access        read/write
LV Status              available
LV #                   52
# open                 0
LV Size                9.04 GB
Current LE             2314
Allocated LE           2314
Stripes                26
Stripe size (KByte)    64
Allocation             next free
Read ahead sectors     120
Block device           58:51

   --- Distribution of logical volume on 26 physical volumes  ---
   PV Name                  PE on PV     reads      writes
   /dev/sdh1                89           13629      173625
   /dev/sdi1                89           13616      173386
   /dev/sdj1                89           13630      173372
   /dev/sdl1                89           13619      173354
   /dev/sdm1                89           13625      173369
   /dev/sdn1                89           13619      173384
   /dev/sdo1                89           13635      173391
   /dev/sdp1                89           13632      173387
   /dev/sdq1                89           13641      173401
   /dev/sdr1                89           13633      173386
   /dev/sds1                89           13639      173398
   /dev/sdt1                89           13633      173388
   /dev/sdu1                89           13625      173367
   /dev/sdv1                89           13617      173357
   /dev/sdw1                89           13625      173367
   /dev/sdx1                89           13617      173358
   /dev/sdy1                89           13624      173366
   /dev/sdz1                89           13618      173354
   /dev/sdaa1               89           13606      173366
   /dev/sdab1               89           13600      173388
   /dev/sdac1               89           13609      173366
   /dev/sdad1               89           13603      173356
   /dev/sdae1               89           13609      173364
   /dev/sdaf1               89           13600      173361
   /dev/sdag1               89           13607      173366
   /dev/sdah1               89           13602      173354

   --- logical volume i/o statistic ---
   354113 reads  4507931 writes


--cut--

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [linux-lvm] More information on my LV with bad read performance..
  2001-10-26  0:02 [linux-lvm] More information on my LV with bad read performance Robert Macaulay
@ 2001-10-26  2:06 ` Andreas Dilger
  2001-10-26  3:13   ` Heinz J . Mauelshagen
  2001-10-26  8:26   ` Robert Macaulay
  0 siblings, 2 replies; 6+ messages in thread
From: Andreas Dilger @ 2001-10-26  2:06 UTC (permalink / raw)
  To: linux-lvm

On Oct 26, 2001  00:03 -0500, Robert Macaulay wrote:
> I realized I didn't include a lvdisplay -v of my volume. Here it is.
> The disks are spread out over 4 scsi busses. 
> 
> --- Logical volume ---
> LV Name                /dev/vgOracle/foo
> VG Name                vgOracle
> LV Write Access        read/write
> LV Status              available
> LV #                   52
> # open                 0
> LV Size                9.04 GB
> Current LE             2314
> Allocated LE           2314
> Stripes                26
> Stripe size (KByte)    64
> Allocation             next free
> Read ahead sectors     120
> Block device           58:51

Well, there was a patch in 2.4.13 to the LVM code to change the readahead
code.  First off, it makes the default readahead 1024 sectors (512kB)
which may be the maximum SCSI request size (don't know the details
exactly).  It also sets a global read_ahead array, so this may impact
it also.  See above, you have a "read ahead" that is smaller than a
single stripe, so it isn't really doing you much good.

However, it is also possible that striping across 26 disks is kind of
pointless, especially for Oracle.  You are far better off to do some
intelligent allocation of the disks depending on known usage patterns
(e.g. put tables and their indexes on separate disks, put rollback
files on separate disks, put heavily used tables on their own disks,
put temporary tablespaces on their own disks).

With LVM, you can easily monitor which PVs/PEs are busiest, and even out
the I/O load by moving LVs/PEs with pvmove (although you CANNOT do this
while the database is active).

Make sure you keep backups of your LVM metadata (both vgcfgbackup, and
also save the text output of "pvdata -avP" and "lvdisplay -v").

Cheers, Andreas
--
Andreas Dilger  \ "If a man ate a pound of pasta and a pound of antipasto,
                 \  would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/               -- Dogbert

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [linux-lvm] More information on my LV with bad read performance..
  2001-10-26  2:06 ` Andreas Dilger
@ 2001-10-26  3:13   ` Heinz J . Mauelshagen
  2001-10-26  8:26   ` Robert Macaulay
  1 sibling, 0 replies; 6+ messages in thread
From: Heinz J . Mauelshagen @ 2001-10-26  3:13 UTC (permalink / raw)
  To: linux-lvm

On Fri, Oct 26, 2001 at 01:06:56AM -0600, Andreas Dilger wrote:
> On Oct 26, 2001  00:03 -0500, Robert Macaulay wrote:
> > I realized I didn't include a lvdisplay -v of my volume. Here it is.
> > The disks are spread out over 4 scsi busses. 
> > 
> > --- Logical volume ---
> > LV Name                /dev/vgOracle/foo
> > VG Name                vgOracle
> > LV Write Access        read/write
> > LV Status              available
> > LV #                   52
> > # open                 0
> > LV Size                9.04 GB
> > Current LE             2314
> > Allocated LE           2314
> > Stripes                26
> > Stripe size (KByte)    64
> > Allocation             next free
> > Read ahead sectors     120
> > Block device           58:51
> 
> Well, there was a patch in 2.4.13 to the LVM code to change the readahead
> code.

Andreas,
to what patch are your reffering to.
Still see the per major read_ahead code in 2.4.13 which is partially
usefull in the best case.

Heinz

> First off, it makes the default readahead 1024 sectors (512kB)
> which may be the maximum SCSI request size (don't know the details
> exactly).  It also sets a global read_ahead array, so this may impact
> it also.  See above, you have a "read ahead" that is smaller than a
> single stripe, so it isn't really doing you much good.
> 
> However, it is also possible that striping across 26 disks is kind of
> pointless, especially for Oracle.  You are far better off to do some
> intelligent allocation of the disks depending on known usage patterns
> (e.g. put tables and their indexes on separate disks, put rollback
> files on separate disks, put heavily used tables on their own disks,
> put temporary tablespaces on their own disks).
> 
> With LVM, you can easily monitor which PVs/PEs are busiest, and even out
> the I/O load by moving LVs/PEs with pvmove (although you CANNOT do this
> while the database is active).
> 
> Make sure you keep backups of your LVM metadata (both vgcfgbackup, and
> also save the text output of "pvdata -avP" and "lvdisplay -v").
> 
> Cheers, Andreas
> --
> Andreas Dilger  \ "If a man ate a pound of pasta and a pound of antipasto,
>                  \  would they cancel out, leaving him still hungry?"
> http://www-mddsp.enel.ucalgary.ca/People/adilger/               -- Dogbert
> 
> 
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@sistina.com
> http://lists.sistina.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://www.sistina.com/lvm/Pages/howto.html

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Heinz Mauelshagen                                 Sistina Software Inc.
Senior Consultant/Developer                       Am Sonnenhang 11
                                                  56242 Marienrachdorf
                                                  Germany
Mauelshagen@Sistina.com                           +49 2626 141200
                                                       FAX 924446
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [linux-lvm] More information on my LV with bad read performance..
  2001-10-26  2:06 ` Andreas Dilger
  2001-10-26  3:13   ` Heinz J . Mauelshagen
@ 2001-10-26  8:26   ` Robert Macaulay
  2001-10-26  8:38     ` Robert Macaulay
  1 sibling, 1 reply; 6+ messages in thread
From: Robert Macaulay @ 2001-10-26  8:26 UTC (permalink / raw)
  To: linux-lvm

On Fri, 26 Oct 2001, Andreas Dilger wrote:

> However, it is also possible that striping across 26 disks is kind of
> pointless, especially for Oracle.  You are far better off to do some
> intelligent allocation of the disks depending on known usage patterns
> (e.g. put tables and their indexes on separate disks, put rollback
> files on separate disks, put heavily used tables on their own disks,
> put temporary tablespaces on their own disks).

True. I have done that. This is a "let's see if it goes really fast with a
lot of disks" test. We have the disks divided up in stripe sets no bigger
than 8 typically, all separated out by function. I was just playing around
with a massive stripe, and ran into this oddity.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [linux-lvm] More information on my LV with bad read performance..
  2001-10-26  8:26   ` Robert Macaulay
@ 2001-10-26  8:38     ` Robert Macaulay
  2001-10-26 12:28       ` Andreas Dilger
  0 siblings, 1 reply; 6+ messages in thread
From: Robert Macaulay @ 2001-10-26  8:38 UTC (permalink / raw)
  To: linux-lvm

On Fri, 26 Oct 2001, Macaulay, Robert wrote:

> 
> True. I have done that. This is a "let's see if it goes really fast with a
> lot of disks" test. We have the disks divided up in stripe sets no bigger
> than 8 typically, all separated out by function. I was just playing around
> with a massive stripe, and ran into this oddity.

I made 2 12-way stripes with mdtools, then another layer of raid0 on top
of that, just to see if it would make any difference. The md got about the
same write performance, but the read was about the same as the writes
were, much higher than LVM.

Where is the patch you referred to to increase the read ahead? Part of our 
Oracle testing(with volumes separated by use) involves sequential table 
scans, which sound like they could benefit greatly from this patch. Thx

Robert

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [linux-lvm] More information on my LV with bad read performance..
  2001-10-26  8:38     ` Robert Macaulay
@ 2001-10-26 12:28       ` Andreas Dilger
  0 siblings, 0 replies; 6+ messages in thread
From: Andreas Dilger @ 2001-10-26 12:28 UTC (permalink / raw)
  To: linux-lvm

On Oct 26, 2001  08:39 -0500, Robert Macaulay wrote:
> Where is the patch you referred to to increase the read ahead? Part of our 
> Oracle testing(with volumes separated by use) involves sequential table 
> scans, which sound like they could benefit greatly from this patch. Thx

I'm not sure of the exact time when the changes went in, but when I
updated to 2.4.13 and wanted to update the LVM code also, I see (note
patch whitespace may be broken because of cut-n-paste):

There was also a small discussion about read ahead on the kernel mailing
list, so this may be a result of that.  Something along the lines of "all
readahead is broken because ..."

Cheers, Andreas
=============================================================================
--- kernel/lvm.c	2001/10/15 09:23:27
+++ kernel/lvm.c	2001/10/26 17:21:47
@@ -270,9 +270,13 @@
 
 #include "lvm-internal.h"
 
-#define	LVM_CORRECT_READ_AHEAD( a) \
-   if      ( a < LVM_MIN_READ_AHEAD || \
-             a > LVM_MAX_READ_AHEAD) a = LVM_MAX_READ_AHEAD;
+#define	LVM_CORRECT_READ_AHEAD(a)               \
+do {                                           \
+	if ((a) < LVM_MIN_READ_AHEAD ||         \
+	    (a) > LVM_MAX_READ_AHEAD)           \
+		(a) = LVM_DEFAULT_READ_AHEAD;   \
+	read_ahead[MAJOR_NR] = (a);             \
+} while(0)
 
 #ifndef WRITEA
 #  define WRITEA WRITE
@@ -1040,6 +1045,7 @@
 		    (long) arg > LVM_MAX_READ_AHEAD)
 			return -EINVAL;
 		lv_ptr->lv_read_ahead = (long) arg;
+		read_ahead[MAJOR_NR] = lv_ptr->lv_read_ahead;
 		break;
 
--- kernel/lvm.h        2001/10/03 14:46:47     1.34
+++ kernel/lvm.h        2001/10/26 17:24:16
@@ -274,8 +274,9 @@
 #define	LVM_MAX_STRIPES		128	/* max # of stripes */
 #define	LVM_MAX_SIZE		( 1024LU * 1024 / SECTOR_SIZE * 1024 *
1024)	/* 1TB[sectors] */
 #define	LVM_MAX_MIRRORS		2	/* future use */
-#define	LVM_MIN_READ_AHEAD	2	/* minimum read ahead sectors */
-#define	LVM_MAX_READ_AHEAD	120	/* maximum read ahead sectors */
+#define	LVM_MIN_READ_AHEAD	0	/* minimum read ahead sectors */
+#define	LVM_DEFAULT_READ_AHEAD	1024	/* sectors for 512k scsi segments */
+#define	LVM_MAX_READ_AHEAD	10000	/* maximum read ahead sectors */
 #define	LVM_MAX_LV_IO_TIMEOUT	60	/* seconds I/O timeout (future use) */
 #define	LVM_PARTITION		0xfe	/* LVM partition id */
 #define	LVM_NEW_PARTITION	0x8e	/* new LVM partition id (10/09/1999) */

 


Cheers, Andreas
--
Andreas Dilger  \ "If a man ate a pound of pasta and a pound of antipasto,
                 \  would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/               -- Dogbert

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2001-10-26 12:28 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-10-26  0:02 [linux-lvm] More information on my LV with bad read performance Robert Macaulay
2001-10-26  2:06 ` Andreas Dilger
2001-10-26  3:13   ` Heinz J . Mauelshagen
2001-10-26  8:26   ` Robert Macaulay
2001-10-26  8:38     ` Robert Macaulay
2001-10-26 12:28       ` Andreas Dilger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).