All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@redhat.com>
To: lvm-devel@redhat.com
Subject: [RFC PATCH v2] change default alignment of pe_start to 1MB
Date: Fri, 6 Aug 2010 00:11:39 -0400	[thread overview]
Message-ID: <20100806041138.GA29995@redhat.com> (raw)
In-Reply-To: <20100805191001.GA27237@redhat.com>

The switch to a 1MB default alignment causes various tests in the LVM2
testsuite to fail -- not a big deal but the tests would need updating.

Of more concern is that the existing LVM2 set_pe_align() code doesn't
always properly respect the alignment determined from
'devices/md_chunk_alignment' or 'devices/data_alignment_detection'.

With the previous default alignment of 64k it would generally do the
right thing -- use the detected values.  But switching the default to
the larger value exposes the fact that MAX() of the MD or I/O Topology
detected values will generally always be 1MB -- when they are compared
to 1MB.

The following revised patch changes the LVM alignment detection
semantics to model what fdisk has elected to do:
- If the default value (1MB) is a multiple of the specified/detected
  alignment then just use the default.
- Otherwise, use the specified/detected value.

In practice this means we'll almost always use 1MB -- that is unless:
- the specified --dataalignment, MD's full stripe width, or the
  optimal_io_size exceeds 1MB
- the specified/detected value is not a power-of-2

NOTE: even with a default of 64k the old set_pe_align code would result
in incorrect alignment if a value < 64k were used for --dataalignment

---
 doc/example.conf.in     |    2 +-
 lib/metadata/metadata.c |   30 ++++++++++++++++++------------
 2 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/doc/example.conf.in b/doc/example.conf.in
index 850b7e2..f7dcc63 100644
--- a/doc/example.conf.in
+++ b/doc/example.conf.in
@@ -113,7 +113,7 @@ devices {
     # Alignment (in KB) of start of data area when creating a new PV.
     # If a PV is placed directly upon an md device and md_chunk_alignment or
     # data_alignment_detection is enabled this parameter is ignored.
-    # Set to 0 for the default alignment of 64KB or page size, if larger.
+    # Set to 0 for the default alignment of 1MB or page size, if larger.
     data_alignment = 0
 
     # By default, the start of the PV's aligned data area will be shifted by
diff --git a/lib/metadata/metadata.c b/lib/metadata/metadata.c
index d7edf54..469473a 100644
--- a/lib/metadata/metadata.c
+++ b/lib/metadata/metadata.c
@@ -64,13 +64,16 @@ const char _really_init[] =
 
 unsigned long set_pe_align(struct physical_volume *pv, unsigned long data_alignment)
 {
+	unsigned long temp_pe_align, default_pe_align = 2048;
+
 	if (pv->pe_align)
 		goto out;
 
 	if (data_alignment)
 		pv->pe_align = data_alignment;
 	else
-		pv->pe_align = MAX(65536UL, lvm_getpagesize()) >> SECTOR_SHIFT;
+		pv->pe_align = MAX((default_pe_align << SECTOR_SHIFT),
+				   lvm_getpagesize()) >> SECTOR_SHIFT;
 
 	if (!pv->dev)
 		goto out;
@@ -79,10 +82,11 @@ unsigned long set_pe_align(struct physical_volume *pv, unsigned long data_alignm
 	 * Align to stripe-width of underlying md device if present
 	 */
 	if (find_config_tree_bool(pv->fmt->cmd, "devices/md_chunk_alignment",
-				  DEFAULT_MD_CHUNK_ALIGNMENT))
-		pv->pe_align = MAX(pv->pe_align,
-				   dev_md_stripe_width(pv->fmt->cmd->sysfs_dir,
-						       pv->dev));
+				  DEFAULT_MD_CHUNK_ALIGNMENT)) {
+		temp_pe_align = dev_md_stripe_width(pv->fmt->cmd->sysfs_dir, pv->dev);
+		if (temp_pe_align && (default_pe_align % temp_pe_align))
+			pv->pe_align = temp_pe_align;
+	}
 
 	/*
 	 * Align to topology's minimum_io_size or optimal_io_size if present
@@ -94,13 +98,15 @@ unsigned long set_pe_align(struct physical_volume *pv, unsigned long data_alignm
 	if (find_config_tree_bool(pv->fmt->cmd,
 				  "devices/data_alignment_detection",
 				  DEFAULT_DATA_ALIGNMENT_DETECTION)) {
-		pv->pe_align = MAX(pv->pe_align,
-				   dev_minimum_io_size(pv->fmt->cmd->sysfs_dir,
-						       pv->dev));
-
-		pv->pe_align = MAX(pv->pe_align,
-				   dev_optimal_io_size(pv->fmt->cmd->sysfs_dir,
-						       pv->dev));
+		temp_pe_align = dev_minimum_io_size(pv->fmt->cmd->sysfs_dir,
+						    pv->dev);
+		if (temp_pe_align && (default_pe_align % temp_pe_align))
+			pv->pe_align = temp_pe_align;
+				   
+		temp_pe_align = dev_optimal_io_size(pv->fmt->cmd->sysfs_dir,
+						    pv->dev);
+		if (temp_pe_align && (default_pe_align % temp_pe_align))
+			pv->pe_align = temp_pe_align;
 	}
 
 	log_very_verbose("%s: Setting PE alignment to %lu sectors.",



  reply	other threads:[~2010-08-06  4:11 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-05 19:10 [RFC PATCH] change default alignment of pe_start to 1MB Mike Snitzer
2010-08-06  4:11 ` Mike Snitzer [this message]
2010-08-09 16:28   ` [RFC PATCH v2] " Milan Broz
2010-08-09 16:42     ` Mike Snitzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100806041138.GA29995@redhat.com \
    --to=snitzer@redhat.com \
    --cc=lvm-devel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.