From: Mike Snitzer <snitzer@redhat.com>
To: device-mapper development <dm-devel@redhat.com>
Cc: sandeen@redhat.com, Daniel Browning <db@kavod.com>,
ejt@redhat.com, Jeff Moyer <jmoyer@redhat.com>,
Zdenek Kabelac <zkabelac@redhat.com>,
"Alasdair G. Kergon" <agk@redhat.com>
Subject: [PATCH] dm thin: fix queue limits stacking when data device has compulsory merge_bvec_fn
Date: Wed, 23 Jan 2013 17:16:53 -0500 [thread overview]
Message-ID: <20130123221653.GA1545@redhat.com> (raw)
In-Reply-To: <20130122135132.GA24851@redhat.com>
When a thin-pool uses an MD device for the data device a thin device
from the thin-pool must respect MD's constraints about disallowing a bio
to span multiple chunks. Otherwise we can see problems. If the raid0
chunksize is 1152K and thin-pool chunksize is 256K I see the following
md/radi0 error (with extra debug tracing added to thin_endio) when
mkfs.xfs is executed against the thin device:
md/raid0:md99: make_request bug: can't convert block across chunks or bigger than 1152k 6688 127
device-mapper: thin: bio sector=2080 err=-5 bi_size=130560 bi_rw=17 bi_vcnt=32 bi_idx=0
This extra DM debugging shows that the failing bio is spanning across
the first and second logical 1152K chunk (sector 2080 + 255 takes the
bio beyond the first chunk's boundary of sector 2304). So the bio
splitting that DM is doing clearly isn't respecting the MD limits.
max_hw_sectors_kb is 127 for both the thin-pool and thin device
(queue_max_hw_sectors returns 255 so we'll excuse sysfs's lack of
precision). So this explains why bi_size is 130560.
But the thin device's max_hw_sectors_kb should be 4 (PAGE_SIZE) given
that it doesn't have a .merge function (for bio_add_page to consult
indirectly via dm_merge_bvec) yet the thin-pool does sit above an MD
device that has a compulsory merge_bvec_fn. This scenario is exactly
why DM must resort to sending single PAGE_SIZE bios to the underlying
layer -- some additional context for this is available in the header for
commit 8cbeb67a.
Look story short, the reason a thin device doesn't properly get
configured to have a max_hw_sectors_kb of 4 (PAGE_SIZE) is that
thin_io_hints() is blindly copying the queue limits from the thin-pool
device directly to the thin device's queue limits.
Fix this by eliminating thin_io_hints. Doing so is safe because the
block layer's queue limits stacking already enables the upper level thin
device to inherit the thin-pool device's discard and minimum_io_size and
optimal_io_size limits that get set in pool_io_hints. But avoiding the
queue limits copy allows the thin and thin-pool limits to be different
where it is important, namely max_hw_sectors_kb.
Reported-by: Daniel Browning <db@kavod.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
---
drivers/md/dm-thin.c | 11 -----------
1 files changed, 0 insertions(+), 11 deletions(-)
diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
index 074e570..9bd59ae 100644
--- a/drivers/md/dm-thin.c
+++ b/drivers/md/dm-thin.c
@@ -2819,16 +2819,6 @@ static int thin_iterate_devices(struct dm_target *ti,
return 0;
}
-/*
- * A thin device always inherits its queue limits from its pool.
- */
-static void thin_io_hints(struct dm_target *ti, struct queue_limits *limits)
-{
- struct thin_c *tc = ti->private;
-
- *limits = bdev_get_queue(tc->pool_dev->bdev)->limits;
-}
-
static struct target_type thin_target = {
.name = "thin",
.version = {1, 6, 0},
@@ -2840,7 +2830,6 @@ static struct target_type thin_target = {
.postsuspend = thin_postsuspend,
.status = thin_status,
.iterate_devices = thin_iterate_devices,
- .io_hints = thin_io_hints,
};
/*----------------------------------------------------------------*/
prev parent reply other threads:[~2013-01-23 22:16 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-18 10:19 A thin-p over 256 GiB fails with I/O errors with non-power-of-two chunk Daniel Browning
2013-01-21 18:49 ` Mike Snitzer
2013-01-22 11:10 ` Zdenek Kabelac
2013-01-22 13:51 ` Mike Snitzer
2013-01-23 22:16 ` Mike Snitzer [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130123221653.GA1545@redhat.com \
--to=snitzer@redhat.com \
--cc=agk@redhat.com \
--cc=db@kavod.com \
--cc=dm-devel@redhat.com \
--cc=ejt@redhat.com \
--cc=jmoyer@redhat.com \
--cc=sandeen@redhat.com \
--cc=zkabelac@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.