Linux Device Mapper development
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@redhat.com>
To: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: device-mapper development <dm-devel@redhat.com>
Subject: Re: mirrored device with thousand of mappingtableentries
Date: Mon, 7 Mar 2011 15:10:27 -0500	[thread overview]
Message-ID: <20110307201027.GB31194@redhat.com> (raw)
In-Reply-To: <yq1y64reiu8.fsf@sermon.lab.mkp.net>

On Sun, Mar 06 2011 at  9:59pm -0500,
Martin K. Petersen <martin.petersen@oracle.com> wrote:

> >>>>> "Zdenek" == Zdenek Kabelac <zkabelac@redhat.com> writes:
> 
> Zdenek> My finding seems to show that BIP-256 slabtop segment grow by
> Zdenek> ~73KB per each device (while dm-io is ab out ~26KB)
> 
> Ok, I see it now that I tried with a bunch of DM devices.
> 
> DM allocates a bioset per volume. And since each bioset has an integrity
> mempool you'll end up with a bunch of memory locked down. It seems like
> a lot but it's actually the same amount as we reserve for the data path
> (bio-0 + biovec-256).
> 
> Since a bioset is not necessarily tied to a single block device we can't
> automatically decide whether to allocate the integrity pool or not. In
> the DM case, however, we just set up the integrity profile so the
> information is available.
> 
> Can you please try the following patch? This will change things so we
> only attach an integrity pool to the bioset if the logical volume is
> integrity-capable.

Hey Martin,

I just took the opportunity to review DM's blk_integrity code a bit more
closely -- with an eye towards stacking devices.  I found an issue that
I think we need to fix that has to do with a DM device's limits being
established during do_resume() and not during table_load().

Unfortunately, a DM device's blk_integrity gets preallocated during
table_load().  dm_table_prealloc_integrity()'s call to
blk_integrity_register() establishes the blk_integrity's block_size.

But a DM device's queue_limits aren't stacked until a DM device is
resumed -- via dm_calculate_queue_limits().

For some background please see the patch header of this commit:
http://git.kernel.org/linus/754c5fc7ebb417

The final blk_integrity for the DM device isn't fully established until
do_resume()'s eventual call to dm_table_set_integrity() -- by passing a
template to blk_integrity_register().  dm_table_set_integrity() does
validate the 'block_size' of each DM devices' blk_integrity to make sure
they all match.  So the code would catch the inconsistency should it
arise.

All I'm saying is: it's possible for a table_load() to not have the
awareness that a newly added device's queue_limits will cause the DM
device's final queue_limits to be increased (say a 4K device was
added to dm_device2, and dm_device2 is now being added to another
dm_device1).

So it seems we need to establish bi->sector_size during the final stage
of blk_integrity_register(), e.g. when a template is passed.  Not sure
if you'd agree with that change in general but it'll work for DM because
the queue_limits are established before dm_table_set_integrity() is set.

Maybe revalidate/change the 'block_size' during the final stage in case
it changed?

Thanks,
Mike

  parent reply	other threads:[~2011-03-07 20:10 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-28 11:21 mirrored device with thousand of mapping table entries Eli Malul
2011-02-28 11:48 ` Alasdair G Kergon
2011-02-28 11:59   ` mirrored device with thousand of mapping tableentries Eli Malul
2011-02-28 12:11     ` Alasdair G Kergon
2011-02-28 12:17       ` mirrored device with thousand of mappingtableentries Eli Malul
2011-02-28 13:10         ` Alasdair G Kergon
2011-02-28 13:13           ` Eli Malul
2011-02-28 13:29             ` Alasdair G Kergon
2011-02-28 13:42               ` Eli Malul
2011-02-28 16:25               ` Eli Malul
2011-02-28 13:38             ` Zdenek Kabelac
2011-02-28 15:01               ` Martin K. Petersen
2011-03-06 20:39                 ` Zdenek Kabelac
2011-03-07  2:59                   ` Martin K. Petersen
2011-03-07 14:24                     ` Zdenek Kabelac
2011-03-07 16:09                     ` Mike Snitzer
2011-03-08  6:51                       ` Martin K. Petersen
2011-03-08 17:13                         ` Mike Snitzer
2011-03-10 16:11                           ` Martin K. Petersen
2011-03-11 16:53                             ` [PATCH] block: Require subsystems to explicitly allocate bio_set integrity mempool (was: Re: mirrored device with thousand of mappingtableentries) Mike Snitzer
2011-03-11 17:11                               ` [PATCH] block: Require subsystems to explicitly allocate bio_set integrity mempool Martin K. Petersen
2011-03-07 20:10                     ` Mike Snitzer [this message]
2011-03-07 20:22                       ` mirrored device with thousand of mappingtableentries Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110307201027.GB31194@redhat.com \
    --to=snitzer@redhat.com \
    --cc=dm-devel@redhat.com \
    --cc=martin.petersen@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox