Re: staging: Add dm-writeboost

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Akira Hayakawa <ruby.wktk@gmail.com>
To: snitzer@redhat.com
Cc: devel@driverdev.osuosl.org, gregkh@linuxfoundation.org,
	linux-kernel@vger.kernel.org, dm-devel@redhat.com,
	agk@redhat.com, joe@perches.com, akpm@linux-foundation.org,
	cesarb@cesarb.net, m.chehab@samsung.com
Subject: Re: staging: Add dm-writeboost
Date: Tue, 17 Sep 2013 21:41:18 +0900	[thread overview]
Message-ID: <52384DEE.7050003@gmail.com> (raw)
In-Reply-To: <20130916215357.GA5015@redhat.com>

Mike,

First, thank you for your commenting.
I was looking forward to your comments.

I suppose you are sensing some "smell" in my design.
You are worrying that dm-writeboost will not only confuse users
but also fall into worst situation of giving up backward-compatibility
after merging into tree.

That dm-writeboost's design is too eccentric as a DM target makes you so.

That you said
>   determines whether a device needs formatting, etc.  Otherwise I cannot
>   see how you can properly stack DM devices on writeboost devices
>   (suspend+resume become tediously different)
is a proof of smell.

Alasdair also said
> I read a statement like that as an indication of an interface or
> architectural problem.  The device-mapper approach is to 'design out'
> problems, rather than relying on users not doing bad things.
> Study the existing interfaces used by other targets to understand
> some approaches that proved successful, then decide which ones
> come closest to your needs.

and

Mikulas said
> Another idea:
> 
> Make the interface of dm-lc (the arguments to constructor, messages and 
> the status line) the same as dm-cache, so that they can be driven by the 
> same userspace code.
Though I guess this is going too far
since dm-writeboost and dm-cache are the different things
designing them similar definitely makes sense.

are also sensing of smell.

I am afraid so I am and
I am thinking of re-designing dm-writeboost
at the fundamental architectural level.
The interfaces will be similar to that of dm-cache as a result.

This will be a really a BIG change.

> Probably best for you to publish the dm-writeboost code a git repo on
> github.com or the like.  I just don't see what benefit there is to
> putting code like this in staging.  Users already need considerable
> userspace tools and infrastructure will also be changing in the
> near-term (e.g. the migration daemon).
Yes, I agree with that regarding the current implementation.
I withdraw from the proposal for staging.
I am really sorry for Greg and others caring about dm-writeboost.
But I will be back after re-designing.
staging means lot to get 3rd party users is for sure.

Since this will be a big change.
I want to agree on the design before going forward.
I will explain why the interfaces of dm-writeboost
are designed so complicated.

Essentially,
it is because dm-writeboost supports "cache-sharing".
The functionality of sharing a cache by devices is required in some cases.

If I remove the functionality the design will be much simpler
and the code will be slightly faster.

What to be removed after re-designing follows
and they are also explaining why cache-sharing makes the design bad.

(1) writeboost-mgr (maybe)
Mike said
> - really dislike the use of an intermediate "writeboost-mgr" target to
>   administer the writeboost devices.  There is no need for this.

but I don't think having a singleton intermediate writeboost-mgr
is completely weird.
dm-thin target also has a singleton thin-pool target
that create and destroy thin devices.

Below is a description from Documentation/device-mapper/thin-provisioning.txt

  Using an existing pool device
  -----------------------------

      dmsetup create pool \
          --table "0 20971520 thin-pool $metadata_dev $data_dev \
                   $data_block_size $low_water_mark"

  i) Creating a new thinly-provisioned volume.

    To create a new thinly- provisioned volume you must send a message to an
    active pool device, /dev/mapper/pool in this example.

      dmsetup message /dev/mapper/pool 0 "create_thin 0"

    Here '0' is an identifier for the volume, a 24-bit number.  It's up
    to the caller to allocate and manage these identifiers.  If the
    identifier is already in use, the message will fail with -EEXIST.

But I do agree on having writeboost-mgr is a smell of over-engineering.
A target does nothing at all but being an admin looks little bit weird
for a design of DM target.
Maybe this should be removed.

(2) device_id and cache_id
To manage which backing devices are attached to a cache
These IDs are needed like dm-thin.
But they are not needed if I give up cache-sharing and

> - the various typedefs aren't needed (e.g. device_id, cache_id,
>   cache_nr)
will be all cleared.

(3) sysfs
>   device it will establish a proper hierarchy (see: dm_get_device).  What
>   advantages are you seeing from having/managing this sysfs tree?
One of the advantages is 
that userland tools can see the relations between devices.
Some GUI application might want to draw that by refering the sysfs.

If I get rid of the cache-sharing,
the dimension of relations between devices
will not be needed and will be removed toward userland
and the alternative is to
set/get the tunable parameters are thru message and status
like dm-cache.

In addition,
there actually is a smelling code causing by this design.
The code below add/remove the sysfs interfaces
that should be done in .ctr but is separated for
actually very complicated reason
belonging to the minor behavior of dmsetup reload.

I fully agree on removing this sysfs anyway because
I don't think I will be able to maintain this sysfs forever
and that one of the reasons why I provides userland tools in Python
as an abstraction layer.

        if (!strcasecmp(cmd, "add_sysfs")) {
                struct kobject *dev_kobj;
                r = kobject_init_and_add(&wb->kobj, &device_ktype,
                                         devices_kobj, "%u", wb->id);
                if (r) {
                        WBERR();
                        return r;
                }

                dev_kobj = get_bdev_kobject(wb->device->bdev);
                r = sysfs_create_link(&wb->kobj, dev_kobj, "device");
                if (r) {
                        WBERR();
                        kobject_del(&wb->kobj);
                        kobject_put(&wb->kobj);
                        return r;
                }

                kobject_uevent(&wb->kobj, KOBJ_ADD);
                return 0;
        }

        if (!strcasecmp(cmd, "remove_sysfs")) {
                kobject_uevent(&wb->kobj, KOBJ_REMOVE);

                sysfs_remove_link(&wb->kobj, "device");
                kobject_del(&wb->kobj);
                kobject_put(&wb->kobj);

                wb_devices[wb->id] = NULL;
                return 0;
        }

Simplify the design and
make it more possible to maintain the target
for the future is what I fully agree with.
Being adhere to cache-sharing by
risking the future maintainability doesn't pay.
Re-designing the dm-writeboost resemble to dm-cache
is a leading candidate of course.

I will ask for comment for the new design in the next reply.

Akira

WARNING: multiple messages have this Message-ID (diff)

From: Akira Hayakawa <ruby.wktk@gmail.com>
To: snitzer@redhat.com
Cc: gregkh@linuxfoundation.org, devel@driverdev.osuosl.org,
	linux-kernel@vger.kernel.org, dm-devel@redhat.com,
	cesarb@cesarb.net, joe@perches.com, akpm@linux-foundation.org,
	agk@redhat.com, m.chehab@samsung.com
Subject: Re: staging: Add dm-writeboost
Date: Tue, 17 Sep 2013 21:41:18 +0900	[thread overview]
Message-ID: <52384DEE.7050003@gmail.com> (raw)
In-Reply-To: <20130916215357.GA5015@redhat.com>

Mike,

First, thank you for your commenting.
I was looking forward to your comments.

I suppose you are sensing some "smell" in my design.
You are worrying that dm-writeboost will not only confuse users
but also fall into worst situation of giving up backward-compatibility
after merging into tree.

That dm-writeboost's design is too eccentric as a DM target makes you so.

That you said
>   determines whether a device needs formatting, etc.  Otherwise I cannot
>   see how you can properly stack DM devices on writeboost devices
>   (suspend+resume become tediously different)
is a proof of smell.

Alasdair also said
> I read a statement like that as an indication of an interface or
> architectural problem.  The device-mapper approach is to 'design out'
> problems, rather than relying on users not doing bad things.
> Study the existing interfaces used by other targets to understand
> some approaches that proved successful, then decide which ones
> come closest to your needs.

and

Mikulas said
> Another idea:
> 
> Make the interface of dm-lc (the arguments to constructor, messages and 
> the status line) the same as dm-cache, so that they can be driven by the 
> same userspace code.
Though I guess this is going too far
since dm-writeboost and dm-cache are the different things
designing them similar definitely makes sense.

are also sensing of smell.

I am afraid so I am and
I am thinking of re-designing dm-writeboost
at the fundamental architectural level.
The interfaces will be similar to that of dm-cache as a result.

This will be a really a BIG change.

> Probably best for you to publish the dm-writeboost code a git repo on
> github.com or the like.  I just don't see what benefit there is to
> putting code like this in staging.  Users already need considerable
> userspace tools and infrastructure will also be changing in the
> near-term (e.g. the migration daemon).
Yes, I agree with that regarding the current implementation.
I withdraw from the proposal for staging.
I am really sorry for Greg and others caring about dm-writeboost.
But I will be back after re-designing.
staging means lot to get 3rd party users is for sure.

Since this will be a big change.
I want to agree on the design before going forward.
I will explain why the interfaces of dm-writeboost
are designed so complicated.

Essentially,
it is because dm-writeboost supports "cache-sharing".
The functionality of sharing a cache by devices is required in some cases.

If I remove the functionality the design will be much simpler
and the code will be slightly faster.

What to be removed after re-designing follows
and they are also explaining why cache-sharing makes the design bad.

(1) writeboost-mgr (maybe)
Mike said
> - really dislike the use of an intermediate "writeboost-mgr" target to
>   administer the writeboost devices.  There is no need for this.

but I don't think having a singleton intermediate writeboost-mgr
is completely weird.
dm-thin target also has a singleton thin-pool target
that create and destroy thin devices.

Below is a description from Documentation/device-mapper/thin-provisioning.txt

  Using an existing pool device
  -----------------------------

      dmsetup create pool \
          --table "0 20971520 thin-pool $metadata_dev $data_dev \
                   $data_block_size $low_water_mark"

  i) Creating a new thinly-provisioned volume.

    To create a new thinly- provisioned volume you must send a message to an
    active pool device, /dev/mapper/pool in this example.

      dmsetup message /dev/mapper/pool 0 "create_thin 0"

    Here '0' is an identifier for the volume, a 24-bit number.  It's up
    to the caller to allocate and manage these identifiers.  If the
    identifier is already in use, the message will fail with -EEXIST.

But I do agree on having writeboost-mgr is a smell of over-engineering.
A target does nothing at all but being an admin looks little bit weird
for a design of DM target.
Maybe this should be removed.

(2) device_id and cache_id
To manage which backing devices are attached to a cache
These IDs are needed like dm-thin.
But they are not needed if I give up cache-sharing and

> - the various typedefs aren't needed (e.g. device_id, cache_id,
>   cache_nr)
will be all cleared.

(3) sysfs
>   device it will establish a proper hierarchy (see: dm_get_device).  What
>   advantages are you seeing from having/managing this sysfs tree?
One of the advantages is 
that userland tools can see the relations between devices.
Some GUI application might want to draw that by refering the sysfs.

If I get rid of the cache-sharing,
the dimension of relations between devices
will not be needed and will be removed toward userland
and the alternative is to
set/get the tunable parameters are thru message and status
like dm-cache.

In addition,
there actually is a smelling code causing by this design.
The code below add/remove the sysfs interfaces
that should be done in .ctr but is separated for
actually very complicated reason
belonging to the minor behavior of dmsetup reload.

I fully agree on removing this sysfs anyway because
I don't think I will be able to maintain this sysfs forever
and that one of the reasons why I provides userland tools in Python
as an abstraction layer.

        if (!strcasecmp(cmd, "add_sysfs")) {
                struct kobject *dev_kobj;
                r = kobject_init_and_add(&wb->kobj, &device_ktype,
                                         devices_kobj, "%u", wb->id);
                if (r) {
                        WBERR();
                        return r;
                }

                dev_kobj = get_bdev_kobject(wb->device->bdev);
                r = sysfs_create_link(&wb->kobj, dev_kobj, "device");
                if (r) {
                        WBERR();
                        kobject_del(&wb->kobj);
                        kobject_put(&wb->kobj);
                        return r;
                }

                kobject_uevent(&wb->kobj, KOBJ_ADD);
                return 0;
        }

        if (!strcasecmp(cmd, "remove_sysfs")) {
                kobject_uevent(&wb->kobj, KOBJ_REMOVE);

                sysfs_remove_link(&wb->kobj, "device");
                kobject_del(&wb->kobj);
                kobject_put(&wb->kobj);

                wb_devices[wb->id] = NULL;
                return 0;
        }

Simplify the design and
make it more possible to maintain the target
for the future is what I fully agree with.
Being adhere to cache-sharing by
risking the future maintainability doesn't pay.
Re-designing the dm-writeboost resemble to dm-cache
is a leading candidate of course.

I will ask for comment for the new design in the next reply.

Akira

next prev parent reply	other threads:[~2013-09-17 12:41 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-01 11:10 [PATCH] staging: Add dm-writeboost Akira Hayakawa
2013-09-16 21:53 ` Mike Snitzer
2013-09-16 21:53   ` Mike Snitzer
2013-09-16 22:49   ` Dan Carpenter
2013-09-16 22:49     ` Dan Carpenter
2013-09-17 12:41   ` Akira Hayakawa [this message]
2013-09-17 12:41     ` Akira Hayakawa
2013-09-17 20:18     ` Mike Snitzer
2013-09-17 20:18       ` Mike Snitzer
2013-09-17 12:43   ` Akira Hayakawa
2013-09-17 12:43     ` Akira Hayakawa
2013-09-17 20:59     ` Mike Snitzer
2013-09-17 20:59       ` Mike Snitzer
2013-09-22  0:09       ` Reworking dm-writeboost [was: Re: staging: Add dm-writeboost] Akira Hayakawa
2013-09-22  0:09         ` Akira Hayakawa
2013-09-24 12:20         ` Akira Hayakawa
2013-09-24 12:20           ` Akira Hayakawa
2013-09-25 17:37           ` Mike Snitzer
2013-09-25 17:37             ` Mike Snitzer
2013-09-26  1:42             ` Akira Hayakawa
2013-09-26  1:47             ` Akira Hayakawa
2013-09-27 18:35               ` Mike Snitzer
2013-09-27 18:35                 ` Mike Snitzer
2013-09-28 11:29                 ` Akira Hayakawa
2013-09-28 11:29                   ` Akira Hayakawa
2013-09-25 23:03           ` Greg KH
2013-09-25 23:03             ` Greg KH
2013-09-26  3:43           ` Dave Chinner
2013-10-01  8:26             ` Joe Thornber
2013-10-01  8:26               ` Joe Thornber
2013-10-03  0:01               ` Mikulas Patocka
2013-10-03  0:01                 ` [dm-devel] " Mikulas Patocka
2013-10-04  2:04                 ` Dave Chinner
2013-10-04  2:04                   ` Dave Chinner
2013-10-05  7:51                   ` Akira Hayakawa
2013-10-07 23:43                     ` Dave Chinner
2013-10-08  9:41                       ` Christoph Hellwig
2013-10-08  9:41                         ` Christoph Hellwig
2013-10-08 10:37                         ` Akira Hayakawa
2013-10-08 10:37                           ` Akira Hayakawa
2013-10-08 15:29                           ` Mike Snitzer
2013-10-09  1:07                             ` Akira Hayakawa
2013-10-09  1:07                               ` Akira Hayakawa
2013-10-08 10:57                       ` [dm-devel] " Akira Hayakawa
2013-10-08 10:57                         ` Akira Hayakawa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52384DEE.7050003@gmail.com \
    --to=ruby.wktk@gmail.com \
    --cc=agk@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cesarb@cesarb.net \
    --cc=devel@driverdev.osuosl.org \
    --cc=dm-devel@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=joe@perches.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=m.chehab@samsung.com \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.