linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
* [linux-lvm] Bugs in DM_DEVICE_STATUS ioctl no_flush flag settings
@ 2016-03-21 12:06 Ming-Hung Tsai
  2016-03-31 12:16 ` Zdenek Kabelac
  0 siblings, 1 reply; 4+ messages in thread
From: Ming-Hung Tsai @ 2016-03-21 12:06 UTC (permalink / raw)
  To: LVM general discussion and development

Hi,

Here are some patches related to DM_DEVICE_STATUS ioctl handling in dm-thin.

------------------------------------------------------------------
1. Fix incorrect no_flush flag settings when querying device utilization

The parameter lv is NULL when querying thin-pool metadata percent (and also the
dlid parameter), thus I use the target_type string instead.

diff --git a/lib/activate/dev_manager.c b/lib/activate/dev_manager.c
index 8e56b7a..4a96d1e 100644
--- a/lib/activate/dev_manager.c
+++ b/lib/activate/dev_manager.c
@@ -897,7 +897,8 @@ static int _percent_run(struct dev_manager *dm,
const char *name,
                return_0;

        /* No freeze on overfilled thin-pool, read existing slightly
outdated data */
-       if (lv && lv_is_thin_pool(lv) &&
+       if ((segtype = get_segtype_from_string(dm->cmd, target_type)) &&
+           segtype_is_thin_pool(segtype) &&
            !dm_task_no_flush(dmt))
                log_warn("Can't set no_flush flag."); /* Non fatal */

@@ -926,7 +927,7 @@ static int _percent_run(struct dev_manager *dm,
const char *name,
                if (!type || !params)
                        continue;

-               if (!(segtype = get_segtype_from_string(dm->cmd, target_type)))
+               if (!segtype)
                        continue;

                if (strcmp(type, target_type)) {

------------------------------------------------------------------
2. Set no_flush flag when querying device info to avoid freeze

Same as 872ea3b98, we should set the no_flush flag in _info_run() to avoid
freeze and performance impact. We might need to patch all other functions
running DM_DEVICE_STATUS, like device_is_usable(), lv_has_target_type(), etc.

diff --git a/lib/activate/dev_manager.c b/lib/activate/dev_manager.c
index 4a96d1e..67c476e 100644
--- a/lib/activate/dev_manager.c
+++ b/lib/activate/dev_manager.c
@@ -212,6 +212,12 @@ static int _info_run(info_type_t type, const char
*name, const char *dlid,
                                major, minor, with_open_count)))
                return_0;

+       /* No freeze on overfilled thin-pool, read existing slightly
outdated data */
+       if (type == STATUS &&
+           segtype_is_thin_pool(seg_status->seg->segtype) &&
+           !dm_task_no_flush(dmt))
+               log_warn("Can't set no_flush flag."); /* Non fatal */
+
        if (!dm_task_run(dmt))
                goto_out;

------------------------------------------------------------------
3. Fix _metadatapercent_disp to not to handle thin volumes

"lvs vg1/thin_lvol0 -o metadata_percent" should not send ioctls.

diff --git a/lib/report/report.c b/lib/report/report.c
index 7070092..8ae565e 100644
--- a/lib/report/report.c
+++ b/lib/report/report.c
@@ -2961,8 +2961,6 @@ static int _metadatapercent_disp(struct dm_report *rh,

        if (lv_is_thin_pool(lv))
                (void) lv_thin_pool_percent(lv, 1, &percent);
-       else if (lv_is_thin_volume(lv))
-               (void) lv_thin_percent(lv, 1, &percent);
        else if (lv_is_cache(lv) || lv_is_cache_pool(lv)) {
                if (lv_cache_status(lv, &status)) {
                        percent = status->metadata_usage;

------------------------------------------------------------------
4. Remove unused parameter from lv_thin_percent

diff --git a/lib/activate/activate.c b/lib/activate/activate.c
index 2eb24d4..0b8bebf 100644
--- a/lib/activate/activate.c
+++ b/lib/activate/activate.c
@@ -304,8 +304,7 @@ int lv_thin_pool_percent(const struct
logical_volume *lv, int metadata,
 {
  return 0;
 }
-int lv_thin_percent(const struct logical_volume *lv, int mapped,
-    dm_percent_t *percent)
+int lv_thin_percent(const struct logical_volume *lv, dm_percent_t *percent)
 {
  return 0;
 }
@@ -1124,8 +1123,7 @@ int lv_thin_pool_percent(const struct
logical_volume *lv, int metadata,
 /*
  * Returns 1 if percent set, else 0 on failure.
  */
-int lv_thin_percent(const struct logical_volume *lv,
-    int mapped, dm_percent_t *percent)
+int lv_thin_percent(const struct logical_volume *lv, dm_percent_t *percent)
 {
  int r;
  struct dev_manager *dm;
@@ -1139,7 +1137,7 @@ int lv_thin_percent(const struct logical_volume *lv,
  if (!(dm = dev_manager_create(lv->vg->cmd, lv->vg->name, 1)))
  return_0;

- if (!(r = dev_manager_thin_percent(dm, lv, mapped, percent)))
+ if (!(r = dev_manager_thin_percent(dm, lv, percent)))
  stack;

  dev_manager_destroy(dm);
diff --git a/lib/activate/activate.h b/lib/activate/activate.h
index 74afb95..c01fedd 100644
--- a/lib/activate/activate.h
+++ b/lib/activate/activate.h
@@ -175,8 +175,7 @@ int lv_cache_status(const struct logical_volume *lv,
     struct lv_status_cache **status);
 int lv_thin_pool_percent(const struct logical_volume *lv, int metadata,
  dm_percent_t *percent);
-int lv_thin_percent(const struct logical_volume *lv, int mapped,
-    dm_percent_t *percent);
+int lv_thin_percent(const struct logical_volume *lv, dm_percent_t *percent);
 int lv_thin_pool_transaction_id(const struct logical_volume *lv,
  uint64_t *transaction_id);
 int lv_thin_device_id(const struct logical_volume *lv, uint32_t *device_id);
diff --git a/lib/activate/dev_manager.c b/lib/activate/dev_manager.c
index 67c476e..753f188 100644
--- a/lib/activate/dev_manager.c
+++ b/lib/activate/dev_manager.c
@@ -1428,7 +1428,7 @@ int dev_manager_thin_pool_percent(struct dev_manager *dm,

 int dev_manager_thin_percent(struct dev_manager *dm,
      const struct logical_volume *lv,
-     int mapped, dm_percent_t *percent)
+     dm_percent_t *percent)
 {
  char *name;
  const char *dlid;
@@ -1443,7 +1443,7 @@ int dev_manager_thin_percent(struct dev_manager *dm,

  log_debug_activation("Getting device status percentage for %s", name);
  if (!(_percent(dm, name, dlid, "thin", 0,
-       (mapped) ? NULL : lv, percent, NULL, 1)))
+       lv, percent, NULL, 1)))
  return_0;

  return 1;
diff --git a/lib/activate/dev_manager.h b/lib/activate/dev_manager.h
index bcfb226..3cf8b90 100644
--- a/lib/activate/dev_manager.h
+++ b/lib/activate/dev_manager.h
@@ -74,7 +74,7 @@ int dev_manager_thin_pool_percent(struct dev_manager *dm,
   int metadata, dm_percent_t *percent);
 int dev_manager_thin_percent(struct dev_manager *dm,
      const struct logical_volume *lv,
-     int mapped, dm_percent_t *percent);
+     dm_percent_t *percent);
 int dev_manager_thin_device_id(struct dev_manager *dm,
        const struct logical_volume *lv,
        uint32_t *device_id);
diff --git a/lib/display/display.c b/lib/display/display.c
index a6387c6..27d09f0 100644
--- a/lib/display/display.c
+++ b/lib/display/display.c
@@ -468,7 +468,7 @@ int lvdisplay_full(struct cmd_context *cmd,
  log_print("LV merging to          %s",
   seg->merge_lv->name);
  if (inkernel)
- thin_active = lv_thin_percent(lv, 0, &thin_percent);
+ thin_active = lv_thin_percent(lv, &thin_percent);
  if (lv_is_merging_origin(lv))
  log_print("LV merged with         %s",
   find_snapshot(lv)->lv->name);
diff --git a/lib/report/properties.c b/lib/report/properties.c
index 62c81b9..69e8552 100644
--- a/lib/report/properties.c
+++ b/lib/report/properties.c
@@ -120,7 +120,7 @@ static dm_percent_t _data_percent(const struct
logical_volume *lv)
  }

  if (lv_is_thin_volume(lv))
- return lv_thin_percent(lv, 0, &percent) ? percent : DM_PERCENT_INVALID;
+ return lv_thin_percent(lv, &percent) ? percent : DM_PERCENT_INVALID;

  return lv_thin_pool_percent(lv, 0, &percent) ? percent : DM_PERCENT_INVALID;
 }
diff --git a/lib/report/report.c b/lib/report/report.c
index 8ae565e..9970f7c 100644
--- a/lib/report/report.c
+++ b/lib/report/report.c
@@ -2939,7 +2939,7 @@ static int _datapercent_disp(struct dm_report
*rh, struct dm_pool *mem,
  else if (lv_is_thin_pool(lv))
  (void) lv_thin_pool_percent(lv, 0, &percent);
  else if (lv_is_thin_volume(lv))
- (void) lv_thin_percent(lv, 0, &percent);
+ (void) lv_thin_percent(lv, &percent);
  else if (lv_is_cache(lv) || lv_is_cache_pool(lv)) {
  if (lv_cache_status(lv, &status)) {
  percent = status->data_usage;

------------------------------------------------------------------

Beside these patches, I think that the mechanism to obtain lv data utilization
should be revised. Given a thin pool vg1/tp1, I hope that the command

# lvs vg1/tp1 -o lv_attr,data_percent,metadata_percent

should run DM_DEVICE_STATUS ioctl only once.

To achieve this, we should calculate the percents in early phase,
then cache the calculated percents in struct lv_with_info_and_seg_status,
thus the column handlers _datapercent_disp() and _metadatapercent_disp()
could obtain the cached percents.

The follow is my proposed approach, inspired by lvm-cache's implementation:

(1) Declare a new structure for cache percents:

    struct lv_status_thin_pool {
        struct dm_pool *mem;
        struct dm_status_thin_pool *status;
        dm_percent_t data_usage;
        dm_percent_t metadata_usage;
    };

(2) int lv_thin_pool_status(const struct logical_volume *pool_lv,
                            lv_status_thin_pool **status):

      Allocates lv_status_thin_pool, obtains dm_status_thin_pool by using
      dev_manager_thin_pool_status(), then calculates data/metadata percents.

(3) int dev_manager_thin_pool_status(struct dev_manager *dm,
                                     const struct logical_volume *lv,
                                     struct dm_status_thin_pool **status,
                                     int noflush):

      Allocates dm_status_thin_pool, run ioctl,
      then transform the status string to dm_status_thin_pool
      by using dm_get_status_thin_pool().

(4) Stores lv_status_<segtype> in lv_seg_status:

    struct lv_seg_status {
        struct dm_pool *mem; /* input */
        const struct lv_segment *seg; /* input */
        lv_seg_status_type_t type; /* output */
        union {
            struct lv_status_cache *cache;
            struct lv_status_raid *raid;
            struct lv_status_mirror *mirror;  // is dm-mirror required?
            struct lv_status_snapshot *snapshot;
            struct lv_status_thin *thin;
            struct lv_status_thin_pool *thin_pool;
        };
    };

(5) Change the colume type of "data_percent" and "metadata_percent"
    to LVSSTATUS.

(6) Also apply this approach to lvdisplay: Now lvdisplay_full() sends
    DM_DEVICE_STATUS twice (lv_thin_pool_percent() twice)

Finally, we could remove segtype_handler::target_percent in all the segtypes.

It need some efforts to do the above patches. How do you think about it?


Thanks,
Ming-Hung Tsai

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [linux-lvm] Bugs in DM_DEVICE_STATUS ioctl no_flush flag settings
  2016-03-21 12:06 [linux-lvm] Bugs in DM_DEVICE_STATUS ioctl no_flush flag settings Ming-Hung Tsai
@ 2016-03-31 12:16 ` Zdenek Kabelac
  2016-04-01  8:08   ` Zdenek Kabelac
  2016-04-01 16:18   ` Ming-Hung Tsai
  0 siblings, 2 replies; 4+ messages in thread
From: Zdenek Kabelac @ 2016-03-31 12:16 UTC (permalink / raw)
  To: LVM general discussion and development

Dne 21.3.2016 v 13:06 Ming-Hung Tsai napsal(a):
> Hi,

Hi

Thanks for valuable deep code scanning.


>
> Here are some patches related to DM_DEVICE_STATUS ioctl handling in dm-thin.
>
> ------------------------------------------------------------------
> 1. Fix incorrect no_flush flag settings when querying device utilization
>
> The parameter lv is NULL when querying thin-pool metadata percent (and also the
> dlid parameter), thus I use the target_type string instead.
>


Yes - forgotten state - fixed just a bit differently.

> ------------------------------------------------------------------
> 2. Set no_flush flag when querying device info to avoid freeze
>

Yes - will need to do - but this also applies on cache device.
So working on larger fix for both.


> ------------------------------------------------------------------
> 3. Fix _metadatapercent_disp to not to handle thin volumes
>


Actually I always planned to 'display' highest_mapped_sector -
so I did now upstreamed it now (as it seemed arg was unused).

So "Meta%" shows what's been the highest written sector as percent.
We do plan to add either switch or new field to show those '%' in
form of byte sizes in future...

Not yet sure how useful or confusing this info is for now...
But i.e. it's interesting to see  how much data is written
by mkfs.ext4/xfs/reiserfs  and how far writes goes on a thin device.

Time for complains...

>
> Beside these patches, I think that the mechanism to obtain lv data utilization
> should be revised. Given a thin pool vg1/tp1, I hope that the command
>
> # lvs vg1/tp1 -o lv_attr,data_percent,metadata_percent
>
> should run DM_DEVICE_STATUS ioctl only once.

Yes - planned - but there is always not enough time to do house-keeping job
and new features are pushed more strongly :(....


> To achieve this, we should calculate the percents in early phase,
> then cache the calculated percents in struct lv_with_info_and_seg_status,
> thus the column handlers _datapercent_disp() and _metadatapercent_disp()
> could obtain the cached percents.
>
> The follow is my proposed approach, inspired by lvm-cache's implementation:
>
> (1) Declare a new structure for cache percents:

Yeah - I've been create lv_cache struct with this in mind.

However the whole issue should be seen more generically in 'OO-way'.

So we do want to have 'status' object with reporting methods,
and this object should be filled by 'segment' method.

The complexity grows however as we do have some segments being related
to other segments - that's why we currently have very ugly code hack
inside to do filling of most status struct in one place - which is totally 
unextendable - but ATM was the quickest way to achieve reasonable goal.

So another TODO job once there is time for it...
A LOT of work should be really done through segment methods instead of placing 
quirks across whole code base - we need more hands....

For now I'm rather conservative about large code-base changes till Heinz's 
raid branch is fully merged.

Regards

Zdenek

PS: I'm amazed some else is able to read lvm2 code.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [linux-lvm] Bugs in DM_DEVICE_STATUS ioctl no_flush flag settings
  2016-03-31 12:16 ` Zdenek Kabelac
@ 2016-04-01  8:08   ` Zdenek Kabelac
  2016-04-01 16:18   ` Ming-Hung Tsai
  1 sibling, 0 replies; 4+ messages in thread
From: Zdenek Kabelac @ 2016-04-01  8:08 UTC (permalink / raw)
  To: LVM general discussion and development

Dne 31.3.2016 v 14:16 Zdenek Kabelac napsal(a):
> Dne 21.3.2016 v 13:06 Ming-Hung Tsai napsal(a):
>> Hi,
>
> Hi
>> ------------------------------------------------------------------
>> 3. Fix _metadatapercent_disp to not to handle thin volumes
>>
>
>
> Actually I always planned to 'display' highest_mapped_sector -
> so I did now upstreamed it now (as it seemed arg was unused).
>
> So "Meta%" shows what's been the highest written sector as percent.
> We do plan to add either switch or new field to show those '%' in
> form of byte sizes in future...
>
> Not yet sure how useful or confusing this info is for now...
> But i.e. it's interesting to see  how much data is written
> by mkfs.ext4/xfs/reiserfs  and how far writes goes on a thin device.
>
> Time for complains...
>

Following here on myself - after longer debate how this would be
understandable by a user - we've concluded we will rather introduce
a new field and leave  "Meta%" empty - so there will
be rather a new field  'highest_size/HSize' instead.

Regards

Zdenek

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [linux-lvm] Bugs in DM_DEVICE_STATUS ioctl no_flush flag settings
  2016-03-31 12:16 ` Zdenek Kabelac
  2016-04-01  8:08   ` Zdenek Kabelac
@ 2016-04-01 16:18   ` Ming-Hung Tsai
  1 sibling, 0 replies; 4+ messages in thread
From: Ming-Hung Tsai @ 2016-04-01 16:18 UTC (permalink / raw)
  To: LVM general discussion and development

2016-03-31 20:16 GMT+08:00 Zdenek Kabelac <zkabelac@redhat.com>:
>
> Yeah - I've been create lv_cache struct with this in mind.
>
> However the whole issue should be seen more generically in 'OO-way'.
>
> So we do want to have 'status' object with reporting methods,
> and this object should be filled by 'segment' method.
>
> The complexity grows however as we do have some segments being related
> to other segments - that's why we currently have very ugly code hack
> inside to do filling of most status struct in one place - which is totally
> unextendable - but ATM was the quickest way to achieve reasonable goal.

Agree. We should utilize the segtype_handler framework as much as we can.

> For now I'm rather conservative about large code-base changes till Heinz's
> raid branch is fully merged.
> PS: I'm amazed some else is able to read lvm2 code.

Because there are lots of legacy code, hacks, and undocumented stuff? :p
It's painful when I started to read lvm2 code, but now I think I'm able
to write a maintenance guide of lvm2...

Hope that someday LVM code would become well-organized, maybe lvm3...


Ming-Hung Tsai

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-04-01 16:18 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-21 12:06 [linux-lvm] Bugs in DM_DEVICE_STATUS ioctl no_flush flag settings Ming-Hung Tsai
2016-03-31 12:16 ` Zdenek Kabelac
2016-04-01  8:08   ` Zdenek Kabelac
2016-04-01 16:18   ` Ming-Hung Tsai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).