All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
To: Ryusuke Konishi
	<konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH v2 9/9] nilfs2: prevent starvation of segments protected by snapshots
Date: Sun, 31 May 2015 20:13:44 +0200	[thread overview]
Message-ID: <556B4F58.9080801@gmx.net> (raw)
In-Reply-To: <20150601.014550.269184778137708369.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>

On 2015-05-31 18:45, Ryusuke Konishi wrote:
> On Fri, 22 May 2015 20:10:05 +0200, Andreas Rohner wrote:
>> On 2015-05-20 16:43, Ryusuke Konishi wrote:
>>> On Sun,  3 May 2015 12:05:22 +0200, Andreas Rohner wrote:
>>>> It doesn't really matter if the number of reclaimable blocks for a
>>>> segment is inaccurate, as long as the overall performance is better than
>>>> the simple timestamp algorithm and starvation is prevented.
>>>>
>>>> The following steps will lead to starvation of a segment:
>>>>
>>>> 1. The segment is written
>>>> 2. A snapshot is created
>>>> 3. The files in the segment are deleted and the number of live
>>>>    blocks for the segment is decremented to a very low value
>>>> 4. The GC tries to free the segment, but there are no reclaimable
>>>>    blocks, because they are all protected by the snapshot. To prevent an
>>>>    infinite loop the GC has to adjust the number of live blocks to the
>>>>    correct value.
>>>> 5. The snapshot is converted to a checkpoint and the blocks in the
>>>>    segment are now reclaimable.
>>>> 6. The GC will never attempt to clean the segment again, because it
>>>>    looks as if it had a high number of live blocks.
>>>>
>>>> To prevent this, the already existing padding field of the SUFILE entry
>>>> is used to track the number of snapshot blocks in the segment. This
>>>> number is only set by the GC, since it collects the necessary
>>>> information anyway. So there is no need, to track which block belongs to
>>>> which segment. In step 4 of the list above the GC will set the new field
>>>> su_nsnapshot_blks. In step 5 all entries in the SUFILE are checked and
>>>> entries with a big su_nsnapshot_blks field get their su_nlive_blks field
>>>> reduced.
>>>>
>>>> Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
>>>
>>> I still don't know whether this workaround is the way we should take
>>> or not.  This patch has several drawbacks:
>>>
>>>  1. It introduces overheads to every "chcp cp" operation
>>>     due to traversal rewrite of sufile.
>>>     If the ratio of snapshot protected blocks is high, then
>>>     this overheads will be big.
>>>
>>>  2. The traversal rewrite of sufile will causes many sufile blocks will be
>>>     written out.   If most blocks are protected by a snapshot,
>>>     more than 4MB of sufile blocks will be written per 1TB capacity.
>>>
>>>     Even though this rewrite may not happen for contiguous "chcp cp"
>>>     operations, it still has potential for creating sufile write blocks
>>>     if the application of nilfs manipulates snapshots frequently.
>>
>> I could also implement this functionality in nilfs_cleanerd in
>> userspace. Every time a "chcp cp" happens some kind of permanent flag
>> like "snapshot_was_recently_deleted" is set at an appropriate location.
>> The flag could be returned with GET_SUSTAT ioctl(). Then nilfs_cleanerd
>> would, at certain intervals and if the flag is set, check all segments
>> with GET_SUINFO ioctl() and set the ones that have potentially invalid
>> values with SET_SUINFO ioctl(). After that it would clear the
>> "snapshot_was_recently_deleted" flag. What do you think about this idea?
> 
> Sorry for my late reply.

No problem. I was also very busy last week.

> I think moving the functionality to cleanerd and notifying some sort
> of information to userland through ioctl for that, is a good idea
> except that I feel the ioctl should be GET_CPSTAT instead of
> GET_SUINFO because it's checkpoint/snapshot related information.

Ok good idea.

> I think the parameter that should be added is a set of statistics
> information including the number of deleted snapshots since the file
> system was mounted last (1).  The counter (1) can serve as the
> "snapshot_was_recently_deleted" flag if it monotonically increases.
> Although we can use timestamp of when a snapshot was deleted last
> time, it's not preferable than the counter (1) because the system
> clock may be rewinded and it also has an issue related to precision.

I agree, a counter is better than a simple flag.

> Note that we must add GET_CPSTAT_V2 (or GET_SUSTAT_V2) and the
> corresponding structure (i.e. nilfs_cpstat_v2, or so) since ioctl
> codes depend on the size of argument data and it will be changed in
> both ioctls; unfortunately, neither GET_CPSTAT nor GET_SUSTAT ioctl is
> expandable.  Some ioctls like EVIOCGKEYCODE_V2 will be a reference for
> this issue.
> 
>>
>> If the policy is "timestamp" the GC would of course skip this scan,
>> because it is unnecessary.
>>
>>>  3. The ratio of the threshold "max_segblks" is hard coded to 50%
>>>     of blocks_per_segment.  It is not clear if the ratio is good
>>>     (versatile).
>>
>> The interval and percentage could be set in /etc/nilfs_cleanerd.conf.
>>
>> I chose 50% kind of arbitrarily. My intent was to encourage the GC to
>> check the segment again in the future. I guess anything between 25% and
>> 75% would also work.
> 
> Sound reasonable.
> 
> By the way, I am thinking we should move cleanerd into kernel as soon
> as we can.  It's not only inefficient due to a large amount of data
> exchange between kernel and user-land, but also is hindering changes
> like we are trying.  We have to care compatibility unnecessarily due
> to the early design mistake (i.e. the separation of gc to user-land).

I am a bit confused. Is it OK if I implement this functionality in
nilfs_cleanerd for this patch set, or would it be better to implement it
with a workqueue in the kernel, like you've suggested before?

If you intend to move nilfs_cleanerd into the kernel anyway, then the
latter would make more sense to me. Which implementation do you prefer
for this patch set?

Regards,
Andreas Rohner

> Regards,
> Ryusuke Konishi
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2015-05-31 18:13 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-03 10:05 [PATCH v2 0/9] nilfs2: implementation of cost-benefit GC policy Andreas Rohner
     [not found] ` <1430647522-14304-1-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-03 10:05   ` [PATCH v2 1/9] nilfs2: copy file system feature flags to the nilfs object Andreas Rohner
     [not found]     ` <1430647522-14304-2-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-09  1:54       ` Ryusuke Konishi
     [not found]         ` <20150509.105445.1816655707671265145.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-09 18:41           ` Andreas Rohner
2015-05-03 10:05   ` [PATCH v2 2/9] nilfs2: extend SUFILE on-disk format to enable tracking of live blocks Andreas Rohner
     [not found]     ` <1430647522-14304-3-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-09  2:24       ` Ryusuke Konishi
     [not found]         ` <20150509.112403.380867861504859109.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-09 18:47           ` Andreas Rohner
2015-05-03 10:05   ` [PATCH v2 3/9] nilfs2: introduce new feature flag for tracking " Andreas Rohner
     [not found]     ` <1430647522-14304-4-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-09  2:28       ` Ryusuke Konishi
     [not found]         ` <20150509.112814.2026089040966346261.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-09 18:53           ` Andreas Rohner
2015-05-03 10:05   ` [PATCH v2 4/9] nilfs2: add kmem_cache for SUFILE cache nodes Andreas Rohner
     [not found]     ` <1430647522-14304-5-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-09  2:41       ` Ryusuke Konishi
     [not found]         ` <20150509.114149.1643183669812667339.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-09 19:10           ` Andreas Rohner
     [not found]             ` <554E5B9D.7070807-hi6Y0CQ0nG0@public.gmane.org>
2015-05-10  0:05               ` Ryusuke Konishi
2015-05-03 10:05   ` [PATCH v2 5/9] nilfs2: add SUFILE cache for changes to su_nlive_blks field Andreas Rohner
     [not found]     ` <1430647522-14304-6-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-09  4:09       ` Ryusuke Konishi
     [not found]         ` <20150509.130900.223492430584220355.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-09 19:39           ` Andreas Rohner
     [not found]             ` <554E626A.2030503-hi6Y0CQ0nG0@public.gmane.org>
2015-05-10  2:09               ` Ryusuke Konishi
2015-05-03 10:05   ` [PATCH v2 6/9] nilfs2: add tracking of block deletions and updates Andreas Rohner
     [not found]     ` <1430647522-14304-7-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-09  7:05       ` Ryusuke Konishi
     [not found]         ` <20150509.160512.1087140271092828536.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-09 15:58           ` Ryusuke Konishi
2015-05-09 20:02           ` Andreas Rohner
     [not found]             ` <554E67C0.1050309-hi6Y0CQ0nG0@public.gmane.org>
2015-05-10  3:17               ` Ryusuke Konishi
2015-05-03 10:05   ` [PATCH v2 7/9] nilfs2: ensure that all dirty blocks are written out Andreas Rohner
     [not found]     ` <1430647522-14304-8-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-09 12:17       ` Ryusuke Konishi
     [not found]         ` <20150509.211741.1463241033923032068.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-09 20:18           ` Andreas Rohner
     [not found]             ` <554E6B7E.8070000-hi6Y0CQ0nG0@public.gmane.org>
2015-05-10  3:31               ` Ryusuke Konishi
2015-05-10 11:04           ` Andreas Rohner
     [not found]             ` <554F3B32.5050004-hi6Y0CQ0nG0@public.gmane.org>
2015-06-01  4:13               ` Ryusuke Konishi
     [not found]                 ` <20150601.131320.1075202804382267027.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-06-01 14:33                   ` Andreas Rohner
2015-05-03 10:05   ` [PATCH v2 8/9] nilfs2: correct live block tracking for GC protection period Andreas Rohner
     [not found]     ` <1430647522-14304-9-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-10 18:15       ` Ryusuke Konishi
     [not found]         ` <20150511.031512.1036934606749624197.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-10 18:23           ` Ryusuke Konishi
     [not found]             ` <20150511.032323.1250231827423193240.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-11  2:07               ` Ryusuke Konishi
     [not found]                 ` <20150511.110726.725667075147435663.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-11 12:32                   ` Andreas Rohner
2015-05-11 13:00           ` Andreas Rohner
     [not found]             ` <5550A7FC.4050709-hi6Y0CQ0nG0@public.gmane.org>
2015-05-12 14:31               ` Ryusuke Konishi
     [not found]                 ` <20150512.233126.2206330706583570566.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-12 15:37                   ` Andreas Rohner
2015-05-03 10:05   ` [PATCH v2 9/9] nilfs2: prevent starvation of segments protected by snapshots Andreas Rohner
     [not found]     ` <1430647522-14304-10-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-20 14:43       ` Ryusuke Konishi
     [not found]         ` <20150520.234335.542615158366069430.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-20 15:49           ` Ryusuke Konishi
2015-05-22 18:10           ` Andreas Rohner
     [not found]             ` <555F70FD.6090500-hi6Y0CQ0nG0@public.gmane.org>
2015-05-31 16:45               ` Ryusuke Konishi
     [not found]                 ` <20150601.014550.269184778137708369.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-31 18:13                   ` Andreas Rohner [this message]
     [not found]                     ` <556B4F58.9080801-hi6Y0CQ0nG0@public.gmane.org>
2015-06-01  0:44                       ` Ryusuke Konishi
     [not found]                         ` <20150601.094441.24658496988941562.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-06-01 14:45                           ` Andreas Rohner
2015-05-03 10:07   ` [PATCH v2 1/5] nilfs-utils: extend SUFILE on-disk format to enable track live blocks Andreas Rohner
2015-05-03 10:07   ` [PATCH v2 2/5] nilfs-utils: add additional flags for nilfs_vdesc Andreas Rohner
2015-05-03 10:07   ` [PATCH v2 3/5] nilfs-utils: add support for tracking live blocks Andreas Rohner
2015-05-03 10:07   ` [PATCH v2 4/5] nilfs-utils: implement the tracking of live blocks for set_suinfo Andreas Rohner
2015-05-03 10:07   ` [PATCH v2 5/5] nilfs-utils: add support for greedy/cost-benefit policies Andreas Rohner
2015-05-05  3:09   ` [PATCH v2 0/9] nilfs2: implementation of cost-benefit GC policy Ryusuke Konishi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=556B4F58.9080801@gmx.net \
    --to=andreas.rohner-hi6y0cq0ng0@public.gmane.org \
    --cc=konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org \
    --cc=linux-nilfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.