From: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
To: Vyacheslav Dubeyko
<slava-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org>,
Ryusuke Konishi
<konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
Cc: Clemens Eisserer
<linuxhippy-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
linux-nilfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [static superblock discussion] Does nilfs2 do any in-place writes?
Date: Wed, 29 Jan 2014 13:44:23 +0100 [thread overview]
Message-ID: <52E8F7A7.8010505@gmx.net> (raw)
In-Reply-To: <1390901114.2942.11.camel-dzAnj6fV1RxGeWtTaGDT1UEK6ufn8VP3@public.gmane.org>
On 2014-01-28 10:25, Vyacheslav Dubeyko wrote:
> Hi Ryusuke,
>
> This is my improved vision of possible approach to change in-place
> update of superblock on COW policy. I suppose that current description
> includes all that we discussed previously. And we can continue to deepen
> this discussion.
>
> Approach is based on necessity to have two areas at the begin and
> at the end of a NILFS2 volume. Every such area should have capacity
> is equal to segment size. The goal of these two areas is to provide
> a FTL-friendly way of storing information about latest log and
> modified superblock's fields by means of COW (Copy-On-Write) policy.
>
> At the begin of a NILFS2 volume is located primary superblock area.
> Primary superblock area begins from static superblock is created
> during NILFS2 volume creation by means of mkfs tool. This superblock
> (primary superblock) is located on 1024 bytes from the volume begin
> (as it placed currently). The primary superblock leaves untouched
> during filling primary superblock area by modified information.
> Initial state of superblock can be rewritten at the moment of
> beginning next iteration of filling of primary superblock area
> (because this area lives likewise circular buffer).
>
> ------------------------------------------------------------
> | Primary superblock | Modifiable area |
> ------------------------------------------------------------
> |<---- 4 KB ------>|
> |<-------------------- segment size ---------------------->|
>
> On the opposite side of the volume (at the volume's end) is located
> secondary (or back) superblock area. This area begins from modifiable
> area and it ends with secondary superblock (as it is located currently).
> Modifiable area of secondary superblock area lives likewise of
> modifiable area in first superblock area.
>
> ------------------------------------------------------------
> | Modifiable area | Secondary superblock |
> ------------------------------------------------------------
> |<------ 4 KB ------>|
> |<-------------------- segment size ---------------------->|
>
> Primary and secondary superblock areas have goal to keep copies
> of super roots. And, firstly, namely these areas are used for
> searching a latest log. These areas should keep as super root as
> physical block of this super root's placement. Moreover, primary
> and secondary superblock areas have different frequency of updating.
> Secondary superblock area is updated during every umount or once
> at several hours (if we have significant system uptime). Primary
> superblock area is updated more frequently. The frequency of
> primary superblock area's update can be based on timeout or count
> of constructed segments. But, anyway, it makes sense to take into
> account only full segments instead of partial segments. Maybe, it
> makes sense to keep more complex combination in modifiable area:
> super root + diff to superblock state + physical block of super root's
> placement.
>
> Modifiable area should have special filling policy. This policy
> doesn't contradict with COW policy but it implements not in
> sequential manner. Namely, modifiable area should be divided on
> several groups (the count of groups can be configurable option).
> Moreover, primary and secondary superblock areas would have
> different values of groups count. Thereby, every group will contain
> some blocks count.
>
> -------------------------------------------------------------
> | Group1 | Group2 | Group3 | **** | GroupN |
> -------------------------------------------------------------
> |<-------------------- Modifiable area -------------------->|
>
> Saving blocks are distributed between groups by means of policy
> that every next block should be saved in next group on every
> iteration. If all groups in modifiable area have equal count of
> saved blocks then it begins the next iteration which starts from
> the first group.
>
> FIRST ITERATION [A phase]:
>
> (1) first block
> -------------------------------------------------------------
> |A1| | | | | | | | | | | | | | | | | | | |
> -------------------------------------------------------------
> |<-- Group1 -->|<-- Group2 -->|<-- ****** -->|<-- GroupN -->|
>
> (2) second block
> -------------------------------------------------------------
> |A1| | | | |A2| | | | | | | | | | | | | | |
> -------------------------------------------------------------
> |<-- Group1 -->|<-- Group2 -->|<-- ****** -->|<-- GroupN -->|
>
> (N) Nth block
> -------------------------------------------------------------
> |A1| | | | |A2| | | | |A3| | | | |An| | | | |
> -------------------------------------------------------------
> |<-- Group1 -->|<-- Group2 -->|<-- ****** -->|<-- GroupN -->|
>
> SECOND ITERATION [B phase]:
>
> -------------------------------------------------------------
> |A1|B1| | | |A2|B2| | | |A3|B3| | | |An|Bn| | | |
> -------------------------------------------------------------
> |<-- Group1 -->|<-- Group2 -->|<-- ****** -->|<-- GroupN -->|
>
> Nth ITERATION [E phase]:
>
> -------------------------------------------------------------
> |A1|B1|C1|D1|E1|A2|B2|C2|D2|E2|A3|B3|C3|D3|E3|An|Bn|Cn|Dn|En|
> -------------------------------------------------------------
> |<-- Group1 -->|<-- Group2 -->|<-- ****** -->|<-- GroupN -->|
>
> Finally, when modifiable area is completely filled then it is
> possible to discard area's content and to begin filling iterations
> again. We will have two modifiable areas are filling with
> different frequencies and some state of replication of
> information. Thereby, it provides basis for safe and independent
> discarding of modifiable areas.
>
> The special filling policy has goal to provide a basis for
> efficient search. Namely, first group contains blocks differ by
> some period from each other. We have such sequence during saving:
> [A1,A2,A3,..,An], [B1,B2,B3,..,Bn], ..., [E1,E2,E3,..,En]. But
> first group will contain (A1,B1,C1,D1,E1). Thereby, passing
> item-by-item through first group means jumping with some period.
> Moreover, in the case of some failure it is possible to start the
> searching from any group (with decreasing search efficiency).
> It needs to take into account magic signature, header checksum and
> timestamps during comparison of items in group. It provides opportunity
> to distinguish valid blocks from empty and invalid ones and to
> distinguish older blocks from latest ones.
>
> Searching in dedicated area gives opportunity to use read-ahead
> technique. Moreover, if group contains many items then it is
> possible to increase step between current and next items during
> search. For example, it is possible to use such sequence of steps
> during searching: 0, 1, 3, 5, 7, and so on. If we have found
> latest item in first group, for example, then it is possible
> to find a latest item in he whole sequence by means of jumping on
> group period (count of blocks in a group).
>
> Two modifiable areas are filled with different frequencies and
> it gives opportunity to use special searching algorithm. Such algorithm
> can use, for example, secondary superblock area for rough,
> preliminary search (because this modifiable area is changed rarely).
> Then, further, algorithm can continue search in first superblock
> area (because this modifiable area is changed more frequently).
> Moreover, segctor thread has knowledge about all dirty files and it can
> predict, theoretically, how many segments will be constructed.
> Thereby, it is possible to save in items of modifiable area's groups
> such prediction in the form of hint that it can be used during search
> for improving search algorithm efficiency.
>
> With the best regards,
> Vyacheslav Dubeyko.
I hope I understand your approach correctly. Can it be summarized as
follows: Instead of overwriting the super block you want to reserve the
first segment to write the super block in a round-robin way into groups.
Thereby spreading the writes over a larger area. Then the groups should
probably have a typical erase block size like 512k. If that is true, I
don't think you need any special algorithm to search the latest super
block. You just read in the whole segment at mount time and select the
one with the biggest s_last_cno.
What about Ryusukes suggestion of never updating the super block and
instead using a clever segment allocation scheme that allows a binary
search for the latest segment?
br,
Andreas Rohner
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2014-01-29 12:44 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-15 10:44 Does nilfs2 do any in-place writes? Clemens Eisserer
[not found] ` <CAFvQSYSzpX_WpUi9KpGj0pZvzhw2mfzzOqcgdj9ripXAjipmtw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-01-15 10:52 ` Vyacheslav Dubeyko
2014-01-15 11:44 ` Clemens Eisserer
[not found] ` <CAFvQSYTG6HBVc9iodYyvCejwf889jiwOPsVb1Hi8cDrR9pOGeg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-01-15 12:01 ` Vyacheslav Dubeyko
2014-01-15 15:23 ` Ryusuke Konishi
[not found] ` <20140116.002353.94325733.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2014-01-16 10:08 ` Vyacheslav Dubeyko
2014-01-17 22:55 ` Ryusuke Konishi
[not found] ` <20140118.075519.43661574.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2014-01-20 11:54 ` [writable snapshots discussion] " Vyacheslav Dubeyko
2014-01-18 0:00 ` Ryusuke Konishi
[not found] ` <20140118.090008.194171715.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2014-01-28 9:25 ` [static superblock discussion] " Vyacheslav Dubeyko
[not found] ` <1390901114.2942.11.camel-dzAnj6fV1RxGeWtTaGDT1UEK6ufn8VP3@public.gmane.org>
2014-01-29 12:44 ` Andreas Rohner [this message]
[not found] ` <52E8F7A7.8010505-hi6Y0CQ0nG0@public.gmane.org>
2014-01-29 13:19 ` Vyacheslav Dubeyko
2014-01-29 18:18 ` Clemens Eisserer
[not found] ` <CAFvQSYSu5CGxs+K6bZUCtq17PrS_paX3bXBuLBRTba_XWYGgAg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-01-30 2:46 ` [PATCH 0/1] nilfs2: add mount option that reduces super block writes Andreas Rohner
[not found] ` <cover.1391048231.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2014-01-30 2:47 ` [PATCH 1/1] " Andreas Rohner
[not found] ` <75ceb45c464097ab556baacf2d15d6ae4b792bb2.1391048231.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2014-01-30 6:36 ` Vyacheslav Dubeyko
[not found] ` <127C78C3-9D47-439C-9639-263BC453D98D-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org>
2014-01-30 6:02 ` Andreas Rohner
[not found] ` <52E9EB06.1000504-hi6Y0CQ0nG0@public.gmane.org>
2014-01-30 7:44 ` Vyacheslav Dubeyko
[not found] ` <8DBE8E18-F678-44B0-A6A6-5AFEC227AA86-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org>
2014-01-30 6:52 ` Andreas Rohner
2014-01-30 9:48 ` Andreas Rohner
[not found] ` <52EA2002.1030809-hi6Y0CQ0nG0@public.gmane.org>
2014-01-30 11:27 ` Vyacheslav Dubeyko
[not found] ` <A6830DB2-DC73-4ACC-BE73-7A6EC1AC7C18-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org>
2014-01-30 11:33 ` Andreas Rohner
[not found] ` <52EA38A3.8060107-hi6Y0CQ0nG0@public.gmane.org>
2014-02-01 19:05 ` Clemens Eisserer
2014-01-30 3:27 ` [PATCH 0/1] " Andreas Rohner
2014-01-30 5:29 ` Ryusuke Konishi
[not found] ` <20140130.142941.55837481.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2014-01-30 5:59 ` Andreas Rohner
2014-01-30 6:29 ` Andreas Rohner
[not found] ` <52E9F13A.5050805-hi6Y0CQ0nG0@public.gmane.org>
2014-01-30 8:46 ` Ryusuke Konishi
2014-01-30 8:35 ` [static superblock discussion] Does nilfs2 do any in-place writes? Vyacheslav Dubeyko
[not found] ` <71B2806D-7CF2-4992-A588-EB73EADFFF9F-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org>
2014-01-30 10:09 ` Clemens Eisserer
[not found] ` <CAFvQSYQ84_BsqVC_ZM77P92jkP+1dh7NexvZWg4mFE7B3wSK0A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-01-30 12:42 ` Vyacheslav Dubeyko
[not found] ` <AE0F313D-5934-452B-80AB-5D691AF8A4BE-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org>
2014-01-30 13:09 ` Clemens Eisserer
[not found] ` <CAFvQSYQGDXmUit1zFZ9_LAjdLjxM-i_yR2L6pwFDX_BEdjdXxQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-01-30 13:32 ` Vyacheslav Dubeyko
2014-01-30 14:03 ` Clemens Eisserer
[not found] ` <CAFvQSYQ-qkXz677-obgHVN5fLQiF10-A=T2yNNAHKRcOGm_Pqw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-01-30 15:27 ` Vyacheslav Dubeyko
[not found] ` <720AFF13-6203-4A28-9850-3C2CAFF3B7BF-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org>
2014-02-05 20:47 ` Clemens Eisserer
[not found] ` <CAFvQSYStT4uwxqtxATLbPOvHYjww=sw=C=f3vBi_qdu6MXAn5A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-02-07 6:43 ` Vyacheslav Dubeyko
2014-01-16 10:03 ` Clemens Eisserer
[not found] ` <CAFvQSYSC7+dd93pRH-uok9N+A_s=1VKrfGEppu3qRTg3q=CuXQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-01-16 10:10 ` Vyacheslav Dubeyko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52E8F7A7.8010505@gmx.net \
--to=andreas.rohner-hi6y0cq0ng0@public.gmane.org \
--cc=konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org \
--cc=linux-nilfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linuxhippy-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=slava-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.