From: Kuniyuki Iwashima <kuniyu@amazon.com>
To: <pabeni@redhat.com>
Cc: <davem@davemloft.net>, <edumazet@google.com>, <kuba@kernel.org>,
<kuni1840@gmail.com>, <kuniyu@amazon.com>,
<netdev@vger.kernel.org>
Subject: Re: [PATCH v3 net-next 05/14] af_unix: Detect Strongly Connected Components.
Date: Tue, 27 Feb 2024 18:49:12 -0800 [thread overview]
Message-ID: <20240228024912.30244-1-kuniyu@amazon.com> (raw)
In-Reply-To: <8880a6a22b774b25db9c4a2bc95487521170de20.camel@redhat.com>
From: Paolo Abeni <pabeni@redhat.com>
Date: Tue, 27 Feb 2024 12:02:27 +0100
> On Fri, 2024-02-23 at 13:39 -0800, Kuniyuki Iwashima wrote:
> > In the new GC, we use a simple graph algorithm, Tarjan's Strongly
> > Connected Components (SCC) algorithm, to find cyclic references.
> >
> > The algorithm visits every vertex exactly once using depth-first
> > search (DFS). We implement it without recursion so that no one
> > can abuse it.
> >
> > There could be multiple graphs, so we iterate unix_unvisited_vertices
> > in unix_walk_scc() and do DFS in __unix_walk_scc(), where we move
> > visited vertices to another list, unix_visited_vertices, not to
> > restart DFS twice on a visited vertex later in unix_walk_scc().
> >
> > DFS starts by pushing an input vertex to a stack and assigning it
> > a unique number. Two fields, index and lowlink, are initialised
> > with the number, but lowlink could be updated later during DFS.
> >
> > If a vertex has an edge to an unvisited inflight vertex, we visit
> > it and do the same processing. So, we will have vertices in the
> > stack in the order they appear and number them consecutively in
> > the same order.
> >
> > If a vertex has a back-edge to a visited vertex in the stack,
> > we update the predecessor's lowlink with the successor's index.
> >
> > After iterating edges from the vertex, we check if its index
> > equals its lowlink.
> >
> > If the lowlink is different from the index, it shows there was a
> > back-edge. Then, we propagate the lowlink to its predecessor and
> > go back to the predecessor to resume checking from the next edge
> > of the back-edge.
> >
> > If the lowlink is the same as the index, we pop vertices before
> > and including the vertex from the stack. Then, the set of vertices
> > is SCC, possibly forming a cycle. At the same time, we move the
> > vertices to unix_visited_vertices.
> >
> > When we finish the algorithm, all vertices in each SCC will be
> > linked via unix_vertex.scc_entry.
> >
> > Let's take an example. We have a graph including five inflight
> > vertices (F is not inflight):
> >
> > A -> B -> C -> D -> E (-> F)
> > ^ |
> > `---------'
> >
> > Suppose that we start DFS from C. We will visit C, D, and B first
> > and initialise their index and lowlink. Then, the stack looks like
> > this:
> >
> > > B = (3, 3) (index, lowlink)
> > D = (2, 2)
> > C = (1, 1)
> >
> > When checking B's edge to C, we update B's lowlink with C's index
> > and propagate it to D.
> >
> > B = (3, 1) (index, lowlink)
> > > D = (2, 1)
> > C = (1, 1)
> >
> > Next, we visit E, which has no edge to an inflight vertex.
> >
> > > E = (4, 4) (index, lowlink)
> > B = (3, 1)
> > D = (2, 1)
> > C = (1, 1)
> >
> > When we leave from E, its index and lowlink are the same, so we
> > pop E from the stack as single-vertex SCC. Next, we leave from
> > D but do nothing because its lowlink is different from its index.
> >
> > B = (3, 1) (index, lowlink)
> > D = (2, 1)
> > > C = (1, 1)
> >
> > Then, we leave from C, whose index and lowlink are the same, so
> > we pop B, D and C as SCC.
> >
> > Last, we do DFS for the rest of vertices, A, which is also a
> > single-vertex SCC.
> >
> > Finally, each unix_vertex.scc_entry is linked as follows:
> >
> > A -. B -> C -> D E -.
> > ^ | ^ | ^ |
> > `--' `---------' `--'
> >
> > We use SCC later to decide whether we can garbage-collect the
> > sockets.
> >
> > Note that we still cannot detect SCC properly if an edge points
> > to an embryo socket. The following two patches will sort it out.
> >
> > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> > ---
> > include/net/af_unix.h | 5 +++
> > net/unix/garbage.c | 82 +++++++++++++++++++++++++++++++++++++++++++
> > 2 files changed, 87 insertions(+)
> >
> > diff --git a/include/net/af_unix.h b/include/net/af_unix.h
> > index f31ad1166346..67736767b616 100644
> > --- a/include/net/af_unix.h
> > +++ b/include/net/af_unix.h
> > @@ -32,13 +32,18 @@ void wait_for_unix_gc(struct scm_fp_list *fpl);
> > struct unix_vertex {
> > struct list_head edges;
> > struct list_head entry;
> > + struct list_head scc_entry;
> > unsigned long out_degree;
> > + unsigned long index;
> > + unsigned long lowlink;
> > + bool on_stack;
> > };
> >
> > struct unix_edge {
> > struct unix_sock *predecessor;
> > struct unix_sock *successor;
> > struct list_head vertex_entry;
> > + struct list_head stack_entry;
> > };
> >
> > struct sock *unix_peer_get(struct sock *sk);
> > diff --git a/net/unix/garbage.c b/net/unix/garbage.c
> > index e8fe08796d02..7e90663513f9 100644
> > --- a/net/unix/garbage.c
> > +++ b/net/unix/garbage.c
> > @@ -103,6 +103,11 @@ struct unix_sock *unix_get_socket(struct file *filp)
> >
> > static LIST_HEAD(unix_unvisited_vertices);
> >
> > +enum unix_vertex_index {
> > + UNIX_VERTEX_INDEX_UNVISITED,
> > + UNIX_VERTEX_INDEX_START,
> > +};
> > +
> > static void unix_add_edge(struct scm_fp_list *fpl, struct unix_edge *edge)
> > {
> > struct unix_vertex *vertex = edge->predecessor->vertex;
> > @@ -245,6 +250,81 @@ void unix_destroy_fpl(struct scm_fp_list *fpl)
> > unix_free_vertices(fpl);
> > }
> >
> > +static LIST_HEAD(unix_visited_vertices);
> > +
> > +static void __unix_walk_scc(struct unix_vertex *vertex)
> > +{
> > + unsigned long index = UNIX_VERTEX_INDEX_START;
> > + LIST_HEAD(vertex_stack);
> > + struct unix_edge *edge;
> > + LIST_HEAD(edge_stack);
> > +
> > +next_vertex:
> > + vertex->index = index;
> > + vertex->lowlink = index;
> > + index++;
> > +
> > + vertex->on_stack = true;
> > + list_add(&vertex->scc_entry, &vertex_stack);
> > +
> > + list_for_each_entry(edge, &vertex->edges, vertex_entry) {
> > + struct unix_vertex *next_vertex = edge->successor->vertex;
> > +
> > + if (!next_vertex)
> > + continue;
> > +
> > + if (next_vertex->index == UNIX_VERTEX_INDEX_UNVISITED) {
> > + list_add(&edge->stack_entry, &edge_stack);
> > +
> > + vertex = next_vertex;
> > + goto next_vertex;
> > +prev_vertex:
> > + next_vertex = vertex;
> > +
> > + edge = list_first_entry(&edge_stack, typeof(*edge), stack_entry);
> > + list_del_init(&edge->stack_entry);
> > +
> > + vertex = edge->predecessor->vertex;
> > +
> > + vertex->lowlink = min(vertex->lowlink, next_vertex->lowlink);
> > + } else if (edge->successor->vertex->on_stack) {
>
> It looks like you can replace ^^^^^^^^^^^^^^^^^^^^ with 'next_vertex'
> and that would be more readable.
Good catch, will update in v2.
>
> IMHO more importantly: this code is fairly untrivial, I think that a
> significant amount of comments would help the review and the long term
> maintenance - even if everything is crystal clear and obvious to you,
> restating the obvious in a comment would help me ;)
Thanks for your review and sorry for bothering you.. yeah, I understand
that but it was hard to comment how the graph algorithm works without
examples.
Actually, I drew dozens of diagrams in iPad with many patterns to ensure
that the code works, so I tried to fill the gap with the long commit
message (and incremental changes for later optimisations).
I'll try to split this commit to DFS and Tarjan part to make review
a bit easier and add more useful comments.
Thank you!
next prev parent reply other threads:[~2024-02-28 2:49 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-23 21:39 [PATCH v3 net-next 00/14] af_unix: Rework GC Kuniyuki Iwashima
2024-02-23 21:39 ` [PATCH v3 net-next 01/14] af_unix: Allocate struct unix_vertex for each inflight AF_UNIX fd Kuniyuki Iwashima
2024-02-23 21:39 ` [PATCH v3 net-next 02/14] af_unix: Allocate struct unix_edge " Kuniyuki Iwashima
2024-02-23 21:39 ` [PATCH v3 net-next 03/14] af_unix: Link struct unix_edge when queuing skb Kuniyuki Iwashima
2024-02-23 21:39 ` [PATCH v3 net-next 04/14] af_unix: Bulk update unix_tot_inflight/unix_inflight " Kuniyuki Iwashima
2024-02-27 10:47 ` Paolo Abeni
2024-02-28 2:34 ` Kuniyuki Iwashima
2024-02-28 7:46 ` Paolo Abeni
2024-02-23 21:39 ` [PATCH v3 net-next 05/14] af_unix: Detect Strongly Connected Components Kuniyuki Iwashima
2024-02-25 0:34 ` Jakub Kicinski
2024-02-26 19:07 ` Kuniyuki Iwashima
2024-02-27 11:02 ` Paolo Abeni
2024-02-28 2:49 ` Kuniyuki Iwashima [this message]
2024-02-23 21:39 ` [PATCH v3 net-next 06/14] af_unix: Save listener for embryo socket Kuniyuki Iwashima
2024-02-23 21:39 ` [PATCH v3 net-next 07/14] af_unix: Fix up unix_edge.successor " Kuniyuki Iwashima
2024-02-23 21:39 ` [PATCH v3 net-next 08/14] af_unix: Save O(n) setup of Tarjan's algo Kuniyuki Iwashima
2024-02-23 21:39 ` [PATCH v3 net-next 09/14] af_unix: Skip GC if no cycle exists Kuniyuki Iwashima
2024-02-23 21:39 ` [PATCH v3 net-next 10/14] af_unix: Avoid Tarjan's algorithm if unnecessary Kuniyuki Iwashima
2024-02-23 21:40 ` [PATCH v3 net-next 11/14] af_unix: Assign a unique index to SCC Kuniyuki Iwashima
2024-02-27 11:19 ` Paolo Abeni
2024-02-28 3:05 ` Kuniyuki Iwashima
2024-02-28 7:49 ` Paolo Abeni
2024-02-28 16:25 ` Kuniyuki Iwashima
2024-02-28 17:51 ` Paolo Abeni
2024-02-23 21:40 ` [PATCH v3 net-next 12/14] af_unix: Detect dead SCC Kuniyuki Iwashima
2024-02-27 11:25 ` Paolo Abeni
2024-02-28 3:14 ` Kuniyuki Iwashima
2024-02-23 21:40 ` [PATCH v3 net-next 13/14] af_unix: Replace garbage collection algorithm Kuniyuki Iwashima
2024-02-27 11:36 ` Paolo Abeni
2024-02-28 3:32 ` Kuniyuki Iwashima
2024-02-28 8:08 ` Paolo Abeni
2024-02-28 16:29 ` Kuniyuki Iwashima
2024-02-23 21:40 ` [PATCH v3 net-next 14/14] selftest: af_unix: Test GC for SCM_RIGHTS Kuniyuki Iwashima
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240228024912.30244-1-kuniyu@amazon.com \
--to=kuniyu@amazon.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=kuni1840@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.