From: Pablo Neira Ayuso <pablo@netfilter.org>
To: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: Jan Engelhardt <jengelh@medozas.de>,
Netfilter Developer Mailing List
<netfilter-devel@vger.kernel.org>
Subject: Re: Xtables2 Netlink spec
Date: Mon, 29 Nov 2010 14:49:13 +0100 [thread overview]
Message-ID: <4CF3AF59.7050706@netfilter.org> (raw)
In-Reply-To: <alpine.DEB.2.00.1011291336400.27282@blackhole.kfki.hu>
On 29/11/10 13:39, Jozsef Kadlecsik wrote:
> On Mon, 29 Nov 2010, Pablo Neira Ayuso wrote:
>
>> On 27/11/10 21:42, Jozsef Kadlecsik wrote:
>>> On Sat, 27 Nov 2010, Jan Engelhardt wrote:
>>>
>>>> On Saturday 2010-11-27 18:04, Jozsef Kadlecsik wrote:
>>>>>
>>>>> AFAIK when the kernel dumps and the skb is full, it's not returned
>>>>> directly to the userspace but first enqueued.
>>>>
>>>> I don't recognize that inside the code however.
>>>>
>>>> In netlink_dump(), there is the cb->dump call. There are no loops
>>>> inside this function. Neither are there in the two parents,
>>>> netlink_dump_start() and netlink_recvmsg().
>>>
>>> In netlink_dump() after the call to cb->dump, you can see the call to
>>> skb_queue_tail. So the message is queued.
>>>
>>> Where the looping happens, I do not know. Some socket magic?
>>
>> 1) you send a NLM_F_DUMP request.
>> 2) the kernel fills one skb and enqueue it into the socket buffer.
>> 3) the process invokes recvmsg(), it gets the datagram, then go back to step
>> 2).
>>
>> Thus, the dump only consumes 1 memory page per recv() invocation. That's the
>> magic.
>
> So Jan has got right: if the process which initiated the dumping is
> suspended and locking is used, then the suspended process locks out all
> other processes.
We may use also some optimistic locking approach:
* We assume that there's an ID for every table.
* That ID is increased if you perform some modification in the rule-set
of that table.
* That ID has to be included as an attribute.
* If the ID changes in the middle of one dump, you restart the dump of
that table since the beginning.
* Once you start receiving information from a different table, you can
consider that the previous table has been fully dumped. For the last
table, you can take the NLM_F_DONE as trailing.
The user-space application has to keep the entries in a list until that
table has been fully dumped, if it notices that the ID increases, it
releases previous entries and get new ones.
This means that the iptables-save command based on netlink does not
write the entries into the disk straight forward, instead it keeps the
rules for that table in the list until the dump is finished. Then, it
writes them to the disk (so we make sure there are no duplicated entries).
Optimistic approaches have one problem, if the rule-set is modified
during the dump quite so often, it may keep dumping indefinitely.
next prev parent reply other threads:[~2010-11-29 13:49 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-24 22:29 Xtables2 Netlink spec Jan Engelhardt
2010-11-25 11:42 ` Pablo Neira Ayuso
2010-11-25 13:35 ` Jan Engelhardt
2010-11-25 14:21 ` Pablo Neira Ayuso
2010-11-25 21:46 ` Jan Engelhardt
2010-11-26 8:25 ` Pablo Neira Ayuso
2010-11-26 13:59 ` Jan Engelhardt
2010-11-26 19:48 ` Jozsef Kadlecsik
2010-11-26 19:55 ` Jan Engelhardt
2010-11-26 20:05 ` Jozsef Kadlecsik
2010-11-26 21:33 ` Jan Engelhardt
[not found] ` <alpine.DEB.2.00.1011270951330.20431@blackhole.kfki.hu>
2010-11-27 13:39 ` Jan Engelhardt
2010-11-27 17:04 ` Jozsef Kadlecsik
2010-11-27 17:35 ` Jan Engelhardt
2010-11-27 20:42 ` Jozsef Kadlecsik
2010-11-29 12:30 ` Pablo Neira Ayuso
2010-11-29 12:39 ` Jozsef Kadlecsik
2010-11-29 12:55 ` Pablo Neira Ayuso
2010-11-29 13:26 ` Jan Engelhardt
2010-11-29 13:49 ` Pablo Neira Ayuso [this message]
2010-11-29 12:23 ` Pablo Neira Ayuso
2010-11-27 11:10 ` Pablo Neira Ayuso
2010-11-26 15:27 ` Jan Engelhardt
2010-11-27 12:25 ` Pablo Neira Ayuso
2010-12-03 21:03 ` Jan Engelhardt
2010-12-07 7:49 ` Pablo Neira Ayuso
2010-12-07 13:30 ` Jan Engelhardt
2010-12-08 11:36 ` Pablo Neira Ayuso
2010-11-26 19:01 ` Jozsef Kadlecsik
2010-12-09 12:08 ` Pablo Neira Ayuso
2010-12-14 2:01 ` Jan Engelhardt
2010-12-14 2:16 ` James Nurmi
2010-12-14 3:46 ` Jan Engelhardt
2010-12-15 13:54 ` Pablo Neira Ayuso
2010-12-16 14:05 ` Thomas Graf
2010-12-16 14:22 ` Jan Engelhardt
2010-12-17 7:25 ` Thomas Graf
2010-12-17 9:35 ` Jan Engelhardt
2010-12-17 9:50 ` Pablo Neira Ayuso
2010-12-17 9:55 ` Pablo Neira Ayuso
2010-12-17 14:56 ` Jan Engelhardt
2010-12-15 4:55 ` Jan Engelhardt
2010-12-15 8:51 ` Jozsef Kadlecsik
2010-12-16 9:57 ` Jesper Dangaard Brouer
2010-12-16 12:51 ` Error reporting in Netlink (Re: Xtables2 Netlink spec) Jan Engelhardt
2010-12-16 13:43 ` Thomas Graf
2010-12-16 13:51 ` Jan Engelhardt
2010-12-16 14:19 ` Thomas Graf
2010-12-17 10:00 ` Pablo Neira Ayuso
2010-12-16 14:47 ` Jozsef Kadlecsik
2010-12-16 15:09 ` Jan Engelhardt
2010-12-16 23:31 ` Patrick McHardy
2010-12-17 6:58 ` Thomas Graf
2010-12-16 23:23 ` Patrick McHardy
2010-12-17 10:02 ` Pablo Neira Ayuso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4CF3AF59.7050706@netfilter.org \
--to=pablo@netfilter.org \
--cc=jengelh@medozas.de \
--cc=kadlec@blackhole.kfki.hu \
--cc=netfilter-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).