From: Pablo Neira Ayuso <pablo@netfilter.org>
To: netfilter-devel@vger.kernel.org
Subject: [PATCH RFC 0/3] nf_tables string match support
Date: Fri, 29 Jul 2022 11:31:26 +0200 [thread overview]
Message-ID: <20220729093129.3108-1-pablo@netfilter.org> (raw)
Hi,
The following patchset contains nf_tables string match support. This new
infrastructure is based on the Aho-Corasick pattern match algorithm,
which allows for linear search of a dictionary (a "string set" composed
of patterns). The implementation is lockless by performing updates on a
cloned copy of the Aho-Corasick "tree" use the existing 2-phase commit
protocol to atomically expose the new tree to the packet patch.
I decided to add new netlink API for "string set" rather than reusing the
existing set API for simplicity. There is a new Kconfig knob CONFIG_NFT_STRING
to enable built-in support into nf_tables to avoid an indirection between
nf_tables_api and nft_string given that the Aho-Corasick API (see ac_*()
functions) are invoked from the nf_tables netlink frontend.
The implementation of Aho-Corasick comes as a separated file, it is
relatively small (~600 LoC), and a dictionary of 370105 English words
consumes ~150 Mbytes. Maximum string size at this stage is 128 bytes.
The implementation has been validated from userspace via ASAN and
valgrind with testsuites consisting simple tests combined with random
feeding the dictionary with words and autogenerated text patched with a
matching at a random offset to validate correct matching. The userspace
implementation (rather similarly to the one coming in this batch) and
the testsuite is not posted in this batch.
This algorithm is described in "Efficient string matching: An aid to
bibliographic search" by Alfred V. Aho and Margaret J. Corasick (published in
June 1975) at Communications of the ACM 18 (6): 333–340.
There is a few aspect I would like to revisit after this RFC, eg. netlink
notifications are not yet supported.
Please, see specific patch descriptions for implementation details.
Comments welcome.
P.S: Patch 2 reports 200 deletions on nf_tables_api.c. For some reason
diff is removing 200 LoC and adding them again after the new netlink
string API, there are not real line removals, it is just noise.
Pablo Neira Ayuso (3):
netfilter: add Aho-Corasick string match implementation
netfilter: nf_tables: add string set API
netfilter: nf_tables: add string expression
include/net/netfilter/ahocorasick.h | 27 +
include/net/netfilter/nf_tables.h | 37 +
include/net/netfilter/nf_tables_core.h | 1 +
include/uapi/linux/netfilter/nf_tables.h | 65 ++
net/netfilter/Kconfig | 7 +
net/netfilter/Makefile | 3 +
net/netfilter/ahocorasick.c | 677 ++++++++++++
net/netfilter/nf_tables_api.c | 1287 ++++++++++++++++++----
net/netfilter/nf_tables_core.c | 1 +
net/netfilter/nft_string.c | 254 +++++
10 files changed, 2158 insertions(+), 201 deletions(-)
create mode 100644 include/net/netfilter/ahocorasick.h
create mode 100644 net/netfilter/ahocorasick.c
create mode 100644 net/netfilter/nft_string.c
--
2.30.2
next reply other threads:[~2022-07-29 9:31 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-29 9:31 Pablo Neira Ayuso [this message]
2022-07-29 9:31 ` [PATCH RFC 1/3] netfilter: add Aho-Corasick string match implementation Pablo Neira Ayuso
2022-07-29 9:31 ` [PATCH RFC 2/3] netfilter: nf_tables: add string set API Pablo Neira Ayuso
2022-07-29 14:19 ` kernel test robot
2022-07-29 16:51 ` kernel test robot
2022-07-29 9:31 ` [PATCH RFC 3/3] netfilter: nf_tables: add string expression Pablo Neira Ayuso
2022-07-29 16:31 ` kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220729093129.3108-1-pablo@netfilter.org \
--to=pablo@netfilter.org \
--cc=netfilter-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.