From: Jeremy Sowden <jeremy@azazel.net>
To: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Netfilter Devel <netfilter-devel@vger.kernel.org>
Subject: Re: [PATCH nf] lib/ts_bm: reset initial match offset for every block of text
Date: Mon, 19 Jun 2023 15:02:34 +0100 [thread overview]
Message-ID: <20230619140234.GC82872@azazel.net> (raw)
In-Reply-To: <ZJBc2qInxGK7yY34@calendula>
[-- Attachment #1: Type: text/plain, Size: 3288 bytes --]
On 2023-06-19, at 15:49:14 +0200, Pablo Neira Ayuso wrote:
> On Sun, Jun 11, 2023 at 09:17:19AM +0100, Jeremy Sowden wrote:
> > The `shift` variable which indicates the offset in the string at which
> > to start matching the pattern is initialized to `bm->patlen - 1`, but it
> > is not reset when a new block is retrieved. This means the implemen-
> > tation may start looking at later and later positions in each successive
> > block and miss occurrences of the pattern at the beginning. E.g.,
> > consider a HTTP packet held in a non-linear skb, where the HTTP request
> > line occurs in the second block:
> >
> > [... 52 bytes of packet headers ...]
> > GET /bmtest HTTP/1.1\r\nHost: www.example.com\r\n\r\n
> >
> > and the pattern is "GET /bmtest".
> >
> > Once the first block comprising the packet headers has been examined,
> > `shift` will be pointing to somewhere near the end of the block, and so
> > when the second block is examined the request line at the beginning will
> > be missed.
> >
> > Reinitialize the variable for each new block.
> >
> > Adjust some indentation and remove some trailing white-space at the same
> > time.
> >
> > Fixes: 8082e4ed0a61 ("[LIB]: Boyer-Moore extension for textsearch infrastructure strike #2")
> > Link: https://bugzilla.netfilter.org/show_bug.cgi?id=1390
> > Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
> > ---
> > lib/ts_bm.c | 16 +++++++++-------
> > 1 file changed, 9 insertions(+), 7 deletions(-)
> >
> > diff --git a/lib/ts_bm.c b/lib/ts_bm.c
> > index 1f2234221dd1..ef448490a2cc 100644
> > --- a/lib/ts_bm.c
> > +++ b/lib/ts_bm.c
> > @@ -60,23 +60,25 @@ static unsigned int bm_find(struct ts_config *conf, struct ts_state *state)
> > struct ts_bm *bm = ts_config_priv(conf);
> > unsigned int i, text_len, consumed = state->offset;
> > const u8 *text;
> > - int shift = bm->patlen - 1, bs;
> > + int bs;
> > const u8 icase = conf->flags & TS_IGNORECASE;
> >
> > for (;;) {
> > + int shift = bm->patlen - 1;
>
> This line is the fix, right?
Yup.
> > text_len = conf->get_next_block(consumed, &text, conf, state);
> >
> > if (unlikely(text_len == 0))
> > break;
> >
>
> These updates below are a clean up, right? If so, maybe split this in
> two patches I'd suggest?
Sure.
> > while (shift < text_len) {
> > - DEBUGP("Searching in position %d (%c)\n",
> > - shift, text[shift]);
> > - for (i = 0; i < bm->patlen; i++)
> > + DEBUGP("Searching in position %d (%c)\n",
> > + shift, text[shift]);
> > + for (i = 0; i < bm->patlen; i++)
> > if ((icase ? toupper(text[shift-i])
> > - : text[shift-i])
> > - != bm->pattern[bm->patlen-1-i])
> > - goto next;
> > + : text[shift-i])
> > + != bm->pattern[bm->patlen-1-i])
>
> Maybe disentagle this with a few helper functions?
>
> static char bm_get_char(const char *text, unsigned int pos, bool icase)
> {
> return icase ? toupper(text[pos]) : text[pos];
> }
Sure.
> Thanks
>
> > if ((icase ? toupper(text[shift-i])
> > - : text[shift-i])
> > + goto next;
> >
> > /* London calling... */
> > DEBUGP("found!\n");
> > --
> > 2.39.2
> >
J.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
prev parent reply other threads:[~2023-06-19 14:03 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-11 8:17 [PATCH nf] lib/ts_bm: reset initial match offset for every block of text Jeremy Sowden
2023-06-19 13:49 ` Pablo Neira Ayuso
2023-06-19 14:02 ` Jeremy Sowden [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230619140234.GC82872@azazel.net \
--to=jeremy@azazel.net \
--cc=netfilter-devel@vger.kernel.org \
--cc=pablo@netfilter.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).