netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH nf] lib/ts_bm: reset initial match offset for every block of text
@ 2023-06-11  8:17 Jeremy Sowden
  2023-06-19 13:49 ` Pablo Neira Ayuso
  0 siblings, 1 reply; 3+ messages in thread
From: Jeremy Sowden @ 2023-06-11  8:17 UTC (permalink / raw)
  To: Netfilter Devel

The `shift` variable which indicates the offset in the string at which
to start matching the pattern is initialized to `bm->patlen - 1`, but it
is not reset when a new block is retrieved.  This means the implemen-
tation may start looking at later and later positions in each successive
block and miss occurrences of the pattern at the beginning.  E.g.,
consider a HTTP packet held in a non-linear skb, where the HTTP request
line occurs in the second block:

  [... 52 bytes of packet headers ...]
  GET /bmtest HTTP/1.1\r\nHost: www.example.com\r\n\r\n

and the pattern is "GET /bmtest".

Once the first block comprising the packet headers has been examined,
`shift` will be pointing to somewhere near the end of the block, and so
when the second block is examined the request line at the beginning will
be missed.

Reinitialize the variable for each new block.

Adjust some indentation and remove some trailing white-space at the same
time.

Fixes: 8082e4ed0a61 ("[LIB]: Boyer-Moore extension for textsearch infrastructure strike #2")
Link: https://bugzilla.netfilter.org/show_bug.cgi?id=1390
Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
---
 lib/ts_bm.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/lib/ts_bm.c b/lib/ts_bm.c
index 1f2234221dd1..ef448490a2cc 100644
--- a/lib/ts_bm.c
+++ b/lib/ts_bm.c
@@ -60,23 +60,25 @@ static unsigned int bm_find(struct ts_config *conf, struct ts_state *state)
 	struct ts_bm *bm = ts_config_priv(conf);
 	unsigned int i, text_len, consumed = state->offset;
 	const u8 *text;
-	int shift = bm->patlen - 1, bs;
+	int bs;
 	const u8 icase = conf->flags & TS_IGNORECASE;
 
 	for (;;) {
+		int shift = bm->patlen - 1;
+
 		text_len = conf->get_next_block(consumed, &text, conf, state);
 
 		if (unlikely(text_len == 0))
 			break;
 
 		while (shift < text_len) {
-			DEBUGP("Searching in position %d (%c)\n", 
-				shift, text[shift]);
-			for (i = 0; i < bm->patlen; i++) 
+			DEBUGP("Searching in position %d (%c)\n",
+			       shift, text[shift]);
+			for (i = 0; i < bm->patlen; i++)
 				if ((icase ? toupper(text[shift-i])
-				    : text[shift-i])
-					!= bm->pattern[bm->patlen-1-i])
-				     goto next;
+				     : text[shift-i])
+				    != bm->pattern[bm->patlen-1-i])
+					goto next;
 
 			/* London calling... */
 			DEBUGP("found!\n");
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH nf] lib/ts_bm: reset initial match offset for every block of text
  2023-06-11  8:17 [PATCH nf] lib/ts_bm: reset initial match offset for every block of text Jeremy Sowden
@ 2023-06-19 13:49 ` Pablo Neira Ayuso
  2023-06-19 14:02   ` Jeremy Sowden
  0 siblings, 1 reply; 3+ messages in thread
From: Pablo Neira Ayuso @ 2023-06-19 13:49 UTC (permalink / raw)
  To: Jeremy Sowden; +Cc: Netfilter Devel

Hi Jeremy,

On Sun, Jun 11, 2023 at 09:17:19AM +0100, Jeremy Sowden wrote:
> The `shift` variable which indicates the offset in the string at which
> to start matching the pattern is initialized to `bm->patlen - 1`, but it
> is not reset when a new block is retrieved.  This means the implemen-
> tation may start looking at later and later positions in each successive
> block and miss occurrences of the pattern at the beginning.  E.g.,
> consider a HTTP packet held in a non-linear skb, where the HTTP request
> line occurs in the second block:
> 
>   [... 52 bytes of packet headers ...]
>   GET /bmtest HTTP/1.1\r\nHost: www.example.com\r\n\r\n
> 
> and the pattern is "GET /bmtest".
> 
> Once the first block comprising the packet headers has been examined,
> `shift` will be pointing to somewhere near the end of the block, and so
> when the second block is examined the request line at the beginning will
> be missed.
> 
> Reinitialize the variable for each new block.
> 
> Adjust some indentation and remove some trailing white-space at the same
> time.
> 
> Fixes: 8082e4ed0a61 ("[LIB]: Boyer-Moore extension for textsearch infrastructure strike #2")
> Link: https://bugzilla.netfilter.org/show_bug.cgi?id=1390
> Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
> ---
>  lib/ts_bm.c | 16 +++++++++-------
>  1 file changed, 9 insertions(+), 7 deletions(-)
> 
> diff --git a/lib/ts_bm.c b/lib/ts_bm.c
> index 1f2234221dd1..ef448490a2cc 100644
> --- a/lib/ts_bm.c
> +++ b/lib/ts_bm.c
> @@ -60,23 +60,25 @@ static unsigned int bm_find(struct ts_config *conf, struct ts_state *state)
>  	struct ts_bm *bm = ts_config_priv(conf);
>  	unsigned int i, text_len, consumed = state->offset;
>  	const u8 *text;
> -	int shift = bm->patlen - 1, bs;
> +	int bs;
>  	const u8 icase = conf->flags & TS_IGNORECASE;
>  
>  	for (;;) {
> +		int shift = bm->patlen - 1;

This line is the fix, right?

>  		text_len = conf->get_next_block(consumed, &text, conf, state);
>  
>  		if (unlikely(text_len == 0))
>  			break;
>

These updates below are a clean up, right? If so, maybe split this in
two patches I'd suggest?

>  		while (shift < text_len) {
> -			DEBUGP("Searching in position %d (%c)\n", 
> -				shift, text[shift]);
> -			for (i = 0; i < bm->patlen; i++) 
> +			DEBUGP("Searching in position %d (%c)\n",
> +			       shift, text[shift]);
> +			for (i = 0; i < bm->patlen; i++)
>  				if ((icase ? toupper(text[shift-i])
> -				    : text[shift-i])
> -					!= bm->pattern[bm->patlen-1-i])
> -				     goto next;
> +				     : text[shift-i])
> +				    != bm->pattern[bm->patlen-1-i])

Maybe disentagle this with a few helper functions?

static char bm_get_char(const char *text, unsigned int pos, bool icase)
{
        return icase ? toupper(text[pos]) : text[pos];
}

Thanks

>  				if ((icase ? toupper(text[shift-i])
> -				    : text[shift-i])
> +					goto next;
>  
>  			/* London calling... */
>  			DEBUGP("found!\n");
> -- 
> 2.39.2
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH nf] lib/ts_bm: reset initial match offset for every block of text
  2023-06-19 13:49 ` Pablo Neira Ayuso
@ 2023-06-19 14:02   ` Jeremy Sowden
  0 siblings, 0 replies; 3+ messages in thread
From: Jeremy Sowden @ 2023-06-19 14:02 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Netfilter Devel

[-- Attachment #1: Type: text/plain, Size: 3288 bytes --]

On 2023-06-19, at 15:49:14 +0200, Pablo Neira Ayuso wrote:
> On Sun, Jun 11, 2023 at 09:17:19AM +0100, Jeremy Sowden wrote:
> > The `shift` variable which indicates the offset in the string at which
> > to start matching the pattern is initialized to `bm->patlen - 1`, but it
> > is not reset when a new block is retrieved.  This means the implemen-
> > tation may start looking at later and later positions in each successive
> > block and miss occurrences of the pattern at the beginning.  E.g.,
> > consider a HTTP packet held in a non-linear skb, where the HTTP request
> > line occurs in the second block:
> > 
> >   [... 52 bytes of packet headers ...]
> >   GET /bmtest HTTP/1.1\r\nHost: www.example.com\r\n\r\n
> > 
> > and the pattern is "GET /bmtest".
> > 
> > Once the first block comprising the packet headers has been examined,
> > `shift` will be pointing to somewhere near the end of the block, and so
> > when the second block is examined the request line at the beginning will
> > be missed.
> > 
> > Reinitialize the variable for each new block.
> > 
> > Adjust some indentation and remove some trailing white-space at the same
> > time.
> > 
> > Fixes: 8082e4ed0a61 ("[LIB]: Boyer-Moore extension for textsearch infrastructure strike #2")
> > Link: https://bugzilla.netfilter.org/show_bug.cgi?id=1390
> > Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
> > ---
> >  lib/ts_bm.c | 16 +++++++++-------
> >  1 file changed, 9 insertions(+), 7 deletions(-)
> > 
> > diff --git a/lib/ts_bm.c b/lib/ts_bm.c
> > index 1f2234221dd1..ef448490a2cc 100644
> > --- a/lib/ts_bm.c
> > +++ b/lib/ts_bm.c
> > @@ -60,23 +60,25 @@ static unsigned int bm_find(struct ts_config *conf, struct ts_state *state)
> >  	struct ts_bm *bm = ts_config_priv(conf);
> >  	unsigned int i, text_len, consumed = state->offset;
> >  	const u8 *text;
> > -	int shift = bm->patlen - 1, bs;
> > +	int bs;
> >  	const u8 icase = conf->flags & TS_IGNORECASE;
> >  
> >  	for (;;) {
> > +		int shift = bm->patlen - 1;
> 
> This line is the fix, right?

Yup.

> >  		text_len = conf->get_next_block(consumed, &text, conf, state);
> >  
> >  		if (unlikely(text_len == 0))
> >  			break;
> >
> 
> These updates below are a clean up, right? If so, maybe split this in
> two patches I'd suggest?

Sure.

> >  		while (shift < text_len) {
> > -			DEBUGP("Searching in position %d (%c)\n", 
> > -				shift, text[shift]);
> > -			for (i = 0; i < bm->patlen; i++) 
> > +			DEBUGP("Searching in position %d (%c)\n",
> > +			       shift, text[shift]);
> > +			for (i = 0; i < bm->patlen; i++)
> >  				if ((icase ? toupper(text[shift-i])
> > -				    : text[shift-i])
> > -					!= bm->pattern[bm->patlen-1-i])
> > -				     goto next;
> > +				     : text[shift-i])
> > +				    != bm->pattern[bm->patlen-1-i])
> 
> Maybe disentagle this with a few helper functions?
> 
> static char bm_get_char(const char *text, unsigned int pos, bool icase)
> {
>         return icase ? toupper(text[pos]) : text[pos];
> }

Sure.

> Thanks
> 
> >  				if ((icase ? toupper(text[shift-i])
> > -				    : text[shift-i])
> > +					goto next;
> >  
> >  			/* London calling... */
> >  			DEBUGP("found!\n");
> > -- 
> > 2.39.2
> > 

J.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-06-19 14:03 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-11  8:17 [PATCH nf] lib/ts_bm: reset initial match offset for every block of text Jeremy Sowden
2023-06-19 13:49 ` Pablo Neira Ayuso
2023-06-19 14:02   ` Jeremy Sowden

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).