Re: [PATCH bpf-next] bpf, docs: Define signed modulo as using truncated division

From: Eduard Zingerman <eddyz87@gmail.com>
To: Daniel Borkmann <daniel@iogearbox.net>,
	Dave Thaler <dthaler1968@googlemail.com>,
	bpf@vger.kernel.org
Cc: bpf@ietf.org, Dave Thaler <dthaler@microsoft.com>
Subject: Re: [PATCH bpf-next] bpf, docs: Define signed modulo as using truncated division
Date: Thu, 19 Oct 2023 03:00:26 +0300	[thread overview]
Message-ID: <e51603c98e2abe061b75fe8ac9854b1678a64aef.camel@gmail.com> (raw)
In-Reply-To: <e2943b75-e47a-01f2-6b3f-a3ce666008cd@iogearbox.net>

On Thu, 2023-10-19 at 01:40 +0200, Daniel Borkmann wrote:
> On 10/19/23 12:34 AM, Eduard Zingerman wrote:
> > On Tue, 2023-10-17 at 20:30 +0000, Dave Thaler wrote:
> > > From: Dave Thaler <dthaler@microsoft.com>
> > > 
> > > There's different mathematical definitions (truncated, floored,
> > > rounded, etc.) and different languages have chosen different
> > > definitions [0][1].  E.g., languages/libraries that follow Knuth
> > > use a different mathematical definition than C uses.  This
> > > patch specifies which definition BPF uses, as verified by
> > > Eduard [2] and others.
> > > 
> > > [0]: https://en.wikipedia.org/wiki/Modulo#Variants_of_the_definition
> > > [1]: https://torstencurdt.com/tech/posts/modulo-of-negative-numbers/
> > > [2]: https://lore.kernel.org/bpf/57e6fefadaf3b2995bb259fa8e711c7220ce5290.camel@gmail.com/
> > > 
> > > Signed-off-by: Dave Thaler <dthaler@microsoft.com>
> > > ---
> > >   Documentation/bpf/standardization/instruction-set.rst | 8 ++++++++
> > >   1 file changed, 8 insertions(+)
> > > 
> > > diff --git a/Documentation/bpf/standardization/instruction-set.rst b/Documentation/bpf/standardization/instruction-set.rst
> > > index c5d53a6e8c7..245b6defc29 100644
> > > --- a/Documentation/bpf/standardization/instruction-set.rst
> > > +++ b/Documentation/bpf/standardization/instruction-set.rst
> > > @@ -283,6 +283,14 @@ For signed operations (``BPF_SDIV`` and ``BPF_SMOD``), for ``BPF_ALU``,
> > >   is first :term:`sign extended<Sign Extend>` from 32 to 64 bits, and then
> > >   interpreted as a 64-bit signed value.
> > >   
> > > +Note that there are varying definitions of the signed modulo operation
> > > +when the dividend or divisor are negative, where implementations often
> > > +vary by language such that Python, Ruby, etc.  differ from C, Go, Java,
> > > +etc. This specification requires that signed modulo use truncated division
> > > +(where -13 % 3 == -1) as implemented in C, Go, etc.:
> > > +
> > > +   a % n = a - n * trunc(a / n)
> > > +
> > >   The ``BPF_MOVSX`` instruction does a move operation with sign extension.
> > >   ``BPF_ALU | BPF_MOVSX`` :term:`sign extends<Sign Extend>` 8-bit and 16-bit operands into 32
> > >   bit operands, and zeroes the remaining upper 32 bits.
> > 
> > Acked-by: Eduard Zingerman <eddyz87@gmail.com>
> 
> Eduard, do we have some test cases in BPF CI around this specifically (e.g. via test_verifier)?
> Might be worth adding if not.

We do, e.g. see tools/testing/selftests/bpf/progs/verifier_sdiv.c:

  SEC("socket")
  __description("SMOD32, non-zero imm divisor, check 1")
  __success __success_unpriv __retval(-1)
  __naked void smod32_non_zero_imm_1(void)
  {
  	asm volatile ("					\
  	w0 = -41;					\
  	w0 s%%= 2;					\
  	exit;						\
  "	::: __clobber_all);
  }

And I'm still surprised that this produces different results in C and
in Python :)

  $ python3
  Python 3.11.5 (main, Aug 31 2023, 07:57:41) [GCC] on linux
  Type "help", "copyright", "credits" or "license" for more information.
  >>> -41 % 2
  1
  $ clang-repl
  clang-repl> #include <stdio.h>
  clang-repl> printf("%d\n", -41 % 2);
  -1

There are several such tests with different combination of signs,
both 32-bit and 64-bit.