All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Alex Bennée" <alex.bennee@linaro.org>
To: "Philippe Mathieu-Daudé" <philmd@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	qemu-devel@nongnu.org, Joseph Myers <joseph@codesourcery.com>
Subject: Re: [PULL 082/116] target/i386: correct fix for pcmpxstrx substring search
Date: Mon, 15 Jun 2020 14:43:42 +0100	[thread overview]
Message-ID: <87k1081m29.fsf@linaro.org> (raw)
In-Reply-To: <bdc40c92-6518-5c9f-646e-817af94d548e@redhat.com>


Philippe Mathieu-Daudé <philmd@redhat.com> writes:

> On 6/15/20 12:18 PM, Philippe Mathieu-Daudé wrote:
>> On 6/12/20 6:07 PM, Paolo Bonzini wrote:
>>> From: Joseph Myers <joseph@codesourcery.com>
>>>
>>> This corrects a bug introduced in my previous fix for SSE4.2 pcmpestri
>>> / pcmpestrm / pcmpistri / pcmpistrm substring search, commit
>>> ae35eea7e4a9f21dd147406dfbcd0c4c6aaf2a60.
>>>
>>> That commit fixed a bug that showed up in four GCC tests with one libc
>>> implementation.  The tests in question generate random inputs to the
>>> intrinsics and compare results to a C implementation, but they only
>>> test 1024 possible random inputs, and when the tests use the cases of
>>> those instructions that work with word rather than byte inputs, it's
>>> easy to have problematic cases that show up much less frequently than
>>> that.  Thus, testing with a different libc implementation, and so a
>>> different random number generator, showed up a problem with the
>>> previous patch.
>>>
>>> When investigating the previous test failures, I found the description
>>> of these instructions in the Intel manuals (starting from computing a
>>> 16x16 or 8x8 set of comparison results) confusing and hard to match up
>>> with the more optimized implementation in QEMU, and referred to AMD
>>> manuals which described the instructions in a different way.  Those
>>> AMD descriptions are very explicit that the whole of the string being
>>> searched for must be found in the other operand, not running off the
>>> end of that operand; they say "If the prototype and the SUT are equal
>>> in length, the two strings must be identical for the comparison to be
>>> TRUE.".  However, that statement is incorrect.
>>>
>>> In my previous commit message, I noted:
>>>
>>>   The operation in this case is a search for a string (argument d to
>>>   the helper) in another string (argument s to the helper); if a copy
>>>   of d at a particular position would run off the end of s, the
>>>   resulting output bit should be 0 whether or not the strings match in
>>>   the region where they overlap, but the QEMU implementation was
>>>   wrongly comparing only up to the point where s ends and counting it
>>>   as a match if an initial segment of d matched a terminal segment of
>>>   s.  Here, "run off the end of s" means that some byte of d would
>>>   overlap some byte outside of s; thus, if d has zero length, it is
>>>   considered to match everywhere, including after the end of s.
>>>
>>> The description "some byte of d would overlap some byte outside of s"
>>> is accurate only when understood to refer to overlapping some byte
>>> *within the 16-byte operand* but at or after the zero terminator; it
>>> is valid to run over the end of s if the end of s is the end of the
>>> 16-byte operand.  So the fix in the previous patch for the case of d
>>> being empty was correct, but the other part of that patch was not
>>> correct (as it never allowed partial matches even at the end of the
>>> 16-byte operand).  Nor was the code before the previous patch correct
>>> for the case of d nonempty, as it would always have allowed partial
>>> matches at the end of s.
>>>
>>> Fix with a partial revert of my previous change, combined with
>>> inserting a check for the special case of s having maximum length to
>>> determine where it is necessary to check for matches.
>>>
>>> In the added test, test 1 is for the case of empty strings, which
>>> failed before my 2017 patch, test 2 is for the bug introduced by my
>>> 2017 patch and test 3 deals with the case where a match of an initial
>>> segment at the end of the string is not valid when the string ends
>>> before the end of the 16-byte operand (that is, the case that would be
>>> broken by a simple revert of the non-empty-string part of my 2017
>>> patch).
>>>
>>> Signed-off-by: Joseph Myers <joseph@codesourcery.com>
>>> Message-Id: <alpine.DEB.2.21.2006121344290.9881@digraph.polyomino.org.uk>
>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>> ---
>>>  target/i386/ops_sse.h                |  4 ++--
>>>  tests/tcg/i386/Makefile.target       |  3 +++
>>>  tests/tcg/i386/test-i386-pcmpistri.c | 33 ++++++++++++++++++++++++++++
>>>  3 files changed, 38 insertions(+), 2 deletions(-)
>>>  create mode 100644 tests/tcg/i386/test-i386-pcmpistri.c
>>>
>>> diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
>>> index 01d6017412..14f2b16abd 100644
>>> --- a/target/i386/ops_sse.h
>>> +++ b/target/i386/ops_sse.h
>>> @@ -2089,10 +2089,10 @@ static inline unsigned pcmpxstrx(CPUX86State *env, Reg *d, Reg *s,
>>>              res = (2 << upper) - 1;
>>>              break;
>>>          }
>>> -        for (j = valids - validd; j >= 0; j--) {
>>> +        for (j = valids == upper ? valids : valids - validd; j >= 0; j--) {
>>>              res <<= 1;
>>>              v = 1;
>>> -            for (i = validd; i >= 0; i--) {
>>> +            for (i = MIN(valids - j, validd); i >= 0; i--) {
>>>                  v &= (pcmp_val(s, ctrl, i + j) == pcmp_val(d, ctrl, i));
>>>              }
>>>              res |= v;
>>> diff --git a/tests/tcg/i386/Makefile.target b/tests/tcg/i386/Makefile.target
>>> index 43ee2e181e..53efec0668 100644
>>> --- a/tests/tcg/i386/Makefile.target
>>> +++ b/tests/tcg/i386/Makefile.target
>>> @@ -10,6 +10,9 @@ ALL_X86_TESTS=$(I386_SRCS:.c=)
>>>  SKIP_I386_TESTS=test-i386-ssse3
>>>  X86_64_TESTS:=$(filter test-i386-ssse3, $(ALL_X86_TESTS))
>>>  
>>> +test-i386-pcmpistri: CFLAGS += -msse4.2
>>> +run-test-i386-pcmpistri: QEMU_OPTS += -cpu max
>> 
>> This test fails on our CI:
>> https://travis-ci.org/github/qemu/qemu/jobs/698006621#L4246
>
> FYI Paolo's analysis from 'make V=1' output
> https://api.travis-ci.org/v3/job/698459904/log.txt:
>
> timeout 60
> /home/travis/build/philmd/qemu/build/i386-linux-user/qemu-i386 -cpu max
> test-i386-pcmpistri >  test-i386-pcmpistri.out
>
> timeout 60
> /home/travis/build/philmd/qemu/build/i386-linux-user/qemu-i386  -plugin
> ../../plugin/libbb.so -d plugin -D
> test-i386-pcmpistri-with-libbb.so.pout test-i386-pcmpistri >
> run-plugin-test-i386-pcmpistri-with-libbb.so.out
>
> "incorrect qemu invocation, missing -cpu max in the second".

Just testing some patches now.


-- 
Alex Bennée


  reply	other threads:[~2020-06-15 13:44 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-12 16:07 [PULL v2 000/116] Huge miscellaneous pull request for 2020-06-11 Paolo Bonzini
2020-06-12 16:07 ` [PULL 082/116] target/i386: correct fix for pcmpxstrx substring search Paolo Bonzini
2020-06-15 10:18   ` Philippe Mathieu-Daudé
2020-06-15 10:58     ` Philippe Mathieu-Daudé
2020-06-15 13:43       ` Alex Bennée [this message]
2020-06-12 16:07 ` [PULL 088/116] i386: hvf: Drop useless declarations in sysemu Paolo Bonzini
2020-06-12 16:07 ` [PULL 097/116] i386: hvf: Move lazy_flags into CPUX86State Paolo Bonzini
2020-06-12 16:07 ` [PULL 098/116] i386: hvf: Move mmio_buf " Paolo Bonzini
2020-06-13 16:17 ` [PULL v2 000/116] Huge miscellaneous pull request for 2020-06-11 Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87k1081m29.fsf@linaro.org \
    --to=alex.bennee@linaro.org \
    --cc=joseph@codesourcery.com \
    --cc=pbonzini@redhat.com \
    --cc=philmd@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.