All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Beulich <jbeulich@suse.com>
To: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>, "Wei Liu" <wl@xen.org>,
	"Paul Durrant" <paul@xen.org>,
	"Tamas K Lengyel" <tamas@tklengyel.com>,
	Xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: [PATCH v3] x86/mm: Short circuit damage from "fishy" ref/typecount failure
Date: Mon, 1 Feb 2021 13:50:02 +0100	[thread overview]
Message-ID: <222ce537-b4fc-cf45-1bd0-0aaa3293d8ea@suse.com> (raw)
In-Reply-To: <6fb91ae1-4b91-f76a-1d38-1c528ab43a9c@citrix.com>

On 29.01.2021 18:17, Andrew Cooper wrote:
> On 29/01/2021 16:31, Jan Beulich wrote:
>> On 29.01.2021 17:17, Andrew Cooper wrote:
>>> On 29/01/2021 11:29, Jan Beulich wrote:
>>>> On 25.01.2021 18:59, Andrew Cooper wrote:
>>>>> On 20/01/2021 08:06, Jan Beulich wrote:
>>>>>> Also, as far as "impossible" here goes - the constructs all
>>>>>> anyway exist only to deal with what we consider impossible.
>>>>>> The question therefore really is of almost exclusively
>>>>>> theoretical nature, and hence something like a counter
>>>>>> possibly overflowing imo needs to be accounted for as
>>>>>> theoretically possible, albeit impossible with today's
>>>>>> computers and realistic timing assumptions. If a counter
>>>>>> overflow occurred, it definitely wouldn't be because of a
>>>>>> bug in Xen, but because of abnormal behavior elsewhere.
>>>>>> Hence I remain unconvinced it is appropriate to deal with
>>>>>> the situation by BUG().
>>>>> I'm not sure how to be any clearer.
>>>>>
>>>>> I am literally not changing the current behaviour.  Xen *will* hit a
>>>>> BUG() if any of these domain_crash() paths are taken.
>>>>>
>>>>> If you do not believe me, then please go and actually check what happens
>>>>> when simulating a ref-acquisition failure.
>>>> So I've now also played the same game on the ioreq path (see
>>>> debugging patch below, and again with some non-"//temp"
>>>> changes actually improving overall behavior in that "impossible"
>>>> case). No BUG()s hit, no leaks (thanks to the extra changes),
>>>> no other anomalies observed.
>>>>
>>>> Hence I'm afraid it is now really up to you to point out the
>>>> specific BUG()s (and additional context as necessary) that you
>>>> either believe could be hit, or that you have observed being hit.
>>> The refcounting logic was taken verbatim from ioreq, with the only
>>> difference being an order greater than 0.  The logic is also identical
>>> to the vlapic logic.
>>>
>>> And the reason *why* it bugs is obvious - the cleanup logic
>>> unconditionally put()'s refs it never took to begin with, and hits
>>> underflow bugchecks.
>> In current staging, neither vmx_alloc_vlapic_mapping() nor
>> hvm_alloc_ioreq_mfn() put any refs they couldn't get. Hence
>> my failed attempt to repro your claimed misbehavior.
> 
> I think I've figured out what is going on.
> 
> They *look* as if they do, but the logic is deceptive.
> 
> We skip both puts in free_*() if the typeref failed, and rely on the
> fact that the frame(s) are *also* on the domheap list for
> relinquish_resources() to put the acquire ref.
> 
> Yet another bizzare recounting rule/behaviour which isn't written down.

But that's not the case - extra pages land on their own
list, which relinquish_resources() doesn't iterate. Hence
me saying we leak these pages on the domain_crash() paths,
and hence my repro attempt patches containing adjustments
to at least try to free those pages on those paths.

Jan


      reply	other threads:[~2021-02-01 12:50 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-19  9:41 [PATCH] x86/mm: Remove cascade damage from "fishy" ref/typecount failure Andrew Cooper
2021-01-19 11:34 ` Andrew Cooper
2021-01-19 12:27 ` [PATCH v2] x86/mm: Short circuit " Andrew Cooper
2021-01-19 12:45   ` Paul Durrant
2021-01-19 13:00     ` Andrew Cooper
2021-01-19 13:02 ` [PATCH v3] " Andrew Cooper
2021-01-19 13:06   ` Paul Durrant
2021-01-19 16:48   ` Jan Beulich
2021-01-19 18:09     ` Andrew Cooper
2021-01-20  8:06       ` Jan Beulich
2021-01-25 17:59         ` Andrew Cooper
2021-01-26 10:48           ` Jan Beulich
2021-01-28 14:48           ` Jan Beulich
2021-01-29 11:29           ` Jan Beulich
2021-01-29 16:17             ` Andrew Cooper
2021-01-29 16:31               ` Jan Beulich
2021-01-29 17:17                 ` Andrew Cooper
2021-02-01 12:50                   ` Jan Beulich [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=222ce537-b4fc-cf45-1bd0-0aaa3293d8ea@suse.com \
    --to=jbeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=paul@xen.org \
    --cc=roger.pau@citrix.com \
    --cc=tamas@tklengyel.com \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.