Re: Re: Memory corruption bug with Xen PV Dom0 and BOSS-S1 RAID card

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Paweł Srokosz" <pawel.srokosz@cert.pl>
To: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>,
	jgross@suse.com,  andrew cooper3 <andrew.cooper3@citrix.com>,
	JBeulich@suse.com
Subject: Re: Re: Memory corruption bug with Xen PV Dom0 and BOSS-S1 RAID card
Date: Wed, 19 Feb 2025 19:37:47 +0100 (CET)	[thread overview]
Message-ID: <1001969494.1457790.1739990267113.JavaMail.zimbra@cert.pl> (raw)
In-Reply-To: <Z7RWdPpUde9ZoaZu@macbook.local>

Hello,

> So the issue doesn't happen on debug=y builds? That's unexpected.  I would
> expect the opposite, that some code in Linux assumes that pfn + 1 == mfn +
> 1, and hence breaks when the relation is reversed.

It was also surprising for me but I think the key thing is that debug=y
causes whole mapping to be reversed so each PFN lands on completely different
MFN e.g. MFN=0x1300000 is mapped to PFN=0x20e50c in ndebug, but in debug
it's mapped to PFN=0x5FFFFF. I guess that's why I can't reproduce the
problem.

> Can you see if you can reproduce with dom0-iommu=strict in the Xen command
> line?

Unfortunately, it doesn't help. But I have few more observations.

Firstly, I checked the "xen-mfndump dump-m2p" output and found that misread
blocks are mapped to suspiciously round MFNs. I have different versions of
Xen and Linux kernel on each machine and I see some coincidence.

I'm writing few huge files without Xen to ensure that they have been written
correctly (because under Xen both read and writeback is affected). Then I'm
booting to Xen, memory-mapping the files and reading each page. I see that when 
block is corrupted, it is mapped on round MFN e.g. pfn=0x5095d9/mfn=0x1600000, 
another on pfn=0x4095d9/mfn=0x1500000 etc.

On another machine with different Linux/Xen version these faults appear on
pfn=0x20e50c/mfn=0x1300000, pfn=0x30e50c/mfn=0x1400000 etc.

I also noticed that during read of page that is mapped to
pfn=0x20e50c/mfn=0x1300000, I'm getting these faults from DMAR:

```
(XEN) [VT-D]DMAR:[DMA Write] Request device [0000:65:00.0] fault addr 1200000000
(XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
(XEN) [VT-D]DMAR:[DMA Write] Request device [0000:65:00.0] fault addr 1200001000
(XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
(XEN) [VT-D]DMAR:[DMA Write] Request device [0000:65:00.0] fault addr 1200006000
(XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
(XEN) [VT-D]DMAR:[DMA Write] Request device [0000:65:00.0] fault addr 1200008000
(XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
(XEN) [VT-D]DMAR:[DMA Write] Request device [0000:65:00.0] fault addr 1200009000
(XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
(XEN) [VT-D]DMAR:[DMA Write] Request device [0000:65:00.0] fault addr 120000a000
(XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
(XEN) [VT-D]DMAR:[DMA Write] Request device [0000:65:00.0] fault addr 120000c000
(XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
```

and every time I'm dropping the cache and reading this region, I'm getting
DMAR faults on few random addresses from 1200000000-120000f000 range (I guess 
MFNs 0x1200000-120000f). MFNs 0x1200000-0x12000ff are not mapped to any PFN in
Dom0 (based on xen-mfndump output.). 

On the other hand, I'm not getting these DMAR faults while reading other regions.
Also I can't trigger the bug with reversed Dom0 mapping, even if I fill the page
cache with reads.

Thank you,
Paweł

next prev parent reply	other threads:[~2025-02-19 18:38 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-17 20:19 Memory corruption bug with Xen PV Dom0 and BOSS-S1 RAID card Paweł Srokosz
2025-02-18  9:44 ` Roger Pau Monné
2025-02-19 18:37   ` Paweł Srokosz [this message]
2025-02-20  9:16     ` Roger Pau Monné
2025-02-20  9:31       ` Jürgen Groß
2025-02-20 12:37         ` Roger Pau Monné
2025-02-20 12:43           ` Jürgen Groß
2025-02-20 13:29             ` Roger Pau Monné
2025-02-20 13:41               ` Jürgen Groß

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1001969494.1457790.1739990267113.JavaMail.zimbra@cert.pl \
    --to=pawel.srokosz@cert.pl \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=jgross@suse.com \
    --cc=roger.pau@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.