Linux IOMMU Development
 help / color / mirror / Atom feed
From: Lu Baolu <baolu.lu@linux.intel.com>
To: Alex Williamson <alex.williamson@redhat.com>,
	"Daniel F. Smith" <dfsmith@us.ibm.com>
Cc: iommu@lists.linux-foundation.org
Subject: Re: Bug report: VFIO map/unmep mem subject to race and DMA data goes to incorrect page (4.18.0)
Date: Mon, 28 Mar 2022 17:01:26 +0800	[thread overview]
Message-ID: <4190f3d7-5c4f-084f-f3bd-2dcf890cd6dc@linux.intel.com> (raw)
In-Reply-To: <20220325161022.00ab43ff.alex.williamson@redhat.com>

Hi Daniel,

On 2022/3/26 6:10, Alex Williamson wrote:
> Hi Daniel,
> 
> On Fri, 25 Mar 2022 13:06:40 -0700
> "Daniel F. Smith" <dfsmith@us.ibm.com> wrote:
> 
>> This email is to document an insidious (incorrect data, no error or warning)
>> VFIO bug found when using the Intel IOMMU to perform DMA transfers; and the
>> associated workaround.
>>
>> There may be security implications (unsure).
>>
>> /sys/devices/virtual/iommu/dmar0/intel-iommu/version: 1:0
>> /sys/devices/virtual/iommu/dmar0/intel-iommu/cap: d2008c40660462
>> Linux xxxxx.ibm.com 4.18.0-348.20.1.el8_5.x86_64 #1 SMP Tue Mar 8 12:56:54 EST 2022 x86_64 x86_64 x86_64 GNU/Linux
>> Red Hat Enterprise Linux release 8.5 (Ootpa)
>>
>> In our testing of VFIO DMA to an FPGA card in rootless mode, we discovered a
>> glitch where DMA data are transferred to/from the incorrect page.  It
>> appears timing based.  Under some specific conditions the test could trigger
>> the bug every loop.  Sometimes the bug would only emerge after 20+ minutes
>> of testing.
>>
>> Basics of test:
>> 	Get memory with mmap(anonymous): size can change.
>> 	VFIO_IOMMU_MAP_DMA with a block of memory, fixed IOVA.
>> 	Fill memory with pattern.
>> 	Do DMA transfer to FPGA from memory at IOVA.
>> 	Do DMA transfer from FPGA to memory at IOVA+offset.
>> 	Compare memory to ensure match.  Miscompare is bug.
>> 	VFIO_IOMMU_UNMAP_DMA
>> 	unmap()
>> 	Repeat.
>>
>> Using the fixed IOVA address* caused sporadic memory miscompares.  The
>> nature of the miscompares is that the received data was mixed with pages
>> that had been returned by mmap in a *previous* loop.
>>
>> Workaround: Randomizing the IOVA eliminated the memory miscompares.
>>
>> Hypothesis/conjecture: Possible race condition in UNMAP_DMA such that pages
>> can be released/munlocked *after* the MAP_DMA with the same IOVA has
>> occurred.
> 
> Coherency possibly.
> 
> There's a possible coherency issue at the compare depending on the
> IOMMU capabilities which could affect whether DMA is coherent to memory
> or requires an explicit flush.  I'm a little suspicious whether dmar0
> is really the IOMMU controlling this device since you mention a 39bit
> IOVA space, which is more typical of Intel client platforms which can
> also have integrated graphics which often have a dedicated IOMMU at
> dmar0 that isn't necessarily representative of the other IOMMUs in the
> system, especially with regard to snoop-control.  Each dmar lists the
> managed devices under it in sysfs to verify.  Support for snoop-control
> would be identified in the ecap register rather than the cap register.
> VFIO can also report coherency via the VFIO_DMA_CC_IOMMU extension
> reported by VFIO_CHECK_EXTENSION ioctl.
> 
> However, CPU coherency might lead to a miscompare, but not necessarily a
> miscompare matching the previous iteration.  Still, for completeness
> let's make sure this isn't a gap in the test programming making invalid
> assumptions about CPU/DMA coherency.
> 
> The fact that randomizing the IOVA provides a workaround though might
> suggest something relative to the IOMMU page table coherency.  But for
> the new mmap target to have the data from the previous iteration, the
> IOMMU PTE would need to be stale on read, but correct on write in order
> to land back in your new mmap.  That seems peculiar.  Are we sure the
> FPGA device isn't caching the value at the IOVA or using any sort of
> IOTLB caching such as ATS that might not be working correctly?
> 
>> Suggestion: Document issue when using fixed IOVA, or fix if security
>> is a concern.
> 
> I don't know that there's enough information here to make any
> conclusions.  Here are some further questions:
> 
>   * What size mappings are being used, both for the mmap and the VFIO
>     MAP/UNMAP operations.
> 
>   * If the above is venturing into super page support (2MB), does the
>     vfio_iommu_type1 module option disable_hugepages=1 affect the
>     results.
> 
>   * Along the same lines, does the kernel command line option
>     intel_iommu=sp_off produce different results.
> 
>   * Does this behavior also occur on upstream kernels (ie. v5.17)?
> 
>   * Do additional CPU cache flushes in the test program produce different
>     results?
> 
>   * Is this a consumer available FPGA device that others might be able
>     to reproduce this issue?  I've always wanted such a device for
>     testing, but also we can't rule out that the FPGA itself or its
>     programming is the source of the miscompare.
> 
>  From the vfio perspective, UNMAP_DMA should first unmap the pages at
> the IOMMU to prevent device access before unpinning the pages.  We do
> make use of batch unmapping to reduce iotlb flushing, but the result is
> expected to be that the IOMMU PTE entries are invalidated before the
> UNMAP_DMA operation completes.  A stale IOVA would not be expected or
> correct operation.  Thanks,
> 
> Alex
> 

As another suggestion, can you please try a patch posted here?

https://lore.kernel.org/linux-iommu/20220322063555.1422042-1-stevensd@google.com/

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

  reply	other threads:[~2022-03-28  9:01 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-25 20:06 Bug report: VFIO map/unmep mem subject to race and DMA data goes to incorrect page (4.18.0) Daniel F. Smith
2022-03-25 22:10 ` Alex Williamson
2022-03-28  9:01   ` Lu Baolu [this message]
2022-03-28 19:14   ` Daniel F. Smith
2022-03-28 23:05     ` Alex Williamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4190f3d7-5c4f-084f-f3bd-2dcf890cd6dc@linux.intel.com \
    --to=baolu.lu@linux.intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=dfsmith@us.ibm.com \
    --cc=iommu@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox