From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org,
Alex Williamson <alex.williamson@redhat.com>,
Marcelo Tosatti <mtosatti@redhat.com>,
Joerg Roedel <joro@8bytes.org>, Borislav Petkov <bp@suse.de>
Subject: [ 01/39] intel-iommu: Fix leaks in pagetable freeing
Date: Fri, 11 Oct 2013 12:34:45 -0700 [thread overview]
Message-ID: <20131011193211.060964140@linuxfoundation.org> (raw)
In-Reply-To: <20131011193210.843963685@linuxfoundation.org>
3.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Williamson <alex.williamson@redhat.com>
commit 3269ee0bd6686baf86630300d528500ac5b516d7 upstream.
At best the current code only seems to free the leaf pagetables and
the root. If you're unlucky enough to have a large gap (like any
QEMU guest with more than 3G of memory), only the first chunk of leaf
pagetables are freed (plus the root). This is a massive memory leak.
This patch re-writes the pagetable freeing function to use a
recursive algorithm and manages to not only free all the pagetables,
but does it without any apparent performance loss versus the current
broken version.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/pci/intel-iommu.c | 74 ++++++++++++++++++++++------------------------
1 file changed, 36 insertions(+), 38 deletions(-)
--- a/drivers/pci/intel-iommu.c
+++ b/drivers/pci/intel-iommu.c
@@ -853,56 +853,54 @@ static int dma_pte_clear_range(struct dm
return order;
}
+static void dma_pte_free_level(struct dmar_domain *domain, int level,
+ struct dma_pte *pte, unsigned long pfn,
+ unsigned long start_pfn, unsigned long last_pfn)
+{
+ pfn = max(start_pfn, pfn);
+ pte = &pte[pfn_level_offset(pfn, level)];
+
+ do {
+ unsigned long level_pfn;
+ struct dma_pte *level_pte;
+
+ if (!dma_pte_present(pte) || dma_pte_superpage(pte))
+ goto next;
+
+ level_pfn = pfn & level_mask(level - 1);
+ level_pte = phys_to_virt(dma_pte_addr(pte));
+
+ if (level > 2)
+ dma_pte_free_level(domain, level - 1, level_pte,
+ level_pfn, start_pfn, last_pfn);
+
+ /* If range covers entire pagetable, free it */
+ if (!(start_pfn > level_pfn ||
+ last_pfn < level_pfn + level_size(level))) {
+ dma_clear_pte(pte);
+ domain_flush_cache(domain, pte, sizeof(*pte));
+ free_pgtable_page(level_pte);
+ }
+next:
+ pfn += level_size(level);
+ } while (!first_pte_in_page(++pte) && pfn <= last_pfn);
+}
+
/* free page table pages. last level pte should already be cleared */
static void dma_pte_free_pagetable(struct dmar_domain *domain,
unsigned long start_pfn,
unsigned long last_pfn)
{
int addr_width = agaw_to_width(domain->agaw) - VTD_PAGE_SHIFT;
- struct dma_pte *first_pte, *pte;
- int total = agaw_to_level(domain->agaw);
- int level;
- unsigned long tmp;
- int large_page = 2;
BUG_ON(addr_width < BITS_PER_LONG && start_pfn >> addr_width);
BUG_ON(addr_width < BITS_PER_LONG && last_pfn >> addr_width);
BUG_ON(start_pfn > last_pfn);
/* We don't need lock here; nobody else touches the iova range */
- level = 2;
- while (level <= total) {
- tmp = align_to_level(start_pfn, level);
-
- /* If we can't even clear one PTE at this level, we're done */
- if (tmp + level_size(level) - 1 > last_pfn)
- return;
-
- do {
- large_page = level;
- first_pte = pte = dma_pfn_level_pte(domain, tmp, level, &large_page);
- if (large_page > level)
- level = large_page + 1;
- if (!pte) {
- tmp = align_to_level(tmp + 1, level + 1);
- continue;
- }
- do {
- if (dma_pte_present(pte)) {
- free_pgtable_page(phys_to_virt(dma_pte_addr(pte)));
- dma_clear_pte(pte);
- }
- pte++;
- tmp += level_size(level);
- } while (!first_pte_in_page(pte) &&
- tmp + level_size(level) - 1 <= last_pfn);
-
- domain_flush_cache(domain, first_pte,
- (void *)pte - (void *)first_pte);
-
- } while (tmp && tmp + level_size(level) - 1 <= last_pfn);
- level++;
- }
+ dma_pte_free_level(domain, agaw_to_level(domain->agaw),
+ domain->pgd, 0, start_pfn, last_pfn);
+
/* free pgd */
if (start_pfn == 0 && last_pfn == DOMAIN_MAX_PFN(domain->gaw)) {
free_pgtable_page(domain->pgd);
next prev parent reply other threads:[~2013-10-11 19:34 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-11 19:34 [ 00/39] 3.0.100-stable review Greg Kroah-Hartman
2013-10-11 19:34 ` Greg Kroah-Hartman [this message]
2013-10-11 19:34 ` [ 02/39] cpqarray: fix info leak in ida_locked_ioctl() Greg Kroah-Hartman
2013-10-11 19:34 ` [ 03/39] cciss: fix info leak in cciss_ioctl32_passthru() Greg Kroah-Hartman
2013-10-11 19:34 ` [ 04/39] caif: Add missing braces to multiline if in cfctrl_linkup_request Greg Kroah-Hartman
2013-10-11 19:34 ` [ 05/39] netpoll: fix NULL pointer dereference in netpoll_cleanup Greg Kroah-Hartman
2013-10-11 19:34 ` [ 06/39] net: sctp: fix ipv6 ipsec encryption bug in sctp_v6_xmit Greg Kroah-Hartman
2013-10-11 19:34 ` [ 07/39] resubmit bridge: fix message_age_timer calculation Greg Kroah-Hartman
2013-10-11 19:34 ` [ 08/39] bridge: Clamp forward_delay when enabling STP Greg Kroah-Hartman
2013-10-11 19:34 ` [ 09/39] ip: generate unique IP identificator if local fragmentation is allowed Greg Kroah-Hartman
2013-10-11 19:34 ` [ 10/39] ipv6 mcast: use in6_dev_put in timer handlers instead of __in6_dev_put Greg Kroah-Hartman
2013-10-11 19:34 ` [ 11/39] ipv4 igmp: use in_dev_put in timer handlers instead of __in_dev_put Greg Kroah-Hartman
2013-10-11 19:34 ` [ 12/39] ipv6: udp packets following an UFO enqueued packet need also be handled by UFO Greg Kroah-Hartman
2013-10-11 19:34 ` [ 13/39] via-rhine: fix VLAN priority field (PCP, IEEE 802.1p) Greg Kroah-Hartman
2013-10-11 19:34 ` [ 14/39] dm9601: fix IFF_ALLMULTI handling Greg Kroah-Hartman
2013-10-11 19:34 ` [ 15/39] bonding: Fix broken promiscuity reference counting issue Greg Kroah-Hartman
2013-10-11 19:35 ` [ 16/39] ll_temac: Reset dma descriptors indexes on ndo_open Greg Kroah-Hartman
2013-10-11 19:35 ` [ 17/39] ASoC: max98095: a couple array underflows Greg Kroah-Hartman
2013-10-11 19:35 ` [ 18/39] ASoC: 88pm860x: array overflow in snd_soc_put_volsw_2r_st() Greg Kroah-Hartman
2013-10-11 19:35 ` [ 19/39] powerpc/iommu: Use GFP_KERNEL instead of GFP_ATOMIC in iommu_init_table() Greg Kroah-Hartman
2013-10-11 19:35 ` [ 20/39] powerpc/vio: Fix modalias_show return values Greg Kroah-Hartman
2013-10-11 19:35 ` [ 21/39] powerpc: Fix parameter clobber in csum_partial_copy_generic() Greg Kroah-Hartman
2013-10-11 19:35 ` [ 22/39] powerpc: Restore registers on error exit from csum_partial_copy_generic() Greg Kroah-Hartman
2013-10-11 19:35 ` [ 23/39] esp_scsi: Fix tag state corruption when autosensing Greg Kroah-Hartman
2013-10-11 19:35 ` [ 24/39] sparc64: Fix ITLB handler of null page Greg Kroah-Hartman
2013-10-11 19:35 ` [ 25/39] sparc64: Remove RWSEM export leftovers Greg Kroah-Hartman
2013-10-11 19:35 ` [ 26/39] sparc64: Fix off by one in trampoline TLB mapping installation loop Greg Kroah-Hartman
2013-10-11 19:35 ` [ 27/39] sparc64: Fix not SRAed %o5 in 32-bit traced syscall Greg Kroah-Hartman
2013-10-11 19:35 ` [ 28/39] sparc32: Fix exit flag passed from traced sys_sigreturn Greg Kroah-Hartman
2013-10-11 19:35 ` [ 29/39] kernel/kmod.c: check for NULL in call_usermodehelper_exec() Greg Kroah-Hartman
2013-10-11 22:36 ` Tetsuo Handa
2013-10-11 19:35 ` [ 30/39] USB: serial: option: Ignore card reader interface on Huawei E1750 Greg Kroah-Hartman
2013-10-11 19:35 ` [ 31/39] rtlwifi: Align private space in rtl_priv struct Greg Kroah-Hartman
2013-10-11 19:35 ` [ 32/39] p54usb: add USB ID for Corega WLUSB2GTST USB adapter Greg Kroah-Hartman
2013-10-11 19:35 ` [ 33/39] staging: comedi: ni_65xx: (bug fix) confine insn_bits to one subdevice Greg Kroah-Hartman
2013-10-11 19:35 ` [ 34/39] ACPI / IPMI: Fix atomic context requirement of ipmi_msg_handler() Greg Kroah-Hartman
2013-10-11 19:35 ` [ 35/39] tile: use a more conservative __my_cpu_offset in CONFIG_PREEMPT Greg Kroah-Hartman
2013-10-11 19:35 ` [ 36/39] Btrfs: change how we queue blocks for backref checking Greg Kroah-Hartman
2013-10-11 19:35 ` [ 37/39] ext4: avoid hang when mounting non-journal filesystems with orphan list Greg Kroah-Hartman
2013-10-11 19:35 ` [ 38/39] tg3: fix length overflow in VPD firmware parsing Greg Kroah-Hartman
2013-10-11 19:35 ` [ 39/39] Tools: hv: verify origin of netlink connector message Greg Kroah-Hartman
2013-10-11 22:14 ` [ 00/39] 3.0.100-stable review Greg Kroah-Hartman
2013-10-12 0:52 ` Guenter Roeck
2013-10-13 16:04 ` Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131011193211.060964140@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=alex.williamson@redhat.com \
--cc=bp@suse.de \
--cc=joro@8bytes.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mtosatti@redhat.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).