From: Glauber Costa <glommer@parallels.com>
To: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Christoph Lameter <cl@linux.com>,
linux-mm@kvack.org, Pekka Enberg <penberg@kernel.org>,
David Rientjes <rientjes@google.com>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: Any reason to use put_page in slub.c?
Date: Wed, 1 Aug 2012 16:42:59 +0400 [thread overview]
Message-ID: <50192453.9080706@parallels.com> (raw)
In-Reply-To: <1343746344.8473.4.camel@dabdike.int.hansenpartnership.com>
On 07/31/2012 06:52 PM, James Bottomley wrote:
> On Tue, 2012-07-31 at 09:31 -0500, Christoph Lameter wrote:
>> On Tue, 31 Jul 2012, Glauber Costa wrote:
>>
>>> On 07/31/2012 06:17 PM, Christoph Lameter wrote:
>>>> On Tue, 31 Jul 2012, Glauber Costa wrote:
>>>>
>>>>> On 07/31/2012 06:09 PM, Christoph Lameter wrote:
>>>>>> That is understood. Typically these object where page sized though and
>>>>>> various assumptions (pretty dangerous ones as you are finding out) are
>>>>>> made regarding object reuse. The fallback of SLUB for higher order allocs
>>>>>> to the page allocator avoids these problems for higher order pages.
>>>>> omg...
>>>>
>>>> I would be very thankful if you would go through the tree and check for
>>>> any remaining use cases like that. Would take care of your problem.
>>>
>>> I would be happy to do it. Do you have any example of any user that
>>> behaved like this in the past, so I can search for something similar?
>>>
>>> This can potentially take many forms, and auditing every kfree out there
>>> is not humanly possible. The best I can do is to search for known
>>> patterns here...
>>
>> The basic problem is that someone will take the address of an object that
>> is allocated via slab and then access the page struct to increase the page
>> count.
>>
>> So you would see
>>
>> page = virt_to_page(<slab_object>);
>>
>> get_page(page);
>>
>>
>> The main cuprit in the past has been the DMA code in the SCSI layer. I
>> think it was the first 512 byte control block for the device that was the
>> main issue. There was a discussion betwen Hugh Dickins and me when SLUB
>> was first released about this issue and it resulted in some changes so
>> that certain fields in the page struct were not touched by SLUB since they
>> were needed for I/O.
>
> Hey, don't try to pin this on me. We don't use get_page() at all on the
> ordinary DMA route. There are four get_page() calls in the whole of
> drivers/scsi. One is in the sg.c fault path, which looks genuine. The
> other three are in fcoe and iSCSI ... what they're trying to do is to
> ensure that the page hangs around until the device sees the data in a
> network tx path.
>
> I can't see why any of these pages would come from kmalloc() or any
> other slab object since they should all be user pages.
>
> James
>
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
On 07/31/2012 06:52 PM, James Bottomley wrote:
> On Tue, 2012-07-31 at 09:31 -0500, Christoph Lameter wrote:
>> On Tue, 31 Jul 2012, Glauber Costa wrote:
>>
>>> On 07/31/2012 06:17 PM, Christoph Lameter wrote:
>>>> On Tue, 31 Jul 2012, Glauber Costa wrote:
>>>>
>>>>> On 07/31/2012 06:09 PM, Christoph Lameter wrote:
>>>>>> That is understood. Typically these object where page sized though and
>>>>>> various assumptions (pretty dangerous ones as you are finding out) are
>>>>>> made regarding object reuse. The fallback of SLUB for higher order allocs
>>>>>> to the page allocator avoids these problems for higher order pages.
>>>>> omg...
>>>>
>>>> I would be very thankful if you would go through the tree and check for
>>>> any remaining use cases like that. Would take care of your problem.
>>>
>>> I would be happy to do it. Do you have any example of any user that
>>> behaved like this in the past, so I can search for something similar?
>>>
>>> This can potentially take many forms, and auditing every kfree out there
>>> is not humanly possible. The best I can do is to search for known
>>> patterns here...
>>
>> The basic problem is that someone will take the address of an object that
>> is allocated via slab and then access the page struct to increase the page
>> count.
>>
>> So you would see
>>
>> page = virt_to_page(<slab_object>);
>>
>> get_page(page);
>>
>>
>> The main cuprit in the past has been the DMA code in the SCSI layer. I
>> think it was the first 512 byte control block for the device that was the
>> main issue. There was a discussion betwen Hugh Dickins and me when SLUB
>> was first released about this issue and it resulted in some changes so
>> that certain fields in the page struct were not touched by SLUB since they
>> were needed for I/O.
>
> Hey, don't try to pin this on me. We don't use get_page() at all on the
> ordinary DMA route. There are four get_page() calls in the whole of
> drivers/scsi. One is in the sg.c fault path, which looks genuine. The
> other three are in fcoe and iSCSI ... what they're trying to do is to
> ensure that the page hangs around until the device sees the data in a
> network tx path.
>
> I can't see why any of these pages would come from kmalloc() or any
> other slab object since they should all be user pages.
>
I've audited all users of get_page() in the drivers/ directory for
patterns like this. In general, they kmalloc something like a table of
entries, and then get_page() the entries. The entries are either user
pages, pages allocated by the page allocator, or physical addresses
through their pfn (in 2 cases from the vga ones...)
I took a look about some other instances where virt_to_page occurs
together with kmalloc as well, and they all seem to fall in the same
category.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-08-01 12:45 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-07-27 12:19 Any reason to use put_page in slub.c? Glauber Costa
2012-07-27 15:55 ` Christoph Lameter
2012-07-30 7:53 ` Glauber Costa
2012-07-30 19:23 ` Christoph Lameter
2012-07-31 8:25 ` Glauber Costa
2012-07-31 14:09 ` Christoph Lameter
2012-07-31 14:09 ` Glauber Costa
2012-07-31 14:17 ` Christoph Lameter
2012-07-31 14:18 ` Glauber Costa
2012-07-31 14:31 ` Christoph Lameter
2012-07-31 14:52 ` James Bottomley
2012-08-01 12:42 ` Glauber Costa [this message]
2012-08-01 18:10 ` Christoph Lameter
2012-08-02 7:55 ` Glauber Costa
2012-08-02 8:07 ` James Bottomley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50192453.9080706@parallels.com \
--to=glommer@parallels.com \
--cc=James.Bottomley@HansenPartnership.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=linux-mm@kvack.org \
--cc=penberg@kernel.org \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.