Re: [PATCH RFC 1/1] x86: fix bad memory access in fb_is_primary_device()

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Alexander Popov <alpopov@ptsecurity.com>
To: Peter Jones <pjones@redhat.com>, Matt Fleming <matt@codeblueprint.co.uk>
Cc: Ingo Molnar <mingo@kernel.org>, Arnd Bergmann <arnd@arndb.de>,
	Tomi Valkeinen <tomi.valkeinen@ti.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	linux-efi@vger.kernel.org,
	Alexander Popov <alpopov@ptsecurity.com>
Subject: Re: [PATCH RFC 1/1] x86: fix bad memory access in fb_is_primary_device()
Date: Wed, 9 Mar 2016 15:46:07 +0300	[thread overview]
Message-ID: <56E01B0F.3000306@ptsecurity.com> (raw)
In-Reply-To: <20160216151859.GB11373@redhat.com>

On 16.02.2016 18:18, Peter Jones wrote:
> On Tue, Feb 16, 2016 at 01:49:18PM +0000, Matt Fleming wrote:
>> [ Including Peter, the efifb maintainer. Original email is here,
>>
>>     http://marc.info/?l=linux-kernel&m=145552936131335&w=2
>>
>>   I've snipped some of the quoted text ]
>>
>> On Tue, 16 Feb, at 08:55:22AM, Ingo Molnar wrote:
>>>
>>> (I've Cc:-ed the EFI-FB and FB gents. Mail quoted below.)
>>>
>>> * Alexander Popov <alpopov@ptsecurity.com> wrote:
>>>
>>>> Currently the code in fb_is_primary_device() contains to_pci_dev() macro
>>>> which is applied to dev from struct fb_info. In some cases this causes
>>>> bad memory access when fb_is_primary_device() handles fb_info of efifb.
>>>> The reason is that fb dev of efifb is embedded into struct platform_device
>>>> but not into struct pci_dev.
>>>>
>>>> We can fix this by checking fb dev bus name in fb_is_primary_device().
>>>>
>>>> It seems that this bug reveals some bigger problem with to_pci_dev(),
>>>> to_platform_device() and others, which just do container_of() and
>>>> don't check whether struct device is a part of the appropriate structure.
>>>> Should we do something more about it?
>>>>
>>>> KASan report:
>>
>> [...]
>>
>>>>
>>>> Signed-off-by: Alexander Popov <alpopov@ptsecurity.com>
>>>> ---
>>>>  arch/x86/video/fbdev.c | 9 +++++----
>>>>  1 file changed, 5 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/arch/x86/video/fbdev.c b/arch/x86/video/fbdev.c
>>>> index d5644bb..4999f78 100644
>>>> --- a/arch/x86/video/fbdev.c
>>>> +++ b/arch/x86/video/fbdev.c
>>>> @@ -18,11 +18,12 @@ int fb_is_primary_device(struct fb_info *info)
>>>>  	struct pci_dev *default_device = vga_default_device();
>>>>  	struct resource *res = NULL;
>>>>  
>>>> -	if (device)
>>>> -		pci_dev = to_pci_dev(device);
>>>> -
>>>> -	if (!pci_dev)
>>>> +	if (!device || !device->bus ||
>>>> +		    !device->bus->name || strcmp(device->bus->name, "pci")) {
>>>>  		return 0;
>>>> +	}
>>>> +
>>>> +	pci_dev = to_pci_dev(device);
>>>>  
>>>>  	if (default_device) {
>>>>  		if (pci_dev == default_device)
>>>> -- 
>>>> 1.9.1
>>>>
>>
>> I wonder if this issue could explain some of the efifb issues we've
>> seen reported on bugzilla.kernel.org in the past where switching from
>> efifb to some other framebuffer device caused hangs during boot. I'm
>> struggling to find the relevant bugzilla entries now, though.
> 
> It's possible it could, but I don't have them handy either.  I've also
> wondered if some of them were due to bad data from the firmware - at
> plugfests we've seen some cases where the actual video mode as measured
> with a ruler is clearly not what the firmware claims it to be, so it's
> entirely possible we're occasionally told a memory region that is not
> what's actually mapped, or that's mapped but is only partially backed
> by the actual frame buffer memory.
> 
> But aside from that diversion, I think Alexander has a legitimate
> question about use of to_pci_dev().  If I ask the question: can we fix
> this in efifb by making it live on a pci_dev, I have a couple of
> fundamental problems:
> 
> 1) technically it doesn't have to be a pci_dev at all (but, practically,
>    so far it always is on PCI...)
> 2) From EFI, we can't necessarily pin it down to a single PCI device
>    even if it is PCI.  Before we do EFI's ExitBootServices() call, we
>    can try to find the PCI_IO handle our GOP instance is connected to,
>    but not all firmware GOP drivers use that, so it doesn't always work.
>    And even if it did, there can be more than one instance pointing to
>    the same memory with different PCI devices - lots of laptops have
>    this sort of thing.
> 3) Ignoring the EFI side and just focusing on PCI, if there's two
>    devices configured that could do scanout, it can be mapped to one
>    device's BAR but the other device be the actual device using it.  In
>    this case either choice is probably wrong for something, and the
>    things that have the information to resolve which one don't include
>    efifb - they're the drivers we'll likely hand off to later.
> 
> So it's most likely right for efifb to be embedded in a platform_device
> instead of a pci_dev.  Which leads back to Alexander's question - if it
> isn't in a pci_dev, that means fb_is_primary_device() needs to not
> assume it is.  So the patch appears correct, but so is the question -
> should to_pci_dev() be checking this and returning NULL here?

The discussion has suspended. May I activate it again?

So there are two ways to fix the bad memory access in fb_is_primary_device().

The first one is proposed in my patch. Checking the bus name string doesn't
look good but I didn't manage to come up with anything better.

The second way is changing to_pci_dev() similarly. It may return NULL or
call BUG() when struct device is a part of an inappropriate structure.

Which way is better? Do we need to do anything with other similar macros?

Thanks.
Best regards,
Alexander

WARNING: multiple messages have this Message-ID (diff)

From: Alexander Popov <alpopov@ptsecurity.com>
To: Peter Jones <pjones@redhat.com>, Matt Fleming <matt@codeblueprint.co.uk>
Cc: Ingo Molnar <mingo@kernel.org>, Arnd Bergmann <arnd@arndb.de>,
	"Tomi Valkeinen" <tomi.valkeinen@ti.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"Ingo Molnar" <mingo@redhat.com>,
	"H. Peter Anvin" <hpa@zytor.com>, <x86@kernel.org>,
	<linux-kernel@vger.kernel.org>, <linux-efi@vger.kernel.org>,
	Alexander Popov <alpopov@ptsecurity.com>
Subject: Re: [PATCH RFC 1/1] x86: fix bad memory access in fb_is_primary_device()
Date: Wed, 9 Mar 2016 15:46:07 +0300	[thread overview]
Message-ID: <56E01B0F.3000306@ptsecurity.com> (raw)
In-Reply-To: <20160216151859.GB11373@redhat.com>

On 16.02.2016 18:18, Peter Jones wrote:
> On Tue, Feb 16, 2016 at 01:49:18PM +0000, Matt Fleming wrote:
>> [ Including Peter, the efifb maintainer. Original email is here,
>>
>>     http://marc.info/?l=linux-kernel&m=145552936131335&w=2
>>
>>   I've snipped some of the quoted text ]
>>
>> On Tue, 16 Feb, at 08:55:22AM, Ingo Molnar wrote:
>>>
>>> (I've Cc:-ed the EFI-FB and FB gents. Mail quoted below.)
>>>
>>> * Alexander Popov <alpopov@ptsecurity.com> wrote:
>>>
>>>> Currently the code in fb_is_primary_device() contains to_pci_dev() macro
>>>> which is applied to dev from struct fb_info. In some cases this causes
>>>> bad memory access when fb_is_primary_device() handles fb_info of efifb.
>>>> The reason is that fb dev of efifb is embedded into struct platform_device
>>>> but not into struct pci_dev.
>>>>
>>>> We can fix this by checking fb dev bus name in fb_is_primary_device().
>>>>
>>>> It seems that this bug reveals some bigger problem with to_pci_dev(),
>>>> to_platform_device() and others, which just do container_of() and
>>>> don't check whether struct device is a part of the appropriate structure.
>>>> Should we do something more about it?
>>>>
>>>> KASan report:
>>
>> [...]
>>
>>>>
>>>> Signed-off-by: Alexander Popov <alpopov@ptsecurity.com>
>>>> ---
>>>>  arch/x86/video/fbdev.c | 9 +++++----
>>>>  1 file changed, 5 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/arch/x86/video/fbdev.c b/arch/x86/video/fbdev.c
>>>> index d5644bb..4999f78 100644
>>>> --- a/arch/x86/video/fbdev.c
>>>> +++ b/arch/x86/video/fbdev.c
>>>> @@ -18,11 +18,12 @@ int fb_is_primary_device(struct fb_info *info)
>>>>  	struct pci_dev *default_device = vga_default_device();
>>>>  	struct resource *res = NULL;
>>>>  
>>>> -	if (device)
>>>> -		pci_dev = to_pci_dev(device);
>>>> -
>>>> -	if (!pci_dev)
>>>> +	if (!device || !device->bus ||
>>>> +		    !device->bus->name || strcmp(device->bus->name, "pci")) {
>>>>  		return 0;
>>>> +	}
>>>> +
>>>> +	pci_dev = to_pci_dev(device);
>>>>  
>>>>  	if (default_device) {
>>>>  		if (pci_dev == default_device)
>>>> -- 
>>>> 1.9.1
>>>>
>>
>> I wonder if this issue could explain some of the efifb issues we've
>> seen reported on bugzilla.kernel.org in the past where switching from
>> efifb to some other framebuffer device caused hangs during boot. I'm
>> struggling to find the relevant bugzilla entries now, though.
> 
> It's possible it could, but I don't have them handy either.  I've also
> wondered if some of them were due to bad data from the firmware - at
> plugfests we've seen some cases where the actual video mode as measured
> with a ruler is clearly not what the firmware claims it to be, so it's
> entirely possible we're occasionally told a memory region that is not
> what's actually mapped, or that's mapped but is only partially backed
> by the actual frame buffer memory.
> 
> But aside from that diversion, I think Alexander has a legitimate
> question about use of to_pci_dev().  If I ask the question: can we fix
> this in efifb by making it live on a pci_dev, I have a couple of
> fundamental problems:
> 
> 1) technically it doesn't have to be a pci_dev at all (but, practically,
>    so far it always is on PCI...)
> 2) From EFI, we can't necessarily pin it down to a single PCI device
>    even if it is PCI.  Before we do EFI's ExitBootServices() call, we
>    can try to find the PCI_IO handle our GOP instance is connected to,
>    but not all firmware GOP drivers use that, so it doesn't always work.
>    And even if it did, there can be more than one instance pointing to
>    the same memory with different PCI devices - lots of laptops have
>    this sort of thing.
> 3) Ignoring the EFI side and just focusing on PCI, if there's two
>    devices configured that could do scanout, it can be mapped to one
>    device's BAR but the other device be the actual device using it.  In
>    this case either choice is probably wrong for something, and the
>    things that have the information to resolve which one don't include
>    efifb - they're the drivers we'll likely hand off to later.
> 
> So it's most likely right for efifb to be embedded in a platform_device
> instead of a pci_dev.  Which leads back to Alexander's question - if it
> isn't in a pci_dev, that means fb_is_primary_device() needs to not
> assume it is.  So the patch appears correct, but so is the question -
> should to_pci_dev() be checking this and returning NULL here?

The discussion has suspended. May I activate it again?

So there are two ways to fix the bad memory access in fb_is_primary_device().

The first one is proposed in my patch. Checking the bus name string doesn't
look good but I didn't manage to come up with anything better.

The second way is changing to_pci_dev() similarly. It may return NULL or
call BUG() when struct device is a part of an inappropriate structure.

Which way is better? Do we need to do anything with other similar macros?

Thanks.
Best regards,
Alexander

next prev parent reply	other threads:[~2016-03-09 12:46 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-15  9:30 [PATCH RFC 1/1] x86: fix bad memory access in fb_is_primary_device() Alexander Popov
2016-02-16  7:55 ` Ingo Molnar
2016-02-16 13:49   ` Matt Fleming
     [not found]     ` <20160216134918.GD2769-mF/unelCI9GS6iBeEJttW/XRex20P6io@public.gmane.org>
2016-02-16 15:18       ` Peter Jones
2016-02-16 15:18         ` Peter Jones
2016-03-09 12:46         ` Alexander Popov [this message]
2016-03-09 12:46           ` Alexander Popov
     [not found]           ` <56E01B0F.3000306-rHsQqRFQpp1OSnsfY10OVw@public.gmane.org>
2016-03-25 11:29             ` Alexander Popov
2016-03-25 11:29               ` Alexander Popov
2016-03-29 11:53               ` Matt Fleming
     [not found]                 ` <20160329115347.GA3625-mF/unelCI9GS6iBeEJttW/XRex20P6io@public.gmane.org>
2016-03-29 12:12                   ` Matt Fleming
2016-03-29 12:12                     ` Matt Fleming
     [not found]               ` <56F52111.6050500-rHsQqRFQpp1OSnsfY10OVw@public.gmane.org>
2016-03-31 13:31                 ` Ingo Molnar
2016-03-31 13:31                   ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56E01B0F.3000306@ptsecurity.com \
    --to=alpopov@ptsecurity.com \
    --cc=arnd@arndb.de \
    --cc=hpa@zytor.com \
    --cc=linux-efi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matt@codeblueprint.co.uk \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=pjones@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=tomi.valkeinen@ti.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.