From: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
To: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 4/4] drm/i915/bios: do not discard address space
Date: Fri, 8 Nov 2019 21:19:15 +0200 [thread overview]
Message-ID: <20191108191915.GJ1208@intel.com> (raw)
In-Reply-To: <20191108181852.f2ce7uthomzgf4vx@ldmartin-desk1>
On Fri, Nov 08, 2019 at 10:18:52AM -0800, Lucas De Marchi wrote:
> On Fri, Nov 08, 2019 at 01:14:03PM +0200, Jani Nikula wrote:
> >On Thu, 07 Nov 2019, Lucas De Marchi <lucas.demarchi@intel.com> wrote:
> >> When we are mapping the VBT through pci_map_rom() we may not be allowed
> >> to simply discard the address space and go on reading the memory. After
> >> checking on my test system that dumping the rom via sysfs I could
> >> actually get the correct vbt, I decided to change the implementation to
> >> use the same approach, by calling memcpy_fromio().
> >>
> >> In order to avoid copying the entire oprom this implements a simple
> >> memmem() searching for "$VBT". Contrary to the previous implementation
> >> this also takes care of not issuing unaligned PCI reads that would
> >> otherwise get translated into more even more reads. I also vaguely
> >> remember unaligned reads failing in the past with some devices.
> >>
> >> Also make sure we copy only the VBT and not the entire oprom that is
> >> usually much larger.
> >
> >So you have
> >
> >1. a fix to unaligned reads
>
> unaligned io reads, yes
>
> >
> >2. an optimization to avoid reading individual bytes four times
>
> it was by no means an optimization. Not reading the same byte 4 bytes is
> there actually to stop doing the unaligned IO reads. You can't have (2)
> without (1) unless you switch to ioreadb() and add a shift (which may
> not be a bad idea.
>
> >
> >3. respecting __iomem and copying (I guess these are tied together)
> >
> >Seems to me that really should be at least three patches. Not
> >necessarily in the above order.
>
> (3) is actually the most important I think, so I will start by that.
>
> >
> >Follow-up: store pointer to the oprom vbt somewhere under i915->vbt, and
> >have debugfs i915_vbt() handle that properly.
>
> I don't think this is needed. The thing I'm doing here is the same as
> what can be accomplished by reading the rom from sysfs:
>
> find /sys/bus/pci/devices/*/ -name rom
> ... choose one
>
> echo 1 > rom # to allow reading the rom
> hexdump -C rom
>
>
> >
> >>
> >> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
> >> ---
> >> drivers/gpu/drm/i915/display/intel_bios.c | 95 +++++++++++++++++++----
> >> 1 file changed, 79 insertions(+), 16 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/display/intel_bios.c b/drivers/gpu/drm/i915/display/intel_bios.c
> >> index 671bbce6ba5b..c401e90b7cf1 100644
> >> --- a/drivers/gpu/drm/i915/display/intel_bios.c
> >> +++ b/drivers/gpu/drm/i915/display/intel_bios.c
> >> @@ -1806,31 +1806,88 @@ bool intel_bios_is_valid_vbt(const void *buf, size_t size)
> >> return vbt;
> >> }
> >>
> >> -static const struct vbt_header *find_vbt(void __iomem *oprom, size_t size)
> >> +void __iomem *find_vbt(void __iomem *oprom, size_t size)
> >> {
> >> - size_t i;
> >> + const u32 MAGIC = *((const u32 *)"$VBT");
> >> + size_t done = 0, cur = 0;
> >> + void __iomem *p;
> >> + u8 buf[128];
> >> + u32 val;
> >>
> >> - /* Scour memory looking for the VBT signature. */
> >> - for (i = 0; i + 4 < size; i++) {
> >> - void *vbt;
> >> + /*
> >> + * poor's man memmem() with sizeof(buf) window to avoid frequent
> >> + * wrap-arounds and using u32 for comparison. This gives us 4
> >> + * comparisons per ioread32() and avoids unaligned io reads (although it
> >> + * still does unaligned cpu access).
> >> + */
> >
> >If we're really worried about performance here, and use a local buffer
> >to optimize the wraparounds, would it actually be more efficient to use
> >memcpy_fromio() which has an arch specific implementation in asm?
>
> Not really worried about performance. I actually did 3 implementations
> that avoids the unaligned io reads.
>
> 1) this one
> 2) memcpy_fromio() to the local buffer + strnstr()
> 3) allocate a oprom buffer, memcpy_fromio() the entire rom and keep a
> pointer to it. Then free the oprom after the vbt is used
>
> (2) and (1) had basically the same complexity involved of requiring a
> wrap around local buffer, so I went with (1)
>
> I didn't feel confortable with (3) because it would allocate much more
> memory than really needed.
>
> >
> >In any case makes you think you should first have the patch that the
> >patch subject claims, fix unaligned reads and add optimizations
> >next. This one does too much.
>
> Again, it was not really meant to be an optimization.
>
> Ville told me that we may not really need to deal with the unaligned
> access and change the implementation to expect the VBT to be aligned.
> This I would be the simplest way to change it, but I'm not fond on
> changing this and breaking old systems usin it... anyway, we can give it
> a try and revert if it breaks.
The current code already assumes 4 byte alignment. So nothing would
change and so nothing can get broken.
--
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
WARNING: multiple messages have this Message-ID (diff)
From: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
To: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH 4/4] drm/i915/bios: do not discard address space
Date: Fri, 8 Nov 2019 21:19:15 +0200 [thread overview]
Message-ID: <20191108191915.GJ1208@intel.com> (raw)
Message-ID: <20191108191915.xv4iJ_fY2Sw1OhOgvPnb_X7mDHbaDYYTWSbkK7wRBS8@z> (raw)
In-Reply-To: <20191108181852.f2ce7uthomzgf4vx@ldmartin-desk1>
On Fri, Nov 08, 2019 at 10:18:52AM -0800, Lucas De Marchi wrote:
> On Fri, Nov 08, 2019 at 01:14:03PM +0200, Jani Nikula wrote:
> >On Thu, 07 Nov 2019, Lucas De Marchi <lucas.demarchi@intel.com> wrote:
> >> When we are mapping the VBT through pci_map_rom() we may not be allowed
> >> to simply discard the address space and go on reading the memory. After
> >> checking on my test system that dumping the rom via sysfs I could
> >> actually get the correct vbt, I decided to change the implementation to
> >> use the same approach, by calling memcpy_fromio().
> >>
> >> In order to avoid copying the entire oprom this implements a simple
> >> memmem() searching for "$VBT". Contrary to the previous implementation
> >> this also takes care of not issuing unaligned PCI reads that would
> >> otherwise get translated into more even more reads. I also vaguely
> >> remember unaligned reads failing in the past with some devices.
> >>
> >> Also make sure we copy only the VBT and not the entire oprom that is
> >> usually much larger.
> >
> >So you have
> >
> >1. a fix to unaligned reads
>
> unaligned io reads, yes
>
> >
> >2. an optimization to avoid reading individual bytes four times
>
> it was by no means an optimization. Not reading the same byte 4 bytes is
> there actually to stop doing the unaligned IO reads. You can't have (2)
> without (1) unless you switch to ioreadb() and add a shift (which may
> not be a bad idea.
>
> >
> >3. respecting __iomem and copying (I guess these are tied together)
> >
> >Seems to me that really should be at least three patches. Not
> >necessarily in the above order.
>
> (3) is actually the most important I think, so I will start by that.
>
> >
> >Follow-up: store pointer to the oprom vbt somewhere under i915->vbt, and
> >have debugfs i915_vbt() handle that properly.
>
> I don't think this is needed. The thing I'm doing here is the same as
> what can be accomplished by reading the rom from sysfs:
>
> find /sys/bus/pci/devices/*/ -name rom
> ... choose one
>
> echo 1 > rom # to allow reading the rom
> hexdump -C rom
>
>
> >
> >>
> >> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
> >> ---
> >> drivers/gpu/drm/i915/display/intel_bios.c | 95 +++++++++++++++++++----
> >> 1 file changed, 79 insertions(+), 16 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/display/intel_bios.c b/drivers/gpu/drm/i915/display/intel_bios.c
> >> index 671bbce6ba5b..c401e90b7cf1 100644
> >> --- a/drivers/gpu/drm/i915/display/intel_bios.c
> >> +++ b/drivers/gpu/drm/i915/display/intel_bios.c
> >> @@ -1806,31 +1806,88 @@ bool intel_bios_is_valid_vbt(const void *buf, size_t size)
> >> return vbt;
> >> }
> >>
> >> -static const struct vbt_header *find_vbt(void __iomem *oprom, size_t size)
> >> +void __iomem *find_vbt(void __iomem *oprom, size_t size)
> >> {
> >> - size_t i;
> >> + const u32 MAGIC = *((const u32 *)"$VBT");
> >> + size_t done = 0, cur = 0;
> >> + void __iomem *p;
> >> + u8 buf[128];
> >> + u32 val;
> >>
> >> - /* Scour memory looking for the VBT signature. */
> >> - for (i = 0; i + 4 < size; i++) {
> >> - void *vbt;
> >> + /*
> >> + * poor's man memmem() with sizeof(buf) window to avoid frequent
> >> + * wrap-arounds and using u32 for comparison. This gives us 4
> >> + * comparisons per ioread32() and avoids unaligned io reads (although it
> >> + * still does unaligned cpu access).
> >> + */
> >
> >If we're really worried about performance here, and use a local buffer
> >to optimize the wraparounds, would it actually be more efficient to use
> >memcpy_fromio() which has an arch specific implementation in asm?
>
> Not really worried about performance. I actually did 3 implementations
> that avoids the unaligned io reads.
>
> 1) this one
> 2) memcpy_fromio() to the local buffer + strnstr()
> 3) allocate a oprom buffer, memcpy_fromio() the entire rom and keep a
> pointer to it. Then free the oprom after the vbt is used
>
> (2) and (1) had basically the same complexity involved of requiring a
> wrap around local buffer, so I went with (1)
>
> I didn't feel confortable with (3) because it would allocate much more
> memory than really needed.
>
> >
> >In any case makes you think you should first have the patch that the
> >patch subject claims, fix unaligned reads and add optimizations
> >next. This one does too much.
>
> Again, it was not really meant to be an optimization.
>
> Ville told me that we may not really need to deal with the unaligned
> access and change the implementation to expect the VBT to be aligned.
> This I would be the simplest way to change it, but I'm not fond on
> changing this and breaking old systems usin it... anyway, we can give it
> a try and revert if it breaks.
The current code already assumes 4 byte alignment. So nothing would
change and so nothing can get broken.
--
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2019-11-08 19:19 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-08 0:35 [PATCH 1/4] drm/i915/opregion: fix leaking fw on error path Lucas De Marchi
2019-11-08 0:35 ` [Intel-gfx] " Lucas De Marchi
2019-11-08 0:36 ` [PATCH 2/4] drm/i915/bios: rename bios to oprom when mapping pci rom Lucas De Marchi
2019-11-08 0:36 ` [Intel-gfx] " Lucas De Marchi
2019-11-08 10:01 ` Jani Nikula
2019-11-08 10:01 ` [Intel-gfx] " Jani Nikula
2019-11-08 0:36 ` [PATCH 3/4] drm/i915/bios: make sure to check vbt size Lucas De Marchi
2019-11-08 0:36 ` [Intel-gfx] " Lucas De Marchi
2019-11-08 10:08 ` Jani Nikula
2019-11-08 10:08 ` [Intel-gfx] " Jani Nikula
2019-11-08 17:41 ` Lucas De Marchi
2019-11-08 17:41 ` [Intel-gfx] " Lucas De Marchi
2019-11-08 0:36 ` [PATCH 4/4] drm/i915/bios: do not discard address space Lucas De Marchi
2019-11-08 0:36 ` [Intel-gfx] " Lucas De Marchi
2019-11-08 11:14 ` Jani Nikula
2019-11-08 11:14 ` [Intel-gfx] " Jani Nikula
2019-11-08 18:18 ` Lucas De Marchi
2019-11-08 18:18 ` [Intel-gfx] " Lucas De Marchi
2019-11-08 19:19 ` Ville Syrjälä [this message]
2019-11-08 19:19 ` Ville Syrjälä
2019-11-08 20:14 ` Lucas De Marchi
2019-11-08 20:14 ` [Intel-gfx] " Lucas De Marchi
2019-11-08 21:02 ` Ville Syrjälä
2019-11-08 21:02 ` [Intel-gfx] " Ville Syrjälä
2019-11-08 21:09 ` Lucas De Marchi
2019-11-08 21:09 ` [Intel-gfx] " Lucas De Marchi
2019-11-11 11:10 ` Jani Nikula
2019-11-11 11:10 ` [Intel-gfx] " Jani Nikula
2019-11-10 16:57 ` kbuild test robot
2019-11-10 16:57 ` [Intel-gfx] " kbuild test robot
2019-11-10 16:57 ` kbuild test robot
2019-11-10 16:57 ` [RFC PATCH] drm/i915/bios: find_vbt() can be static kbuild test robot
2019-11-10 16:57 ` kbuild test robot
2019-11-10 16:57 ` [Intel-gfx] " kbuild test robot
2019-11-08 1:53 ` ✗ Fi.CI.SPARSE: warning for series starting with [1/4] drm/i915/opregion: fix leaking fw on error path Patchwork
2019-11-08 1:53 ` [Intel-gfx] " Patchwork
2019-11-08 2:18 ` ✓ Fi.CI.BAT: success " Patchwork
2019-11-08 2:18 ` [Intel-gfx] " Patchwork
2019-11-08 9:16 ` [PATCH 1/4] " Jani Nikula
2019-11-08 9:16 ` [Intel-gfx] " Jani Nikula
2019-11-08 17:34 ` Lucas De Marchi
2019-11-08 17:34 ` [Intel-gfx] " Lucas De Marchi
2019-11-09 13:23 ` ✓ Fi.CI.IGT: success for series starting with [1/4] " Patchwork
2019-11-09 13:23 ` [Intel-gfx] " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191108191915.GJ1208@intel.com \
--to=ville.syrjala@linux.intel.com \
--cc=intel-gfx@lists.freedesktop.org \
--cc=lucas.demarchi@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.