All of lore.kernel.org
 help / color / mirror / Atom feed
* Question on arm64 unaligned faults during playback
@ 2014-12-08  5:40 Abhilash Kesavan
  2014-12-08  8:49 ` Clemens Ladisch
  0 siblings, 1 reply; 22+ messages in thread
From: Abhilash Kesavan @ 2014-12-08  5:40 UTC (permalink / raw)
  To: b42378, lars, tiwai, perex, alsa-devel
  Cc: broonie, lgirdwood, p.zabel, padma.v

Hi,

I am working on a 64-bit ARM SoC (Samsung's Exynos7) and have observed
unaligned faults while testing certain sound streams with aplay.
Playing a 44.1k sampling rate stream gives the following crash while
48k streams work fine.


[root@(none) ~]# aplay test2_16bit_44k.wav
Playing WAVE 'test2_16bit_44k.wav' : Signed 1[   24.500724] sound:
2ch, 44100Hz, 88208bytes
6 bit Little Endian, Rate 44100 Hz, Stereo
[   24.520738] snd_pcm_lib_write_transfer:sound/core/pcm_lib.c hwbuf
is ffffff8000080000, runtime->dma_area is ffffff8000080000, hwoff is
0, frames_to_bytes is 0, frames is 5513
[   24.535661] snd_pcm_lib_write_transfer:sound/core/pcm_lib.c hwbuf
is ffffff8000085624, runtime->dma_area is ffffff8000080000, hwoff is
5513, frames_to_bytes is 22052, frames is 5513
[   24.551244] Unhandled fault: alignment fault (0x96000061) at
0xffffff8000085624
[   24.558448] Internal error: : 96000061 [#1] PREEMPT SMP
[   24.563449] Modules linked in:
[   24.566489] CPU: 0 PID: 928 Comm: aplay Not tainted 3.16.0+ #232
[   24.572473] task: ffffffc0a934c380 ti: ffffffc0038d4000 task.ti:
ffffffc0038d4000
[   24.579944] PC is at __copy_from_user+0x14/0x60
[   24.584450] LR is at snd_pcm_lib_write_transfer+0xe4/0x104
[   24.589912] pc : [<ffffffc0002d4784>] lr : [<ffffffc0004b58f0>]
pstate: 20000145
[   24.597285] sp : ffffffc0038d7c90
[   24.600583] x29: ffffffc0038d7c90 x28: ffffffc0038c1c00
[   24.605876] x27: ffffffc0a93eae00 x26: 0000000000001589
[   24.611168] x25: 0000000000000000 x24: 0000000000000000
[   24.616461] x23: 0000000000000008 x22: ffffff8000085624
[   24.621753] x21: 0000000000005624 x20: ffffffc0038c1c00
[   24.627046] x19: 0000000011d69620 x18: 0000007fd2e8b1c0
[   24.632339] x17: 0000007fa8057370 x16: ffffffc000196108
[   24.637632] x15: 232b226b00000000 x14: 617266202c333135
[   24.642925] x13: 352073692066666f x12: 7768202c30303030
[   24.648218] x11: 3830303030386666 x10: 6666666620736920
[   24.653510] x9 : 616572615f616d64 x8 : 000000000000026f
[   24.658803] x7 : ffffffc000998790 x6 : 0000000000000001
[   24.664095] x5 : 0000000000000000 x4 : 0000000011d6ec44
[   24.669388] x3 : f459f780f2b3f6d4 x2 : 0000000000005614
[   24.674681] x1 : 0000000011d69628 x0 : ffffff8000085624
[   24.679973]
[   24.681451] Process aplay (pid: 928, stack limit = 0xffffffc0038d4058)
[   24.687959] Stack: (0xffffffc0038d7c90 to 0xffffffc0038d8000)
[   24.693688] 7c80:                                     038d7cd0
ffffffc0 004b7578 ffffffc0
[   24.701846] 7ca0: 00001589 00000000 a93eaf50 ffffffc0 009ef000
ffffffc0 11d69620 00000000
[   24.710001] 7cc0: 004b580c ffffffc0 004b755c ffffffc0 038d7d70
ffffffc0 004b7758 ffffffc0
[   24.718158] 7ce0: a93eae00 ffffffc0 d2e8b438 0000007f a91eb858
ffffffc0 d2e8b438 0000007f
[   24.726313] 7d00: 40184150 00000000 d2e8b438 0000007f 00000115
00000000 0000001d 00000000
[   24.734470] 7d20: 00960000 ffffffc0 038d4000 ffffffc0 0000000a
00000000 038d7d68 ffffffc0
[   24.742626] 7d40: 00001589 00000000 00001589 00000000 00000000
00000000 009ef898 ffffffc0
[   24.750782] 7d60: a8b6d800 ffffffc0 0000409b 00000000 038d7da0
ffffffc0 004b2dd0 ffffffc0
[   24.758938] 7d80: a93eae00 ffffffc0 11d69620 00000000 11d69620
00000000 00001589 00000000
[   24.767094] 7da0: 038d7df0 ffffffc0 004b2e10 ffffffc0 a8b6d900
ffffffc0 d2e8b438 0000007f
[   24.775249] 7dc0: a91eb858 ffffffc0 00000004 00000000 00000001
00000000 00000000 00000000
[   24.783406] 7de0: 11d69620 00000000 00001589 00000000 038d7e00
ffffffc0 00195ecc ffffffc0
[   24.791563] 7e00: 038d7e90 ffffffc0 0019618c ffffffc0 00000000
00000000 a8b6d900 ffffffc0
[   24.799718] 7e20: a8b6d900 ffffffc0 00000004 00000000 40184150
00000000 00185888 ffffffc0
[   24.807875] 7e40: a8b6d800 ffffffc0 a8b6d800 ffffffc0 ffffffff
ffffffff a80f0588 0000007f
[   24.816030] 7e60: 80000000 00000000 00000015 00000000 038d7e80
ffffffc0 001a0340 ffffffc0
[   24.824187] 7e80: 038d7e90 ffffffc0 00196148 ffffffc0 d2e8b410
0000007f 0008425c ffffffc0
[   24.832343] 7ea0: 00000000 00000000 11d500e0 00000000 ffffffff
ffffffff a805737c 0000007f
[   24.840499] 7ec0: 60000000 00000000 00000015 00000000 00000004
00000000 40184150 00000000
[   24.848655] 7ee0: d2e8b438 0000007f a8223fbc 0000007f 00000003
00000000 00000000 00000000
[   24.856811] 7f00: a82d46f0 0000007f 00000000 00000000 0000001d
00000000 00000120 00000000
[   24.864967] 7f20: 01010101 01010101 00000016 00000000 00000003
00000000 11d510d0 00000000
[   24.873123] 7f40: 00000000 00000000 00000000 232b226b a8299b80
0000007f a8057370 0000007f
[   24.881279] 7f60: d2e8b1c0 0000007f 11d691f0 00000000 11d500e0
00000000 00001589 00000000
[   24.889435] 7f80: 11d69620 00000000 00000000 00000000 0041f7d8
00000000 00000003 00000000
[   24.897591] 7fa0: 00000000 00000000 00000004 00000000 00000003
00000000 d2e8b410 0000007f
[   24.905748] 7fc0: a8223ff0 0000007f d2e8b410 0000007f a805737c
0000007f 60000000 00000000
[   24.913904] 7fe0: 00000004 00000000 0000001d 00000000 d10d0579
f13ea086 786e62ff 1f2437e3
[   24.922054] Call trace:
[   24.924488] [<ffffffc0002d4784>] __copy_from_user+0x14/0x60
[   24.930040] [<ffffffc0004b7574>] snd_pcm_lib_write1+0x1fc/0x384
[   24.935940] [<ffffffc0004b7754>] snd_pcm_lib_write+0x58/0x80
[   24.941580] [<ffffffc0004b2dcc>] snd_pcm_playback_ioctl1+0x2e8/0x300
[   24.947913] [<ffffffc0004b2e0c>] snd_pcm_playback_ioctl+0x28/0x44
[   24.953990] [<ffffffc000195ec8>] do_vfs_ioctl+0x370/0x5b0
[   24.959367] [<ffffffc000196188>] SyS_ioctl+0x80/0x98
[   24.964314] Code: f1002042 540000a4 f8408423 f1002042 (f8008403)
[   24.976808] ---[ end trace 35415135d4341fc9 ]---


We are using the internal sram available for sound, for DMA buffer
allocation, using the generic SRAM driver. From the above log, the
buffer address offset is not 8-byte aligned and as we are using the
SRAM driver which maps the memory as device memory we are getting an
unaligned fault. Is it incorrect to use the generic sram driver for
arm64 or am I missing something ?

Regards,
Abhilash

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Question on arm64 unaligned faults during playback
  2014-12-08  5:40 Question on arm64 unaligned faults during playback Abhilash Kesavan
@ 2014-12-08  8:49 ` Clemens Ladisch
  2014-12-08 10:41   ` Takashi Iwai
  2014-12-09  2:56   ` Abhilash Kesavan
  0 siblings, 2 replies; 22+ messages in thread
From: Clemens Ladisch @ 2014-12-08  8:49 UTC (permalink / raw)
  To: Abhilash Kesavan, b42378, lars, tiwai, perex, alsa-devel
  Cc: broonie, lgirdwood, p.zabel, padma.v

Abhilash Kesavan wrote:
> I am working on a 64-bit ARM SoC (Samsung's Exynos7) and have observed
> unaligned faults while testing certain sound streams with aplay.
>
> [   24.535661] snd_pcm_lib_write_transfer:sound/core/pcm_lib.c hwbuf is ffffff8000085624, runtime->dma_area is ffffff8000080000, hwoff is 5513, frames_to_bytes is 22052, frames is 5513
> [   24.551244] Unhandled fault: alignment fault (0x96000061) at 0xffffff8000085624
> [   24.579944] PC is at __copy_from_user+0x14/0x60
> [   24.584450] LR is at snd_pcm_lib_write_transfer+0xe4/0x104
> [   24.922054] Call trace:
> [   24.924488] [<ffffffc0002d4784>] __copy_from_user+0x14/0x60
> [   24.930040] [<ffffffc0004b7574>] snd_pcm_lib_write1+0x1fc/0x384
>
> We are using the internal sram available for sound, for DMA buffer
> allocation, using the generic SRAM driver. From the above log, the
> buffer address offset is not 8-byte aligned and as we are using the
> SRAM driver which maps the memory as device memory we are getting an
> unaligned fault. Is it incorrect to use the generic sram driver for
> arm64 or am I missing something ?

When you give the ALSA framework a memory buffer, it is assumed that it
is general-purpose memory, i.e., can be accessed with any alignment.

If your memory does not allow such accesses, you have to do all the
accesses in your driver, i.e., do everything in the .copy/.silence
callbacks.  (You also have to add constraints so that periods and buffer
are correctly aligned.)


Regards,
Clemens

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Question on arm64 unaligned faults during playback
  2014-12-08  8:49 ` Clemens Ladisch
@ 2014-12-08 10:41   ` Takashi Iwai
  2014-12-08 11:15       ` [alsa-devel] " Lars-Peter Clausen
  2014-12-09  2:56   ` Abhilash Kesavan
  1 sibling, 1 reply; 22+ messages in thread
From: Takashi Iwai @ 2014-12-08 10:41 UTC (permalink / raw)
  To: Clemens Ladisch
  Cc: padma.v, Abhilash Kesavan, lars, lgirdwood, b42378, alsa-devel,
	broonie, p.zabel

At Mon, 08 Dec 2014 09:49:37 +0100,
Clemens Ladisch wrote:
> 
> Abhilash Kesavan wrote:
> > I am working on a 64-bit ARM SoC (Samsung's Exynos7) and have observed
> > unaligned faults while testing certain sound streams with aplay.
> >
> > [   24.535661] snd_pcm_lib_write_transfer:sound/core/pcm_lib.c hwbuf is ffffff8000085624, runtime->dma_area is ffffff8000080000, hwoff is 5513, frames_to_bytes is 22052, frames is 5513
> > [   24.551244] Unhandled fault: alignment fault (0x96000061) at 0xffffff8000085624
> > [   24.579944] PC is at __copy_from_user+0x14/0x60
> > [   24.584450] LR is at snd_pcm_lib_write_transfer+0xe4/0x104
> > [   24.922054] Call trace:
> > [   24.924488] [<ffffffc0002d4784>] __copy_from_user+0x14/0x60
> > [   24.930040] [<ffffffc0004b7574>] snd_pcm_lib_write1+0x1fc/0x384
> >
> > We are using the internal sram available for sound, for DMA buffer
> > allocation, using the generic SRAM driver. From the above log, the
> > buffer address offset is not 8-byte aligned and as we are using the
> > SRAM driver which maps the memory as device memory we are getting an
> > unaligned fault. Is it incorrect to use the generic sram driver for
> > arm64 or am I missing something ?
> 
> When you give the ALSA framework a memory buffer, it is assumed that it
> is general-purpose memory, i.e., can be accessed with any alignment.
> 
> If your memory does not allow such accesses, you have to do all the
> accesses in your driver, i.e., do everything in the .copy/.silence
> callbacks.  (You also have to add constraints so that periods and buffer
> are correctly aligned.)

You can set runtime->min_align in the own hw_params callback so that
the PCM core handle it well, at least, for non-mmap mode.

But this feature has been rarely used (currently no driver sets it
explicitly), so there might be still some bugs there, too :)


Takashi

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Question on arm64 unaligned faults during playback
  2014-12-08 10:41   ` Takashi Iwai
@ 2014-12-08 11:15       ` Lars-Peter Clausen
  0 siblings, 0 replies; 22+ messages in thread
From: Lars-Peter Clausen @ 2014-12-08 11:15 UTC (permalink / raw)
  To: Takashi Iwai, Clemens Ladisch
  Cc: Abhilash Kesavan, alsa-devel, padma.v, Catalin Marinas, b42378,
	Will Deacon, lgirdwood, broonie, p.zabel,
	linux-arm-kernel@lists.infradead.org

Added ARM64 folks to Cc.

On 12/08/2014 11:41 AM, Takashi Iwai wrote:
> At Mon, 08 Dec 2014 09:49:37 +0100,
> Clemens Ladisch wrote:
>>
>> Abhilash Kesavan wrote:
>>> I am working on a 64-bit ARM SoC (Samsung's Exynos7) and have observed
>>> unaligned faults while testing certain sound streams with aplay.
>>>
>>> [   24.535661] snd_pcm_lib_write_transfer:sound/core/pcm_lib.c hwbuf is ffffff8000085624, runtime->dma_area is ffffff8000080000, hwoff is 5513, frames_to_bytes is 22052, frames is 5513
>>> [   24.551244] Unhandled fault: alignment fault (0x96000061) at 0xffffff8000085624
>>> [   24.579944] PC is at __copy_from_user+0x14/0x60
>>> [   24.584450] LR is at snd_pcm_lib_write_transfer+0xe4/0x104
>>> [   24.922054] Call trace:
>>> [   24.924488] [<ffffffc0002d4784>] __copy_from_user+0x14/0x60
>>> [   24.930040] [<ffffffc0004b7574>] snd_pcm_lib_write1+0x1fc/0x384
>>>
>>> We are using the internal sram available for sound, for DMA buffer
>>> allocation, using the generic SRAM driver. From the above log, the
>>> buffer address offset is not 8-byte aligned and as we are using the
>>> SRAM driver which maps the memory as device memory we are getting an
>>> unaligned fault. Is it incorrect to use the generic sram driver for
>>> arm64 or am I missing something ?
>>
>> When you give the ALSA framework a memory buffer, it is assumed that it
>> is general-purpose memory, i.e., can be accessed with any alignment.
>>
>> If your memory does not allow such accesses, you have to do all the
>> accesses in your driver, i.e., do everything in the .copy/.silence
>> callbacks.  (You also have to add constraints so that periods and buffer
>> are correctly aligned.)
>
> You can set runtime->min_align in the own hw_params callback so that
> the PCM core handle it well, at least, for non-mmap mode.
>
> But this feature has been rarely used (currently no driver sets it
> explicitly), so there might be still some bugs there, too :)

As far as I can see min_align only aligns the hardware pointer, but not the 
application pointer. The later causes the issue here.

E.g. if a application writes a unaligned amount of bytes the next write will 
start at an unaligned buffer address.

How safe is it in general to use IO mapped memory as general purpose memory 
on ARM64? Are there maybe even more constraints than just that access has to 
be aligned? Maybe the driver should rather use copy_from_user_toio() instead 
of copy_from_user() to make sure that things work correctly.

- Lars

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [alsa-devel] Question on arm64 unaligned faults during playback
@ 2014-12-08 11:15       ` Lars-Peter Clausen
  0 siblings, 0 replies; 22+ messages in thread
From: Lars-Peter Clausen @ 2014-12-08 11:15 UTC (permalink / raw)
  To: linux-arm-kernel

Added ARM64 folks to Cc.

On 12/08/2014 11:41 AM, Takashi Iwai wrote:
> At Mon, 08 Dec 2014 09:49:37 +0100,
> Clemens Ladisch wrote:
>>
>> Abhilash Kesavan wrote:
>>> I am working on a 64-bit ARM SoC (Samsung's Exynos7) and have observed
>>> unaligned faults while testing certain sound streams with aplay.
>>>
>>> [   24.535661] snd_pcm_lib_write_transfer:sound/core/pcm_lib.c hwbuf is ffffff8000085624, runtime->dma_area is ffffff8000080000, hwoff is 5513, frames_to_bytes is 22052, frames is 5513
>>> [   24.551244] Unhandled fault: alignment fault (0x96000061) at 0xffffff8000085624
>>> [   24.579944] PC is at __copy_from_user+0x14/0x60
>>> [   24.584450] LR is at snd_pcm_lib_write_transfer+0xe4/0x104
>>> [   24.922054] Call trace:
>>> [   24.924488] [<ffffffc0002d4784>] __copy_from_user+0x14/0x60
>>> [   24.930040] [<ffffffc0004b7574>] snd_pcm_lib_write1+0x1fc/0x384
>>>
>>> We are using the internal sram available for sound, for DMA buffer
>>> allocation, using the generic SRAM driver. From the above log, the
>>> buffer address offset is not 8-byte aligned and as we are using the
>>> SRAM driver which maps the memory as device memory we are getting an
>>> unaligned fault. Is it incorrect to use the generic sram driver for
>>> arm64 or am I missing something ?
>>
>> When you give the ALSA framework a memory buffer, it is assumed that it
>> is general-purpose memory, i.e., can be accessed with any alignment.
>>
>> If your memory does not allow such accesses, you have to do all the
>> accesses in your driver, i.e., do everything in the .copy/.silence
>> callbacks.  (You also have to add constraints so that periods and buffer
>> are correctly aligned.)
>
> You can set runtime->min_align in the own hw_params callback so that
> the PCM core handle it well, at least, for non-mmap mode.
>
> But this feature has been rarely used (currently no driver sets it
> explicitly), so there might be still some bugs there, too :)

As far as I can see min_align only aligns the hardware pointer, but not the 
application pointer. The later causes the issue here.

E.g. if a application writes a unaligned amount of bytes the next write will 
start at an unaligned buffer address.

How safe is it in general to use IO mapped memory as general purpose memory 
on ARM64? Are there maybe even more constraints than just that access has to 
be aligned? Maybe the driver should rather use copy_from_user_toio() instead 
of copy_from_user() to make sure that things work correctly.

- Lars

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [alsa-devel] Question on arm64 unaligned faults during playback
  2014-12-08 11:15       ` [alsa-devel] " Lars-Peter Clausen
@ 2014-12-08 11:31         ` Will Deacon
  -1 siblings, 0 replies; 22+ messages in thread
From: Will Deacon @ 2014-12-08 11:31 UTC (permalink / raw)
  To: Lars-Peter Clausen
  Cc: Abhilash Kesavan, padma.v@samsung.com, Takashi Iwai,
	Catalin Marinas, b42378@freescale.com, Clemens Ladisch,
	alsa-devel@alsa-project.org, lgirdwood@gmail.com,
	broonie@kernel.org, p.zabel@pengutronix.de, perex@perex.cz,
	linux-arm-kernel@lists.infradead.org

On Mon, Dec 08, 2014 at 11:15:52AM +0000, Lars-Peter Clausen wrote:
> Added ARM64 folks to Cc.
> 
> On 12/08/2014 11:41 AM, Takashi Iwai wrote:
> > At Mon, 08 Dec 2014 09:49:37 +0100,
> > Clemens Ladisch wrote:
> >>
> >> Abhilash Kesavan wrote:
> >>> I am working on a 64-bit ARM SoC (Samsung's Exynos7) and have observed
> >>> unaligned faults while testing certain sound streams with aplay.
> >>>
> >>> [   24.535661] snd_pcm_lib_write_transfer:sound/core/pcm_lib.c hwbuf is ffffff8000085624, runtime->dma_area is ffffff8000080000, hwoff is 5513, frames_to_bytes is 22052, frames is 5513
> >>> [   24.551244] Unhandled fault: alignment fault (0x96000061) at 0xffffff8000085624
> >>> [   24.579944] PC is at __copy_from_user+0x14/0x60
> >>> [   24.584450] LR is at snd_pcm_lib_write_transfer+0xe4/0x104
> >>> [   24.922054] Call trace:
> >>> [   24.924488] [<ffffffc0002d4784>] __copy_from_user+0x14/0x60
> >>> [   24.930040] [<ffffffc0004b7574>] snd_pcm_lib_write1+0x1fc/0x384
> >>>
> >>> We are using the internal sram available for sound, for DMA buffer
> >>> allocation, using the generic SRAM driver. From the above log, the
> >>> buffer address offset is not 8-byte aligned and as we are using the
> >>> SRAM driver which maps the memory as device memory we are getting an
> >>> unaligned fault. Is it incorrect to use the generic sram driver for
> >>> arm64 or am I missing something ?
> >>
> >> When you give the ALSA framework a memory buffer, it is assumed that it
> >> is general-purpose memory, i.e., can be accessed with any alignment.
> >>
> >> If your memory does not allow such accesses, you have to do all the
> >> accesses in your driver, i.e., do everything in the .copy/.silence
> >> callbacks.  (You also have to add constraints so that periods and buffer
> >> are correctly aligned.)
> >
> > You can set runtime->min_align in the own hw_params callback so that
> > the PCM core handle it well, at least, for non-mmap mode.
> >
> > But this feature has been rarely used (currently no driver sets it
> > explicitly), so there might be still some bugs there, too :)
> 
> As far as I can see min_align only aligns the hardware pointer, but not the 
> application pointer. The later causes the issue here.
> 
> E.g. if a application writes a unaligned amount of bytes the next write will 
> start at an unaligned buffer address.
> 
> How safe is it in general to use IO mapped memory as general purpose memory 
> on ARM64? Are there maybe even more constraints than just that access has to 
> be aligned? Maybe the driver should rather use copy_from_user_toio() instead 
> of copy_from_user() to make sure that things work correctly.

If you're simply mapping RAM as device memory, then you need to ensure:

  - Naturally aligned access
  - No exclusive instructions (e.g atomics, locks, semaphores ...)

Out of interest, why are you using device mappings for memory? By design,
they prohibit speculation (and, depending on the subtype, re-ordering,
gathering and early write response) and will be significantly slower than
normal memory accesses.

Could you use something like ioremap_wc, which will give you normal,
non-cacheable memory instead?

Will

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [alsa-devel] Question on arm64 unaligned faults during playback
@ 2014-12-08 11:31         ` Will Deacon
  0 siblings, 0 replies; 22+ messages in thread
From: Will Deacon @ 2014-12-08 11:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Dec 08, 2014 at 11:15:52AM +0000, Lars-Peter Clausen wrote:
> Added ARM64 folks to Cc.
> 
> On 12/08/2014 11:41 AM, Takashi Iwai wrote:
> > At Mon, 08 Dec 2014 09:49:37 +0100,
> > Clemens Ladisch wrote:
> >>
> >> Abhilash Kesavan wrote:
> >>> I am working on a 64-bit ARM SoC (Samsung's Exynos7) and have observed
> >>> unaligned faults while testing certain sound streams with aplay.
> >>>
> >>> [   24.535661] snd_pcm_lib_write_transfer:sound/core/pcm_lib.c hwbuf is ffffff8000085624, runtime->dma_area is ffffff8000080000, hwoff is 5513, frames_to_bytes is 22052, frames is 5513
> >>> [   24.551244] Unhandled fault: alignment fault (0x96000061) at 0xffffff8000085624
> >>> [   24.579944] PC is at __copy_from_user+0x14/0x60
> >>> [   24.584450] LR is at snd_pcm_lib_write_transfer+0xe4/0x104
> >>> [   24.922054] Call trace:
> >>> [   24.924488] [<ffffffc0002d4784>] __copy_from_user+0x14/0x60
> >>> [   24.930040] [<ffffffc0004b7574>] snd_pcm_lib_write1+0x1fc/0x384
> >>>
> >>> We are using the internal sram available for sound, for DMA buffer
> >>> allocation, using the generic SRAM driver. From the above log, the
> >>> buffer address offset is not 8-byte aligned and as we are using the
> >>> SRAM driver which maps the memory as device memory we are getting an
> >>> unaligned fault. Is it incorrect to use the generic sram driver for
> >>> arm64 or am I missing something ?
> >>
> >> When you give the ALSA framework a memory buffer, it is assumed that it
> >> is general-purpose memory, i.e., can be accessed with any alignment.
> >>
> >> If your memory does not allow such accesses, you have to do all the
> >> accesses in your driver, i.e., do everything in the .copy/.silence
> >> callbacks.  (You also have to add constraints so that periods and buffer
> >> are correctly aligned.)
> >
> > You can set runtime->min_align in the own hw_params callback so that
> > the PCM core handle it well, at least, for non-mmap mode.
> >
> > But this feature has been rarely used (currently no driver sets it
> > explicitly), so there might be still some bugs there, too :)
> 
> As far as I can see min_align only aligns the hardware pointer, but not the 
> application pointer. The later causes the issue here.
> 
> E.g. if a application writes a unaligned amount of bytes the next write will 
> start at an unaligned buffer address.
> 
> How safe is it in general to use IO mapped memory as general purpose memory 
> on ARM64? Are there maybe even more constraints than just that access has to 
> be aligned? Maybe the driver should rather use copy_from_user_toio() instead 
> of copy_from_user() to make sure that things work correctly.

If you're simply mapping RAM as device memory, then you need to ensure:

  - Naturally aligned access
  - No exclusive instructions (e.g atomics, locks, semaphores ...)

Out of interest, why are you using device mappings for memory? By design,
they prohibit speculation (and, depending on the subtype, re-ordering,
gathering and early write response) and will be significantly slower than
normal memory accesses.

Could you use something like ioremap_wc, which will give you normal,
non-cacheable memory instead?

Will

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Question on arm64 unaligned faults during playback
  2014-12-08 11:15       ` [alsa-devel] " Lars-Peter Clausen
@ 2014-12-08 12:04         ` Takashi Iwai
  -1 siblings, 0 replies; 22+ messages in thread
From: Takashi Iwai @ 2014-12-08 12:04 UTC (permalink / raw)
  To: Lars-Peter Clausen
  Cc: padma.v, Abhilash Kesavan, lgirdwood, Catalin Marinas, b42378,
	Clemens Ladisch, alsa-devel, Will Deacon, broonie, p.zabel,
	linux-arm-kernel@lists.infradead.org

At Mon, 08 Dec 2014 12:15:52 +0100,
Lars-Peter Clausen wrote:
> 
> Added ARM64 folks to Cc.
> 
> On 12/08/2014 11:41 AM, Takashi Iwai wrote:
> > At Mon, 08 Dec 2014 09:49:37 +0100,
> > Clemens Ladisch wrote:
> >>
> >> Abhilash Kesavan wrote:
> >>> I am working on a 64-bit ARM SoC (Samsung's Exynos7) and have observed
> >>> unaligned faults while testing certain sound streams with aplay.
> >>>
> >>> [   24.535661] snd_pcm_lib_write_transfer:sound/core/pcm_lib.c hwbuf is ffffff8000085624, runtime->dma_area is ffffff8000080000, hwoff is 5513, frames_to_bytes is 22052, frames is 5513
> >>> [   24.551244] Unhandled fault: alignment fault (0x96000061) at 0xffffff8000085624
> >>> [   24.579944] PC is at __copy_from_user+0x14/0x60
> >>> [   24.584450] LR is at snd_pcm_lib_write_transfer+0xe4/0x104
> >>> [   24.922054] Call trace:
> >>> [   24.924488] [<ffffffc0002d4784>] __copy_from_user+0x14/0x60
> >>> [   24.930040] [<ffffffc0004b7574>] snd_pcm_lib_write1+0x1fc/0x384
> >>>
> >>> We are using the internal sram available for sound, for DMA buffer
> >>> allocation, using the generic SRAM driver. From the above log, the
> >>> buffer address offset is not 8-byte aligned and as we are using the
> >>> SRAM driver which maps the memory as device memory we are getting an
> >>> unaligned fault. Is it incorrect to use the generic sram driver for
> >>> arm64 or am I missing something ?
> >>
> >> When you give the ALSA framework a memory buffer, it is assumed that it
> >> is general-purpose memory, i.e., can be accessed with any alignment.
> >>
> >> If your memory does not allow such accesses, you have to do all the
> >> accesses in your driver, i.e., do everything in the .copy/.silence
> >> callbacks.  (You also have to add constraints so that periods and buffer
> >> are correctly aligned.)
> >
> > You can set runtime->min_align in the own hw_params callback so that
> > the PCM core handle it well, at least, for non-mmap mode.
> >
> > But this feature has been rarely used (currently no driver sets it
> > explicitly), so there might be still some bugs there, too :)
> 
> As far as I can see min_align only aligns the hardware pointer, but not the 
> application pointer. The later causes the issue here.
> 
> E.g. if a application writes a unaligned amount of bytes the next write will 
> start at an unaligned buffer address.

Oh right, and I think we should fix the alignment of applptr, too.
We can set min_align zero as default and allow the unaligned applptr
update while aligning to the size if given explicitly.


Takashi

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [alsa-devel] Question on arm64 unaligned faults during playback
@ 2014-12-08 12:04         ` Takashi Iwai
  0 siblings, 0 replies; 22+ messages in thread
From: Takashi Iwai @ 2014-12-08 12:04 UTC (permalink / raw)
  To: linux-arm-kernel

At Mon, 08 Dec 2014 12:15:52 +0100,
Lars-Peter Clausen wrote:
> 
> Added ARM64 folks to Cc.
> 
> On 12/08/2014 11:41 AM, Takashi Iwai wrote:
> > At Mon, 08 Dec 2014 09:49:37 +0100,
> > Clemens Ladisch wrote:
> >>
> >> Abhilash Kesavan wrote:
> >>> I am working on a 64-bit ARM SoC (Samsung's Exynos7) and have observed
> >>> unaligned faults while testing certain sound streams with aplay.
> >>>
> >>> [   24.535661] snd_pcm_lib_write_transfer:sound/core/pcm_lib.c hwbuf is ffffff8000085624, runtime->dma_area is ffffff8000080000, hwoff is 5513, frames_to_bytes is 22052, frames is 5513
> >>> [   24.551244] Unhandled fault: alignment fault (0x96000061) at 0xffffff8000085624
> >>> [   24.579944] PC is at __copy_from_user+0x14/0x60
> >>> [   24.584450] LR is at snd_pcm_lib_write_transfer+0xe4/0x104
> >>> [   24.922054] Call trace:
> >>> [   24.924488] [<ffffffc0002d4784>] __copy_from_user+0x14/0x60
> >>> [   24.930040] [<ffffffc0004b7574>] snd_pcm_lib_write1+0x1fc/0x384
> >>>
> >>> We are using the internal sram available for sound, for DMA buffer
> >>> allocation, using the generic SRAM driver. From the above log, the
> >>> buffer address offset is not 8-byte aligned and as we are using the
> >>> SRAM driver which maps the memory as device memory we are getting an
> >>> unaligned fault. Is it incorrect to use the generic sram driver for
> >>> arm64 or am I missing something ?
> >>
> >> When you give the ALSA framework a memory buffer, it is assumed that it
> >> is general-purpose memory, i.e., can be accessed with any alignment.
> >>
> >> If your memory does not allow such accesses, you have to do all the
> >> accesses in your driver, i.e., do everything in the .copy/.silence
> >> callbacks.  (You also have to add constraints so that periods and buffer
> >> are correctly aligned.)
> >
> > You can set runtime->min_align in the own hw_params callback so that
> > the PCM core handle it well, at least, for non-mmap mode.
> >
> > But this feature has been rarely used (currently no driver sets it
> > explicitly), so there might be still some bugs there, too :)
> 
> As far as I can see min_align only aligns the hardware pointer, but not the 
> application pointer. The later causes the issue here.
> 
> E.g. if a application writes a unaligned amount of bytes the next write will 
> start at an unaligned buffer address.

Oh right, and I think we should fix the alignment of applptr, too.
We can set min_align zero as default and allow the unaligned applptr
update while aligning to the size if given explicitly.


Takashi

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [alsa-devel] Question on arm64 unaligned faults during playback
  2014-12-08 11:15       ` [alsa-devel] " Lars-Peter Clausen
@ 2014-12-08 12:25         ` Catalin Marinas
  -1 siblings, 0 replies; 22+ messages in thread
From: Catalin Marinas @ 2014-12-08 12:25 UTC (permalink / raw)
  To: Lars-Peter Clausen
  Cc: Abhilash Kesavan, padma.v@samsung.com, Takashi Iwai, Will Deacon,
	b42378@freescale.com, Clemens Ladisch,
	alsa-devel@alsa-project.org, lgirdwood@gmail.com,
	broonie@kernel.org, p.zabel@pengutronix.de, perex@perex.cz,
	linux-arm-kernel@lists.infradead.org

On Mon, Dec 08, 2014 at 11:15:52AM +0000, Lars-Peter Clausen wrote:
> Added ARM64 folks to Cc.

Thanks.

> On 12/08/2014 11:41 AM, Takashi Iwai wrote:
> > At Mon, 08 Dec 2014 09:49:37 +0100,
> > Clemens Ladisch wrote:
> >>
> >> Abhilash Kesavan wrote:
> >>> I am working on a 64-bit ARM SoC (Samsung's Exynos7) and have observed
> >>> unaligned faults while testing certain sound streams with aplay.
> >>>
> >>> [   24.535661] snd_pcm_lib_write_transfer:sound/core/pcm_lib.c hwbuf is ffffff8000085624, runtime->dma_area is ffffff8000080000, hwoff is 5513, frames_to_bytes is 22052, frames is 5513
> >>> [   24.551244] Unhandled fault: alignment fault (0x96000061) at 0xffffff8000085624
> >>> [   24.579944] PC is at __copy_from_user+0x14/0x60
> >>> [   24.584450] LR is at snd_pcm_lib_write_transfer+0xe4/0x104
> >>> [   24.922054] Call trace:
> >>> [   24.924488] [<ffffffc0002d4784>] __copy_from_user+0x14/0x60
> >>> [   24.930040] [<ffffffc0004b7574>] snd_pcm_lib_write1+0x1fc/0x384

I can see copy_from_user() here, I assume the source is a user space
buffer and the destination is the ioremap'ed SRAM (which causes the
unaligned fault). Such function is not supposed copy into device memory
(its prototype does not specify __iomem for the destination pointer).

If you really want Device memory for the SRAM, I think we need some
other API. memcpy_toio() could copy into such buffer but the source must
be kernel. Maybe something like copy_from_user_toio()?

Otherwise, if you can remap the buffer as Normal (non-cacheable) memory,
you could use something like vmap() with pgprot_writecombine for the
mapping. Even though ioremap_wc() returns (currently) Normal
non-cacheable memory, such pointer (__iomem) should only be used with
specific IO accessors (though I can see this assumption broken in a few
places in the kernel).

-- 
Catalin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [alsa-devel] Question on arm64 unaligned faults during playback
@ 2014-12-08 12:25         ` Catalin Marinas
  0 siblings, 0 replies; 22+ messages in thread
From: Catalin Marinas @ 2014-12-08 12:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Dec 08, 2014 at 11:15:52AM +0000, Lars-Peter Clausen wrote:
> Added ARM64 folks to Cc.

Thanks.

> On 12/08/2014 11:41 AM, Takashi Iwai wrote:
> > At Mon, 08 Dec 2014 09:49:37 +0100,
> > Clemens Ladisch wrote:
> >>
> >> Abhilash Kesavan wrote:
> >>> I am working on a 64-bit ARM SoC (Samsung's Exynos7) and have observed
> >>> unaligned faults while testing certain sound streams with aplay.
> >>>
> >>> [   24.535661] snd_pcm_lib_write_transfer:sound/core/pcm_lib.c hwbuf is ffffff8000085624, runtime->dma_area is ffffff8000080000, hwoff is 5513, frames_to_bytes is 22052, frames is 5513
> >>> [   24.551244] Unhandled fault: alignment fault (0x96000061) at 0xffffff8000085624
> >>> [   24.579944] PC is at __copy_from_user+0x14/0x60
> >>> [   24.584450] LR is at snd_pcm_lib_write_transfer+0xe4/0x104
> >>> [   24.922054] Call trace:
> >>> [   24.924488] [<ffffffc0002d4784>] __copy_from_user+0x14/0x60
> >>> [   24.930040] [<ffffffc0004b7574>] snd_pcm_lib_write1+0x1fc/0x384

I can see copy_from_user() here, I assume the source is a user space
buffer and the destination is the ioremap'ed SRAM (which causes the
unaligned fault). Such function is not supposed copy into device memory
(its prototype does not specify __iomem for the destination pointer).

If you really want Device memory for the SRAM, I think we need some
other API. memcpy_toio() could copy into such buffer but the source must
be kernel. Maybe something like copy_from_user_toio()?

Otherwise, if you can remap the buffer as Normal (non-cacheable) memory,
you could use something like vmap() with pgprot_writecombine for the
mapping. Even though ioremap_wc() returns (currently) Normal
non-cacheable memory, such pointer (__iomem) should only be used with
specific IO accessors (though I can see this assumption broken in a few
places in the kernel).

-- 
Catalin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Question on arm64 unaligned faults during playback
  2014-12-08  8:49 ` Clemens Ladisch
  2014-12-08 10:41   ` Takashi Iwai
@ 2014-12-09  2:56   ` Abhilash Kesavan
  1 sibling, 0 replies; 22+ messages in thread
From: Abhilash Kesavan @ 2014-12-09  2:56 UTC (permalink / raw)
  To: Clemens Ladisch
  Cc: alsa-devel, lars, padma.v, tiwai, b42378, Liam Girdwood, broonie,
	p.zabel

Hi Clemens,

On Mon, Dec 8, 2014 at 2:19 PM, Clemens Ladisch <clemens@ladisch.de> wrote:
> Abhilash Kesavan wrote:
>> I am working on a 64-bit ARM SoC (Samsung's Exynos7) and have observed
>> unaligned faults while testing certain sound streams with aplay.
>>
>> [   24.535661] snd_pcm_lib_write_transfer:sound/core/pcm_lib.c hwbuf is ffffff8000085624, runtime->dma_area is ffffff8000080000, hwoff is 5513, frames_to_bytes is 22052, frames is 5513
>> [   24.551244] Unhandled fault: alignment fault (0x96000061) at 0xffffff8000085624
>> [   24.579944] PC is at __copy_from_user+0x14/0x60
>> [   24.584450] LR is at snd_pcm_lib_write_transfer+0xe4/0x104
>> [   24.922054] Call trace:
>> [   24.924488] [<ffffffc0002d4784>] __copy_from_user+0x14/0x60
>> [   24.930040] [<ffffffc0004b7574>] snd_pcm_lib_write1+0x1fc/0x384
>>
>> We are using the internal sram available for sound, for DMA buffer
>> allocation, using the generic SRAM driver. From the above log, the
>> buffer address offset is not 8-byte aligned and as we are using the
>> SRAM driver which maps the memory as device memory we are getting an
>> unaligned fault. Is it incorrect to use the generic sram driver for
>> arm64 or am I missing something ?
>
> When you give the ALSA framework a memory buffer, it is assumed that it
> is general-purpose memory, i.e., can be accessed with any alignment.
>
> If your memory does not allow such accesses, you have to do all the
> accesses in your driver, i.e., do everything in the .copy/.silence
> callbacks.  (You also have to add constraints so that periods and buffer
> are correctly aligned.)

Thanks for the reply. Exynos7 is using generic dmaengine and the
default dmaengine_pcm_ops does not have the copy/silence call-backs
populated. Is the expectation that I should add the  copy call-back
for my platform even though it uses the generic dmaengine framework ?

Regards,
Abhilash
>
>
> Regards,
> Clemens

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Question on arm64 unaligned faults during playback
  2014-12-08 11:31         ` Will Deacon
@ 2014-12-09  2:56           ` Abhilash Kesavan
  -1 siblings, 0 replies; 22+ messages in thread
From: Abhilash Kesavan @ 2014-12-09  2:56 UTC (permalink / raw)
  To: Will Deacon
  Cc: alsa-devel@alsa-project.org, Lars-Peter Clausen,
	padma.v@samsung.com, Takashi Iwai, Catalin Marinas,
	b42378@freescale.com, Clemens Ladisch, lgirdwood@gmail.com,
	broonie@kernel.org, p.zabel@pengutronix.de,
	linux-arm-kernel@lists.infradead.org

Hi Will,

On Mon, Dec 8, 2014 at 5:01 PM, Will Deacon <will.deacon@arm.com> wrote:
> On Mon, Dec 08, 2014 at 11:15:52AM +0000, Lars-Peter Clausen wrote:
>> Added ARM64 folks to Cc.
>>
>> On 12/08/2014 11:41 AM, Takashi Iwai wrote:
>> > At Mon, 08 Dec 2014 09:49:37 +0100,
>> > Clemens Ladisch wrote:
>> >>
>> >> Abhilash Kesavan wrote:
>> >>> I am working on a 64-bit ARM SoC (Samsung's Exynos7) and have observed
>> >>> unaligned faults while testing certain sound streams with aplay.
>> >>>
>> >>> [   24.535661] snd_pcm_lib_write_transfer:sound/core/pcm_lib.c hwbuf is ffffff8000085624, runtime->dma_area is ffffff8000080000, hwoff is 5513, frames_to_bytes is 22052, frames is 5513
>> >>> [   24.551244] Unhandled fault: alignment fault (0x96000061) at 0xffffff8000085624
>> >>> [   24.579944] PC is at __copy_from_user+0x14/0x60
>> >>> [   24.584450] LR is at snd_pcm_lib_write_transfer+0xe4/0x104
>> >>> [   24.922054] Call trace:
>> >>> [   24.924488] [<ffffffc0002d4784>] __copy_from_user+0x14/0x60
>> >>> [   24.930040] [<ffffffc0004b7574>] snd_pcm_lib_write1+0x1fc/0x384
>> >>>
>> >>> We are using the internal sram available for sound, for DMA buffer
>> >>> allocation, using the generic SRAM driver. From the above log, the
>> >>> buffer address offset is not 8-byte aligned and as we are using the
>> >>> SRAM driver which maps the memory as device memory we are getting an
>> >>> unaligned fault. Is it incorrect to use the generic sram driver for
>> >>> arm64 or am I missing something ?
>> >>
>> >> When you give the ALSA framework a memory buffer, it is assumed that it
>> >> is general-purpose memory, i.e., can be accessed with any alignment.
>> >>
>> >> If your memory does not allow such accesses, you have to do all the
>> >> accesses in your driver, i.e., do everything in the .copy/.silence
>> >> callbacks.  (You also have to add constraints so that periods and buffer
>> >> are correctly aligned.)
>> >
>> > You can set runtime->min_align in the own hw_params callback so that
>> > the PCM core handle it well, at least, for non-mmap mode.
>> >
>> > But this feature has been rarely used (currently no driver sets it
>> > explicitly), so there might be still some bugs there, too :)
>>
>> As far as I can see min_align only aligns the hardware pointer, but not the
>> application pointer. The later causes the issue here.
>>
>> E.g. if a application writes a unaligned amount of bytes the next write will
>> start at an unaligned buffer address.
>>
>> How safe is it in general to use IO mapped memory as general purpose memory
>> on ARM64? Are there maybe even more constraints than just that access has to
>> be aligned? Maybe the driver should rather use copy_from_user_toio() instead
>> of copy_from_user() to make sure that things work correctly.
>
> If you're simply mapping RAM as device memory, then you need to ensure:
>
>   - Naturally aligned access
>   - No exclusive instructions (e.g atomics, locks, semaphores ...)
>
> Out of interest, why are you using device mappings for memory? By design,
> they prohibit speculation (and, depending on the subtype, re-ordering,
> gathering and early write response) and will be significantly slower than
> normal memory accesses.

Thanks for the comments.

We are using the generic SRAM allocation driver (drivers/misc/sram.c)
to use the internal audio SRAM for buffer allocation. The generic
driver ioremaps the sram and exposes the obtained pool via the
genalloc API to other drivers such as audio. There already exists
support for internal sram in the audio framework. The specific use
case is in an Exynos7 low power state where the DRAM cannot be used.

>
> Could you use something like ioremap_wc, which will give you normal,
> non-cacheable memory instead?

Modifying ioremap to ioremap_wc gets rid of the crash. Do you think
the sram driver ioremap usage should be modified to use ioremap_wc ?

Regards,
Abhilash
>
> Will

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [alsa-devel] Question on arm64 unaligned faults during playback
@ 2014-12-09  2:56           ` Abhilash Kesavan
  0 siblings, 0 replies; 22+ messages in thread
From: Abhilash Kesavan @ 2014-12-09  2:56 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will,

On Mon, Dec 8, 2014 at 5:01 PM, Will Deacon <will.deacon@arm.com> wrote:
> On Mon, Dec 08, 2014 at 11:15:52AM +0000, Lars-Peter Clausen wrote:
>> Added ARM64 folks to Cc.
>>
>> On 12/08/2014 11:41 AM, Takashi Iwai wrote:
>> > At Mon, 08 Dec 2014 09:49:37 +0100,
>> > Clemens Ladisch wrote:
>> >>
>> >> Abhilash Kesavan wrote:
>> >>> I am working on a 64-bit ARM SoC (Samsung's Exynos7) and have observed
>> >>> unaligned faults while testing certain sound streams with aplay.
>> >>>
>> >>> [   24.535661] snd_pcm_lib_write_transfer:sound/core/pcm_lib.c hwbuf is ffffff8000085624, runtime->dma_area is ffffff8000080000, hwoff is 5513, frames_to_bytes is 22052, frames is 5513
>> >>> [   24.551244] Unhandled fault: alignment fault (0x96000061) at 0xffffff8000085624
>> >>> [   24.579944] PC is at __copy_from_user+0x14/0x60
>> >>> [   24.584450] LR is at snd_pcm_lib_write_transfer+0xe4/0x104
>> >>> [   24.922054] Call trace:
>> >>> [   24.924488] [<ffffffc0002d4784>] __copy_from_user+0x14/0x60
>> >>> [   24.930040] [<ffffffc0004b7574>] snd_pcm_lib_write1+0x1fc/0x384
>> >>>
>> >>> We are using the internal sram available for sound, for DMA buffer
>> >>> allocation, using the generic SRAM driver. From the above log, the
>> >>> buffer address offset is not 8-byte aligned and as we are using the
>> >>> SRAM driver which maps the memory as device memory we are getting an
>> >>> unaligned fault. Is it incorrect to use the generic sram driver for
>> >>> arm64 or am I missing something ?
>> >>
>> >> When you give the ALSA framework a memory buffer, it is assumed that it
>> >> is general-purpose memory, i.e., can be accessed with any alignment.
>> >>
>> >> If your memory does not allow such accesses, you have to do all the
>> >> accesses in your driver, i.e., do everything in the .copy/.silence
>> >> callbacks.  (You also have to add constraints so that periods and buffer
>> >> are correctly aligned.)
>> >
>> > You can set runtime->min_align in the own hw_params callback so that
>> > the PCM core handle it well, at least, for non-mmap mode.
>> >
>> > But this feature has been rarely used (currently no driver sets it
>> > explicitly), so there might be still some bugs there, too :)
>>
>> As far as I can see min_align only aligns the hardware pointer, but not the
>> application pointer. The later causes the issue here.
>>
>> E.g. if a application writes a unaligned amount of bytes the next write will
>> start at an unaligned buffer address.
>>
>> How safe is it in general to use IO mapped memory as general purpose memory
>> on ARM64? Are there maybe even more constraints than just that access has to
>> be aligned? Maybe the driver should rather use copy_from_user_toio() instead
>> of copy_from_user() to make sure that things work correctly.
>
> If you're simply mapping RAM as device memory, then you need to ensure:
>
>   - Naturally aligned access
>   - No exclusive instructions (e.g atomics, locks, semaphores ...)
>
> Out of interest, why are you using device mappings for memory? By design,
> they prohibit speculation (and, depending on the subtype, re-ordering,
> gathering and early write response) and will be significantly slower than
> normal memory accesses.

Thanks for the comments.

We are using the generic SRAM allocation driver (drivers/misc/sram.c)
to use the internal audio SRAM for buffer allocation. The generic
driver ioremaps the sram and exposes the obtained pool via the
genalloc API to other drivers such as audio. There already exists
support for internal sram in the audio framework. The specific use
case is in an Exynos7 low power state where the DRAM cannot be used.

>
> Could you use something like ioremap_wc, which will give you normal,
> non-cacheable memory instead?

Modifying ioremap to ioremap_wc gets rid of the crash. Do you think
the sram driver ioremap usage should be modified to use ioremap_wc ?

Regards,
Abhilash
>
> Will

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [alsa-devel] Question on arm64 unaligned faults during playback
  2014-12-08 12:25         ` Catalin Marinas
@ 2014-12-09  2:56           ` Abhilash Kesavan
  -1 siblings, 0 replies; 22+ messages in thread
From: Abhilash Kesavan @ 2014-12-09  2:56 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: alsa-devel@alsa-project.org, Lars-Peter Clausen,
	padma.v@samsung.com, Takashi Iwai, Will Deacon,
	b42378@freescale.com, Clemens Ladisch, lgirdwood@gmail.com,
	perex@perex.cz, broonie@kernel.org, p.zabel@pengutronix.de,
	linux-arm-kernel@lists.infradead.org

Hi Catalin,

On Mon, Dec 8, 2014 at 5:55 PM, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Mon, Dec 08, 2014 at 11:15:52AM +0000, Lars-Peter Clausen wrote:
>> Added ARM64 folks to Cc.
>
> Thanks.
>
>> On 12/08/2014 11:41 AM, Takashi Iwai wrote:
>> > At Mon, 08 Dec 2014 09:49:37 +0100,
>> > Clemens Ladisch wrote:
>> >>
>> >> Abhilash Kesavan wrote:
>> >>> I am working on a 64-bit ARM SoC (Samsung's Exynos7) and have observed
>> >>> unaligned faults while testing certain sound streams with aplay.
>> >>>
>> >>> [   24.535661] snd_pcm_lib_write_transfer:sound/core/pcm_lib.c hwbuf is ffffff8000085624, runtime->dma_area is ffffff8000080000, hwoff is 5513, frames_to_bytes is 22052, frames is 5513
>> >>> [   24.551244] Unhandled fault: alignment fault (0x96000061) at 0xffffff8000085624
>> >>> [   24.579944] PC is at __copy_from_user+0x14/0x60
>> >>> [   24.584450] LR is at snd_pcm_lib_write_transfer+0xe4/0x104
>> >>> [   24.922054] Call trace:
>> >>> [   24.924488] [<ffffffc0002d4784>] __copy_from_user+0x14/0x60
>> >>> [   24.930040] [<ffffffc0004b7574>] snd_pcm_lib_write1+0x1fc/0x384
>
> I can see copy_from_user() here, I assume the source is a user space
> buffer and the destination is the ioremap'ed SRAM (which causes the
> unaligned fault). Such function is not supposed copy into device memory
> (its prototype does not specify __iomem for the destination pointer).
>
> If you really want Device memory for the SRAM, I think we need some
> other API. memcpy_toio() could copy into such buffer but the source must
> be kernel. Maybe something like copy_from_user_toio()?

Thanks for the response.

Changing copy_from_user() to copy_from_user_toio() in
snd_pcm_lib_write_transfer() (sound/core/pcm_lib.c) works. I am not
sure about other effects of making such a change in a core sound file,
can someone comment ?

>
> Otherwise, if you can remap the buffer as Normal (non-cacheable) memory,
> you could use something like vmap() with pgprot_writecombine for the
> mapping. Even though ioremap_wc() returns (currently) Normal
> non-cacheable memory, such pointer (__iomem) should only be used with
> specific IO accessors (though I can see this assumption broken in a few
> places in the kernel).

Currently, if I continue to use the SRAM allocation driver, two
options are working:
1) ioremap to ioremap_wc in drivers/misc/sram.c.
2) copy_from_user to copy_from_user_toio in sound/core/pcm_lib.c

Which would be the preferred option ?

Regards,
Abhilash
>
> --
> Catalin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [alsa-devel] Question on arm64 unaligned faults during playback
@ 2014-12-09  2:56           ` Abhilash Kesavan
  0 siblings, 0 replies; 22+ messages in thread
From: Abhilash Kesavan @ 2014-12-09  2:56 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Catalin,

On Mon, Dec 8, 2014 at 5:55 PM, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Mon, Dec 08, 2014 at 11:15:52AM +0000, Lars-Peter Clausen wrote:
>> Added ARM64 folks to Cc.
>
> Thanks.
>
>> On 12/08/2014 11:41 AM, Takashi Iwai wrote:
>> > At Mon, 08 Dec 2014 09:49:37 +0100,
>> > Clemens Ladisch wrote:
>> >>
>> >> Abhilash Kesavan wrote:
>> >>> I am working on a 64-bit ARM SoC (Samsung's Exynos7) and have observed
>> >>> unaligned faults while testing certain sound streams with aplay.
>> >>>
>> >>> [   24.535661] snd_pcm_lib_write_transfer:sound/core/pcm_lib.c hwbuf is ffffff8000085624, runtime->dma_area is ffffff8000080000, hwoff is 5513, frames_to_bytes is 22052, frames is 5513
>> >>> [   24.551244] Unhandled fault: alignment fault (0x96000061) at 0xffffff8000085624
>> >>> [   24.579944] PC is at __copy_from_user+0x14/0x60
>> >>> [   24.584450] LR is at snd_pcm_lib_write_transfer+0xe4/0x104
>> >>> [   24.922054] Call trace:
>> >>> [   24.924488] [<ffffffc0002d4784>] __copy_from_user+0x14/0x60
>> >>> [   24.930040] [<ffffffc0004b7574>] snd_pcm_lib_write1+0x1fc/0x384
>
> I can see copy_from_user() here, I assume the source is a user space
> buffer and the destination is the ioremap'ed SRAM (which causes the
> unaligned fault). Such function is not supposed copy into device memory
> (its prototype does not specify __iomem for the destination pointer).
>
> If you really want Device memory for the SRAM, I think we need some
> other API. memcpy_toio() could copy into such buffer but the source must
> be kernel. Maybe something like copy_from_user_toio()?

Thanks for the response.

Changing copy_from_user() to copy_from_user_toio() in
snd_pcm_lib_write_transfer() (sound/core/pcm_lib.c) works. I am not
sure about other effects of making such a change in a core sound file,
can someone comment ?

>
> Otherwise, if you can remap the buffer as Normal (non-cacheable) memory,
> you could use something like vmap() with pgprot_writecombine for the
> mapping. Even though ioremap_wc() returns (currently) Normal
> non-cacheable memory, such pointer (__iomem) should only be used with
> specific IO accessors (though I can see this assumption broken in a few
> places in the kernel).

Currently, if I continue to use the SRAM allocation driver, two
options are working:
1) ioremap to ioremap_wc in drivers/misc/sram.c.
2) copy_from_user to copy_from_user_toio in sound/core/pcm_lib.c

Which would be the preferred option ?

Regards,
Abhilash
>
> --
> Catalin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Question on arm64 unaligned faults during playback
  2014-12-09  2:56           ` Abhilash Kesavan
@ 2014-12-09  8:19             ` Clemens Ladisch
  -1 siblings, 0 replies; 22+ messages in thread
From: Clemens Ladisch @ 2014-12-09  8:19 UTC (permalink / raw)
  To: Abhilash Kesavan, Catalin Marinas
  Cc: alsa-devel@alsa-project.org, Lars-Peter Clausen,
	padma.v@samsung.com, Takashi Iwai, Will Deacon,
	lgirdwood@gmail.com, broonie@kernel.org, p.zabel@pengutronix.de,
	linux-arm-kernel@lists.infradead.org

Abhilash Kesavan wrote:
> Currently, if I continue to use the SRAM allocation driver, two
> options are working:
> 1) ioremap to ioremap_wc in drivers/misc/sram.c.
> 2) copy_from_user to copy_from_user_toio in sound/core/pcm_lib.c
>
> Which would be the preferred option ?

I don't know how to tell the SRAM allocator to hand out WC memory, but
copy_from_user_toio() is what you would put into your .copy callback.


Regards,
Clemens

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [alsa-devel] Question on arm64 unaligned faults during playback
@ 2014-12-09  8:19             ` Clemens Ladisch
  0 siblings, 0 replies; 22+ messages in thread
From: Clemens Ladisch @ 2014-12-09  8:19 UTC (permalink / raw)
  To: linux-arm-kernel

Abhilash Kesavan wrote:
> Currently, if I continue to use the SRAM allocation driver, two
> options are working:
> 1) ioremap to ioremap_wc in drivers/misc/sram.c.
> 2) copy_from_user to copy_from_user_toio in sound/core/pcm_lib.c
>
> Which would be the preferred option ?

I don't know how to tell the SRAM allocator to hand out WC memory, but
copy_from_user_toio() is what you would put into your .copy callback.


Regards,
Clemens

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Question on arm64 unaligned faults during playback
  2014-12-09  2:56           ` Abhilash Kesavan
@ 2014-12-09 11:43             ` Catalin Marinas
  -1 siblings, 0 replies; 22+ messages in thread
From: Catalin Marinas @ 2014-12-09 11:43 UTC (permalink / raw)
  To: Abhilash Kesavan
  Cc: alsa-devel@alsa-project.org, Lars-Peter Clausen,
	padma.v@samsung.com, Takashi Iwai, Will Deacon,
	b42378@freescale.com, Clemens Ladisch, lgirdwood@gmail.com,
	broonie@kernel.org, p.zabel@pengutronix.de,
	linux-arm-kernel@lists.infradead.org

On Tue, Dec 09, 2014 at 02:56:59AM +0000, Abhilash Kesavan wrote:
> On Mon, Dec 8, 2014 at 5:55 PM, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > On Mon, Dec 08, 2014 at 11:15:52AM +0000, Lars-Peter Clausen wrote:
> >> On 12/08/2014 11:41 AM, Takashi Iwai wrote:
> >> > At Mon, 08 Dec 2014 09:49:37 +0100,
> >> > Clemens Ladisch wrote:
> >> >>
> >> >> Abhilash Kesavan wrote:
> >> >>> I am working on a 64-bit ARM SoC (Samsung's Exynos7) and have observed
> >> >>> unaligned faults while testing certain sound streams with aplay.
> >> >>>
> >> >>> [   24.535661] snd_pcm_lib_write_transfer:sound/core/pcm_lib.c hwbuf is ffffff8000085624, runtime->dma_area is ffffff8000080000, hwoff is 5513, frames_to_bytes is 22052, frames is 5513
> >> >>> [   24.551244] Unhandled fault: alignment fault (0x96000061) at 0xffffff8000085624
> >> >>> [   24.579944] PC is at __copy_from_user+0x14/0x60
> >> >>> [   24.584450] LR is at snd_pcm_lib_write_transfer+0xe4/0x104
> >> >>> [   24.922054] Call trace:
> >> >>> [   24.924488] [<ffffffc0002d4784>] __copy_from_user+0x14/0x60
> >> >>> [   24.930040] [<ffffffc0004b7574>] snd_pcm_lib_write1+0x1fc/0x384
> >
> > I can see copy_from_user() here, I assume the source is a user space
> > buffer and the destination is the ioremap'ed SRAM (which causes the
> > unaligned fault). Such function is not supposed copy into device memory
> > (its prototype does not specify __iomem for the destination pointer).
> >
> > If you really want Device memory for the SRAM, I think we need some
> > other API. memcpy_toio() could copy into such buffer but the source must
> > be kernel. Maybe something like copy_from_user_toio()?
> 
> Thanks for the response.
> 
> Changing copy_from_user() to copy_from_user_toio() in
> snd_pcm_lib_write_transfer() (sound/core/pcm_lib.c) works. I am not
> sure about other effects of making such a change in a core sound file,
> can someone comment ?

The default implementation uses copy_from_user() to an intermediate
buffer and another memcpy_toio() to the SRAM destination. So, it will be
slower.

In general it may be better to copy this function to some more generic
place and let the arch code override it with a more optimal
implementation.

> > Otherwise, if you can remap the buffer as Normal (non-cacheable) memory,
> > you could use something like vmap() with pgprot_writecombine for the
> > mapping. Even though ioremap_wc() returns (currently) Normal
> > non-cacheable memory, such pointer (__iomem) should only be used with
> > specific IO accessors (though I can see this assumption broken in a few
> > places in the kernel).
> 
> Currently, if I continue to use the SRAM allocation driver, two
> options are working:
> 1) ioremap to ioremap_wc in drivers/misc/sram.c.

I think the SRAM driver does the correct devm_ioremap_resource() call
but the implementation in lib/devres.c assumes that if
IORESOURCE_CACHEABLE it should use devm_ioremap(). AFAIK, ioremap()
always returns Device memory (even on x86, the default ioremap behaviour
is nocache).

> 2) copy_from_user to copy_from_user_toio in sound/core/pcm_lib.c
> 
> Which would be the preferred option ?

I would prefer the sram changed. It provides an allocation pool from
SRAM which could as well be mapped as write-combine. If no-one complains
about changing the SRAM mapping to WC, you could introduce a
devm_ioremap_wc() in lib/devres.c and call that directly from
sram_probe(). But I can't tell for sure what the memory type
expectations for SRAM are.

-- 
Catalin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [alsa-devel] Question on arm64 unaligned faults during playback
@ 2014-12-09 11:43             ` Catalin Marinas
  0 siblings, 0 replies; 22+ messages in thread
From: Catalin Marinas @ 2014-12-09 11:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 09, 2014 at 02:56:59AM +0000, Abhilash Kesavan wrote:
> On Mon, Dec 8, 2014 at 5:55 PM, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > On Mon, Dec 08, 2014 at 11:15:52AM +0000, Lars-Peter Clausen wrote:
> >> On 12/08/2014 11:41 AM, Takashi Iwai wrote:
> >> > At Mon, 08 Dec 2014 09:49:37 +0100,
> >> > Clemens Ladisch wrote:
> >> >>
> >> >> Abhilash Kesavan wrote:
> >> >>> I am working on a 64-bit ARM SoC (Samsung's Exynos7) and have observed
> >> >>> unaligned faults while testing certain sound streams with aplay.
> >> >>>
> >> >>> [   24.535661] snd_pcm_lib_write_transfer:sound/core/pcm_lib.c hwbuf is ffffff8000085624, runtime->dma_area is ffffff8000080000, hwoff is 5513, frames_to_bytes is 22052, frames is 5513
> >> >>> [   24.551244] Unhandled fault: alignment fault (0x96000061) at 0xffffff8000085624
> >> >>> [   24.579944] PC is at __copy_from_user+0x14/0x60
> >> >>> [   24.584450] LR is at snd_pcm_lib_write_transfer+0xe4/0x104
> >> >>> [   24.922054] Call trace:
> >> >>> [   24.924488] [<ffffffc0002d4784>] __copy_from_user+0x14/0x60
> >> >>> [   24.930040] [<ffffffc0004b7574>] snd_pcm_lib_write1+0x1fc/0x384
> >
> > I can see copy_from_user() here, I assume the source is a user space
> > buffer and the destination is the ioremap'ed SRAM (which causes the
> > unaligned fault). Such function is not supposed copy into device memory
> > (its prototype does not specify __iomem for the destination pointer).
> >
> > If you really want Device memory for the SRAM, I think we need some
> > other API. memcpy_toio() could copy into such buffer but the source must
> > be kernel. Maybe something like copy_from_user_toio()?
> 
> Thanks for the response.
> 
> Changing copy_from_user() to copy_from_user_toio() in
> snd_pcm_lib_write_transfer() (sound/core/pcm_lib.c) works. I am not
> sure about other effects of making such a change in a core sound file,
> can someone comment ?

The default implementation uses copy_from_user() to an intermediate
buffer and another memcpy_toio() to the SRAM destination. So, it will be
slower.

In general it may be better to copy this function to some more generic
place and let the arch code override it with a more optimal
implementation.

> > Otherwise, if you can remap the buffer as Normal (non-cacheable) memory,
> > you could use something like vmap() with pgprot_writecombine for the
> > mapping. Even though ioremap_wc() returns (currently) Normal
> > non-cacheable memory, such pointer (__iomem) should only be used with
> > specific IO accessors (though I can see this assumption broken in a few
> > places in the kernel).
> 
> Currently, if I continue to use the SRAM allocation driver, two
> options are working:
> 1) ioremap to ioremap_wc in drivers/misc/sram.c.

I think the SRAM driver does the correct devm_ioremap_resource() call
but the implementation in lib/devres.c assumes that if
IORESOURCE_CACHEABLE it should use devm_ioremap(). AFAIK, ioremap()
always returns Device memory (even on x86, the default ioremap behaviour
is nocache).

> 2) copy_from_user to copy_from_user_toio in sound/core/pcm_lib.c
> 
> Which would be the preferred option ?

I would prefer the sram changed. It provides an allocation pool from
SRAM which could as well be mapped as write-combine. If no-one complains
about changing the SRAM mapping to WC, you could introduce a
devm_ioremap_wc() in lib/devres.c and call that directly from
sram_probe(). But I can't tell for sure what the memory type
expectations for SRAM are.

-- 
Catalin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Question on arm64 unaligned faults during playback
  2014-12-09 11:43             ` [alsa-devel] " Catalin Marinas
@ 2014-12-10  4:10               ` Abhilash Kesavan
  -1 siblings, 0 replies; 22+ messages in thread
From: Abhilash Kesavan @ 2014-12-10  4:10 UTC (permalink / raw)
  To: Catalin Marinas, Heiko Stübner, Li.Xiubo, shc_work,
	p.zabel@pengutronix.de, nicoleotsuka
  Cc: alsa-devel@alsa-project.org, Lars-Peter Clausen,
	padma.v@samsung.com, Takashi Iwai, Will Deacon,
	b42378@freescale.com, Clemens Ladisch, lgirdwood@gmail.com,
	broonie@kernel.org, linux-arm-kernel@lists.infradead.org

Hi Catalin,

On Tue, Dec 9, 2014 at 5:13 PM, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Tue, Dec 09, 2014 at 02:56:59AM +0000, Abhilash Kesavan wrote:
>> On Mon, Dec 8, 2014 at 5:55 PM, Catalin Marinas <catalin.marinas@arm.com> wrote:
>> > On Mon, Dec 08, 2014 at 11:15:52AM +0000, Lars-Peter Clausen wrote:
>> >> On 12/08/2014 11:41 AM, Takashi Iwai wrote:
>> >> > At Mon, 08 Dec 2014 09:49:37 +0100,
>> >> > Clemens Ladisch wrote:
>> >> >>
>> >> >> Abhilash Kesavan wrote:
>> >> >>> I am working on a 64-bit ARM SoC (Samsung's Exynos7) and have observed
>> >> >>> unaligned faults while testing certain sound streams with aplay.
>> >> >>>
>> >> >>> [   24.535661] snd_pcm_lib_write_transfer:sound/core/pcm_lib.c hwbuf is ffffff8000085624, runtime->dma_area is ffffff8000080000, hwoff is 5513, frames_to_bytes is 22052, frames is 5513
>> >> >>> [   24.551244] Unhandled fault: alignment fault (0x96000061) at 0xffffff8000085624
>> >> >>> [   24.579944] PC is at __copy_from_user+0x14/0x60
>> >> >>> [   24.584450] LR is at snd_pcm_lib_write_transfer+0xe4/0x104
>> >> >>> [   24.922054] Call trace:
>> >> >>> [   24.924488] [<ffffffc0002d4784>] __copy_from_user+0x14/0x60
>> >> >>> [   24.930040] [<ffffffc0004b7574>] snd_pcm_lib_write1+0x1fc/0x384
>> >
>> > I can see copy_from_user() here, I assume the source is a user space
>> > buffer and the destination is the ioremap'ed SRAM (which causes the
>> > unaligned fault). Such function is not supposed copy into device memory
>> > (its prototype does not specify __iomem for the destination pointer).
>> >
>> > If you really want Device memory for the SRAM, I think we need some
>> > other API. memcpy_toio() could copy into such buffer but the source must
>> > be kernel. Maybe something like copy_from_user_toio()?
>>
>> Thanks for the response.
>>
>> Changing copy_from_user() to copy_from_user_toio() in
>> snd_pcm_lib_write_transfer() (sound/core/pcm_lib.c) works. I am not
>> sure about other effects of making such a change in a core sound file,
>> can someone comment ?
>
> The default implementation uses copy_from_user() to an intermediate
> buffer and another memcpy_toio() to the SRAM destination. So, it will be
> slower.
>
> In general it may be better to copy this function to some more generic
> place and let the arch code override it with a more optimal
> implementation.
>
>> > Otherwise, if you can remap the buffer as Normal (non-cacheable) memory,
>> > you could use something like vmap() with pgprot_writecombine for the
>> > mapping. Even though ioremap_wc() returns (currently) Normal
>> > non-cacheable memory, such pointer (__iomem) should only be used with
>> > specific IO accessors (though I can see this assumption broken in a few
>> > places in the kernel).
>>
>> Currently, if I continue to use the SRAM allocation driver, two
>> options are working:
>> 1) ioremap to ioremap_wc in drivers/misc/sram.c.
>
> I think the SRAM driver does the correct devm_ioremap_resource() call
> but the implementation in lib/devres.c assumes that if
> IORESOURCE_CACHEABLE it should use devm_ioremap(). AFAIK, ioremap()
> always returns Device memory (even on x86, the default ioremap behaviour
> is nocache).
>
>> 2) copy_from_user to copy_from_user_toio in sound/core/pcm_lib.c
>>
>> Which would be the preferred option ?
>
> I would prefer the sram changed. It provides an allocation pool from
> SRAM which could as well be mapped as write-combine. If no-one complains
> about changing the SRAM mapping to WC, you could introduce a
> devm_ioremap_wc() in lib/devres.c and call that directly from
> sram_probe(). But I can't tell for sure what the memory type
> expectations for SRAM are.

Thanks for the response. I will post patches to do this.
Also, adding guys who have been working on the generic sram driver and
internal sram support for sound in case they have any comments
regarding the change to ioremap_wc.

Regards,
Abhilash

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [alsa-devel] Question on arm64 unaligned faults during playback
@ 2014-12-10  4:10               ` Abhilash Kesavan
  0 siblings, 0 replies; 22+ messages in thread
From: Abhilash Kesavan @ 2014-12-10  4:10 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Catalin,

On Tue, Dec 9, 2014 at 5:13 PM, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Tue, Dec 09, 2014 at 02:56:59AM +0000, Abhilash Kesavan wrote:
>> On Mon, Dec 8, 2014 at 5:55 PM, Catalin Marinas <catalin.marinas@arm.com> wrote:
>> > On Mon, Dec 08, 2014 at 11:15:52AM +0000, Lars-Peter Clausen wrote:
>> >> On 12/08/2014 11:41 AM, Takashi Iwai wrote:
>> >> > At Mon, 08 Dec 2014 09:49:37 +0100,
>> >> > Clemens Ladisch wrote:
>> >> >>
>> >> >> Abhilash Kesavan wrote:
>> >> >>> I am working on a 64-bit ARM SoC (Samsung's Exynos7) and have observed
>> >> >>> unaligned faults while testing certain sound streams with aplay.
>> >> >>>
>> >> >>> [   24.535661] snd_pcm_lib_write_transfer:sound/core/pcm_lib.c hwbuf is ffffff8000085624, runtime->dma_area is ffffff8000080000, hwoff is 5513, frames_to_bytes is 22052, frames is 5513
>> >> >>> [   24.551244] Unhandled fault: alignment fault (0x96000061) at 0xffffff8000085624
>> >> >>> [   24.579944] PC is at __copy_from_user+0x14/0x60
>> >> >>> [   24.584450] LR is at snd_pcm_lib_write_transfer+0xe4/0x104
>> >> >>> [   24.922054] Call trace:
>> >> >>> [   24.924488] [<ffffffc0002d4784>] __copy_from_user+0x14/0x60
>> >> >>> [   24.930040] [<ffffffc0004b7574>] snd_pcm_lib_write1+0x1fc/0x384
>> >
>> > I can see copy_from_user() here, I assume the source is a user space
>> > buffer and the destination is the ioremap'ed SRAM (which causes the
>> > unaligned fault). Such function is not supposed copy into device memory
>> > (its prototype does not specify __iomem for the destination pointer).
>> >
>> > If you really want Device memory for the SRAM, I think we need some
>> > other API. memcpy_toio() could copy into such buffer but the source must
>> > be kernel. Maybe something like copy_from_user_toio()?
>>
>> Thanks for the response.
>>
>> Changing copy_from_user() to copy_from_user_toio() in
>> snd_pcm_lib_write_transfer() (sound/core/pcm_lib.c) works. I am not
>> sure about other effects of making such a change in a core sound file,
>> can someone comment ?
>
> The default implementation uses copy_from_user() to an intermediate
> buffer and another memcpy_toio() to the SRAM destination. So, it will be
> slower.
>
> In general it may be better to copy this function to some more generic
> place and let the arch code override it with a more optimal
> implementation.
>
>> > Otherwise, if you can remap the buffer as Normal (non-cacheable) memory,
>> > you could use something like vmap() with pgprot_writecombine for the
>> > mapping. Even though ioremap_wc() returns (currently) Normal
>> > non-cacheable memory, such pointer (__iomem) should only be used with
>> > specific IO accessors (though I can see this assumption broken in a few
>> > places in the kernel).
>>
>> Currently, if I continue to use the SRAM allocation driver, two
>> options are working:
>> 1) ioremap to ioremap_wc in drivers/misc/sram.c.
>
> I think the SRAM driver does the correct devm_ioremap_resource() call
> but the implementation in lib/devres.c assumes that if
> IORESOURCE_CACHEABLE it should use devm_ioremap(). AFAIK, ioremap()
> always returns Device memory (even on x86, the default ioremap behaviour
> is nocache).
>
>> 2) copy_from_user to copy_from_user_toio in sound/core/pcm_lib.c
>>
>> Which would be the preferred option ?
>
> I would prefer the sram changed. It provides an allocation pool from
> SRAM which could as well be mapped as write-combine. If no-one complains
> about changing the SRAM mapping to WC, you could introduce a
> devm_ioremap_wc() in lib/devres.c and call that directly from
> sram_probe(). But I can't tell for sure what the memory type
> expectations for SRAM are.

Thanks for the response. I will post patches to do this.
Also, adding guys who have been working on the generic sram driver and
internal sram support for sound in case they have any comments
regarding the change to ioremap_wc.

Regards,
Abhilash

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2014-12-10  4:10 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-08  5:40 Question on arm64 unaligned faults during playback Abhilash Kesavan
2014-12-08  8:49 ` Clemens Ladisch
2014-12-08 10:41   ` Takashi Iwai
2014-12-08 11:15     ` Lars-Peter Clausen
2014-12-08 11:15       ` [alsa-devel] " Lars-Peter Clausen
2014-12-08 11:31       ` Will Deacon
2014-12-08 11:31         ` Will Deacon
2014-12-09  2:56         ` Abhilash Kesavan
2014-12-09  2:56           ` [alsa-devel] " Abhilash Kesavan
2014-12-08 12:04       ` Takashi Iwai
2014-12-08 12:04         ` [alsa-devel] " Takashi Iwai
2014-12-08 12:25       ` Catalin Marinas
2014-12-08 12:25         ` Catalin Marinas
2014-12-09  2:56         ` Abhilash Kesavan
2014-12-09  2:56           ` Abhilash Kesavan
2014-12-09  8:19           ` Clemens Ladisch
2014-12-09  8:19             ` [alsa-devel] " Clemens Ladisch
2014-12-09 11:43           ` Catalin Marinas
2014-12-09 11:43             ` [alsa-devel] " Catalin Marinas
2014-12-10  4:10             ` Abhilash Kesavan
2014-12-10  4:10               ` [alsa-devel] " Abhilash Kesavan
2014-12-09  2:56   ` Abhilash Kesavan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.