public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 07/11] tty: improve tty_insert_flip_char() fast path
       [not found] <20170622171355.267192-1-arnd@arndb.de>
@ 2017-06-22 17:13 ` Arnd Bergmann
  2017-06-23 16:07   ` Greg Kroah-Hartman
  2017-06-25  2:33   ` kbuild test robot
  0 siblings, 2 replies; 5+ messages in thread
From: Arnd Bergmann @ 2017-06-22 17:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: kasan-dev, Dmitry Vyukov, Alexander Potapenko, Andrey Ryabinin,
	netdev, linux-kernel, Arend van Spriel, Arnd Bergmann,
	Masahiro Yamada, Michal Marek, Kees Cook, Ingo Molnar,
	David S . Miller, linux-kbuild, Samuel Thibault,
	Greg Kroah-Hartman, Jiri Slaby, stable

kernelci.org reports a crazy stack usage for the VT code when CONFIG_KASAN
is enabled:

drivers/tty/vt/keyboard.c: In function 'kbd_keycode':
drivers/tty/vt/keyboard.c:1452:1: error: the frame size of 2240 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]

The problem is that tty_insert_flip_char() gets inlined many times into
kbd_keycode(), and also into other functions, and each copy requires 128
bytes for stack redzone to check for a possible out-of-bounds access on
the 'ch' and 'flags' arguments that are passed into
tty_insert_flip_string_flags as a variable-length string.

This introduces a new __tty_insert_flip_char() function for the slow
path, which receives the two arguments by value. This completely avoids
the problem and the stack usage goes back down to around 100 bytes.

Without KASAN, this is also slightly better, as we don't have to
spill the arguments to the stack but can simply pass 'ch' and 'flag'
in registers, saving a few bytes in .text for each call site.

This should be backported to linux-4.0 or later, which first introduced
the stack sanitizer in the kernel.

Cc: stable@vger.kernel.org
Fixes: c420f167db8c ("kasan: enable stack instrumentation")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
I already submitted this separately to Greg, but he hasn't replied
yet. I assume that it's fine if Andrew picks it up along with the
other patches and drops it again in case Greg applies it to linux-next.

 drivers/tty/tty_buffer.c | 24 ++++++++++++++++++++++++
 include/linux/tty_flip.h |  3 ++-
 2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
index 4e7a4e9dcf4d..2da05fa37aec 100644
--- a/drivers/tty/tty_buffer.c
+++ b/drivers/tty/tty_buffer.c
@@ -362,6 +362,30 @@ int tty_insert_flip_string_flags(struct tty_port *port,
 EXPORT_SYMBOL(tty_insert_flip_string_flags);
 
 /**
+ *	__tty_insert_flip_char   -	Add one character to the tty buffer
+ *	@port: tty port
+ *	@ch: character
+ *	@flag: flag byte
+ *
+ *	Queue a single byte to the tty buffering, with an optional flag.
+ *	This is the slow path of tty_insert_flip_char.
+ */
+int __tty_insert_flip_char(struct tty_port *port, unsigned char ch, char flag)
+{
+	struct tty_buffer *tb = port->buf.tail;
+	int flags = (flag == TTY_NORMAL) ? TTYB_NORMAL : 0;
+
+	if (!tty_buffer_request_room(port, 1))
+		return 0;
+
+	*flag_buf_ptr(tb, tb->used) = flag;
+	*char_buf_ptr(tb, tb->used++) = ch;
+
+	return 1;
+}
+EXPORT_SYMBOL(__tty_insert_flip_char);
+
+/**
  *	tty_schedule_flip	-	push characters to ldisc
  *	@port: tty port to push from
  *
diff --git a/include/linux/tty_flip.h b/include/linux/tty_flip.h
index c28dd523f96e..d43837f2ce3a 100644
--- a/include/linux/tty_flip.h
+++ b/include/linux/tty_flip.h
@@ -12,6 +12,7 @@ extern int tty_prepare_flip_string(struct tty_port *port,
 		unsigned char **chars, size_t size);
 extern void tty_flip_buffer_push(struct tty_port *port);
 void tty_schedule_flip(struct tty_port *port);
+int __tty_insert_flip_char(struct tty_port *port, unsigned char ch, char flag);
 
 static inline int tty_insert_flip_char(struct tty_port *port,
 					unsigned char ch, char flag)
@@ -26,7 +27,7 @@ static inline int tty_insert_flip_char(struct tty_port *port,
 		*char_buf_ptr(tb, tb->used++) = ch;
 		return 1;
 	}
-	return tty_insert_flip_string_flags(port, &ch, &flag, 1);
+	return __tty_insert_flip_char(port, ch, flag);
 }
 
 static inline int tty_insert_flip_string(struct tty_port *port,
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v3 07/11] tty: improve tty_insert_flip_char() fast path
  2017-06-22 17:13 ` [PATCH v3 07/11] tty: improve tty_insert_flip_char() fast path Arnd Bergmann
@ 2017-06-23 16:07   ` Greg Kroah-Hartman
  2017-06-26 13:58     ` Arnd Bergmann
  2017-06-25  2:33   ` kbuild test robot
  1 sibling, 1 reply; 5+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-23 16:07 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Andrew Morton, kasan-dev, Dmitry Vyukov, Alexander Potapenko,
	Andrey Ryabinin, netdev, linux-kernel, Arend van Spriel,
	Masahiro Yamada, Michal Marek, Kees Cook, Ingo Molnar,
	David S . Miller, linux-kbuild, Samuel Thibault, Jiri Slaby,
	stable

On Thu, Jun 22, 2017 at 07:13:51PM +0200, Arnd Bergmann wrote:
> kernelci.org reports a crazy stack usage for the VT code when CONFIG_KASAN
> is enabled:
> 
> drivers/tty/vt/keyboard.c: In function 'kbd_keycode':
> drivers/tty/vt/keyboard.c:1452:1: error: the frame size of 2240 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
> 
> The problem is that tty_insert_flip_char() gets inlined many times into
> kbd_keycode(), and also into other functions, and each copy requires 128
> bytes for stack redzone to check for a possible out-of-bounds access on
> the 'ch' and 'flags' arguments that are passed into
> tty_insert_flip_string_flags as a variable-length string.
> 
> This introduces a new __tty_insert_flip_char() function for the slow
> path, which receives the two arguments by value. This completely avoids
> the problem and the stack usage goes back down to around 100 bytes.
> 
> Without KASAN, this is also slightly better, as we don't have to
> spill the arguments to the stack but can simply pass 'ch' and 'flag'
> in registers, saving a few bytes in .text for each call site.
> 
> This should be backported to linux-4.0 or later, which first introduced
> the stack sanitizer in the kernel.
> 
> Cc: stable@vger.kernel.org
> Fixes: c420f167db8c ("kasan: enable stack instrumentation")
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
> I already submitted this separately to Greg, but he hasn't replied
> yet. I assume that it's fine if Andrew picks it up along with the
> other patches and drops it again in case Greg applies it to linux-next.

I've been traveling in China this week, give me a chance to catch up
please.

And no, I don't like this patch either, I think kasan needs to be fixed
here, not work around it in odd ways in code that is completly
acceptable to "sane" compilers.  But give me a week to catch up on my
pending stuff first...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3 07/11] tty: improve tty_insert_flip_char() fast path
  2017-06-22 17:13 ` [PATCH v3 07/11] tty: improve tty_insert_flip_char() fast path Arnd Bergmann
  2017-06-23 16:07   ` Greg Kroah-Hartman
@ 2017-06-25  2:33   ` kbuild test robot
  1 sibling, 0 replies; 5+ messages in thread
From: kbuild test robot @ 2017-06-25  2:33 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: kbuild-all, Andrew Morton, kasan-dev, Dmitry Vyukov,
	Alexander Potapenko, Andrey Ryabinin, netdev, linux-kernel,
	Arend van Spriel, Arnd Bergmann, Masahiro Yamada, Michal Marek,
	Kees Cook, Ingo Molnar, David S . Miller, linux-kbuild,
	Samuel Thibault, Greg Kroah-Hartman, Jiri Slaby, stable

[-- Attachment #1: Type: text/plain, Size: 1762 bytes --]

Hi Arnd,

[auto build test WARNING on linuxtv-media/master]
[also build test WARNING on v4.12-rc6 next-20170623]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Arnd-Bergmann/bring-back-stack-frame-warning-with-KASAN/20170625-071646
base:   git://linuxtv.org/media_tree.git master
config: x86_64-randconfig-b0-06250903 (attached as .config)
compiler: gcc-4.4 (Debian 4.4.7-8) 4.4.7
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All warnings (new ones prefixed by >>):

   drivers/tty/tty_buffer.c: In function '__tty_insert_flip_char':
>> drivers/tty/tty_buffer.c:376: warning: unused variable 'flags'

vim +/flags +376 drivers/tty/tty_buffer.c

   360		return copied;
   361	}
   362	EXPORT_SYMBOL(tty_insert_flip_string_flags);
   363	
   364	/**
   365	 *	__tty_insert_flip_char   -	Add one character to the tty buffer
   366	 *	@port: tty port
   367	 *	@ch: character
   368	 *	@flag: flag byte
   369	 *
   370	 *	Queue a single byte to the tty buffering, with an optional flag.
   371	 *	This is the slow path of tty_insert_flip_char.
   372	 */
   373	int __tty_insert_flip_char(struct tty_port *port, unsigned char ch, char flag)
   374	{
   375		struct tty_buffer *tb = port->buf.tail;
 > 376		int flags = (flag == TTY_NORMAL) ? TTYB_NORMAL : 0;
   377	
   378		if (!tty_buffer_request_room(port, 1))
   379			return 0;
   380	
   381		*flag_buf_ptr(tb, tb->used) = flag;
   382		*char_buf_ptr(tb, tb->used++) = ch;
   383	
   384		return 1;

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 29450 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3 07/11] tty: improve tty_insert_flip_char() fast path
  2017-06-23 16:07   ` Greg Kroah-Hartman
@ 2017-06-26 13:58     ` Arnd Bergmann
  2017-06-27 20:43       ` Arnd Bergmann
  0 siblings, 1 reply; 5+ messages in thread
From: Arnd Bergmann @ 2017-06-26 13:58 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Andrew Morton, kasan-dev, Dmitry Vyukov, Alexander Potapenko,
	Andrey Ryabinin, Networking, Linux Kernel Mailing List,
	Arend van Spriel, Masahiro Yamada, Michal Marek, Kees Cook,
	Ingo Molnar, David S . Miller, Linux Kbuild mailing list,
	Samuel Thibault, Jiri Slaby, # 3.4.x

On Fri, Jun 23, 2017 at 6:07 PM, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
> On Thu, Jun 22, 2017 at 07:13:51PM +0200, Arnd Bergmann wrote:
>> kernelci.org reports a crazy stack usage for the VT code when CONFIG_KASAN
>> is enabled:
>>
>> drivers/tty/vt/keyboard.c: In function 'kbd_keycode':
>> drivers/tty/vt/keyboard.c:1452:1: error: the frame size of 2240 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
>>
>> The problem is that tty_insert_flip_char() gets inlined many times into
>> kbd_keycode(), and also into other functions, and each copy requires 128
>> bytes for stack redzone to check for a possible out-of-bounds access on
>> the 'ch' and 'flags' arguments that are passed into
>> tty_insert_flip_string_flags as a variable-length string.
>>
>> This introduces a new __tty_insert_flip_char() function for the slow
>> path, which receives the two arguments by value. This completely avoids
>> the problem and the stack usage goes back down to around 100 bytes.
>>
>> Without KASAN, this is also slightly better, as we don't have to
>> spill the arguments to the stack but can simply pass 'ch' and 'flag'
>> in registers, saving a few bytes in .text for each call site.
>>
>> This should be backported to linux-4.0 or later, which first introduced
>> the stack sanitizer in the kernel.
>>
>> Cc: stable@vger.kernel.org
>> Fixes: c420f167db8c ("kasan: enable stack instrumentation")
>> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
>> ---
>> I already submitted this separately to Greg, but he hasn't replied
>> yet. I assume that it's fine if Andrew picks it up along with the
>> other patches and drops it again in case Greg applies it to linux-next.
>
> I've been traveling in China this week, give me a chance to catch up
> please.

Sorry about the rush, I thought the new version was going to be
uncontroversial.

Having sent a broken patch (unused variable unless tty patch 2/2 is
applied, but that wasn't part of this series) certainly didn't make me
look any better :(

> And no, I don't like this patch either, I think kasan needs to be fixed
> here, not work around it in odd ways in code that is completly
> acceptable to "sane" compilers.  But give me a week to catch up on my
> pending stuff first...

I have done some more research, and in particular found out more about
what the compiler does, and why it shows up with some compilers
but not others for this particular file:

* when CONFIG_OPTIMIZE_INLINING is set, gcc-5 and higher decide
  to inline put_queue() in keyboard.c, regardless of architecture. gcc-4.9
  does not do this, and without CONFIG_OPTIMIZE_INLINING,
  put_queue() remains out of line for all versions of gcc. clang-3.9 always
  inlines put_queue(), regardless of CONFIG_OPTIMIZE_INLINING.

* with -fsanitize=kernel-address enabled (regardless of asan-stack), both
  clang and gcc give each local variable in an inline function a separate
  stack address when it gets passed by reference. Clang normally tries
  to overlap the addresses (without kasan), gcc apparently does not.

* With asan-stack=1, gcc uses at least 64 bytes per such variable
  (two times ASAN_RED_ZONE_SIZE), while clang only uses 16 bytes
  (2 * (1<<kDefaultShadowScale)). With asan-stack=0, they do not
  use any more space than with kasan completely disabled
  (no -fsanitize=kernel-address).

Can you say which behavior you find 'sane' or 'not sane' here,
specifically? Maybe we can make future gcc releases use a
smaller redzone like clang does.

If we find a way to improve gcc so it uses less stack here, we still
have a problem with existing compilers still producing dangerously
high stack usage, as well as annoying warnings for an allmodconfig
build as soon as we start warning about this again.

       Arnd

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3 07/11] tty: improve tty_insert_flip_char() fast path
  2017-06-26 13:58     ` Arnd Bergmann
@ 2017-06-27 20:43       ` Arnd Bergmann
  0 siblings, 0 replies; 5+ messages in thread
From: Arnd Bergmann @ 2017-06-27 20:43 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Andrew Morton, kasan-dev, Dmitry Vyukov, Alexander Potapenko,
	Andrey Ryabinin, Networking, Linux Kernel Mailing List,
	Arend van Spriel, Masahiro Yamada, Michal Marek, Kees Cook,
	Ingo Molnar, David S . Miller, Linux Kbuild mailing list,
	Samuel Thibault, Jiri Slaby, # 3.4.x

On Mon, Jun 26, 2017 at 3:58 PM, Arnd Bergmann <arnd@arndb.de> wrote:

> * With asan-stack=1, gcc uses at least 64 bytes per such variable
>   (two times ASAN_RED_ZONE_SIZE), while clang only uses 16 bytes
>   (2 * (1<<kDefaultShadowScale)). With asan-stack=0, they do not
>   use any more space than with kasan completely disabled
>   (no -fsanitize=kernel-address).

I asked around the Linaro toolchain team today, and arrived at this commit
in llvm: https://github.com/llvm-mirror/llvm/commit/daa1bf3b74054

Prior to this, the llvm behavior was the same as gcc, using 64 bytes
for each small (<= 16 byte) variable instead of just 16 or 32 as it
does now. llvm now also uses a larger redzone (up to 256 bytes) for
very large stack objects, which also seems like a good idea.

While it would be hard to argue that the gcc behavior is a bug,
it should be possible to implement the same optimization in gcc,
and that would solve a lot of the stack size issues with KASAN.

> Can you say which behavior you find 'sane' or 'not sane' here,
> specifically? Maybe we can make future gcc releases use a
> smaller redzone like clang does.
>
> If we find a way to improve gcc so it uses less stack here, we still
> have a problem with existing compilers still producing dangerously
> high stack usage, as well as annoying warnings for an allmodconfig
> build as soon as we start warning about this again.

This problem obviously still stands.

          Arnd

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-06-27 20:43 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20170622171355.267192-1-arnd@arndb.de>
2017-06-22 17:13 ` [PATCH v3 07/11] tty: improve tty_insert_flip_char() fast path Arnd Bergmann
2017-06-23 16:07   ` Greg Kroah-Hartman
2017-06-26 13:58     ` Arnd Bergmann
2017-06-27 20:43       ` Arnd Bergmann
2017-06-25  2:33   ` kbuild test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox