From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752658AbbJEQ2J (ORCPT ); Mon, 5 Oct 2015 12:28:09 -0400 Received: from mail-wi0-f171.google.com ([209.85.212.171]:38046 "EHLO mail-wi0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751569AbbJEQ2H (ORCPT ); Mon, 5 Oct 2015 12:28:07 -0400 Date: Mon, 5 Oct 2015 18:28:02 +0200 From: Ingo Molnar To: Linus Torvalds Cc: Alexey Dobriyan , Chris Metcalf , Linux Kernel Mailing List , Peter Zijlstra , Thomas Gleixner , "H. Peter Anvin" , Borislav Petkov Subject: Re: [PATCH] string: Improve the generic strlcpy() implementation Message-ID: <20151005162802.GA11474@gmail.com> References: <20151005162226.GA10993@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151005162226.GA10993@gmail.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Ingo Molnar wrote: > We could do something like: > > c = *(unsigned long *)(src+res); > *(unsigned long *)(dest+res) = c; > > if (has_zero(c, &data, &constants)) { > unsigned int zero_pos; > > data = prep_zero_mask(c, data, &constants); > data = create_zero_mask(data); > > zero_pos = find_zero(data); > res += zero_pos; > > memset(dest+res, 0, sizeof(long)-zero_pos); > > return res; > } > > I.e. the extra memset() clears out the partial word (if any) after the NUL. A slightly more paranoid version would be: c = *(unsigned long *)(src+res); if (has_zero(c, &data, &constants)) { unsigned int zero_pos; data = prep_zero_mask(c, data, &constants); data = create_zero_mask(data); zero_pos = find_zero(data); /* Clear out undefined data within the final word after the NUL: */ memset((void *)&c + zero_pos, 0, sizeof(long)-zero_pos); *(unsigned long *)(dest+res) = c; return res+zero_pos; } *(unsigned long *)(dest+res) = c; This would solve any theoretical races in the _target_ buffer: if the target buffer may be copied to user-space in a racy fashion and we don't ever want it to have undefined data, then this variant does the tail-zeroing of the final word in the temporary copy, not in the target buffer. Still untested. Thanks, Ingo