From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Raymond Subject: More git status --porcelain lossage Date: Fri, 9 Apr 2010 15:06:01 -0400 (EDT) Message-ID: <20100409190601.47B37475FEF@snark.thyrsus.com> To: git@vger.kernel.org X-From: git-owner@vger.kernel.org Fri Apr 09 21:06:15 2010 Return-path: Envelope-to: gcvg-git-2@lo.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1O0JWu-0006VL-A1 for gcvg-git-2@lo.gmane.org; Fri, 09 Apr 2010 21:06:12 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754206Ab0DITGF (ORCPT ); Fri, 9 Apr 2010 15:06:05 -0400 Received: from static-71-162-243-5.phlapa.fios.verizon.net ([71.162.243.5]:58421 "EHLO snark.thyrsus.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753760Ab0DITGC (ORCPT ); Fri, 9 Apr 2010 15:06:02 -0400 Received: by snark.thyrsus.com (Postfix, from userid 23) id 47B37475FEF; Fri, 9 Apr 2010 15:06:01 -0400 (EDT) Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: After I posted my last, I noticed another crash landing... A format properly designed for script parseability should use even use whitespace as a field separator. Why? Because if you do that, front ends *will* do field analysis using a naive split-on-whitespace operation. And then...someday...someone will try to run one of these of these on a volume from a system where filenames contain embedded whitespace. Like Mac OS X or Windows. Hilarity will ensue. Conclusion: As it is presently, git status --porcelain format is irretrievably botched. You need a field separator that's musch less likely to land in a filename, like '|' - and to warn in the documentation that careful front ends must check for and ignore '\|'. -- Eric S. Raymond The right of the citizens to keep and bear arms has justly been considered as the palladium of the liberties of a republic; since it offers a strong moral check against usurpation and arbitrary power of rulers; and will generally, even if these are successful in the first instance, enable the people to resist and triumph over them." -- Supreme Court Justice Joseph Story of the John Marshall Court From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Raymond Subject: Re: More git status --porcelain lossage Date: Fri, 9 Apr 2010 15:09:36 -0400 Organization: Eric Conspiracy Secret Labs Message-ID: <20100409190936.GA15170@thyrsus.com> References: <20100409190601.47B37475FEF@snark.thyrsus.com> Reply-To: esr@thyrsus.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: git@vger.kernel.org To: Eric Raymond X-From: git-owner@vger.kernel.org Fri Apr 09 21:09:42 2010 Return-path: Envelope-to: gcvg-git-2@lo.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1O0JaI-00089m-HW for gcvg-git-2@lo.gmane.org; Fri, 09 Apr 2010 21:09:42 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754206Ab0DITJh (ORCPT ); Fri, 9 Apr 2010 15:09:37 -0400 Received: from static-71-162-243-5.phlapa.fios.verizon.net ([71.162.243.5]:35422 "EHLO snark.thyrsus.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752731Ab0DITJg (ORCPT ); Fri, 9 Apr 2010 15:09:36 -0400 Received: by snark.thyrsus.com (Postfix, from userid 23) id 6904C475FEF; Fri, 9 Apr 2010 15:09:36 -0400 (EDT) Content-Disposition: inline In-Reply-To: <20100409190601.47B37475FEF@snark.thyrsus.com> X-Eric-Conspiracy: There is no conspiracy User-Agent: Mutt/1.5.20 (2009-06-14) Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: Eric Raymond : > A format properly designed for script parseability should use even use > whitespace as a field separator. should *not* even use... -- Eric S. Raymond From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jakub Narebski Subject: Re: More git status --porcelain lossage Date: Fri, 09 Apr 2010 12:22:22 -0700 (PDT) Message-ID: References: <20100409190601.47B37475FEF@snark.thyrsus.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: git@vger.kernel.org To: Eric Raymond X-From: git-owner@vger.kernel.org Fri Apr 09 21:23:24 2010 Return-path: Envelope-to: gcvg-git-2@lo.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1O0JnU-0006Ro-HZ for gcvg-git-2@lo.gmane.org; Fri, 09 Apr 2010 21:23:20 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756105Ab0DITWg (ORCPT ); Fri, 9 Apr 2010 15:22:36 -0400 Received: from mail-fx0-f223.google.com ([209.85.220.223]:50459 "EHLO mail-fx0-f223.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755508Ab0DITWZ (ORCPT ); Fri, 9 Apr 2010 15:22:25 -0400 Received: by fxm23 with SMTP id 23so3051666fxm.21 for ; Fri, 09 Apr 2010 12:22:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:received:received :x-authentication-warning:to:cc:subject:references:from:date :in-reply-to:message-id:lines:user-agent:mime-version:content-type; bh=u2QO5GMblslN0YM5c6cQSOxk+qnFP2hpBxfAkr92KHY=; b=oeNeqPgIHLEDtIOPkknuh3sJMoqDc4/As1s3dyg0X3QUOtItQ+BGxSoelkJBhinl7s 63TMP2FoVXeRa5gEJjP7rPriLeNpYcYu4SD+shNxBgHhYYrWMS01fPUUJBNCGeDjOoGO ZEV8ZdZrUQvf2fSgBGIGcVddGtau+2PGgzmn0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=x-authentication-warning:to:cc:subject:references:from:date :in-reply-to:message-id:lines:user-agent:mime-version:content-type; b=xZLeSVUxlkfXUGxZdpKJEP065Z64pGN8tX4HgLBaKqgajlWFYqajJR9afznJdgC8LB DAolZa+Y9Bb2DAj6mOg/zfLJLu2ZYtH6c4wy3HBjM0oHqSMAhAp4Tizs4dt00zd3JBBi YI7WD3f/thqEn1g03di9iHFpsVIoZLhAEGQ1w= Received: by 10.103.84.1 with SMTP id m1mr272634mul.26.1270840943650; Fri, 09 Apr 2010 12:22:23 -0700 (PDT) Received: from localhost.localdomain (abvc167.neoplus.adsl.tpnet.pl [83.8.200.167]) by mx.google.com with ESMTPS id j10sm6803545muh.58.2010.04.09.12.22.22 (version=TLSv1/SSLv3 cipher=RC4-MD5); Fri, 09 Apr 2010 12:22:22 -0700 (PDT) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by localhost.localdomain (8.13.4/8.13.4) with ESMTP id o39JLvML017280; Fri, 9 Apr 2010 21:22:08 +0200 Received: (from jnareb@localhost) by localhost.localdomain (8.13.4/8.13.4/Submit) id o39JLe3F017275; Fri, 9 Apr 2010 21:21:40 +0200 X-Authentication-Warning: localhost.localdomain: jnareb set sender to jnareb@gmail.com using -f In-Reply-To: <20100409190601.47B37475FEF@snark.thyrsus.com> User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.4 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: Eric Raymond writes: > After I posted my last, I noticed another crash landing... > > A format properly designed for script parseability should use even use > whitespace as a field separator. > > Why? > > Because if you do that, front ends *will* do field analysis using a > naive split-on-whitespace operation. And then...someday...someone > will try to run one of these of these on a volume from a system where > filenames contain embedded whitespace. Like Mac OS X or Windows. > > Hilarity will ensue. > > Conclusion: As it is presently, git status --porcelain format is > irretrievably botched. You need a field separator that's musch less > likely to land in a filename, like '|' - and to warn in the documentation > that careful front ends must check for and ignore '\|'. Or follow what other porcelain does, like git-diff-tree raw output format, where all fields except final filename are space separated, filename is separated by tab character (or NUL when '-z' options is used). If there are two names (in the case of copy or renames), they are separated by a tab (or NUL). Record ends with LF (or NUL). When '-z' option is not used, TAB, LF, " and backslash characters are represented by '\t', '\n', '\"' and \\, and the filename is enclosed in '"' doublequotes. -- Jakub Narebski Poland ShadeHawk on #git From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Raymond Subject: Re: More git status --porcelain lossage Date: Fri, 9 Apr 2010 15:50:29 -0400 Organization: Eric Conspiracy Secret Labs Message-ID: <20100409195029.GA15810@thyrsus.com> References: <20100409190601.47B37475FEF@snark.thyrsus.com> Reply-To: esr@thyrsus.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Eric Raymond , git@vger.kernel.org To: Jakub Narebski X-From: git-owner@vger.kernel.org Fri Apr 09 21:51:08 2010 Return-path: Envelope-to: gcvg-git-2@lo.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1O0KEN-00051C-Oq for gcvg-git-2@lo.gmane.org; Fri, 09 Apr 2010 21:51:08 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756535Ab0DITuf (ORCPT ); Fri, 9 Apr 2010 15:50:35 -0400 Received: from static-71-162-243-5.phlapa.fios.verizon.net ([71.162.243.5]:38280 "EHLO snark.thyrsus.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755966Ab0DITua (ORCPT ); Fri, 9 Apr 2010 15:50:30 -0400 Received: by snark.thyrsus.com (Postfix, from userid 23) id 03347475FEF; Fri, 9 Apr 2010 15:50:30 -0400 (EDT) Content-Disposition: inline In-Reply-To: X-Eric-Conspiracy: There is no conspiracy User-Agent: Mutt/1.5.20 (2009-06-14) Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: Jakub Narebski : > > Conclusion: As it is presently, git status --porcelain format is > > irretrievably botched. You need a field separator that's musch less > > likely to land in a filename, like '|' - and to warn in the documentation > > that careful front ends must check for and ignore '\|'. > > Or follow what other porcelain does, like git-diff-tree raw output > format, where all fields except final filename are space separated, > filename is separated by tab character (or NUL when '-z' options is > used). If there are two names (in the case of copy or renames), > they are separated by a tab (or NUL). Record ends with LF (or NUL). > > When '-z' option is not used, TAB, LF, " and backslash characters > are represented by '\t', '\n', '\"' and \\, and the filename is > enclosed in '"' doublequotes. That would be a bit trickier to parse, but acceptable. -- Eric S. Raymond From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeff King Subject: Re: More git status --porcelain lossage Date: Sat, 10 Apr 2010 00:12:48 -0400 Message-ID: <20100410041247.GB11977@coredump.intra.peff.net> References: <20100409190601.47B37475FEF@snark.thyrsus.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Cc: git@vger.kernel.org To: Eric Raymond X-From: git-owner@vger.kernel.org Sat Apr 10 06:13:21 2010 Return-path: Envelope-to: gcvg-git-2@lo.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1O0S4L-0007yf-04 for gcvg-git-2@lo.gmane.org; Sat, 10 Apr 2010 06:13:17 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750838Ab0DJENM (ORCPT ); Sat, 10 Apr 2010 00:13:12 -0400 Received: from peff.net ([208.65.91.99]:46239 "EHLO peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750719Ab0DJENL (ORCPT ); Sat, 10 Apr 2010 00:13:11 -0400 Received: (qmail 13306 invoked by uid 107); 10 Apr 2010 04:13:11 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.40) with (AES128-SHA encrypted) SMTP; Sat, 10 Apr 2010 00:13:11 -0400 Received: by coredump.intra.peff.net (sSMTP sendmail emulation); Sat, 10 Apr 2010 00:12:48 -0400 Content-Disposition: inline In-Reply-To: <20100409190601.47B37475FEF@snark.thyrsus.com> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: On Fri, Apr 09, 2010 at 03:06:01PM -0400, Eric Raymond wrote: > A format properly designed for script parseability should use even use > whitespace as a field separator. > > Why? > > Because if you do that, front ends *will* do field analysis using a > naive split-on-whitespace operation. And then...someday...someone > will try to run one of these of these on a volume from a system where > filenames contain embedded whitespace. Like Mac OS X or Windows. Yes, that is why almost every scriptable git interface supports a "-z" variant with NUL termination. > Conclusion: As it is presently, git status --porcelain format is > irretrievably botched. You need a field separator that's musch less > likely to land in a filename, like '|' - and to warn in the documentation > that careful front ends must check for and ignore '\|'. We already quote correctly, so it is only sloppy parsers that will be in trouble. Yes, space is more common than "|", but sloppy is sloppy. Parse it right, or use "-z". -Peff From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeff King Subject: Re: More git status --porcelain lossage Date: Sat, 10 Apr 2010 00:14:42 -0400 Message-ID: <20100410041442.GC11977@coredump.intra.peff.net> References: <20100409190601.47B37475FEF@snark.thyrsus.com> <20100410041247.GB11977@coredump.intra.peff.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Cc: git@vger.kernel.org To: Eric Raymond X-From: git-owner@vger.kernel.org Sat Apr 10 06:15:13 2010 Return-path: Envelope-to: gcvg-git-2@lo.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1O0S6C-0008IM-2i for gcvg-git-2@lo.gmane.org; Sat, 10 Apr 2010 06:15:12 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750841Ab0DJEPH (ORCPT ); Sat, 10 Apr 2010 00:15:07 -0400 Received: from peff.net ([208.65.91.99]:46242 "EHLO peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750719Ab0DJEPG (ORCPT ); Sat, 10 Apr 2010 00:15:06 -0400 Received: (qmail 13329 invoked by uid 107); 10 Apr 2010 04:15:06 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.40) with (AES128-SHA encrypted) SMTP; Sat, 10 Apr 2010 00:15:06 -0400 Received: by coredump.intra.peff.net (sSMTP sendmail emulation); Sat, 10 Apr 2010 00:14:42 -0400 Content-Disposition: inline In-Reply-To: <20100410041247.GB11977@coredump.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: On Sat, Apr 10, 2010 at 12:12:48AM -0400, Jeff King wrote: > > Conclusion: As it is presently, git status --porcelain format is > > irretrievably botched. You need a field separator that's musch less > > likely to land in a filename, like '|' - and to warn in the documentation > > that careful front ends must check for and ignore '\|'. > > We already quote correctly, so it is only sloppy parsers that will be in > trouble. Yes, space is more common than "|", but sloppy is sloppy. Parse > it right, or use "-z". BTW, this should go on your "git status --porcelain documentation failures" list. We really need to note that the output paths may be quoted. -Peff From mboxrd@z Thu Jan 1 00:00:00 1970 From: Simon Subject: Re: More git status --porcelain lossage Date: Sat, 10 Apr 2010 14:48:05 -0400 Message-ID: References: <20100409190601.47B37475FEF@snark.thyrsus.com> Reply-To: turner25@gmail.com Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: git@vger.kernel.org To: Eric Raymond X-From: git-owner@vger.kernel.org Sat Apr 10 20:48:17 2010 Return-path: Envelope-to: gcvg-git-2@lo.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1O0fj5-0007Hy-VV for gcvg-git-2@lo.gmane.org; Sat, 10 Apr 2010 20:48:16 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751926Ab0DJSsK convert rfc822-to-quoted-printable (ORCPT ); Sat, 10 Apr 2010 14:48:10 -0400 Received: from mail-gw0-f46.google.com ([74.125.83.46]:35986 "EHLO mail-gw0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751050Ab0DJSsI convert rfc822-to-8bit (ORCPT ); Sat, 10 Apr 2010 14:48:08 -0400 Received: by gwj19 with SMTP id 19so323108gwj.19 for ; Sat, 10 Apr 2010 11:48:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:reply-to:in-reply-to :references:date:received:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=O2TMe9tpzNKNBPBRHnhByyjUzqUba3eXpLX2svxINqA=; b=x7H/44UxXBVJ3mDbBY2At7S25cBVYvnX2Af/vjzpTfSxcu5v65o7WN+D/jKnOL2q3I YH+Y1i4kzVCtAw1YuWWby2mFngy+XJ72FW6E1nFj0Unp/unemHg+0yG3oxxlFjZImF1d 0vcQMlnar0fw0csyP+w6P0YNraLOfU41PMWGY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:reply-to:in-reply-to:references:date:message-id :subject:from:to:cc:content-type:content-transfer-encoding; b=B7srwfKaotxctYgUpQanrkkn0u99Vq4SkezGQJeCF0J+Y522uO8U4eNXXcaN0bMkuV XknldPTj+632FU61c22MoprLm2wfmh5j9qp6LiNhFnNM8/psAbFXw3eFhe81P7MQvn3h AOohyHogbtWsCQwC3nZX1XEanrYRYu+JaYcP4= Received: by 10.100.142.16 with HTTP; Sat, 10 Apr 2010 11:48:05 -0700 (PDT) In-Reply-To: <20100409190601.47B37475FEF@snark.thyrsus.com> Received: by 10.101.135.40 with SMTP id m40mr3334785ann.1.1270925285636; Sat, 10 Apr 2010 11:48:05 -0700 (PDT) Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: > A format properly designed for script parseability should use even us= e > whitespace as a field separator. > > Why? > > Because if you do that, front ends *will* do field analysis using a > naive split-on-whitespace operation. =A0And then...someday...someone > will try to run one of these of these on a volume from a system where > filenames contain embedded whitespace. =A0Like Mac OS X or Windows. Why not use an XML output? Plain text is easier to parse, but XML may give this extra durability you are looking for? Simon From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jakub Narebski Subject: Re: More git status --porcelain lossage Date: Sat, 10 Apr 2010 12:01:41 -0700 (PDT) Message-ID: References: <20100409190601.47B37475FEF@snark.thyrsus.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Eric Raymond , git@vger.kernel.org To: Simon X-From: git-owner@vger.kernel.org Sat Apr 10 21:07:16 2010 Return-path: Envelope-to: gcvg-git-2@lo.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1O0g1R-0004zJ-0F for gcvg-git-2@lo.gmane.org; Sat, 10 Apr 2010 21:07:13 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752176Ab0DJTHG convert rfc822-to-quoted-printable (ORCPT ); Sat, 10 Apr 2010 15:07:06 -0400 Received: from mail-bw0-f219.google.com ([209.85.218.219]:35950 "EHLO mail-bw0-f219.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752164Ab0DJTHE convert rfc822-to-8bit (ORCPT ); Sat, 10 Apr 2010 15:07:04 -0400 X-Greylist: delayed 319 seconds by postgrey-1.27 at vger.kernel.org; Sat, 10 Apr 2010 15:07:04 EDT Received: by bwz19 with SMTP id 19so21434bwz.21 for ; Sat, 10 Apr 2010 12:07:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:received:received :x-authentication-warning:to:cc:subject:references:from:date :in-reply-to:message-id:lines:user-agent:mime-version:content-type :content-transfer-encoding; bh=JivKF7afxvzatH/WUZyGm218T1HjIBN/ZLZ2q6W90UU=; b=nphiNANPAXvayF6OZnR14aiVZQJk9TG+GzEjq6MtO7ZIABCvkFsOMxvNG7GQOTzsIi CozYywY5qNhrEZi3yM2RShYDea54k9LzUgl2prADovpcYnK2FrJJPSxZvRQr25DylrcD lkWDioMJmz/ysoV+b/dKr2dkWnc1D+eS9Z+cw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=x-authentication-warning:to:cc:subject:references:from:date :in-reply-to:message-id:lines:user-agent:mime-version:content-type :content-transfer-encoding; b=U05UW3WNvfs2bMq8Vf0DdmG+TfF6YhtBe4uXLRLQUoWbD3ycBijH6AkZyRBtUSOBwN 2mTp8OemKMz2Xv9z9suqfNaK0f2pVY6C6RkIDcmrAdkabEc8H9j5jP6Mef84yLL3IzZh wC8mZ4yPb73ICHZTKooXnZcFZWOR9LngiqBNk= Received: by 10.204.6.25 with SMTP id 25mr1938451bkx.135.1270926103275; Sat, 10 Apr 2010 12:01:43 -0700 (PDT) Received: from localhost.localdomain (abvp94.neoplus.adsl.tpnet.pl [83.8.213.94]) by mx.google.com with ESMTPS id s17sm21306071bkd.16.2010.04.10.12.01.40 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sat, 10 Apr 2010 12:01:41 -0700 (PDT) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by localhost.localdomain (8.13.4/8.13.4) with ESMTP id o3AJ1CHh023729; Sat, 10 Apr 2010 21:01:18 +0200 Received: (from jnareb@localhost) by localhost.localdomain (8.13.4/8.13.4/Submit) id o3AJ0uum023723; Sat, 10 Apr 2010 21:00:56 +0200 X-Authentication-Warning: localhost.localdomain: jnareb set sender to jnareb@gmail.com using -f In-Reply-To: User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.4 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: Simon writes: > > A format properly designed for script parseability should use even = use > > whitespace as a field separator. > > > > Why? > > > > Because if you do that, front ends *will* do field analysis using a > > naive split-on-whitespace operation. =A0And then...someday...someon= e > > will try to run one of these of these on a volume from a system whe= re > > filenames contain embedded whitespace. =A0Like Mac OS X or Windows. >=20 > Why not use an XML output? > Plain text is easier to parse, but XML may give this extra durability > you are looking for? Are out of your f**g mind? XML, really? XML might be good choice to *define* _document_ formats, but is really poor data exchange / serialization format (being overly verbose, among others). Also, XML is not language but meta-language. I could understand providing JSON format, specified using --json option. I think there is some GPLv2 compatibile JSON generating code in C (MIT licensed code is GPLv2 compatibilie, isn't it?); we can always borrow compact JSON generation code from GPSD project (if license allows it) from ESR. --=20 Jakub Narebski Poland ShadeHawk on #git From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Raymond Subject: Re: More git status --porcelain lossage Date: Sat, 10 Apr 2010 15:30:39 -0400 Organization: Eric Conspiracy Secret Labs Message-ID: <20100410193039.GA28768@thyrsus.com> References: <20100409190601.47B37475FEF@snark.thyrsus.com> Reply-To: esr@thyrsus.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Eric Raymond , git@vger.kernel.org To: Simon X-From: git-owner@vger.kernel.org Sat Apr 10 21:30:47 2010 Return-path: Envelope-to: gcvg-git-2@lo.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1O0gOD-0005js-Om for gcvg-git-2@lo.gmane.org; Sat, 10 Apr 2010 21:30:46 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751420Ab0DJTak (ORCPT ); Sat, 10 Apr 2010 15:30:40 -0400 Received: from static-71-162-243-5.phlapa.fios.verizon.net ([71.162.243.5]:48081 "EHLO snark.thyrsus.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751177Ab0DJTak (ORCPT ); Sat, 10 Apr 2010 15:30:40 -0400 Received: by snark.thyrsus.com (Postfix, from userid 23) id 8438D20CBBC; Sat, 10 Apr 2010 15:30:39 -0400 (EDT) Content-Disposition: inline In-Reply-To: X-Eric-Conspiracy: There is no conspiracy User-Agent: Mutt/1.5.20 (2009-06-14) Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: Simon : > Why not use an XML output? > Plain text is easier to parse, but XML may give this extra durability > you are looking for? Because XML is awfully heavyewight, and XML parsers tend to be slow. If we were going to buld on a metaprotocol, JSON would be better. IMHO. -- Eric S. Raymond From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Raymond Subject: Re: More git status --porcelain lossage Date: Sat, 10 Apr 2010 15:41:54 -0400 Organization: Eric Conspiracy Secret Labs Message-ID: <20100410194154.GB28768@thyrsus.com> References: <20100409190601.47B37475FEF@snark.thyrsus.com> Reply-To: esr@thyrsus.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Simon , Eric Raymond , git@vger.kernel.org To: Jakub Narebski X-From: git-owner@vger.kernel.org Sat Apr 10 21:42:06 2010 Return-path: Envelope-to: gcvg-git-2@lo.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1O0gZ7-0001Ha-KL for gcvg-git-2@lo.gmane.org; Sat, 10 Apr 2010 21:42:02 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751916Ab0DJTl4 (ORCPT ); Sat, 10 Apr 2010 15:41:56 -0400 Received: from static-71-162-243-5.phlapa.fios.verizon.net ([71.162.243.5]:60191 "EHLO snark.thyrsus.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751692Ab0DJTlz (ORCPT ); Sat, 10 Apr 2010 15:41:55 -0400 Received: by snark.thyrsus.com (Postfix, from userid 23) id 10D3920CBBC; Sat, 10 Apr 2010 15:41:54 -0400 (EDT) Content-Disposition: inline In-Reply-To: X-Eric-Conspiracy: There is no conspiracy User-Agent: Mutt/1.5.20 (2009-06-14) Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: Jakub Narebski : > Are out of your f**g mind? XML, really? XML might be good choice to > *define* _document_ formats, but is really poor data exchange / > serialization format (being overly verbose, among others). Also, XML > is not language but meta-language. Agreed. > I could understand providing JSON format, specified using --json > option. You know, that's actually an interesting idea. I mentioned it previously as the not-XML if we want to build on a metaprotocol; I wasn't considering it seriously then. But I am now, and it is not without attractions. JSON would certainly solve all the delimiter and empty-object edge cases, and it has excellent extensibility. > I think there is some GPLv2 compatibile JSON generating code > in C (MIT licensed code is GPLv2 compatibilie, isn't it?); we can > always borrow compact JSON generation code from GPSD project (if > license allows it) from ESR. My license would allow it, but there's not really a lot of win in trying to reuse JSON generator code - writing your own printfs for it by hand is easy and fast. Emacs Lisp has a JSON parser, so it would meet my needs. Alternatively, a cleaned-up --porcelain -Z along the lines previously suggested would be good. Supplying both might not be a bad idea. The volume of code involved would be low. -- Eric S. Raymond From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?UTF-8?B?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= Subject: Re: More git status --porcelain lossage Date: Sat, 10 Apr 2010 19:39:59 +0000 Message-ID: References: <20100409190601.47B37475FEF@snark.thyrsus.com> <20100410193039.GA28768@thyrsus.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Simon , Eric Raymond , git@vger.kernel.org To: esr@thyrsus.com X-From: git-owner@vger.kernel.org Sat Apr 10 21:47:20 2010 Return-path: Envelope-to: gcvg-git-2@lo.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1O0geF-0003UR-IE for gcvg-git-2@lo.gmane.org; Sat, 10 Apr 2010 21:47:19 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751703Ab0DJTq7 convert rfc822-to-quoted-printable (ORCPT ); Sat, 10 Apr 2010 15:46:59 -0400 Received: from mail-bw0-f219.google.com ([209.85.218.219]:64309 "EHLO mail-bw0-f219.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751177Ab0DJTq6 convert rfc822-to-8bit (ORCPT ); Sat, 10 Apr 2010 15:46:58 -0400 X-Greylist: delayed 416 seconds by postgrey-1.27 at vger.kernel.org; Sat, 10 Apr 2010 15:46:58 EDT Received: by bwz19 with SMTP id 19so36204bwz.21 for ; Sat, 10 Apr 2010 12:46:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:received:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=N/2fmqb2YYnws6EcilwRyK3OfwIx/tYZoDTXRmmE7W0=; b=Dm0vPlTy6QrkTo7IdLYl8ZdpWhD9peRtB+7FiIoXwryYLIl68jrOxstPQrbU7KLb6O 8ffD3AVaNugk6Bm2C2vidg/KkuR4XUrm20mTWpODbs/hUWa8Gnc3Pvd0I4ke9di3nDb2 9rI4ejwU9P6aA9X/rUVJiOjGi0KuF7OFDzXjM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=HstAUhAxP/rNZIeLIYrOcfYmGyBVcdp98ApmRS0ZS2Moi5+za23K5X24kTTjk0uaqw zwbEmoJsTgPf0MZ6Bg0JNmL+frMiBXadHiueXwMRbiOlqnXZMSxgsx0HwzroMJl3SJIY N34pPo/bZ6CUxFfz7mfYaZPl/p0EMDLqv1pNE= Received: by 10.204.121.195 with HTTP; Sat, 10 Apr 2010 12:39:59 -0700 (PDT) In-Reply-To: <20100410193039.GA28768@thyrsus.com> Received: by 10.204.6.193 with SMTP id a1mr2050809bka.104.1270928399785; Sat, 10 Apr 2010 12:39:59 -0700 (PDT) Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: On Sat, Apr 10, 2010 at 19:30, Eric Raymond wrote: > Simon : >> Why not use an XML output? >> Plain text is easier to parse, but XML may give this extra durabilit= y >> you are looking for? > > Because XML is awfully heavyewight, and XML parsers tend to be slow. > > If we were going to buld on a metaprotocol, JSON would be better. =C2= =A0IMHO. A lot of web services (like some Catalyst-based applications) support all of these equally. If Git had machine readable output like this it would be nice if every git-* program just had --format=3D* where * coul= d be xml, json, yaml, sexp, perl etc. The program would just construct a native datastructure and then there would be an output driver to generate the textual representation. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Martin Langhoff Subject: Re: More git status --porcelain lossage Date: Sat, 10 Apr 2010 16:31:55 -0400 Message-ID: References: <20100409190601.47B37475FEF@snark.thyrsus.com> <20100410194154.GB28768@thyrsus.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Jakub Narebski , Simon , Eric Raymond , git@vger.kernel.org To: esr@thyrsus.com X-From: git-owner@vger.kernel.org Sat Apr 10 22:32:33 2010 Return-path: Envelope-to: gcvg-git-2@lo.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1O0hM0-0002vV-Va for gcvg-git-2@lo.gmane.org; Sat, 10 Apr 2010 22:32:33 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751191Ab0DJUcQ convert rfc822-to-quoted-printable (ORCPT ); Sat, 10 Apr 2010 16:32:16 -0400 Received: from mail-iw0-f197.google.com ([209.85.223.197]:39742 "EHLO mail-iw0-f197.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750829Ab0DJUcP convert rfc822-to-8bit (ORCPT ); Sat, 10 Apr 2010 16:32:15 -0400 Received: by iwn35 with SMTP id 35so419043iwn.21 for ; Sat, 10 Apr 2010 13:32:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :from:date:received:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=JCnyU7aTo+1e63Epn8jfBJK9KGxSQuv0ceDobi6vkD0=; b=ORMiw8v+GQHMO0c/1dXH4thHK0SbdongkhBubqqns+5FQ+SXOoUub4h+GID6y3s4l2 6nPjKzItrJu7U5mjOhYfcdoYRsOgaC4NZ7P2MZVibq+5FZx7B7rka7+5djHsKMgC/7eE Go5kEKQnZBdfBH3punghwuSiGIwy1ARyl9g3A= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; b=sCixn2k55xMoI4F8TC0o1zLhvk2TF8U2BqFFJGpSnw4KI7dP5BqmjKLrzgttIAmTBu pM/lK5o/89yAWzaok7JITqCP/i4rPZPSrpQVdox33XooGE+7qPe7BZvpR+38dfv7qbNi zSDoVft66qHyS5Mq+NKH2A7A6QZtsCFFi0D+E= Received: by 10.231.207.67 with HTTP; Sat, 10 Apr 2010 13:31:55 -0700 (PDT) In-Reply-To: <20100410194154.GB28768@thyrsus.com> Received: by 10.231.169.145 with SMTP id z17mr847914iby.83.1270931535188; Sat, 10 Apr 2010 13:32:15 -0700 (PDT) Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: On Sat, Apr 10, 2010 at 3:41 PM, Eric Raymond wrote: >> I could understand providing JSON format, specified using --json >> option. > > You know, that's actually an interesting idea. =A0I mentioned it > previously as the not-XML if we want to build on a metaprotocol; One issue is that there's no stream-parser JSON implementations that I'm aware of. Everthing I've seen is in-memory, therefore apt only for memory-bound operations. Not sure if all commands with -z output options can be assumed to produce bound-sized datasets. cheers, martin --=20 martin.langhoff@gmail.com martin@laptop.org -- School Server Architect - ask interesting questions - don't get distracted with shiny stuff - working code first - http://wiki.laptop.org/go/User:Martinlanghoff From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jakub Narebski Subject: Re: More git status --porcelain lossage Date: Sat, 10 Apr 2010 23:21:58 +0200 Message-ID: <201004102321.59263.jnareb@gmail.com> References: <20100409190601.47B37475FEF@snark.thyrsus.com> <20100410194154.GB28768@thyrsus.com> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Cc: Simon , Eric Raymond , git@vger.kernel.org To: esr@thyrsus.com X-From: git-owner@vger.kernel.org Sat Apr 10 23:22:33 2010 Return-path: Envelope-to: gcvg-git-2@lo.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1O0i8P-0001dz-DA for gcvg-git-2@lo.gmane.org; Sat, 10 Apr 2010 23:22:33 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752291Ab0DJVWQ (ORCPT ); Sat, 10 Apr 2010 17:22:16 -0400 Received: from fg-out-1718.google.com ([72.14.220.158]:47727 "EHLO fg-out-1718.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752270Ab0DJVWK (ORCPT ); Sat, 10 Apr 2010 17:22:10 -0400 Received: by fg-out-1718.google.com with SMTP id 22so35815fge.1 for ; Sat, 10 Apr 2010 14:22:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:from:to:subject:date :user-agent:cc:references:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:message-id; bh=f1B6vsb7E0dh1maI5GNHt/19H+0TVTSP7AaWzF1bKs4=; b=xcvHMUqmk5Sz9KkJ+hmff3QeqvMQ7BKT6Kstvq5AMh79kREWLE1efHLX1rs+SLLw4Y i/JlHeGpVFBYf4NdC8wHQkjnWL2S5YB01vZLGU4MQg6VTfE/ZcEYKJsKN4U3s8UbCV4+ ZrqwhbEnLtq4DnwycZAoxxXv5S9MeK52t4Dck= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:to:subject:date:user-agent:cc:references:in-reply-to :mime-version:content-type:content-transfer-encoding :content-disposition:message-id; b=BiQ9lGOOfNH4KiaqOmvNfKS9KaX7zEv3K0RUSL0ZYOePY5eY04WyCjCSyuJelRb59d ZpBSFCDmgf/Gl8FNl3fdgWITR36xE2xZh61ppj5Tq4d4sSSSks9T6dTS7laFzXTRZOlC m34DE4v3lujdtJfV41rNNpGm1+GZ+eerQ4oIA= Received: by 10.103.7.30 with SMTP id k30mr938513mui.24.1270934528029; Sat, 10 Apr 2010 14:22:08 -0700 (PDT) Received: from [192.168.1.13] (abvp94.neoplus.adsl.tpnet.pl [83.8.213.94]) by mx.google.com with ESMTPS id w5sm10883522mue.54.2010.04.10.14.22.06 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sat, 10 Apr 2010 14:22:07 -0700 (PDT) User-Agent: KMail/1.9.3 In-Reply-To: <20100410194154.GB28768@thyrsus.com> Content-Disposition: inline Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: On Sat, 10 Apr 2010, Eric Raymond wrote: > Jakub Narebski : > > > I could understand providing JSON format, specified using --json > > option. > > You know, that's actually an interesting idea. I mentioned it > previously as the not-XML if we want to build on a metaprotocol; > I wasn't considering it seriously then. But I am now, and it is > not without attractions. JSON would certainly solve all the delimiter > and empty-object edge cases, and it has excellent extensibility. It is a bit chatty, but is to some extent self documenting. The question is whether it should output well formed array of objects, or just list of objects not wrapped in array... > > I think there is some GPLv2 compatibile JSON generating code > > in C (MIT licensed code is GPLv2 compatibilie, isn't it?); we can > > always borrow compact JSON generation code from GPSD project (if > > license allows it) from ESR. > > My license would allow it, but there's not really a lot of win in > trying to reuse JSON generator code - writing your own printfs for > it by hand is easy and fast. What I am worrying about is correct handling of escaping, quoting, and non-ASCII characters in strings (the JSON-quoting and JSON-escapes are different than C escape codes, IIRC). JSON rules are simple, but are different than C. -- Jakub Narebski Poland From mboxrd@z Thu Jan 1 00:00:00 1970 From: Simon Subject: Re: More git status --porcelain lossage Date: Sat, 10 Apr 2010 17:24:35 -0400 Message-ID: References: <20100409190601.47B37475FEF@snark.thyrsus.com> <20100410193039.GA28768@thyrsus.com> Reply-To: turner25@gmail.com Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Cc: esr@thyrsus.com, Eric Raymond , git@vger.kernel.org To: =?ISO-8859-1?Q?=C6var_Arnfj=F6r=F0_Bjarmason?= X-From: git-owner@vger.kernel.org Sat Apr 10 23:24:42 2010 Return-path: Envelope-to: gcvg-git-2@lo.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1O0iAU-0002Nq-4I for gcvg-git-2@lo.gmane.org; Sat, 10 Apr 2010 23:24:42 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752212Ab0DJVYg (ORCPT ); Sat, 10 Apr 2010 17:24:36 -0400 Received: from mail-gw0-f46.google.com ([74.125.83.46]:60866 "EHLO mail-gw0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752091Ab0DJVYg (ORCPT ); Sat, 10 Apr 2010 17:24:36 -0400 Received: by gwj19 with SMTP id 19so359920gwj.19 for ; Sat, 10 Apr 2010 14:24:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:reply-to:in-reply-to :references:date:received:message-id:subject:from:to:cc:content-type; bh=tgf/gD+1P8rS5+iGc3WiS1n3dUjKo7RBjqFdzWJg228=; b=OMBb9JQRlNr5iH/yu7UTMija7pz3pZHdQ77LJO1kvPzGvTVwvkeFUSv0neheWNMcoc RwKarrzNbWeQ6lVB7iBXuKnq22/QXUvzhiJNsoNWma3T/WNaPf+yQ9FTuJ4Zj3JN11jv nrsGRwrhG7ShOtitZP2g/v/tt8ebN9MacVHXE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:reply-to:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; b=bBY3bQPAdyQ0lUwTHP/T7ef648eBH89uqGNPemVSgBsqFn/b1fMme0QZgLN28H36Ea ka9cv/AQUNf0/uOcH67PznRkVResr6neHX37I9XI28kLCLaaiZqeVJHtlVf+ESuDHPic w7HyaqixF4eYrFDq35ZwIpLZz+gKLUQQLYwVs= Received: by 10.100.142.16 with HTTP; Sat, 10 Apr 2010 14:24:35 -0700 (PDT) In-Reply-To: Received: by 10.101.58.8 with SMTP id l8mr3228672ank.7.1270934675296; Sat, 10 Apr 2010 14:24:35 -0700 (PDT) Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: > A lot of web services (like some Catalyst-based applications) support > all of these equally. If Git had machine readable output like this it > would be nice if every git-* program just had --format=* where * could > be xml, json, yaml, sexp, perl etc. > > The program would just construct a native datastructure and then there > would be an output driver to generate the textual representation. > I had something just like this in mind when I suggested XML... I would personally avoid it for same reasons others have pointed out, but... There are lots of tools out there that can parse and display XML very well natively. Firefox is one such example. My intention is not to start a flame here, rather try to keep our options flexible. ASCII would clearly remain the default though! ;) Simon From mboxrd@z Thu Jan 1 00:00:00 1970 From: Paolo Bonzini Subject: Re: More git status --porcelain lossage Date: Sun, 11 Apr 2010 00:28:36 +0200 Message-ID: <4BC0FB94.6050409@gnu.org> References: <20100409190601.47B37475FEF@snark.thyrsus.com> <20100410194154.GB28768@thyrsus.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------080605070306070202040505" Cc: git@vger.kernel.org To: Martin Langhoff X-From: git-owner@vger.kernel.org Sun Apr 11 00:28:51 2010 Return-path: Envelope-to: gcvg-git-2@lo.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1O0jAW-0003Yy-9p for gcvg-git-2@lo.gmane.org; Sun, 11 Apr 2010 00:28:49 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752289Ab0DJW2m (ORCPT ); Sat, 10 Apr 2010 18:28:42 -0400 Received: from fg-out-1718.google.com ([72.14.220.155]:18952 "EHLO fg-out-1718.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752259Ab0DJW2k (ORCPT ); Sat, 10 Apr 2010 18:28:40 -0400 Received: by fg-out-1718.google.com with SMTP id 22so61552fge.1 for ; Sat, 10 Apr 2010 15:28:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:sender:message-id:date:from :user-agent:mime-version:to:cc:subject:references:in-reply-to :content-type; bh=VkUMoJG13SwwB5ea8sYhzIyPzIm64iUmR+S+wB1PnDs=; b=HgeMz8AYn1OaXfmMFHpvEvbfEh8zryEGrC5V5KQ+txL2yEwlBtXTw4He8pcRfUds2o FPdug+SiXIi5Uo33PICpwLeAlkK2qR4VZQ5IKLhv2PxBF4dOZmZiNY1kg8nF26eZz3lo gDjaTukHVd9BzstUMqPmNNl8yZvVzUaAreMyc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type; b=KbW1hH0cd+qjZBeioGOEQP5zlAyDx9qirahw67Ry7wQsMjS+Wj5zHU6kHWEmK6xixG amBijTn2xfRDutRj5wFxcYscOz9qWQ88MENo95nWzeeR+4vMem48mDhB0scWKji8kTQI n9zKg/ZGnO4y56N1HE+aFmhFhuK8OSv9qbs4c= Received: by 10.223.15.143 with SMTP id k15mr1664090faa.57.1270938518567; Sat, 10 Apr 2010 15:28:38 -0700 (PDT) Received: from yakj.usersys.redhat.com (s209p8.home.99maxprogres.cz [85.93.118.17]) by mx.google.com with ESMTPS id z10sm5807844fka.1.2010.04.10.15.28.37 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sat, 10 Apr 2010 15:28:37 -0700 (PDT) User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.8) Gecko/20100301 Fedora/3.0.3-1.fc12 Lightning/1.0b2pre Thunderbird/3.0.3 In-Reply-To: Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: This is a multi-part message in MIME format. --------------080605070306070202040505 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit On 04/10/2010 10:31 PM, Martin Langhoff wrote: > On Sat, Apr 10, 2010 at 3:41 PM, Eric Raymond wrote: >>> I could understand providing JSON format, specified using --json >>> option. >> >> You know, that's actually an interesting idea. I mentioned it >> previously as the not-XML if we want to build on a metaprotocol; > > One issue is that there's no stream-parser JSON implementations that > I'm aware of. Here is one. It's ugly as hell, you're warned. The only missing piece is making the stack state resizable. Paolo --------------080605070306070202040505 Content-Type: text/plain; name="json.c" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="json.c" /* * An event-based, asynchronous JSON parser. * * Copyright (C) 2009 Red Hat Inc. * * Authors: * Paolo Bonzini * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include "json.h" #include #include /* Common character classes. */ #define CASE_XDIGIT \ case 'a': case 'b': case 'c': case 'd': case 'e': case 'f': \ case 'A': case 'B': case 'C': case 'D': case 'E': case 'F' #define CASE_DIGIT \ case '0': case '1': case '2': case '3': case '4': \ case '5': case '6': case '7': case '8': case '9' /* Helper function to go from \uXXXX-encoded UTF-16 to UTF-8. */ static bool hex_to_utf8 (char *buf, char **dest, char *src) { int i, n; uint8_t *p; for (i = n = 0; i < 4; i++) { n <<= 4; switch (src[i]) { CASE_DIGIT: n |= src[i] - '0'; break; CASE_XDIGIT: n |= (src[i] & ~32) - 'A' + 10; break; default: return false; } } p = (uint8_t *)*dest; if (n < 128) { *p++ = n; } else if (n < 2048) { *p++ = 0xC0 | (n >> 6); *p++ = 0x80 | (n & 63); } else if (n < 0xDC00 || n > 0xDFFF) { *p++ = 0xE0 | (n >> 12); *p++ = 0x80 | ((n >> 6) & 63); *p++ = 0x80 | (n & 63); } else { /* Merge with preceding high surrogate. */ if (p - (uint8_t *)buf < 3 || p[-3] != 0xED || p[-2] < 0xA0 || p[-2] > 0xAF) /* 0xD800..0xDBFF */ return false; n += 0x10000 - 0xDC00; n += ((p[-2] & 15) << 16) | ((p[-1] & 63) << 10); /* Overwrite high surrogate. */ p[-3] = 0xF0 | (n >> 18); p[-2] = 0x80 | ((n >> 12) & 63); p[-1] = 0x80 | ((n >> 6) & 63); *p++ = 0x80 | (n & 63); } *dest = (char *)p; return true; } struct json_parser { struct json_parser_config c; size_t n, alloc; char *buf; size_t sp; uint32_t state, stack[128]; char start_buffer[128]; }; /* Managing the state stack. */ static inline void push_state (struct json_parser *p, uint32_t state) { p->stack[p->sp++] = p->state; p->state = state; } static inline void pop_state (struct json_parser *p) { p->state = p->stack[--p->sp]; } /* Managing the string/number buffer. */ static inline void clear_buffer (struct json_parser *p) { p->n = 0; } static inline void push_buffer (struct json_parser *p, char c) { if (p->n == p->alloc) { size_t new_alloc = p->alloc * 2; if (p->buf == p->start_buffer) { p->buf = malloc (new_alloc); memcpy (p->buf, p->start_buffer, p->alloc); } else { p->buf = realloc (p->buf, new_alloc); } p->alloc = new_alloc; } p->buf[p->n++] = c; } /* * Parser states are organized like this: * bit 0-7: enum parser_state * bit 8-15: for IN_KEYWORD, index in keyword table * bit 16-31: additional substate (enum parser_cookies) */ enum parser_state { START_PARSE, /* at start of parsing */ IN_KEYWORD, /* parsing keyword (match exactly) */ START_KEY, /* expecting key */ END_KEY, /* expecting colon */ START_VALUE, /* expecting value */ END_VALUE, /* expecting comma or closing parenthesis */ IN_NUMBER, /* parsing number (up to whitespace) */ IN_STRING, /* parsing string */ IN_STRING_BACKSLASH, /* parsing string, copy one char verbatim */ IN_COMMENT, /* comment mini-scanner */ }; enum parser_cookies { IN_UNUSED, IN_TRUE, /* for IN_KEYWORD */ IN_FALSE, IN_NULL, IN_ARRAY, /* for {START,END}_{KEY,VALUE} */ IN_DICT, IN_KEY, /* for IN_STRING */ IN_VALUE, }; #define STATE(state, cookie) \ (((cookie) << 16) | (state)) #define STATE_KEYWORD(n, cookie) \ (((cookie) << 16) | ((n) << 8) | IN_KEYWORD) static const char keyword_table[] = "rue\0alse\0ull"; enum keyword_indices { KW_TRUE = 0, KW_FALSE = 4, KW_NULL = 9, }; /* Parser actions. These transfer to the appropriate state, * and invoke the callbacks. * * If there is a begin/end pair, begin pushes a state * and end pops it. */ static inline bool array_begin (struct json_parser *p) { push_state (p, STATE (START_VALUE, IN_ARRAY)); return !p->c.array_begin || p->c.array_begin (p->c.data); } static inline bool array_end (struct json_parser *p) { int state_cookie = (p->state >> 16); if (state_cookie != IN_ARRAY) return false; pop_state (p); return !p->c.array_end || p->c.array_end (p->c.data); } static inline bool object_begin (struct json_parser *p) { push_state (p, STATE (START_KEY, IN_DICT)); return !p->c.object_begin || p->c.object_begin (p->c.data); } static inline bool object_end (struct json_parser *p) { int state_cookie = (p->state >> 16); if (state_cookie != IN_DICT) return false; pop_state (p); return !p->c.object_end || p->c.object_end (p->c.data); } static inline bool key_user (struct json_parser *p) { return p->c.value_user && p->c.key (p->c.data, NULL, 0); } static inline bool number_begin (struct json_parser *p, char ch) { push_state (p, IN_NUMBER); push_buffer (p, ch); return true; } static inline bool number_end (struct json_parser *p) { char *end; bool result; long long ll; double d; pop_state (p); push_buffer (p, 0); ll = strtoll (p->buf, &end, 0); if (!*end) result = (!p->c.value_integer || p->c.value_integer (p->c.data, ll)); else { d = strtod (p->buf, &end); result = (!*end && (!p->c.value_float || p->c.value_float (p->c.data, d))); } clear_buffer(p); return result; } static inline bool value_null (struct json_parser *p) { return !p->c.value_null || p->c.value_null (p->c.data); } static inline bool value_boolean (struct json_parser *p, int n) { return !p->c.value_boolean || p->c.value_boolean (p->c.data, n); } static inline bool string_begin (struct json_parser *p, int cookie) { push_state (p, STATE (IN_STRING, cookie)); return true; } static inline bool string_end (struct json_parser *p, int cookie) { bool result; char *buf, *src, *dest; size_t n; pop_state (p); push_buffer (p, 0); /* Unescape in place. */ for (n = p->n, buf = src = dest = p->buf; n > 0; n--) { if (*src != '\\') { *dest++ = *src++; continue; } if (n < 2) return false; src++; n--; switch (*src++) { case 'b': *dest++ = '\b'; continue; case 'f': *dest++ = '\f'; continue; case 'n': *dest++ = '\n'; continue; case 'r': *dest++ = '\r'; continue; case 't': *dest++ = '\t'; continue; case 'U': case 'u': /* The [uU] has not been removed from n yet, hence subtract 5. */ if (n < 5 || !hex_to_utf8 (buf, &dest, src)) return false; src += 4; n -= 4; continue; default: *dest++ = src[-1]; continue; } } buf = p->buf; n = dest - buf; if (cookie == IN_KEY) result = !p->c.key || p->c.key (p->c.data, buf, n); else result = !p->c.value_string || p->c.value_string (p->c.data, buf, n); clear_buffer(p); return result; } static inline bool value_user (struct json_parser *p) { return p->c.value_user && p->c.value_user (p->c.data); } static inline bool comment (struct json_parser *p) { return !p->c.comment || p->c.comment (p->c.data, p->buf, p->n); } bool json_parser_char(struct json_parser *p, char ch) { for (;;) { int state = p->state & 255; int state_data = (p->state >> 8) & 255; int state_cookie = (p->state >> 16); // printf ("%d %d | %d %d\n", state, ch, state_cookie, p->sp); /* The big ugly parser. Each case will always return or * continue, and we want to check this at link time if * possible. */ #ifndef __OPTIMIZE__ #define link_error abort #endif extern void link_error (void); switch (state) { /* First, however, a helpful definition... */ #define SKIP_WHITE \ switch (ch) { \ case '/': goto do_start_comment; \ case ' ': case '\t': case '\n': case '\r': case '\f': return true; \ default: break; \ } /* Unlike START_VALUE, this only accepts compound values. */ case START_PARSE: SKIP_WHITE; p->state = STATE (END_VALUE, state_cookie); switch (ch) { case '[': return array_begin (p); case '{': return object_begin (p); default: return false; } link_error (); /* Only strings and user values are accepted here. */ case START_KEY: SKIP_WHITE; p->state = STATE (END_KEY, IN_DICT); switch (ch) { case '"': return string_begin (p, IN_KEY); case '%': return key_user (p); case '}': return object_end (p); default: return false; } link_error (); /* Accept any Javascript literal. Checking p->sp ensures that * something like "[] []" is rejected (the first array is parsed * from START_PARSE. */ case START_VALUE: SKIP_WHITE; if (p->sp == 0) return false; p->state = STATE (END_VALUE, state_cookie); switch (ch) { case 't': push_state (p, STATE_KEYWORD(KW_TRUE, IN_TRUE)); return true; case 'f': push_state (p, STATE_KEYWORD(KW_FALSE, IN_FALSE)); return true; case 'n': push_state (p, STATE_KEYWORD(KW_NULL, IN_NULL)); return true; case '"': return string_begin (p, IN_VALUE); case '-': CASE_DIGIT: return number_begin (p, ch); case '[': return array_begin (p); case '{': return object_begin (p); case '%': return value_user (p); case ']': return array_end (p); default: return false; } link_error (); /* End of a key, look for a colon. */ case END_KEY: SKIP_WHITE; p->state = STATE (START_VALUE, IN_DICT); return (ch == ':'); /* End of a value, look for a comma or closing parenthesis. */ case END_VALUE: SKIP_WHITE; p->state = STATE (state_cookie == IN_DICT ? START_KEY : START_VALUE, state_cookie); switch (ch) { case ',': return true; case '}': return object_end (p); case ']': return array_end (p); default: return false; } link_error (); /* Table-driven keyword scanner. Advance until mismatch or end * of keyword. */ case IN_KEYWORD: if (ch != keyword_table[state_data]) return false; if (keyword_table[state_data + 1] != 0) { p->state = STATE_KEYWORD(state_data + 1, state_cookie); return true; } pop_state (p); switch (state_cookie) { case IN_TRUE: return value_boolean (p, 1); case IN_FALSE: return value_boolean (p, 0); case IN_NULL: return value_null (p); default: abort (); } link_error (); /* Eat until closing quote (special-casing \"). */ case IN_STRING: switch (ch) { case '"': return string_end (p, state_cookie); case '\\': p->state = STATE (IN_STRING_BACKSLASH, state_cookie); default: push_buffer (p, ch); return true; } link_error (); /* Eat any character */ case IN_STRING_BACKSLASH: push_buffer (p, ch); p->state = STATE (IN_STRING, state_cookie); return true; /* Eat until a "bad" character is found, then we refine with * strtod/strtoll. The character we end on is reprocessed in * the new state! */ case IN_NUMBER: switch (ch) { case '+': case '-': case '.': case 'x': case 'X': CASE_DIGIT: CASE_XDIGIT: push_buffer (p, ch); return true; default: if (!number_end (p)) return false; continue; } link_error (); /* Parse until '*' '/', then convert the whole comment to a * single blank and rescan. */ do_start_comment: push_state(p, IN_COMMENT); if (p->c.comment) push_buffer(p, ch); return true; case IN_COMMENT: if (p->c.comment) push_buffer(p, ch); if (state_cookie == 0 && ch != '*') return false; else if (state_cookie == 0 ) state_cookie = 1; else if (state_cookie == 1 && ch == '*') state_cookie = 2; else if (state_cookie == 2 && ch == '*') state_cookie = 2; else if (state_cookie == 2 && ch == '/') state_cookie = 3; else state_cookie = 1; if (state_cookie < 3) { p->state = STATE(state, state_cookie); return true; } else { comment (p); pop_state (p); ch = ' '; continue; } link_error (); default: abort (); } link_error (); } } bool json_parser_string(struct json_parser *p, char *s, size_t n) { while (n--) if (!json_parser_char(p, *s++)) return false; return true; } struct json_parser *json_parser_new(struct json_parser_config *config) { struct json_parser *p; p = malloc (sizeof *p); memcpy (&p->c, config, sizeof *config); p->n = 0; p->alloc = sizeof p->start_buffer; p->state = START_PARSE; p->buf = p->start_buffer; p->sp = 0; return p; } bool json_parser_destroy(struct json_parser *p) { bool result = (p->state == END_VALUE) && (p->sp == 0); if (p->buf != p->start_buffer) free (p->buf); free (p); return result; } --------------080605070306070202040505 Content-Type: text/plain; name="main.c" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="main.c" /* main.c */ /* This program demonstrates a simple application of JSON_parser. It reads a JSON text from STDIN, producing an error message if the text is rejected. % JSON_parser #include #include #include #include #include "json.h" #include #include #include static int level = 0; static int got_key = 0; static void print_indent() { printf ("%*s", 2 * level, ""); } static bool array_begin (void *data) { if (!got_key) print_indent(); else got_key = 0; printf ("[\n"); ++level; return true; } static bool array_end (void *data) { --level; print_indent (); printf ("]\n"); return true; } static bool object_begin (void *data) { if (!got_key) print_indent(); else got_key = 0; printf ("{\n"); ++level; return true; } static bool object_end (void *data) { --level; print_indent (); printf ("}\n"); return true; } static bool key (void *data, const char *buf, size_t n) { got_key = 1; print_indent (); if (buf) printf ("key = '%s', value = ", buf); else printf ("user key = %%%c, value = ", getchar()); return true; } static bool value_integer (void *data, long long ll) { if (!got_key) print_indent(); else got_key = 0; printf ("integer: %lld\n", ll); return true; } static bool value_float (void *data, double d) { if (!got_key) print_indent(); else got_key = 0; printf ("float: %f\n", d); return true; } static bool value_null (void *data) { if (!got_key) print_indent(); else got_key = 0; printf ("null\n"); return true; } static bool value_boolean (void *data, int val) { if (!got_key) print_indent(); else got_key = 0; printf ("%s\n", val ? "true" : "false"); return true; } static bool value_string (void *data, const char *buf, size_t n) { if (!got_key) print_indent(); else got_key = 0; printf ("string: '%s'\n", buf); return true; } static bool value_user (void *data) { if (!got_key) print_indent(); else got_key = 0; printf ("user: %%%c\n", getchar()); return true; } int main(int argc, char* argv[]) { static struct json_parser_config parser_config = { .array_begin = array_begin, .array_end = array_end, .object_begin = object_begin, .object_end = object_end, .key = key, .value_integer = value_integer, .value_float = value_float, .value_null = value_null, .value_boolean = value_boolean, .value_string = value_string, .value_user = value_user, }; struct json_parser *p = json_parser_new(&parser_config); int count = 0; int ch; while ((ch = getchar ()) != EOF && json_parser_char (p, ch)) count++; if (ch != EOF) { fprintf (stderr, "error at character %d\n", count); exit (1); } if (!json_parser_destroy (p)) { fprintf (stderr, "error at end of file\n"); exit (1); } exit (0); } --------------080605070306070202040505 Content-Type: text/plain; name="json.h" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="json.h" /* * An event-based, asynchronous JSON parser. * * Copyright (C) 2009 Red Hat Inc. * * Authors: * Paolo Bonzini * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef JSON_H #define JSON_H #include #include #include struct json_parser_config { bool (*array_begin) (void *); bool (*array_end) (void *); bool (*object_begin) (void *); bool (*object_end) (void *); bool (*key) (void *, const char *, size_t); bool (*value_integer) (void *, long long); bool (*value_float) (void *, double); bool (*value_null) (void *); bool (*value_boolean) (void *, int); bool (*value_string) (void *, const char *, size_t); bool (*value_user) (void *); bool (*comment) (void *, const char *, size_t); void *data; }; struct json_parser; struct json_parser *json_parser_new(struct json_parser_config *config); bool json_parser_destroy(struct json_parser *p); bool json_parser_char(struct json_parser *p, char ch); bool json_parser_string(struct json_parser *p, char *buf, size_t n); #endif /* JSON_H */ --------------080605070306070202040505-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Raymond Subject: Re: More git status --porcelain lossage Date: Sat, 10 Apr 2010 18:57:04 -0400 Organization: Eric Conspiracy Secret Labs Message-ID: <20100410225704.GA4623@thyrsus.com> References: <20100409190601.47B37475FEF@snark.thyrsus.com> <20100410194154.GB28768@thyrsus.com> <201004102321.59263.jnareb@gmail.com> Reply-To: esr@thyrsus.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Simon , Eric Raymond , git@vger.kernel.org To: Jakub Narebski X-From: git-owner@vger.kernel.org Sun Apr 11 00:57:20 2010 Return-path: Envelope-to: gcvg-git-2@lo.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1O0jc2-0004zq-Fv for gcvg-git-2@lo.gmane.org; Sun, 11 Apr 2010 00:57:14 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752386Ab0DJW5H (ORCPT ); Sat, 10 Apr 2010 18:57:07 -0400 Received: from static-71-162-243-5.phlapa.fios.verizon.net ([71.162.243.5]:56510 "EHLO snark.thyrsus.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752063Ab0DJW5G (ORCPT ); Sat, 10 Apr 2010 18:57:06 -0400 Received: by snark.thyrsus.com (Postfix, from userid 23) id 117D020CBBC; Sat, 10 Apr 2010 18:57:04 -0400 (EDT) Content-Disposition: inline In-Reply-To: <201004102321.59263.jnareb@gmail.com> X-Eric-Conspiracy: There is no conspiracy User-Agent: Mutt/1.5.20 (2009-06-14) Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: Jakub Narebski : > [JSON] is a bit chatty, but is to some extent self documenting. Yes. But to my mind, the big win of JSON is that you can extend it without breaking parsers looking for older versions - they just skip the new fields and all is happy. Jakub, you seem to know this, but other listmermbers may not: I've recently re-engineered GPSD, a service daemon for watching geolocation sensors, to report JSON objects up the socket to client apps. The benefits in clarity and extensibility of the protocol have been *huge*. Like, today I'm adding a reporting type for digital compass/gyroscope sensors. > The question is whether it should output well formed array of objects, > or just list of objects not wrapped in array... Yes, I know this dance. Answer: one big JSON object, tagged by the name of the output generator, and also *containing a version-stamp field*. Array of file status objects is another top-level member. The point is: later, if we want to enrich the reporting format, we add whatever fields we want and bump the version stamp. Self-describing goodness. Python, Perl, JavaScript, and Emacs LISP clients win especially big. Slurping this into a native data structure is one function call. The more I think about this, the better I like it. > What I am worrying about is correct handling of escaping, quoting, > and non-ASCII characters in strings (the JSON-quoting and JSON-escapes > are different than C escape codes, IIRC). JSON rules are simple, > but are different than C. Yes. Perhaps there's some scope for reuse here after all. GPSD has well-tested code for uttering the JSON quote/escape conventions. The git project is welcome to it. -- Eric S. Raymond From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Raymond Subject: Re: More git status --porcelain lossage Date: Sat, 10 Apr 2010 19:06:10 -0400 Organization: Eric Conspiracy Secret Labs Message-ID: <20100410230610.GB4623@thyrsus.com> References: <20100409190601.47B37475FEF@snark.thyrsus.com> <20100410194154.GB28768@thyrsus.com> <4BC0FB94.6050409@gnu.org> Reply-To: esr@thyrsus.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Martin Langhoff , git@vger.kernel.org To: Paolo Bonzini X-From: git-owner@vger.kernel.org Sun Apr 11 01:06:22 2010 Return-path: Envelope-to: gcvg-git-2@lo.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1O0jkn-0000gJ-Jb for gcvg-git-2@lo.gmane.org; Sun, 11 Apr 2010 01:06:17 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752451Ab0DJXGM (ORCPT ); Sat, 10 Apr 2010 19:06:12 -0400 Received: from static-71-162-243-5.phlapa.fios.verizon.net ([71.162.243.5]:43661 "EHLO snark.thyrsus.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752439Ab0DJXGK (ORCPT ); Sat, 10 Apr 2010 19:06:10 -0400 Received: by snark.thyrsus.com (Postfix, from userid 23) id 16C3220CBBC; Sat, 10 Apr 2010 19:06:10 -0400 (EDT) Content-Disposition: inline In-Reply-To: <4BC0FB94.6050409@gnu.org> X-Eric-Conspiracy: There is no conspiracy User-Agent: Mutt/1.5.20 (2009-06-14) Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: Paolo Bonzini : > >One issue is that there's no stream-parser JSON implementations that > >I'm aware of. > > Here is one. It's ugly as hell, you're warned. The only missing > piece is making the stack state resizable. I wrote one in C for the GPSD project that has two interesting properties: (1) No use of malloc(), (2) Unpacks to *fixed-extent* data structures. It has one language restriction: Array subelements all have to be the same type. It's not a stream parser, so there will be compile-time limits on the volume of data it can handle. This isn't a big deal in the GPSD context, where the objects are relatively short (< 1K) datagrams. It's very well tested and, I think, pretty bulletproof. I've been thinking of spinning it out as a reusable project. -- Eric S. Raymond From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Aguilar Subject: Re: More git status --porcelain lossage Date: Sun, 11 Apr 2010 11:04:06 +0000 (UTC) Message-ID: <20100413050247.GA31108@gmail.com> References: <20100409190601.47B37475FEF@snark.thyrsus.com> <20100410194154.GB28768@thyrsus.com> <4BC0FB94.6050409@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Cc: Martin Langhoff , git@vger.kernel.org To: Paolo Bonzini X-From: git-owner@vger.kernel.org Sun Apr 11 13:04:05 2010 Return-path: Envelope-to: gcvg-git-2@lo.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1O0uxQ-00021s-It for gcvg-git-2@lo.gmane.org; Sun, 11 Apr 2010 13:04:04 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751844Ab0DKLD7 (ORCPT ); Sun, 11 Apr 2010 07:03:59 -0400 Received: from mail-yw0-f194.google.com ([209.85.211.194]:63490 "EHLO mail-yw0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751801Ab0DKLD6 (ORCPT ); Sun, 11 Apr 2010 07:03:58 -0400 Received: by ywh32 with SMTP id 32so1089856ywh.33 for ; Sun, 11 Apr 2010 04:03:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:date:from:to:cc:subject :message-id:references:mime-version:content-type:content-disposition :in-reply-to:user-agent; bh=DJ96HmxIu99LzmdD9xFSIMTs/GnK4CDSEVqmvWFXOIk=; b=MvV/vPWm/pDUdJTBbLdL8TnSW3Jlul5NhI++Vxl9wZO0/eaI2BBtSiDn1xEBXK3b30 hxgXrxHxv4tYEhNUOhqJr0pdy3w1L2U9mAmn3XwhqSEJjOc6/x1DQRWWfNsCnttNjhC1 60K7C1rvvrH8CcRJOWvA17s6XsKIBdNrgAG2k= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=waX3y6HMcipp4MF09beg9lI25gKdsquJ+vL7u7r0ADoR9e0rwi0Gm3qys0ZygyGVsu 4+qyVCYEEG/GottVc7g/Jk/+lJSP2nrVLW98obMq4VvSnSji8piStxyUokFysMZRcaIh /K5svGeVl+B87mnuYSKXmtxT71cTg4O8J8rkk= Received: by 10.101.141.9 with SMTP id t9mr4203999ann.55.1270983837037; Sun, 11 Apr 2010 04:03:57 -0700 (PDT) Received: from gmail.com (208-106-56-2.static.dsltransport.net [208.106.56.2]) by mx.google.com with ESMTPS id 23sm2843220iwn.10.2010.04.11.04.03.55 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sun, 11 Apr 2010 04:03:56 -0700 (PDT) Date: Mon, 12 Apr 2010 22:02:49 -0700 Content-Disposition: inline In-Reply-To: <4BC0FB94.6050409@gnu.org> User-Agent: Mutt/1.5.19 (2009-01-05) Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: On Sun, Apr 11, 2010 at 12:28:36AM +0200, Paolo Bonzini wrote: > On 04/10/2010 10:31 PM, Martin Langhoff wrote: >> On Sat, Apr 10, 2010 at 3:41 PM, Eric Raymond wrote: >>>> I could understand providing JSON format, specified using --json >>>> option. >>> >>> You know, that's actually an interesting idea. I mentioned it >>> previously as the not-XML if we want to build on a metaprotocol; >> >> One issue is that there's no stream-parser JSON implementations that >> I'm aware of. > > Here is one. It's ugly as hell, you're warned. The only missing piece > is making the stack state resizable. > > Paolo Here's a fairly popular stream parser: http://lloyd.github.com/yajl/ Yet Another JSON Library. YAJL is a small event-driven (SAX-style) JSON parser written in ANSI C, and a small validating JSON generator. YAJL is released under the BSD license. The license is BSD-with-advertising-clause. Perhaps the author did not know about modified BSD. -- David