git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Yundt <jason@jasonyundt.email>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: git@vger.kernel.org, Jeff King <peff@peff.net>
Subject: Re: [PATCH 2/2] gitweb: remove invalid http-equiv="content-type"
Date: Mon, 07 Mar 2022 17:49:48 -0500	[thread overview]
Message-ID: <109813056.nniJfEyVGO@jason-desktop-linux> (raw)
In-Reply-To: <220307.861qze0wv9.gmgdl@evledraar.gmail.com>

On Monday, March 7, 2022 7:23:49 AM EST Ævar Arnfjörð Bjarmason wrote:
> I'm not sure I understand this change really. The result in always XML,
> so application/xhtml+xml is redundant, text/html, or both?

To be honest, using an http-equiv="content-type" in XHTML is confusing. When 
you do use one, your goal shouldn’t really be to specify the document’s MIME 
type. After all, the first three lines of each page say

	<?xml version="1.0" encoding="utf-8"?>
	<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
	<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US" lang="en-US">

Those lines are more than enough to determine that something is using XHTML 
and UTF-8. Instead, the idea is to help out a parser that is incorrectly 
parsing the document as HTML (instead of as XHTML). Historical W3C documents  
(that were applicable when http-equiv="content-type" was allowed in XHTML) [1]
[2][3] indicate that http-equiv="content-type" should be used like this:

	<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>

In other words, to use http-equiv="content-type" properly in XHTML, you had to 
lie about the document’s type. The fact that this is confusing is probably 
part of why WHATWG disallowed it in the HTML Standard.

> But aside from that: I have seen browsers get the lack of encoding=""
> "wrong" with data at rest, don't some still default to ISO-8859-1?
> 
> So won't this result in badly decoded data if you save the web page &
> view it locally?

I tested this idea in ungoogled-chromium, Firefox and Pale Moon. Other than 
Pale Moon in one specific circumstance, they all used UTF-8 as the encoding. 
Pale Moon used windows-1252, but only when the file ended with .html. When the 
file ended with .xhtml, Pale Moon used UTF-8. That being said, we don’t have to 
use an http-equiv="content-type" to fix the problem. Instead, we can use a 
<meta charset="utf-8"> which is allowed by the HTML Standard [4].

[1]: <https://www.w3.org/TR/xhtml1/#C_9>
[2]: <https://www.w3.org/TR/html-polyglot/#character-encoding>
[3]: <https://www.w3.org/Bugs/Public/show_bug.cgi?id=21818>

[4]: <https://html.spec.whatwg.org/multipage/semantics.html#attr-meta-charset>




  reply	other threads:[~2022-03-07 22:49 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-07  3:37 [PATCH 0/2] gitweb: remove invalid http-equiv="content-type" Jason Yundt
2022-03-07  3:37 ` [PATCH 1/2] comment: fix typo Jason Yundt
2022-03-07  3:37 ` [PATCH 2/2] gitweb: remove invalid http-equiv="content-type" Jason Yundt
2022-03-07 12:23   ` Ævar Arnfjörð Bjarmason
2022-03-07 22:49     ` Jason Yundt [this message]
2022-03-07 23:24   ` brian m. carlson
2022-03-08  1:07 ` [PATCH v2 0/2] " Jason Yundt
2022-03-08  2:13   ` Junio C Hamano
2022-03-08 12:26     ` Jason Yundt
2022-03-08 15:56   ` [PATCH v3 " Jason Yundt
2022-03-08 15:56     ` [PATCH v3 1/2] comment: fix typo Jason Yundt
2022-03-08 15:56     ` [PATCH v3 2/2] gitweb: remove invalid http-equiv="content-type" Jason Yundt
2022-03-08  1:07 ` [PATCH v2 1/2] comment: fix typo Jason Yundt
2022-03-08  1:07 ` [PATCH v2 2/2] gitweb: remove invalid http-equiv="content-type" Jason Yundt
2022-03-08  1:50   ` brian m. carlson
2022-03-08 12:44     ` Ævar Arnfjörð Bjarmason
2022-03-08 14:54       ` Jason Yundt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=109813056.nniJfEyVGO@jason-desktop-linux \
    --to=jason@jasonyundt.email \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).