From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B012C433F5 for ; Tue, 14 Sep 2021 19:07:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7C98260E52 for ; Tue, 14 Sep 2021 19:07:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231401AbhINTI0 (ORCPT ); Tue, 14 Sep 2021 15:08:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48150 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229658AbhINTIZ (ORCPT ); Tue, 14 Sep 2021 15:08:25 -0400 Received: from mail-pf1-x429.google.com (mail-pf1-x429.google.com [IPv6:2607:f8b0:4864:20::429]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2CFEFC061574 for ; Tue, 14 Sep 2021 12:07:08 -0700 (PDT) Received: by mail-pf1-x429.google.com with SMTP id y8so229032pfa.7 for ; Tue, 14 Sep 2021 12:07:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=QXf6PRcGFNDo3uhqGxQ8953oBubP4rIuWcWrGP9JmN4=; b=Eh4kf/zQMGffJ0O4G1hGHHNjLrhmMaLv+9rJgFITCgfGod579WsTJ0LG08f8QB2M7N nJYyfNDVyGiUZ+0TD6mXDOsxaVaw+W1anv/hLl7vpXK9Lit/oPaUHMw2dPdDL7ucvXvG Gl5yajWX1gstez4zGjNk4bGqDmemLlzRganc2aRiQ4ulukpAIVs9MAn8F/H8utIRrauU NddyOSk2YOgTTHY9btK8R/SNq8hUmoPKzkDXGXTZSUFoQr5Gll0R+0GoNL7NGONXfyqL 5PeCUYnk7/gPmEt/BWY1+uqkziT/pTnxh6/50uuT+vrw2kmMYQExKzSUd8I9KPmeNn0r lXWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=QXf6PRcGFNDo3uhqGxQ8953oBubP4rIuWcWrGP9JmN4=; b=N3oZp325Ko4sIextikOdDKHTDQBJdv3IQKa12agd8Iw3n0ftagTMmBuVuLGEctSSkJ boN+x/BZZuSuaZDpU67vi7ltTVAoLh/CzToKqBWnRfunHRxnk4Stb7gzlIi424vLLFWJ IDMRNJjjTnuX1ObvCrRj/qCiwcCCDi8v/huZ26WzrR5v2WNz88yA6BInc866UTd6pDOa porHp/GKgXaFNPXcZD3aXWNOrnwPuVF/iu1clzjrTB0P73kq3FZ5kjAe3YnYcZfizbAb 7EBy5DdBp9xSjx5RsjTrSpSz62GhBZwDh8IvPL6YzQrl/2r1PmwzjJxj0Gb2odvdrYdy Fv0A== X-Gm-Message-State: AOAM530sGRiF/ZCFTm7GeVsO266BvsnRcnpotAVpi3RmsJ9G8AWIxjpI x/2pnAZVxGgtfOVyZfUFJjunsAsX13btznJrC6VRTAu1U0M= X-Google-Smtp-Source: ABdhPJwbs+lkNmnA+vaDhnQQfoe2vYH9wwsGOHegZI6b4QvGK6RlZnSDbr9W5GiOWjGGsid4m9LxdC28QyeUtSAdJJY= X-Received: by 2002:a62:64d3:0:b0:43d:ba3:1e2c with SMTP id y202-20020a6264d3000000b0043d0ba31e2cmr6306245pfb.5.1631646427603; Tue, 14 Sep 2021 12:07:07 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: =?UTF-8?Q?Martin_=C3=85gren?= Date: Tue, 14 Sep 2021 21:06:55 +0200 Message-ID: Subject: Re: [PATCH 7/9] ls-refs: ignore very long ref-prefix counts To: Jeff King Cc: Git Mailing List , =?UTF-8?B?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org On Tue, 14 Sept 2021 at 17:38, Jeff King wrote: > > One possible solution is to just drop the connection when the limit is > reached. If we set it high enough, then only misbehaving or malicious > clients would hit it. But "high enough" is vague, and it's unfriendly if > we guess wrong and a legitimate client hits this. > > But we can do better. Since supporting the ref-prefix capability is > optional anyway, the client has to further cull the response based on > their own patterns. So we can simply ignore the patterns once we cross a > certain threshold. Note that we have to ignore _all_ patterns, not just > the ones past our limit (since otherwise we'd send too little data). This all makes sense to me. At some point, we should be able to go "I don't know what you're trying to do, but let me just ignore all this craziness and instead try to give you a useful result sooner rather than later". I do wonder if we should document that the client can't trust us to actually do all this culling. In general, I find that it's a matter of hygiene for the client to do its own checks, but with this change they actually *need* to do them. (Unless they know our limit and that they're on the right side of it, but that kind of magic is even less hygienic.) > + else if (skip_prefix(arg, "ref-prefix ", &out)) { > + if (too_many_prefixes) { > + /* ignore any further ones */ > + } else if (data.prefixes.nr >= MAX_ALLOWED_PREFIXES) { > + strvec_clear(&data.prefixes); > + too_many_prefixes = 1; > + } else { > + strvec_push(&data.prefixes, out); > + } > + } Is it easier to reason about with something like this (whitespace-damaged) on top? diff --git a/ls-refs.c b/ls-refs.c index 839fb0caa9..b3101ff361 100644 --- a/ls-refs.c +++ b/ls-refs.c @@ -147,7 +147,6 @@ static int ls_refs_config(const char *var, const char *value, void *data) int ls_refs(struct repository *r, struct packet_reader *request) { struct ls_refs_data data; - int too_many_prefixes = 0; memset(&data, 0, sizeof(data)); strvec_init(&data.prefixes); @@ -164,14 +163,8 @@ int ls_refs(struct repository *r, struct packet_reader *request) else if (!strcmp("symrefs", arg)) data.symrefs = 1; else if (skip_prefix(arg, "ref-prefix ", &out)) { - if (too_many_prefixes) { - /* ignore any further ones */ - } else if (data.prefixes.nr >= MAX_ALLOWED_PREFIXES) { - strvec_clear(&data.prefixes); - too_many_prefixes = 1; - } else { + if (data.prefixes.nr <= MAX_ALLOWED_PREFIXES) strvec_push(&data.prefixes, out); - } } else if (!strcmp("unborn", arg)) data.unborn = allow_unborn; @@ -180,6 +173,9 @@ int ls_refs(struct repository *r, struct packet_reader *request) if (request->status != PACKET_READ_FLUSH) die(_("expected flush after ls-refs arguments")); + if (data.prefixes.nr > MAX_ALLOWED_PREFIXES) + strvec_clear(&data.prefixes); + send_possibly_unborn_head(&data); if (!data.prefixes.nr) strvec_push(&data.prefixes, ""); Maybe even name the macro TOO_MANY_PREFIXES (and bump it by one) to make the logic instead be if (data.prefixes.nr < TOO_MANY_PREFIXES) strvec_push(&data.prefixes, out); ... if (data.prefixes.nr >= TOO_MANY_PREFIXES) strvec_clear(&data.prefixes); Just a thought. I'm reaching to try to find a way to improve this series. ;-) It was a nice read. Martin