From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Ilya Dryomov <ilya.dryomov@inktank.com>,
Josh Durgin <josh.durgin@inktank.com>
Subject: [PATCH 3.14 41/83] crush: fix off-by-one errors in total_tries refactor
Date: Sun, 11 May 2014 21:19:42 +0200 [thread overview]
Message-ID: <20140511191912.566314924@linuxfoundation.org> (raw)
In-Reply-To: <20140511191907.024339448@linuxfoundation.org>
3.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilya Dryomov <ilya.dryomov@inktank.com>
commit 48a163dbb517eba13643bf404a0d695c1ab0a60d upstream.
Back in 27f4d1f6bc32c2ed7b2c5080cbd58b14df622607 we refactored the CRUSH
code to allow adjustment of the retry counts on a per-pool basis. That
commit had an off-by-one bug: the previous "tries" counter was a *retry*
count, not a *try* count, but the new code was passing in 1 meaning
there should be no retries.
Fix the ftotal vs tries comparison to use < instead of <= to fix the
problem. Note that the original code used <= here, which means the
global "choose_total_tries" tunable is actually counting retries.
Compensate for that by adding 1 in crush_do_rule when we pull the tunable
into the local variable.
This was noticed looking at output from a user provided osdmap.
Unfortunately the map doesn't illustrate the change in mapping behavior
and I haven't managed to construct one yet that does. Inspection of the
crush debug output now aligns with prior versions, though.
Reflects ceph.git commit 795704fd615f0b008dcc81aa088a859b2d075138.
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
net/ceph/crush/mapper.c | 46 +++++++++++++++++++++++++++-------------------
1 file changed, 27 insertions(+), 19 deletions(-)
--- a/net/ceph/crush/mapper.c
+++ b/net/ceph/crush/mapper.c
@@ -292,8 +292,8 @@ static int is_out(const struct crush_map
* @outpos: our position in that vector
* @tries: number of attempts to make
* @recurse_tries: number of attempts to have recursive chooseleaf make
- * @local_tries: localized retries
- * @local_fallback_tries: localized fallback retries
+ * @local_retries: localized retries
+ * @local_fallback_retries: localized fallback retries
* @recurse_to_leaf: true if we want one device under each item of given type (chooseleaf instead of choose)
* @out2: second output vector for leaf items (if @recurse_to_leaf)
*/
@@ -304,8 +304,8 @@ static int crush_choose_firstn(const str
int *out, int outpos,
unsigned int tries,
unsigned int recurse_tries,
- unsigned int local_tries,
- unsigned int local_fallback_tries,
+ unsigned int local_retries,
+ unsigned int local_fallback_retries,
int recurse_to_leaf,
int *out2)
{
@@ -344,9 +344,9 @@ static int crush_choose_firstn(const str
reject = 1;
goto reject;
}
- if (local_fallback_tries > 0 &&
+ if (local_fallback_retries > 0 &&
flocal >= (in->size>>1) &&
- flocal > local_fallback_tries)
+ flocal > local_fallback_retries)
item = bucket_perm_choose(in, x, r);
else
item = crush_bucket_choose(in, x, r);
@@ -393,8 +393,8 @@ static int crush_choose_firstn(const str
x, outpos+1, 0,
out2, outpos,
recurse_tries, 0,
- local_tries,
- local_fallback_tries,
+ local_retries,
+ local_fallback_retries,
0,
NULL) <= outpos)
/* didn't get leaf */
@@ -420,14 +420,14 @@ reject:
ftotal++;
flocal++;
- if (collide && flocal <= local_tries)
+ if (collide && flocal <= local_retries)
/* retry locally a few times */
retry_bucket = 1;
- else if (local_fallback_tries > 0 &&
- flocal <= in->size + local_fallback_tries)
+ else if (local_fallback_retries > 0 &&
+ flocal <= in->size + local_fallback_retries)
/* exhaustive bucket search */
retry_bucket = 1;
- else if (ftotal <= tries)
+ else if (ftotal < tries)
/* then retry descent */
retry_descent = 1;
else
@@ -640,10 +640,18 @@ int crush_do_rule(const struct crush_map
__u32 step;
int i, j;
int numrep;
- int choose_tries = map->choose_total_tries;
- int choose_local_tries = map->choose_local_tries;
- int choose_local_fallback_tries = map->choose_local_fallback_tries;
+ /*
+ * the original choose_total_tries value was off by one (it
+ * counted "retries" and not "tries"). add one.
+ */
+ int choose_tries = map->choose_total_tries + 1;
int choose_leaf_tries = 0;
+ /*
+ * the local tries values were counted as "retries", though,
+ * and need no adjustment
+ */
+ int choose_local_retries = map->choose_local_tries;
+ int choose_local_fallback_retries = map->choose_local_fallback_tries;
if ((__u32)ruleno >= map->max_rules) {
dprintk(" bad ruleno %d\n", ruleno);
@@ -677,12 +685,12 @@ int crush_do_rule(const struct crush_map
case CRUSH_RULE_SET_CHOOSE_LOCAL_TRIES:
if (curstep->arg1 > 0)
- choose_local_tries = curstep->arg1;
+ choose_local_retries = curstep->arg1;
break;
case CRUSH_RULE_SET_CHOOSE_LOCAL_FALLBACK_TRIES:
if (curstep->arg1 > 0)
- choose_local_fallback_tries = curstep->arg1;
+ choose_local_fallback_retries = curstep->arg1;
break;
case CRUSH_RULE_CHOOSELEAF_FIRSTN:
@@ -734,8 +742,8 @@ int crush_do_rule(const struct crush_map
o+osize, j,
choose_tries,
recurse_tries,
- choose_local_tries,
- choose_local_fallback_tries,
+ choose_local_retries,
+ choose_local_fallback_retries,
recurse_to_leaf,
c+osize);
} else {
next prev parent reply other threads:[~2014-05-11 19:46 UTC|newest]
Thread overview: 90+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-11 19:19 [PATCH 3.14 00/83] 3.14.4-stable review Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 01/83] drivers/tty/hvc: dont free hvc_console_setup after init Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 02/83] tty: serial: 8250_core.c Bug fix for Exar chips Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 03/83] tty: Fix lockless tty buffer race Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 04/83] n_tty: Fix n_tty_write crash when echoing in raw mode Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 05/83] floppy: ignore kernel-only members in FDRAWCMD ioctl input Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 06/83] floppy: dont write kernel-only members to FDRAWCMD ioctl output Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 07/83] KVM: ARM: vgic: Fix sgi dispatch problem Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 08/83] arm: KVM: fix possible misalignment of PGDs and bounce page Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 09/83] KVM: async_pf: mm->mm_users can not pin apf->mm Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 10/83] KVM: ioapic: fix assignment of ioapic->rtc_status.pending_eoi (CVE-2014-0155) Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 11/83] MIPS: KVM: Pass reserved instruction exceptions to guest Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 12/83] KVM: PPC: Book3S HV: Fix KVM hang with CONFIG_KVM_XICS=n Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 13/83] MIPS: Hibernate: Flush TLB entries in swsusp_arch_resume() Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 14/83] virtio_balloon: dont softlockup on huge balloon changes Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 15/83] tools/virtio: add a missing ) Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 16/83] [SCSI] virtio-scsi: Skip setting affinity on uninitialized vq Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 17/83] [SCSI] mpt2sas: Dont disable device twice at suspend Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 18/83] powerpc/compat: 32-bit little endian machine name is ppcle, not ppc Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 19/83] powerpc/tm: Disable IRQ in tm_recheckpoint Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 20/83] powerpc: Fix Oops in rtas_stop_self() Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 21/83] s390/chsc: fix SEI usage on old FW levels Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 22/83] s390/bpf,jit: initialize A register if 1st insn is BPF_S_LDX_B_MSH Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 23/83] ASoC: dapm: Fix widget double free with auto-disable DAPM kcontrol Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 24/83] ARC: Remove ARC_HAS_COH_RTSC Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 25/83] SUNRPC: Ensure that call_connect times out correctly Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 26/83] SUNRPC: Ensure call_connect_status() deals correctly with SOFTCONN tasks Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 27/83] ARC: !PREEMPT: Ensure Return to kernel mode is IRQ safe Greg Kroah-Hartman
2014-05-12 4:54 ` Vineet Gupta
2014-05-13 11:06 ` Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 28/83] framebuffer: fix cfb_copyarea Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 29/83] matroxfb: restore the registers M_ACCESS and M_PITCH Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 30/83] mach64: use unaligned access Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 31/83] mach64: fix cursor when character width is not a multiple of 8 pixels Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 32/83] b43: Fix machine check error due to improper access of B43_MMIO_PSM_PHY_HDR Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 33/83] Revert "net: mvneta: fix usage as a module on RGMII configurations" Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 34/83] ahci: do not request irq for dummy port Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 35/83] libata/ahci: accommodate tag ordered controllers Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 36/83] ahci: Ensure "MSI Revert to Single Message" mode is not enforced Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 37/83] ahci: Do not receive interrupts sent by dummy ports Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 38/83] libata: Update queued trim blacklist for M5x0 drives Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 39/83] iwlwifi: dvm: take mutex when sending SYNC BT config command Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 40/83] iwlwifi: mvm: disable uAPSD due to bugs in the firmware Greg Kroah-Hartman
2014-05-11 19:19 ` Greg Kroah-Hartman [this message]
2014-05-11 19:19 ` [PATCH 3.14 42/83] mac80211: fix potential use-after-free Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 43/83] mac80211: fix WPA with VLAN on AP side with ps-sta again Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 44/83] mac80211: fix suspend vs. authentication race Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 45/83] mac80211: fix software remain-on-channel implementation Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 46/83] mac80211: exclude AP_VLAN interfaces from tx power calculation Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 47/83] ath9k: fix ready time of the multicast buffer queue Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 48/83] locks: allow __break_lease to sleep even when break_time is 0 Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 49/83] rtlwifi: rtl8723ae: Fix too long disable of IRQs Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 50/83] rtlwifi: rtl8188ee: " Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 51/83] rtlwifi: rtl8192cu: " Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 52/83] rtlwifi: rtl8192se: " Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 53/83] rtlwifi: rtl8192se: Fix regression due to commit 1bf4bbb Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 54/83] rtlwifi: rtl8188ee: initialize packet_beacon Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 55/83] gpio: mxs: Allow for recursive enable_irq_wake() call Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 56/83] pinctrl: as3722: fix handling of GPIO invert bit Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 57/83] tgafb: fix mode setting with fbset Greg Kroah-Hartman
2014-05-11 19:19 ` [PATCH 3.14 58/83] tgafb: fix data copying Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.14 59/83] mtd: atmel_nand: Disable subpage NAND write when using Atmel PMECC Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.14 60/83] mtd: diskonchip: mem resource name is not optional Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.14 61/83] mtd: nuc900_nand: NULL dereference in nuc900_nand_enable() Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.14 62/83] mtd: sm_ftl: heap corruption in sm_create_sysfs_attributes() Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.14 63/83] Skip intel_crt_init for Dell XPS 8700 Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.14 64/83] dm cache: prevent corruption caused by discard_block_size > cache_block_size Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.14 65/83] dm transaction manager: fix corruption due to non-atomic transaction commit Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.14 66/83] dm: take care to copy the space map roots before locking the superblock Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.14 67/83] dm thin: fix dangling bio in process_deferred_bios error path Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.14 68/83] dm cache: fix a lock-inversion Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.14 69/83] dma: edma: fix incorrect SG list handling Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.14 70/83] aio: v4 ensure access to ctx->ring_pages is correctly serialised for migration Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.14 71/83] lockd: ensure we tear down any live sockets when socket creation fails during lockd_up Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.14 72/83] lib/percpu_counter.c: fix bad percpu counter state during suspend Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.14 73/83] mmc: sdhci-bcm-kona: fix build errors when built-in Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.14 74/83] thinkpad_acpi: Fix inconsistent mute LED after resume Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.14 75/83] Input: synaptics - add min/max quirk for ThinkPad T431s, L440, L540, S1 Yoga and X1 Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.14 76/83] Input: synaptics - add min/max quirk for ThinkPad Edge E431 Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.14 77/83] cpufreq: loongson2_cpufreq: dont declare local variable as static Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.14 78/83] cpufreq: at32ap: " Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.14 79/83] ACPI / processor: Fix failure of loading acpi-cpufreq driver Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.14 80/83] cpufreq: unicore32: fix typo issue for clk Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.14 81/83] drm: cirrus: add power management support Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.14 82/83] drm: bochs: " Greg Kroah-Hartman
2014-05-11 19:20 ` [PATCH 3.14 83/83] x86-64, build: Fix stack protector Makefile breakage with 32-bit userland Greg Kroah-Hartman
2014-05-11 22:52 ` [PATCH 3.14 00/83] 3.14.4-stable review Guenter Roeck
2014-05-12 20:30 ` Greg Kroah-Hartman
2014-05-12 21:53 ` Shuah Khan
2014-05-12 22:28 ` Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140511191912.566314924@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=ilya.dryomov@inktank.com \
--cc=josh.durgin@inktank.com \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox