All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vladislav Bolkhovitin <vvvvvst@gmail.com>
To: "杨苏立 Yang Su Li" <suli@cs.wisc.edu>
Cc: General Discussion of SQLite Database <sqlite-users@sqlite.org>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	drh@hwaci.com
Subject: Re: [sqlite] light weight write barriers
Date: Tue, 23 Oct 2012 15:53:11 -0400	[thread overview]
Message-ID: <5086F5A7.9090406@vlnb.net> (raw)
In-Reply-To: <CABK4GYNKF6LCgsQ5SN+dATtRm-0Qh_QmNdqZqZcj6S98z+ofXg@mail.gmail.com>

杨苏立 Yang Su Li, on 10/11/2012 12:32 PM wrote:
> I am not quite whether I should ask this question here, but in terms
> of light weight barrier/fsync, could anyone tell me why the device
> driver / OS provide the barrier interface other than some other
> abstractions anyway? I am sorry if this sounds like a stupid questions
> or it has been discussed before....
>
> I mean, most of the time, we only need some ordering in writes; not
> complete order, but partial,very simple topological order. And a
> barrier seems to be a heavy weighted solution to achieve this anyway:
> you have to finish all writes before the barrier, then start all
> writes issued after the barrier. That is some ordering which is much
> stronger than what we need, isn't it?
>
> As most of the time the order we need do not involve too many blocks
> (certainly a lot less than all the cached blocks in the system or in
> the disk's cache), that topological order isn't likely to be very
> complicated, and I image it could be implemented efficiently in a
> modern device, which already has complicated caching/garbage
> collection/whatever going on internally. Particularly, it seems not
> too hard to be implemented on top of SCSI's ordered/simple task mode?

Yes, SCSI has full support for ordered/simple commands designed exactly for that 
task: to have steady flow of commands even in case when some of them are ordered. 
It also has necessary facilities to handle commands errors without unexpected 
reorders of their subsequent commands (ACA, etc.). Those allow to get full storage 
performance by fully "fill the pipe", using networking terms. I can easily imaging 
real life configs, where it can bring 2+ times more performance, than with queue 
flushing.

In fact, AFAIK, AIX requires from storage to support ordered commands and ACA.

Implementation should be relatively easy as well, because all transports naturally 
have link as the point of serialization, so all you need in multithreaded 
environment is to pass some SN from the point when each ORDERED command created to 
the point when it sent to the link and make sure that no SIMPLE commands can ever 
cross ORDERED commands. You can see how it is implemented in SCST in an elegant 
and lockless manner (for SIMPLE commands).

But historically for some reason Linux storage developers were stuck with 
"barriers" concept, which is obviously not the same as ORDERED commands, hence had 
a lot troubles with their ambiguous semantic. As far as I can tell the reason of 
that was some lack of sufficiently deep SCSI understanding (how to handle errors, 
believe that ACA is something legacy from parallel SCSI times, etc.).

Hopefully, eventually the storage developers will realize the value behind ordered 
commands and learn corresponding SCSI facilities to deal with them. It's quite 
easy to demonstrate this value, if you know where to look at and not blindly 
refusing such possibility. I have already tried to explain it a couple of times, 
but was not successful.

Before that happens, people will keep returning again and again with those simple 
questions: why the queue must be flushed for any ordered operation? Isn't is an 
obvious overkill?

Vlad

  parent reply	other threads:[~2012-10-23 19:53 UTC|newest]

Thread overview: 146+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <415E76CC-A53D-4643-88AB-3D7D7DC56F98@dubeyko.com>
2012-10-06 13:54 ` [PATCH 00/16] f2fs: introduce flash-friendly file system Vyacheslav Dubeyko
2012-10-06 20:06   ` Jaegeuk Kim
2012-10-07  7:09     ` Marco Stornelli
2012-10-07  9:31       ` Jaegeuk Kim
2012-10-07  9:31         ` Jaegeuk Kim
2012-10-07 12:08         ` Vyacheslav Dubeyko
2012-10-07 12:08           ` Vyacheslav Dubeyko
2012-10-08  8:25           ` Jaegeuk Kim
2012-10-08  8:25             ` Jaegeuk Kim
2012-10-08  9:59             ` Namjae Jeon
2012-10-08  9:59               ` Namjae Jeon
2012-10-08 10:52               ` Jaegeuk Kim
2012-10-08 11:21                 ` Namjae Jeon
2012-10-08 12:11                   ` Jaegeuk Kim
2012-10-09  3:52                     ` Namjae Jeon
2012-10-09  8:00                       ` Jaegeuk Kim
2012-10-09  8:31                 ` Lukáš Czerner
2012-10-09 10:45                   ` Jaegeuk Kim
2012-10-09 10:45                     ` Jaegeuk Kim
2012-10-09 11:01                     ` Lukáš Czerner
2012-10-09 12:01                       ` Jaegeuk Kim
2012-10-09 12:39                         ` Lukáš Czerner
2012-10-09 13:10                           ` Jaegeuk Kim
2012-10-09 21:20                         ` Dave Chinner
2012-10-09 21:20                           ` Dave Chinner
2012-10-10  2:32                           ` Jaegeuk Kim
2012-10-10  4:53                       ` Theodore Ts'o
2012-10-10  4:53                         ` Theodore Ts'o
2012-10-12 20:55                         ` Arnd Bergmann
2012-10-10 10:36                   ` David Woodhouse
2012-10-12 20:58                     ` Arnd Bergmann
2012-10-13  4:26                       ` Namjae Jeon
2012-10-13 12:37                         ` Jaegeuk Kim
2012-10-13 12:37                           ` Jaegeuk Kim
2012-10-17 11:12                           ` Namjae Jeon
     [not found]                             ` <000001cdacef$b2f6eaa0$18e4bfe0$%kim@samsung.com>
2012-10-18 13:39                               ` Vyacheslav Dubeyko
2012-10-18 22:14                                 ` Jaegeuk Kim
2012-10-19  9:20                                 ` NeilBrown
2012-10-08 19:22             ` Vyacheslav Dubeyko
2012-10-09  7:08               ` Jaegeuk Kim
2012-10-09  7:08                 ` Jaegeuk Kim
2012-10-09 19:53                 ` Jooyoung Hwang
2012-10-09 19:53                   ` Jooyoung Hwang
2012-10-10  8:05                   ` Vyacheslav Dubeyko
2012-10-10  9:02                   ` Theodore Ts'o
2012-10-10 11:52                     ` SQLite on flash (was: [PATCH 00/16] f2fs: introduce flash-friendly file system) Clemens Ladisch
     [not found]                       ` <50756199.1090103-P6GI/4k7KOmELgA04lAiVw@public.gmane.org>
2012-10-10 12:47                         ` Richard Hipp
2012-10-10 17:17                           ` light weight write barriers Andi Kleen
     [not found]                             ` <m2fw5mtffg.fsf_-_-Vw/NltI1exuRpAAqCnN02g@public.gmane.org>
2012-10-10 17:48                               ` Richard Hipp
     [not found]                                 ` <CALwJ=MyR+nU3zqi3V3JMuEGNwd8FUsw9xLACJvd0HoBv3kRi0w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-10-11 16:38                                   ` Nico Williams
2012-10-11 16:38                                     ` [sqlite] " Nico Williams
     [not found]                                     ` <CAK3OfOi3E1ePfzWjq1epFaXsjtn8V_=r3h+PG6ankWW2fOr6GA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-10-11 16:48                                       ` Nico Williams
2012-10-11 16:48                                         ` [sqlite] " Nico Williams
2012-10-11 16:32                               ` 杨苏立 Yang Su Li
2012-10-11 16:32                                 ` [sqlite] " 杨苏立 Yang Su Li
2012-10-11 17:41                                 ` Christoph Hellwig
2012-10-23 19:53                                 ` Vladislav Bolkhovitin [this message]
     [not found]                                   ` <5086F5A7.9090406-d+Crzxg7Rs0@public.gmane.org>
2012-10-24 21:17                                     ` Nico Williams
2012-10-24 21:17                                       ` [sqlite] " Nico Williams
2012-10-24 22:03                                       ` david
     [not found]                                         ` <alpine.DEB.2.02.1210241447210.8519-Z4YwzcCRHZnr5h6Zg1Auow@public.gmane.org>
2012-10-25  0:20                                           ` Nico Williams
2012-10-25  0:20                                             ` [sqlite] " Nico Williams
2012-10-25  1:04                                             ` david
     [not found]                                               ` <alpine.DEB.2.02.1210241748180.8519-Z4YwzcCRHZnr5h6Zg1Auow@public.gmane.org>
2012-10-25  5:18                                                 ` Nico Williams
2012-10-25  5:18                                                   ` [sqlite] " Nico Williams
2012-10-25  6:02                                                   ` Theodore Ts'o
2012-10-25  6:58                                                     ` david
     [not found]                                                       ` <alpine.DEB.2.02.1210242331060.31862-Z4YwzcCRHZnr5h6Zg1Auow@public.gmane.org>
2012-10-25 14:03                                                         ` Theodore Ts'o
2012-10-25 14:03                                                           ` [sqlite] " Theodore Ts'o
     [not found]                                                           ` <20121025140327.GB13562-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
2012-10-25 18:03                                                             ` david-gFPdbfVZQbY
2012-10-25 18:03                                                               ` [sqlite] " david
     [not found]                                                               ` <alpine.DEB.2.02.1210251048280.8519-Z4YwzcCRHZnr5h6Zg1Auow@public.gmane.org>
2012-10-25 18:29                                                                 ` Theodore Ts'o
2012-10-25 18:29                                                                   ` [sqlite] " Theodore Ts'o
2012-11-05 20:03                                                                   ` Pavel Machek
     [not found]                                                                     ` <20121105200348.GB15821-5NIqAleC692hcjWhqY66xCZi+YwRKgec@public.gmane.org>
2012-11-05 22:04                                                                       ` Theodore Ts'o
2012-11-05 22:04                                                                         ` [sqlite] " Theodore Ts'o
     [not found]                                                                         ` <20121105220440.GB25378-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
2012-11-05 22:37                                                                           ` Richard Hipp
     [not found]                                                                             ` <CALwJ=Mx-uEFLXK2wywekk=0dwrwVFb68wocnH9bjXJmHRsJx3w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-11-05 23:00                                                                               ` Theodore Ts'o
2012-11-05 23:00                                                                                 ` [sqlite] " Theodore Ts'o
2012-10-30 23:49                                                     ` Nico Williams
2012-10-25  5:42                                         ` Theodore Ts'o
2012-10-25  7:11                                           ` david
2012-10-27  1:52                                       ` Vladislav Bolkhovitin
2012-10-25  5:14                                   ` Theodore Ts'o
2012-10-25 13:03                                     ` Alan Cox
     [not found]                                       ` <20121025140325.49cd7c79-38n7/U1jhRXW96NNrWNlrekiAK3p4hvP@public.gmane.org>
2012-10-25 13:50                                         ` Theodore Ts'o
2012-10-25 13:50                                           ` [sqlite] " Theodore Ts'o
2012-10-27  1:55                                           ` Vladislav Bolkhovitin
2012-10-27  1:54                                     ` Vladislav Bolkhovitin
     [not found]                                       ` <508B3EED.2080003-d+Crzxg7Rs0@public.gmane.org>
2012-10-27  4:44                                         ` Theodore Ts'o
2012-10-27  4:44                                           ` [sqlite] " Theodore Ts'o
2012-10-30 22:22                                           ` Vladislav Bolkhovitin
     [not found]                                             ` <5090532D.4050902-d+Crzxg7Rs0@public.gmane.org>
2012-10-31  9:54                                               ` Alan Cox
2012-10-31  9:54                                                 ` [sqlite] " Alan Cox
2012-11-01 20:18                                                 ` Vladislav Bolkhovitin
     [not found]                                                   ` <5092D90F.7020105-d+Crzxg7Rs0@public.gmane.org>
2012-11-01 21:24                                                     ` Alan Cox
2012-11-01 21:24                                                       ` [sqlite] " Alan Cox
2012-11-02  0:15                                                       ` Vladislav Bolkhovitin
     [not found]                                                       ` <20121101212418.140e3a82-38n7/U1jhRXW96NNrWNlrekiAK3p4hvP@public.gmane.org>
2012-11-02  0:38                                                         ` Howard Chu
2012-11-02  0:38                                                           ` [sqlite] " Howard Chu
     [not found]                                                           ` <50931601.4060102-aQkYFu9vm6AAvxtiuMwx3w@public.gmane.org>
2012-11-02 12:24                                                             ` Richard Hipp
2012-11-13  3:41                                                               ` [sqlite] " Vladislav Bolkhovitin
2012-11-02 12:33                                                             ` Alan Cox
2012-11-02 12:33                                                               ` [sqlite] " Alan Cox
2012-11-13  3:41                                                               ` Vladislav Bolkhovitin
     [not found]                                                                 ` <50A1C15E.2080605-d+Crzxg7Rs0@public.gmane.org>
2012-11-13 17:40                                                                   ` Alan Cox
2012-11-13 17:40                                                                     ` [sqlite] " Alan Cox
     [not found]                                                                     ` <20121113174000.6457a68b-38n7/U1jhRXW96NNrWNlrekiAK3p4hvP@public.gmane.org>
2012-11-13 19:13                                                                       ` Nico Williams
2012-11-13 19:13                                                                         ` [sqlite] " Nico Williams
2012-11-15  1:17                                                                         ` Vladislav Bolkhovitin
     [not found]                                                                           ` <50A442AF.9020407-d+Crzxg7Rs0@public.gmane.org>
2012-11-15 12:07                                                                             ` David Lang
2012-11-15 12:07                                                                               ` [sqlite] " David Lang
     [not found]                                                                               ` <alpine.DEB.2.02.1211150353080.32408-UEhY+ZBZOcqqLGM74eQ/YA@public.gmane.org>
2012-11-15 16:14                                                                                 ` 杨苏立 Yang Su Li
2012-11-17  5:02                                                                                   ` [sqlite] " Vladislav Bolkhovitin
2012-11-17  5:02                                                                                     ` Vladislav Bolkhovitin
2012-11-16 15:06                                                                                 ` Howard Chu
2012-11-16 15:06                                                                                   ` [sqlite] " Howard Chu
2012-11-16 15:31                                                                                   ` Ric Wheeler
     [not found]                                                                                     ` <50A65C68.6080001-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-11-16 15:54                                                                                       ` Howard Chu
2012-11-16 15:54                                                                                         ` [sqlite] " Howard Chu
2012-11-16 18:03                                                                                         ` Ric Wheeler
2012-11-16 18:03                                                                                           ` Ric Wheeler
     [not found]                                                                                   ` <50A65681.8000204-aQkYFu9vm6AAvxtiuMwx3w@public.gmane.org>
2012-11-16 19:14                                                                                     ` David Lang
2012-11-16 19:14                                                                                       ` [sqlite] " David Lang
2012-11-17  5:02                                                                               ` Vladislav Bolkhovitin
2012-11-15 17:06                                                                             ` Ryan Johnson
2012-11-15 17:06                                                                               ` [sqlite] " Ryan Johnson
2012-11-15 22:35                                                                               ` Chris Friesen
2012-11-17  5:02                                                                                 ` Vladislav Bolkhovitin
2012-11-20  1:23                                                                                   ` Vladislav Bolkhovitin
2012-11-26 20:05                                                                                     ` Nico Williams
     [not found]                                                                                       ` <CAK3OfOjD4XBGfu3cnMwTvCfec0Lvg3zrO16+pXtiFF4UWpFjDw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-11-29  2:15                                                                                         ` Vladislav Bolkhovitin
2012-11-29  2:15                                                                                           ` [sqlite] " Vladislav Bolkhovitin
2012-11-15  1:16                                                                     ` Vladislav Bolkhovitin
2012-11-13  3:37                                                           ` Vladislav Bolkhovitin
2012-11-11  4:25                                         ` 杨苏立 Yang Su Li
2012-11-13  3:42                                           ` [sqlite] " Vladislav Bolkhovitin
2012-10-10  7:57                 ` [PATCH 00/16] f2fs: introduce flash-friendly file system Vyacheslav Dubeyko
2012-10-10  9:43                   ` Jaegeuk Kim
2012-10-11  3:14                     ` Namjae Jeon
     [not found]                       ` <CAN863PuyMkSZtZCvqX+kwei9v=rnbBYVYr3TqBXF_6uxwJe2_Q@mail.gmail.com>
2012-10-17 11:13                         ` Namjae Jeon
2012-10-17 23:06                           ` Changman Lee
2012-10-12 12:30                     ` Vyacheslav Dubeyko
2012-10-12 14:25                       ` Jaegeuk Kim
2012-10-07 10:15     ` Vyacheslav Dubeyko
2012-10-07 10:15       ` Vyacheslav Dubeyko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5086F5A7.9090406@vlnb.net \
    --to=vvvvvst@gmail.com \
    --cc=drh@hwaci.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sqlite-users@sqlite.org \
    --cc=suli@cs.wisc.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.