[DRBD-user] Etcd questions

kvaps kvapss at gmail.com
Tue Sep 17 12:58:50 CEST 2019


Hi Robert,
Very intelligible, thanks for the your answers!

In my opinion Linstor have really good and thoughtful design, I'm really
glad to work with it.

> As you can see, I could probably write an entire book about it all, but
I'll stop here for now.

Great idea, it might be so useful for novice software developers, let us
know if you would do this :)

Thank you for all!

- kvaps


On Tue, Sep 17, 2019 at 12:36 PM Robert Altnoeder <
robert.altnoeder at linbit.com> wrote:

> Hello,
>
> On 9/16/19 12:01 PM, kvaps wrote:
> > Hi guys!
> >
> > Thanks for your beautiful work to implement etcd support for Linstor
> > server.
> > I'm really glad that Linstor keeps up with the with the other
> > cloud-native projects and provides an opportunity to use common
> > interfaces like etcd for storing configuration.
> >
> > I have few questions about etcd and future of linstor with this:
> >
> > 1. Does etcd have any limitations in comparison with standard sql
> > backends? (in case of using --max-txn-ops 1024)
>
> From a technical perspective, it is virtually nothing else than a one
> huge pile of limitations compared to a modern SQL database:
>
> - As you mentioned, it cannot only do a fixed number of operations per
> transaction
>
> - It also cannot touch the same item twice in one transaction.
>
> - It is not type safe, everything is a string, so everything that is
> written must be serialized to strings, and everything that is read must
> be parsed from strings, which is more complex and also much slower than
> simple data serialization. It also requires more exception handling code
> and creates a greater potential for software bugs.
>
> - Etcd is only a key/value store, so it does not have a structure
> (tables, rows, columns). Therefore, multiple columns belonging to one
> key must be either serialized into multiple key/value paris (increasing
> the number of operations per transaction) or the columns must be
> serialized into one string (increasing complexity due to the additional
> parsing)
>
> - It does not support constraints, such as foreign key constraints,
> checks of the values that go into it, etc., you could e.g. store a
> TCP/IP-Port number of -70,000 just fine, or a duplicate one, while a
> DBMS would have prevented such incorrect entries even in the presence of
> a bug in LINSTOR code. That makes the data less robust than it would be
> when stored in a DBMS.
>
> - It cannot combine entries or their fields, e.g. like an SQL JOIN can,
> so that must be coded into our software if required
>
> - We cannot automatically transform it as we can with a DBMS. It cannot
> be instructed to "put all values from this table into this other table"
> or "change all the values where this condition matches". Transformations
> typically require parsing and loading all the affected entries, then
> writing our own logic to make any changes, and then serializing and
> storing all the entries again.
>
> - It is far less maintainable. Finding, changing or deleting one or
> multiple entries, or just fields of some entries, is quite simple if you
> can just type in some SQL manually. There is nothing like that in etcd.
>
>
> It's just marginally better than writing to files (at least it offers
> some kind of transactions). But apart from that, put bluntly, it's a bit
> like going back to the 1960s.
>
>
> But to be fair, even supporting multiple SQL databases is not as
> carefree as it might seem. I like to jokingly call any database a
> NoSQL-database, because none of the SQL databases actually implements
> SQL, they all implement a chaotic mix of subsets, supersets and
> variations of SQL. I'm tempted to say that SQL doesn't even exist in the
> real world, except as an idea in a book that noone ever read after it
> was written.
> That being said, due to some kind of a miracle, we're still able to run
> four different databases with the same database driver in LINSTOR, with
> only few conditional changes here and there.
>
> >
> > 2. What about future of sql backends? Are you going to focus on etcd
> > as main backend, or continue using sql, and leave etcd as an option?
> >
>
> It was meant to be an option, not an replacement. However, nothing is
> cast in stone in the real world, even changes to technically worse
> solutions are very common in the IT world (unfortunately), due to
> various reasons.
> From today's perspective, I would expect that we will continue using SQL
> databases as the main backend and leave etcd as an option.
>
> > 3. According previous questions, what's preferred for large
> > deployments? etcd or sql?
>
> The most powerful, robust and maintainable option would be a centralized
> database cluster. For most customers, I would recommend the PostgreSQL
> database for such installations.
>
>
> I'll add some background to shed some light on the development effort
> behind LINSTOR:
>
> I have to admit that LINSTOR is a bit of an alien in its environment.
> And that was actually done on purpose. When I created the initial design
> for LINSTOR in 2016, our background at LINBIT was the reliability,
> robustness, scalability and maintainability nightmare that we had gone
> through with drbdmanage, which was LINSTOR's predecessor (when LINSTOR
> development started, the project was actually still called "drbdmanage
> next generation" internally). Drbdmanage was built around its typical
> environment, some Linux server with DRBD installed, with D-Bus as an IPC
> protocol, a filesystem with config files, a Python interpreter, simple
> JSON documents for persistence. It turned out to be way too limited to
> continue developing it.
>
> Most of the ideas behind LINSTOR were the result of ignoring all the
> conventions, traditions and half-baked solutions that existed already in
> those typical environments, and instead asking the question: What would
> be the theoretical ideal solution in a perfect world, and then, how
> close can one get to something like that within real-world limits -
> limited developer time, money, limited hard- and software environments,
> etc.
>
> That is why it was built around a full-blown SQL database, why it
> originally used its own communication protocol for IPC, why all the
> object names are different from drbdmanage's, why it doesn't have
> drbdmanage's "--force" flags, why it writes its own error report files
> in addition to using syslog, and that is also why it is so different
> from its environment. I did not originally design LINSTOR to work or to
> look like a typical Unix/Linux application, or to use whatever the most
> widespread or most convenient protocol or data format is, or to e.g.
> have a single simple numeric return code as most usual applications do.
> Instead, my intention was to make the design more robust, more
> consistent, more maintainable and also more scalable by avoiding many of
> the weaknesses found in more conventional technology.
>
> The introduction of etcd as an option, the replacement of the binary API
> with a REST webserver, the use of DRBD quorum instead of fencing in
> cloud environments, the presence of configuration files instead of
> configuration utilities, all those things were adjustments made to fit
> certain limitations, not because those solutions are technically better.
> They aren't, they are just what either the rest of the technology around
> LINSTOR or the users can deal with more easily.
>
> In the real world, it's always a compromise.
>
> As you can see, I could probably write an entire book about it all, but
> I'll stop here for now.
> Anyhow, I hope I could provide some insight into what the challenges and
> ideas behind the development of LINSTOR are.
>
> br,
> Robert
>
> _______________________________________________
> Star us on GITHUB: https://github.com/LINBIT
> drbd-user mailing list
> drbd-user at lists.linbit.com
> https://lists.linbit.com/mailman/listinfo/drbd-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20190917/a71dddc1/attachment.htm>


More information about the drbd-user mailing list