In modern software landscapes multiple applications, which co-exists in potentially multiple versions, share one database as their single point of truth. Managing the multitude of depending schema versions manually is highly error-prone and accounts for significant costs in software projects. Developers have to realize the translation of data accesses between schema versions with hand-written views, triggers, and migration script. With InVerDa we will develop concepts and tools for integrated, easy, and robust versioning of databases. The key idea is to rethink the way of evolving to new schema versions. We use the richer semantics of a descriptive database evolution language to generate all the required artifacts automatically. The developer only needs to specify compact and easy to write descriptions of the evolution to a new schema version, which is then immediately accessible without any further manual implementation.
Our research approach:
We decouple the logical external schema versions, from the actual internal physical materialization. Developers describe the evolution of external schema versions using InDEL, a language which contains invertible evolution operations . Each InDEL operation contains both the changes of the schema and the currently existing data. We use the richer semantics of InDEL to automatically generate views for each schema version, as well as triggers to enable writes on any schema version. So, all schema versions stay continuously accessible, without any further line of SQL, hence InVerDa makes database versioning robust and foolproof.
Let's consider a task management system called TasKy , which users can install on their desktops and which is backed by a central database. TasKy allows users to create new tasks, list, update, and delete them. Each task has an author and a priority ranging from 1 to 3 with 1 being most urgent. At first, we create the naive schema as shown below. As users like to have their most urgent tasks listed on their phone, we additionally incorporate a third party phone app called Do! . To match its schema, we simply execute the InDEL evolution script as shown in the figure. The TasKy data is immediately available to be read and written through the newly incorporated Do! app. At this point InVerDa has already simplified our job significantly.
For the next release TasKy2 , we decide to normalize the table Task. Since we plan a stepwise roll-out of TasKy2 , the old schema of TasKy has to remain alive until all clients have been updated. With the InDEL evolution script shown below, InVerDa creates the new schema version and ensures that all write operations to any of the three schema versions are propagated to all other schema versions. Assume user Ann has already upgraded to TasKy2 and changes the priority of Organize party to 1, then this task will immediately occur in the Do! app on her phone.
At first, the data is primarily kept in the initial schema version as shown above. However, the decoupling of the availability of external schema versions and the actual materializing schema allows to easily change this materializing schema. Assume, some weeks after releasing TasKy2 the majority of the users have upgraded to the new version. TasKy2 comes with its own phone app, so that the schema TasKy and Do! are still accessed but merely by a minority of users. Hence, it seems appropriate to migrate data physically to the TasKy2 schema, now.
Traditionally, we would have to write a migration script, which moves data, and implements new views and triggers. With InVerDa, we achieve the same blindfold with the one-liner: MATERIALIZATION TasKy2; Upon this statement, InVerDa transparently runs the physical data migration and updates the involved views and triggers of all schema versions. All schema versions stay available; read and write operations are merely propagated to a different materializing schema, now.
Internally, InVerDa represents the InDEL operations as sets of Datalog rules, which gives as several nice advantages:
- We have formally shown the correct and complete invertibility of all InDEL operations.
- We generate the delta code(views, triggers, migration scripts) directly from Datalog rules.
- We can easily add and change InDEL operations by specifying their Datalog rules.
- There are many more opportunities regarding combination and optimization of evolution operations, which we will investigate in the future.
Thanks for your interest in InVerDa. Feel free to approach us for any questions, feedback, or ideas.
Example Evolution for Wikimedia:
We modelled the evolution of 171 versions of Wikimedia with our evolution langauge to show that it is practically complete--hence, feasible for common software projects. Download it here . Benchmark taken from: C. Curino, L. Tanca, H. J. Moon, and C. Zaniolo, “Schema evolution in wikipedia: toward a web information system benchmark,” in ICEIS, 2008, pp. 323–332.
Kai Herrmann, Hannes Voigt, and Wolfgang Lehner, “Cinderella -- Adaptive Online Partitioning of Irregularly Structured Data,” in 9th International Workshop on Self-Managing Database Systems (SMDB 2014), Chicago, IL, USA, 2014. [ pdf ] [ [bibtex] ]
 Kai Herrmann, Jan Reimann, Hannes Voigt, Birgit Demuth, Stefan Fromm, Robert Stelzmann, Wolfgang Lehner: Database Evolution for Software Product Lines. DATA 2015, Colmar, France [ pdf ] [ bibtex ]
 Kai Herrmann, Hannes Voigt, Andreas Behrend, Wolfgang Lehner: CoDEL - A Relationally Complete Language for Database Evolution. ADBIS 2015, Futuroscope, Poitiers, France [ pdf ] bibtex ]