~olly/yoyo#58: 
Add support for in-memory migrations

Created to preserve this discussion here before bitbucket deletes it.

Original proposal, from Dan Lidral-Porter:

Add support for defining migrations without files

Add the InMemoryMigration class to yoyo.migrations, which is identical to a Migration in every way, except:

  • the third argument to its constructor is the source of the migration, instead of a path to a file to load that source from;
  • its path is a hex representation of its memory address;
  • its repr() indicates it's 'at' a Memory address rather than 'from' a file.

In order for the InMemoryMigration class to share its implementation of load() with Migration, make a few changes to the Migration class:

  • move the part of load() that opens the Migration's file to read its source to a new load_source() method, and call that method in load();
  • add a path_name property to the class, which calls
  • os.path.basename() on the migration's path;
  • change the call to compile() in load() to use the new path_name property as the second arg, as the old arg was the name of the open file object where the source was being read from, and that is no longer available now that reading the source happens in a different method.

Add a post-apply version of an InMemoryMigration, creatively called InMemoryPostApplyHookMigration. Currently, the only place in the codebase where PostApplyHookMigrations are constructed is in the read_migrations function, and it doesn't make much sense to write an analogue of that function for in-memory migrations, therefore it's up to users to know when they need to use this class, and to add this kind of migration to the post_apply portion of a Migration

Add support for defining migrations without files

Add the InMemoryMigration class to yoyo.migrations, which is identical to a Migration in every way, except:

the third argument to its constructor is the source of the
migration, instead of a path to a file to load that source from;
its path is a hex representation of its memory address;
its `repr()` indicates it's 'at' a Memory address rather than 'from'
a file.

In order for the InMemoryMigration class to share its implementation of load() with Migration, make a few changes to the Migration class:

move the part of `load()` that opens the Migration's file to read
its source to a new `load_source()` method, and call that method in
`load()`;
add a path_name property to the class, which calls
`os.path.basename()` on the migration's path;
change the call to `compile()` in `load()` to use the new
`path_name` property as the second arg, as the old arg was the name
of the open file object where the source was being read from, and
that is no longer available now that reading the source happens in
a different method.

Add a post-apply version of an InMemoryMigration, creatively called InMemoryPostApplyHookMigration.

Currently, the only place in the codebase where PostApplyHookMigrations are constructed is in the read_migrations function, and it doesn't make much sense to write an analogue of that function for in-memory migrations, therefore it's up to users to know when they need to use this class, and to add this kind of migration to the post_apply portion of a MigrationList rather than among the normal items. List rather than among the normal items.

Status
REPORTED
Submitter
~olly
Assigned to
No-one
Submitted
4 years ago
Updated
4 years ago
Labels
No labels applied.

~olly 4 years ago

@mcgrawms9 later added some helpful context:

I found my way to this project for the first time today while I was researching database migration libraries for Python. Once in the repo I quickly found my way to this issue and I wanted to share how I got here, as it may be tangentially related to what Dan had in mind.

Recently I’ve been working on a prototype that would allow users to create and modify domain models at runtime, almost like how you might expect a UI based ORM to work (though this is not what the project is). Imagine someone had an existing domain object they wanted to add a new attribute to - they might make a change in the UI, which sends a PUT request to a web service to handle a “modifyEntity” action with an “entityID”, “newAttribute”, “attributeType”, and a set of “attributeParameters”. This service would wrap generate DDL statements that would eventually be run against the database - effectively creating migrations as regular strings.

My experience and research has mostly found that migration libraries require the migrations to be saved as files, and the files are loaded and executed during some sort of a CI/build process. While I can imagine making a file based system work, in my previous example I really want to commit that migration to a database table, “submitted_migrations”, and then use yoyo from Python code with the submitted_migrations table load_migrations(“select * from submitted_migrations where is_applied = false order by id asc;“). This isn’t the actual API I would expect from load_migrations, just an example of the intention.

If an InMemoryMigration class were added then someone could implement the system I sketched out by persisting the submitted migration however they chose, attempt to run the migration, and then handling success/failure as required.

This is a bit rough around the edges, but I think it demonstrates the idea well. I apologize for the any formatting issues, I was fighting the markdown a bit.

You’ll need sqlite3 of course and all the yoyo dependencies, but if you run that first as “python3.7 runtime_migrations.py” it will fail, and next “python3.7 runtime_migrations.py yoyo” it will succeed.

import subprocess
import sys
import time
import uuid

from typing import Any, Dict, Tuple, Type

from sqlalchemy import (create_engine, 
    Table, Column, Integer, String, MetaData, ForeignKey)


class RuntimeSchemaManager:
    """
    A class to manage runtime schema changes for a single backend.

    This class doesn't concern itself with the query execution details.
    """

    def __init__(self, metadata: Type[MetaData],
                       schema_tables: Dict[str, bool],
                       type_catalog: Dict[str, Any]):
        self.metadata = metadata
        self.schema_tables = schema_tables
        self.type_catalog = type_catalog

    def build_table(self, name: str) -> Type[Table]:
        if name in self.schema_tables:
            raise ValueError("{} already exists".format(name))
        self.schema_tables[name] = True
        return Table(name, self.metadata)


    def build_column(self, name: str,
                           column_type: str) -> Type[Table]:
        if column_type not in self.type_catalog.keys():
            raise ValueError("{} not a legal type".format(column_type))
        return Column(name, self.type_catalog[column_type])


# standalone function for demonstrating SQLAlchemy capabilities
def update_database(metadata: Type[MetaData], engine: Any):
    """
    This is not sufficient to complete script as once SQLAlchemy
    has binded to the database engine, mutations are not possible.
    """
    metadata.create_all(engine)


from yoyo import (get_backend)
from yoyo.migrations import (MigrationList, InMemoryMigration)

class RuntimeMigrationManager:
    """
    A class to manage runtime migrations for a single backend.
    """
    def __init__(self, backend: Type[Any]):
        self.backend = backend

    def run_migration(self, migration: str, migration_name: str = None):
        # allow a user naming convention
        migration_suffix = ""
        if migration_name is not None:
            migration_suffix = migration_name

        # aim for unique id construction
        migration_id = str(uuid.uuid1()) + migration_suffix

        # wrap this snippet as it does not effect the migration
        migration_wrapper = """from yoyo import step; step('{}')"""

        # build the InMemoryMigration
        try:

            next_migration = InMemoryMigration(migration_id, migration_wrapper.format(migration))
            migrations = MigrationList([next_migration])

            with self.backend.lock():
                # https://ollycope.com/software/yoyo/latest/#using-yoyo-from-python-code
                # one way migration that doesn't have a rollback =)
                self.backend.apply_migrations(self.backend.to_apply(migrations))

        except Exception as e:
            raise IOError("Could not run migration successfully\nTrace: {}".format(e))


if __name__ == "__main__":
    db_dsn_root = "sqlite:///"
    db_name = "runtime_migrations.db"
    db_dsn = db_dsn_root + db_name

    # remove old database if it exists
    subprocess.run(["rm", db_name])

    using_yoyo = True if len(sys.argv) > 1 and sys.argv[1] == "yoyo" else False
    if using_yoyo:
        print("INFO:Using yoyo migrations")
        backend = get_backend(db_dsn)
        migration_manager = RuntimeMigrationManager(backend=backend)

    # engine, conn, metadata required for using SQLAlchemy core
    engine = create_engine(db_dsn)
    conn = engine.connect()
    metadata = MetaData()

    # a mapping of SQLAlchemy types our schema will support
    type_catalog = {
        "int": Integer,
        "string": String
    }
    global_tables = {}

    schema_manager = RuntimeSchemaManager(metadata=metadata,
                                          schema_tables=global_tables,
                                          type_catalog=type_catalog)

    users = schema_manager.build_table(name="users")
    users.append_column(schema_manager.build_column(name="id", column_type="int"))
    users.append_column(schema_manager.build_column(name="name", column_type="string"))
    users.append_column(schema_manager.build_column(name="fullname", column_type="string"))

    if not using_yoyo:
        # this call is required to create anything or bind to existing tables
        # using the stock SQLAlchemy example
        update_database(metadata=metadata, engine=engine)
    if using_yoyo:
        # how we construct the migration is not especially interesting
        migration = "create table users (id int, name text, fullname text);"
        migration_manager.run_migration(migration=migration)

    ins = users.insert().values(id=1, name="mickey", fullname="Mickey Mouse")
    result = conn.execute(ins)

    # this would be more complicated in the actual system, but we'll make
    # a runtime mutation to our schema
    users.append_column(schema_manager.build_column(name="residence", column_type="string"))

    if not using_yoyo:
        # then we don't make it through this block
        # because this doesn't work and ... go to =>
        update_database(metadata=metadata, engine=engine)

        # => no migration is run and this fails
        ins = users.insert().values(id=2, name="mickey", fullname="Mickey Mouse", residence="Florida")
        result = conn.execute(ins)

        # to the best of my knowledge SQLAlchemy cannot handle a runtime migration
        # maybe alembic can, but I couldn't easily track it down...

    if using_yoyo:
        # then we will make it through this block!
        # and we'll have the nice yoyo migration management in the database

        # again, ignore how this is created...
        migration = "alter table users add column residence text;"
        migration_manager.run_migration(migration=migration, migration_name="add_residence_to_users")

        # => migration is good and we can insert
        ins = users.insert().values(id=2, name="goofy", fullname="Goofy Goof", residence="Florida")
        result = conn.execute(ins)
Register here or Log in to comment, or comment via email.