How to migrate Mix Release projects

Fly.io runs apps close to users, by transmuting Docker containers into micro-VMs that run on our own hardware around the world. This post is part of the Safe Ecto Migrations series guide. If you just want to ship your Phoenix app, the easiest way to learn more is to try it out; you can be up and running in just a couple minutes.

This is part 2 in a 4-part series on designing and running Safe Ecto Migrations:

Not long ago, deploying and managing Elixir projects was not as straight-forward as today; some might say it was downright painful. Thankfully, since Elixir 1.9, Mix ships with tools to help developers assemble applications for deployment. How you ship that binary to its destination it still entirely up to you, but now it's a simpler and common task!

Fly Changes the Deployment Game

Fly is an awesome place to run your Elixir apps. Deploying, clustering, connecting Observer, and more are supported and even fun!

Deploy your Elixir app today!  

Before the adoption of pre-compiled releases (thanks to Mix Release and before Mix was Distillery), it was more common to install Elixir (and therefore mix) directly on the server. This meant you would copy your code and use mix to start your application directly on the servers. A significant downside of doing this was the long start-up time for an application. It required it to be compiled before starting and during that time the server was reporting "unhealthy" because the application wasn't started. Pre-compiled releases solve this problem and include other benefits.

Since Mix is a development tool, it isn't included in a runtime release. This creates a challenge because another common Mix operation are creating and migrating databases. Locally for development and testing, you can run mix ecto.migrate && mix phx.server and you're done! With the mix command missing on the server, developers need another way to manage the application's database in production.

On our servers, we need to easily perform the following tasks:

  1. Check the status of migrations.
  2. Migrate Repos up to X migration. Default to the latest migration.
  3. Rollback to X migration for a specific Repo.

The recommended way to encapsulate these commands is with a MyApp.Release module. This module serves as an entry point into your application for managing release-related tasks.

Let's create that module.

Release Module

Here is the Ecto SQL example:

defmodule MyApp.Release do
  @app :my_app

  def migrate do
    for repo <- repos() do
      {:ok, _, _} = Ecto.Migrator.with_repo(repo, &Ecto.Migrator.run(&1, :up, all: true))
    end
  end

  def rollback(repo, version) do
    {:ok, _, _} = Ecto.Migrator.with_repo(repo, &Ecto.Migrator.run(&1, :down, to: version))
  end

  defp repos do
    Application.load(@app)
    Application.fetch_env!(@app, :ecto_repos)
  end
end

Most of the work happens in Ecto.Migrator, which is great because it keeps our code slim and neat. But, we need to add a little bit to it:

  • There isn't a function that prints out the migrations' status. This is helpful for a sanity check. You might want to know which migration will execute next when the migrations are run.
  • In most cases you want to be deploying frequently enough that only migration is run at a time. In some cases, a lot of work might be going out in a single deployment and it may contain multiple migrations. When deploying, we may only want to execute one migration at a time so they can be monitored individually. As it is written, the function does not allow us to only run one migration.
  • You may want to run manual data migrations.

Adding options to MyApp.Release.migrate/1

Let's adjust the migrate function to accept options that we can pass into Ecto.Migrator.

+  @doc """
+  Migrate the database. Defaults to migrating to the latest, `[all: true]`
+  Also accepts `[step: 1]`, or `[to: 20200118045751]`
+  """
-  def migrate do
+  def migrate(opts \\ [all: true]) do
    for repo <- repos() do
-     {:ok, _, _} = Ecto.Migrator.with_repo(repo, &Ecto.Migrator.run(&1, :up, all: true))
+     {:ok, _, _} = Ecto.Migrator.with_repo(repo, &Ecto.Migrator.run(&1, :up, opts))
    end
  end

Now we can pass in options to allow us step through migrations one at a time or to go up to a specific version. For example, migrate(step: 1) or migrate(to: 20210719021232).

When rolling back, it's already a situation where something has gone wrong and you want to reverse some changes. If your application has multiple Ecto repos, running mix ecto.rollback will rollback the last migration on each database. That's probably not what you want! Because of this, you want to be more explicit with this command. The engineers deploying this change should be required to give the specific repo and version to which to rollback.

See available options

Adding MyApp.Release.migration_status/0

Before I run migrations, I like to make sure I know which migration the application is going to run next. Locally, you can run mix ecto.migrations to see the status of your migrations. I want something like that on the server that works with releases.

Let's update it for Mix Releases:

@doc """
Print the migration status for configured Repos' migrations.
"""
def migration_status do
  for repo <- repos(), do: print_migrations_for(repo)
end

defp print_migrations_for(repo) do
  paths = repo_migrations_path(repo)

  {:ok, repo_status, _} =
    Ecto.Migrator.with_repo(repo, &Ecto.Migrator.migrations(&1, paths), mode: :temporary)

  IO.puts(
    """
    Repo: #{inspect(repo)}
      Status    Migration ID    Migration Name
    --------------------------------------------------
    """ <>
      Enum.map_join(repo_status, "\n", fn {status, number, description} ->
        "  #{pad(status, 10)}#{pad(number, 16)}#{description}"
      end) <> "\n"
  )
end

defp repo_migrations_path(repo) do
  config = repo.config()
  priv = config[:priv] || "priv/#{repo |> Module.split() |> List.last() |> Macro.underscore()}"
  config |> Keyword.fetch!(:otp_app) |> Application.app_dir() |> Path.join(priv)
end

defp pad(content, pad) do
  content
  |> to_string
  |> String.pad_trailing(pad)
end

A lot of this code is borrowed from the Mix task mix ecto.migrations, but changed to not use the Mix module.

When you run bin/my_app eval "MyApp.Release.migration_status()", you should see something like the following.

Repo: MyApp.Repo
  Status    Migration ID    Migration Name
--------------------------------------------------
  up        20210718153339  add_test_table1
  down      20210718153341  add_test_table2

What If I Want to Migrate My Data?

Perhaps your database already has data in it and you need to change the data in-place. For example, in a blog system you might have assumed earlier that all the blog posts would be published as soon as it was written. Later, you hire an editor that wants to review the blog posts before they're made publicly available. You decide to add a new column published_at that accepts a timestamp of when the blog post is publicly available. Your new editor now reviews blog posts and when it's ready, they set the published_at timestamp. You already ran the schema migration to add the column, but now you also need to migrate the older data and fill the publishing date on older blog posts. This is where a data migration is helpful. We'll also refer to this as "backfilling" existing data.

Because data migrations are usually one-off processes that only need to run once, data migrations need to happen separately from schema migrations and we want to trigger them manually. Making them manual ensures that other automatic workflows don't try to run it multiple times. This is a case where a singleton in your workflow may be necessary 😉. With Ecto, we can separate these data migrations into a different folder, which makes running them more intentional. When generating a data migration with mix ecto.gen.migration, you can use the --migrations-path=MY_PATH flag to put them in a different folder, eg:

mix ecto.gen.migration --migrations-path=priv/repo/data_migrations backfill_foo
* creating priv/repo/data_migrations
* creating priv/repo/data_migrations/20210811035222_backfill_foo.exs

To run these migrations in a Mix Release, we'll need a new function that looks in this custom folder for our data migrations.

@doc """
Migrate data in the database. Defaults to migrating to the latest, `[all: true]`
Also accepts `[step: 1]`, or `[to: 20200118045751]`
"""
def migrate_data(opts \\ [all: true]) do
  for repo <- repos() do
    path = Ecto.Migrator.migrations_path(repo, "data_migrations")
    {:ok, _, _} = Ecto.Migrator.with_repo(repo, &Ecto.Migrator.run(&1, path, :up, opts))
  end
end

Now you can manually run your data migrations when using releases like this:

bin/my_app eval 'MyApp.Release.migrate_data()'

If you'd like more inspiration, read Automatic and Manual Ecto Migration by Wojtek Mach.

Start the Release

It's time to build your release and deploy it. Wonderful! But wait! Before you start your application, let's ask some questions:

  1. Does the deployed code assume the migrations were already run? If so, you need to run your migrations first and start the application after; otherwise your application will crash!
  2. Does the release contain migrations that aren't used yet? For example, you have a migration that adds a column to a table but the Ecto schema doesn't even reference it yet. In this case, you can start your application before running the migration because the code does not need the column to exist. Feel free to run the migrations at your convenience.
  3. Are you using Kubernetes? Then consider Init Containers. Init containers run to completion before the application containers in the pod. This is a perfect place to start your Ecto Repo and migrate the database before starting the rest of your application. Combine this with Kubernetes Jobs, and you have a way to run the migration only one time. Using Init Containers waits for the job to complete before starting the application. Tip: If you setup Kubernetes to run your migrations, be mindful to exclude one-off processes such as data migrations.

Now that you've determined the order needed to safely roll out your changes, i.e. run database migrations or start the application, let's start running the release commands!

Check migration status

We can inspect the database migration status.

bin/my_app eval 'MyApp.Release.migration_status()'
Repo: MyApp.Repo
  Status    Migration ID    Migration Name
--------------------------------------------------
  up        20210718153339  add_test_table1
  down      20210718153341  add_test_table2

Run the migrations

To migrate the database structure, run bin/my_app eval 'MyApp.Release.migrate()'.

When running bin/my_app eval ..., a separate slim instance of the Erlang VM is started. Your app is loaded but not started. Only the Repo is started, and it's only started with 2 database connections. Since this is a new instance booting, this implies that it will also need the same environment variables as your running application. If you rely on environment variables for the database settings, ensure they're present when running this command.

To run data migrations, run bin/my_app eval 'MyApp.Release.migrate_data()'.

OMG ROLL IT BACK

Before you roll back, you should consider if there's a safer way to continue forward and fix or work around any issues. I have never needed to roll back the database. I don't mean this as a weird flex (far from it!) it's just that me and many others have found a way around the issue without rolling back.

But that doesn't mean you shouldn't give yourself that escape hatch; so if necessary, the app can rollback using bin/my_app eval 'MyApp.Release.rollback(MyApp.Repo, 20210709121212)'

Where to next?

Next we look through our recipe books for tips and techniques to make our migrations smooth and tasty!