Tags | docker postgresql sqlalchemy |
Hard Prerequisites | |
IMPORTANT: Please review these prerequisites, they include important information that will help you with this content. | |
|
In this project you will practice the basics of alembic based migrations. A lot of this stuff might seem quite straight-forward now, but focus! When migrations go wrong in production then life gets way too hard, and real user data can be put at serious risk.
This exercise will also teach you a bit more about why git is important. Make sure you commit often! Seriously! Git will save your bacon!
In this project you’ll need 2 databases, I’ll refer to them as “development” and “prod”. How you set these up is up to you. You could have one docker composition that sets them both up, or (easier) you can have 2 docker compositions that are well named.
This is similar to how you would work with real databases later in life. You’ll have a database on your computer that you can experiment on. If you destroy the data there then it’s not a big deal at all. Your job is to keep your prod database safe.
personal_email_address
. Set it to be a required field, and must be unique.This should have been quite smooth
Now a lot can go wrong with databases and migrations. Ideally the db, your models and your migrations will all be in sync. If they fall out of sync life gets kind of hard sometimes.
Generally the models should be considered the truth of things. That truth should then be propagated to the migration files, and then to the database. If the database is updated independently then weird things start to happen.
In this section we’ll be updating the database independently. In real life you wouldn’t do this intentionally.
So why are we even doing this weird thing?
Basically if you are working on a team, then one of your team mates migration files might disagree with yours, and you might end up in a mess. Also, your migration files that work totally fine against your dev db might not work against prod because the last person to update the prod db wasn’t you. This can lead to all sorts of chaos - that chaos is avoidable but it’s really worth appreciating.
We’re going to go through 3 situations, they are all nice and simple on their own. In real applications the database structure generally has many interlinking tables. For example, our Umuzi recruitment portal has 26 tables, and they’re all connected to each other.
Using your dev db:
Do whatever it takes to get it to work… you might need to checkout an earlier commit. At the end, you’ll have the C26 learners in the prod database.
Rules: Don’t delete any data in your production database! But you can completely delete your dev database if you wanted to.
Using your dev db:
Do whatever it takes to get it to work… you might need to checkout an earlier commit. By the end you’ll have the C27 learners in the prod database.
Rules: Don’t delete any data in your production database! But you can completely delete your dev database if you wanted to.
Using your dev db:
Do whatever it takes to get it to work… you might need to checkout an earlier commit. By the end you’ll have the C28 learners in the prod database.
Rules: Don’t delete any data in your production database! But you can completely delete your dev database if you wanted to.
For this project we will need you to commit your migrations to your feature branches. The migrations should get your database to a point where the learner and cohort tables are both set up properly.
We’ll also need you to give us a bunch of working scripts for creating learners.
Earlier it was hinted that there are ways to prevent migration chaos. When it comes to working on real life databases then we need to be very careful about things. For example, with the Tilde code base we do the following:
Just in case you nerds are interested, the main Tilde database is a google cloud SQL instance running postgres, and we’re using the Django ORM instead of sqlalchemy. But the basic principles are the same.
Please look at the migration files and make sure that the learner renamed columns when they were meant to. Dropping a column and creating a new column is not the same as renaming a column. If you drop a column then you drop all the data inside that column.