Retrieving data from AWS to Heroku using Django

Ahmad Bilesanmi
Bilesanmi Ahmad
Published in
4 min readMar 5, 2019

--

There is this feeling of satisfaction that comes to most developers when they finally host an application on a live server and everything is working and he feels he is done with that application. Well, that feeling is always short-lived when clients want things moved around. Sometimes, sensitive data is also involved and has to be preserved in this process.

I found myself in this situation when one of our clients requested that we move their application to Heroku because AWS was expensive for them. This took me some time because I couldn’t figure out how it would all be done at once. So I had to do some searching and I want to show my journey. So its possible that you (the reader) may know a better way of doing this. If that’s the case, then please let me know in the comments.

I will not be going through all the steps in detail. I will explain the main highlights in my journey. They are:

  1. Getting the data from AWS
  2. Downloading that data to our local system
  3. Putting the data in our Heroku database

Getting the data from AWS

I was able to get the data from the live server from the EC2 instance by using the dumpdata command from django.

pipenv run ./manage.py dumpdata accounts > ../backups/accounts.json

In this command, dumpdata retrieves the data from the ‘accounts’ table of the database and saves the data in a json file called ‘accounts.json’ in the backups folder. So you can apply this same technique to all your tables and get a backup for all of them.

Downloading the data to your local system

At this point, the data is in a file on your remote computer. If you need to get the data on your local system, you can use the regular copy and paste using a vim editor. However if your data is large, it would be better to use the scp tool in Ubuntu (which is what I use).

scp user@host.com:/var/www/backups/accounts.json ./

The command above uses the scp tool to copy the accounts.json file to your current folder on your local computer.

Putting the data in the Heroku database

This article assumes you have already created your Heroku account and uploaded your application to it. What we intend to discus is putting the data we have on our local computer on the Heroku db.

We can easily add the json files we have got from our AWS server by adding them to the project, and do a git push to Heroku. This would be a bad idea as you stand a chance of putting out all your data to the public from the git history, especially for projects that are public on remote repositories.

If you need your data protected, then we need to find another way to get our data to our Heroku application.

Firstly, we connect to our heroku app using the CLI tool. The command used to get access to the Heroku CLI:

heroku run -r [remote name] bash

The command above will be used to gain access to the files of your django application on Heroku. The [remote name] is the name of the remote repository that deploys to Heroku. If you are using the first deployment method in Heroku, most likely your remote name will be ‘heroku’. If that’s the case, then the complete command will be:

heroku run -r heroku bash

Now you have direct access to your django project on heroku. You can add the data directly without using git and risk the chance of exposing your data to the public.

Django Fixtures

Django has a feature called fixtures, which, according to the documentation is a collection of data that Django knows how to import into a database. Since we have our data already, we can create fixtures that would insert this data into our heroku database.

Django searches for fixtures in their apps. For example if you have an accounts model in the accounts app of your Django project, then you can have a ‘fixtures’ directory that has an accounts.json file.

The fixtures directory can be created from the Heroku CLI, and the file created with the command:

cd app

mkdir fixtures

cd fixtures

cat > filename.json

The last command creates the json file and leaves a prompt for you to type in your data. You can copy and paste your data from your local file into the prompt and press Ctrl + D or Cmd + D to save the file can close the prompt.

Take note that if you have a large data set, you may not be able to paste all at once at the prompt. I’ll explain an easy way to handle that in a bit.

Run your fixtures

Now that you have your fixtures, you can run it with the command:

manage.py loaddata <fixturename>

If you did this correctly, then, you should see a response from django telling you the number of rows added.

Large Data sets

If you happen to have a large data file, it would be better to save it to the cloud using a service like DropBox or Amazon S3 and download it from your fixtures directory using:

wget "http://domain.com/directory/filename.json"

You can then run your fixtures when downloaded. With this, you can finally transfer your data to heroku without having to add it to your git history.

If this was helpful, please tap on the clap button. If you have a better way of getting this done, I’ll love to hear from you in the comments section. Thanks for reading.

--

--

Ahmad Bilesanmi
Bilesanmi Ahmad

Software Engineer || Python || Javascript || DevOps || Data