FSDL - Local Development Setup
By Charles Frye
August 9, 2022
0:00 / 6:33
Charles Frye00:00

This video walks through the instructions for setting up the compute environment for the FSDL 2022 labs. These instructions assume that you have a Linux machine with a GPU available that you can SSH into. If that's not you, then we recommend that you stick with Colab. So check out that video for your instructions.

Couple of side notes. If you have a Windows machine with a GPU and you have Windows Subsystem Linux 2, you're welcome to try out these instructions, but we can't guarantee they'll work. Cuz Windows Subsystem Linux with GPU support is a very fresh technology with sharp edges. You can read more about it there.

If you have a Mac with an Apple Silicon GPU, don't even try to install a deep learning stack on it. A lot of good engineers have spent a lot of hours Trying to get libraries to compile and run on that hardware, just run things on Colab.

So the instructions for local development can be found on GitHub here. in order to use them, the first step is to check that repo out. So I'm doing this on a fresh Ubuntu machine.

And once we've cloned the repo, we can go inside, take a look at it.

The next step is to set up the Python environment using Anaconda, which is our system package and Python environment manager.

So to install it, you can follow the instructions at that link to install a Python 3.7 version of miniconda. If you install another version, it's okay. You'll just have multiple versions of Python on your machine.

If you only have terminal access to your machine, then you'll need to use something like wget to get the installer for miniconda. Otherwise the instructions will direct you how to do it via your graphical user interface.

You can run the installer from the command line. You just need to type bash and then the name of the installer file that you downloaded.

You can accept the default options

Here though we wanna say yes and use conda init

at this point, you'll likely need to close and reopen your terminal.

Notice that my prompt now includes the name of the Anaconda environment. This is the default behavior of Anaconda.

Back inside the lab repo. There's a make file that automates pretty much the entirety of the rest of setup. So to use it, you need the make command, and then the make target conda update from that file.

That takes some time. but now we've got our system libraries like cuda and Python. And all the things that cuda and cudnn rely on.

The last thing we need to do is run that command at the bottom there. conda activate fsdl-text-recognizer-2022.

So that sets up our system libraries, but it doesn't install everything that we need. Everything else gets installed using the Python package manager, pip.

but we provide a make target for that as well. So all you need to do is type, make pip-tools.

All right. We've got the environment installed. The last step that we need to do is set the Python path, which makes sure that we're able to import all the libraries in the class. the command to do it is simple. It just sets an environment variable of the $PYTHONPATH. That includes the current directory, which is

".". And now we should be able to enter into one of the lab directories and import the text_recognizer code.

The next time that you log into this same machine, the conda environment will be back to base and the PYTHONPATH won't be set. you'll have to activate the conda environment every time with that conda activate command.

But for the PYTHONPATH, you could just edit your shell configuration file. So for most folks that'll be .bashrc.

If you have your machine locally, and it's got, a graphical user interface and a browser, then you're done and ready to go, and you can just type in "jupyter notebook" and you should get a browser window with a Jupyter server running in it.

Once you reach this screen, you're ready to go to try out the labs. for the first lab, with the overview of our text recognizer architecture, we can just click overview.ipynb and we're ready to go.

But if you are SSHing into a machine, maybe it's a cloud instance. then there's a little bit more setup that will go through.

So if we type in jupyter notebook here,

we'll see that the server's running, but if you're SSH into a machine, then you can't use the local host URL that's on there. cuz that will point to whatever machine you're SSHing from. there are instructions on how to set up a Jupyter server for a single user that you can check out. but there's a slightly simpler way to do it that we'll walk through now.

So you'll need another session on the same machine. If you're familiar with something like tmux, you can do that or just open up another terminal

and then we're gonna use a tool called ngrok to connect to our jupyter server.

If you don't have ngrok, type the name and then follow the instructions for how to install using your system's package manager.

Then in the URL for the Jupyter notebook, you'll see a port number after the name localhost.

So type that port number after ngrok http. So for me, that's 8888. That's the default. And now we have a connection to that Jupyter server being run by ngrok. So you can see the URL here where the forwarding is happening. Let's copy that and open it in a browser window.

So when you first try this, you'll get an error that says that you haven't signed up for an ngrok account. So go ahead and do that. And get a hold of an auth token and follow the instructions for adding your authentication token here.

So I've added my authentication token. And now I can run that same ngrok command,

copy the URL.

And if I open that URL in a browser window, now I'm connected to my Jupyter server. the first time you log in, it should ask you for a token. That's also in the output of the Jupyter server. So let's take a look. Here's my token.

You can switch it to a password by entering into the field down here.

If you run into trouble, just make sure that you copied that token correctly without any spaces or any of the other stuff on the line. Just everything after the equal sign.