Why am I doing this?
This is the second installment on a blog series about how to use Python for Data Analysis at CyVerse. If you are not familiar with CyVerse, please read my previous post.
What you will need
- An Instance (Virtual Machine) in Atmosphere. Read the previous blog on this series if you don't know how to set it up.
- Basic knowledge of Pyhon 3.5.x or higher regarding the Installation of Python Packages, and Jupyter NoteBooks.
Steps
1. Launch the terminal in your Atmosphere instance (VM)
In the previous post we created an Ubuntu Virtual Machine in atmosphere. In order to access this machine we need to:
- log into https://atmo.cyverse.org
- Click on the "CyVerse Tutorial" project from the project tab
As you can see above, this displays the instances (VMs) associated with this project. As you can read above, we called our only VM "learning-cyverse", and the system generated an IP address to access and use our virtual machine. We have to options to use this VM:
1) Use an SSH client and the IP address on the screen.
2) Use Open WebShell directly on the Web Browser.
For this tutorial, I will use Open WebShell. In order to this just click on "learning-cyverse"* (arrow on the left above).
Once you get an screen similar to the one above, just click on the Open WebShell icon and a new web browser will open with an unix-like command shell (see below). It is AMAZING !!!!
2. Install Anaconda and Jupyter in VM using the terminal
In step one we log into our VM in the cloud ; thus, all instructions now assume your are in the terminal of the VM created at Atmosphere. The team at CyVerse created a very nice way of to install PiP and Jupyter (Python is already installed), all you have to do is type ezj and enter your CyVerse password (ignore everything to the left of $) :
ditiano@vm142-67:~$ ezj
[sudo] password for ditiano:
This will take a couple minutes and you are done installing Anaconda and Jupyter.
/usr/bin/python3
DEBUG: using python version 3
DEBUG: downloading anaconda binary, may take a few minutes
DEBUG: install Anaconda
PREFIX=/home/anaconda3
.
.
.
Python 3.6.0 :: Continuum Analytics, Inc.
creating default environment...
installation finished.
/usr/bin/python3
DEBUG: using python version 3
[I 19:05:08.070 NotebookApp] Writing notebook server cookie secret to /home/diti
ano/.local/share/jupyter/runtime/notebook_cookie_secret
[I 19:05:08.103 NotebookApp] Serving notebooks from local directory: /home/ditia
no
[I 19:05:08.103 NotebookApp] 0 active kernels
[I 19:05:08.104 NotebookApp] The Jupyter Notebook is running at: http://128.196.
142.67:8888/?token=c8c995ce0efabfca95ba477b750da97a4302340fb716f771
[I 19:05:08.104 NotebookApp] Use Control-C to stop this server and shut down all
kernels (twice to skip confirmation).
[C 19:05:08.104 NotebookApp]
Copy/paste this URL into your browser when you connect for the first time, to login with a token:
http://128.196.142.67:8888/?token=c8c995ce0efabfca95ba477b750da97a430234
0fb716f771
A beauty thing about this system is that you have created an URL to use a Jupyter Notebook remotely. All you have to this copy the URL listed on your screen and paste onto your web browser. Remember that is URL will be different for each user and each time to launch Jupyter.
Copy/paste this URL into your browser when you connect for the first time, to login with a token:
http://128.196.142.67:8888/?token=c8c995ce0efabfca95ba477b750da97a430234
0fb716f771
The only current drawback is that can't use the terminal while Jupyter is running. In order to quite just do: command + c (Mac) OR ctrl + c (Windows)
3. Change Permissions of the anaconda3 installation folder
Unfortunately, the current set up for ezj does not change the permissions of the anaconda3 folder (where packages are installed) and we need do it manually before attempting to install any Python packages. In the terminal type sudo chown -R [username] /home/anaconda3:
ditiano@vm142-67:~$ sudo chown -R cardenaj /home/anaconda3
[sudo] password for ditiano:
After entering your password we are ready to install Python packages.
4. Install the Python packages you will need.
Now, we can use PiP to install any Python package listed in the the Python Package Index (PyPI ). I use the following packages for most of my projects:
- numpy
- matplotlib
- sklearn
- scipy
- pandas
In order to install all these packages, you simply have to type (PiP will skip installation if the package is already in your system):
ditiano@vm142-67:~$ pip install numpy matplotlib sklearn scipy pandas --upgrade
5. Start using Python.
Now we have python installed you can use it programmatically or through Jupyter by simply typing ezj and copying the URL into your web browser.
ditiano@vm142-67:~$ ezj
/usr/bin/python3
DEBUG: using python version 3
DEBUG: downloading anaconda binary, may take a few minutes
DEBUG: Anaconda already installed to /home/anaconda3
/home/anaconda3/bin/python3
DEBUG: using python version 3
[I 08:24:36.970 NotebookApp] Serving notebooks from local directory: /home/ditia
no
[I 08:24:36.970 NotebookApp] 0 active kernels
[I 08:24:36.970 NotebookApp] The Jupyter Notebook is running at: http://128.196.
142.67:8888/?token=274aa37f816603a9faa8268f229989fed67b9a3e53f23fac
[I 08:24:36.971 NotebookApp] Use Control-C to stop this server and shut down all
kernels (twice to skip confirmation).
[C 08:24:36.971 NotebookApp]
Copy/paste this URL into your browser when you connect for the first time, to login with a token:
http://128.196.142.67:8888/?token=274aa37f816603a9faa8268f229989fed67b9a3e53f23fac
I am data scientist scientist, passionate about helping people using mathematics, programming and chemistry