Data Science Work Environment setup on Linux/Mac

2 minute read

Python, R, and Julia- these languages are usually used in data science projects. In this tutorial we will see how to create a proper environment for working on data science projects.

Conda environment setup

We can directly install any language on the system. However, it is recommended to use virtual environments to avoid conflicts between using different versions of same language. For that purpose, we will use conda environment. We can use Anaconda or Miniconda. I prefer Miniconda because it is lightweight and does not come with all the packages making the machine heavy. That means we will have to install packages if we need them. In Anaconda, all packages are installed by default.

Install Miniconda

  • Download and install miniconda form here

  • Last step enter no when prompted with the following message:
    Do you wish the installer to prepend the conda install location to PATH in your /root/.bashrc ? [yes|no]
    
  • Symbolic link creation for conda command
    • Create a hidden directory in user’s home directory
        $ cd ~
        $ mkdir .conda_links
        $ cd .conda_links
      
    • Create symbolic links (Linux)
        $ ln -s /home/$USER/miniconda3/bin/conda conda
        $ ln -s /home/$USER/miniconda3/bin/activate activate
        $ ln -s /home/$USER/miniconda3/bin/deactivate deactivate
      
    • Create symbolic links (Mac) Use your username instead of roy.
        $ ln -s /Users/roy/miniconda3/bin/conda conda
        $ ln -s /Users/roy/miniconda3/bin/activate activate
        $ ln -s /Users/roy/miniconda3/bin/deactivate deactivate
      
    • In ~/.bashrc file, add the following (Linux)
        export PATH=/home/roy/.conda_links:$PATH
      
    • In ~/.bash_profile file, add the following (Mac)
        export PATH=/Users/roy/.conda_links:$PATH
      
  • Check installation of conda
    $ conda --version
    

Environment Setup

  • Create a python virtual environment (my_lab is environment name and python version is 3.6)
    $ conda create --name my_lab python=3.6
    
  • Check environment with following command:
    $ conda info --envs
    
  • Activate environment
    $ source activate my_lab
    
  • Deactivate environment
    $ source deactivate
    
  • Delete existing environment
    $ conda remove --name my_lab --all
    

IDE setup

Install Jupyter Notebook

$ pip install --upgrade pip
$ pip install --upgrade ipython jupyter
$ jupyter notebook

Install Jupyter Lab

Jupyter Lab has a better IDE alike features than Jupyter Notebook. Jupyter notebook is good enough for beginners. Advanced users may prefer Jupyter Lab over notebook.

$ pip install jupyterlab

Language Setup

Conda comes with python kernel by default. R and Julia kernels can be installed alongside. List of jupyter kernels is available in the official git repo.

Install Julia Kernel

  • Download Julia and install
  • Run Julia CLI
  • Enter the following commands:
    julia> using Pkg
    julia> Pkg.add("IJulia")
    

    Now, julia will be available in Jupyter notebook/lab.

Install R kernel

  • Download R and install
  • Run R CLI
  • Enter the following commands:
    R> install.packages('IRkernel')
    R> IRkernel::installspec()
    

    Optional:

    R> IRkernel::installspec(name = 'ir33', displayname = 'R 3.3')
    

Professional IDE

References

Leave a Comment