JupyterLab + SoS Suite setup
This document provides tips for setting up your computing environment using micromamba
package manager.
Operating OS requirement
The instructions on this page are tested and known to work for Linux and MacOS. Although with some efforts it might work for Windows, using Windows your every day computational biology research is discouraged.
- For those using a Debian-based Linux desktop (e.g., Debian or Ubuntu), you can find setup recommendations here.
- MacOS users with version 11.X or above can find setup recommendations here.
- If you only have access to Windows, you might consider establishing a Linux OS within your Windows system using the Windows Subsystem for Linux (WSL). For assistance, here’s a video tutorial by one of our lab members on how to install WSL on a Windows machine.
Note for Neurology HPC Users: To configure the network proxy, add the following commands to your ~/.bashrc
and then run the source
command. Begin by opening ~/.bashrc
in a text editor and appending the commands:
export http_proxy=http://menloproxy.cumc.columbia.edu:8080
export https_proxy=http://menloproxy.cumc.columbia.edu:8080
and type source ~/.bashrc
to load the changes.
Purge previous installations of micromamba
or miniconda
This is an optional step only necessary for those who had installed various software previously and now would like to start from scratch.
First, back-up the previous micromamba
installation by:
mv ~/micromamba ~/micromamba_backup
mv ~/.conda ~/conda_backup
rm -rf ~/.mamba ~/.conda ~/.anaconda
Then make a back up of ~/.bashrc
by:
mv ~/.bashrc ~/.bashrc_backup
and install a new copy of it:
cp /etc/skel/.bashrc ~/.bashrc
Finally, open up your ~/.bashrc_backup
, review and move the contents that you deem relevant to the new environment that you are about to setup. For example, the http_proxy
and https_proxy
discussed in the previous section should be retained for Neurology HPC users. However if you would like micromamba
to setup R and not using R installed on the HPC, please do not include module load R
in the new bashrc that you are configuring now.
At this point, please log out then log back in to refresh the computing environment.
Note: the exact same tip works also for purging your miniconda3
installation.
Install micromamba
We highly recommend using micromamba
over miniconda
or anaconda
. Unlike miniconda
, micromaba
does not need a base
environment and does not come with a default version of Python. micromamba
supports a subset of all mamba
implements a command line interface from scratch in the C++ language.
To install please follow instructions on this page. Briefly,
"${SHELL}" <(curl -L micro.mamba.pm/install.sh)
Push the “enter” or “return” key on your keyboard when prompted to follow the default setting.
If your computer does not have curl
available you can use wget
like this:
cd ~
wget -qO- https://micromamba.snakepit.net/api/micromamba/linux-64/latest | tar -xvj bin/micromamba
~/bin/micromamba shell init -s bash -p ~/micromamba
where you manually specific the OS, in this case linux-64
.
After installation is done you should load micromamba
from ~/.bashrc
(Linux) or ~/.zshrc
(MacOS) by typing source ~/.bashrc
(or source ~/.zshrc
). To verify you’ve installed it successfully:
micromamba -h
This should print the help message. You can then use micromamba
to create environments, install packages, etc. For conveniences it is strongly recommended adding the following channels to micromamba
to install packages from by default:
micromamba config prepend channels nodefaults
micromamba config prepend channels bioconda
micromamba config prepend channels conda-forge
After you successfully installed the latest version of micromamba
, please follow prompts below to setup
a JupyterLab + SoS Suite environment for daily computing.
Setup the Script of Scripts computing environment
Current recommended version of SoS suite along with Python and R can be installed using this configuration file pisces-rabbit.yml
:
wget https://raw.githubusercontent.com/gaow/misc/master/docker/pisces-rabbit.yml
micromamba env create -y -f pisces-rabbit.yml
Notice that we name environment by the Zodiac of the month and year. For example, pisces-rabbit
pins the setup to what was tested by our lab members to be a stable distribution as of Feb 20 (Pisces), 2023 (Rabbit).
This wiki will be periodically updated to the latest stable version we have tested.
If you want to load this environment by default, you can open your ~/.bashrc
file (or ~/.zshrc
) and add this line:
micromamba activate pisces-rabbit
and type source ~/.bashrc
(or source ~/.zshrc
) to load the changes. Otherwise you need to type the command above each time you want to activate and work under this environment after opening up a new Shell session.
Note for Neurology HPC users:
- When you submit a job to the cluster, since the computing node ignores the
~/.bashrc
settings, you need to add or source these lines in your job submission template in order to activate and use this environment:
export PATH=$HOME/.local/bin:$PATH
export MAMBA_ROOT_PREFIX=$HOME/micromamba
eval "$(micromamba shell hook --shell bash)"
micromamba activate pisces-rabbit
You can put these lines in a file called ~/mamba_activate.sh
and include source ~/mamba_activate.sh
as the first line in your job submission script or template.
- The SoS notebook plugin,
sos-r
, is not included in the setup because as of today (August, 2023)r-feather
does not support Apple Silicon CPU. However it is available for Intel/AMD CPU. HPC users are encouraged to run
micromamba install sos-r -y
to install the R plugin for SoS notebook on the cluster.
At this point, you can test your installation by connecting to the HPC via JupyterLab. If everything works well so far, you can optionally delete your old micromamba
environment that you backed up earlier:
rm -rf ~/micromamba_backup
rm -rf ~/conda_backup
Install other software
Once this is set, you can also install other software using micromamba
as the software manager, as long as they are released in one of the conda channels. In our setting we have already included conda-forge
and bioconda
by default. You can install for example plink
, plink2
, bcftools
, tabix
etc easily:
micromamba install plink plink2 bcftools tabix -y
About R libraries
It is important to realize that the R software installed using micromamba
is packaged and distributed by conda-forge
. It is therefore highly recommended that R libraries be installed also from conda-forge
as long as they are available. For example, to install R library pacman
you can verify that it is available on conda-forge
; then you can install it using:
micromamba install r-pacman -y
For libraries not available on conda-forge
you can use the regular approaches to install them, such as from cran, bioconductor and GitHub.