NOTE: The information below is expected to be modular, in that it can be reproduced locally on a laptop, or HPC, or on VM based on AWS or Google Cloud.
JupyterLab + SoS Suite setup
This document provides tips for setting up your JupyterLab + SoS Suite computing environment using pixi
package manager.
Operating OS requirement
The instructions on this page are tested and known to work for Linux and MacOS. Although with some efforts it might work for Windows, using Windows your every day computational biology research is discouraged.
- For those using a Debian-based Linux desktop (e.g., Debian or Ubuntu), you can find setup recommendations here.
- MacOS users with version 11.X or above can find setup recommendations here.
- If you only have access to Windows, you might consider establishing a Linux OS within your Windows system using the Windows Subsystem for Linux (WSL). For assistance, here’s a video tutorial by one of our lab members on how to install WSL on a Windows machine.
A note for Columbia Neurology HPC users
To configure the network proxy, add the following commands to your ~/.bashrc
and then run the source
command. Begin by opening ~/.bashrc
in a text editor and appending the commands:
export http_proxy=http://menloproxy.cumc.columbia.edu:8080
export https_proxy=http://menloproxy.cumc.columbia.edu:8080
and type source ~/.bashrc
to load the changes.
A note for MemVerge Cloud users
When running on MemVerge Cloud, there will be extra setup involved — it is recommended you leave this page and follow instead the steps on this page.
A note for users from China
Depending on your network proxy settings, sometimes you might experience difficulties following from this setup due to GitHub connectivity issues, this repository provides some walk-arounds to set it up using resources from Gitee instead — if you have network connectivity issues it is recommended you leave this page and follow the Gitee repository instead.
Purge previous installations of various conda setup
This is an optional step only necessary for those who had installed various software previously and now would like to start from scratch.
rm -rf ~/.mamba ~/.conda ~/.anaconda ~/.pixi ~/.jupyter
Basic software environment and manager setup
To install our customized environment based on pixi
, in command terminal, run:
curl -fsSL https://raw.githubusercontent.com/gaow/misc/master/bash/pixi/pixi-mamba.sh | bash
This will provide a bioinformatics environment with most frequently used software packages installed as the starting point. Depending on your needs, you can add extra software packages. The following section gives examples how to install other executables as well as R and Python packags.
Next steps
For HPC users
You are now clear to start setting up for connecting to the HPC via JupyterLab.
Install other executables
Once this is set, you can install other executables using pixi
as the software manager, as long as they are released in one of the conda channels. For example to install Rstudio server:
pixi global install rstudio
and/or VS Code,
pixi global install vscode
and/or vim
the text editor,
pixi global install vim
Or, bioinformatics tools,
pixi global install STAR
You can check if the installed packages are executable using (pay attention to the package names, which might not be the same as those in your installing commands):
which star
star --version
Install other R libraries
It is important to realize that the R software and libraries here are included in this environment, using precompiled packages from conda installed via pixi
. It is therefore highly recommended that R libraries be installed also from conda as long as they are available.
For example, to install R library pacman
you can verify that it is available on anaconda.org; then you can install it using:
pixi global install -c conda-forge --environment r-base r-pacman
to install it to the r-base
environment that we have previously configured.
For libraries not available on conda, you can use the regular approaches to install them, such as from cran, bioconductor and GitHub although it is strongly recommended to install from conda as long as it is possible, or, build your own conda packages first (instructions TBD) then install them via pixi
, so you can avoid potential issues setting up your own computing environments to compile packages from source.
Same command can be used to update these packages.
If you want to update all packages in the environment,
pixi global update r-base
Install other Python packages
For example, to install Python library seaborn
you can verify that it is available on anaconda.org; then you can install it using:
pixi global install -c conda-forge --environment python seaborn
to install it to the python
environment that we have previously configured.
Same command can be used to update these packages.
If you want to update all packages in the environment,
pixi global update python
Similarly, for packages not available on conda, you can use the regular approaches to install them, such as from pypi via pip install
; again try avoid doing that but rely on conda as much as possible.