NOTE: The information below is expected to be modular, in that it can be reproduced locally on a laptop, or HPC, or on VM based on AWS or Google Cloud.

JupyterLab + SoS Suite setup

This document provides tips for setting up your JupyterLab + SoS Suite computing environment using pixi package manager.

Operating OS requirement

The instructions on this page are tested and known to work for Linux and MacOS. Although with some efforts it might work for Windows, using Windows your every day computational biology research is discouraged.

A note for Columbia Neurology HPC users

To configure the network proxy, add the following commands to your ~/.bashrc and then run the source command. Begin by opening ~/.bashrc in a text editor and appending the commands:

export http_proxy=http://menloproxy.cumc.columbia.edu:8080
export https_proxy=http://menloproxy.cumc.columbia.edu:8080

and type source ~/.bashrc to load the changes.

A note for MemVerge Cloud users

When running on MemVerge Cloud, there will be extra setup involved — it is recommended you leave this page and follow instead the steps on this page.

A note for users from China

Depending on your network proxy settings, sometimes you might experience difficulties following from this setup due to GitHub connectivity issues, this repository provides some walk-arounds to set it up using resources from Gitee instead — if you have network connectivity issues it is recommended you leave this page and follow the Gitee repository instead.

Purge previous installations of various conda setup

This is an optional step only necessary for those who had installed various software previously and now would like to start from scratch.

rm -rf ~/.mamba ~/.conda ~/.anaconda ~/.pixi ~/.jupyter ~/micromamba ~/.mambarc ~/.local/share/jupyter/

Basic software environment and manager setup

To install our customized environment based on pixi, in command terminal, run:

curl -fsSL https://raw.githubusercontent.com/gaow/misc/master/bash/pixi/pixi-setup.sh | bash

This will provide a bioinformatics environment with most frequently used software packages installed as the starting point. Depending on your needs, you can add extra software packages. The following section gives examples how to install other executables as well as R and Python packags.

Next steps

For HPC users

You are now clear to start setting up for connecting to the HPC via JupyterLab.

Install other executables

Once this is set, you can install other executables using pixi as the software manager, as long as they are released in one of the conda channels. For example to install Rstudio server:

pixi global install rstudio

and/or VS Code,

pixi global install vscode

and/or vim the text editor,

pixi global install vim

Or, bioinformatics tools,

pixi global install STAR

You can check if the installed packages are executable using (pay attention to the package names, which might not be the same as those in your installing commands):

which star
star --version

Install other R libraries

It is important to realize that the R software and libraries here are included in this environment, using precompiled packages from conda installed via pixi. It is therefore highly recommended that R libraries be installed also from conda as long as they are available.

For example, to install R library pacman you can verify that it is available on anaconda.org; then you can install it using:

pixi global install -c conda-forge --environment r-base r-pacman

to install it to the r-base environment that we have previously configured.

For libraries not available on conda, you can use the regular approaches to install them, such as from cran, bioconductor and GitHub although it is strongly recommended to install from conda as long as it is possible, or, build your own conda packages first (instructions TBD) then install them via pixi, so you can avoid potential issues setting up your own computing environments to compile packages from source.

To update installed packages, please specify the version of package to the latest version you intend to update,

pixi global install --environment r-base <PACKAGE>=<VERSION>

If you want to update all packages in the environment,

pixi global update r-base

Install other Python packages

For example, to install Python library seaborn you can verify that it is available on anaconda.org; then you can install it using:

pixi global install -c conda-forge --environment python seaborn

to install it to the python environment that we have previously configured.

To update installed packages, please specify the version of package to the latest version you intend to update,

pixi global install --environment python <PACKAGE>=<VERSION>

If you want to update all packages in the environment,

pixi global update python

Similarly, for packages not available on conda, you can use the regular approaches to install them, such as from pypi via pip install; again try avoid doing that but rely on conda as much as possible.