Interactive analysis on MMCloud + AWS

We have implemented a utility script mm_interactive.sh to help start interactive sessions on MMCloud.

The script current supports three types of sessions: JupyterLab, Rstudio and plain shell environment.

NOTE: As of 8/28, the Rstudio sesison is not yet setup in this environment.

Initial configuration

The interactive session by default is a plain shell environment with barely minimum other software installed. You will need to start a docker image where you can install software that will be saved to this image permanently even after the session ends. That will assure that next time you start a new session as long as the image are pulled correctly, you will have access to software you previously installed.

Mke sure you have git clone‘d this repo in order to have access to the relevant scripts. When submitting for the first time, run the command below. The script mm_interactive.sh resides in src/ folder of the repo.

bash mm_interactive.sh

Next, input your account details when prompted: your OpCenter username and password. The script automatically configures two default mounts: one maps the root S3 folder to /data/ on your instance, and the other maps your interactive folder to the corresponding folders in your home directory, with the capability to create the folders if it does not yet exist. This latter mount is required and cannot be changed.

If you prefer to use different paths from the default one, you can modify them with specific parameters. Use -s3 and -vm for the “data” pair of paths. If additional mounts are necessary, the -am option allows for further customization. Essentially, the end result will be the same: an S3 bucket path will be mounted to a directory on the instance. The custom command structure would look like this:

bash mm_interactive.sh  -s3 's3://custom_data_s3_path/' -vm '/custom_data_vm_path/'  -is3 's3://custom_interactive_s3_path/' -ivm '/custom_interactive_vm_path/'  -am 's3://s3_path1:tovm_path1' -am 's3://s3_path2:tovm_path2'

After following the setup prompts, a connection to the interactive session will be established from your shell terminal. Approximately 3 minutes later, you should see the output

To access the server, copy this URL in a browser: ...

OR

SSH session: ...

Copy the URL command to your web browser. For the SSH session, you may copy that into your terminal.

Once logged in, refer to this documentation page to install recommended software packages using pixi and micromamba for initial configuration.

In addtion, you can install other packages you need for your analysis using pixi since the image is pixi-based. Taking STAR as an example:

pixi global install STAR

You can check if the installed packages are executable using(pay attention to the package names, which might not be the same as that in your installing commands):

which star
star --version

Once your initial packages are installed (should take around 30 minutes), you are ready to go!

Daily Use

The initial configuration from the steps above should have installed JupyterLab as integrated development environment (IDE) that you can choose instead of working with shell. To access JupyterLab, use:

bash mm_interactive.sh --no-interactive-mount -ide jupyter

After about 10 minutes, you’ll see the message: To access the server, copy this URL into a browser: Follow this instruction to access JupyterLab.

The initial configuration should have most packages available. The image is preconfigured with pixi and micromamba as instructed here. If you want to add new conda packages, follow the recommendations for command executables, R, and Python. For example, to install pecotmr, which is an R package with many dependencies, follow the appropriate guidelines.

micromamba install -n r_libs r-pecotmr -c dnachun

Suspension

To conserve resources, suspend your interactive session when not in use with the provided command ``Suspend your Jupyter Notebook when you do not need it by running:` displayed on your screen.

Migration

If you require a session with different specifications, use the migration option. A blue button in the upper right corner of your Jupyter interface allows you to log in and view your instance information. From here, you can migrate to a new instance with preferred CPU/memory settings, you can also chose another instance family as well. We are maintaining a whitelist and blacklist of instance families to optimize performance and avoid less efficient options.

Alternative IDE (NOT YET AVAILABLE)

For RStudio:

bash mm_interactive.sh --ide rstudio

You will be provided a similar instruction to access RStudio.