About The Project

TooManyCellsInteractive is an interactive visualization tool allowing users to explore cell cluster trees generated by TooManyCells.

This tool with its associated website is free and open to all users and there is no login requirement.

Quick start

With Docker and Docker-Compose installed:

./start-and-load.sh \
    --matrix-dir /path/to/my-matrix/dir \
    --tree-path /path/to/cluster_tree.json \ 
    --label-path /path/to/labels.csv \
    --port 1234

For an example use, see the tutorial.

Importantly, --matrix-dir is optional and only required if you want to overlay features such as gene expression on the tree.

Running the application

To run the application, first make sure that you have Docker and Docker-Compose on your system. Before running the software, use too-many-cells to generate (among others) a cluster_tree.json file in the output directory. The application needs access to this file as well as a file called labels.csv as described in the too-many-cells documentation in order to display results. Optionally, you may also include the original matrix files, which will be used to populate a database of features that can overlay your cluster tree visualization. Depending on the size of these files, it may take a while for the database to populate. The current benchmark is about 100,000 entries per second.

For a one-command solution to get your cluster_tree.json and labels.csv onto the app server and, optionally, upload your matrix files to the database, use the convenience wrapper program, start-and-load.sh. This program takes three required arguments: the path to the labels.csv file, the path to the cluster_tree.json file and the port on which the server will listen on your localhost (which can be served to and accessed by another computer as well). You may also include the directory path where your matrix files are stored, in which case they will be imported into the database before the app runs.

Example:

./start-and-load.sh \
    --matrix-dir /path/to/my-matrix/dir \
    --tree-path /path/to/cluster_tree.json \ 
    --label-path /path/to/labels.csv \
    --port 1234

This command will automatically pull the necessary dependencies and run the server. When you are finished with the application, the postgres container will need to be stopped manually by running docker-compose stop postgres from the project root.

WARNING: your matrix files will be mounted into the container and imported into a database stored in a Docker volume on your host machine. Depending on the size of your files, this may result in substantial disk usage. To free up space, consider regularly purging unneeded volumes, containers, and/or images using the docker system prune command.

While you can use the start-and-load.sh program on each use, to avoid loading the optional matrix each time, you may simply restart the services with the command docker-compose up -f docker-compose.prod.yaml from inside the project root.

Note that the node image must be rebuilt locally any time there is a code change. This is handled by the start-and-load.sh program as well.

Importantly, TooManyCellsInteractive works with any trees in the cluster_tree.json format, so feel free to use any program to generate a tree structure and use TooManyCellsInteractive to explore!

Generating images from the command line

TooManyCellsInteractive makes it possible to batch-generate SVG images from the command line based on configurations set by the browser tool. Once you are satisfied by the tree configuration (colors, pruning parameters, sizes, etc.) on the browser, you can choose “Export Image Configuration” from the “Select Export” menu in order to download a JSON representation of the graphic. You can pass this object to the command-line program generate-svg.sh with --config-path.

If you wish to include feature values in your plot, you will need to ensure that the PostgreSQL database has been provisioned with the appropriate data ahead of time. For scripting batch outputs, the configuration JSON may be passed as stdin to the program, enabling modifications via a library like jq. This JSON editing makes it easy to, for instance, loop through many feature plots (such as multiple gene expressions on the tree).

The generate-svg.sh program provides a convenience wrapper around the fairly complex docker-compose command required to run the program and sample-export-loop.sh demonstrates how multiple images might be generated by a single configuration file, in this case by substituting the features and overriding the output path to prevent overwrites. As with similar commands, this program will mount the designated input files and output directory into the container for processing.