Background

A significant challenge in dealing with large genomic datasets is being able to visualize them in an effective way. Generating informative and attractive figures is one of the most important things you can do to make your presentations and publications more impactful. If you look through genome papers, you will find that it is very common to generate circular images and summaries of genomic data. See the figure below summarizing two related bacterial genomes. When done properly, this can be very effective even in cases where you are working with genomes that are not circular themselves.

Objectives

The goal of this exercise will be to use the program Circos to generate publication-quality images from genomic data.

Software and Dependencies

Note that circos is already installed on our workshop server. But it is not yet in your “PATH”, meaning that it is not in one of the list of places that are automatically searched for executables whenever you enter a Bash command. Circos can be found in the following directory: /home/apps/circos-0.69-6/bin. Therefore, you could enter that full absolute path every time you want to call the program. Or you can add that location to your “PATH” so that you only need to type circos to run the program. Add that location to your path with the following command.

PATH=$PATH\:/home/apps/circos-0.69-6/bin

Protocol

1. Generate the Circos figure from the example dataset distributed with the software

Before we get into the details of Circos, let’s see what it can do and confirm that the software is properly installed by generating the “example” figure based on the data distributed with the software.

First, open a Terminal session and ssh into our workshop server. Then enter the following command to move into the directory with the relevant files for this exercise.

cd ~/TodosSantos/circos/example

Then run Circos by simply entering the name of the program. All the configuration files for this dataset are already in place in this directory, so no additional information is necessary as long as Circos is installed and in your PATH. We will go through how to set up these configuration files later.

circos

The program should take about a minute to run and report a number of status updates along the way. It will return to the command prompt when finished. To see the figure that was generated, enter the following command.

[Note that the open command used below and throughout this exercise assumes that you are working locally on a Mac OS X machine. If you are using a local linux machine, you can use xdg-open. If you are using a remote server, it will be easiest to transfer the image file to your local machine before viewing it.]

open circos.png

You should see something like this: