GenomeSpace Tools and Data Sources
Project Website: http://www.genepattern.org
GenePattern is a powerful genomic analysis platform that provides access to hundreds of tools for gene expression analysis, proteomics, SNP analysis, flow cytometry, RNA-seq analysis, and common data processing tasks. A web-based interface provides easy access to these tools and allows the creation of multi-step analysis pipelines that enable reproducible in silico research.
The categories of analysis modules in GenePattern include:
There are also a number of preprocessing and utility modules for data handling, and visualization modules that display analysis results graphically and allow you to manipulate that view interactively.
GenePattern provides a simple application interface that gives users access to computational analysis methods and tools, regardless of their computational experience. GenePattern also provides a programmatic interface that makes those analysis modules available to computational biologists and developers from Java, MATLAB, and R.
Modules developed and tested by the GenePattern development team are available on the Broad Institute public repository. Modules developed by GenePattern users are available on GParc, the GenePattern Archive.
Suites group modules and pipelines into convenient packages. They can provide an easy list of frequently accessed modules and a convenient way of collecting a set of modules and pipelines to be shared with other GenePattern users.
GenePattern pipelines combine analysis modules, visualization modules, and other pipelines into a single, reusable workflow. Pipelines can be defined to analyze a particular dataset; for example, you might create a pipeline to reproduce published analysis results. Or they can be parameterized, which allows the person running the pipeline to provide datasets and other analysis variables. Often a pipeline runs a progressive series of analyses, where the output from one analysis is used as input for the next.
Pipelines allow you to capture, automate, and share the complex series of steps required to analyze genomic data. By providing a way to create and distribute an entire computational analysis methodology in a single executable script, pipelines enable a form of in silico reproducible research.
Pipelines capture computational analysis methods, modules and/or other pipelines, and their parameters. They allow you to "chain" methods together by using the output of one as the input of another. You can use a simple form-based interface to build a pipeline from scratch, or have GenePattern work backward from an analysis results file to create a pipeline that contains the analysis methods used to generate that file. You can create a pipeline to reproduce an exact set of events, or parameterize the pipeline to run an analysis methodology against variable data