Loading Data into GenomeSpace

Introduction

This guide is an overview of how to load data into the GenomeSpace user interface, from a variety of tools and resources which connect to GenomeSpace. This guide describes methods for obtaining data, and is not meant to be comprehensive documentation for the given tools. For more information on each tool, see the list of GenomeSpace tools and the help links provided on each tool's page.

If you are having problems launching certain desktop tools, which depend on the Java Network Launch Protocol (JNLP), see FAQ2.2 for solutions. These solutions were updated February of 2015, and should take you about five minutes to incorporate. They involve adjusting your browser pop-up blocker and system's Java security exception site list.

 

For tool-specific documentation, check out the main Tool Guide:

Tool Guide


Upload Data to GenomeSpace

In GenomeSpace:

  1. Select one or more files in your desktop file manager (e.g. Windows Explorer, Mac OSX Finder)

 

  1. Drag the files over the GenomeSpace user interface directory where you want them uploaded to.  The icon will change to a green '+' sign and the folder where they will be uploaded to will change to have a yellowish-green background.

  1. Release the mouse button to drop the file(s). If you drop the file(s) onto a subdirectory, the GenomeSpace user interface will open that directory and upload the file(s) to there. If you drop the files anywhere else in the currently displayed directory, either on a file name and or on white space in the display, the files(s) will be uploaded to the current directory.

  2. To see recent uploads, you can click on the View > Recent uploads  menu item.  This will show a dialog that tracks your upload queue.  You can dismiss the dialog by clicking Close

For more information about drag-and-drop uploads to GenomeSpace, see the GenomeSpace blog post.

 


Load Data from ArrayExpress to GenomeSpace

1. Launch ArrayExpress from GenomeSpace.

Click the ArrayExpress icon in the toolbar to launch ArrayExpress.

2. Locate data in ArrayExpress and send to GenomeSpace.

  1. Search for ArrayExpress experiment data.

  1. Results can be filtered using the "Filter Search Results" tool. In this example, we have searched for "muscular dystrophy", and filtered the results to be limited to organisms that are "Canis lupus". Open your selected experiment by clicking the experiment accession number (highlighted).

  1. Click the Send [experiment accession] data to GenomeSpace link.

  1. In the upload window, all of this experiment's files will be selected; you should clear the checkboxes for any files you do not want to upload to GenomeSpace. When all the experiment files you want to upload are correctly selected, click Upload.

ArrayExpress shows you the upload status of each file.

  1. When all your files are uploaded to GenomeSpace, you can click the follow this link to open GenomeSpace UI link to go to your GenomeSpace.

 

3. Locate your ArrayExpress data in GenomeSpace.

If this is your first time loading ArrayExpress data into GenomeSpace, ArrayExpress creates an ArrayExpress folder in the top level of your GenomeSpace home directory.  Subsequently, ArrayExpress creates a separate subdirectory in that folder for each experiment you upload.

One way you can use the ArrayExpress experiment files (which are in MAGE-TAB format) you download is to send them to GenePattern's MAGETABImportViewer module to create GCT and CLS files for analysis.  See the MAGETABImportViewer module documentation for more information on the specific MAGE-TAB format the module can receive.


Load Data from the cBioPortal to GenomeSpace

 

1. Launch the cBioPortal from GenomeSpace.

  1. Click the cBioPortal icon in the toolbar to launch the cBioPortal.

2. Download data from cBioPortal and send it to GenomeSpace.


 
 
  1. Locate downloadable data by clicking on the "Download Data" tab in cBioPortal. To download data, change the following parameters:
  2. Select Cancer Study: select from the drop-down menu the study of your choice, e.g. glioblastoma tumor samples.
  3. Select Genomic Profiles: select the datasets that you are interested in, e.g. gene expression data (mRNA expression).
  4. Select Patient/Case Set: select the set of samples you are interested in, e.g. only samples with mRNA expression data.
  5. Enter Gene Set: enter a gene set of interest one of the following ways:
    1. Choose a pre-compiled gene list of interest from the drop-down menu.
    2. Enter a custom gene list of interest in the text box.
  6. Click the Send to GenomeSpace icon.

3. Save the data to GenomeSpace.

A pop-up window will appear, allowing you to save the compiled dataset to GenomeSpace.


  1. Optional: Choose a specific directory to save the data to, by navigating through the directory tree.
  2. Choose a name for the file, or use the filename automatically generated by the cBioPortal.
  3. Click the Submit button to upload the data to GenomeSpace.

 


Load GEO Data from InSilico DB to GenomeSpace

1. Launch InSilico DB from GenomeSpace.

Click the InSilico icon in the Data Sources toolbar to launch InSilico DB.

2. Locate data in InSilico DB and send to GenomeSpace.

  1. In InSilico DB, click Sign in/sign up in the top right corner.

  1. In the OpenID section, click the GenomeSpace icon.

If you have not logged into InSilico DB before with your GenomeSpace ID, you will need to:

InSilico DB will then send you a confirmation email. Click the link in the email to confirm your registration.

  1. If you are returned to the main InSilico window, click the green Browse button to get to the search window.

  1. Enter GEO in the search field and click the magnifying glass to search.

  1. Locate the dataset you would like to export in the search results.
  2. Click the down arrow next to the Export button and select the GenomeSpace radio button.

  1. Click the Export button (which should now have the GenomeSpace icon on it).

  1. You may see a message stating that "the dataset you requested is being prepared" and that you will be notified via email when your dataset is complete.  If this happens, you can click the link in the email to initiate the sending of the dataset to GenomeSpace. You may also see a message confirming that the dataset has been exported to GenomeSpace, as below.  Click Ok.


Load Data from UCSC Genome Browser to GenomeSpace

1. Launch UCSC Table Browser from GenomeSpace.

Click the UCSC Table Browser icon in the Data Sources toolbar to launch it.

2. Locate data in UCSC Table Browser and send to GenomeSpace.

  1. Enter your selections to retrieve the data track you want.
  2. In the output format line, select the check box next to GenomeSpace.

  1. In the output file field, specify the name of your output file. 

There are two options for this:

NOTE: You must add the correct extension to your file name. If, for instance, you select BED in the output format drop-down, you will need to add .bed to your output file name.  The UCSC Table Browser does not do this for you, and some of the GenomeSpace tools depend on the file extension to determine file type.

  1. Click get output.  This will save your track file to the specified GenomeSpace directory.

Load Data from Synapse to GenomeSpace

1. Launch Synapse from GenomeSpace.

  1. Click the Synapse icon in the toolbar to launch Synapse.
  2. Click Login or Register for a Synapse Account and enter the required credentials.

2. Locate data in Synapse and send to GenomeSpace.

  1. Using the search dialog at the bottom of the page, search for data in Synapse.  For better results, set the dropdown menu to the left of the search box to "All Types".
  2. Select the desired project from the search results.
  3. Select the desired file by navigating the folders of available datatypes (if applicable) on the project overview page.

3. Upload the selected dataset to GenomeSpace.

  1. On the righthand side of the page, click the Tools icon and select Upload to GenomeSpace.  Note: You may need to disable pop-up blocking in your browser.
  2. Once the pop-up dialog opens, select the desired target directory from the tree browser, rename the file (if desired) and click Submit.  Your file will be uploaded to the specified directory under your username.

Load a Gene List from Reactome Pathway into GenomeSpace

 

 

 

Summary

This recipe provides an outline of how to use the Reactome pathway browser to idenity a list of genes or proteins in a pathway, then save the list as a file in your GenomeSpace data store. 

Input:

For more information about the Reactome pathway browser, see http://wiki.reactome.org/index.php/Usersguide#The_Pathway_Browser

 

Recipe Details

1. Click on the Reactome icon in the GenomeSpace toolbar

You will be sent to the Reactome website.

2. Click on Browse Pathways

This will take you to the Reactome pathway browser.

 

3. You can navigate the pathway hierarchy by clicking on the  symbol on the left side of the pathway labels. Open Apoptosis->Regulation of Apoptosis

4. From the tabs below the pathway diagram, select Molecules, then click the Download button.

This takes you to the Download form.  By default, all fields (buttons) are selected, as indicated by darker shading.

5. De-select all fields except Uniprot ID and Gene Name by clicking on each button each button that you do not want selected.

6. Click on View to preview the gene list.

This shows a preview of the text that will be saved to a file.  Make adjustments to your selection as required.

7. To save to your GenomeSpace data store, click the GenomeSpace button.

A window will pop up with a GenomeSpace dialog.

8. Select a file name and location to save the file to in your GenomeSpace data store.  Click Submit to save the file, then click Close.

 9. Return to the GenomeSpace website.  Your gene list should now be there, after refreshing the page.

 

Pathway Enrichment Analysis with Reactome

 

 

 

Summary

This recipe provides an outline of how to perform pathway enrichment analysis on the reactome website using a list of genes or protein Ids in GenomeSpace.  Given a list of gene ideintifers, the goal is to query the reactome database for pathways whose components include the proteins in the supplied list.  Enriched pathways are then displayed with the specified proteins highlighted, and the diagrams can be saved back to GenomeSpace. 

Input:

Input Formats:

A list of NCBI/Entrez gene ids:

2
21
10257
8038
...

A list of UniProt ids:

O00139
O00186
O00187
O00204
O00217
...

Tab-delimited expression data:

#Probeset 10h_control 10h 14h 18h 24h
200000_s_at 9.381569 9.710802 9.874874 9.934639 9.495911
200002_at 12.555275 12.511045 12.564419 12.538642 12.439174
200003_s_at 12.401259 12.054083 12.275169 12.206342 12.015476
...

For more information about input data, see http://wiki.reactome.org/index.php/Usersguide#Gene_list_Dataset

 

Recipe Details

1. Drag your gene list file onto the Reactome OR click the icon and select Launch on File, then select your gene list file

You will be sent to the Reactome website.

2. When the analysis is complete, click on one of the top-level pathways, at the left hand side of the screen.

This will take you to the Reactome pathway browser.

3. Navigate the pathway hierarchy by clicking on the  symbol on the left side of the pathway labels.

The pathway diagrams are colorized according to the number of enriched proteins.

4. Once you have selected your pathway, click the  icon and select Save Diagram to GenomeSpace.

A window will pop up with a GenomeSpace dialog.

5. Select a location to save the file to in your GenomeSpace data store.  Click Submit to save the file, then Close.

 6. Return to the GenomeSpace website.  Your pathway diagram file should now be there, after refreshing the page.