Home  |  Start a new run  |  Job status  |  Examples  |  Documentation  
server

Documentation

The objective of this server is to automate data analysis for the processing of metabolomics experiments.

Using the web server

Here is how it works:

  • You need a file that contains metabolite concentrations or area counts from MS/MS metabolomics experiments in (semicolon separated values format. The server expects the samples in the rows and the metabolites in the columns of the data file. Export files from the Biocrates MetIQ Software containing metabolite concentration and validation data (csv export file) can be directly uploaded to the server. Export files in Excel xls format have to be converted into the semicolon separated values table format) before beeing accepted. Please make sure that you use points "." as decimal seperator symbol in your table files. Please note that commas and quotes are deleted by our server. Sample and metabolite names must not contain semicolons. To analyze group or phenotype data you also need a semicolon separated values table file containig the unique sample ids in the first column and the group-/phenotype-data in subsequent columns.
  • To submit a new job, go to the "Start a new run" page.
  • Upload a data file (or paste the data into the window).
  • Select a data format (currently the formats 'AbsoluteIDQ', 'Quant. data', and 'Quant. data with KEGG IDs' are supported).
  • The usage of validation data and further options for data analysis can be chosen. More information concerning the different options can be found below, or follow the help links on the submission page.
  • Optionally enter a job identifier as a title for all output and to identify your job on the job status page.
  • Set your job status to "private" (default), if the submitted dataset is confidental or untick the checkbox to make your data freely visible on the internet.
  • If you wish to be notified by e-mail enter your e-mail address (recommended if your job is kept private). Your e-mail will never be shown on the internet and is used ONLY to notify you about your job status. If you enter a pseudonym (any name without a @ character), this will be used to identify your jobs on the job status page.
  • Optinally upload a table file in semicolon separated values format with group-/phenotype data (or paste the data into the window).The file has to contain the Sample bar code information in the first column and your group definitions in subsequent columns (see below).
  • Enter additionally job parameters in the field "Auxillary data" or opload a file containing the options (see below).
  • Submit your job by pressing the "submit" button.

This is what the server will do next:

  • Your job will be entered into the queue. You can check its status using the "Job status" page. Its status should be one of "spooled", "running", or "finished". The number of jobs run in parallel is controlled by a load balancing system.
  • When the job is started, the server will read in all concentration, phenotype, and auxillary data and run the choosen R-script on them. The webpage will be updated every 60 seconds with a notification about the current status of your job.
  • After the job has finished a summary page will be displayed. From here you have access to the graphical (pdf) and tabular output of the statistical analysis. Log files of the processed steps are also generated.

How to read the results from the server:

  • In order to describe the results produced by the server, we provide two walk-through examples.

This server is bound to evolve in the near future to aquire additional functionalities. All comments are welcome!

PS: As this is a freely accessible non-for-profit web service, it comes of course with absolutely no warranty (see imprint).

Help on job options

This is a brief syllabus describing the different job options and input file formats.
AbsoluteIDQ format:
The Biocrates AbsoluteIDQ Kit comes with its own Data handling and analysis software MetIQ. Metabolite concentrations can be exported with several options and file formats, including csv, txt, and xls formats. The metaP webserver accepts files in csv format exported from MetIQ with "." as decimal seperator and without the options "Transpose values" and "Export KEGG Identifieres" (for a list of database IDs of metabolites including KEGG IDs see the metabolite info page). These files begin with three header lines followed by an empty line and the data for individual samples. The first columns hold general information of the sample and the measurement followed by a structure of 2 columns per metabolite where the first column contains the concentration in μM and the second one the validation info. This structure is automatically recognized by the R-scripts.

Concentration [μM]
Plate bar code;Sample bar code;Sample Type;Sample Identification;...;Arg-PTC;;Gln-PTC;;...
;;;;;;;;;;;;;;;Class;aminoacids;;aminoacids;;...
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;...
1000123456;1000789123;Sample;plasma_xyz;...;26.3;Out of Quant.;33.3;Valid;...
1000123456;1000789124;Sample;plasma_abc;...;121;Valid;159;Valid;...
1000123456;1000789125;Sample;plasma_def;...;0.1;< LOD;0.3;< LOD;...

MetIQ Exports in the Microsoft Excel (xls) format lack the validation columns. If you convert these files into csv format, they are also accepted by this webserver. In that case the R-scripts assume that all values are valid and the concentration data block begins with the metabolite "Arg.PTC". If this is not the case in your file you have to set the parameter first_metab (see below).
Data format:
Currently ... TODO supported.
Validation data:
Please select here whether to drop non valid data or to analyse the whole dataset.
Outliers:
Please select here whether to drop outliers in the data or to analyse the whole dataset.
Noisy metabolites:
Please select here whether to drop metabolites that show a cv above 0.25 for reference samples.
References:
Please select here whether to drop the references/replicates after quality checks.
Ratios:
Please select here whether to calculate ratios. Please note that the calculation of ratios is rather time-consuming. Jobscan take more than one hour.
Phenotype data:
Phenotype data has to be submitted in csv file format or alternatively can be pasted into the specified field. After a sample ID in the first column (Sample bar code) the phenotype/group definitions are added in seperate columns. The first row contains the header with the group names, while the second row holds information on the type of data. This information will be used in the decision which statistical test has to be used. Currently only the classification "ordered" and "unordered" is supported (numerical or alphabetic order is implied). The following rows contain the information on individual samples.

Sample bar code;age;sex;genotype;dosage;...
;ordered;unorded;unordered;ordered;...
1000789123;25;female;wt;low;...
1000789124;28;male;mutant;middle;...
1000789125;21;female;mutant;strong;...

Most easily the phenotype data file is generated by copying the "Sample bar code" and optionally also the "Plate bar code" and "Sample Identification" columns from the MetIQ Export into a new spreadsheet table, where you can fill out the phenotype columns.
Auxillary data:
Here you may enter additionally job parameters or opload a parameter file containing the options. Note that the options have to be seperated by semicolon (and optionally by new line). Currently supported options (default values are given) are:

barcode_equal_y=true;
first_metab=Arg.PTC

  • barcode_equal_y=true/false
    use equal y dimensions on barplots for diffrent groups (default) or scale each barplot individually according to the maximal value.
  • first_metab=Arg.PTC
    tells the server with which metabolite the concentration data block begins in the file. This option is usefull when you submit csv datafiles, that were generated from original MetIQ Exports in Excel formats. Because these files are lacking the validation columns the R-scripts are generally not able to recongnize the beginning of the data structure. Arg.PTC is used as default value as it is typically the first metabolite in whole dataset exports. The name of the metabolite has to be given exact in the same way as it appears in the output files of this webserver. Note that all special characters like ",_-():" are transformed to "." by the R-scripts.
R-script selection:
Currently only one R-script for standard analysis is publicly available. Further scripts for special analysis and metaP internal use will be implemented here.

Frequently asked questions

Change log

Here we log the major modifications to the web server.
September 2009
Major revision and extension of functionality. Documentation added.
24 January 2009
- migration of metaP-server to a new server; previous jobs are still available at this link.
07 October 2008
Extended graphics to full metabolite set.
06 October 2008
Setup of the server based on MassTRIX scripts.





KEGG Data is provided by the Kanehisa Laboratories for academic use. Any commercial use of KEGG data requires a license agreement from Pathway Solutions Inc.
The Helmholtz Zentrum München imprint applies.

This page is maintained by Gabi Kastenmüller and Werner Römisch-Margl.
Last modification: 28 December 2009

Visit our NAR-web server: