differentialabundance: Parameters

Define where the pipeline should find input data and save output data.

A string to identify results in the output directory

required

type: string

default: study

A string identifying the technology used to produce the data

required

type: string

Path to comma-separated file containing information about the samples in the experiment.

required

type: string

pattern: ^\S+\.(csv|tsv|txt)$

A CSV file describing sample contrasts

required

type: string

pattern: ^\S+\.(csv|tsv|txt)$

The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.

required

type: string

Type of abundance measure used, platform-dependent

required

type: string

default: counts

Ways of providing your abundance values

TSV-format abundance matrix

type: string

pattern: ^\S+\.(tsv|csv|txt)$

Alternative to matrix: a compressed CEL files archive such as often found in GEO

type: string

default: None

Use SOFT files from GEO by providing the GSE study identifier

type: string

default: None

Column in the samples sheet to be used as the primary sample identifier

required

type: string

default: sample

Type of observation

required

type: string

default: sample

Column in the sample sheet to be used as the display identifier for observations

type: string

default: sample

Options related to features

Feature ID attribute in the GTF file (e.g. the gene_id field)

required

type: string

default: gene_id

Feature name attribute in the GTF file (e.g. the gene symbol field)

required

type: string

default: gene_name

Type of feature we have, often ‘gene’

required

type: string

default: gene

When set, use the control features in scaling/ normalisation

type: boolean

A text file listing technical features (e.g. spikes)

type: string

Comma-separated string, specifies feature metadata columns to be used for exploratory analysis, platform-specific

type: string

default: gene_id,gene_name,gene_biotype

This parameter allows you to supply your own feature annotations. These can often be automatically derived from the GTF used upstream for RNA-seq, or from the Bioconductor annotation package (for affy arrays).

type: string

pattern: ^\S+\.(csv|tsv|txt)$

Where a GTF file is supplied, which feature type to use

type: string

default: transcript

Where a GTF file is supplied, which field should go first in the converted output table

type: string

default: gene_id

Of which assays to compute the log2. Not necessary for maxquant data as this is controlled by the pipeline.

type: string

Options for processing of affy arrays with justRMA()

Column of the sample sheet containing the Affymetrix CEL file name

type: string

default: file

logical value. If TRUE, then background correct using RMA background correction.

type: boolean

default: true

integer value indicating which RMA background to use

type: integer

default: 2

logical value. If TRUE, then works on the PM matrix in place as much as possible, good for large datasets.

type: boolean

Used to specify the name of an alternative cdf package. If set to NULL, then the usual cdf package based on Affymetrix’ mappings will be used.

type: string

default: None

logical value. If TRUE, a matrix of probe annotations will be derived.

type: boolean

default: true

should the spots marked as ‘MASKS’ set to NA?

type: boolean

should the spots marked as ‘OUTLIERS’ set to NA?

type: boolean

if TRUE, then overrides what is in rm.mask and rm.oultiers.

type: boolean

Options for processing of proteomics MaxQuant tables with the Proteus R package

Prefix of the column names of the MaxQuant proteingroups table in which the intensity values are saved; the prefix has to be followed by the sample names that are also found in the samplesheet. Default: ‘LFQ intensity ’; take care to also consider trailing whitespace between prefix and samplenames.

type: string

default: LFQ intensity

Normalization function to use on the MaxQuant intensities.

type: string

Which method to use for plotting sample distributions of the MaxQuant intensities; one of ‘violin’, ‘dist’, ‘box’.

type: string

Should a loess line be added to the plot of mean-variance relationship of the conditions? Default: true.

type: boolean

default: true

Valid R palette name

type: string

default: Set1

Number of decimals to round the MaxQuant intensities to; default: -1 (will not round).

type: number

default: -1

Options related to filtering upstream of differential analysis

Minimum abundance value

required

type: number

default: 1

Minimum observations that must pass the threshold to retain the row/ feature (e.g. gene).

type: number

default: 1

A minimum proportion of observations, given as a number between 0 and 1, that must pass the threshold. Overrides minimum_samples

type: number

An optional grouping variable to be used to calculate a min_samples value

type: string

Options related to data exploration

Clustering method used in dendrogram creation

required

type: string

default: ward.D2

Correlation method used in dendrogram creation

required

type: string

default: spearman

Number of features selected before certain exploratory analyses

required

type: integer

default: 500

Length of the whiskers in boxplots as multiple of IQR. Defaults to 1.5.

type: number

default: 1.5

Threshold on MAD score for outlier identification

type: integer

default: -5

How should the main grouping variable be selected? ‘auto_pca’, ‘contrasts’, or a valid column name from the observations table.

required

type: string

default: auto_pca

Specifies assay names to be used for matrices, platform-specific

hidden

type: string

default: raw,normalised,variance_stabilised

Specifies final assay to be used for exploratory analysis, platform-specific

hidden

type: string

default: variance_stabilised

Valid R palette name

required

type: string

default: Set1

Options related to differential operations

The suffix associated tabular differential results tables

required

type: string

default: .deseq2.results.tsv

The feature identifier column in differential results tables

required

type: string

default: gene_id

The fold change column in differential results tables

required

type: string

default: log2FoldChange

The p value column in differential results tables

type: string

default: pvalue

The q value column in differential results tables.

required

type: string

default: padj

Minimum fold change used to calculate differential feature numbers

required

type: number

default: 2

Maximum p value used to calculate differential feature numbers

required

type: number

default: 1

Maximum q value used to calculate differential feature numbers

required

type: number

default: 0.05

Where a features file (GTF) has been provided, what attributed to use to name features

type: string

default: gene_name

Indicate whether or not fold changes are on the log scale (default is to assume they are)

type: boolean

default: true

Valid R palette name

required

type: string

default: Set1

In differential analysis (DEseq2 or Limma), subset to the contrast samples before modelling variance?

type: boolean

test parameter passed to DESeq()

type: string

fitType parameter passed to DESeq()

type: string

sfType parameter passed to DESeq()

type: string

‘minReplicatesForReplace’ parameter passed to DESeq()

type: integer

default: 7

useT parameter passed to DESeq2

type: boolean

independentFiltering parameter passed to results()

type: boolean

default: true

lfcThreshold parameter passed to results()

type: integer

altHypothesis parameter passed to results()

type: string

default: greaterAbs

pAdjustMethod parameter passed to results()

type: string

default: BH

alpha parameter passed to results()

type: number

default: 0.1

minmu parameter passed to results()

type: number

default: 0.5

variance stabilisation method to use when making a variance stabilised matrix

type: string

Shink fold changes in results?

type: boolean

default: true

Number of cores

type: integer

default: 1

blind parameter for rlog() and/ or vst()

type: boolean

default: true

nsub parameter passed to vst()

type: integer

default: 1000

passed to lmFit(), positive integer giving the number of times each distinct probe is printed on each array.

type: number

passed to lmFit(), positive integer giving the spacing between duplicate occurrences of the same probe, spacing=1 for consecutive rows.

type: string

default: None

Sample sheet column to be used to derive a vector or factor specifying a blocking variable on the arrays

type: string

default: None

passed to lmFit(), the inter-duplicate or inter-technical replicate correlation

type: string

default: None

passed to lmFit(), the fitting method

type: string

passed to eBayes(), a numeric value between 0 and 1, assumed proportion of genes which are differentially expressed

type: number

default: 0.01

passed to eBayes(), logical, should an intensity-dependent trend be allowed for the prior variance?

type: boolean

passed to eBayes(), logical, should the estimation of df.prior and var.prior be robustified against outlier sample variances?

type: boolean

passed to eBayes, comma separated string of two values, assumed lower and upper limits for the standard deviation of log2-fold-changes for differentially expressed genes

type: string

default: 0.1,4

passed to eBayes, comma separated string of length 1 or 2, giving left and right tail proportions of x to Winsorize. Used only when robust=TRUE.

type: string

default: 0.05,0.1

passed to topTable(), minimum absolute log2-fold-change required

type: integer

passed to topTable(), logical, should confidence 95% intervals be output for logFC? Alternatively, can take a numeric value between zero and one specifying the confidence level required.

type: boolean

passed to topTable(), method used to adjust the p-values for multiple testing.

type: string

cutoff value for adjusted p-values. Only genes with lower p-values are listed.

type: number

default: 1

Set to run GSEA to infer differential gene sets in contrasts

type: boolean

Permutation type

type: string

Number of permutations

type: integer

default: 1000

Enrichment statistic

type: string

Metric for ranking genes

type: string

Gene list sorting mode

type: string

Gene list ordering mode

type: string

Max size: exclude larger sets

type: integer

default: 500

Min size: exclude smaller sets

type: integer

default: 15

Normalisation mode

type: string

Randomization mode

type: string

Make detailed geneset report?

type: boolean

default: true

Use median for class metrics

type: boolean

Number of markers

type: integer

default: 100

Plot graphs for the top sets of each phenotype

type: integer

default: 20

Seed for permutation

type: string

default: timestamp

Save random ranked lists

type: boolean

Make a zipped file with all reports

type: boolean

Gene sets in GMT or GMX-format (multiple comma-separated input files are possible)

type: string

default: None

Should a Shiny app be built?

type: boolean

default: true

Should the app be deployed to shinyapps.io?

type: boolean

Your shinyapps.io account name

type: string

default: None

The name of the app to push to in your shinyapps.io account

type: string

default: None

Should we guess the log status of matrices and unlog for the app?

type: boolean

default: true

Rmd report template from which to create the pipeline report

required

type: string

pattern: ^\S+\.Rmd$

Email address for completion summary.

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

A logo to display in the report instead of the generic pipeline logo

required

type: string

default: docs/images/nf-core-differentialabundance_logo_light.png

CSS to use to style the output, in lieu of the default nf-core styling

required

type: string

default: assets/nf-core_style.css

A markdown file containing citations to include in the fiinal report

type: string

default: CITATIONS.md

A title for reporting outputs

type: string

default: None

An author for reporting outputs

type: string

default: None

A description for reporting outputs

type: string

default: None

Reference genome related files and options required for the workflow.

Name of iGenomes reference.

type: string

Genome annotation file in GTF format

type: string

pattern: ^\S+\.gtf(\.gz)?

Do not load the iGenomes reference config.

hidden

type: boolean

Parameters used to describe centralised config profiles. These should not be edited.

Git commit id for Institutional configs.

hidden

type: string

default: master

Base directory for Institutional configs.

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/configs/master

Institutional config name.

hidden

type: string

Institutional config description.

hidden

type: string

Institutional config contact information.

hidden

type: string

Institutional config URL link.

hidden

type: string

Set the top limit for requested resources for any single job.

Maximum number of CPUs that can be requested for any single job.

hidden

type: integer

default: 16

Maximum amount of memory that can be requested for any single job.

hidden

type: string

default: 128.GB

pattern: ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$

Maximum amount of time that can be requested for any single job.

hidden

type: string

default: 240.h

pattern: ^(\d+\.?\s*(s|m|h|d|day)\s*)+$

Less common options for the pipeline, typically set in a config file.

Display help text.

hidden

type: boolean

Display version and exit.

hidden

type: boolean

Method used to save pipeline results to output directory.

type: string

Email address for completion summary, only when pipeline fails.

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Send plain-text email instead of HTML.

hidden

type: boolean

Do not use coloured log outputs.

hidden

type: boolean

Incoming hook URL for messaging service

hidden

type: string

Boolean whether to validate parameters against the schema at runtime

type: boolean

default: true

Show all params when using --help

hidden

type: boolean

Validation of parameters fails when an unrecognised parameter is found.

hidden

type: boolean

Validation of parameters in lenient more.

hidden

type: boolean

nf-core/differentialabundance