differentialabundance: Parameters

Define where the pipeline should find input data and save output data.

A string identifier used to name result files in the output directory

required

type: string

default: study

A string identifying the technology used to produce the data

required

type: string

Path to CSV/TSV file containing information about the samples in the experiment.

required

type: string

pattern: ^\S+\.(csv|tsv)$

A CSV/TSV file describing sample contrasts to compare groups.

type: string

pattern: ^\S+\.(csv|tsv)$

A YAML file describing sample contrasts to compare groups.

type: string

pattern: ^\S+\.(yaml|yml)$

The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.

required

type: string

Type of abundance measure used, platform-dependent.

required

type: string

To how many digits should numeric output in different modules be rounded? If -1 or null, will not round.

type: integer

default: 4

Ways of providing your abundance values

TSV/CSV-format abundance matrix

type: string

pattern: ^\S+\.(tsv|csv)$|\S*proteinGroups\.txt$

(RNA-seq only): optional transcript/gene length matrix with samples and transcript_ids/gene_ids as in the abundance matrix.

type: string

Alternative to matrix: a compressed CEL files archive such as often found in GEO

type: string

Use SOFT files from GEO by providing the GSE study identifier

type: string

Column in the sample sheet to be used as the primary sample identifier

required

type: string

default: sample

Type of observation

required

type: string

default: sample

Column in the sample sheet to be used as identifier for observations. If unset, the —observations_id_col is used.

type: string

Options related to features

Feature ID attribute in the abundance table as well as in the GTF file (e.g. the gene_id field)

required

type: string

default: gene_id

Feature name attribute in the abundance table as well as in the GTF file (e.g. the gene symbol field)

required

type: string

default: gene_name

Type of feature. Often ‘gene’

required

type: string

default: gene

When set, use the control features in scaling/ normalization.

type: boolean

A text file listing technical features (e.g. spikes)

type: string

Comma-separated string, specifies feature metadata columns to be used for exploratory analysis, platform-specific

type: string

default: gene_id,gene_name,gene_biotype

Supply your own feature annotations. Can be derived from the GTF (rnaseq) or from the Bioconductor annotation package (affy arrays).

type: string

pattern: ^\S+\.(csv|tsv)$

Analysis options related to the use of paramsheet to run multiple combinations of analyses (see usage docs for details).

Name of the paramset (as specified in assets/paramsheet) to run specific analyses

type: string

Path to a paramsheet file

type: string

pattern: ^\S+\.yaml$

Options for processing of affy arrays with justRMA()

Column of the sample sheet containing the Affymetrix CEL file name

type: string

default: file

logical value. If set to true, apply background correction using RMA.

type: boolean

default: true

integer value indicating which RMA background to use

type: integer

default: 2

logical value. If TRUE, then works on the PM matrix in place as much as possible, good for large datasets.

type: boolean

Used to specify the name of an alternative cdf package. If set to NULL, then the usual cdf package based on Affymetrix’ mappings will be used.

type: string

logical value. If TRUE, a matrix of probe annotations will be derived.

type: boolean

default: true

should the spots marked as ‘MASKS’ set to NA?

type: boolean

should the spots marked as ‘OUTLIERS’ set to NA?

type: boolean

if TRUE, then overrides what is in rm.mask and rm.oultiers.

type: boolean

Genome annotation file in GTF format

type: string

pattern: ^\S+\.gtf(\.gz)?

If a GTF file is supplied, which feature type to use

type: string

default: transcript

If a GTF file is supplied, which field should go first in the converted output table

type: string

default: gene_id

Options for processing of proteomics MaxQuant tables with the Proteus R package

Prefix of the column names of the MaxQuant proteingroups table in which the intensity values are saved; the prefix has to be followed by the sample names that are also found in the samplesheet. Default: ‘LFQ intensity’; will search for both the prefix as entered and the prefix followed by one whitespace.

type: string

default: LFQ intensity

Normalization function to use on the MaxQuant intensities.

type: string

Which method to use for plotting sample distributions of the MaxQuant intensities; one of ‘violin’, ‘dist’, ‘box’.

type: string

Should a loess line be added to the plot of mean-variance relationship of the conditions? Default: true.

type: boolean

default: true

Valid R palette name

type: string

default: Set1

Options related to filtering upstream of differential analysis

Minimum abundance value

required

type: number

default: 1

Minimum observations that must pass the threshold to retain the row/ feature (e.g. gene).

type: number

default: 1

A minimum proportion of observations, given as a number between 0 and 1, that must pass the threshold. Overrides minimum_samples

type: number

An optional grouping variable to be used to calculate a min_samples value

type: string

A minimum proportion of observations, given as a number between 0 and 1, that must have a value (not NA) to retain the row/ feature (e.g. gene).

type: number

default: 0.5

Minimum observations that must have a value (not NA) to retain the row/ feature (e.g. gene). Overrides filtering_min_proportion_not_na.

type: number

Set to run IMMUNEDECONV

type: boolean

Set method to run with IMMUNEDECONV. Available options can be found in ‘https://omnideconv.org/immunedeconv/articles/immunedeconv.html’

type: string

default: quantiseq

Set function to run with IMMUNEDECONV. Available options can be found in ‘https://omnideconv.org/immunedeconv/articles/immunedeconv.html’

type: string

default: deconvolute

Options related to data exploration

Clustering method used in dendrogram creation

required

type: string

default: ward.D2

Correlation method used in dendrogram creation

required

type: string

default: spearman

Number of features selected before certain exploratory analyses. If -1, will use all features.

required

type: integer

default: 500

Length of the whiskers in boxplots as multiple of IQR. Defaults to 1.5.

type: number

default: 1.5

Threshold on MAD score for outlier identification

type: integer

default: -5

How should the main grouping variable be selected? ‘auto_pca’, ‘contrasts’, or a valid column name from the observations table.

required

type: string

default: auto_pca

Specifies assay names to be used for matrices, platform-specific.

hidden

type: string

default: raw,normalised,variance_stabilised

Specifies final assay to be used for exploratory analysis, platform-specific

hidden

type: string

default: variance_stabilised

Of which assays to compute the log2 during exploratory analysis. Not necessary for maxquant data as this is controlled by the pipeline.

type: string

default: raw,normalised

Valid R palette name

required

type: string

default: Set1

Options related to differential operations

Differential analysis method

type: string

Advanced option: the suffix associated tabular differential results tables. Will by default use the appropriate suffix according to the study_type.

type: string

The feature identifier column in differential results tables

required

type: string

default: gene_id

The fold change column in differential results tables

required

type: string

default: log2FoldChange

The p value column in differential results tables

type: string

default: pvalue

The q value column in differential results tables (adjust p values/ q values).

required

type: string

default: padj

Minimum fold change used to calculate differential feature numbers

required

type: number

default: 2

Maximum p value used to calculate differential feature numbers

required

type: number

default: 1

Maximum q value used to calculate differential feature numbers

required

type: number

default: 0.05

Where a features file (GTF) has been provided, what attribute to use to name features

type: string

default: gene_name

Indicate whether or not fold changes are on the log scale (default is to assume they are)

type: boolean

default: true

Valid R palette name

required

type: string

default: Set1

In differential analysis (DEseq2 or Limma), subset to the contrast samples before modelling variance?

type: boolean

test parameter passed to DESeq()

type: string

fitType parameter passed to DESeq()

type: string

sfType parameter passed to DESeq()

type: string

‘minReplicatesForReplace’ parameter passed to DESeq()

type: integer

default: 7

useT parameter passed to DESeq2

type: boolean

independentFiltering parameter passed to results()

type: boolean

default: true

lfcThreshold parameter passed to results()

type: integer

altHypothesis parameter passed to results()

type: string

default: greaterAbs

pAdjustMethod parameter passed to results()

type: string

default: BH

alpha parameter passed to results()

type: number

default: 0.1

minmu parameter passed to results()

type: number

default: 0.5

variance stabilisation method to use when making a variance stabilised matrix

type: string

Shink fold changes in results?

type: boolean

default: true

Number of cores

type: integer

default: 1

type: integer

blind parameter for rlog() and/ or vst()

type: boolean

default: true

nsub parameter passed to vst()

type: integer

default: 1000

passed to lmFit(), positive integer giving the number of times each distinct probe is printed on each array.

type: number

passed to lmFit(), positive integer giving the spacing between duplicate occurrences of the same probe, spacing=1 for consecutive rows.

type: string

Sample sheet column to be used to derive a vector or factor specifying a blocking variable on the arrays

type: string

passed to lmFit(), the inter-duplicate or inter-technical replicate correlation

type: string

passed to lmFit(), the fitting method

type: string

passed to eBayes(), a numeric value between 0 and 1, assumed proportion of genes which are differentially expressed

type: number

default: 0.01

passed to eBayes(), logical, should an intensity-dependent trend be allowed for the prior variance?

type: boolean

passed to eBayes(), logical, should the estimation of df.prior and var.prior be robustified against outlier sample variances?

type: boolean

passed to eBayes, comma separated string of two values, assumed lower and upper limits for the standard deviation of log2-fold-changes for differentially expressed genes

type: string

default: 0.1,4

passed to eBayes, comma separated string of length 1 or 2, giving left and right tail proportions of x to Winsorize. Used only when robust=TRUE.

type: string

default: 0.05,0.1

passed to topTable(), minimum absolute log2-fold-change required

type: integer

passed to topTable(), logical, should confidence 95% intervals be output for logFC? Alternatively, can take a numeric value between zero and one specifying the confidence level required.

type: boolean

passed to topTable(), method used to adjust the p-values for multiple testing.

type: string

cutoff value for adjusted p-values. Only genes with lower p-values are listed.

type: number

default: 1

Turns on and off usage of voom normalization in the Limma module.

type: boolean

type: integer

default: 1

type: integer

type: boolean

type: number

default: 0.01

type: string

default: 0.1,4

type: boolean

type: string

default: 0.05,0.1

type: string

default: adaptive

type: boolean

type: string

default: BH

Functional analysis method

type: string

Gene sets in GMT or GMX-format; for GSEA: multiple comma-separated input files in either format are possible. For gprofiler2: A single file in GMT format is possible; this has lowest priority and will be overridden by —gprofiler2_token and —gprofiler2_organism.

type: string

Permutation type

type: string

Number of permutations

type: integer

default: 1000

Enrichment statistic

type: string

Metric for ranking genes

type: string

Gene list sorting mode

type: string

Gene list ordering mode

type: string

Max size: exclude larger sets

type: integer

default: 500

Min size: exclude smaller sets

type: integer

default: 15

Normalisation mode

type: string

Randomization mode

type: string

Make detailed geneset report?

type: boolean

default: true

Use median for class metrics

type: boolean

Number of markers

type: integer

default: 100

Plot graphs for the top sets of each phenotype

type: integer

default: 20

Seed for permutation

type: string

default: timestamp

Save random ranked lists

type: boolean

Make a zipped file with all reports

type: boolean

Short name of the organism that is analyzed, e.g. hsapiens for homo sapiens.

type: string

Should only significant enrichment results be considered?

type: boolean

default: true

Should underrepresentation be measured instead of overrepresentation?

type: boolean

The method that should be used for multiple testing correction.

type: string

On which source databases to run the gprofiler query

type: string

Whether to include evcodes in the results.

type: boolean

Maximum q value used for significance testing.

type: number

default: 0.05

Token that should be used as a query.

type: string

Path to CSV/TSV/TXT file that should be used as a background list of genes for the query; alternatively, ‘auto’ (default) or ‘false’.

type: string

default: auto

pattern: ^\S+\.(csv|tsv|txt)$|auto|false

Which column to use as gene IDs in the background matrix.

type: string

How to calculate the statistical domain size.

type: string

How many genes must be differentially expressed in a pathway for it to be considered enriched? Default 1.

type: integer

default: 1

Valid R palette name

type: string

default: Blues

Path to TSV file containing network file for decoupler

type: string

pattern: ^\S+\.(tsv)$

Removes sources of a net with less than min_n targets

type: integer

default: 5

Comma-separated list of methods to use (e.g., ‘ora,ulm’)

type: string

default: ulm

Should a Shiny app be built?

type: boolean

default: true

Should the app be deployed to shinyapps.io?

type: boolean

Your shinyapps.io account name

type: string

The name of the app to push to in your shinyapps.io account

type: string

Rmd report template from which to create the pipeline report

required

type: string

default: ${projectDir}/assets/differentialabundance_report.Rmd

pattern: ^\S+\.(Rmd|qmd|ipynb)$

Email address for completion summary.

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

A logo to display in the report instead of the generic pipeline logo.

required

type: string

default: ${projectDir}/docs/images/nf-core-differentialabundance_logo_light.png

CSS to use to style the output, in lieu of the default nf-core styling

required

type: string

default: ${projectDir}/assets/nf-core_style.css

A markdown file containing citations to include in the final report

type: string

default: ${projectDir}/CITATIONS.md

A title for reporting outputs

type: string

An author for reporting outputs

type: string

Semicolon-separated string of contributor info that should be listed in the report.

type: string

A description for reporting outputs

type: string

Whether to generate a scree plot in the report

type: boolean

default: true

Skip generation of reports

type: boolean

Reference genome related files and options required for the workflow.

Name of iGenomes reference.

type: string

Do not load the iGenomes reference config.

hidden

type: boolean

The base path to the igenomes reference files

hidden

type: string

default: s3://ngi-igenomes/igenomes/

Parameters used to describe centralised config profiles. These should not be edited.

Git commit id for Institutional configs.

hidden

type: string

default: master

Base directory for Institutional configs.

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/configs/master

Institutional config name.

hidden

type: string

Institutional config description.

hidden

type: string

Institutional config contact information.

hidden

type: string

Institutional config URL link.

hidden

type: string

Less common options for the pipeline, typically set in a config file.

Display version and exit.

hidden

type: boolean

Method used to save pipeline results to output directory.

type: string

Email address for completion summary, only when pipeline fails.

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Send plain-text email instead of HTML.

hidden

type: boolean

Do not use coloured log outputs.

hidden

type: boolean

Incoming hook URL for messaging service

hidden

type: string

Boolean whether to validate parameters against the schema at runtime

type: boolean

default: true

Base URL or local path to location of pipeline test dataset files

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/test-datasets/

Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.

hidden

type: string

nf-core/differentialabundance