From c7ded3426f2492d61b60f3a15c064d02d0302b7d Mon Sep 17 00:00:00 2001
From: Laura Mason <95650976+lauramason326@users.noreply.github.com>
Date: Thu, 30 Apr 2026 12:28:05 -0600
Subject: [PATCH 1/3] Update README.md
readme_edits_LMM
---
README.md | 69 +++++++++++++++++++++++++++++++------------------------
1 file changed, 39 insertions(+), 30 deletions(-)
diff --git a/README.md b/README.md
index a0d25478..2e3fc2a5 100644
--- a/README.md
+++ b/README.md
@@ -1,37 +1,42 @@
-# DRAM v2
+# DRAM2
+
+## Welcome to the wiki for Distilling and Refining Annotations of Metabolism 2 (DRAM2)!
+Here you will find all you need to know to setup, install and run DRAM2. This page will give you basic instructions, but if you want more detail on how DRAM2 works or what all DRAM2 options mean then be sure to check out the other pages in the wiki
-
+
## ⚠️ DRAM v2 is currently under active development and usage is at your own risk. ⚠️
-DRAM v2 (Distilled and Refined Annotation of Metabolism Version 2) is a tool for annotating metagenomic and genomic assembled data (e.g. scaffolds or contigs) or called genes (e.g. nuclotide or amino acid format). DRAM annotates MAGs using [KEGG](https://www.kegg.jp/) (if provided by the user), [UniRef90](https://www.uniprot.org/), [PFAM](https://pfam.xfam.org/), [dbCAN](http://bcb.unl.edu/dbCAN2/), [RefSeq viral](https://www.ncbi.nlm.nih.gov/genome/viruses/), [VOGDB](http://vogdb.org/) and the [MEROPS](https://www.ebi.ac.uk/merops/) peptidase database as well as custom user databases.
+## DRAM2 Overview
+DRAM2 (Distilling and Refining Annotations of Metabolism, version 2) is a tool for annotating genomic and metagenomic assemblies (e.g., scaffolds or contigs) as well as predicted genes (nucleotide or amino acid sequences). It organizes genome annotations into metabolic functions across three levels of increasing interpretation: (1) **ANNOTATE**, (2) **SUMMARIZE**, and (3) **VISUALIZE**. This workflow enables the analysis of large numbers of microbial genomes or metagenomes, highlighting functional guilds and supporting inference of organismal metabolism across datasets.
-DRAM is run in four stages:
+During the **ANNOTATE** stage, DRAM2 identifies genes in input sequences and annotates them using multiple databases, including [KEGG](https://www.kegg.jp/) (if provided by the user), [UniRef90](https://www.uniprot.org/), [PFAM](https://pfam.xfam.org/), [dbCAN3](http://bcb.unl.edu/dbCAN2/), [RefSeq Viral](https://www.ncbi.nlm.nih.gov/genome/viruses/), [VOGDB](http://vogdb.org/), [MEROPS](https://www.ebi.ac.uk/merops/), and optional user-defined databases. A full list of available annotation databases can be found here: [WrightonLabCSU/dram pipeline parameters](https://dramit.readthedocs.io/en/latest/params_doc.html#pipeline-steps). ANNOTATE then integrates results across all databases, increasing annotation coverage and yielding ~25% more database hits than commonly used annotators such as DFAST, MetaERG, and Prokka.
-1. Gene Calling Prodogal - genes are called on user provided scaffolds or contigs
-2. Gene Annotation - genes are annotated with a set of user defined databases
-3. Distillation - annotations are curated into functional categories
-4. Product Generation - interactive visualizations of DRAM output are generated
+The **ANNOTATE** output contains all database hits for every gene in each genome, generating a comprehensive output of most annotation pipelines. DRAM2 extends beyond this by organizing (**SUMMARIZE**) and visualizing (**VISUALIZE**) annotations into ecosystem-relevant functional categories, enabling more interpretable comparisons across genomes and ecosystems.
-For more detail on DRAM and how DRAM v2 works please see our DRAM products:
+## DRAM2 Overview & Example Usage
-- [DRAM version 1 publication](https://academic.oup.com/nar/article/48/16/8883/5884738)
-- [DRAM in KBase publication](https://pubmed.ncbi.nlm.nih.gov/36857575/)
-- [DRAM webinar](https://www.youtube.com/watch?v=-Ky2fz2vw2s)
+### DRAM2 for Genomes
+After gene calling in Prodigal, DRAM2 annotates genes in each genome (or Metagenome Assembled Genome (MAG)) using a suite of user-defined databases. The output of this step (“raw-annotations.tsv”) contains all database annotations. DRAM2 also generates the ANNOTATE folder containing: (1) the annotated nucleotide and amino acid fasta files of all genes, (2) genome quality data generated using Quast, (3) .gff files for each genome, and (4) database-specific files produced during the gene annotation process (i.e. HMMsearch output, MMseq2s output, dbcan3-hmm and dbcan3SUB-hmm etc). DRAM2 also generates the SUMMARIZE folder, which contains three key files: (1) A genome statistics table which includes all statistics required by MIMAG, (2) a metabolism summary sheet which gives gene counts of functional genes across a wide variety of metabolisms, and (3) a summarized genomes table which gives pathway presence information per MAGs. Finally, DRAM2 generates the VISUALIZE folder. This contains a visualization of your data as an interactive heatmap showing coverage of modules, the coverage of electron transport chain components and the presence of selected metabolic functions. Here is a standard full workflow to run DRAM2 for genomes. Here, we are annotating a directory of genomes, renaming them for downstream use, calling genes and annotating them using all available databases, performing quality control, summarizing and visualizing with particular ecosystems in mind. The command is submitted on the command line and will run in the background
-## Quick Links
+``` bash
+nextflow run WrightonLabCSU/DRAM --input_fasta [INPUT_FASTA] --outdir [OUTPUT_DIR] --rename --call --annotate --anno_dbs all --qc --summarize --sum_ecos 'eng_sys,ag' --visualize -profile singularity -resume --slurm -bg
+```
+please note that '--input_fasta [INPUT_FASTA]' should be a directory of genomes or MAGs in .fa or .fna format
-- [Docs](https://dramit.readthedocs.io/en/latest)
-- [Installation Guide](https://dramit.readthedocs.io/en/latest/installation.html)
-- [Usage Examples](https://dramit.readthedocs.io/en/latest/usage.html)
-- [Parameter API](<[#command-line-options](https://dramit.readthedocs.io/en/latest/params_doc.html)>)
-- [Rules API](<[#nextflow-tips-and-tricks](https://dramit.readthedocs.io/en/latest/rules_parser.html)>)
+### DRAM2 for Assemblies
+DRAM2 can also be used to annotate genes from metagenome assemblies. Similar to DRAM2 for MAGs, genes first are called in Prodigal, and then annotated using user-defined databases. The outputs for this ANNOTATE step are similar in function to the ANNOTATE output for DRAM2 for MAGs. A key difference between these two pipelines, though, is that the SUMMARIZE and VISUALIZE functions are meant to show genome-scale functions and take into account synteny and gene order. As such, the SUMMARIZE and VISUALIZE outputs should be interpreted with caution when running DRAM2 on genes. Here is a standard full workflow to run DRAM2 for assemblies:
+
+``` bash
+nextflow run WrightonLabCSU/DRAM --input_fasta [INPUT_FASTA] --outdir [OUTPUT_DIR] --rename --call --annotate --anno_dbs all
+-profile singularity -resume --slurm -bg
+```
-## Example Usage
-DRAM apps Call, Annotate and Distill can all be run at once or alternatively, each app can be run individually. Here are some common usage examples:
+## Other DRAM2 functions
+DRAM2 is a flexible tool, allowing users to call, annotate, vizualize, and summarize genomes and genes as separate steps, rename fasta headers, merge annotations, and define annotation databases and ecosystem outputs. For more example commands and all available parameters, please see [Docs](https://dramit.readthedocs.io/en/latest)
1. **Rename fasta headers based on input sample file names:**
@@ -45,7 +50,7 @@ nextflow run WrightonLabCSU/DRAM --rename --input_fasta
```
-3. **Annotate called genes using input called genes and the KOFAM database:**
+3. **Annotate called genes using input called genes and just one database:**
```bash
nextflow run WrightonLabCSU/DRAM --annotate --input_genes --use_kofam
@@ -57,7 +62,7 @@ nextflow run WrightonLabCSU/DRAM --annotate --input_genes --use_kofam
```
-5. **Merge various existing annotations files together (Must be generated using DRAM):**
+5. **Merge existing DRAM or DRAM2 annotations files:**
```bash
nextflow run WrightonLabCSU/DRAM --merge_annotations
@@ -69,15 +74,19 @@ nextflow run WrightonLabCSU/DRAM --merge_annotations --annotations
```
-7. **Complete workflow example:**
-```bash
-nextflow run -bg WrightonLabCSU/DRAM \
- --input_fasta [DIRECTORY of fasta files] \
- --outdir [OUTPUT] \
- --rename --sum_ecos 'eng_sys,ag' \
- -profile singularity,full_mode
-```
+## For more detailed information on DRAM and DRAM2 please see our DRAM other products:
+- [DRAM version 1 publication](https://academic.oup.com/nar/article/48/16/8883/5884738)
+- [DRAM in KBase publication](https://pubmed.ncbi.nlm.nih.gov/36857575/)
+- [DRAM webinar](https://www.youtube.com/watch?v=-Ky2fz2vw2s)
+
+## Quick Links
+- [Docs](https://dramit.readthedocs.io/en/latest)
+- [Installation Guide](https://dramit.readthedocs.io/en/latest/installation.html)
+- [Usage Examples](https://dramit.readthedocs.io/en/latest/usage.html)
+- [Parameter API](<[#command-line-options](https://dramit.readthedocs.io/en/latest/params_doc.html)>)
+- [Rules API](<[#nextflow-tips-and-tricks](https://dramit.readthedocs.io/en/latest/rules_parser.html)>)
+
## Nextflow Tips and Tricks
From baeaba69aef1b97665fe309eb04ba27f16b8fd5e Mon Sep 17 00:00:00 2001
From: Madeline Scyphers
Date: Thu, 30 Apr 2026 13:13:44 -0600
Subject: [PATCH 2/3] Remove markdown from prettier and trim whitespace on
precommit
---
.pre-commit-config.yaml | 2 ++
1 file changed, 2 insertions(+)
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
index fda9bbce..b98c0928 100644
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -10,6 +10,7 @@ repos:
additional_dependencies:
- prettier@2.1.2
- "@prettier/plugin-xml@0.12.0"
+ exclude: \.md$
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v6.0.0
hooks:
@@ -27,6 +28,7 @@ repos:
- id: detect-private-key
- id: end-of-file-fixer
- id: trailing-whitespace
+ exclude: \.md$
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.15.4
hooks:
From fb1b7b5078e66be6e7675bea2c7f4ace6644b0a7 Mon Sep 17 00:00:00 2001
From: Laura Mason <95650976+lauramason326@users.noreply.github.com>
Date: Wed, 6 May 2026 12:06:41 -0600
Subject: [PATCH 3/3] Update README.md
---
README.md | 106 +++++++-----------------------------------------------
1 file changed, 12 insertions(+), 94 deletions(-)
diff --git a/README.md b/README.md
index 2e3fc2a5..f9e779cd 100644
--- a/README.md
+++ b/README.md
@@ -1,13 +1,13 @@
# DRAM2
## Welcome to the wiki for Distilling and Refining Annotations of Metabolism 2 (DRAM2)!
-Here you will find all you need to know to setup, install and run DRAM2. This page will give you basic instructions, but if you want more detail on how DRAM2 works or what all DRAM2 options mean then be sure to check out the other pages in the wiki
+Here you will find give you basic instructions for running DRAM2, but for full documentation, please see the official DRAM2 webpage: [Read-the-docs](https://dramit.readthedocs.io/en/latest)
-## ⚠️ DRAM v2 is currently under active development and usage is at your own risk. ⚠️
+## ⚠️ DRAM2 is currently under active development and usage is at your own risk. ⚠️
## DRAM2 Overview
DRAM2 (Distilling and Refining Annotations of Metabolism, version 2) is a tool for annotating genomic and metagenomic assemblies (e.g., scaffolds or contigs) as well as predicted genes (nucleotide or amino acid sequences). It organizes genome annotations into metabolic functions across three levels of increasing interpretation: (1) **ANNOTATE**, (2) **SUMMARIZE**, and (3) **VISUALIZE**. This workflow enables the analysis of large numbers of microbial genomes or metagenomes, highlighting functional guilds and supporting inference of organismal metabolism across datasets.
@@ -16,112 +16,30 @@ During the **ANNOTATE** stage, DRAM2 identifies genes in input sequences and ann
The **ANNOTATE** output contains all database hits for every gene in each genome, generating a comprehensive output of most annotation pipelines. DRAM2 extends beyond this by organizing (**SUMMARIZE**) and visualizing (**VISUALIZE**) annotations into ecosystem-relevant functional categories, enabling more interpretable comparisons across genomes and ecosystems.
-## DRAM2 Overview & Example Usage
+## Basic usage:
+Below is an example of basic DRAM2 usage. This code is for annotating a directory of genomes, renaming them for downstream use, calling genes and annotating them using all available databases, performing quality control, summarizing and visualizing with particular ecosystems in mind and assigning genome-level traits to the organisms. The command is submitted on the command line and will run in the background.
-### DRAM2 for Genomes
-After gene calling in Prodigal, DRAM2 annotates genes in each genome (or Metagenome Assembled Genome (MAG)) using a suite of user-defined databases. The output of this step (“raw-annotations.tsv”) contains all database annotations. DRAM2 also generates the ANNOTATE folder containing: (1) the annotated nucleotide and amino acid fasta files of all genes, (2) genome quality data generated using Quast, (3) .gff files for each genome, and (4) database-specific files produced during the gene annotation process (i.e. HMMsearch output, MMseq2s output, dbcan3-hmm and dbcan3SUB-hmm etc). DRAM2 also generates the SUMMARIZE folder, which contains three key files: (1) A genome statistics table which includes all statistics required by MIMAG, (2) a metabolism summary sheet which gives gene counts of functional genes across a wide variety of metabolisms, and (3) a summarized genomes table which gives pathway presence information per MAGs. Finally, DRAM2 generates the VISUALIZE folder. This contains a visualization of your data as an interactive heatmap showing coverage of modules, the coverage of electron transport chain components and the presence of selected metabolic functions. Here is a standard full workflow to run DRAM2 for genomes. Here, we are annotating a directory of genomes, renaming them for downstream use, calling genes and annotating them using all available databases, performing quality control, summarizing and visualizing with particular ecosystems in mind. The command is submitted on the command line and will run in the background
``` bash
-nextflow run WrightonLabCSU/DRAM --input_fasta [INPUT_FASTA] --outdir [OUTPUT_DIR] --rename --call --annotate --anno_dbs all --qc --summarize --sum_ecos 'eng_sys,ag' --visualize -profile singularity -resume --slurm -bg
+nextflow run WrightonLabCSU/DRAM --input_fasta [INPUT_FASTA] --outdir [OUTPUT_DIR] --rename --call --annotate --anno_dbs all --qc --summarize --sum_ecos 'eng_sys,ag' --visualize --traits -profile singularity -resume --slurm -bg
```
-please note that '--input_fasta [INPUT_FASTA]' should be a directory of genomes or MAGs in .fa or .fna format
+Please note that '--input_fasta [INPUT_FASTA]' should be a directory of genomes or MAGs in .fa or .fna format. It is also worth noting that all Nextflow options are specified with a single dash `-`, while all DRAM2-specific options are specified with a double dash `--`. All available Nextflow options can be seen by running:
-### DRAM2 for Assemblies
-DRAM2 can also be used to annotate genes from metagenome assemblies. Similar to DRAM2 for MAGs, genes first are called in Prodigal, and then annotated using user-defined databases. The outputs for this ANNOTATE step are similar in function to the ANNOTATE output for DRAM2 for MAGs. A key difference between these two pipelines, though, is that the SUMMARIZE and VISUALIZE functions are meant to show genome-scale functions and take into account synteny and gene order. As such, the SUMMARIZE and VISUALIZE outputs should be interpreted with caution when running DRAM2 on genes. Here is a standard full workflow to run DRAM2 for assemblies:
-
-``` bash
-nextflow run WrightonLabCSU/DRAM --input_fasta [INPUT_FASTA] --outdir [OUTPUT_DIR] --rename --call --annotate --anno_dbs all
--profile singularity -resume --slurm -bg
-```
-
-
-## Other DRAM2 functions
-DRAM2 is a flexible tool, allowing users to call, annotate, vizualize, and summarize genomes and genes as separate steps, rename fasta headers, merge annotations, and define annotation databases and ecosystem outputs. For more example commands and all available parameters, please see [Docs](https://dramit.readthedocs.io/en/latest)
-
-1. **Rename fasta headers based on input sample file names:**
-
-```bash
-nextflow run WrightonLabCSU/DRAM --rename --input_fasta
-```
-
-2. **Call genes using input fastas (use --rename to rename FASTA headers):**
-
-```bash
-nextflow run WrightonLabCSU/DRAM --call --rename --input_fasta
-```
-
-3. **Annotate called genes using input called genes and just one database:**
-
-```bash
-nextflow run WrightonLabCSU/DRAM --annotate --input_genes --use_kofam
-```
-
-4. **Annotate called genes using input fasta files and the KOFAM database:**
-
-```bash
-nextflow run WrightonLabCSU/DRAM --annotate --input_fasta --use_kofam
-```
-
-5. **Merge existing DRAM or DRAM2 annotations files:**
-
-```bash
-nextflow run WrightonLabCSU/DRAM --merge_annotations
-```
-
-6. **Distill using input annotations:**
-
-```bash
-nextflow run WrightonLabCSU/DRAM --distill_ --annotations
-```
-
-
-## For more detailed information on DRAM and DRAM2 please see our DRAM other products:
-- [DRAM version 1 publication](https://academic.oup.com/nar/article/48/16/8883/5884738)
-- [DRAM in KBase publication](https://pubmed.ncbi.nlm.nih.gov/36857575/)
-- [DRAM webinar](https://www.youtube.com/watch?v=-Ky2fz2vw2s)
+`nextflow run -help`
## Quick Links
- [Docs](https://dramit.readthedocs.io/en/latest)
- [Installation Guide](https://dramit.readthedocs.io/en/latest/installation.html)
- [Usage Examples](https://dramit.readthedocs.io/en/latest/usage.html)
-- [Parameter API](<[#command-line-options](https://dramit.readthedocs.io/en/latest/params_doc.html)>)
-- [Rules API](<[#nextflow-tips-and-tricks](https://dramit.readthedocs.io/en/latest/rules_parser.html)>)
-
-
-## Nextflow Tips and Tricks
-
-The `-resume` option in Nextflow DSL2 allows you to efficiently manage and modify your workflow runs:
-
-- **Adding databases to an existing run:**
- - Using `-resume` with your existing work directory lets you reuse called genes and existing annotations
- - Example: If you initially used `--use_kofam --use_dbcan`, you can add `--use_kegg --use_uniref` and only the new annotations will be computed
+- [Parameter API](https://dramit.readthedocs.io/en/latest/params_doc.html#pipeline-steps)
+- [Rules API](https://dramit.readthedocs.io/en/latest/rules_parser.html)
-## Resource Management
-DRAM leverages Nextflow's horizontal scaling capabilities to distribute computational tasks across multiple computing resources. You can customize resource allocation through the `nextflow.config` file:
-
-- Modify "maxForks" parameters to control parallel execution
-- Configure CPU and memory requirements per process
-- Coming soon: "lite", "medium" and "heavy" modes for different computing environments
-
-## Configuration
-
-Every CLI option can be set in the `nextflow.config` file. For example:
-
-```nextflow
-params {
- use_uniref = true
- annotate = true
-}
-```
-
-You can also use a custom config file:
+## Other DRAM products from our research group:
+- [DRAM webinar](https://www.youtube.com/watch?v=-Ky2fz2vw2s)
+- [DRAM in KBase publication (2023)](https://pubmed.ncbi.nlm.nih.gov/36857575/)
-```bash
-nextflow run DRAM -c /path/to/custom_config.config
-```
## Citing DRAM
-
If DRAM helps you in your research, please cite:
[DRAM publication in Nucleic Acids Research (2020)](https://academic.oup.com/nar/article/48/16/8883/5884738)