YAME - Yet Another Methylation Encoder
A fast and lightweight toolkit for sequence-level DNA methylation analysis
Overview
YAME (Yet Another Methylation Encoder) is designed for efficient sequence-level DNA methylation data management, capable of handling both bulk and single-cell DNA methylome workflows. It introduces a family of compact binary formats (CX formats) that represent methylation values, MU counts, categorical states, fraction data, masks, and genomic coordinates in a uniform compressed structure. Use Cases include:
- Single-cell methylome analysis - Efficiently store and analyze sparse single-cell data
- Bulk methylome processing - Fast operations on large cohorts
- Enrichment testing - Test methylation enrichment across genomic features
- Feature aggregation - Summarize methylation over bins, chromatin states, or custom regions
- Pseudobulk generation - Merge single cells into cluster-level pseudobulks
- Differential methylation - Identify and test differentially methylated sites
Tutorials & Workflows
- Storage & Format - Working with CX formats
- Summarize & Encode - Calculate statistics and aggregations
- Test Enrichment - Test methylation enrichment across genomic features
- Subset Rows - Extract samples and regions
- Aggregate Row-wise - Merge pseudobulks and perform calculations
- Combine, Split & Index - Handle multi-sample datasets
- Mask Data - Test methods at different sparsity levels
Installation
# Option 1: Install via Conda (Recommended)
conda install yame -c bioconda
# Option 2: Build from Source
git clone https://github.com/zhou-lab/YAME.git
cd YAME
make
Quick Start
# Pack binary methylation data
yame pack -fb yourfile.bed > yourfile.cg
# Pack MU count data (M and U columns)
yame pack -f3 methylation_counts.txt > yourfile.cg
# Summarize methylation data
yame summary yourfile.cg
# Test enrichment over genomic features
yame summary -m ChromHMM_states.cm yourfile.cg > enrichment_results.txt
# Subset samples from multi-sample data
yame subset -l sample_list.txt yourfile.cg > subset.cg
# Merge single cells into pseudobulks
yame subset -l cluster1_cells.txt single_cell.cg | yame rowop - -o binasum > pseudobulk.cg
Reference
Goldberg*, Fu*, Atkins, Moyer, Lee, Deng, Zhou† (2025)
KnowYourCG: Facilitating Base-level Sparse Methylome Interpretation.
Science Advances
DOI: 10.1126/sciadv.adw3027
Acknowledgements
This work is supported by NIH/NIGMS 5R35GM146978.
YAME integrates with the KYCG knowledge base for comprehensive methylation feature analysis.
Getting Help
- Command Help: Run
yameoryame <command>for usage information - Issues: Report bugs on GitHub Issues
- KYCG Resources: Download reference coordinates and feature files from KYCG hg38 or KYCG mm10