Skip to content

algbio/SRFAligner

Repository files navigation

seed-chain-extend on iEFGs

SRFAligner and SRFChainer are long-read aligners based on indexable Elastic Founder Graphs (iEFGs), which can be obtained from multiple sequence alignments using founderblockgraph or from VCF files with the pipeline implemented in experiments/vcf-to-hapl-to-efg. The graphs used in the experiments can be found at doi.org/10.5281/zenodo.14012881.

Workflow to build iEFGs from a VCF file and to perform seed-chain-extend alignment

getting started

SRFAligner and SRFChainer are Bash programs based on efg-locate, chainx-block-graph (from this repository, tested on GCC >= 15), and GraphAligner (>= 1.0.19). Clone this repository and compile efg-locate and chainx-block-graph with

git clone https://github.com/algbio/SRFAligner && cd SRFAligner
git submodule update --init tools/{sdsl-lite-v3,concurrentqueue}
make

GraphAligner's executable is expected to be found in tools/GraphAligner/bin, so you can run command git submodule update --init --recursive tools/GraphAligner and follow its compilation instructions, or if GraphAligner is already installed in your system, you can just modify the relative line in the appropriate programs with

sed --in-place '7s/.*/graphaligner=GraphAligner/' SRFAligner SRFChainer efg-memsAligner efg-ahocorasickAligner

Finally, test your setup with

./SRFAligner -g test/graph1.gfa -f test/read1.fastq -a test/aln1.gaf
./SRFChainer -g test/graph2.gfa -f test/read2.fastq -a test/aln2.gaf

prototype aligners

As part of our experiments, we also developed two other interesting aligners: efg-ahocorasickAligner and efg-memsAligner.

Full node seeds via the Aho-Corasick automaton

To use full node seeds computed by the Aho-Corasick automaton of the iEFG node labels (based on daachorse, requires Rust >= 1.61), efg-ahocorasickAligner depends on efg-ahocorasick and efg-gaf-splitter (from this repository); seqtk is expected to be in tools/seqtk. Compile all three with

git submodule update --init --recursive {tools/daachorse,tools/seqtk}
make -C tools/seqtk
make -C tools/efg-ahocorasick

Maximal exact match seeds

To use MEM seeds computed by efg-mems, efg-memsAligner expects efg-mems's executable to be in tools/efg-mems/efg-mems and seqtk to be in tools/seqtk:

git submodule update --init --recursive {tools/efg-mems,tools/seqtk}
make -C tools/seqtk
cd tools/efg-mems/sdsl-lite
./install.sh .
cd ..
cmake .
make

publication

Nicola Rizzo, Manuel Cáceres, Veli Mäkinen. Exploiting uniqueness: seed-chain-extend alignment on elastic founder graphs. Bioinformatics, 2025.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published