GitHub - ASAFind/ASAFind-2: ASAFind latest version, with options for graphical output and ppc protein prediction

Download ASAFind to install locally

For local installation, a command line version of ASAFind can be downloaded from our GitHub repository:

Installation steps

Using the GitHub function, you can either dowload the files as a zip archive, or you can clone the repository using the provided URL. ASAFind requires a Unix-based operating system like Linux (Ubuntu) or macOS. After download of the latest version, follow these installation steps on a command-line shell:

Step into the directory where you want to make the installation e.g.:
- cd /home/marta/asafind
Make a clone of the GitHub (if git is installed: "git clone https://github.com/ASAFind/ASAFind-2.git"), or place the content of the downloaded zip archive in the folder.
Run the following commands:
- python3 -m venv asafind_command_line
- . asafind_command_line/bin/activate
- pip install --upgrade pip
- pip install -r requirements.txt
Now you are in a virtual environment named asafind_command_line. Here you can run the scripts or ask for help e.g.:
- python3 S0_ASAFind.py --help

The installation procedure will create the environment asafind_command_line, activate it, install all required packages activate and create the subdirectries temp and output.

ASAFind 2.0

The script S1_ASAFind.py in the environment root directory performs the actual prediction. It is called from the script S0_ASAFind.py, which handles the options and generates the optional graphical output. Input data is a Fasta and a companion TargetP v.2.0 short format tabular output file, with the complete TargetP header (two lines starting with '#'). Some versions of SignalP/TargetP truncate the sequence names. SignalP-3.0 to 20 characters, and 4.0, 4.1 to 58 characters. Therefore, ASAFind only considers the first corresponding characters of the fasta name (and the first 90 in the case of TargetP 2.0), which must be unique within the file. Parts of the fasta name after that character are ignored. Additionally, the fasta name may not contain a '-' or '|'. This requirement is because SignalP/TargetP converts special characters in sequence names (e.g. '-' is changed to '_'). ASAFind requires at least 7 aa upstream and 22 aa downstream of the cleavage site suggested by SignalP/TargetP. The basic output of ASAFind is a tab delimited table containing the results for each sequence in the FASTA input file. The results table, the log files, and if requested the graphical output are zipped and can be found in the folder 'output'. In the current version, graphical output can only be generated for predictions generated from TargetP 2 output. Please save the results in a different location; each new run of ASAFind will overwrite the content of the folder 'output'. Python >= 3.10 is required.

python S0_ASAFind.py --help


usage: S0_ASAFind.py	[-h] -f FASTA_FILE -p SIGNALP_FILE [-s SIMPLE_SCORE_CUTOFF] [-t FASTA_FILE_WITH_MOTIFS] [-w] [-v1] [-ppc] [-s_ppc SCORE_CUTOFF_PPC] [-t_ppc FASTA_FILE_WITH_MOTIFS_PPC] [-l] [-my_org MY_ORGANISM] [-v]


-h, --help	Show this help message and exit
-f FASTA_FILE, --fasta_file FASTA_FILE	Specify the input fasta FILE.
-p SIGNALP_FILE, --signalp_file SIGNALP_FILE	Specify the input TargetP/SignalP FILE.
-s SIMPLE_SCORE_CUTOFF, --simple_score_cutoff SIMPLE_SCORE_CUTOFF	Optionally, specify an explicit score cutoff, rather than using ASAFind's default algorithm, not compatible with option -v1. The score given here will not be normalized and therefore should be obtained from a distribution of normalized scores.
-t FASTA_FILE_WITH_MOTIFS, --fasta_file_with_motifs FASTA_FILE_WITH_MOTIFS	Optionally, specify a custom scoring table. The scoring table will be normalized with the maximum score, which allows for processing of non-normalized as well as normalized scoring tables.
-w, --web_output	Format output for web display. This is mostly useful when called by a web app.
-v1, --reproduce_ASAFind_1	Reproduce ASAFind 1.x scores and results (non-normalized scores, if no custom scoring table is specified, the original default scoring table generated without small sample size correction will be used, not compatible with option -s).
-ppc, --include_ppc_prediction	Include prediction of proteins that might be targeted to the periplastidic compartment.
-t SCORE_TABLE_FILE, --score_table_file SCORE_TABLE_FILE	Optionally, specify a custom scoring table. The scoring table will be normalized with the maximum score, which allows for processing of non-normalized as well as normalized scoring tables.
-o OUT_FILE, --out_file OUT_FILE	Specify the path and name of the output file you wish to create. Default will be the same as the fasta_file, but with a ".tab" suffix.
-s_ppc SCORE_CUTOFF_PPC, --score_cutoff_ppc SCORE_CUTOFF_PPC	Optionally, specify an explicit score cutoff for the ppc protein prediction, if given, ppc protein prediction will be included. The score given here will not be normalized and therefore should be obtained from a distribution of normalized scores.
-t_ppc SCORE_TABLE_FILE_PPC, --score_table_file_ppc SCORE_TABLE_FILE_PPC	Optionally, specify a custom scoring table for the ppc protein prediction, if given, ppc protein prediction will be included. The scoring table will be normalized with the maximum score, which allows for processing of non-normalized as well as normalized scoring tables.
-l, --logomaker	If chosen, the program will generate graphical output using logomaker, in .png and .svg formats. They will be included into the output compressed package.
-my_org MY_ORGANISM, --my_organism MY_ORGANISM	Specify the name of organism.
-v, --version	Show program's version number and exit.

Usage

A test run can be started with the supplied example files using the following command:

python S0_ASAFind.py -f example.fasta -p example_summary.targetp2 -l

The results will be zipped and can be found in the folder 'output'. Example SignalP output files are also provided for SignalP 4.1 and SignalP 3, these can be used analogously.

If you use ASAFind in your research please cite our publication (Gruber et al., 2025, https://doi.org/10.1111/tpj.70138) as well as the appropriate publications for SignalP or TargetP:

SignalP 5: Almagro Armenteros et al. 2019, https://doi.org/10.1038/s41587-019-0036-z
SignalP 4: Petersen et al., 2011, https://doi.org/10.1038/nmeth.1701
SignalP 3.0: Bendtsen et al., 2004, https://doi.org/10.1016/j.jmb.2004.05.028
TargetP 2.0: Almagro Armenteros et al., 2019, https://doi.org/10.26508/lsa.201900429

Further information on the biological background and on strategies for pre-sequence identification can be found in the following publications:

Gruber and Kroth 2024, https://doi.org/10.1007/978-3-031-57446-7_15
Gruber et al. 2020, https://doi.org/10.48550/arXiv.2303.02509 (Comparative evaluation of statistical performance)
Gruber and Kroth 2017, https://doi.org/10.1098/rstb.2016.0402
Gruber et al., 2015, https://doi.org/10.1111/tpj.12734 (Original puplication of ASAFind)
Gruber and Kroth 2014, https://doi.org/10.1007/978-1-62703-661-0_12

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
ASAFind_logo_small.png		ASAFind_logo_small.png
README.md		README.md
S0_ASAFind.py		S0_ASAFind.py
S1_ASAFind.py		S1_ASAFind.py
S2_score_table_updated.py		S2_score_table_updated.py
SignalP3_example_output.txt		SignalP3_example_output.txt
SignalP41_example_output.txt		SignalP41_example_output.txt
SignalP5_example_output.txt		SignalP5_example_output.txt
colour_code.png		colour_code.png
diatom_output.tab		diatom_output.tab
diatom_output.tab.pkl		diatom_output.tab.pkl
diatom_scoring_matrix.fasta		diatom_scoring_matrix.fasta
diatom_scoring_matrix_user_defined.fasta		diatom_scoring_matrix_user_defined.fasta
directories.png		directories.png
example.fasta		example.fasta
example_summary.targetp2		example_summary.targetp2
fill_constants.py		fill_constants.py
ppc_output.tab		ppc_output.tab
ppc_output.tab.pkl		ppc_output.tab.pkl
ppc_scoring_matrix.fasta		ppc_scoring_matrix.fasta
ppc_scoring_matrix_user_defined.fasta		ppc_scoring_matrix_user_defined.fasta
reproduce_ASAFind_1.tab.pkl		reproduce_ASAFind_1.tab.pkl
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Download ASAFind to install locally

Installation steps

ASAFind 2.0

python S0_ASAFind.py --help

Usage

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

ASAFind/ASAFind-2

Folders and files

Latest commit

History

Repository files navigation

Download ASAFind to install locally

Installation steps

ASAFind 2.0

python S0_ASAFind.py --help

Usage

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages