#gff3 #fasta #bioinformatics

app stats_on_gff3

Calculate statistics such as CDS GC3 ratio, intron GC ratio, flanking gene region GC ratio, first intron length, number of introns, CpG ratio, etc. Examples: stats_on_gff3 Homo_sapiens.GRCh38.109.chromosome.1.gff3 Homo_sapiens.GRCh38.dna.chromosome.1.fa zcat Ciona_savignyi.CSAV2.0.dna.toplevel.fa.gz | stats_on_gff3 Ciona_savignyi.CSAV2.0.109.gff3 stdin See https://gitlab.in2p3.fr/penel/stats_on_gff3

26 releases

0.1.26 Feb 22, 2024
0.1.25 Feb 22, 2024
0.1.18 Aug 1, 2023
0.1.17 Jul 28, 2023
0.1.4 Apr 28, 2023

#21 in Biology

Download history 220/week @ 2024-02-10 263/week @ 2024-02-17 73/week @ 2024-02-24 8/week @ 2024-03-02 14/week @ 2024-03-09 160/week @ 2024-03-23 8/week @ 2024-03-30

182 downloads per month

CECILL-2.1

61KB
1K SLoC

Stats of gff3 files

Install :

cargo install stats_on_gff3

Crates: https://crates.io/crates/stats_on_gff3

Examples:

stats_on_gff3 --precision 1000 Homo_sapiens.GRCh38.109.chromosome.1.gff3 Homo_sapiens.GRCh38.dna.chromosome.1.fa 2>err

stats_on_gff3 --all Homo_sapiens.GRCh38.109.chromosome.1.gff3 Homo_sapiens.GRCh38.dna.chromosome.1.fa 2>err

zcat Ciona_savignyi.CSAV2.0.dna.toplevel.fa.gz | stats_on_gff3 --precision 100 Ciona_savignyi.CSAV2.0.109.gff3 stdin 2> err

Input data:

A Gff file and its associated fasta file from Ensembl. Fasta sequences should be in uppercase. (for NCBI data, see stats_on_gff3_ncbi https://crates.io/crates/stats_on_gff3_ncbi)

Dependencies

~17MB
~310K SLoC