#estimator #regressor #sc-ala-ble #heritability

nightly bin+lib saber

ScAlaBle Estimator Regressor for heritability estimation

3 releases

0.1.2 Sep 5, 2019
0.1.1 Sep 5, 2019
0.1.0 Aug 12, 2019

#482 in Science

Apache-2.0

185KB
3.5K SLoC

Saber

ScAlaBle Estimator Regressor

Build

Installing Rust

curl https://sh.rustup.rs -sSf | sh

rustup toolchain install nightly

rustup default nightly

For more details please visit https://www.rust-lang.org/tools/install and https://github.com/rust-lang/rustup.rs#working-with-nightly-rust

Build Saber

RUSTFLAGS='-L /path/to/OpenBLAS -lopenblas -C target-cpu=native' cargo build --release

where /path/to/OpenBLAS is the path to the directory containing the OpenBLAS libraries.

Run

Inside the saber top level directory, the executables generated by the build process will be located in ./target/release

Some executables of interest:

./target/release/partition_by_chrom -h
partition_by_chrom 0.1
Aaron Zhou

USAGE:
    partition_by_chrom --bim <BIM> --out <out_path>

FLAGS:
    -h, --help       Prints help information
    -V, --version    Prints version information

OPTIONS:
    -b, --bim <BIM>         required; the PLINK bim file
    -o, --out <out_path>    output path; each line will have two fields: variant_id chrom_partition_assignment
./target/release/estimate_heritability -h
estimate_heritability 0.1
Aaron Zhou

USAGE:
    estimate_heritability [OPTIONS] --nrv <num_random_vecs> --pheno <pheno_path> --bfile <plink_filename_prefix>

FLAGS:
    -h, --help       Prints help information
    -V, --version    Prints version information

OPTIONS:
        --lowest-maf <lowest_allowed_maf>
            Lowest allowed minor allele frequency (MAF)
            Any SNPs with a MAF less than <lowest_allowed_maf> will be ignored
    -k, --num-jackknifes <num_jackknife_partitions>
            The number of jackknife partitions
            SNPs will be divided into <num_jackknife_partitions> partitions
            where each partition will be treated as a single point of observation [default: 20]
        --nrv <num_random_vecs>
            The number of random vectors used to estimate traces
            Recommends at least 100 for small datasets, and 10 for huge datasets
        --partition <partition_file>
            A file to partition the SNPs into multiple components.
            Each line consists of two values of the form:
            SNP_ID PARTITION
            For example,
            rs3115860 1
            will assign SNP with ID rs3115860 in the BIM file to a partition named 1
    -p, --pheno <pheno_path>
            The header line should be
            FID IID PHENOTYPE_NAME
            where PHENOTYPE_NAME can be any string without white spaces.
            The rest of the lines are of the form:
            1000011 1000011 -12.11363
    -b, --bfile <plink_filename_prefix>
            If we have files named 
            PATH/TO/x.bed PATH/TO/x.bim PATH/TO/x.fam 
            then the <plink_filename_prefix> should be path/to/x
./target/release/estimate_g_gxg_heritability -h
estimate_multi_gxg_heritability 0.1

USAGE:
    estimate_g_gxg_heritability [OPTIONS] --le <le_snps_filename_prefix> --nrv-gxg <num_rand_vecs_gxg> --nrv <num_random_vecs> --pheno <pheno_path>... --bfile <plink_filename_prefix>

FLAGS:
    -h, --help       Prints help information
    -V, --version    Prints version information

OPTIONS:
        --gxg-partition <gxg_partition_file>
            Form GxG for each of the partitions instead of
            over the entire range of LE SNPs.
            Taking the same file format as the --partition option
        --le <le_snps_filename_prefix>
            The SNPs that are in linkage equilibrium.
            To be used to construct the GxG matrix.
            If we have files named 
            PATH/TO/x.bed PATH/TO/x.bim PATH/TO/x.fam 
            then the <le_snps_filename_prefix> should be path/to/x
    -k, --num-jackknifes <num_jackknife_partitions>    The number of jackknife partitions [default: 20]
        --nrv-gxg <num_rand_vecs_gxg>
            The number of random vectors used to estimate traces related to the GxG matrix

        --nrv <num_random_vecs>
            The number of random vectors used to estimate traces
            Recommends at least 100 for small datasets, and 10 for huge datasets
        --partition <partition_file>
            A file to partition the G SNPs into multiple components.
            Each line consists of two values of the form:
            SNP_ID PARTITION
            For example,
            rs3115860 1
            will assign SNP with ID rs3115860 in the BIM file to a partition named 1
    -p, --pheno <pheno_path>...
            The header line should be
            FID IID PHENOTYPE_NAME
            where PHENOTYPE_NAME can be any string without white spaces.
            The rest of the lines are of the form:
            1000011 1000011 -12.11363
    -b, --bfile <plink_filename_prefix>
            If we have files named 
            PATH/TO/x.bed PATH/TO/x.bim PATH/TO/x.fam 
            then the <plink_filename_prefix> should be path/to/x

Dependencies

~11–19MB
~318K SLoC