### 8 releases (5 breaking)

0.8.0 | Jun 28, 2024 |
---|---|

0.7.3 | Jun 17, 2024 |

0.6.0 | Dec 13, 2023 |

0.5.1 | Aug 31, 2023 |

0.3.0 | Jun 25, 2023 |

#**268** in Biology

Used in lightmotif-py

**GPL-3.0-or-later**

**1.5MB**

4.5K
SLoC

# 🎼🧬 `lightmotif-tfmpvalue`

`lightmotif-tfmpvalue`

*A Rust port of the TFMPvalue algorithm for the *.

`lightmotif`crate.

## 🗺️ Overview

**TFMPvalue** is an algorithm proposed by Touzet & Varré[1] for
computing a *p-value* from a score
obtained with a position weight matrix.
It uses discretization to compute an approximation of the score distribution
for the position weight matrix, iterating with growing levels of accuracy
until convergence is reached. This approach outperforms
dynamic-programming
based methods such as **LazyDistrib** by Beckstette *et al.*[2].

provides an implementation of the `lightmotif-tfmpvalue`**TFMPvalue** algorithm
to use with position weight matrices from the

crate.`lightmotif`

## 💡 Example

Use

to create a position specific scoring matrix, and then use
the TFMPvalue algorithm to compute the exact P-value for a given score, or
a score threshold for a given P-value:`lightmotif`

`extern` `crate` lightmotif`;`
`extern` `crate` lightmotif_tfmpvalue`;`
`use` `lightmotif``::``pwm``::`CountMatrix`;`
`use` `lightmotif``::``abc``::`Dna`;`
`use` `lightmotif``::``seq``::`EncodedSequence`;`
`use` `lightmotif_tfmpvalue``::`TfmPvalue`;`
`//` Use a ScoringMatrix from `lightmotif`
`let` pssm `=` `CountMatrix``::``<`Dna`>``::`from_sequences`(``&``[`
`EncodedSequence``::`encode`(``"`GTTGACCTTATCAAC`"``)``.``unwrap``(``)``,`
`EncodedSequence``::`encode`(``"`GTTGATCCAGTCAAC`"``)``.``unwrap``(``)``,`
`]``)`
`.``unwrap``(``)`
`.``to_freq``(``0.``25``)`
`.``to_scoring``(``None``)``;`
`//` Initialize the TFMPvalue algorithm for the given PSSM
`//` (the `pssm` reference must outlive `tfmp`).
`let` `mut` tfmp `=` `TfmPvalue``::`new`(``&`pssm`)``;`
`//` Compute the exact p-value for a given score
`let` pvalue `=` tfmp`.``pvalue``(``19.``3``)``;`
`assert_eq!``(`pvalue`,` `1.``4901161193847656e-08``)``;`
`//` Compute the exact score for a given p-value
`let` score `=` tfmp`.``score``(`pvalue`)``;`
`assert_eq!``(`score`,` `19.``3``)``;`

*Note that in the example above, the computation is not bounded, so for certain
particular matrices the algorithm may require a large amount of memory to
converge. Use the *

`and`

`TfmPvalue`approximate_pvalue`::`

`methods to obtain an iterator over the algorithm iterations, allowing you to stop at any given time based on external criterion such as total memory usage.`

`TfmPvalue`approximate_score`::`## 💭 Feedback

### ⚠️ Issue Tracker

Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.

## 📋 Changelog

This project adheres to Semantic Versioning and provides a changelog in the Keep a Changelog format.

## ⚖️ License

This library is provided under the open-source GNU General Public License v3.0. The original TFMPvalue implementation was written by the BONSAI team of CRISTaL, Université de Lille and is available under the terms of the GNU General Public License v2.0.

*This project is in no way not affiliated, sponsored, or otherwise endorsed
by the original TFMPvalue authors. It was
developed by Martin Larralde during his PhD
project at the European Molecular Biology Laboratory
in the Zeller team.*

## 📚 References

- Touzet, Hélène and Jean-Stéphane Varré. ‘Efficient and accurate P-value computation for Position Weight Matrices’. Algorithms for Molecular Biology 2, 1–12 (2007). doi:10.1186/1748-7188-2-15.
- Beckstette, Michael, Robert Homann, and Robert Giegerich. ‘Fast index based algorithms and software for matching position specific scoring matrices’. BMC Bioinformatics 7, 389 (2006). doi:10.1186/1471-2105-7-389.