37 releases
0.9.0 | Dec 18, 2023 |
---|---|
0.8.0 | May 31, 2019 |
0.7.4 | Aug 22, 2018 |
0.6.12 | Jun 18, 2018 |
0.3.3 | Mar 29, 2018 |
#315 in Algorithms
127 downloads per month
355KB
7K
SLoC
program-induction
A library for program induction and learning representations.
Implements Bayesian program learning and genetic programming. See the docs for more information.
Installation
Install rust and ensure you're up to date (rustup update
).
In a new or existing project, add the following to your Cargo.toml
:
[dependencies]
programinduction = "0.9"
# many examples also depend on polytype for the tp! and ptp! macros:
polytype = "7.0"
The documentation requires a custom HTML header to include KaTeX for math
support. This isn't supported by cargo doc
, so to build the documentation
you may use:
cargo rustdoc -- --html-in-header rustdoc-include-katex-header.html
Usage
Specify a probabilistic context-free grammar (PCFG; see pcfg::Grammar
) and
induce a sentence that matches an example:
use polytype::tp;
use programinduction::pcfg::{task_by_evaluation, Grammar, Rule};
use programinduction::{ECParams, EC};
fn evaluate(name: &str, inps: &[i32]) -> Result<i32, ()> {
match name {
"0" => Ok(0),
"1" => Ok(1),
"plus" => Ok(inps[0] + inps[1]),
_ => unreachable!(),
}
}
fn main() {
let g = Grammar::new(
tp!(EXPR),
vec![
Rule::new("0", tp!(EXPR), 1.0),
Rule::new("1", tp!(EXPR), 1.0),
Rule::new("plus", tp!(@arrow[tp!(EXPR), tp!(EXPR), tp!(EXPR)]), 1.0),
],
);
let ec_params = ECParams {
frontier_limit: 1,
search_limit_timeout: None,
search_limit_description_length: Some(8.0),
};
// task: the number 4
let task = task_by_evaluation(&evaluate, &4, tp!(EXPR));
let frontiers = g.explore(&ec_params, &[task]);
let sol = &frontiers[0].best_solution().unwrap().0;
println!("{}", g.display(sol));
}
The Exploration-Compression (EC) algorithm iteratively learns a better
representation by finding common structure in induced programs. We can run
the EC algorithm with a polymorphically-typed lambda calculus representation
lambda::Language
in a Boolean circuit domain:
use polytype::{ptp, tp};
use programinduction::{domains, lambda, ECParams, EC};
fn main() {
// circuit DSL
let dsl = lambda::Language::uniform(vec![
// NAND takes two bools and returns a bool
("nand", ptp!(@arrow[tp!(bool), tp!(bool), tp!(bool)])),
]);
// parameters
let lambda_params = lambda::CompressionParams::default();
let ec_params = ECParams {
frontier_limit: 1,
search_limit_timeout: Some(std::time::Duration::new(1, 0)),
search_limit_description_length: None,
};
// randomly sample 250 circuit tasks
let rng = &mut rand::thread_rng();
let tasks = domains::circuits::make_tasks(rng, 250);
// one iteration of EC:
let (new_dsl, _solutions) = dsl.ec(&ec_params, &lambda_params, &tasks);
// print the new concepts it invented, based on common structure:
for (expr, _, _) in &new_dsl.invented {
println!("invented {}", new_dsl.display(expr))
// one of the inventions was "(λ (nand $0 $0))",
// which is the common and useful NOT operation!
}
}
You may have noted the above use of domains::circuits
. Some domains are
already implemented for you. Currently, this only consists of circuits and
strings.
TODO
(you could be the one who does one of these!)
- First-class function evaluation within Rust (and remove lisp interpreters).
- Add task generation function in
domains::strings
- Fallible evaluation (e.g. see how
domains::strings
handlesslice
). - Lazy evaluation.
-
impl GP for pcfg::Grammar
is not yet complete. - Eta-long sidestepping (so
f
gets enumerated instead of(λ (f $0))
) - Consolidate lazy/non-lazy evaluation (for ergonomics).
- Permit non-
&'static str
-namedType
/TypeScheme
. - Ability to include recursive primitives in
lambda
representation. - Faster lambda calculus evaluation (less cloning; bubble up whether beta reduction happened rather than ultimate equality comparison).
- PCFG compression is currently only estimating parameters, not actually learning pieces of programs. An adaptor grammar approach seems like a good direction to go, perhaps minus the Bayesian non-parametrics.
- Add more learning traits (like
EC
orGP
) - Add more representations
- Add more domains
Dependencies
~5–11MB
~126K SLoC