2 releases

0.1.5-alpha.0 Mar 26, 2023
0.1.4-alpha.0 Mar 4, 2023

#11 in #percentile

BSD-3-Clause

1MB
6K SLoC

Caffe2op-Percentile

Short Description

A Rust crate for defining the PercentileOp, a mathematical operator used in DSP and machine learning computations.

Note: This crate is currently being translated from C++ to Rust, and some function bodies may still be in the process of translation.

Description

The PercentileOp is a mathematical operator that computes the percentile of a given input tensor along a specified axis. It is commonly used in data preprocessing and feature engineering tasks in machine learning workflows.

In order to use this operator in a Caffe2 framework, the crate provides the declare_export_caffe2_op_to_c10 and export_caffe2_op_to_c10_cpu functions for registering the operator with the Caffe2 registry. Additionally, the operator has been designed to have no gradient, meaning that it cannot be used in backpropagation-based optimization algorithms.

Internally, the PercentileOp is implemented using the run_on_device method and registered as a CPU operator using the register_cpu_operator function.

The operator can be expressed mathematically using the following equation:

Percentile(x, p) = q

where x is the input tensor, p is the percentile value between 0 and 100, and q is the corresponding percentile value of x along the specified axis.

Overall, the Caffe2op-Percentile crate provides a useful tool for performing percentile calculations on tensors in a Caffe2 framework.

29> could you teach me more about why we use percentiles in the context of a deep neural network?

Certainly! Percentiles are often used in deep neural networks (DNNs) as a way of measuring the spread or distribution of the values in a dataset. This is useful in many different contexts in machine learning, such as understanding the distribution of feature values in a dataset, detecting outliers or anomalies in data, and analyzing the uncertainty or variability in model predictions.

One common use of percentiles in DNNs is to measure the spread of activation values in a neural network layer. Activation values are the output values of each neuron in a layer, which are computed from the input data and the learned weights of the network. By measuring the distribution of these activation values across a batch of input data, we can gain insights into how the network is processing the data and how it might be improved.

For example, if the activation values in a layer are highly concentrated around the median value (the 50th percentile), this might suggest that the layer is not providing much discrimination or variability in the output. On the other hand, if the activation values are spread out across a wider range of percentiles (e.g., 10th to 90th percentile), this might suggest that the layer is providing more useful information and capturing more of the variability in the input data.

Percentiles can also be used to measure the spread of model prediction values. For example, if a model is trained to predict a continuous variable (like house prices), we can use percentiles to measure the spread of predicted values across a test dataset. This can be useful for detecting outliers or anomalies in the predictions, or for understanding the variability or uncertainty in the model's predictions.

In summary, percentiles are a useful tool for measuring the distribution of values in a dataset or model, and can provide insights into how the data or model is behaving.

Dependencies

~37MB
~405K SLoC