2 releases

0.1.5-alpha.0 Mar 25, 2023
0.1.4-alpha.0 Mar 3, 2023

#51 in #indices

BSD-3-Clause

705KB
5K SLoC

caffe2op-find

Crate for finding the FindOp operator, a mathematical operation used in DSP and machine learning computations.

The FindOp operator can be used to efficiently search for a set of Needles within a larger dataset. It operates by taking a Placeholder input representing the larger dataset and an IndexGet input representing the Needles, and outputting a set of indices corresponding to the locations of the Needles within the dataset.

This operation is particularly useful in applications where a large dataset needs to be searched for a small set of values, such as in image recognition or speech processing. The FindOp operator can significantly reduce the computational cost of this task compared to brute-force methods.

The caffe2op-find crate also provides functionality for modifying and inserting Needles into the dataset. The inserts and modifies functions can be used to efficiently update the dataset with new or updated Needles.

Overall, the caffe2op-find crate provides a powerful set of tools for efficiently searching, modifying, and outputting data in machine learning and DSP applications.

2> !file caffe2op-finddup.query

caffe2op-finddup

Crate for finding duplicate elements in a dataset using the FindDuplicateElementsOp operator, a mathematical operation used in DSP and machine learning computations.

The FindDuplicateElementsOp takes a dataset as input and outputs the indices of the duplicate elements within that dataset. This operator can be useful in various applications where duplicate data can cause issues, such as in data preprocessing for machine learning algorithms or in data deduplication.

The caffe2op-finddup crate provides a convenient implementation of the FindDuplicateElementsOp using the FeedBlob and FetchBlob functions to input and output data. The ResetWorkspace function can be used to reset the state of the workspace between calls to the operator.

The astype function is provided to cast the input data to a specific data type, while the dict parameter can be used to specify additional options for the operation. The dupIndices and dupSize variables are outputted by the operator to give the indices and size of the duplicate elements found. The retVal variable can be used to retrieve the output indices.

Overall, the caffe2op-finddup crate provides a powerful set of tools for efficiently finding and outputting duplicate elements in datasets, with applications in machine learning, data preprocessing, and data deduplication.

Dependencies

~37MB
~410K SLoC