2 releases
0.1.5-alpha.0 | Mar 25, 2023 |
---|---|
0.1.4-alpha.0 | Mar 3, 2023 |
#45 in #indices
705KB
5K
SLoC
caffe2op-find
Crate for finding the FindOp
operator,
a mathematical operation used in DSP and machine
learning computations.
The FindOp
operator can be used to efficiently
search for a set of Needles
within a larger
dataset. It operates by taking a Placeholder
input representing the larger dataset and an
IndexGet
input representing the Needles
, and
outputting a set of indices corresponding to the
locations of the Needles
within the dataset.
This operation is particularly useful in
applications where a large dataset needs to be
searched for a small set of values, such as in
image recognition or speech processing. The
FindOp
operator can significantly reduce the
computational cost of this task compared to
brute-force methods.
The caffe2op-find
crate also provides
functionality for modifying and inserting
Needles
into the dataset. The inserts
and
modifies
functions can be used to efficiently
update the dataset with new or updated Needles
.
Overall, the caffe2op-find
crate provides
a powerful set of tools for efficiently searching,
modifying, and outputting data in machine learning
and DSP applications.
2> !file caffe2op-finddup.query
caffe2op-finddup
Crate for finding duplicate elements in a dataset
using the FindDuplicateElementsOp
operator,
a mathematical operation used in DSP and machine
learning computations.
The FindDuplicateElementsOp
takes a dataset as
input and outputs the indices of the duplicate
elements within that dataset. This operator can be
useful in various applications where duplicate
data can cause issues, such as in data
preprocessing for machine learning algorithms or
in data deduplication.
The caffe2op-finddup
crate provides a convenient
implementation of the FindDuplicateElementsOp
using the FeedBlob
and FetchBlob
functions to
input and output data. The ResetWorkspace
function can be used to reset the state of the
workspace between calls to the operator.
The astype
function is provided to cast the
input data to a specific data type, while the
dict
parameter can be used to specify additional
options for the operation. The dupIndices
and
dupSize
variables are outputted by the operator
to give the indices and size of the duplicate
elements found. The retVal
variable can be used
to retrieve the output indices.
Overall, the caffe2op-finddup
crate provides
a powerful set of tools for efficiently finding
and outputting duplicate elements in datasets,
with applications in machine learning, data
preprocessing, and data deduplication.
Dependencies
~36MB
~404K SLoC