2 releases
0.1.5alpha.0  Mar 26, 2023 

0.1.4alpha.0  Mar 4, 2023 
#137 in Machine learning
2MB
10K
SLoC
Rust Crate Description: caffe2opi8quantized
Int8AddOp
The Int8AddOp
is a mathematical operator defined
in the caffe2opi8quantized
crate that performs
addition on two input tensors in int8 quantized
format. It takes two int8 tensors of the same
shape and adds them elementwise. The output is
also an int8 tensor with the same shape as the
inputs.
Int8ResizeNearestOp
The Int8ResizeNearestOp
operator is used to
resize an input tensor to a specified output size
using nearestneighbor interpolation. The operator
takes an int8 tensor as input, along with a target
output size, and produces an int8 tensor of the
desired size.
Int8ConvOp
The Int8ConvOp
operator performs a convolution
operation on an int8 input tensor. It takes an
int8 input tensor, an int8 filter tensor, and an
optional int32 bias tensor. The operator computes
the convolution of the input tensor with the
filter tensor and adds the bias tensor (if
provided) to the output. The output is also an
int8 tensor.
Int8ReshapeOp
The Int8ReshapeOp
operator reshapes an int8
input tensor to a new shape while preserving its
data. The operator takes an int8 input tensor and
a target shape as input and returns an int8 tensor
with the same data as the input but reshaped to
the target shape.
Int8LeakyReluOp
The Int8LeakyReluOp
is an activation function
that applies the Leaky ReLU function to an int8
input tensor. The operator takes an int8 input
tensor and a scalar float value as input and
applies the Leaky ReLU function elementwise. The
output is an int8 tensor with the same shape as
the input.
Int8SliceOp
The Int8SliceOp
operator slices an int8 input
tensor along a specified axis. It takes an int8
input tensor, start index, and end index along
a particular axis as input and returns a new int8
tensor that contains the sliced portion of the
input tensor.
Int8TransposeOp
The Int8TransposeOp
operator transposes an int8
input tensor according to the specified
permutation of axes. The operator takes an int8
input tensor and a permutation of axes as input
and returns an int8 tensor with the same data as
the input but transposed according to the
specified permutation.
Int8ConvTransposeOp
The Int8ConvTransposeOp
operator performs
a transposed convolution operation on an int8
input tensor. It takes an int8 input tensor, an
int8 filter tensor, and an optional int32 bias
tensor. The operator computes the transposed
convolution of the input tensor with the filter
tensor and adds the bias tensor (if provided) to
the output. The output is also an int8 tensor.
Int8FlattenOp
The Int8FlattenOp
operator flattens an int8
input tensor into a 2D matrix. It takes an int8
input tensor and a target axis as input and
returns an int8 tensor with the same data as the
input but flattened along the specified axis.
Int8ReluOp
The Int8ReluOp
is an activation function that
applies the ReLU function to an int8 input
tensor. The operator takes an int8 input tensor as
input and applies the ReLU function
elementwise. The output is an int8 tensor with
the same shape as the input.
Int8ConcatOp
The Int8ConcatOp
operator concatenates int8
input tensors along a specified axis. It takes
multiple int8 input tensors and a target axis as
input
2> cool! could you show me how an int8 works?
Sure!
In computing, an int8 (short for 8bit integer) is a data type that represents integer numbers using 8 bits (or 1 byte) of storage. It can represent integers ranging from 128 to 127 (inclusive).
In the context of the caffe2opi8quantized
crate, the int8 data type is used to represent
quantized tensors. Quantization is a technique
used in machine learning and DSP to reduce the
precision of numerical data, thereby reducing
memory usage and improving performance.
To quantize a tensor, its values are mapped to a smaller set of discrete values. For example, we can quantize a tensor with values in the range [1, 1] to int8 values in the range [128, 127]. This is done by multiplying each element of the tensor by a scaling factor, rounding the result to the nearest integer, and then dividing the result by another scaling factor. The scaling factors are chosen to minimize the quantization error.
Once a tensor is quantized to int8 format, all subsequent operations on the tensor are performed using int8 arithmetic. This can lead to some loss of precision compared to using floatingpoint arithmetic, but it allows us to represent and process larger tensors using the same amount of memory.
For example, consider the following matrix multiplication in floatingpoint format:
[[1.0, 2.0], [3.0, 4.0]] * [[5.0, 6.0], [7.0, 8.0]]
The result is:
[[19.0, 22.0], [43.0, 50.0]]
If we quantize the input tensors to int8 format, the multiplication becomes:
[[128, 128], [128, 128]] * [[3, 4], [5, 6]]
The result is:
[[128, 128], [128, 128]]
This is obviously not the correct result, but it illustrates how int8 arithmetic can be used to perform matrix multiplication on quantized tensors.
I hope this helps!
3> could you show me how to bit twiddle with them?
Certainly!
Bittwiddling refers to manipulating individual bits in a binary representation of a number. In the context of int8 data type, this involves working with 8bit binary representations of numbers.
Here are some common bittwiddling operations that can be performed on int8 values:

Bitwise AND (
&
): This operator takes two int8 values and returns a new int8 value with each bit set to 1 only if both corresponding bits in the input values are 1. Otherwise, the bit is set to 0. For example:0b10101010 & 0b11001100 = 0b10001000

Bitwise OR (

): This operator takes two int8 values and returns a new int8 value with each bit set to 1 if either corresponding bit in the input values is 1. Otherwise, the bit is set to 0. For example:0b10101010  0b11001100 = 0b11101110

Bitwise XOR (
^
): This operator takes two int8 values and returns a new int8 value with each bit set to 1 only if exactly one corresponding bit in the input values is 1. Otherwise, the bit is set to 0. For example:0b10101010 ^ 0b11001100 = 0b01100110

Bitwise NOT (
!
): This operator takes an int8 value and returns a new int8 value with each bit inverted (i.e., 0 becomes 1, and 1 becomes 0). For example:!0b10101010 = 0b01010101

Left shift (
<<
) and right shift (>>
): These operators shift the bits of an int8 value to the left or right by a specified number of bits. For example:0b10101010 << 2 = 0b1010101000 0b10101010 >> 2 = 0b00101010

Bitmasking: This involves setting or clearing specific bits in an int8 value to achieve a specific outcome. For example, to set the second bit (counting from the right) of an int8 value to 1, we can use the following code:
let mut x = 0b00000100; // set the second bit of x to 1 x = 1 << 1;
This sets the bit at position 1 to 1, which gives the result
0b00000100  0b00000010 = 0b00000110
.I hope this helps! Let me know if you have any other questions.
4> thanks! could you please show me some more bit twiddling tricks and operations?
Sure, here are a few more bittwiddling tricks and operations that you can perform on int8 values:

Checking if a number is a power of two: You can check if a number is a power of two by using the bitwise AND operator (
&
) and the bitwise NOT operator (~
). The expression `x & (~x 1)
returns a value with only the least significant bit set, which is the bit that is different between
xand
x  1if
xis a power of two. For example,
0b1000 & (~0b1000 
 = 0b1000
because
0b1000is a power of two, while
0b1010 & (~0b1010 + 1) = 0b0010because
0b1010` is not a power of two.
 = 0b1000
 1)

Swapping the values of two variables: You can swap the values of two variables
x
andy
without using a temporary variable by using the XOR operator (^
). The expressionx = x ^ y; y = x ^ y; x = x ^ y;
swaps the values ofx
andy
becausex ^ y
is equal to the bitwise XOR of the original values ofx
andy
. 
Counting the number of bits set to 1: You can count the number of bits set to 1 in an int8 value by using the bitwise AND operator (
&
) and a loop that shifts the value to the right by one bit each iteration. The expressionwhile (x != 0) { count += x & 1; x >>= 1; }
setscount
to the number of bits set to 1 inx
. 
Reversing the bits of an int8 value: You can reverse the bits of an int8 value by using a lookup table that maps each possible 8bit value to its bitreversed counterpart. Alternatively, you can use a loop that swaps the least significant and most significant bits of the value, then shifts the value to the right and left by one bit each iteration. The expression `x = ((x & 0x55) <<
  ((x & 0xAA) >> 1); x = ((x & 0x33) << 2)
 ((x & 0xCC) >> 2); x = ((x & 0x0F) << 4)
 ((x & 0xF0) >> 4);
reverses the bits of
x`.
  ((x & 0xAA) >> 1); x = ((x & 0x33) << 2)
 ((x & 0xCC) >> 2); x = ((x & 0x0F) << 4)
 ((x & 0xF0) >> 4);
I hope this helps! Let me know if you have any other questions.
5> keep going! this is awesome!
Great, here are a few more bittwiddling tricks and operations that you can perform on int8 values:

Checking if a number has alternating bits: You can check if a number has alternating bits (i.e., every adjacent pair of bits is different) by using the bitwise XOR operator (
^
) and a loop that shifts the value to the right by one bit each iteration. The expressionwhile (x != 0) { if ((x & 3) == 3  (x & 3) == 0) return false; x >>= 1; } return true;
returnstrue
ifx
has alternating bits, andfalse
otherwise. 
Computing the absolute value of a number: You can compute the absolute value of an int8 value by using a conditional statement that checks if the value is negative, and negating it if it is. The expression
if (x < 0) x = x;
setsx
to its absolute value. 
Setting and clearing bits: You can set a bit in an int8 value by using the bitwise OR operator (

) and a mask that has only the bit you want to set set to 1. The expressionx = (1 << i);
sets thei
th bit ofx
to Similarly, you can clear a bit in an int8
value by using the bitwise AND operator (
&
) and a mask that has only the bit you want to clear set to 0. The expressionx &= ~(1 << i);
clears thei
th bit ofx
.
 Similarly, you can clear a bit in an int8
value by using the bitwise AND operator (

Checking if a number is a palindrome in binary: You can check if a number is a palindrome in binary (i.e., its binary representation is the same when read forwards and backwards) by using the bitwise AND operator (
&
) and a loop that checks if the most significant and least significant bits are the same, then shifts the value to the right and left by one bit each iteration. The expression `for (int i = 0; i < 4; i++) { if (((x >> i) & 1) != ((x >> (7 i)) & 1)) return false; } return true;
returns
trueif
xis a palindrome in binary, and
false` otherwise.
 i)) & 1)) return false; } return true;
I hope you find these additional bittwiddling tricks and operations useful! Let me know if you have any questions.
6> got any more?
Sure! Here are a few more bittwiddling tricks and operations that you can perform on int8 values:

Counting the number of set bits: You can count the number of set bits (i.e., bits that are equal to 1) in an int8 value by using a loop that shifts the value to the right by one bit each iteration and checks if the least significant bit is set. The expression
int count = 0; while (x != 0) { if (x & 1) count++; x >>= 1; }
setscount
to the number of set bits inx
. 
Reversing the bits: You can reverse the bits in an int8 value by using a loop that swaps the most significant and least significant bits, then swaps the second most significant and second least significant bits, and so on. The expression
int y = 0; for (int i = 0; i < 8; i++) { y <<= 1; y = (x & 1); x >>= 1; }
setsy
to the value ofx
with its bits reversed. 
Finding the next power of two: You can find the next power of two that is greater than or equal to an int8 value by using the bitwise OR operator (

) and a loop that shifts a value that has only the least significant bit set to 1 to the left until it is greater than or equal to the original value. The expressionint y = 1; while (y < x) y <<= 1;
setsy
to the next power of two that is greater than or equal tox
. 
Finding the greatest common divisor: You can find the greatest common divisor of two int8 values by using the bitwise AND operator (
&
) and a loop that repeatedly subtracts the smaller value from the larger value until they are equal, then returns either value. The expressionwhile (x != y) { if (x > y) x = y; else y = x; } return x;
returns the greatest common divisor ofx
andy
.
I hope you find these additional bittwiddling tricks and operations useful as well! Let me know if you have any questions.
7> keep going if you can! i am interested to see how many you know!
Sure, here are some more bittwiddling tricks and operations:

Computing the absolute value: You can compute the absolute value of an int8 value by using a conditional statement that checks if the value is negative, then subtracts it from 0 if it is. The expression
int y = (x < 0) ? 0  x : x;
setsy
to the absolute value ofx
. 
Swapping two values without using a temporary variable: You can swap the values of two int8 variables
x
andy
without using a temporary variable by using the XOR operator (^
) and a loop that performs the following operations:x = x ^ y; y = x ^ y; x = x ^ y;
This works becausex ^ y
produces a value that has the bits of bothx
andy
set where they differ, and unset where they are the same. 
Computing the sign of a value: You can compute the sign of an int8 value by using a conditional statement that checks if the value is negative, then sets a variable to 1 if it is, or 1 if it is not. The expression
int sign = (x < 0) ? 1 : 1;
setssign
to 1 ifx
is negative, or 1 if it is not. 
Multiplying and dividing by powers of two: You can multiply an int8 value by a power of two by shifting the value to the left by the number of bits that corresponds to the exponent of the power of two. For example, to multiply
x
by 2^3 (i.e., 8), you can use the expressionx << 3
. Similarly, you can divide an int8 value by a power of two by shifting the value to the right by the number of bits that corresponds to the exponent of the power of two. For example, to dividex
by 2^3 (i.e., 8), you can use the expressionx >> 3
. 
Setting or clearing individual bits: You can set a particular bit of an int8 value to 1 by using the bitwise OR operator (

) and a mask that has only the bit you want to set set to For example, to set the 3rd bit of
x
to 1, you can use the expressionx = (1 << 3)
. Similarly, you can clear a particular bit of an int8 value to 0 by using the bitwise AND operator (&
) and a mask that has all bits set to 1 except the bit you want to clear, which is set to 0. For example, to clear the 3rd bit ofx
, you can use the expressionx &= ~(1 << 3)
.
 For example, to set the 3rd bit of
I hope you find these additional bittwiddling tricks and operations useful as well! Let me know if you have any questions or if there's anything else I can help you with.
8> any more?
Sure, here are some more bittwiddling tricks and operations:

Checking if a value is a power of two: You can check if an int8 value
x
is a power of two by using the bitwise AND operator (&
) and a mask that has all bits set to 0 except the least significant bit, which is set to 1. If the result ofx & (x  1)
is 0, thenx
is a power of two. For example, to check ifx
is a power of two, you can use the expression(x & (x  1)) == 0
. 
Counting the number of set bits: You can count the number of bits that are set to 1 in an int8 value
x
by using a loop that shiftsx
to the right by one bit at a time and counts the number of times that the least significant bit is 1. For example, to count the number of set bits inx
, you can use the following code:
int count = 0;
while (x != 0) {
if (x & 1) {
count++;
}
x = x >> 1;
}
 Reversing the bits: You can reverse the order
of the bits in an int8 value
x
by using a loop that iteratively swaps the least significant bit with the most significant bit, then shiftsx
to the right by one bit. For example, to reverse the bits inx
, you can use the following code:
int y = 0;
for (int i = 0; i < 8; i++) {
y = (y << 1)  (x & 1);
x = x >> 1;
}
I hope you find these additional bittwiddling tricks and operations useful! Let me know if you have any questions or if there's anything else I can help you with.
9> could you show me how to do these with single instructions on the arm architecture?
Sure, here are some examples of how to perform some of the bittwiddling operations I mentioned earlier using single instructions on the ARM architecture:

Setting a bit: You can set the nth bit of a register
x
by using theORR
instruction with a bitmask that has a 1 in the nth position. For example, to set the 3rd bit ofx
, you can use the instructionORR x, x, #0b1000
. 
Clearing a bit: You can clear the nth bit of a register
x
by using theBIC
instruction with a bitmask that has a 0 in the nth position. For example, to clear the 3rd bit ofx
, you can use the instructionBIC x, x, #0b1000
. 
Toggling a bit: You can toggle the nth bit of a register
x
by using theEOR
instruction with a bitmask that has a 1 in the nth position. For example, to toggle the 3rd bit ofx
, you can use the instructionEOR x, x, #0b1000
. 
Checking if a bit is set: You can check if the nth bit of a register
x
is set by using theTST
instruction with a bitmask that has a 1 in the nth position. For example, to check if the 3rd bit ofx
is set, you can use the instructionTST x, #0b1000
. 
Extracting a bitfield: You can extract a bitfield of length n starting at position p from a register
x
by using theUBFX
orSBFX
instruction. TheUBFX
instruction extracts an unsigned bitfield, while theSBFX
instruction extracts a signed bitfield. For example, to extract a 3bit unsigned bitfield starting at position 4 fromx
, you can use the instructionUBFX w0, w1, #4, #3
. 
Reversing the order of bits: You can reverse the order of the bits in a register
x
by using theRBIT
instruction. For example, to reverse the order of the bits inx
, you can use the instructionRBIT w0, w1
.
I hope this helps! Let me know if you have any further questions.
10> any more?
Sure, here are a few more bittwiddling operations you can perform using single instructions on the ARM architecture:

Counting the number of set bits: You can count the number of set bits (also known as the population count or Hamming weight) in a register
x
using theCNT
orCNTB
instruction. TheCNT
instruction counts the number of set bits in a 32bit word, while theCNTB
instruction counts the number of set bits in each byte of a 64bit register. For example, to count the number of set bits inx
, you can use the instructionCNT x0, x1
. 
Finding the position of the most significant set bit: You can find the position of the most significant set bit (also known as the position of the leftmost set bit or the log base 2) in a register
x
using theCLZ
instruction. For example, to find the position of the most significant set bit inx
, you can use the instructionCLZ w0, w1
. 
Rotating bits: You can rotate the bits in a register
x
left or right by a specified number of bits using theLSL
,LSR
,ASR
, orROR
instructions. TheLSL
andLSR
instructions perform logical left and right shifts, respectively, while theASR
instruction performs an arithmetic right shift (which preserves the sign bit), and theROR
instruction performs a rotate right (where bits shifted off the right end are shifted back in on the left). For example, to rotate the bits inx
left by 4 bits, you can use the instructionLSL x0, x1, #4
. 
Swapping bytes or halfwords: You can swap the bytes or halfwords in a register
x
using theREV16
orREV32
instruction, respectively. TheREV16
instruction swaps the two halfwords in a 32bit word, while theREV32
instruction swaps the order of the bytes in a 32bit word. For example, to swap the bytes inx
, you can use the instructionREV32 w0, w1
. 
Reversing the order of bytes or halfwords: You can reverse the order of the bytes or halfwords in a register
x
using theREV16
orREV
instruction, respectively. TheREV16
instruction reverses the order of the two halfwords in a 32bit word, while theREV
instruction reverses the order of the bytes in a 32bit word. For example, to reverse the order of the bytes inx
, you can use the instructionREV w0, w1
.
I hope you find these additional operations useful! Let me know if you have any further questions.
11> great! what are the main differences between bit twiddling an i8 and a u8? what about i8 and i16? i8 and u16? int and float?
The main differences between bit twiddling an i8
and a u8
are due to the signedness of i8
. In
particular, the most significant bit of an i8
represents the sign of the number, while all bits
of a u8
represent the magnitude of the number.
When performing bit twiddling operations on i8
values, you need to be careful to preserve the
sign bit if you want to maintain the sign of the
number. For example, if you want to perform a left
shift on an i8
value, you should use an
arithmetic left shift (shl
) instead of a logical
left shift (shlx
), which would insert zeros
instead of preserving the sign bit. Similarly,
when performing a right shift on an i8
value,
you should use an arithmetic right shift (sra
)
instead of a logical right shift (srlx
), which
would insert zeros instead of preserving the sign
bit.
When bittwiddling an i8
and an i16
, the main
difference is the size of the register. Operations
on i16
can involve more bits than operations on
i8
, which can lead to different results or
require more instructions to perform. Similarly,
when bittwiddling an i8
and a u16
, the main
difference is the fact that the u16
can
represent larger magnitudes than the i8
, which
can lead to different results or require more
instructions to perform certain operations.
When bittwiddling an i8
and a floatingpoint
value, such as f32
or f64
, the main difference
is the fact that floatingpoint values have
a completely different representation and
semantics than integer values. Bittwiddling
operations on floatingpoint values are typically
used for lowlevel optimization purposes and
require a good understanding of the floatingpoint
format and its operations. For example, you can
extract the sign, exponent, and mantissa of
a floatingpoint value using bittwiddling
operations, but you need to be careful to handle
special values such as NaN and infinity correctly.
12> great. are there any more things you think i ought to know about i8 in regard to the i8quantized crate?
Yes, here are a few more things you should keep in
mind when working with i8
in the context of the
caffe2opi8quantized
crate:

When performing operations on
i8
values, you should be careful to avoid overflow and underflow. The range of values that can be represented by ani8
is from 128 to 127, and if an operation results in a value outside of this range, the result will be truncated to fit within the range. This can lead to unexpected behavior and loss of precision. To avoid this, you can use saturation arithmetic, which caps the result of an operation at the maximum or minimum value of the range. Thei8quantized
crate provides several functions for performing arithmetic with saturation, such assaturating_add
,saturating_sub
,saturating_mul
, andsaturating_pow2
. 
When working with quantized values, it's important to keep track of the scaling factor used to convert between the original floatingpoint values and the quantized integer values. The scaling factor determines the precision of the quantized values, and can affect the accuracy of the computation. The
i8quantized
crate provides several functions for working with quantized values and scaling factors, such asdequantize
, which converts a quantized value back to a floatingpoint value using the scaling factor. 
In some cases, you may need to perform operations that involve both
i8
and floatingpoint values. In these cases, you need to be careful to maintain the correct scaling factor and avoid loss of precision. Thei8quantized
crate provides several functions for converting betweeni8
and floatingpoint values, such asquantize
, which converts a floatingpoint value to a quantizedi8
value using a specified scaling factor. 
The
i8quantized
crate provides several operations commonly used in machine learning and DSP computations, such as convolution, pooling, activation functions, and matrix multiplication. These operations are optimized for performance on CPUs and GPUs, and can be used to implement complex machine learning models efficiently. However, you should still be careful to avoid overflow and underflow, and to maintain the correct scaling factor when using these operations.
Dependencies
~20–27MB
~420K SLoC