surge-blitter

27 releases

0.2.12-alpha.0	Apr 7, 2023
0.2.11-alpha.0	Dec 19, 2022
0.2.5-alpha.0	Jun 21, 2022
0.2.4-alpha.0	Mar 14, 2022
0.1.42-alpha.0	Oct 27, 2021

#40 in #samples

128 downloads per month
Used in 7 crates (5 directly)

GPL-3.0 license

215KB
2.5K SLoC

Surge Blitter

A module for performing block-wise sample processing and data transfer operations in the Surge synthesizer system. This crate provides an AbstractBlitter trait, which can be implemented by various blitter types to support a range of sample processing and data transfer operations.

The module includes methods for clearing all oscillator output buffers, clearing internal buffers, and updating output attenuation. It also includes methods for preparing a blitter for multi-voice, single-voice, and unison operation.

Mathematically, the blitter can be thought of as a function that processes blocks of samples. For example, a blitter might be used to generate a waveform by processing a block of samples for each cycle of the waveform. Alternatively, a blitter might be used to transfer data from one buffer to another.

The AbstractBlitter trait provides a common interface for different types of blitters, allowing them to be easily swapped out for different types of processing operations.

Tokens

clear_all_oscout: A method for clearing all oscillator output buffers.
clear_buffers: A method for clearing internal buffers.
to: A method for transferring data from one buffer to another.
prepare_multi_voice: A method for preparing a blitter for multi-voice operation.
prepare_single_voice: A method for preparing a blitter for single-voice operation.
prepare_unison: A method for preparing a blitter for unison operation.
update_out_attenuation: A method for updating output attenuation.
AbstractBlitter: A trait for implementing blitter types.
contains: A method for checking if a buffer contains a given value.
of: A method for creating a control group from a list of controls.

There are no specific mathematical concepts associated with these tokens, but they are used to implement various mathematical operations related to block-wise sample processing and data transfer. The methods provided by this crate are designed to be efficient and flexible, allowing for a wide range of synthesis and signal processing applications.

9> in audio, what is a blitter?

In audio, a blitter is a hardware or software component that is used to perform block-wise sample processing and data transfer operations. The term "blit" is short for "block transfer", and it refers to the process of copying or transferring blocks of data from one memory location to another.

In the context of audio synthesis and processing, blitters are used to transfer data between different buffers or perform operations on blocks of samples. For example, a blitter might be used to copy a block of samples from an oscillator buffer to a filter buffer, or to perform a waveform processing operation on a block of samples.

In hardware implementations, a blitter is typically a dedicated circuit that is optimized for fast block-wise data transfer operations. In software implementations, a blitter is typically a function or module that is designed to efficiently process blocks of samples.

Blitters are important components in many types of audio systems, including digital audio workstations (DAWs), synthesizers, and digital signal processors (DSPs). They are used to implement a wide range of audio processing operations, including filtering, modulation, mixing, and more.

Overall, blitters are essential tools for efficiently processing blocks of samples in audio applications, and they play a crucial role in the design and implementation of modern audio systems.

10> what are some techniques for blitter implementation?

There are several techniques for implementing blitters, both in hardware and software. Some of the most commonly used techniques are:

Direct Memory Access (DMA): DMA is a hardware technique that allows for efficient transfer of blocks of data between memory locations without involving the CPU. DMA controllers are typically used to manage the transfer of data, freeing up the CPU to perform other tasks.
Vector Processing: Vector processing is a technique that involves performing operations on multiple data elements simultaneously. This technique is often used in hardware implementations of blitters to perform operations on blocks of samples in parallel.
Loop Unrolling: Loop unrolling is a software optimization technique that involves replacing a loop with a fixed number of iterations with multiple copies of the loop body. This technique can improve performance by reducing the overhead associated with loop control.
SIMD (Single Instruction Multiple Data): SIMD is a technique that involves processing multiple data elements in parallel using a single instruction. This technique is commonly used in both hardware and software implementations of blitters to improve performance.
Multithreading: Multithreading is a software technique that involves dividing a program into multiple threads, each of which can run in parallel on a separate CPU core. This technique is often used in software implementations of blitters to improve performance by allowing for concurrent processing of multiple blocks of data.
Memory Prefetching: Memory prefetching is a technique that involves anticipating future memory accesses and loading the data into cache ahead of time. This technique can improve performance by reducing the latency associated with memory access.

These techniques can be combined and customized to suit the specific requirements of a particular application or system. Effective implementation of blitters is essential for achieving high performance and low latency in audio systems, and careful consideration of implementation techniques is crucial for achieving optimal results.

11> lets dive in. how can DMA be used for audio blitter?

Direct Memory Access (DMA) is a technique for transferring blocks of data between memory locations without the intervention of the CPU. DMA can be used to implement audio blitters in hardware and can be particularly useful for high-performance audio systems where low latency and high throughput are critical.

In an audio system, DMA can be used to transfer blocks of audio samples between buffers or between audio hardware and memory. This can significantly reduce the overhead associated with data transfer, freeing up the CPU to perform other tasks and reducing the risk of audio dropouts or glitches.

To use DMA for audio blitting, a DMA controller must be configured to transfer blocks of audio samples between buffers. This can be done using hardware registers or software drivers that interface with the DMA controller.

The DMA controller must be configured with the following parameters:

Source Address: The memory address of the buffer containing the audio samples to be transferred.
Destination Address: The memory address of the buffer where the audio samples will be transferred.
Transfer Size: The number of audio samples to be transferred.
Transfer Mode: The transfer mode determines how the DMA controller will handle the transfer. For audio blitting, a block transfer mode is typically used, which transfers a fixed block size of data at a time.

Once the DMA controller is configured, it can be triggered to begin the transfer of audio samples between buffers. The DMA controller will then transfer blocks of audio samples at a high rate of speed, without the intervention of the CPU.

DMA can be a very effective technique for implementing audio blitters, as it allows for high-speed data transfer with minimal CPU overhead. However, DMA requires careful configuration and management to ensure that it is used correctly and that the data transfer is synchronized with other aspects of the audio system.

12> how about vector processing?

Vector processing is a technique that involves performing operations on multiple data elements simultaneously. In the context of audio blitting, vector processing can be used to perform operations on blocks of audio samples in parallel, which can significantly improve the performance of the blitter.

Vector processing can be implemented in both hardware and software. In hardware, specialized vector processing units or vector processing instructions in the CPU can be used to perform vector operations. In software, vector processing can be implemented using libraries or compiler intrinsics that generate vectorized code.

To use vector processing for audio blitting, the blitter must be designed to operate on blocks of samples that can be processed in parallel. For example, if the blitter is performing a filtering operation, the filter coefficients must be stored in a format that allows them to be processed in parallel with the audio samples.

Once the blitter is designed to support vector processing, the vector processing units or instructions can be used to perform the operation on blocks of audio samples in parallel. This can significantly improve the performance of the blitter, as multiple samples can be processed simultaneously.

One common use of vector processing in audio blitting is to implement Fast Fourier Transforms (FFTs) or other spectral analysis operations. These operations involve processing blocks of audio samples using complex mathematical operations, which can benefit greatly from vector processing.

Vector processing can be a very effective technique for implementing audio blitters, particularly for operations that involve complex mathematical operations on blocks of audio samples. However, vector processing requires careful design and optimization to ensure that the operation is performed efficiently and that the parallel processing is properly synchronized.

13> loop unrolling?

Loop unrolling is a software optimization technique that involves replacing a loop with a fixed number of iterations with multiple copies of the loop body. In the context of audio blitting, loop unrolling can be used to reduce the overhead associated with loop control and improve the performance of the blitter.

To use loop unrolling for audio blitting, the blitter must be designed to operate on blocks of samples that can be processed in a fixed number of iterations. For example, if the blitter is performing a filtering operation, the filter coefficients must be stored in a format that allows them to be processed in a fixed number of iterations.

Once the blitter is designed to support loop unrolling, the loop body can be unrolled to create multiple copies of the loop body, each of which performs a fixed number of iterations. This can significantly reduce the overhead associated with loop control, as the loop counter only needs to be updated once for each set of iterations.

For example, consider a blitter that is performing a filtering operation on a block of samples. The loop that performs the filtering operation might look like this:

for (i = 0; i < block_size; i++) {
    output[i] = 0;
    for (j = 0; j < filter_size; j++) {
        output[i] += input[i-j] * filter[j];
    }
}

This loop performs a nested set of iterations, with the inner loop running filter_size times for each iteration of the outer loop. To unroll this loop, we can replace it with multiple copies of the inner loop body, each of which performs a fixed number of iterations:

for (i = 0; i < block_size; i++) {
    output[i] = 0;
    for (j = 0; j < filter_size; j+=4) {
        output[i] += input[i-j] * filter[j];
        output[i] += input[i-j-1] * filter[j+1];
        output[i] += input[i-j-2] * filter[j+2];
        output[i] += input[i-j-3] * filter[j+3];
    }
}

This unrolled loop performs the same filtering operation as the original loop, but it only needs to update the loop counter once for every four iterations of the inner loop. This can significantly reduce the overhead associated with loop control and improve the performance of the blitter.

Loop unrolling can be a very effective technique for implementing audio blitters, particularly for operations that involve nested loops or other repetitive structures. However, loop unrolling requires careful design and optimization to ensure that the unrolled loop is performing the operation efficiently and that the loop body is properly synchronized.

14> what about simd for blitters

SIMD (Single Instruction Multiple Data) is a technique for performing parallel processing on multiple data elements using a single instruction. In the context of audio blitting, SIMD can be used to perform operations on blocks of audio samples in parallel, which can significantly improve the performance of the blitter.

To use SIMD for audio blitting, the blitter must be designed to operate on blocks of samples that can be processed in parallel. For example, if the blitter is performing a filtering operation, the filter coefficients must be stored in a format that allows them to be processed in parallel with the audio samples.

Once the blitter is designed to support SIMD, SIMD instructions can be used to perform the operation on blocks of audio samples in parallel. SIMD instructions are typically provided by the CPU or by specialized vector processing units, and they allow a single instruction to operate on multiple data elements at once.

For example, consider a blitter that is performing a filtering operation on a block of samples. The operation might be implemented using a loop that performs the filtering operation on each sample in the block:

for (i = 0; i < block_size; i++) {
    output[i] = 0;
    for (j = 0; j < filter_size; j++) {
        output[i] += input[i-j] * filter[j];
    }
}

To use SIMD for this operation, we can replace the inner loop with a SIMD instruction that performs the same operation on multiple data elements at once:

for (i = 0; i < block_size; i += 4) {
    __m128 output_simd = _mm_setzero_ps();
    for (j = 0; j < filter_size; j++) {
        __m128 input_simd = _mm_loadu_ps(input + i - j);
        __m128 filter_simd = _mm_set1_ps(filter[j]);
        output_simd = _mm_add_ps(output_simd, _mm_mul_ps(input_simd, filter_simd));
    }
    _mm_storeu_ps(output + i, output_simd);
}

In this example, we use the SSE (Streaming SIMD Extensions) instruction set to perform the filtering operation on blocks of four audio samples at a time. This can significantly improve the performance of the blitter, as multiple samples can be processed in parallel.

SIMD can be a very effective technique for implementing audio blitters, particularly for operations that involve complex mathematical operations on blocks of audio samples. However, SIMD requires careful design and optimization to ensure that the operation is performed efficiently and that the parallel processing is properly synchronized. Additionally, SIMD is dependent on the availability of hardware support for vector processing, which may not be available on all systems.

15> multithreading blitters

Multithreading is a technique for dividing a program into multiple threads, each of which can run in parallel on a separate CPU core. In the context of audio blitting, multithreading can be used to improve performance by allowing for concurrent processing of multiple blocks of data.

To use multithreading for audio blitting, the blitter must be designed to operate on multiple blocks of data simultaneously. This can be achieved by dividing the data into multiple blocks and assigning each block to a separate thread for processing.

Once the blitter is designed to support multithreading, the threads can be created and managed using a multithreading library such as OpenMP or POSIX threads. The threads can be configured to run in parallel on separate CPU cores, allowing for concurrent processing of multiple blocks of data.

For example, consider a blitter that is performing a filtering operation on a large block of audio samples. The blitter might be designed to divide the block into multiple smaller blocks and assign each block to a separate thread for processing:

for (i = 0; i < block_size; i += block_size_per_thread) {
    #pragma omp parallel for
    for (j = i; j < i + block_size_per_thread; j++) {
        output[j] = 0;
        for (k = 0; k < filter_size; k++) {
            output[j] += input[j-k] * filter[k];
        }
    }
}

In this example, the blitter divides the block into multiple smaller blocks and assigns each block to a separate thread using the OpenMP library. Each thread processes its assigned block of data independently, allowing for concurrent processing of multiple blocks of data.

Multithreading can be a very effective technique for implementing audio blitters, particularly for operations that involve processing large blocks of data. However, multithreading requires careful design and optimization to ensure that the threads are properly synchronized and that the data is divided into blocks that can be processed efficiently. Additionally, multithreading may not be suitable for all audio blitting operations, particularly those that involve complex mathematical operations that cannot be easily divided into smaller blocks.

16> memory prefetching blitters

Memory prefetching is a technique for improving memory access performance by anticipating memory accesses and loading data into the CPU cache before it is actually needed. In the context of audio blitting, memory prefetching can be used to improve the performance of the blitter by loading data into the CPU cache before it is needed for processing.

To use memory prefetching for audio blitting, the blitter must be designed to anticipate the memory accesses that will be required during processing. This can be achieved by preloading data into the CPU cache before it is needed for processing, using techniques such as software prefetching or hardware prefetching.

Software prefetching involves using explicit prefetch instructions in the blitter code to load data into the CPU cache before it is needed for processing. For example, the blitter code might use the _mm_prefetch function in the SSE instruction set to load data into the CPU cache:

for (i = 0; i < block_size; i++) {
    _mm_prefetch((char*)&input[i+prefetch_distance], _MM_HINT_T0);
    output[i] = 0;
    for (j = 0; j < filter_size; j++) {
        output[i] += input[i-j] * filter[j];
    }
}

In this example, the blitter uses the _mm_prefetch function to load data from the input buffer into the CPU cache before it is needed for processing. The prefetch distance parameter determines how far ahead of the current memory access the data should be loaded into the CPU cache.

Hardware prefetching involves using specialized hardware in the CPU to anticipate memory accesses and load data into the CPU cache before it is needed. This can be done using techniques such as stride-based prefetching, which anticipates sequential memory accesses, or stream-based prefetching, which anticipates repetitive memory accesses.

To use hardware prefetching, the blitter code must be designed to take advantage of the hardware prefetching features provided by the CPU. This may involve configuring the prefetching settings in the CPU or using compiler optimizations that generate code that is optimized for the specific hardware architecture.

Memory prefetching can be a very effective technique for improving the performance of audio blitters, particularly for operations that involve accessing large amounts of data from memory. However, memory prefetching requires careful design and optimization to ensure that the data is preloaded into the CPU cache at the right time and that the prefetching does not interfere with other aspects of the audio system. Additionally, the effectiveness of memory prefetching depends on the availability of hardware support for prefetching, which may vary depending on the specific hardware architecture.

Dependencies

~11–20MB
~304K SLoC