### 2 unstable releases

0.2.0 | Mar 31, 2021 |
---|---|

0.0.10 | Feb 19, 2020 |

#**691** in Images

**MIT/Apache**

520KB

650 lines

# fractal-analysis

Various Rust and Futhark functions that help perform Z-Box Merging fractal analysis

###
`lib.rs`

:

Welcome to the

crate! Here you will find necessary tools
for the analysis of fractal sets, based on the “Z-box merging” algorithm.`fractal-analysis`

# How to use

I have tried to create ergonomic functions for the following data-sets:

- Images, colour or gray-scale
- Sounds (to be added later)
- Electro-encephalograms (to be added later)
- MRI images (to be added later)

For each of those, you will find ready-made functions that can give you
useful results; just eg say

and you will get the result.`example_code_here`

# Caveat

Currently (March 2021) the documentation and testing are still unfinished. They are both provided on a best-effort basis; the user's understanding is appreciated.

# How to extend to other uses

Oh, how I *wish* there was a way for me to design an API that said “Oh, just
do

for whatever input and it'll work!”.`zbox_merge``(`input`)`

Sadly, it is no-where nearly as simple as that. The basic library will get you a long way along, but there is serious work you'll need to do before-hand. The simplest way would be to just send the crate maintainer an e-mail and ask directly, but in case s/he is unreachable, here is what you'll need to do:

### Choose your dependent and independent dimensions

First, decide what counts as an “separate dimension” for your use-case. Is each channel of your EEG a separate dimension that depends on the time-stamp, or are they a sampling of a two-dimensional scalp? What about the colour channels of your image? That will give you a first idea of what to do, and denote the limits of the result you will end up with.

Of particular note: You will want the number of dimensions to be independent of the measurement method. For instance: For the colour channels, we chose each to be a separate dimension in part because all colour images have those exact same 3 channels. But for the EEGs we chose to consider them as a 2-D grid, or else the result would depend on the number of electrodes and measurements between different setups would have no immediate way to be compared. As a result, about half the mental effort was spent in deciding exactly how to subdivide the electrode placement into quadrants!

### Count the number of bits you have available

For each coordinate, count the amount of values it can take. Take the smallest of those amounts, find its base-2 logarithm, and throw away all the decimal digits. The result is the number of bits to which each of your coordinates will need to be normalised.

### Choose your data-types

This is only important if you *really* care about efficiency. The default
choices ought to be optimal for most cases, but under special circumstances
they *might* not be.

**Example 1:** If you have 5 keys of 12 bits each, each

will be an
`Coor`

, and as a result the default `u16`

will be an `Key`

. But they can
actually fit within an `u128`

; the compiler just doesn't know that. In that
case, you might prefer to explicitly cast all keys to `u64`

s.`u64`

**Example 2:** If you need 48 bits for your key, and you're running the
code on a 16-bit architecture CPU, you might prefer to implement
custom-made 48 bit data-types (3*16) instead of letting the code use

s
by default. The `u64`

crate permits that.`lindel`

### Create a “key-extraction” function

The key-extraction is comprised of two parts: Normalisation and linearisation.

The normalisation is already provided for you, and only requires you to provide, for each co-ordinate, the min/max values and the amount of bits to which they will be normalised. However, please bear in mind that you will need to extract the independent coordinates together with the dependent ones!

As for the linearisation, two separate methods are already provided, thanks
to the

crate. Those are Morton encoding (“z-index”) and Hilbert
encoding (“Hilbert index”). If you care about speed and/or have
at least one independent coordinate, the z-index will serve you better.
If you have no independent coordinates, and you can afford the small
performance hit, a Hilbert index will easily allow you to examine almost
every possible subdivision of your data-set independently. Still, please
note the following:`lindel`

- You will need to do that manually, as the program can't perform this operation automatically.
- You could theoretically use a Hilbert index everywhere, but in subdividing it you might end up omitting parts of the independent coordinates whose values are too large.
- While a z-index is much quicker to compute than a Hilbert index, for
large data-sets the most expensive operation is the sorting, so perhaps the
difference won't be important for the

part of the program.`O``(`N`)`

### Choose a sorting algorithm

If you don't have a great concern for speed, you could choose the default
methods defined in

or `std`

. If you need to squeeze every last bit
of performance, on the other hand, you might prefer to implement a radix
sort or something to that extent.`rayon`

### And you're finished!

…finished with the preliminary work, that is. Now all that's left
is to write the actual code: extract the keys, sort, extract the
amounts of common leading bits between each consecutive pair,
get the results from them, drain the useless bits, and use a
regression function to get the result. Here is some example code,
copied from the

module:`img_funs`

`let` width `=` input`.``width``(``)` `as` `usize``;`
`let` height `=` input`.``height``(``)` `as` `usize``;`
`let` `get_key_from_sample` `=` `|``(``x``,` `y``,` `rgb``)``:` `(``u32``,` `u32``,` `&``image``::``Rgb``<``u8``>``)``|` `->` `u64` `{`
`let` norm_x `=` `create_normaliser``::``<``u32`, `u8``>``(``0``,` input`.``width``(``)``)``.``unwrap``(``)``;`
`//` The provided normaliser simplifies things slightly.
`let` norm_y `=` `create_normaliser``::``<``u32`, `u8``>``(``0``,` input`.``height``(``)``)``.``unwrap``(``)``;`
`let` arr `=` `[``norm_x``(`x`)``,` `norm_y``(`y`)``,` rgb`[``0``]``,` rgb`[``1``]``,` rgb`[``2``]``]``;`
`//` Extract both independent and dependent coordinates
`morton_encoding``::`morton_encode`(`arr`)`
`//` Linearise using `morton_encoding`
`}``;`
`let` clzs `=` `get_clzs``(`input`.``enumerate_pixels``(``)``,` get_key_from_sample`)``;`
`let` `(`tmp`,` lacun`)` `=` `get_results_from_clzs``(`clzs`)``;`
`let` tmp `=` `drain_useless_bits``::``<``_`, 64, 40`>``(`tmp`)``;`
`let` lacun `=` `drain_useless_bits``::``<``_`, 64, 40`>``(`lacun`)``;`
`finalise_results``(`tmp`,` lacun`,` width`*`height`,` `8``)`

Unless of course you need more than 256 bits for each key, in which case
you'll need to change the rest of the functions so that they operate on

s instead.`u16`

# Interpreting the results

## Fractal dimension

*The first* and most important *thing to understand* with regards to the
fractal dimension is the limits inside which its
log-log plot will fall. Those create a quadrilateral, whose four edges
are defined as follows:

- Edge 1 is vertical, for

equal to the amount of bits you have available. You can't subdivide your data-set more then the resolution that each coordinate has.`x` - Edge 2 is horizontal, for

equal to the amount of samples you have available. Subdivide all you might, you will never get more populated boxes than the amount of samples (eg pixels) that exist in the data-set.`y` - Edge 3 is diagonal, and has a slope equal to the amount of independent coordinates, if any.
- Edge 4 is diagonal as well; its slope is equal to the total amount
of coordinates available, independent as well as dependent. A data-set
constrained within

dimensions can never have a fractal dimension above`N`

.`N`

The most immediate result of these four constraints is that, especially if
the log-log plot rises abruptly at first, it will encounter a plateau as
soon as it reaches the total amount of samples, past which it will be
horizontal. As such, the horizontal part of the plot *must* be excised
before calculating the fractal dimension, else it will be artificially low.

*The second thing to understand* is that, although an ideally fractal
shape's log-log plot will be linear, in practice data-sets will be
scale-dependent, leading to non-linearities in the log-log plot. In every
such instance we've found, the log-log plot is concave downward. Therefore,
the user has two choices:

- To take the simple linear regression of the log-log plot, and interpret its slope as the fractal dimension. The scale-dependence, if any, may be found from the mean square error between the line and the actual plot.
- To take a parabolic regression of the log-log plot, interpret the linear parametre as the fractal dimension, and the second-order parametre as the scale-dependence.

Neither has been tried with any real rigour; the user is cordially invited to share the results, if any.

## Lacunarity

The way the lacunarity was defined in the available literature, it appears to be a measure that's different for each scale. It's measured here for the user's convenience, but the interpretation of the results must necessarily be left up to the user.

#### Dependencies

~2.5MB

~53K SLoC