-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fluctuation complexity, restrict possibilites to formally defined self-informations #413
base: main
Are you sure you want to change the base?
Changes from all commits
95f8b94
58b80e6
f5d5b87
805b60f
785cb6c
0d8abd7
bb2712f
1d4f326
e3c4d39
780176b
ba6b73b
bb1aa37
0e7a2d1
0132c05
6c26097
30a82f7
ddea888
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,88 @@ | ||
using Combinatorics | ||
export SequentialCategoricalEncoding | ||
|
||
""" | ||
SequentialCategoricalEncoding <: Encoding | ||
SequentialCategoricalEncoding(; symbols, m = 2) | ||
|
||
An encoding scheme that [`encode`](@ref)s length-`m` categorical vectors onto integers. | ||
|
||
## Description | ||
|
||
Given a vector of possible `symbols`, `SequentialCategoricalEncoding` constructs all possible | ||
length-`m` sequential symbol transitions. | ||
|
||
The input vector `χ` is always treated as categorical, and can have any element type | ||
(but encoding/decoding is faster if `χ` is sortable). | ||
|
||
## Example | ||
|
||
```julia | ||
encoding = SequentialCategoricalEncoding(symbols = ["hello", "there", "skipper"], m = 2) | ||
julia> encoding = SequentialCategoricalEncoding(symbols = ["hello", "there", "skipper"], m = 2) | ||
SequentialCategoricalEncoding, with 3 fields: | ||
symbols = ["hello", "there", "skipper"] | ||
encode_dict = Dict(["there", "skipper"] => 4, ["skipper", "hello"] => 5, ["there", "hello"] => 3, ["hello", "skipper"] => 2, ["skipper", "there"] => 6, ["hello", "there"] => 1) | ||
decode_dict = Dict(5 => ["skipper", "hello"], 4 => ["there", "skipper"], 6 => ["skipper", "there"], 2 => ["hello", "skipper"], 3 => ["there", "hello"], 1 => ["hello", "there"]) | ||
``` | ||
|
||
We can now use `encoding` to encode and decode transitions: | ||
|
||
```julia | ||
julia> decode(encoding, 1) | ||
2-element Vector{String}: | ||
"hello" | ||
"there" | ||
|
||
julia> encode(encoding, ["hello", "there"]) | ||
1 | ||
|
||
julia> encode(encoding, ["there", "skipper"]) | ||
4 | ||
|
||
julia> decode(encoding, 4) | ||
2-element Vector{String}: | ||
"there" | ||
"skipper" | ||
``` | ||
|
||
|
||
""" | ||
struct SequentialCategoricalEncoding{M, V, ED, DD} <: Encoding | ||
symbols::V | ||
encode_dict::ED | ||
decode_dict::DD | ||
|
||
function SequentialCategoricalEncoding(; symbols, m = 2) | ||
s = unique(symbols) # we don't sort, because that would disallow mixing types | ||
pgen = permutations(s, m) | ||
T = eltype(s) | ||
perms = [SVector{m, T}(p) for p in pgen] | ||
|
||
encode_dict = Dict{eltype(perms), Int}() | ||
decode_dict = Dict{Int, eltype(perms)}() | ||
for (i, pᵢ) in enumerate(perms) | ||
encode_dict[pᵢ] = i | ||
decode_dict[i] = pᵢ | ||
end | ||
S, TED, TDD = typeof(s), typeof(encode_dict), typeof(decode_dict) | ||
return new{m, S, TED, TDD}(s, encode_dict, decode_dict) | ||
end | ||
end | ||
|
||
|
||
# Note: internally, we represent the transitions with `StaticVector`s. However, | ||
# `χ` will in general not be a static vector if the user uses `encode` directly. | ||
# Therefore, we convert to `StaticVector`. This doesn't allocate, so no need to | ||
# worry about performance. | ||
function encode(encoding::SequentialCategoricalEncoding{m}, χ::AbstractVector) where {m} | ||
if m != length(χ) | ||
throw(ArgumentError("Transition length `m` and length of input must match! Got `m = $m` and `length(χ) = $(length(χ))`")) | ||
end | ||
χstatic = SVector{m, eltype(χ)}(χ) | ||
return encoding.encode_dict[χstatic] | ||
end | ||
|
||
function decode(encoding::SequentialCategoricalEncoding{m}, i) where {m} | ||
return encoding.decode_dict[i] | ||
end |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,58 +1,84 @@ | ||
export FluctuationComplexity | ||
export InformationFluctuation | ||
|
||
""" | ||
FluctuationComplexity <: InformationMeasure | ||
FluctuationComplexity(; definition = Shannon(; base = 2), base = 2) | ||
InformationFluctuation <: InformationMeasure | ||
InformationFluctuation(; definition = Shannon()) | ||
|
||
The "fluctuation complexity" quantifies the standard deviation of the information content of the states | ||
The information fluctuation quantifies the standard deviation of the information content of the states | ||
``\\omega_i`` around some summary statistic ([`InformationMeasure`](@ref)) of a PMF. Specifically, given some | ||
outcome space ``\\Omega`` with outcomes ``\\omega_i \\in \\Omega`` | ||
and a probability mass function ``p(\\Omega) = \\{ p(\\omega_i) \\}_{i=1}^N``, it is defined as | ||
|
||
```math | ||
\\sigma_I(p) := \\sqrt{\\sum_{i=1}^N p_i(I_i - H_*)^2} | ||
\\sigma_I_Q(p) := \\sqrt{\\sum_{i=1}^N p_i(I_Q(p_i) - F_Q)^2} | ||
``` | ||
|
||
where ``I_i = -\\log_{base}(p_i)`` is the information content of the i-th outcome. The type of information measure | ||
``*`` is controlled by `definition`. | ||
where ``I_Q(p_i)`` is the [`information_content`](@ref) of the i-th outcome with respect to the information | ||
measure ``F_Q`` (controlled by `definition`). | ||
|
||
The `base` controls the base of the logarithm that goes into the information content terms. Make sure that | ||
you pick a `base` that is consistent with the base chosen for the `definition` (relevant for e.g. [`Shannon`](@ref)). | ||
## Compatible with | ||
|
||
- [`Shannon`](@ref) | ||
- [`Tsallis`](@ref) | ||
- [`Curado`](@ref) | ||
- [`StretchedExponential`](@ref) | ||
- [`ShannonExtropy`](@ref) | ||
|
||
If `definition` is the [`Shannon`](@ref) entropy, then we recover the | ||
[Shannon-type "information fluctuation complexity"](https://en.wikipedia.org/wiki/Information_fluctuation_complexity) | ||
from [Bates1993](@cite). | ||
|
||
## Properties | ||
|
||
If `definition` is the [`Shannon`](@ref) entropy, then we recover | ||
the [Shannon-type information fluctuation complexity](https://en.wikipedia.org/wiki/Information_fluctuation_complexity) | ||
from [Bates1993](@cite). Then the fluctuation complexity is zero for PMFs with only a single non-zero element, or | ||
Then the information fluctuation is zero for PMFs with only a single non-zero element, or | ||
for the uniform distribution. | ||
|
||
If `definition` is not Shannon entropy, then the properties of the measure varies, and does not necessarily share the | ||
properties [Bates1993](@cite). | ||
## Examples | ||
|
||
```julia | ||
using ComplexityMeasures | ||
using Random; rng = Xoshiro(55543) | ||
|
||
# Information fluctuation for a time series encoded by ordinal patterns | ||
x = rand(rng, 10000) | ||
def = Tsallis(q = 2) # information measure definition | ||
pest = RelativeAmount() # probabilities estimator | ||
o = OrdinalPatterns(m = 3) # outcome space / discretization method | ||
information(InformationFluctuation(definition = def), pest, o, x) | ||
``` | ||
|
||
!!! note "Potential for new research" | ||
As far as we know, using other information measures besides Shannon entropy for the | ||
fluctuation complexity hasn't been explored in the literature yet. Our implementation, however, allows for it. | ||
Please inform us if you try some new combinations! | ||
We're currently writing a paper outlining the generalizations to other measures. For now, we verify | ||
correctness of the measure through numerical examples in our test-suite. | ||
""" | ||
struct FluctuationComplexity{M <: InformationMeasure, I <: Integer} <: InformationMeasure | ||
struct InformationFluctuation{M <: InformationMeasure, I <: Integer} <: InformationMeasure | ||
definition::M | ||
base::I | ||
|
||
function FluctuationComplexity(; definition::D = Shannon(base = 2), base::I = 2) where {D, I} | ||
if D isa FluctuationComplexity | ||
throw(ArgumentError("Cannot use `FluctuationComplexity` as the summary statistic for `FluctuationComplexity`. Please select some other information measures, like `Shannon`.")) | ||
function InformationFluctuation(; definition::D = Shannon(base = 2), base::I = 2) where {D, I} | ||
if D isa InformationFluctuation | ||
throw(ArgumentError("Cannot use `InformationFluctuation` as the summary statistic for `InformationFluctuation`. Please select some other information measures, like `Shannon`.")) | ||
end | ||
return new{D, I}(definition, base) | ||
end | ||
end | ||
|
||
# Fluctuation complexity is zero when p_i = 1/N or when p = (1, 0, 0, ...). | ||
function information(e::FluctuationComplexity, probs::Probabilities) | ||
function information(e::InformationFluctuation, probs::Probabilities) | ||
def = e.definition | ||
non0_probs = Iterators.filter(!iszero, vec(probs)) | ||
h = information(def, probs) | ||
return sqrt(sum(pᵢ * (self_information(def, pᵢ, length(probs)) - h)^2 for pᵢ in non0_probs)) | ||
end | ||
|
||
function information_normalized(e::InformationFluctuation, probs::Probabilities) | ||
def = e.definition | ||
non0_probs = Iterators.filter(!iszero, vec(probs)) | ||
logf = log_with_base(e.base) | ||
return sqrt(sum(pᵢ * (-logf(pᵢ) - h) ^ 2 for pᵢ in non0_probs)) | ||
h = information(def, probs) | ||
info_fluct = sqrt(sum(pᵢ * (self_information(def, pᵢ, length(probs)) - h)^2 for pᵢ in non0_probs)) | ||
return info_fluct / h | ||
end | ||
|
||
# The maximum is not generally known. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -57,3 +57,8 @@ function information_maximum(e::TsallisExtropy, L::Int) | |
|
||
return ((L - 1) * L^(q - 1) - (L - 1)^q) / ((q - 1) * L^(q - 1)) | ||
end | ||
|
||
function self_information(e::TsallisExtropy, pᵢ, N) #must have N | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is If There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll see if I can define a functional that doesn't depend on |
||
k, q = e.k, e.q | ||
return (N - 1)/(q - 1) - (1 - pᵢ)^q / (q-1) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately I don't agree with the latest change of requiring
N
. It seems simpler, and more reasonable, to simply not allow Curado to be part of this interface. The opposite, defining the information unit as depending on N, doesn't make much sense at least not with how Shannon introduced it.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Datseris
Curado
is not the only information measure whose surprisal/self-information depends explicitly onN
, when following the definition of an information measure as a probability weighted average of the surprisal (as I do in the paper).In the context of Shannon information unit alone, I agree. But the point of this interface is to generalize the Shannon information unit. This inevitably introduces
N
as a parameter.Can we discuss the final API when I'm done with writing up the paper? I'm not too far from finishing it; I just need to generate a few example applications. Since I am using this PR for the paper analyses, it would be nice to not change anything in this draft PR until the paper is ready.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The alternative is to have
information_content
/information_unit
which dispatches to a subset ofInformationMeasure
s, and thengeneralized_information_content
/generalized_information_unit
which dispatches to thoseInformationMeasure
s whose generalization of information unit depends onN
. But that kind of defeats the purpose of having an interface to begin with - since we're back at defining multiple functions with different names for things that are fundamentally identical (modulo the parameterN
).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PS: I sent you a link to the paper draft, @Datseris