-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fluctuation complexity, restrict possibilites to formally defined self-informations #413
Draft
kahaaga
wants to merge
17
commits into
main
Choose a base branch
from
fluctuation_complexity_formal
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from 1 commit
Commits
Show all changes
17 commits
Select commit
Hold shift + click to select a range
95f8b94
Generalized `self_information` function
kahaaga 58b80e6
Some generalied self-informations
kahaaga f5d5b87
Self-information for Shannon extropy
kahaaga 805b60f
Update `FluctuationComplexity` docstring
kahaaga 785cb6c
Merge branch 'main' into fluctuation_complexity_formal
kahaaga 0d8abd7
use self_info
kahaaga bb2712f
Add Anteneodo-Plastino self-info and update syntax
kahaaga 1d4f326
Docs
kahaaga e3c4d39
Merge branch 'main' into fluctuation_complexity_formal
kahaaga 780176b
Update src/information_measure_definitions/fluctuation_complexity.jl
kahaaga ba6b73b
wip...
kahaaga bb1aa37
Merge branch 'fluctuation_complexity_formal' into sequential_categori…
kahaaga 0e7a2d1
wip..
kahaaga 0132c05
Merge branch 'sequential_categorical_encoding' into fluctuation_compl…
kahaaga 6c26097
Merge branch 'main' into fluctuation_complexity_formal
kahaaga 30a82f7
Update docs
kahaaga ddea888
self_information for identification entropy
kahaaga File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
60 changes: 41 additions & 19 deletions
60
src/information_measure_definitions/fluctuation_complexity.jl
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,62 +1,84 @@ | ||
export FluctuationComplexity | ||
export InformationFluctuation | ||
|
||
""" | ||
FluctuationComplexity <: InformationMeasure | ||
FluctuationComplexity(; definition = Shannon()) | ||
InformationFluctuation <: InformationMeasure | ||
InformationFluctuation(; definition = Shannon()) | ||
|
||
The "fluctuation complexity" quantifies the standard deviation of the information content of the states | ||
The information fluctuation quantifies the standard deviation of the information content of the states | ||
``\\omega_i`` around some summary statistic ([`InformationMeasure`](@ref)) of a PMF. Specifically, given some | ||
outcome space ``\\Omega`` with outcomes ``\\omega_i \\in \\Omega`` | ||
and a probability mass function ``p(\\Omega) = \\{ p(\\omega_i) \\}_{i=1}^N``, it is defined as | ||
|
||
```math | ||
\\sigma_I_Q(p) := \\sqrt{\\sum_{i=1}^N p_i(I_Q(p_i) - H_*)^2} | ||
\\sigma_I_Q(p) := \\sqrt{\\sum_{i=1}^N p_i(I_Q(p_i) - F_Q)^2} | ||
``` | ||
|
||
where ``I_Q(p_i)`` is the [`self_information`](@ref) of the i-th outcome with respect to the information | ||
measure of type ``Q`` (controlled by `definition`). | ||
where ``I_Q(p_i)`` is the [`information_content`](@ref) of the i-th outcome with respect to the information | ||
measure ``F_Q`` (controlled by `definition`). | ||
|
||
## Compatible with | ||
|
||
- [`Shannon`](@ref) | ||
- [`Tsallis`](@ref) | ||
- [`Curado`](@ref) | ||
- [`StretchedExponential`](@ref) | ||
- [`ShannonExtropy`](@ref) | ||
|
||
If `definition` is the [`Shannon`](@ref) entropy, then we recover the | ||
[Shannon-type "information fluctuation complexity"](https://en.wikipedia.org/wiki/Information_fluctuation_complexity) | ||
from [Bates1993](@cite). | ||
|
||
## Properties | ||
|
||
If `definition` is the [`Shannon`](@ref) entropy, then we recover the | ||
[Shannon-type information fluctuation complexity](https://en.wikipedia.org/wiki/Information_fluctuation_complexity) | ||
from [Bates1993](@cite). Then the fluctuation complexity is zero for PMFs with only a single non-zero element, or | ||
Then the information fluctuation is zero for PMFs with only a single non-zero element, or | ||
for the uniform distribution. | ||
|
||
If `definition` is not Shannon entropy, then the properties of the measure varies, and does not necessarily share the | ||
properties [Bates1993](@cite). | ||
## Examples | ||
|
||
```julia | ||
using ComplexityMeasures | ||
using Random; rng = Xoshiro(55543) | ||
|
||
# Information fluctuation for a time series encoded by ordinal patterns | ||
x = rand(rng, 10000) | ||
def = Tsallis(q = 2) # information measure definition | ||
pest = RelativeAmount() # probabilities estimator | ||
o = OrdinalPatterns(m = 3) # outcome space / discretization method | ||
information(InformationFluctuation(definition = def), pest, o, x) | ||
``` | ||
|
||
!!! note "Potential for new research" | ||
As far as we know, using other information measures besides Shannon entropy for the | ||
fluctuation complexity hasn't been explored in the literature yet. Our implementation, however, allows for it. | ||
We're currently writing a paper outlining the generalizations to other measures. For now, we verify | ||
correctness of the measure through numerical examples in our test-suite. | ||
""" | ||
struct FluctuationComplexity{M <: InformationMeasure, I <: Integer} <: InformationMeasure | ||
struct InformationFluctuation{M <: InformationMeasure, I <: Integer} <: InformationMeasure | ||
definition::M | ||
base::I | ||
|
||
function FluctuationComplexity(; definition::D = Shannon(base = 2), base::I = 2) where {D, I} | ||
if D isa FluctuationComplexity | ||
throw(ArgumentError("Cannot use `FluctuationComplexity` as the summary statistic for `FluctuationComplexity`. Please select some other information measures, like `Shannon`.")) | ||
function InformationFluctuation(; definition::D = Shannon(base = 2), base::I = 2) where {D, I} | ||
if D isa InformationFluctuation | ||
throw(ArgumentError("Cannot use `InformationFluctuation` as the summary statistic for `InformationFluctuation`. Please select some other information measures, like `Shannon`.")) | ||
end | ||
return new{D, I}(definition, base) | ||
end | ||
end | ||
|
||
# Fluctuation complexity is zero when p_i = 1/N or when p = (1, 0, 0, ...). | ||
function information(e::FluctuationComplexity, probs::Probabilities) | ||
function information(e::InformationFluctuation, probs::Probabilities) | ||
def = e.definition | ||
non0_probs = Iterators.filter(!iszero, vec(probs)) | ||
h = information(def, probs) | ||
return sqrt(sum(pᵢ * (self_information(def, pᵢ, length(probs)) - h)^2 for pᵢ in non0_probs)) | ||
end | ||
|
||
function information_normalized(e::InformationFluctuation, probs::Probabilities) | ||
def = e.definition | ||
non0_probs = Iterators.filter(!iszero, vec(probs)) | ||
h = information(def, probs) | ||
s = self_information(def, probs) | ||
return sqrt(sum(pᵢ * (s - h)^2 for pᵢ in non0_probs)) | ||
info_fluct = sqrt(sum(pᵢ * (self_information(def, pᵢ, length(probs)) - h)^2 for pᵢ in non0_probs)) | ||
return info_fluct / h | ||
end | ||
|
||
# The maximum is not generally known. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
16 changes: 9 additions & 7 deletions
16
test/infomeasures/infomeasure_types/fluctuation_complexity.jl
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,11 @@ | ||
# Examples from https://en.wikipedia.org/wiki/Information_fluctuation_complexity | ||
# for the Shannon fluctuation complexity. | ||
p = Probabilities([2//17, 2//17, 1//34, 5//34, 2//17, 2//17, 2//17, 4//17]) | ||
def = Shannon(base = 2) | ||
c = FluctuationComplexity(definition = def, base = 2) | ||
@test round(information(c, p), digits = 2) ≈ 0.56 | ||
# Zero both for uniform and single-element PMFs. | ||
@test information(c, Probabilities([0.2, 0.2, 0.2, 0.2, 0.2])) == 0.0 | ||
@test information(c, Probabilities([1.0, 0.0])) == 0.0 | ||
@testset "fluctuation_complexity" begin | ||
probs = Probabilities([2//17, 2//17, 1//34, 5//34, 2//17, 2//17, 2//17, 4//17]) | ||
def = Shannon(base = 2) | ||
c = FluctuationComplexity(definition = def, base = 2) | ||
@test round(information(c, probs), digits = 2) ≈ 0.56 | ||
# Zero both for uniform and single-element PMFs. | ||
@test information(c, Probabilities([0.2, 0.2, 0.2, 0.2, 0.2])) == 0.0 | ||
@test information(c, Probabilities([1.0, 0.0])) == 0.0 | ||
end |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately I don't agree with the latest change of requiring
N
. It seems simpler, and more reasonable, to simply not allow Curado to be part of this interface. The opposite, defining the information unit as depending on N, doesn't make much sense at least not with how Shannon introduced it.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Datseris
Curado
is not the only information measure whose surprisal/self-information depends explicitly onN
, when following the definition of an information measure as a probability weighted average of the surprisal (as I do in the paper).In the context of Shannon information unit alone, I agree. But the point of this interface is to generalize the Shannon information unit. This inevitably introduces
N
as a parameter.Can we discuss the final API when I'm done with writing up the paper? I'm not too far from finishing it; I just need to generate a few example applications. Since I am using this PR for the paper analyses, it would be nice to not change anything in this draft PR until the paper is ready.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The alternative is to have
information_content
/information_unit
which dispatches to a subset ofInformationMeasure
s, and thengeneralized_information_content
/generalized_information_unit
which dispatches to thoseInformationMeasure
s whose generalization of information unit depends onN
. But that kind of defeats the purpose of having an interface to begin with - since we're back at defining multiple functions with different names for things that are fundamentally identical (modulo the parameterN
).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PS: I sent you a link to the paper draft, @Datseris