Skip to content

adknudson/RunningStatistics

Repository files navigation

RunningStatistics

Online (single pass) algorithms for statistical measures based on the Julia package OnlineStats.jl by Josh Day. Useful for streaming and big data.

This package is available on NuGet

Supports .NET Standard 2.0

List of Statistics

Statistic Description
Mean The univariate mean
Sum The overall sum of double observations
Sum<T> The overall sum of any type that implements IAdditionOperators and IAdditiveIdentity
Variance The univariate variance
Extrema The min and max of double observations
Extrema<T> The min and max of any type that implements IComparable<T>
Moments The Mean, Variance, Skewness, and (excess) Kurtosis
EmpiricalCdf Approximate order statistics (quantiles)
CountMap<T> Counts for each unique value
Histogram A histogram with specified bin edges

List of Distributions

Distribution Description
Normal The univariate mean and variance
Beta The number of successes and failures

Common Interface

All running statistics implement the following interfaces:

public interface IRunningStatistic<TObs>
{
    public long Nobs { get; }
    
    public void Fit(TObs value);
    
    public void Fit(TObs value, long count);

    public void Fit(IEnumerable<TObs> values);
    
    public void Fit(IEnumerable<KeyValuePair<TObs, long>> keyValuePairs);

    public void Reset();
    
    public IRunningStatistic<TObs> CloneEmpty();
    
    public IRunningStatistic<TObs> Clone();
    
    public void UnsafeMerge(IRunningStatistic<TObs> other);
}

public interface IRunningStatistic<TObs, TSelf> : IRunningStatistic<TObs> 
    where TSelf : IRunningStatistic<TObs, TSelf>
{
    public TSelf CloneEmpty();

    public TSelf Clone();
    
    public void Merge(TSelf other);
}

The interface is split into a base interface IRunningStatistic<TObs> and a derived interface IRunningStatistic<TObs, TSelf>. The base interface contains the generic methods, while the derived interface contains more type information. The base interface allows for collections of statistics that can be fit to the same observation type.

Abstract Base Class

The abstract class RunningStatisticBase implements the IRunningStatistic<TObs, TSelf> interface and provides default implementations when possible.

Examples

See the RunningStatistics.Examples project for more examples.

Basic Usage

using RunningStatistics;

var mean1 = new Mean();
var mean2 = new Mean();
var ecdf = new EmpiricalCdf();

var rng = new Random();
for (var i = 0; i < 1000; i++)
{
    var x = rng.NextDouble();
    
    mean1.Fit(x);
    mean2.Fit(2*x);
    
    ecdf.Fit(x);
}

mean1.Merge(mean2);
var q1 = ecdf.Quantile(0.25);

About

Online algorithms for statistical measures.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages