Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flatline anomalies in regular period scalar data are not identified #3824

Open
rhyolight opened this issue Apr 4, 2018 · 3 comments
Open

Comments

@rhyolight
Copy link
Member

rhyolight commented Apr 4, 2018

Common flatline anomalies are not detected in many cases by HTM algorithms.

The first report of this is from HTM Forum. See:

flatline chart

Later, it was also reported against HTM.Java in a private message. I have taken all the data from that private message from Parag_Goyal and put it here in this report.

First and foremost, download the flatline-anomaly.zip file to get a reproduction of this issue. It contains data and a nupic program to replicate. See also the attached output from this program (output MT.xlsx) with Excel charts showing anomaly scores and anomaly likelihoods.

screen shot 2018-03-29 at 10 39 00 am

screen shot 2018-03-29 at 10 39 07 am

This phenomenon has been reported in NuPIC and HTM.Java, so we may assume it is an algorithmic issue. It could be something to do with how the anomaly scores are calculated, or how anomaly likelihoods are calculated.

This is an open issue, we are aware of it, but are not prioritizing it for work at this time. But we want to report that it exists and someone might be able to figure out what's wrong.

One last note: Eventually we would like to add the dataset in flatline-anomaly.zip to NAB. It is a good dataset that represents a very common anomaly in streaming scalar data. We plan to do this by adding a staging area for new data sets in NAB so we can publish them with our next versioned release.

@rhyolight
Copy link
Member Author

@Parag0892
Copy link

Parag0892 commented Apr 5, 2018

This is not just a flat line issue. If you look at this data set : non-flatline-data.csv.zip data line in mid is not flat. Metrics vary between 1 to 3 but still, the anomaly is not detected

Anomaly region in data:

screen shot 2018-04-05 at 1 11 37 pm

The overall analyzer output:

screen shot 2018-04-05 at 1 08 39 pm

@rhyolight
Copy link
Member Author

Correct, not just a flatline issue, and you can see why in this line of code I quoted above:

if metricDistribution["variance"] < 1.5e-5: 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants