Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hisparc-update and data view crashes #182

Closed
153957 opened this issue Apr 18, 2017 · 5 comments
Closed

hisparc-update and data view crashes #182

153957 opened this issue Apr 18, 2017 · 5 comments
Assignees

Comments

@153957
Copy link
Member

153957 commented Apr 18, 2017

See HiSPARC/datastore#23 for a bit more info.

Two stations have uploaded a lot of configurations to the datastore.
These cause issues for the publicdb.
Once the raw data is cleaned up, the database needs to be cleaned as well.
Both Summaries and Configurations require cleaning.

@tomkooij
Copy link
Member

tomkooij commented Apr 21, 2017

Delete all configs for a given station/date:

from __future__ import print_function
import os
import sys
from datetime import date

PUBLICDB_PATH = '/srv/publicdb/www'
sys.path.append(PUBLICDB_PATH)

os.environ['DJANGO_SETTINGS_MODULE'] = 'publicdb.settings'

import django
django.setup()

from publicdb.histograms.models import Summary, Configuration

summary = Summary.objects.get(station__number=4, date=date(2017, 4, 15))
print(summary)
configs = Configuration.objects.filter(source=summary)
print('n = ', configs.count())
for config in configs.iterator()):
    config.delete()

n_configs = Configuration.objects.filter(source=summary).count()
print('after delete: n = ', n_configs)
summary.num_config = n_configs
summary.save()

This script will take some time! (>>10 min in my VM)

Delete a summary (and thus all attached histograms, configs etc):

from __future__ import print_function
import os
import sys
from datetime import date

PUBLICDB_PATH = '/srv/publicdb/www'
sys.path.append(PUBLICDB_PATH)

os.environ['DJANGO_SETTINGS_MODULE'] = 'publicdb.settings'

import django
django.setup()

from publicdb.histograms.models import Summary, Configuration

summary = Summary.objects.get(station__number=4, date=date(2017, 4, 15))
print(summary)

summary.delete()

@tomkooij
Copy link
Member

tomkooij commented Apr 22, 2017

The (incorrect) configs were removed using the final script in HiSPARC/datastore@6fb022c

The following patch was applied to the "live" pique:

diff --git a/publicdb/histograms/jobs.py b/publicdb/histograms/jobs.py
index e37f399..cbd142b 100644
--- a/publicdb/histograms/jobs.py
+++ b/publicdb/histograms/jobs.py
@@ -723,6 +723,10 @@ def shrink_dataset(dataset, interval):

 def update_config(summary):
     cluster, station_number = get_station_cluster_number(summary.station)
+    if station_number in [4, 91, 508]:
+        logger.info('Skipping configs for station %d' % station_number)
+        return summary.num_config
+
     file, configs, blobs = datastore.get_config_messages(cluster,
                                                          station_number,
                                                          summary.date)

This patch skips new configs (in the raw datastore) for stations 4, 91 and 508.
After fixing the issue and migrating the raw data (removing the bogus configs) we can just git reset --hard back to the current master.

@153957
Copy link
Member Author

153957 commented Jun 22, 2017

Is this patch still active since #186 was fixed and probably deployed?
I assume those stations are no longer creating those large numbers of configs?

@tomkooij
Copy link
Member

I don't think it is. And #183 (a more stable fix) is not merged yet...

There's no rush. At least not until we deploy a new version of the DAQ ;-)

@tomkooij
Copy link
Member

It happened again today 👎

The patch/hotfix was gone due to an update but #183 was not live and we recieved data for Apr 15-17, so the >>500.000 configs were imported again.

I deleted the configs again and updated pique:

Too many configs are now skipped:

2017-06-30 13:45:14,468 - INFO - [20925] - Updating configuration messages for Summary: 4 - 15 Apr 2017
2017-06-30 13:45:14,518 - ERROR - [20925] - Summary: 4 - 15 Apr 2017: Too many configs: 248013. Skipping.

@kaspervd will remove the configs from the raw data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants