-
Notifications
You must be signed in to change notification settings - Fork 7
Provide a STMO accessible version database #260
Comments
See bug 1405614 for an example :) |
Do you know what kind of API's we need to provide for that to happen? |
I'm not familiar with this project so I'm not quite sure but, AFAICT, an intermediate csv can be generated. If that's correct, we could simply schedule some recurring job that grabs the CSV, loads it in a temp view in Spark and saves it to a parquet file. I'm not an expert, so I'll happily flag @mreid-moz to make sure I didn't say anything too stupid :-D |
If we can call a web hook each time we add a new build it could be better and near realtime |
If you can incorporate an HTTP POST when a new build arrives, that could work very well and integrate with the generic ingestion service. |
That would be perfect |
Pretty sure @peterbe did something along these lines for Socorro. |
What we did for Socorro is that we upload (a subset of) every single crash we process into an S3 bucket. The uploads are put into a S3 "directory" which is a date. E.g. "/20171013/" Additionally we uploaded the JSON Schema that describes this subset into the root of that S3 bucket. That way, when @mreid-moz 's cron job runs it fetches the JSON Schema to generate the Scala code (right?) that packages up the JSON blobs into Parquet files. In terms of configuration we just let the Socorro Ops people talk to the Telemetry Ops people so they can set up IAM policies for reading. The S3 bucket belongs to the AWS org that Socorro uses (if that matters). Also, Mark and I wrote a Python script that converts the JSON Schema into Scala code but I'm not sure if that's still used. We also have a policy about the versioning of the JSON Schema (basically a key called |
I would like to wait for the HTTP service to be ready then :) |
cc @jasonthomas re: ingestion stuff |
It would be very useful to have the information provided by buildhub available as a dataset which is accessible from STMO. This would allow, among the other things, to perform a join with other datasets (e.g. update dataset) and make precise build information available to other consumers.
The text was updated successfully, but these errors were encountered: