-
Notifications
You must be signed in to change notification settings - Fork 7
A basic loader of HDF5 files into SciDB
License
wangd/SciDB-HDF5
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
SciDB-HDF5 ---------- The SciDB-HDF5 project was designed to allow scientists to load their array data from HDF5 (now FITS as well) files into SciDB. While full functionality is not promised, we have used the code to load large 1D, 2D, and 3D arrays and manipulate them using SciDB. Quick notes ----------- Prerequisites: A checkout of SciDB -- scidb.org/download CMake -- www.cmake.org HDF5 libraries and headers (e.g., yum install hdf5 hdf5-devel) CFITSIO and CCfits (for FITS functionality) RHEL5 RPMS for HDF5 are available at: http://www.slac.stanford.edu/~danielw/hdf/ You need the -static and -devel packages at minimum, and probably the plain hdf5 rpm too. How to build: ------------ Standard cmake steps should work. Suppose you've checked out the source into /home/daniel/scidbhdf 1. Make a build directory, say /home/daniel/scidbhdf-bld 2. cd into your build directory ("cd /home/daniel/scidbhdf-bld") 3. "cmake <your-source-directory>" ("cmake /home/daniel/scidbhdf") 4. Patch locations for dependencies as needed by "ccmake <build-directory>" ("ccmake /home/daniel/scidbhdf-bld", or just "ccmake ." if $PWD=="/home/daniel/scidb-hdf-bld") 5. Within your build directory, "make" The "loader" binary will be created within the "src" subdirectory. It is hardcoded to use a particular HDF5 file and to read a particular tree within. The "libloadhdf.so" is a shared library designed to be a scidb plugin. You can install it by copying it from the src/ directory to your SciDB installation's plugins directory (ROOT/lib/scidb/plugins) The "libloadfits.so" is a shared library designed to be a scidb plugin. It can be installed similarly to the other plugin. Status ------ SciDB-HDF5 is beta-level code. It functions under some conditions but may need development to handle more general usage. It does not contain known memory leaks or scalability problems, but has not undergone extensive testing. FITS *tables* are not currently supported, since they do not have natural counterparts in SciDB. FITS arrays seem to work fine. AFL function signatures: ------------------------ libloadhdf.so: void loadhdf(<full path of HDF5 file>, <path to array within file>) This function produces a temporary array as its output. To store this array permanently, wrap the result inside of SciDB's "store()" operator (see examples below). libloadfits.so: void loadfits(<target array name>, <full path of FITS file>, <HDU number>) This function overwrites the target specified array, which is deleted if necessary and constructed with a shape and form according to the source array's original shape. Please note that loadfits() writes directly to an array, and thus does not really work with distributed arrays. It has not yet been updated to return a temporary array. Patches welcome. Operating notes --------------- The code automatically maps a 1D array of 2D images into a single 3D array, as this was the best behavior for the original use case. Trying out the loader: ---------------------- ## Load the plugin iquery -aq "load_library('loadhdf');" ## Load an array iquery -naq "store(loadhdf('/u24/ldrd/sxrcom10-r0232.h5', '/Configure:0000/Run:0000/CalibCycle:0000/Camera::FrameV1/SxrBeamline.0:Opal1000.1/image'), image);" ## Some SciDB snippets to inspect the loaded array iquery -aq "show(image);" ## Count it iquery -aq "count(image);" ## pixel standard deviations for the first 3 images in the series iquery -aq "stdev(subarray(image,0,0,0,1023,1023,2),int1,unnamed2);" ## Load a compound: iquery -aq "store(loadhdf('/u24/ldrd/sxrcom10-r0232.h5', '/Configure:0000/Run:0000/CalibCycle:0000/Camera::FrameV1/SxrBeamline.0:Opal1000.1/time'), imagetime); ## See its shape: iquery -aq "show(imagetime);" [("imagetime<seconds:uint32 NOT NULL,nanoseconds:uint32 NOT NULL> [unnamed0=0:9223372036854775807,42247,0]")] ### FITS loader examples: ## Load the plugin and the fits file iquery -aq "load_library('loadfits');" iquery -aq "loadfits('s20fits', '/u24/danielw/testdata/S20.fits',1);" ## See what we loaded iquery -aq "show(s20fits);" [("s20fits<unknown:float NOT NULL> [dim0=0:4072,4072,0,dim1=0:4000,4000,0]")] ## Compute an average iquery -aq "avg(s20fits);" [(5.21866)] Notes: ------ 1. You can reconfigure the build type via: For a debug build cmake -DCMAKE_BUILD_TYPE=Debug <build dir> For a release build cmake -DCMAKE_BUILD_TYPE=Release <build dir> e.g. if your build directory is /home/daniel/scidbhdf-bld and you want to configure for debugging, "cmake -DCMAKE_BUILD_TYPE=Debug /home/daniel/scidbhdf-bld" 2. You might find the "make VERBOSE=1" command useful when troubleshooting build problems. All files are distributed under the Apache License 2.0 (described below) unless otherwise noted. Copyright Notice: ----------------- Copyright 2012 Jacek Becla, Daniel Liwei Wang Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
About
A basic loader of HDF5 files into SciDB
Resources
License
Stars
Watchers
Forks
Packages 0
No packages published