Skip to content

Chapter 2: Installation and First Use

Pablo de Oliveira Castro edited this page Jul 24, 2017 · 8 revisions

Installation and First Use

Introduction

The following chapter presents how to download and configure ASK on the user’s workstation.

Download

Download ASK through the Git version control system:

$ git clone https://github.com/benchmark-subsetting/adaptive-sampling-kit.git

The previous command retrieves the current stable version of ASK. To switch to the development version type:

$ cd adaptive-sampling-kit
$ git checkout devel

Configuration

Before using ASK, make sure all its dependencies are satisfied:

  • Python: at least version 2.6

  • R

  • Libraries: python-numpy, python-scipy and python-argparse

  • Optionally, nosetests, which is only needed to run the regression test suite.

In a Debian or Ubuntu system, use:

$ sudo apt-get install python2.6 r-base python-numpy python-scipy \
       python-argparse python-nose

Ensure that R_LIBS contains a writable directory. For example, add the following line to your .bashrc, or equivalent, file:

export R_LIBS=$HOME/.R-libs:$R_LIBS

and create the directory with:

$ mkdir $HOME/.R-libs/

Once the dependencies are installed and R_LIBS configured, enter ASK’s directory and execute the configure script.

$./configure
Checking if R is installed
Checking if python is installed
All the dependencies are satisfied.
Building uniform sampling dynamic library (only needed for amart sampler module)

The configure script makes sure all previous dependencies are properly installed. It also retrieves a set of R modules from rcran. Therefore, before running this script, make sure the computer is connected to the internet.

ASK installation is self-contained: it does not install system-wide files. The ask binary can be run directly, or added to the PATH environment variable, eg. in bash:

cd adaptive-sampling-kit/
export PATH=$PWD:$PATH

The ASK directory contains the following subdirectories:


Go into ASK’s directory and type ask -h to get a brief summary of the command-line options. Chapter 3 explains in detail ASK’s invocation.

Now, consider the simple experiment from the examples directory.

$ cd adaptive-sampling-kit/; export PATH=$PWD:$PATH
$ cd examples/simple
$ ls 
gauss2D.data simple.conf

Observe there are two files:

  • Gauss2D.data contains some measures:
-200 -200 -0.000670925255805024
-199 -200 -0.000694745225748637
-198 -200 -0.000719248849372865
-197 -200 -0.00074444881633483
-196 -200 -0.00077035776068282
-195 -200 -0.0007969882457312
-194 -200 -0.000824352748414127
-193 -200 -0.00085246364311948
-192 -200 -0.000881333185005384
-191 -200 -0.000910973492802606
...

The last column represents the response; the first two columns are factors.

The above file contains an exhaustive measure of a design space inspired by an example from the article, *tgp: An R package for Bayesian nonstationary, semiparametric nonlinear regression and design by treed gaussian process models. *, R. B. Gramacy, Journal of Statistical Software 2007:

$$[f(x_1,x_2) = \frac{x_1}{100}.e^{-(\frac{x_1}{100})^2-(\frac{x_2}{100})^2} \textrm{ on } [-200:600] \times [-200:600]]$$

The design space has two factors, named x1 and x2 in the formula, and the response f(x1,x2).

  • Simple.conf contains the configuration of the experiment

The configuration file’s first section, named factors, describes the factors of the experiment:

"factors": [
  {"name": "x",
   "type": "integer",
   "range": {"min": -200, "max": 600}
  },
  {"name": "y",
   "type": "integer",
   "range": {"min": -200, "max": 600}
  }
]

In the experiment, there are two factors called x and y, both of type integer and varying between -200 and 600, bounds included.

The second section, Modules, configures the ask modules involved in this experiment. The bootstrap module samples five hundred points random points, a general boosting machine (gbm) model is built and the 2D reporter plots the result. For a full discussion of module parameters please refer to Chapter 3.

To run the experiments, type:

$ ask simple.conf
Logging to default.log
Experiments finished normally

The ask driver runs approximately for one minute and reports that the experiment finished without errors. While it is running, the default.log file tracks the driver’s progress, the default.log file is created by default in the directory where ASK is invoked. ASK saves all the results into the default output directory output/:

$ ls output/
labelled00000.data  labelled.data  model00000.data  plot00000.png  
prediction00000.data

Open the plot00000.png file in an image viewer, observe it shows two level plots:

  • The top one, shows the absolute error between the response model and the true response: white is better

  • The bottom one, shows the response model built using five hundred samples

  • The samples themselves are marked by the tiny circles

image