TODO

VarImpact TODOs

Alan priorities:
- HIGHEST: Add CV-TMLE (also Mark priority!)
  * IN PROGRESS, largely working.
  * Needs more testing with non-binary outcomes. E.g. transform theta?
- Unadjusted difference in means estimate (t-test) to compare to TMLE estimate
- Cell sizes in the groups that it chooses (A = 0 and A = 1)
- Option of ratios (delta method), rather than differences
   - Alan will email Minh, maybe Jeremy too. (may be embedded in TMLE)
- In ByV table, use the labels of the quantiles to show the actual range.
- Specify names of variables to use for variable importance

Chris items:
* Arguments
  - Also support "X" optional argument, ala glm
  - Parameter for the number of covariates after which we use dimensionality reduction (HOPACH).
  - Argument for p-value cutoff for consistent results.
* Coding style
  - Create subfunctions.
* Output
  - Sort consistent results by p-value then estimate size?
  - Support summary()
* Quality
  - Warn/error if there are character columns in the data, as a bonus check.
  - Figure out how to reduce the number of potential failures in various steps.
  - Count and report how many variables are skipped due to minCell constraint.
  - Automatically handle rows that are all NAs?
* Testing
  - Make examples quicker to execute, esp. during devtools::check()
  - Add other tests.
* Performance
  - Try Rprofvis
  - Use Jeremy's method of not wasting SL folds during CV-TMLE.
* Documentation
  - Improve varImpact() roxygen description.
  - Document the return object
  - Vignette
* Examples
  - Use doParallel instead of doMC so that it works on Windows.
* Benchmarking
  - Benchmark to random forest, lasso, OLS and SL-perturbation algorithms.
* BUGS/ERRORS
  - HOPACH can fail twice in a row. Need to figure out why and fix (see factor test 1).
  - HOPACH (Minh example): Error in rowMeans(dmat[labels == ordlabels[j], labels == labels[medoid2]]) :
     * seem to occur in hopach::mssnextlevel()
     * possibly due to call to msssplitcluster()
     * may need a ", drop = F" when subsetting a matrix somewhere.
     * TODO: fork HOPACH and add a dimension check / debug stuff near there.
  'x' must be an array of at least two dimensions
  - Error in predmat[which, seq(nlami)] = preds : replacement has length zero
     - Occurs when estimating TMLE on training samples.
  - Error in lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs,  :
  one multinomial or binomial class has 1 or 0 observations; not allowed
     -- Can happen with BreastCancer dataset, seems to depend on seed.
     -- Happens in parkinson's competition dataset.
     -- Seems to occur when a variable has value with a small number of obs.
      Then when estimating the treatment propensity score for a given fold
      there may be only a single value for all observations.
  - Error in y %*% rep(1, nc) : non-conformable arguments
     -- When estimating TMLE
  - "Missing value(s) in medoidsdata in collap()"
* WARNINGS
  - Warning in collap(data, level, d, dmat, newmed) (conducting HOPACH, at least on factors):
  Not enough medoids to use newmed='medsil' in collap() -
 using newmed='nn' instead
   (CK: suppressing warnings currently to avoid this one.)

From existing documentation:
- Allow reporting of results that randomly do not have estimates for some of validation samples.

Wishlist
- Support clustering or variable reduction algorithms other than HOPACH.
- Support more than 2 external (CV-TMLE) folds.