\name{buildBag}
\alias{buildBag}

\title{
  Build a bagged model
}

\description{
  The model is built by first applying a variable selection and second
  fitting the ordinary least squares hyperplane. This procedure is
  repeated with a non-parametric bootstrap (bagging).
}

\usage{
buildBag(v = NULL, p = NULL, d = NULL, t1 = NULL, 
         na.rm = NULL, B = NULL, shuffle = NULL)
}

\arguments{
  \item{v}{list as returned by \code{\link{setVariables}}}
  \item{p}{integer vector containing the years to use for model building}
  \item{d}{numeric matrix as returned by \code{\link{getSeries}}}
  \item{t1}{string, date of prediction within the calendar year
    according to the pattern "MM-DD". \code{t1} is the first day of the
  forecast window.}
  \item{na.rm}{boolean, remove missing values whilst aggregating the
    variables and estimating moments for variable scaling? Passed to
    \code{\link{getPredictand}} and \code{\link{getPredictors}} and thus
    overrides any \code{na.rm} arguments in \code{v$args}.}
  \item{B}{integer, number of bootstrap replicates}
  \item{shuffle}{boolean, shuffle the predictand vector? Shuffling takes
    place in each bootstrap replicate and can be used to obtain a model
    of which we know that any present skill is artificial.}
}

\details{
  All predictors as well as the predictand get scaled to standard
  deviation one. The latter gets also centred to mean zero. The variable
  selection prodecure simply uses Pearsons correlation coefficient to
  estimate the appropriate aggregation period for each predictor. The
  regression coefficents are the ordinary least squares estimates as
  returned by \code{\link{.lm.fit}}.
}

\value{
  List containing three arrays with the bootstrap replicates on the
  first dimension and with:
  \item{mo}{"b*" (intercept and regression coefficients), and "ag*"
    (selected aggregation periods) on the second dimension}
  \item{sc}{the variables on the second and fields "mn" (mean value)
    and "sd" (standard deviation) on the third dimension}
  \item{va}{the years on the second, and fields "test" (used for model
    testing?), "o" (observations), and "p" (predictions) on the third
    dimension}
}

\note{
  It will be an error if \code{p} contains a year that has a
  missing value in one of the aggregated predictand and
  predictors. Using the \code{na.rm} argument only helps to ignore a few
  missing values in order to aggregate the variables, but not to remove
  years for which observations are not available at all.
}
