Construct MART(tm) model:                                                      
                                                                               
mart (x, y=x[,ncol(x)], lx=rep(1,ncol(x)), martmode="regress", niter=200,      
   tree.size=6, learn.rate=max(0.01,0.1*min(1,(nrow(x)-abs(ntest))/10000)),    
   loss.cri=4, samp.fract=min(1,max(0.5,(4-(nrow(x)-abs(ntest))/200)/3)),      
   wt=rep(1,n), wthr=0.1, winsor=0.9, start=T, term.count=10, ntest=nrow(x)/5, 
   xmiss=9.0e30, cost.mtx=0, data.check=T, tree.store=1000000,                 
   cat.store=100000, maxiter=10000, quiet=F) {                                 
                                                                               
                                                                               
Required argument:                                                             
                                                                               
   x = input (predictor) variable data matrix.                                 
       (Must be numeric and type matrix or vector.)                            
                                                                               
Optional arguments:                                                            
                                                                               
   y = vector of output (response) variable values, or class labels.           
       (See martmode below.)                                                   
   lx = vector of input variable flags:                                        
              | 0 => ignore x[,j],                                             
      lx[j] = | 1 => x[,j] = real (orderable) variable,                        
              | 2 => x[,j] = categorical (unorderable) variable.               
   martmode = regression / classification flag:                                
      martmode="regress" => regression:                                        
         output (response) values stored in y are numeric                      
         [for logistic regression (loss.cri=3, see below), y in {-1,1}]        
      martmode="class" => classification:                                      
         output (response) values stored in y are class labels; they can be of 
         any type (numeric or character) as long as all observations of the    
         same class are represented by the same label.                         
   niter = number of iterations (trees).                                       
   tree.size = number of terminal nodes in each tree.                          
   learn.rate = regularization shrinkage factor [Friedman 1999a, Section 5]    
   loss.cri = optimization loss criterion (martmode="regress" only):           
                  | 1 => least-absolute-deviation [Friedman 1999a, Section 4.2]
      loss.cri =  | 2 => least-squares [Friedman 1999a, Section 4.1]           
                  | 3 => logistic likelihood (y in {-1,1}, only)               
                          [Friedman 1999a, Section 4.4]                        
                  | 4 => Huber-M [Friedman 1999a, Section 4.3]                 
   samp.fract =  fraction of training observations. randomly sampled at        
                 each iteration [Friedman (1999b)].                            
   wt = vector of weights for observations.                                    
   wthr = influence trimming speed-up (martmode="class" or loss.cri = 3, only) 
       At each iteration, ignore observations with smallest influence          
       whose sum is less than wthr times the total influence                   
       [Friedman 1999a, Section 4.4.1]                                         
   winsor = 1 - breakdown parameter for M-regression                           
       (martmode="regress" and loss.cri = 4, only)                             
       [Friedman 1999a, Section 4.3]                                           
   start = initialization flag.                                                
       start = | T => new problem, begin with first iteration.                 
               | F => continue iterating from last iteration of previous run.  
                      (see procedure moremart)                                 
   term.count =  minimum number of (non-zero weight) training observations     
                 in each terminal node of each tree.                           
   abs(ntest) = number of observations held out for the test sample.           
      ntest > 0 => test data observations randomly selected from input data.   
      ntest < 0 => test data set is last abs(ntest) observations.              
   xmiss = missing value flag.                                                 
      Must be larger than any data value on any input variable.                
   cost.mtx = misclassification cost matrix (martmode="class" only):           
      cost.mtx[khat,k] = cost of classifying class(k) as class(khat).          
         Here, k and khat are internal mart codes for each of the respective   
         class labels. They can be obtained by calling procedure               
         classlabels(y). If cost.mtx is not specified, then all                
         misclassification costs are taken to have the (same) value 1.         
   data.check = data checking flag:                                            
       data.check = | T => check input data (x, y, wt) for inconsistencies.    
                    | F => do not check input data (saves time).               
   tree.store = maximum storage allocated to store trees.                      
   cat.store = maximum storage allocated to store categorical splits.          
   maxiter = maximum total number of iterations over all runs on this problem. 
      Note: tree.store, cat.store, and maxiter need only be changed in response
            to an error message from mart (see martstat).                      
   quiet = T/F => do/don't list (martstat) summary at command line             
                  upon termination                                             
                                                                               
Examples:                                                                      
                                                                               
   mart(xdata)                                                                 
   mart(xdata,ydata)                                                           
   mart(products,profits,c(1,1,2,1,2,0,0,1),niter=1000,tree.size=2,            
          learn.rate=0.05,start=F)                                             
   mart(features,labels,martmode="class")                                      
   mart(attributes,concept,vartype,"class")                                    
   mart(symptoms,diagnosis,martmode="class",cost.mtx=costs)                    
                                                                               
Remarks:                                                                       
                                                                               
This is the procedure that builds the MART model. It must be called            
(with start=T) before any other mart procedures are called. It can be          
repeatedly called at any point, with start=F, to perform more                  
iterations on the same data. (See procedure moremart.) The two most sensitive  
parameters are tree.size, which controls the maximum interaction level of the  
model, and learn.rate, which provides resistance to overfitting. Several       
values of tree.size should be experimented with. Values of learn.rate should   
be in the range [0.01,0.1], with smaller samples requiring smaller values.     
                                                                               
While mart (or moremart) is running, model building can be canceled by         
selecting the window running MART and typing Control-c. Execution then stops   
and no MART model is constructed, or expanded (start=F or moremart).           
                                                                               
Related procedures: martstat, moremart, classlabels                            
                                                                               
References:                                                                    
                                                                               
Friedman, J. H. (1999a). Greedy function approximation: a gradient boosting    
   machine. http://www-stat.stanford.edu/~jhf/ftp/trebst.ps                    
                                                                               
Friedman, J. H. (1999b). Stochastic gradient boosting.                         
    http://www-stat.stanford.edu/~jhf/ftp/stobst.ps                            
                                                                               

Make predictions from a MART(tm) model:                                        
                                                                               
yp <- martpred(xp, nit=100000, probs=F, data.check=T)                          
                                                                               
Required argument:                                                             
                                                                               
   xp = vector or matrix of data point(s) to be predicted.                     
                                                                               
Optional arguments:                                                            
                                                                               
   nit = model size (number of trees) to use.                                  
         If nit > estimated optimal number, then the latter is used.           
   probs = probability output flag [classification (martmode="class") only]    
   data.check = data checking flag:                                            
       data.check = | T => check input data (xp) for inconsistencies.          
                    | F => do not check input data (saves time).               
Output:                                                                        
                                                                               
   Regression (martmode="regress"):                                            
      yp = vector of predicted output response values (numeric).               
                                                                               
   Classification (martmode="class"):                                          
      probs=F => yp = vector of predicted class labels (numeric or character). 
      probs=T => yp = vector or matrix of predicted class probabilities.       
         If xp is a vector (single prediction) then yp is a numeric vector     
         containing the respective estimated class probabilities at xp.        
         If xp is a matrix (multiple predictions) then yp is an nrow(x) by     
         nclass matrix, where nclass is the number of distinct class labels.   
         In either case, the classes are listed in order of their internal     
         mart codes. These can be obtained by calling procedure classlabels()  
                                                                               
   Examples:                                                                   
                                                                               
   martpred(c(0.72,4,1.6,73.5))                                                
   yp=martpred(xp,"probs")                                                     
                                                                               
Remarks:                                                                       
                                                                               
This procedure can be called after mart (or moremart) to make model            
predictions.                                                                   
                                                                               
Related procedures: classlabels, marterror    


Check status of mart:                                                          
                                                                               
martstat()                                                                     
                                                                               
Arguments: none                                                                
                                                                               
Remarks:                                                                       
                                                                               
This procedure can be called after mart (or moremart) to inquire about         
its status. Calling martstat() returns the total number of iterations          
so far, the estimated optimal number, and the corresponding estimated          
optimal average-absolute-error, misclassification rate for loss.cri=3,         
or misclassification risk for martmode="class". While mart is running,         
model building can be canceled by selecting the mart window and typing         
Control-c. Execution then stops and no MART model is constructed, or           
expanded (start=F or moremart). In this case the result of martstat()          
will be "MART interrupted". martstat will be reinitialized correctly           
after the next mart() or moremart() command.                                   
                                                                               
Related procedures: mart, moremart