If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.
Whenever you search in PBworks or on the Web, Dokkio Sidebar (from the makers of PBworks) will run the same search in your Drive, Dropbox, OneDrive, Gmail, Slack, and browsed web pages. Now you can find what you're looking for wherever it lives. Try Dokkio Sidebar for free.
Comments (5)
stephen.oconnell said
at 4:39 pm on Jun 26, 2010
Mike thanks for putting the files out there, I got distracted yesterday and then raking pine needles all day, my hands are killing... I have converted the files to csv to make it more straight forward for the matlab and pythons folks.
stephen.oconnell said
at 5:26 pm on Jun 26, 2010
I would suggest using the CSV and not the Rdget files. There was an issue with the 3 of the samples having an 'x' for a classification which is invalid in this context. I have assigned them to the right class in the CSV files. If you want to use the Rdget files run the following code after loading the labels_y.Rdget file:
# LOAD FILE INTO lbl
lbl <- dget("labels_y.Rdget")
# FIX THE 'x' SAMPLES
lbl$Y[lbl$SAMPLE == 'SAMPLE_49'] <- '5'
lbl$Y[lbl$SAMPLE == 'SAMPLE_786'] <- '5'
lbl$Y[lbl$SAMPLE == 'SAMPLE_1250'] <- '3'
stephen.oconnell said
at 10:12 pm on Jun 26, 2010
I have uploaded a version of the problem presentation. I had to grey a few things out since it is being posted to the web...
mike@mbowles.com said
at 6:03 am on Jun 27, 2010
I've uploaded my copies of Stephen's CPU data. There are three files.
1. labels_y.Rdget - a vector of class labels. 1-6 in accord with Stephens presentation in class.
2. avg_matrix.Rdget.tar.gz - a matrix where each row contains 130 measurements of the average daily load for a single server.
3. p95_matrix.Rdget.tar.gz - a matrix where each row contains 130 measurements of the 95th percentile load for a single server.
the ith row from the vector of class labels and from two matrices of daily load measurements are all data corresponding to the same server we can use the labels in conjunction with either or both of the matrices.
mike@mbowles.com said
at 6:39 am on Jun 27, 2010
i just uploaded the .R file that will run friedman's boosted tree program - MART. The two commands at the end of that file
yp<-martpred(X)
table(yp,y)
have to be run separately to generate the prediction and the table. I'm not sure why.
you cannot download MART using the normal "Load Package". To download MART, go to http://www.stanford.edu/class/stats315b/ This is Prof Friedman's course page for 315b. click on "Homework 2". That give download url's and instructions. I've also uploaded the help files for MART and some of its related programs.
You don't have permission to comment on this page.