blog.gopenai.com Open in urlscan Pro
162.159.153.4 Public Scan

Back to summary

Submitted URL:
http://blog.gopenai.com/tuning-a-random-forest-classification-model-in-r-part-i-ac620f0b9fcd
Effective URL:
https://blog.gopenai.com/tuning-a-random-forest-classification-model-in-r-part-i-ac620f0b9fcd?gi=828f85aff530
Submission: On January 30 via api (January 30th 2024, 7:35:44 am UTC) from US — Scanned from US

Form analysis
0 forms found in the DOM

Text Content

Open in app

Sign up

Sign in

Write


Sign up

Sign in


Mastodon

Member-only story


TUNING A RANDOM FOREST CLASSIFICATION MODEL IN R, PART I

Spencer Antonio Marlen-Starr

·

Follow

Published in

GoPenAI

·
18 min read
·
Oct 2, 2023

8



Listen

Share

This article is a follow up to my previous article Forecasting with
Classification Models in R, but in this one, I focus exclusively on the second
to last model included in that article. The R script which all of the code
snippets in this article come from is called “Predicting stock performance using
just Random Forest script”, and it can be found in the ‘Random Forest only
scripts’ folder in the GitHub repo for this project.

Also, all datasets used in this project and in this article came from Kaggle.

Let’s start out where I left off with the previous article in terms of running
random forests. It was the 2nd to last model I ran in that article.

Note: Every random forest in this article is run using the caret package in R.


INITIAL RANDOM FOREST MODEL

I initially used the original fitting method for RFs in R which is ‘rf’, which
does not really allow for much manual tuning, such as setting the number of
trees to grow.


TRAINING THE CLASSIFICATION MODEL

Training/fitting a random forest in R using the caret package via the ‘rf’
method automatically grows 500 trees. The .mtry is what is typically referred to
as either m, or mtry, which is the size of the subset m, where m < P where P is
he total number candidate predictors/columns in your dataset, which your random
forest selects from at each split where growing each decision tree in the
forest.

A standard rule of thumb for m is to use the square root of the number of
candidate independents variables, so that is what I used initially as the
maximum mtry option.

## Random Forest version 1
set.seed(100)  # use the same seed for every model
# Define the Tuning Grid
rfGrid <- expand.grid(.mtry = c(1:sqrt(ncol(data2014))))  # sqrt of total number of variables is a common choice

# Train the Random Forest Model using the caret package
system.time( ftRF <- train(x = data2014, y = class2014, 
             method = "rf", tuneGrid = rfGrid, 
             metric = "ROC", trControl = ctrl) )

By using .mtry = c(1:sqrt(ncol(data2014))), you instruct R to try running random
forests with different sized subsets of candidate predictors to use at every
split when growing each try from 1 through the square root of the number of
columns in your dataframe with incremental steps of 1. So for this…

CREATE AN ACCOUNT TO READ THE FULL STORY.

The author made this story available to Medium members only.
If you’re new to Medium, create a new account to read this story on us.



Continue in app
Or, continue in mobile web



Sign up with Google

Sign up with Facebook

Sign up with email

Already have an account? Sign in





8

8



Follow



WRITTEN BY SPENCER ANTONIO MARLEN-STARR

75 Followers
·Writer for

GoPenAI

Data Analyst & Junior Data Scientist (MS in Data Analytics Engineering) with a
deep interest in Economics & Applied Epistemology.

Follow




MORE FROM SPENCER ANTONIO MARLEN-STARR AND GOPENAI

Spencer Antonio Marlen-Starr


WHY NEW THOMAS SOWELL FANS SHOULD READ A FEW OF HIS BOOKS ON ECONOMICS BEFORE
READING ANY ON…


IN THIS ARTICLE, I WILL REVIEW THOMAS SOWELL’S MOST INFLUENTIAL (NON-ACADEMIC)
BOOKS, EXPLAIN HOW THEY HELPED ME CHANGE MY WORLDVIEW OVER…

11 min read·Jul 17, 2023

5





Lucas Scott

in

GoPenAI


PYTHON IS OUT OF FAVOR?HUGGING FACE OPEN-SOURCES A NEW ML FRAMEWORK WHICH
WRITTEN IN RUST


HUGGING FACE HAS QUIETLY OPEN SOURCED AN ML FRAMEWORK — CANDLE


·5 min read·Sep 26, 2023

566

6




Sanjay Singh

in

GoPenAI


A STEP-BY-STEP GUIDE TO TRAINING YOUR OWN LARGE LANGUAGE MODELS (LLMS).


LARGE LANGUAGE MODELS (LLMS) HAVE TRULY REVOLUTIONIZED THE REALM OF ARTIFICIAL
INTELLIGENCE (AI). THESE POWERFUL AI SYSTEMS, SUCH AS GPT-3…

10 min read·Sep 30, 2023

144

1




Spencer Antonio Marlen-Starr


MEAN ABSOLUTE DEVIATION > “STANDARD” DEVIATION


NOTE: A REPLY TO THIS STORY RIGHTLY POINTED OUT THAT I SHOULD HAVE INCLUDED
OTHER ALTERNATIVE MEASURES OF DISPERSION BESIDES JUST THE MEAN…


·12 min read·Nov 24, 2023

86

2



See all from Spencer Antonio Marlen-Starr
See all from GoPenAI



RECOMMENDED FROM MEDIUM

Sachinsoni


UNLOCKING THE IDEAS BEHIND OF SVM(SUPPORT VECTOR MACHINE)


IMAGINE TEACHING A COMPUTER TO BE LIKE A CLEVER DETECTIVE, FINDING THE BEST WAYS
TO GROUP THINGS TOGETHER. THAT’S WHAT SUPPORT VECTOR…

17 min read·Aug 21, 2023

53





Kasun Dissanayake

in

Towards Dev


MACHINE LEARNING ALGORITHMS(8) — DECISION TREE ALGORITHM


IN THIS ARTICLE, I WILL FOCUS ON DISCUSSING THE PURPOSE OF DECISION TREES. A
DECISION TREE IS ONE OF THE MOST POWERFUL ALGORITHMS OF…

14 min read·Nov 22, 2023

382






LISTS


PREDICTIVE MODELING W/ PYTHON

20 stories·845 saves


PRACTICAL GUIDES TO MACHINE LEARNING

10 stories·989 saves


NATURAL LANGUAGE PROCESSING

1132 stories·605 saves


DATA SCIENCE AND AI

39 stories·54 saves


Connie Zhou


APPLIED MACHINE LEARNING — PART 2: CLASSIFICATION NAIVE BAYES FROM MATH TO
PYTHON IMPLEMENTATION


CLASSIFICATION IS A FUNDAMENTAL TASK IN MACHINE LEARNING, AND ONE POWERFUL
ALGORITHM FOR THIS PURPOSE IS NAIVE BAYES. THIS YEAR, I’M…

4 min read·Jan 22

1





Mr Farkhod Khushaktov


INTRODUCTION RANDOM FOREST CLASSIFICATION BY EXAMPLE


FARKHOD KHUSHVAKTOV | 2023 26 AUGUST LINKEDIN

7 min read·Aug 26, 2023

8





Deniz Gunay


RANDOM FOREST


RANDOM FOREST IS AN ENSEMBLE MACHINE LEARNING ALGORITHM THAT COMBINES MULTIPLE
DECISION TREES TO CREATE A MORE ROBUST AND ACCURATE…

12 min read·Sep 11, 2023

55





Allan victor


AN INTUITIVE GUIDE TO PRINCIPAL COMPONENT ANALYSIS (PCA) IN R: A STEP-BY-STEP
TUTORIAL WITH…


“DON’T GIVE UP SEEING THE EXHAUSTIVE LINES OF CODE. IT’S JUST COPY AND PASTE
THEN RUN!!”. STAY WITH ME, AND I WILL SHOW YOU HOW TO GENERATE…

14 min read·Nov 20, 2023

72

2



See more recommendations

Help

Status

About

Careers

Blog

Privacy

Terms

Text to speech

Teams

blog.gopenai.com Open in urlscan Pro 162.159.153.4 Public Scan

Form analysis 0 forms found in the DOM

Text Content

blog.gopenai.com Open in urlscan Pro
162.159.153.4 Public Scan

Form analysis
0 forms found in the DOM