{"id":194391,"date":"2023-08-10T14:34:00","date_gmt":"2023-08-10T18:34:00","guid":{"rendered":"https:\/\/ibkrcampus.com\/?p=194391"},"modified":"2023-08-10T14:34:36","modified_gmt":"2023-08-10T18:34:36","slug":"how-to-build-a-logistic-regression-model-from-scratch-in-r","status":"publish","type":"post","link":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/how-to-build-a-logistic-regression-model-from-scratch-in-r\/","title":{"rendered":"How to Build a Logistic Regression Model from Scratch in R"},"content":{"rendered":"\n<p><em>Originally posted on <a href=\"https:\/\/theautomatic.net\/2018\/10\/02\/how-to-build-a-logistic-regression-model-from-scratch-in-r\/\">TheAutomatic.net<\/a>.<\/em><\/p>\n\n\n\n<p><em>Excerpt<\/em><\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-background\"><strong>Background<\/strong><\/h2>\n\n\n\n<p>In a&nbsp;<a href=\"https:\/\/theautomatic.net\/2017\/12\/11\/vectorize-fuzzy-matching\/\">previous post<\/a>, we showed how using vectorization in R can vastly speed up fuzzy matching. Here, we will show you how to use vectorization to efficiently build a&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Logistic_regression\">logistic regression<\/a>&nbsp;model from scratch in R. Now we could just use the&nbsp;<a href=\"https:\/\/topepo.github.io\/caret\/index.html\">caret<\/a>&nbsp;or&nbsp;<a href=\"https:\/\/www.rdocumentation.org\/packages\/stats\/versions\/3.5.1\">stats<\/a>&nbsp;packages to create a model, but building algorithms from scratch is a great way to develop a better understanding of how they work under the hood.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-definitions-amp-assumptions\"><strong>Definitions &amp; Assumptions<\/strong><\/h2>\n\n\n\n<p>In developing our code for the logistic regression algorithm, we will consider the following definitions and assumptions:<\/p>\n\n\n\n<p><strong>x<\/strong>&nbsp;= A dxn matrix of&nbsp;<em>d<\/em>&nbsp;predictor variables, where each column x<sub>i<\/sub>&nbsp;represents the vector of predictors corresponding to one data point (with&nbsp;<em>n<\/em>&nbsp;such columns i.e.&nbsp;<em>n<\/em>&nbsp;data points)<\/p>\n\n\n\n<p>d = The number of predictor variables (i.e. the number of dimensions)<\/p>\n\n\n\n<p>n = The number of data points<\/p>\n\n\n\n<p><strong>y<\/strong>&nbsp;= A vector of labels i.e. y<sub>i<\/sub>&nbsp;equals the label associated with x<sub>i<\/sub>; in our case we\u2019ll assume y = 1 or -1<\/p>\n\n\n\n<p><strong>\u0398<\/strong>&nbsp;= The vector of coefficients, \u0398<sub>1<\/sub>, \u0398<sub>2<\/sub>\u2026\u0398<sub>d<\/sub>&nbsp;trained via gradient descent. These correspond to x<sub>1<\/sub>, x<sub>2<\/sub>\u2026x<sub>d<\/sub><\/p>\n\n\n\n<p>\u03b1 = Step size, controls the rate of gradient descent<\/p>\n\n\n\n<p>Logistic regression, being a binary classification algorithm, outputs a probability between 0 and 1 of a given data point being associated with a positive label. This probability is given by the equation below:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"1100\" height=\"161\" data-src=\"\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/theautomaticnet-r-built-logistic-regression-1-1100x161.png\" alt=\"\" class=\"wp-image-194394 lazyload\" data-srcset=\"https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/theautomaticnet-r-built-logistic-regression-1-1100x161.png 1100w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/theautomaticnet-r-built-logistic-regression-1-700x102.png 700w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/theautomaticnet-r-built-logistic-regression-1-300x44.png 300w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/theautomaticnet-r-built-logistic-regression-1-768x112.png 768w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/theautomaticnet-r-built-logistic-regression-1.png 1150w\" data-sizes=\"(max-width: 1100px) 100vw, 1100px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1100px; aspect-ratio: 1100\/161;\" \/><\/figure>\n\n\n\n<p>Recall that &lt;<strong>\u0398<\/strong>, x&gt; refers to the&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Dot_product\">dot product<\/a>&nbsp;of&nbsp;<strong>\u0398<\/strong>&nbsp;and&nbsp;<strong>x<\/strong>.<\/p>\n\n\n\n<p>In order to calculate the above formula, we need to know the value of&nbsp;<strong>\u0398<\/strong>. Logistic regression uses a method called&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Gradient_descent\">gradient descent<\/a>&nbsp;to learn the value of&nbsp;<strong>\u0398<\/strong>. There are a few variations of gradient descent, but the variation we will use here is based upon the following update formula for \u0398<sub>j<\/sub>:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"1100\" height=\"111\" data-src=\"\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/theautomaticnet-r-built-logistic-regression-2-1100x111.png\" alt=\"\" class=\"wp-image-194396 lazyload\" data-srcset=\"https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/theautomaticnet-r-built-logistic-regression-2-1100x111.png 1100w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/theautomaticnet-r-built-logistic-regression-2-700x71.png 700w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/theautomaticnet-r-built-logistic-regression-2-300x30.png 300w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/theautomaticnet-r-built-logistic-regression-2-768x78.png 768w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/theautomaticnet-r-built-logistic-regression-2.png 1148w\" data-sizes=\"(max-width: 1100px) 100vw, 1100px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1100px; aspect-ratio: 1100\/111;\" \/><\/figure>\n\n\n\n<p>This formula updates the j<sup>th<\/sup>&nbsp;element of the&nbsp;<strong>\u0398<\/strong>&nbsp;vector. Logistic regression models run this gradient descent update of&nbsp;<strong>\u0398<\/strong>&nbsp;until either 1) a maximum number of iterations has been reached or 2) the difference between the current update of&nbsp;<strong>\u0398<\/strong>&nbsp;and the previous value is below a set threshold. To run this update of theta, we\u2019re going to write the following function, which we\u2019ll break down further along in the post. This function will update the entire&nbsp;<strong>\u0398<\/strong>&nbsp;vector in one function call i.e.&nbsp;<em>all j elements<\/em>&nbsp;of&nbsp;<strong>\u0398<\/strong>.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\"># function to update theta via gradient descent\nupdate_theta &lt;- function(theta_arg,n,x,y,d,alpha)\n{\n   \n  # calculate numerator of the derivative\n  numerator &lt;- t(replicate(d , y)) * x\n   \n  # perform element-wise multiplication between theta and x,\n  # prior to getting their dot product\n  theta_x_prod &lt;- t(replicate(n,theta_arg)) * t(x)\n  dotprod &lt;- rowSums(theta_x_prod)\n   \n  denominator &lt;- 1 + exp(-y * dotprod)\n   \n  # cast the denominator as a matrix\n  denominator &lt;- t(replicate(d,denominator)) \n   \n  # final step, get new theta result based off update formula\n  theta_arg &lt;- theta_arg - alpha * rowSums(numerator \/ denominator) \n   \n   \n  return(theta_arg)\n   \n}<\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Simplifying the update formula<\/strong><\/h2>\n\n\n\n<p>To simplify the update formula for \u0398<sub>j<\/sub>, we need to calculate the&nbsp;<a href=\"https:\/\/mathworld.wolfram.com\/Derivative.html\">derivative<\/a>&nbsp;in the formula above.<\/p>\n\n\n\n<p>Let\u2019s suppose z = (y<sub>i<\/sub>)(&lt;<strong>\u0398<\/strong>, x<sub>i<\/sub>&gt;). We\u2019ll abbreviate the summation of 1 to n by simply using \u03a3.<\/p>\n\n\n\n<p>Then, calculating the derivative gives us the following:<\/p>\n\n\n\n<p>\u03a3 (1 \/ exp(1 + z)) * exp(z) * x<sub>i<\/sub>y<sub>i<\/sub><\/p>\n\n\n\n<p>= \u03a3 (exp(z) \/ exp(1 + z)) * x<sub>i<\/sub>y<sub>i<\/sub><\/p>\n\n\n\n<p>Since exp(z) \/ (1 + exp(z)) is a known identity for 1 \/ (1 + exp(-z)), we can substitute above to get:<\/p>\n\n\n\n<p>\u03a3 1 \/ (1 + exp(-z)) * x<sub>i<\/sub>y<sub>i<\/sub><\/p>\n\n\n\n<p>= \u03a3 x<sub>i<\/sub>y<sub>i<\/sub>&nbsp;\/ (1 + exp(-z))<\/p>\n\n\n\n<p>Now, substituting (y<sub>i<\/sub>)(&lt;<strong>\u0398<\/strong>, x<sub>i<\/sub>&gt;) back for z:<\/p>\n\n\n\n<p>= \u03a3 x<sub>i<\/sub>y<sub>i<\/sub>&nbsp;\/ (1 + exp(-(y<sub>i<\/sub>)(&lt;<strong>\u0398<\/strong>, x<sub>i<\/sub>&gt;)))<\/p>\n\n\n\n<p>Plugging this derivative result into the rest of the update formula, the below expression tells us how to update \u0398<sub>j<\/sub>:<\/p>\n\n\n\n<p>\u0398<sub>j<\/sub>&nbsp;\u2190 \u0398<sub>j<\/sub>&nbsp;\u2013 \u03b1\u03a3 x<sub>i<\/sub>y<sub>i<\/sub>&nbsp;\/ (1 + exp(-(y<sub>i<\/sub>)(&lt;<strong>\u0398<\/strong>, x<sub>i<\/sub>&gt;)))<\/p>\n\n\n\n<p>To convert this math into R code, we\u2019ll split up the formula above into three main steps:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Calculate the numerator: x<sub>i<\/sub>y<sub>i<\/sub><\/li>\n\n\n\n<li>Calculate the denominator: (1 + exp(-(y<sub>i<\/sub>)(&lt;<strong>\u0398<\/strong>, x<sub>i<\/sub>&gt;)))<\/li>\n\n\n\n<li>Plug the results from first two steps back into the formula above<\/li>\n<\/ul>\n\n\n\n<p>The idea to keep in mind throughout this post is that we\u2019re not going to use a for loop to update each j<sup>th<\/sup>&nbsp;element of&nbsp;<strong>\u0398<\/strong>. Instead, we\u2019re going to use vectorization to update the entire&nbsp;<strong>\u0398<\/strong>&nbsp;vector at once via element-wise matrix multiplication. This will vastly speed up the gradient descent implementation.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Step 1) Calculating the numerator<\/strong><\/h2>\n\n\n\n<p>In the numerator of the derivative, we have the summation of 1 to n of y<sub>i<\/sub>&nbsp;times x<sub>i<\/sub>. Effectively, this means we have to calculate the following:<\/p>\n\n\n\n<p><strong>y<sub>1<\/sub><\/strong><strong>x<sub>1<\/sub><\/strong>&nbsp;+&nbsp;<strong>y<sub>2<\/sub><\/strong><strong>x<sub>2<\/sub><\/strong>\u2026+<strong>y<sub>n<\/sub><\/strong><strong>x<sub>n<\/sub><\/strong><\/p>\n\n\n\n<p>This calculation needs to be done for all j elements in&nbsp;<strong>\u0398<\/strong>&nbsp;to fully update the vector. So, we actually need to run the following calculations:<\/p>\n\n\n\n<p><strong>y<sub>1<\/sub><\/strong><strong>x<sub>1,1<\/sub><\/strong>&nbsp;+&nbsp;<strong>y<sub>2<\/sub><\/strong><strong>x<sub>1,2<\/sub><\/strong>\u2026+<strong>y<sub>n<\/sub><\/strong><strong>x<sub>1,n<\/sub><\/strong><\/p>\n\n\n\n<p><strong>y<sub>1<\/sub><\/strong><strong>x<sub>2,1<\/sub><\/strong>&nbsp;+&nbsp;<strong>y<sub>2<\/sub><\/strong><strong>x<sub>2,2<\/sub><\/strong>\u2026+<strong>y<sub>n<\/sub><\/strong><strong>x<sub>2,n<\/sub><\/strong><\/p>\n\n\n\n<p>\u2026<\/p>\n\n\n\n<p><strong>y<sub>1<\/sub><\/strong><strong>x<sub>d,1<\/sub><\/strong>&nbsp;+&nbsp;<strong>y<sub>2<\/sub><\/strong><strong>x<sub>d,2<\/sub><\/strong>\u2026+<strong>y<sub>n<\/sub><\/strong><strong>x<sub>d,n<\/sub><\/strong><\/p>\n\n\n\n<p>Since&nbsp;<strong>y<\/strong>&nbsp;is a vector, or put another way, is an nx1 matrix, and&nbsp;<strong>x<\/strong>&nbsp;is a dxn matrix, where d is the number of predictor variables i.e.&nbsp;<strong>x<sub>1<\/sub>, x<sub>2<\/sub>, x<sub>3<\/sub>\u2026x<sub>d<\/sub><\/strong>, and n is the number of labels (how many predicted values we have), then we can compute the above calculations by creating a dxn matrix where each row is a duplicate of&nbsp;<strong>y<\/strong>. This way we have d duplicates of&nbsp;<strong>y<\/strong>. Each ith element of&nbsp;<strong>x<\/strong>&nbsp;(i.e. x<sub>i<\/sub>) corresponds to the i<sup>th<\/sup>&nbsp;<em>column<\/em>&nbsp;of&nbsp;<strong>x<\/strong>. So x<sub>j,i<\/sub>&nbsp;refers to the element in the j<sup>th<\/sup>&nbsp;row and i<sup>th<\/sup>&nbsp;column of&nbsp;<strong>x<\/strong>.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"1100\" height=\"179\" data-src=\"\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/theautomaticnet-r-built-logistic-regression-3-1100x179.png\" alt=\"\" class=\"wp-image-194399 lazyload\" data-srcset=\"https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/theautomaticnet-r-built-logistic-regression-3-1100x179.png 1100w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/theautomaticnet-r-built-logistic-regression-3-700x114.png 700w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/theautomaticnet-r-built-logistic-regression-3-300x49.png 300w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/theautomaticnet-r-built-logistic-regression-3-768x125.png 768w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/theautomaticnet-r-built-logistic-regression-3.png 1134w\" data-sizes=\"(max-width: 1100px) 100vw, 1100px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1100px; aspect-ratio: 1100\/179;\" \/><\/figure>\n\n\n\n<p>In the above expression, instead of doing traditional matrix multiplication, we are going to do element-wise multiplication i.e. the j<sup>th<\/sup>, i<sup>th<\/sup>&nbsp;element of the resultant matrix will equal the j<sup>th<\/sup>, i<sup>th<\/sup>&nbsp;elements of each matrix multiplied by each other. This can be done in one line of R code, like below:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\"># calculate numerator of the derivative of the loss function\nnumerator &lt;- t(replicate(d , y)) * x<\/pre>\n\n\n\n<p>Initially we create a matrix where each column is equal to y<sub>1<\/sub>, y<sub>2<\/sub>,\u2026y<sub>n<\/sub>. This is&nbsp;<strong>replicate(d, y)<\/strong>. We then transpose this so that we can perform the element-wise multiplication described above. This element-wise multiplication gets us the following dxn matrix result:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"770\" height=\"111\" data-src=\"\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/theautomaticnet-r-built-logistic-regression-4.png\" alt=\"\" class=\"wp-image-194424 lazyload\" data-srcset=\"https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/theautomaticnet-r-built-logistic-regression-4.png 770w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/theautomaticnet-r-built-logistic-regression-4-700x101.png 700w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/theautomaticnet-r-built-logistic-regression-4-300x43.png 300w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/theautomaticnet-r-built-logistic-regression-4-768x111.png 768w\" data-sizes=\"(max-width: 770px) 100vw, 770px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 770px; aspect-ratio: 770\/111;\" \/><\/figure>\n\n\n\n<p>Notice how the sum of each j<sup>th<\/sup>&nbsp;row corresponds to the each calculation above i.e. the sum of row 1 is \u03a3x<sub>i<\/sub>y<sub>i<\/sub>&nbsp;for j = 1, the sum of row 2 is \u03a3x<sub>i<\/sub>y<sub>i<\/sub>&nbsp;for j = 2\u2026the sum of row d is \u03a3x<sub>i<\/sub>y<sub>i<\/sub>&nbsp;for j = d. Rather then using slower for loops, this allows us to calculate the numerator of the derivative given above for each element (\u0398<sub>j<\/sub>) in&nbsp;<strong>\u0398<\/strong>&nbsp;using vectorization. We\u2019ll postpone doing the summation until after we\u2019ve calculated the denominator piece of the derivative.<\/p>\n\n\n\n<p><em>Visit <a href=\"https:\/\/theautomatic.net\/2018\/10\/02\/how-to-build-a-logistic-regression-model-from-scratch-in-r\/\">TheAutomatic.net<\/a> to read how to calculate the denominator<\/em>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In developing our code for the logistic regression algorithm, we will consider the following definitions and assumptions.<\/p>\n","protected":false},"author":388,"featured_media":194444,"comment_status":"open","ping_status":"closed","sticky":true,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[339,343,338,341,342],"tags":[806,487,6591,15689],"contributors-categories":[13695],"class_list":{"0":"post-194391","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-data-science","8":"category-programing-languages","9":"category-ibkr-quant-news","10":"category-quant-development","11":"category-r-development","12":"tag-data-science","13":"tag-r","14":"tag-rstats","15":"tag-vectorization-in-r","16":"contributors-categories-theautomatic-net"},"pp_statuses_selecting_workflow":false,"pp_workflow_action":"current","pp_status_selection":"publish","acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v26.9 (Yoast SEO v27.5) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>How to Build a Logistic Regression Model from Scratch in R<\/title>\n<meta name=\"description\" content=\"In developing our code for the logistic regression algorithm, we will consider the following definitions and assumptions.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.interactivebrokers.com\/campus\/wp-json\/wp\/v2\/posts\/194391\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"How to Build a Logistic Regression Model from Scratch in R | IBKR Campus US\" \/>\n<meta property=\"og:description\" content=\"In developing our code for the logistic regression algorithm, we will consider the following definitions and assumptions.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/how-to-build-a-logistic-regression-model-from-scratch-in-r\/\" \/>\n<meta property=\"og:site_name\" content=\"IBKR Campus US\" \/>\n<meta property=\"article:published_time\" content=\"2023-08-10T18:34:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-08-10T18:34:36+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/R-programming-user-pressing-button.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1000\" \/>\n\t<meta property=\"og:image:height\" content=\"563\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Andrew Treadway\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Andrew Treadway\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\n\t    \"@context\": \"https:\\\/\\\/schema.org\",\n\t    \"@graph\": [\n\t        {\n\t            \"@type\": \"NewsArticle\",\n\t            \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/how-to-build-a-logistic-regression-model-from-scratch-in-r\\\/#article\",\n\t            \"isPartOf\": {\n\t                \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/how-to-build-a-logistic-regression-model-from-scratch-in-r\\\/\"\n\t            },\n\t            \"author\": {\n\t                \"name\": \"Andrew Treadway\",\n\t                \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#\\\/schema\\\/person\\\/d4018570a16fb867f1c08412fc9c64bc\"\n\t            },\n\t            \"headline\": \"How to Build a Logistic Regression Model from Scratch in R\",\n\t            \"datePublished\": \"2023-08-10T18:34:00+00:00\",\n\t            \"dateModified\": \"2023-08-10T18:34:36+00:00\",\n\t            \"mainEntityOfPage\": {\n\t                \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/how-to-build-a-logistic-regression-model-from-scratch-in-r\\\/\"\n\t            },\n\t            \"wordCount\": 1072,\n\t            \"commentCount\": 0,\n\t            \"publisher\": {\n\t                \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#organization\"\n\t            },\n\t            \"image\": {\n\t                \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/how-to-build-a-logistic-regression-model-from-scratch-in-r\\\/#primaryimage\"\n\t            },\n\t            \"thumbnailUrl\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2023\\\/08\\\/R-programming-user-pressing-button.jpg\",\n\t            \"keywords\": [\n\t                \"Data Science\",\n\t                \"R\",\n\t                \"rstats\",\n\t                \"vectorization in R\"\n\t            ],\n\t            \"articleSection\": [\n\t                \"Data Science\",\n\t                \"Programming Languages\",\n\t                \"Quant\",\n\t                \"Quant Development\",\n\t                \"R Development\"\n\t            ],\n\t            \"inLanguage\": \"en-US\",\n\t            \"potentialAction\": [\n\t                {\n\t                    \"@type\": \"CommentAction\",\n\t                    \"name\": \"Comment\",\n\t                    \"target\": [\n\t                        \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/how-to-build-a-logistic-regression-model-from-scratch-in-r\\\/#respond\"\n\t                    ]\n\t                }\n\t            ]\n\t        },\n\t        {\n\t            \"@type\": \"WebPage\",\n\t            \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/how-to-build-a-logistic-regression-model-from-scratch-in-r\\\/\",\n\t            \"url\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/how-to-build-a-logistic-regression-model-from-scratch-in-r\\\/\",\n\t            \"name\": \"How to Build a Logistic Regression Model from Scratch in R | IBKR Campus US\",\n\t            \"isPartOf\": {\n\t                \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#website\"\n\t            },\n\t            \"primaryImageOfPage\": {\n\t                \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/how-to-build-a-logistic-regression-model-from-scratch-in-r\\\/#primaryimage\"\n\t            },\n\t            \"image\": {\n\t                \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/how-to-build-a-logistic-regression-model-from-scratch-in-r\\\/#primaryimage\"\n\t            },\n\t            \"thumbnailUrl\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2023\\\/08\\\/R-programming-user-pressing-button.jpg\",\n\t            \"datePublished\": \"2023-08-10T18:34:00+00:00\",\n\t            \"dateModified\": \"2023-08-10T18:34:36+00:00\",\n\t            \"description\": \"In developing our code for the logistic regression algorithm, we will consider the following definitions and assumptions.\",\n\t            \"inLanguage\": \"en-US\",\n\t            \"potentialAction\": [\n\t                {\n\t                    \"@type\": \"ReadAction\",\n\t                    \"target\": [\n\t                        \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/how-to-build-a-logistic-regression-model-from-scratch-in-r\\\/\"\n\t                    ]\n\t                }\n\t            ]\n\t        },\n\t        {\n\t            \"@type\": \"ImageObject\",\n\t            \"inLanguage\": \"en-US\",\n\t            \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/how-to-build-a-logistic-regression-model-from-scratch-in-r\\\/#primaryimage\",\n\t            \"url\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2023\\\/08\\\/R-programming-user-pressing-button.jpg\",\n\t            \"contentUrl\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2023\\\/08\\\/R-programming-user-pressing-button.jpg\",\n\t            \"width\": 1000,\n\t            \"height\": 563,\n\t            \"caption\": \"Quant\"\n\t        },\n\t        {\n\t            \"@type\": \"WebSite\",\n\t            \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#website\",\n\t            \"url\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/\",\n\t            \"name\": \"IBKR Campus US\",\n\t            \"description\": \"Financial Education from Interactive Brokers\",\n\t            \"publisher\": {\n\t                \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#organization\"\n\t            },\n\t            \"potentialAction\": [\n\t                {\n\t                    \"@type\": \"SearchAction\",\n\t                    \"target\": {\n\t                        \"@type\": \"EntryPoint\",\n\t                        \"urlTemplate\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/?s={search_term_string}\"\n\t                    },\n\t                    \"query-input\": {\n\t                        \"@type\": \"PropertyValueSpecification\",\n\t                        \"valueRequired\": true,\n\t                        \"valueName\": \"search_term_string\"\n\t                    }\n\t                }\n\t            ],\n\t            \"inLanguage\": \"en-US\"\n\t        },\n\t        {\n\t            \"@type\": \"Organization\",\n\t            \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#organization\",\n\t            \"name\": \"Interactive Brokers\",\n\t            \"alternateName\": \"IBKR\",\n\t            \"url\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/\",\n\t            \"logo\": {\n\t                \"@type\": \"ImageObject\",\n\t                \"inLanguage\": \"en-US\",\n\t                \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#\\\/schema\\\/logo\\\/image\\\/\",\n\t                \"url\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2024\\\/05\\\/ibkr-campus-logo.jpg\",\n\t                \"contentUrl\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2024\\\/05\\\/ibkr-campus-logo.jpg\",\n\t                \"width\": 669,\n\t                \"height\": 669,\n\t                \"caption\": \"Interactive Brokers\"\n\t            },\n\t            \"image\": {\n\t                \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#\\\/schema\\\/logo\\\/image\\\/\"\n\t            },\n\t            \"publishingPrinciples\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/about-ibkr-campus\\\/\",\n\t            \"ethicsPolicy\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/cyber-security-notice\\\/\"\n\t        },\n\t        {\n\t            \"@type\": \"Person\",\n\t            \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#\\\/schema\\\/person\\\/d4018570a16fb867f1c08412fc9c64bc\",\n\t            \"name\": \"Andrew Treadway\",\n\t            \"description\": \"Andrew Treadway currently works as a Senior Data Scientist, and has experience doing analytics, software automation, and ETL. He completed a master\u2019s degree in computer science \\\/ machine learning, and an undergraduate degree in pure mathematics. Connect with him on LinkedIn: https:\\\/\\\/www.linkedin.com\\\/in\\\/andrew-treadway-a3b19b103\\\/In addition to TheAutomatic.net blog, he also teaches in-person courses on Python and R through my NYC meetup: more details.\",\n\t            \"sameAs\": [\n\t                \"https:\\\/\\\/theautomatic.net\\\/about-me\\\/\"\n\t            ],\n\t            \"url\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/author\\\/andrewtreadway\\\/\"\n\t        }\n\t    ]\n\t}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"How to Build a Logistic Regression Model from Scratch in R","description":"In developing our code for the logistic regression algorithm, we will consider the following definitions and assumptions.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.interactivebrokers.com\/campus\/wp-json\/wp\/v2\/posts\/194391\/","og_locale":"en_US","og_type":"article","og_title":"How to Build a Logistic Regression Model from Scratch in R | IBKR Campus US","og_description":"In developing our code for the logistic regression algorithm, we will consider the following definitions and assumptions.","og_url":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/how-to-build-a-logistic-regression-model-from-scratch-in-r\/","og_site_name":"IBKR Campus US","article_published_time":"2023-08-10T18:34:00+00:00","article_modified_time":"2023-08-10T18:34:36+00:00","og_image":[{"width":1000,"height":563,"url":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/R-programming-user-pressing-button.jpg","type":"image\/jpeg"}],"author":"Andrew Treadway","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Andrew Treadway","Est. reading time":"6 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"NewsArticle","@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/how-to-build-a-logistic-regression-model-from-scratch-in-r\/#article","isPartOf":{"@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/how-to-build-a-logistic-regression-model-from-scratch-in-r\/"},"author":{"name":"Andrew Treadway","@id":"https:\/\/ibkrcampus.com\/campus\/#\/schema\/person\/d4018570a16fb867f1c08412fc9c64bc"},"headline":"How to Build a Logistic Regression Model from Scratch in R","datePublished":"2023-08-10T18:34:00+00:00","dateModified":"2023-08-10T18:34:36+00:00","mainEntityOfPage":{"@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/how-to-build-a-logistic-regression-model-from-scratch-in-r\/"},"wordCount":1072,"commentCount":0,"publisher":{"@id":"https:\/\/ibkrcampus.com\/campus\/#organization"},"image":{"@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/how-to-build-a-logistic-regression-model-from-scratch-in-r\/#primaryimage"},"thumbnailUrl":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/R-programming-user-pressing-button.jpg","keywords":["Data Science","R","rstats","vectorization in R"],"articleSection":["Data Science","Programming Languages","Quant","Quant Development","R Development"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/how-to-build-a-logistic-regression-model-from-scratch-in-r\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/how-to-build-a-logistic-regression-model-from-scratch-in-r\/","url":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/how-to-build-a-logistic-regression-model-from-scratch-in-r\/","name":"How to Build a Logistic Regression Model from Scratch in R | IBKR Campus US","isPartOf":{"@id":"https:\/\/ibkrcampus.com\/campus\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/how-to-build-a-logistic-regression-model-from-scratch-in-r\/#primaryimage"},"image":{"@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/how-to-build-a-logistic-regression-model-from-scratch-in-r\/#primaryimage"},"thumbnailUrl":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/R-programming-user-pressing-button.jpg","datePublished":"2023-08-10T18:34:00+00:00","dateModified":"2023-08-10T18:34:36+00:00","description":"In developing our code for the logistic regression algorithm, we will consider the following definitions and assumptions.","inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/how-to-build-a-logistic-regression-model-from-scratch-in-r\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/how-to-build-a-logistic-regression-model-from-scratch-in-r\/#primaryimage","url":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/R-programming-user-pressing-button.jpg","contentUrl":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/R-programming-user-pressing-button.jpg","width":1000,"height":563,"caption":"Quant"},{"@type":"WebSite","@id":"https:\/\/ibkrcampus.com\/campus\/#website","url":"https:\/\/ibkrcampus.com\/campus\/","name":"IBKR Campus US","description":"Financial Education from Interactive Brokers","publisher":{"@id":"https:\/\/ibkrcampus.com\/campus\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/ibkrcampus.com\/campus\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/ibkrcampus.com\/campus\/#organization","name":"Interactive Brokers","alternateName":"IBKR","url":"https:\/\/ibkrcampus.com\/campus\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/ibkrcampus.com\/campus\/#\/schema\/logo\/image\/","url":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2024\/05\/ibkr-campus-logo.jpg","contentUrl":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2024\/05\/ibkr-campus-logo.jpg","width":669,"height":669,"caption":"Interactive Brokers"},"image":{"@id":"https:\/\/ibkrcampus.com\/campus\/#\/schema\/logo\/image\/"},"publishingPrinciples":"https:\/\/www.interactivebrokers.com\/campus\/about-ibkr-campus\/","ethicsPolicy":"https:\/\/www.interactivebrokers.com\/campus\/cyber-security-notice\/"},{"@type":"Person","@id":"https:\/\/ibkrcampus.com\/campus\/#\/schema\/person\/d4018570a16fb867f1c08412fc9c64bc","name":"Andrew Treadway","description":"Andrew Treadway currently works as a Senior Data Scientist, and has experience doing analytics, software automation, and ETL. He completed a master\u2019s degree in computer science \/ machine learning, and an undergraduate degree in pure mathematics. Connect with him on LinkedIn: https:\/\/www.linkedin.com\/in\/andrew-treadway-a3b19b103\/In addition to TheAutomatic.net blog, he also teaches in-person courses on Python and R through my NYC meetup: more details.","sameAs":["https:\/\/theautomatic.net\/about-me\/"],"url":"https:\/\/www.interactivebrokers.com\/campus\/author\/andrewtreadway\/"}]}},"jetpack_featured_media_url":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/08\/R-programming-user-pressing-button.jpg","_links":{"self":[{"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/posts\/194391","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/users\/388"}],"replies":[{"embeddable":true,"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/comments?post=194391"}],"version-history":[{"count":0,"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/posts\/194391\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/media\/194444"}],"wp:attachment":[{"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/media?parent=194391"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/categories?post=194391"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/tags?post=194391"},{"taxonomy":"contributors-categories","embeddable":true,"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/contributors-categories?post=194391"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}