{"id":187835,"date":"2023-04-03T12:55:35","date_gmt":"2023-04-03T16:55:35","guid":{"rendered":"https:\/\/ibkrcampus.com\/?p=187835"},"modified":"2023-04-03T14:51:14","modified_gmt":"2023-04-03T18:51:14","slug":"web-browsing-and-parsing-with-robobrowser-and-requests_html","status":"publish","type":"post","link":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/web-browsing-and-parsing-with-robobrowser-and-requests_html\/","title":{"rendered":"Web Browsing and Parsing with RoboBrowser and requests_html"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\" id=\"h-background\"><strong>Background<\/strong><\/h2>\n\n\n\n<p>So you\u2019ve learned all about&nbsp;<strong>BeautifulSoup<\/strong>. What\u2019s next? Python is a great language for automating web operations. In a&nbsp;<a href=\"https:\/\/theautomatic.net\/2017\/08\/24\/scraping-articles-about-stocks\/\">previous article<\/a>&nbsp;we went through how to use&nbsp;<a href=\"https:\/\/www.crummy.com\/software\/BeautifulSoup\/bs4\/doc\/\">BeautifulSoup<\/a>&nbsp;and requests to scrape stock-related articles from Nasdaq\u2019s website. This post talks about a couple of alternatives to using&nbsp;<strong>BeautifulSoup<\/strong>&nbsp;directly.<\/p>\n\n\n\n<p>One way of scraping and crawling the web is to use Python\u2019s&nbsp;<strong>RoboBrowser<\/strong>&nbsp;package, which is built on top of&nbsp;<a href=\"https:\/\/docs.python-requests.org\/en\/master\/\">requests<\/a>&nbsp;and&nbsp;<a href=\"https:\/\/www.crummy.com\/software\/BeautifulSoup\/\">BeautifulSoup<\/a>. Because it\u2019s built using each of these packages, writing code to scrape the web is a bit simplified as we\u2019ll see below.&nbsp;<strong>RoboBrowser<\/strong>&nbsp;works similarly to the older Python 2.x package&nbsp;<strong>mechanize<\/strong>&nbsp;in that it allows you to simulate a web browser.<\/p>\n\n\n\n<p>A second option is using&nbsp;<strong>requests_html<\/strong>, which was also discussed&nbsp;<a href=\"https:\/\/theautomatic.net\/2019\/01\/19\/scraping-data-from-javascript-webpage-python\/\">here<\/a>, and which we\u2019ll also talk more about below.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Let\u2019s get started with RoboBrowser!<\/strong><\/h2>\n\n\n\n<p>To install RoboBrowser, you can use pip:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">pip install robobrowser<\/pre>\n\n\n\n<p>Once installed, we\u2019ll load the package like so:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">from robobrowser import RoboBrowser<\/pre>\n\n\n\n<p>Next, let\u2019s create a&nbsp;<strong>RoboBrowser<\/strong>&nbsp;object, which will serve as an invisible browser. We can use this browser-like object to navigate to websites, like Google.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\"># create a RoboBrowser object\nbrowser = RoboBrowser(history = True)\n \n# navigate to Google\nbrowser.open(\"https:\/\/www.google.com\")<\/pre>\n\n\n\n<p>We can verify we\u2019re currently at Google\u2019s homepage by checking the&nbsp;<em>url<\/em>&nbsp;attribute of&nbsp;<strong>browser<\/strong>:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">browser.url # \"https:\/\/www.google.com\/\"<\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Submit a form with RoboBrowser<\/strong><\/h2>\n\n\n\n<p>Now that we\u2019ve navigated to Google\u2019s search homepage, let\u2019s simulate searching for a company\u2019s name. We can find the search query form on Google by using the&nbsp;<strong>get_form<\/strong>&nbsp;method.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">search = browser.get_form()<\/pre>\n\n\n\n<p>For webpages with multiple forms, you can use the&nbsp;<strong>get_forms<\/strong>&nbsp;method (note:&nbsp;<em>forms<\/em>&nbsp;instead of&nbsp;<em>form<\/em>). In that case a list of all the forms on the webpage will be returned. Calling&nbsp;<strong>get_form<\/strong>&nbsp;will return only the first form found on the webpage. We can see the difference in the snapshots below.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"640\" height=\"65\" data-src=\"\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/robobrowser-get_form-the-automatic-net.jpg\" alt=\"\" class=\"wp-image-187839 lazyload\" data-srcset=\"https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/robobrowser-get_form-the-automatic-net.jpg 640w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/robobrowser-get_form-the-automatic-net-300x30.jpg 300w\" data-sizes=\"(max-width: 640px) 100vw, 640px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 640px; aspect-ratio: 640\/65;\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"640\" height=\"62\" data-src=\"\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/robobrowser-get_forms.jpg\" alt=\"\" class=\"wp-image-187840 lazyload\" data-srcset=\"https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/robobrowser-get_forms.jpg 640w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/robobrowser-get_forms-300x29.jpg 300w\" data-sizes=\"(max-width: 640px) 100vw, 640px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 640px; aspect-ratio: 640\/62;\" \/><\/figure>\n\n\n\n<p>To submit our query for a regular Google search, we write the following code:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">search[\"q\"] = \"aapl\"\n \nbrowser.submit_form(search, submit = search.submit_fields[\"btnG\"])<\/pre>\n\n\n\n<p>The first line above specifies we want the value of the form\u2019s&nbsp;<em>q<\/em>&nbsp;attribute to be \u201caapl\u201d. This is equivalent to typing \u201caapl\u201d into Google\u2019s search box.<\/p>\n\n\n\n<p>The&nbsp;<em>search.submit_fields[\u201cbtnG\u201d]<\/em>&nbsp;tells our browser object that we want to search using the standard Google search method. If want to use search using the \u201cI\u2019m feeling lucky\u201d option, we just tweak our code like this:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">browser.submit_form(search, submit = search.submit_fields[\"btnI\"])<\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>BeautifulSoup vs. RoboBrowser<\/strong><\/h2>\n\n\n\n<p>We can scrape all of the links off the webpage using the&nbsp;<strong>get_links<\/strong>&nbsp;method.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">links = browser.get_links()\n \nurls = [link.get(\"href\") for link in links]<\/pre>\n\n\n\n<p>In comparison,&nbsp;<strong>BeautifulSoup<\/strong>&nbsp;code for what we did above would look like this:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">import requests\nfrom bs4 import BeautifulSoup\n \nresp = requests.get(\"https:\/\/www.google.com\/search?q=aapl\")\n \nsoup = BeautifulSoup(resp.content)\n \n# get links\nlinks = soup.find_all(\"a\")\n \n# get urls\nurls = [link.get(\"href\") for link in links]\n<\/pre>\n\n\n\n<p>The main advantage of&nbsp;<strong>RoboBrowser<\/strong>&nbsp;versus BeautifulSoup \/ requests is that it behaviors similarly to an actual browser, so it can fill in forms, like the search query above, or click on links like below, using the&nbsp;<strong>follow_link<\/strong>&nbsp;method, all in one package.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">browser.follow_link(links[0])<\/pre>\n\n\n\n<p><strong>RoboBrowser<\/strong>&nbsp;also allows you to effectively use functionality from&nbsp;<strong>BeautifulSoup<\/strong>. Take for example the&nbsp;<strong>find_all<\/strong>&nbsp;method:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\"># get all links - \"a\" tags\nbrowser.find_all(\"a\")\n \n# find all div tags\nbrowser.find_all(\"div\")<\/pre>\n\n\n\n<p>This works just like in BeautifulSoup, where you pass whatever tag you want to search for on the webpage. You also parse out information from various tags the same way.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\"># get text from link\nlinks[0].text<\/pre>\n\n\n\n<p><strong>RoboBrowser<\/strong>, like a typical web browser, also allows you to \u201cgo back\u201d or \u201cforward\u201d using the&nbsp;<strong>back<\/strong>&nbsp;or&nbsp;<strong>forward<\/strong>&nbsp;methods, respectively.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\"># go back to previous page\nbrowser.back()\n \n# check URL\nprint(browser.url)\n \n# go forward\nbrowser.forward()\n \n# check URL again\nprint(browser.url)<\/pre>\n\n\n\n<p>In summary,&nbsp;<strong>RoboBrowser<\/strong>&nbsp;gives you the same HTML parsing abilities as&nbsp;<strong>BeautifulSoup<\/strong>, but also allows you to fill out forms, and perform browser-like functions. One drawback of&nbsp;<strong>RoboBrowser<\/strong>&nbsp;is that it is not able to scrape JavaScript-rendered pages, unlike&nbsp;<strong>requests_html<\/strong>.<\/p>\n\n\n\n<p>You can check out the&nbsp;<strong>RoboBrowser<\/strong>&nbsp;documentation&nbsp;<a href=\"https:\/\/robobrowser.readthedocs.io\/en\/latest\/readme.html\">here<\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Working with requests_html<\/strong><\/h2>\n\n\n\n<p><strong>requests_html<\/strong>&nbsp;can be installed via pip. This package requires Python 3.6+.&nbsp;<strong>requests_html<\/strong>, in effect, provides&nbsp;<strong>requests<\/strong>&nbsp;functionality, while also adding on web parsing abilities.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">pip install requests-html<\/pre>\n\n\n\n<p>To get started with&nbsp;<strong>requests_html<\/strong>, we establish a session object that we will use to connect to webpages and scrape information.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">from requests_html import HTMLSession\n \n# establish a session\nsession = HTMLSession()\n \n# connect to needed webpage\nresp = session.get(\"https:\/\/www.google.com\/search?q=aapl\")<\/pre>\n\n\n\n<p>Getting the URLs on the webpage is pretty intuitive:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">resp.html.links<\/pre>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"640\" height=\"219\" data-src=\"\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/requests_html-relative-url-paths-1-the-automatic-net.jpg\" alt=\"\" class=\"wp-image-187846 lazyload\" data-srcset=\"https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/requests_html-relative-url-paths-1-the-automatic-net.jpg 640w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/requests_html-relative-url-paths-1-the-automatic-net-300x103.jpg 300w\" data-sizes=\"(max-width: 640px) 100vw, 640px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 640px; aspect-ratio: 640\/219;\" \/><\/figure>\n\n\n\n<p>Note:&nbsp;<strong>resp.html.links<\/strong>&nbsp;doesn\u2019t return link objects i.e. it does not return objects representing anchor, or \u201ca\u201d tags. Rather, it just returns the URLs directly from a webpage.<\/p>\n\n\n\n<p>Specifically,&nbsp;<strong>resp.html.links<\/strong>, similar to how we scraped links with&nbsp;<strong>BeautifulSoup<\/strong>&nbsp;or&nbsp;<strong>RoboBrowser<\/strong>, returns the&nbsp;<em>relative<\/em>&nbsp;links on the webpage. To get the absolute links in any of these cases, you&nbsp;<em>could<\/em>&nbsp;use another package, like this:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">from urllib.urlparse import urljoin\n \n# taking urls list from above\nfull_urls = [urljoin(\"https:\/\/www.google.com\", url) for url in urls]<\/pre>\n\n\n\n<p><strong>requests_html<\/strong>, on the other hand, can do this internally with just a simple tweak to our line of code above:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">resp.html.absolute_links<\/pre>\n\n\n\n<p>Here, we use&nbsp;<em>absolute_links<\/em>, rather&nbsp;<em>links<\/em>.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"640\" height=\"240\" data-src=\"\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/requests_html-absolute-url-paths-1-the-automatic-net.jpg\" alt=\"\" class=\"wp-image-187849 lazyload\" data-srcset=\"https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/requests_html-absolute-url-paths-1-the-automatic-net.jpg 640w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/requests_html-absolute-url-paths-1-the-automatic-net-300x113.jpg 300w\" data-sizes=\"(max-width: 640px) 100vw, 640px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 640px; aspect-ratio: 640\/240;\" \/><\/figure>\n\n\n\n<p>If you want to get the link&nbsp;<em>objects<\/em>, or scrape any other tags, you can use the&nbsp;<strong>find<\/strong>&nbsp;method.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\"># get all link objects (\"a\" tags)\nresp.html.find(\"a\")\n \n# scrape all div tags\nresp.html.find(\"div\")<\/pre>\n\n\n\n<p>Then we can scrape the text of all the links like this:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">links = resp.html.find(\"a\")\n \n# get text of each link\n[link.text for link in links]<\/pre>\n\n\n\n<p>If you want to filter your results to only links containing the word \u201csearch\u201d in the URL, you can do that also in one line like this:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">resp.html.find(\"a\", containing = \"search\")<\/pre>\n\n\n\n<p>Or, you replace \u201csearch\u201d with whatever other word you\u2019re looking for.<\/p>\n\n\n\n<p>You can get the attributes of a single tag using the&nbsp;<strong>attrs<\/strong>&nbsp;method.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">links[0].attrs<\/pre>\n\n\n\n<p><strong>requests_html<\/strong>&nbsp;also allows you to search for text on a page. In the example below, Python returns the word \u201cand\u201d because that word appears between \u201ctrade\u201d and \u201cinvesting\u201d in the search results, the phrase input into the&nbsp;<strong>search<\/strong>&nbsp;method.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">resp.html.search(\"trade {} investing\") # and<\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Pagination<\/strong><\/h2>\n\n\n\n<p><strong>requests_html<\/strong>&nbsp;also, to an extent, is able to identify sequential webpages (like the snapshot below).<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">resp = session.get(\"https:\/\/www.nasdaq.com\/symbol\/nflx\/news-headlines\")<\/pre>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"589\" height=\"68\" data-src=\"\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/python-sequential-web-pages-the-automatic-net.jpg\" alt=\"\" class=\"wp-image-187852 lazyload\" data-srcset=\"https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/python-sequential-web-pages-the-automatic-net.jpg 589w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/python-sequential-web-pages-the-automatic-net-300x35.jpg 300w\" data-sizes=\"(max-width: 589px) 100vw, 589px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 589px; aspect-ratio: 589\/68;\" \/><\/figure>\n\n\n\n<p>Once we connect to this webpage (<a href=\"https:\/\/www.nasdaq.com\/symbol\/nflx\/news-headlines\">https:\/\/www.nasdaq.com\/symbol\/nflx\/news-headlines<\/a>), we get the next page URL in the sequence like this:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">resp.html.next() # \"https:\/\/www.nasdaq.com\/symbol\/nflx\/news-headlines?page=2\"<\/pre>\n\n\n\n<p>To request the next page, we can just do this:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">resp2 = session.get(resp.html.next())<\/pre>\n\n\n\n<p>To request the first ten pages, and store each request object in a list, we could do this:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">resp = session.get(\"https:\/\/www.nasdaq.com\/symbol\/nflx\/news-headlines\")\n \nresp_objects = [resp]\n \npage_num = 1\nwhile page_num &lt;= 9:\n    resp = session.get(resp.html.next())\n    resp_objects.append(resp)    \n    print(resp.url)\n     \n    page_num += 1<\/pre>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"640\" height=\"218\" data-src=\"\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/python-requests_html-pagination-1-the-automatic-net.jpg\" alt=\"\" class=\"wp-image-187854 lazyload\" data-srcset=\"https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/python-requests_html-pagination-1-the-automatic-net.jpg 640w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/python-requests_html-pagination-1-the-automatic-net-300x102.jpg 300w\" data-sizes=\"(max-width: 640px) 100vw, 640px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 640px; aspect-ratio: 640\/218;\" \/><\/figure>\n\n\n\n<p>Then we could get the absolute URLs for each of the ten pages:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">urls = [resp.html.absolute_links for resp in resp_objects]\n \nall_urls = []\nfor links_set in urls:\n    all_urls.extend(links_set)<\/pre>\n\n\n\n<p>Check out the documentation for&nbsp;requests_html&nbsp;<a href=\"https:\/\/html.python-requests.org\/\">here<\/a>.<\/p>\n\n\n\n<p>To learn about scraping JavaScript webpages with&nbsp;<strong>requests_html<\/strong>,&nbsp;<a href=\"https:\/\/theautomatic.net\/2019\/01\/19\/scraping-data-from-javascript-webpage-python\/\">click here<\/a>.<\/p>\n\n\n\n<p><em>Originally posted <a href=\"https:\/\/theautomatic.net\/2019\/06\/22\/web-browsing-and-parsing-with-robobrowser-and-requests_html\/\">TheAutomatic.net<\/a> blog.<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Python is a great language for automating web operations.<\/p>\n","protected":false},"author":388,"featured_media":184296,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[339,343,349,338,341,342],"tags":[14986,806,595,14988,15052],"contributors-categories":[13695],"class_list":{"0":"post-187835","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-data-science","8":"category-programing-languages","9":"category-python-development","10":"category-ibkr-quant-news","11":"category-quant-development","12":"category-r-development","13":"tag-beautifulsoup","14":"tag-data-science","15":"tag-python","16":"tag-robobrowser","17":"tag-web-browsing","18":"contributors-categories-theautomatic-net"},"pp_statuses_selecting_workflow":false,"pp_workflow_action":"current","pp_status_selection":"publish","acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v26.9 (Yoast SEO v27.4) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>Web Browsing and Parsing with RoboBrowser and requests_html<\/title>\n<meta name=\"description\" content=\"Python is a great language for automating web operations.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.interactivebrokers.com\/campus\/wp-json\/wp\/v2\/posts\/187835\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Web Browsing and Parsing with RoboBrowser and requests_html | IBKR Campus US\" \/>\n<meta property=\"og:description\" content=\"Python is a great language for automating web operations.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/web-browsing-and-parsing-with-robobrowser-and-requests_html\/\" \/>\n<meta property=\"og:site_name\" content=\"IBKR Campus US\" \/>\n<meta property=\"article:published_time\" content=\"2023-04-03T16:55:35+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-04-03T18:51:14+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/02\/python-notebook.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1000\" \/>\n\t<meta property=\"og:image:height\" content=\"563\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Andrew Treadway\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Andrew Treadway\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"7 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\n\t    \"@context\": \"https:\\\/\\\/schema.org\",\n\t    \"@graph\": [\n\t        {\n\t            \"@type\": \"NewsArticle\",\n\t            \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/web-browsing-and-parsing-with-robobrowser-and-requests_html\\\/#article\",\n\t            \"isPartOf\": {\n\t                \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/web-browsing-and-parsing-with-robobrowser-and-requests_html\\\/\"\n\t            },\n\t            \"author\": {\n\t                \"name\": \"Andrew Treadway\",\n\t                \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#\\\/schema\\\/person\\\/d4018570a16fb867f1c08412fc9c64bc\"\n\t            },\n\t            \"headline\": \"Web Browsing and Parsing with RoboBrowser and requests_html\",\n\t            \"datePublished\": \"2023-04-03T16:55:35+00:00\",\n\t            \"dateModified\": \"2023-04-03T18:51:14+00:00\",\n\t            \"mainEntityOfPage\": {\n\t                \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/web-browsing-and-parsing-with-robobrowser-and-requests_html\\\/\"\n\t            },\n\t            \"wordCount\": 1059,\n\t            \"publisher\": {\n\t                \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#organization\"\n\t            },\n\t            \"image\": {\n\t                \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/web-browsing-and-parsing-with-robobrowser-and-requests_html\\\/#primaryimage\"\n\t            },\n\t            \"thumbnailUrl\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2023\\\/02\\\/python-notebook.jpg\",\n\t            \"keywords\": [\n\t                \"BeautifulSoup\",\n\t                \"Data Science\",\n\t                \"Python\",\n\t                \"RoboBrowser\",\n\t                \"Web Browsing\"\n\t            ],\n\t            \"articleSection\": [\n\t                \"Data Science\",\n\t                \"Programming Languages\",\n\t                \"Python Development\",\n\t                \"Quant\",\n\t                \"Quant Development\",\n\t                \"R Development\"\n\t            ],\n\t            \"inLanguage\": \"en-US\"\n\t        },\n\t        {\n\t            \"@type\": \"WebPage\",\n\t            \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/web-browsing-and-parsing-with-robobrowser-and-requests_html\\\/\",\n\t            \"url\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/web-browsing-and-parsing-with-robobrowser-and-requests_html\\\/\",\n\t            \"name\": \"Web Browsing and Parsing with RoboBrowser and requests_html | IBKR Campus US\",\n\t            \"isPartOf\": {\n\t                \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#website\"\n\t            },\n\t            \"primaryImageOfPage\": {\n\t                \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/web-browsing-and-parsing-with-robobrowser-and-requests_html\\\/#primaryimage\"\n\t            },\n\t            \"image\": {\n\t                \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/web-browsing-and-parsing-with-robobrowser-and-requests_html\\\/#primaryimage\"\n\t            },\n\t            \"thumbnailUrl\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2023\\\/02\\\/python-notebook.jpg\",\n\t            \"datePublished\": \"2023-04-03T16:55:35+00:00\",\n\t            \"dateModified\": \"2023-04-03T18:51:14+00:00\",\n\t            \"description\": \"Python is a great language for automating web operations.\",\n\t            \"inLanguage\": \"en-US\",\n\t            \"potentialAction\": [\n\t                {\n\t                    \"@type\": \"ReadAction\",\n\t                    \"target\": [\n\t                        \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/web-browsing-and-parsing-with-robobrowser-and-requests_html\\\/\"\n\t                    ]\n\t                }\n\t            ]\n\t        },\n\t        {\n\t            \"@type\": \"ImageObject\",\n\t            \"inLanguage\": \"en-US\",\n\t            \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/web-browsing-and-parsing-with-robobrowser-and-requests_html\\\/#primaryimage\",\n\t            \"url\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2023\\\/02\\\/python-notebook.jpg\",\n\t            \"contentUrl\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2023\\\/02\\\/python-notebook.jpg\",\n\t            \"width\": 1000,\n\t            \"height\": 563,\n\t            \"caption\": \"Python: Importing an ipynb File (Jupyter Notebook) from Another ipynb File\"\n\t        },\n\t        {\n\t            \"@type\": \"WebSite\",\n\t            \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#website\",\n\t            \"url\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/\",\n\t            \"name\": \"IBKR Campus US\",\n\t            \"description\": \"Financial Education from Interactive Brokers\",\n\t            \"publisher\": {\n\t                \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#organization\"\n\t            },\n\t            \"potentialAction\": [\n\t                {\n\t                    \"@type\": \"SearchAction\",\n\t                    \"target\": {\n\t                        \"@type\": \"EntryPoint\",\n\t                        \"urlTemplate\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/?s={search_term_string}\"\n\t                    },\n\t                    \"query-input\": {\n\t                        \"@type\": \"PropertyValueSpecification\",\n\t                        \"valueRequired\": true,\n\t                        \"valueName\": \"search_term_string\"\n\t                    }\n\t                }\n\t            ],\n\t            \"inLanguage\": \"en-US\"\n\t        },\n\t        {\n\t            \"@type\": \"Organization\",\n\t            \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#organization\",\n\t            \"name\": \"Interactive Brokers\",\n\t            \"alternateName\": \"IBKR\",\n\t            \"url\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/\",\n\t            \"logo\": {\n\t                \"@type\": \"ImageObject\",\n\t                \"inLanguage\": \"en-US\",\n\t                \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#\\\/schema\\\/logo\\\/image\\\/\",\n\t                \"url\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2024\\\/05\\\/ibkr-campus-logo.jpg\",\n\t                \"contentUrl\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2024\\\/05\\\/ibkr-campus-logo.jpg\",\n\t                \"width\": 669,\n\t                \"height\": 669,\n\t                \"caption\": \"Interactive Brokers\"\n\t            },\n\t            \"image\": {\n\t                \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#\\\/schema\\\/logo\\\/image\\\/\"\n\t            },\n\t            \"publishingPrinciples\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/about-ibkr-campus\\\/\",\n\t            \"ethicsPolicy\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/cyber-security-notice\\\/\"\n\t        },\n\t        {\n\t            \"@type\": \"Person\",\n\t            \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#\\\/schema\\\/person\\\/d4018570a16fb867f1c08412fc9c64bc\",\n\t            \"name\": \"Andrew Treadway\",\n\t            \"description\": \"Andrew Treadway currently works as a Senior Data Scientist, and has experience doing analytics, software automation, and ETL. He completed a master\u2019s degree in computer science \\\/ machine learning, and an undergraduate degree in pure mathematics. Connect with him on LinkedIn: https:\\\/\\\/www.linkedin.com\\\/in\\\/andrew-treadway-a3b19b103\\\/In addition to TheAutomatic.net blog, he also teaches in-person courses on Python and R through my NYC meetup: more details.\",\n\t            \"sameAs\": [\n\t                \"https:\\\/\\\/theautomatic.net\\\/about-me\\\/\"\n\t            ],\n\t            \"url\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/author\\\/andrewtreadway\\\/\"\n\t        }\n\t    ]\n\t}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Web Browsing and Parsing with RoboBrowser and requests_html","description":"Python is a great language for automating web operations.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.interactivebrokers.com\/campus\/wp-json\/wp\/v2\/posts\/187835\/","og_locale":"en_US","og_type":"article","og_title":"Web Browsing and Parsing with RoboBrowser and requests_html | IBKR Campus US","og_description":"Python is a great language for automating web operations.","og_url":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/web-browsing-and-parsing-with-robobrowser-and-requests_html\/","og_site_name":"IBKR Campus US","article_published_time":"2023-04-03T16:55:35+00:00","article_modified_time":"2023-04-03T18:51:14+00:00","og_image":[{"width":1000,"height":563,"url":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/02\/python-notebook.jpg","type":"image\/jpeg"}],"author":"Andrew Treadway","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Andrew Treadway","Est. reading time":"7 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"NewsArticle","@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/web-browsing-and-parsing-with-robobrowser-and-requests_html\/#article","isPartOf":{"@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/web-browsing-and-parsing-with-robobrowser-and-requests_html\/"},"author":{"name":"Andrew Treadway","@id":"https:\/\/ibkrcampus.com\/campus\/#\/schema\/person\/d4018570a16fb867f1c08412fc9c64bc"},"headline":"Web Browsing and Parsing with RoboBrowser and requests_html","datePublished":"2023-04-03T16:55:35+00:00","dateModified":"2023-04-03T18:51:14+00:00","mainEntityOfPage":{"@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/web-browsing-and-parsing-with-robobrowser-and-requests_html\/"},"wordCount":1059,"publisher":{"@id":"https:\/\/ibkrcampus.com\/campus\/#organization"},"image":{"@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/web-browsing-and-parsing-with-robobrowser-and-requests_html\/#primaryimage"},"thumbnailUrl":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/02\/python-notebook.jpg","keywords":["BeautifulSoup","Data Science","Python","RoboBrowser","Web Browsing"],"articleSection":["Data Science","Programming Languages","Python Development","Quant","Quant Development","R Development"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/web-browsing-and-parsing-with-robobrowser-and-requests_html\/","url":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/web-browsing-and-parsing-with-robobrowser-and-requests_html\/","name":"Web Browsing and Parsing with RoboBrowser and requests_html | IBKR Campus US","isPartOf":{"@id":"https:\/\/ibkrcampus.com\/campus\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/web-browsing-and-parsing-with-robobrowser-and-requests_html\/#primaryimage"},"image":{"@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/web-browsing-and-parsing-with-robobrowser-and-requests_html\/#primaryimage"},"thumbnailUrl":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/02\/python-notebook.jpg","datePublished":"2023-04-03T16:55:35+00:00","dateModified":"2023-04-03T18:51:14+00:00","description":"Python is a great language for automating web operations.","inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/web-browsing-and-parsing-with-robobrowser-and-requests_html\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/web-browsing-and-parsing-with-robobrowser-and-requests_html\/#primaryimage","url":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/02\/python-notebook.jpg","contentUrl":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/02\/python-notebook.jpg","width":1000,"height":563,"caption":"Python: Importing an ipynb File (Jupyter Notebook) from Another ipynb File"},{"@type":"WebSite","@id":"https:\/\/ibkrcampus.com\/campus\/#website","url":"https:\/\/ibkrcampus.com\/campus\/","name":"IBKR Campus US","description":"Financial Education from Interactive Brokers","publisher":{"@id":"https:\/\/ibkrcampus.com\/campus\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/ibkrcampus.com\/campus\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/ibkrcampus.com\/campus\/#organization","name":"Interactive Brokers","alternateName":"IBKR","url":"https:\/\/ibkrcampus.com\/campus\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/ibkrcampus.com\/campus\/#\/schema\/logo\/image\/","url":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2024\/05\/ibkr-campus-logo.jpg","contentUrl":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2024\/05\/ibkr-campus-logo.jpg","width":669,"height":669,"caption":"Interactive Brokers"},"image":{"@id":"https:\/\/ibkrcampus.com\/campus\/#\/schema\/logo\/image\/"},"publishingPrinciples":"https:\/\/www.interactivebrokers.com\/campus\/about-ibkr-campus\/","ethicsPolicy":"https:\/\/www.interactivebrokers.com\/campus\/cyber-security-notice\/"},{"@type":"Person","@id":"https:\/\/ibkrcampus.com\/campus\/#\/schema\/person\/d4018570a16fb867f1c08412fc9c64bc","name":"Andrew Treadway","description":"Andrew Treadway currently works as a Senior Data Scientist, and has experience doing analytics, software automation, and ETL. He completed a master\u2019s degree in computer science \/ machine learning, and an undergraduate degree in pure mathematics. Connect with him on LinkedIn: https:\/\/www.linkedin.com\/in\/andrew-treadway-a3b19b103\/In addition to TheAutomatic.net blog, he also teaches in-person courses on Python and R through my NYC meetup: more details.","sameAs":["https:\/\/theautomatic.net\/about-me\/"],"url":"https:\/\/www.interactivebrokers.com\/campus\/author\/andrewtreadway\/"}]}},"jetpack_featured_media_url":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/02\/python-notebook.jpg","_links":{"self":[{"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/posts\/187835","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/users\/388"}],"replies":[{"embeddable":true,"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/comments?post=187835"}],"version-history":[{"count":0,"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/posts\/187835\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/media\/184296"}],"wp:attachment":[{"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/media?parent=187835"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/categories?post=187835"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/tags?post=187835"},{"taxonomy":"contributors-categories","embeddable":true,"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/contributors-categories?post=187835"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}