{"id":185984,"date":"2023-03-03T07:54:37","date_gmt":"2023-03-03T12:54:37","guid":{"rendered":"https:\/\/ibkrcampus.com\/?p=185984"},"modified":"2023-03-06T09:24:30","modified_gmt":"2023-03-06T14:24:30","slug":"python-basket-analysis-and-pymining","status":"publish","type":"post","link":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/python-basket-analysis-and-pymining\/","title":{"rendered":"Python, Basket Analysis, and Pymining"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\" id=\"h-background\"><strong>Background<\/strong><\/h2>\n\n\n\n<p><a href=\"https:\/\/theautomatic.net\/category\/python\/\">Python\u2019s<\/a>&nbsp;<strong>pymining<\/strong>&nbsp;package provides a collection of useful algorithms for item set mining, association mining, and more. We\u2019ll explore some of its functionality during this post by using it to apply basket analysis to tennis. When basket analysis is discussed, it\u2019s often in the context of retail \u2013 analyzing what combinations of products are typically bought together (or in the same \u201cbasket\u201d). For example, in grocery shopping, milk and butter may be frequently purchased together. We can take ideas from basket analysis and apply them in many other scenarios.<\/p>\n\n\n\n<p>As an example \u2013 let\u2019s say we\u2019re looking at events like tennis tournaments where each tournament has different successive rounds i.e. quarterfinals, semifinals, finals etc. How would you figure out what combinations of players typically show up in the same rounds i.e. what&nbsp;<em>combinations<\/em>&nbsp;of players typically appear in the semifinal round of tournaments? This is effectively the same type question as asking what&nbsp;<em>combinations<\/em>&nbsp;of groceries are typically bought together. In both scenarios we can use Python to help us out!<\/p>\n\n\n\n<p>The data I\u2019ll be using in this post can be found by&nbsp;<a href=\"https:\/\/github.com\/JeffSackmann\/tennis_atp\">clicking here<\/a>&nbsp;(all credit for the data goes to&nbsp;<a href=\"https:\/\/www.tennisabstract.com\/\">Jeff Sackmann \/ Tennis Abstract<\/a>). It runs from 1968 through part of 2019.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-prepping-the-data\"><strong>Prepping the data<\/strong><\/h2>\n\n\n\n<p>First, let\u2019s import the packages we\u2019ll need. From the&nbsp;<strong>pymining<\/strong>&nbsp;package, we\u2019ll import a module called&nbsp;<strong>itemmining<\/strong>.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">from pymining import itemmining\nimport pandas as pd\nimport itertools\nimport os<\/pre>\n\n\n\n<p>Next, we can read in our data, which sits in a collection of CSV files that I\u2019ve downloaded into a directory from the link above. Our analysis will exclude data from futures and challenger tournaments, so we need to filter out those files as seen in the&nbsp;<a href=\"https:\/\/theautomatic.net\/tutorial-on-python-list-comprehensions\/\">list comprehension<\/a>&nbsp;below.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\"># set directory to folder with tennis files\nos.chdir(\"C:\/path\/to\/tennis\/files\")\n \n# get list of files we need\nfiles = [file for file in os.listdir() if \"atp_matches\" in file]\nfiles = [file for file in files if \"futures\" not in file and \"chall\" not in file]\n \n# read in each CSV file\ndfs = []\nfor file in files:\n    dfs.append(pd.read_csv(file))\n \n# combine all the datasets into a single data frame\nall_data = pd.concat(dfs)<\/pre>\n\n\n\n<p>The next line defines a&nbsp;<a href=\"https:\/\/theautomatic.net\/tutorial-for-python-lists\/\">list<\/a>&nbsp;of the four grand slam, or major, tournaments. We\u2019ll use this in our analysis. Then, we get the subset of our data containing only results related to the four major tournaments.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">major_tourneys = [\"Wimbledon\", \"Roland Garros\", \"US Open\",\n                  \"Australian Open\"]\n \n \n# get data for only grand slam tournaments\ngrand_slams = all_data[all_data.tourney_name.isin(major_tourneys)]<\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Combinations of semifinals players<\/strong><\/h2>\n\n\n\n<p>Alright \u2013 let\u2019s ask our first question. What combinations of players have appeared the most in the semifinals of a grand slam? By \u201ccombination\u201d, we can look at \u2013 how many times the same two players appeared in the last four, the same three players appeared, or the same four individuals appeared in the final four.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\"># get subset with just semifinals data\nsf = grand_slams[grand_slams[\"round\"] == \"SF\"]\n \n# split dataset into yearly subsets\nyear_sfs = {year : sf[sf.year == year] for year in sf.year.unique()}<\/pre>\n\n\n\n<p>Next, we\u2019ll write a function to extract what players made it to the semifinals in a given yearly subset. We do this by splitting the yearly subset input into each tournament separately. Then, we extract the winner_name and loser_name field values in each tournament within a given year (corresponding to that yearly subset).<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">def get_info_from_year(info, tourn_list):\n     \n    tournaments = [info[info.tourney_name == name] for name in tourn_list]\n     \n    players = [df.winner_name.tolist() + df.loser_name.tolist() for df in tournaments]\n     \n    return players<\/pre>\n\n\n\n<p>We can now get a list of all the semifinalists in each tournament throughout the full time period:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">player_sfs = {year : get_info_from_year(info, major_tourneys) for\n                     year,info in year_sfs.items()}\nplayer_sfs = [elt for elt in itertools.chain.from_iterable(player_sfs.values())]<\/pre>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Using the pymining package<\/strong><\/h3>\n\n\n\n<p>Now it\u2019s time to use the&nbsp;<strong>pymining<\/strong>&nbsp;package.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">sf_input = itemmining.get_relim_input(player_sfs)\nsf_results = itemmining.relim(sf_input, min_support = 2)<\/pre>\n\n\n\n<p>The first function call above \u2013 using&nbsp;<em>itemmining.get_relim_input<\/em>&nbsp;\u2013 creates a data structure that we can then input into the&nbsp;<em>itemmining.relim<\/em>&nbsp;method. This method runs the&nbsp;<a href=\"https:\/\/www.philippe-fournier-viger.com\/spmf\/Relim.php\">Relim algorithm<\/a>&nbsp;to identify combinations of items (players in this case) in the semifinals data.<\/p>\n\n\n\n<p>We can check to see that&nbsp;<em>sf_results<\/em>, or the result of running&nbsp;<em>itemmining.relim<\/em>, is a dictionary.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">type(sf_results) # dict<\/pre>\n\n\n\n<p>By default,&nbsp;<em>itemmining.relim<\/em>&nbsp;returns a dictionary where each key is a frozenset of a particular combination (in this case, a combination of players), and each value is the count of how many times that combination appears. This is analogous to the traditional idea of basket analysis where you might look at what grocery items tend to be purchased together, for instance (e.g. milk and butter are purchased together X number of times).<\/p>\n\n\n\n<p>The&nbsp;<em>min_support = 2<\/em>&nbsp;parameter specifies that we only want combinations that appear at least twice to be returned. If you\u2019re dealing with large amounts of different combinations, you can adjust this to be higher as need be.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">{key : val for key,val in sf_results.items() if len(key) == 4}<\/pre>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"640\" height=\"392\" data-src=\"\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/pymining-itemmining-TheAutomatic.net_.jpg\" alt=\"\" class=\"wp-image-185990 lazyload\" data-srcset=\"https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/pymining-itemmining-TheAutomatic.net_.jpg 640w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/pymining-itemmining-TheAutomatic.net_-300x184.jpg 300w\" data-sizes=\"(max-width: 640px) 100vw, 640px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 640px; aspect-ratio: 640\/392;\" \/><\/figure>\n\n\n\n<p>From this we can see that the \u201cBig 4\u201d of Federer, Nadal, Djokovic, and Murray took all four spots in the semifinals of grand slam together on four separate occasions, which is more than any other quartet. We can see that there\u2019s two separate groups of four players that achieved this feat three times.<\/p>\n\n\n\n<p>Let\u2019s examine how this looks for groups of three in the semifinals.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">three_sf = {key : val for key,val in sf_results.items() if len(key) == 3}\n \n# sort by number of occurrences\nsorted(three_sf.items(), key = lambda sf: sf[1])<\/pre>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"640\" height=\"141\" data-src=\"\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/pymining-examples-TheAutomatic.net_.jpg\" alt=\"\" class=\"wp-image-185991 lazyload\" data-srcset=\"https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/pymining-examples-TheAutomatic.net_.jpg 640w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/pymining-examples-TheAutomatic.net_-300x66.jpg 300w\" data-sizes=\"(max-width: 640px) 100vw, 640px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 640px; aspect-ratio: 640\/141;\" \/><\/figure>\n\n\n\n<p>Above we can see that the the top three most frequent 3-player combinations are each subsets of the \u201cbig four\u201d above \u2013 with Federer, Nadal, and Djokovic making up the top spot. We can do the same analysis with 2-player combinations to see that the most common duo appearing in the semifinals is Federer \/ Djokovic as there have been 20 occasions where both were present in the semifinals of the same major event.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"640\" height=\"268\" data-src=\"\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/python-pymining-TheAutomatic.net_.jpg\" alt=\"\" class=\"wp-image-185993 lazyload\" data-srcset=\"https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/python-pymining-TheAutomatic.net_.jpg 640w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/python-pymining-TheAutomatic.net_-300x126.jpg 300w\" data-sizes=\"(max-width: 640px) 100vw, 640px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 640px; aspect-ratio: 640\/268;\" \/><\/figure>\n\n\n\n<p>The most number of times a single duo appeared in the semifinals outside of Federer \/ Nadal \/ Djokovic \/ Murray is Jimmy Connors \/ John McEnroe at 14 times.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Analyzing semifinals across all tournaments<\/strong><\/h2>\n\n\n\n<p>What if we adjust our analysis to look across all tournaments? Keeping our focus on the semifinal stage only, we just need to make a minor tweak to our code:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\"># switch to all_data rather than grand_slams\nsf = all_data[all_data[\"round\"] == \"SF\"]\n \nyear_sfs = {year : sf[sf.year == year] for year in sf.year.unique()}\n \nplayer_sfs = {year : get_info_from_year(info, all_data.tourney_name.unique()) for\n                     year,info in year_sfs.items()}\nplayer_sfs = [elt for elt in itertools.chain.from_iterable(player_sfs.values())]\n \n \nsf_input = itemmining.get_relim_input(player_sfs)\nsf_results = itemmining.relim(sf_input, min_support = 2)\n \n# get 2-player combinations only\n{key : val for key,val in sf_results.items() if len(key) == 2}<\/pre>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"640\" height=\"256\" data-src=\"\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/python-tennis-analysis-TheAutomatic.net_.jpg\" alt=\"\" class=\"wp-image-185995 lazyload\" data-srcset=\"https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/python-tennis-analysis-TheAutomatic.net_.jpg 640w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/python-tennis-analysis-TheAutomatic.net_-300x120.jpg 300w\" data-sizes=\"(max-width: 640px) 100vw, 640px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 640px; aspect-ratio: 640\/256;\" \/><\/figure>\n\n\n\n<p>Here, we can see that Federer \/ Djokovic and Djokovic \/ Nadal have respectively appeared in the semifinals of a tournament over 60 times. If we wanted to, we could also tweak our code to look at quarterfinal stages, earlier rounds, etc.<\/p>\n\n\n\n<p>Lastly, to learn more about&nbsp;<strong>pymining<\/strong>,&nbsp;<a href=\"https:\/\/github.com\/bartdag\/pymining\">see its GitHub page here<\/a>.<\/p>\n\n\n\n<p><em>Originally posted on <a href=\"https:\/\/theautomatic.net\/2019\/08\/21\/python-basket-analysis-and-pymining\/\">TheAutomatic.net<\/a> Blog.<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Python\u2019s pymining package provides a collection of useful algorithms for item set mining, association mining, and more. <\/p>\n","protected":false},"author":388,"featured_media":185996,"comment_status":"closed","ping_status":"open","sticky":true,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[339,343,349,338,341,352,344],"tags":[14851,14853,1224,14852,595,14854],"contributors-categories":[13695],"class_list":{"0":"post-185984","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-data-science","8":"category-programing-languages","9":"category-python-development","10":"category-ibkr-quant-news","11":"category-quant-development","12":"category-quant-north-america","13":"category-quant-regions","14":"tag-basket-analysis","15":"tag-itertools","16":"tag-pandas","17":"tag-pymining-package","18":"tag-python","19":"tag-the-relim-algorithm","20":"contributors-categories-theautomatic-net"},"pp_statuses_selecting_workflow":false,"pp_workflow_action":"current","pp_status_selection":"publish","acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v26.9 (Yoast SEO v27.7) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>Python, Basket Analysis, and Pymining | IBKR Quant<\/title>\n<meta name=\"description\" content=\"Python\u2019s pymining package provides a collection of useful algorithms for item set mining, association mining, and more.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.interactivebrokers.com\/campus\/wp-json\/wp\/v2\/posts\/185984\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Python, Basket Analysis, and Pymining | IBKR Campus US\" \/>\n<meta property=\"og:description\" content=\"Python\u2019s pymining package provides a collection of useful algorithms for item set mining, association mining, and more.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/python-basket-analysis-and-pymining\/\" \/>\n<meta property=\"og:site_name\" content=\"IBKR Campus US\" \/>\n<meta property=\"article:published_time\" content=\"2023-03-03T12:54:37+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-03-06T14:24:30+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/python-word-cloud-programming-languages.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1000\" \/>\n\t<meta property=\"og:image:height\" content=\"563\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Andrew Treadway\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Andrew Treadway\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\n\t    \"@context\": \"https:\\\/\\\/schema.org\",\n\t    \"@graph\": [\n\t        {\n\t            \"@type\": \"NewsArticle\",\n\t            \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/python-basket-analysis-and-pymining\\\/#article\",\n\t            \"isPartOf\": {\n\t                \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/python-basket-analysis-and-pymining\\\/\"\n\t            },\n\t            \"author\": {\n\t                \"name\": \"Andrew Treadway\",\n\t                \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#\\\/schema\\\/person\\\/d4018570a16fb867f1c08412fc9c64bc\"\n\t            },\n\t            \"headline\": \"Python, Basket Analysis, and Pymining\",\n\t            \"datePublished\": \"2023-03-03T12:54:37+00:00\",\n\t            \"dateModified\": \"2023-03-06T14:24:30+00:00\",\n\t            \"mainEntityOfPage\": {\n\t                \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/python-basket-analysis-and-pymining\\\/\"\n\t            },\n\t            \"wordCount\": 931,\n\t            \"publisher\": {\n\t                \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#organization\"\n\t            },\n\t            \"image\": {\n\t                \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/python-basket-analysis-and-pymining\\\/#primaryimage\"\n\t            },\n\t            \"thumbnailUrl\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2023\\\/03\\\/python-word-cloud-programming-languages.jpg\",\n\t            \"keywords\": [\n\t                \"Basket Analysis\",\n\t                \"itertools\",\n\t                \"Pandas\",\n\t                \"pymining package\",\n\t                \"Python\",\n\t                \"the Relim Algorithm\"\n\t            ],\n\t            \"articleSection\": [\n\t                \"Data Science\",\n\t                \"Programming Languages\",\n\t                \"Python Development\",\n\t                \"Quant\",\n\t                \"Quant Development\",\n\t                \"Quant North America\",\n\t                \"Quant Regions\"\n\t            ],\n\t            \"inLanguage\": \"en-US\"\n\t        },\n\t        {\n\t            \"@type\": \"WebPage\",\n\t            \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/python-basket-analysis-and-pymining\\\/\",\n\t            \"url\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/python-basket-analysis-and-pymining\\\/\",\n\t            \"name\": \"Python, Basket Analysis, and Pymining | IBKR Campus US\",\n\t            \"isPartOf\": {\n\t                \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#website\"\n\t            },\n\t            \"primaryImageOfPage\": {\n\t                \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/python-basket-analysis-and-pymining\\\/#primaryimage\"\n\t            },\n\t            \"image\": {\n\t                \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/python-basket-analysis-and-pymining\\\/#primaryimage\"\n\t            },\n\t            \"thumbnailUrl\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2023\\\/03\\\/python-word-cloud-programming-languages.jpg\",\n\t            \"datePublished\": \"2023-03-03T12:54:37+00:00\",\n\t            \"dateModified\": \"2023-03-06T14:24:30+00:00\",\n\t            \"description\": \"Python\u2019s pymining package provides a collection of useful algorithms for item set mining, association mining, and more.\",\n\t            \"inLanguage\": \"en-US\",\n\t            \"potentialAction\": [\n\t                {\n\t                    \"@type\": \"ReadAction\",\n\t                    \"target\": [\n\t                        \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/python-basket-analysis-and-pymining\\\/\"\n\t                    ]\n\t                }\n\t            ]\n\t        },\n\t        {\n\t            \"@type\": \"ImageObject\",\n\t            \"inLanguage\": \"en-US\",\n\t            \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/python-basket-analysis-and-pymining\\\/#primaryimage\",\n\t            \"url\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2023\\\/03\\\/python-word-cloud-programming-languages.jpg\",\n\t            \"contentUrl\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2023\\\/03\\\/python-word-cloud-programming-languages.jpg\",\n\t            \"width\": 1000,\n\t            \"height\": 563,\n\t            \"caption\": \"Python\"\n\t        },\n\t        {\n\t            \"@type\": \"WebSite\",\n\t            \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#website\",\n\t            \"url\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/\",\n\t            \"name\": \"IBKR Campus US\",\n\t            \"description\": \"Financial Education from Interactive Brokers\",\n\t            \"publisher\": {\n\t                \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#organization\"\n\t            },\n\t            \"potentialAction\": [\n\t                {\n\t                    \"@type\": \"SearchAction\",\n\t                    \"target\": {\n\t                        \"@type\": \"EntryPoint\",\n\t                        \"urlTemplate\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/?s={search_term_string}\"\n\t                    },\n\t                    \"query-input\": {\n\t                        \"@type\": \"PropertyValueSpecification\",\n\t                        \"valueRequired\": true,\n\t                        \"valueName\": \"search_term_string\"\n\t                    }\n\t                }\n\t            ],\n\t            \"inLanguage\": \"en-US\"\n\t        },\n\t        {\n\t            \"@type\": \"Organization\",\n\t            \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#organization\",\n\t            \"name\": \"Interactive Brokers\",\n\t            \"alternateName\": \"IBKR\",\n\t            \"url\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/\",\n\t            \"logo\": {\n\t                \"@type\": \"ImageObject\",\n\t                \"inLanguage\": \"en-US\",\n\t                \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#\\\/schema\\\/logo\\\/image\\\/\",\n\t                \"url\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2024\\\/05\\\/ibkr-campus-logo.jpg\",\n\t                \"contentUrl\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2024\\\/05\\\/ibkr-campus-logo.jpg\",\n\t                \"width\": 669,\n\t                \"height\": 669,\n\t                \"caption\": \"Interactive Brokers\"\n\t            },\n\t            \"image\": {\n\t                \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#\\\/schema\\\/logo\\\/image\\\/\"\n\t            },\n\t            \"publishingPrinciples\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/about-ibkr-campus\\\/\",\n\t            \"ethicsPolicy\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/cyber-security-notice\\\/\"\n\t        },\n\t        {\n\t            \"@type\": \"Person\",\n\t            \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#\\\/schema\\\/person\\\/d4018570a16fb867f1c08412fc9c64bc\",\n\t            \"name\": \"Andrew Treadway\",\n\t            \"description\": \"Andrew Treadway currently works as a Senior Data Scientist, and has experience doing analytics, software automation, and ETL. He completed a master\u2019s degree in computer science \\\/ machine learning, and an undergraduate degree in pure mathematics. Connect with him on LinkedIn: https:\\\/\\\/www.linkedin.com\\\/in\\\/andrew-treadway-a3b19b103\\\/In addition to TheAutomatic.net blog, he also teaches in-person courses on Python and R through my NYC meetup: more details.\",\n\t            \"sameAs\": [\n\t                \"https:\\\/\\\/theautomatic.net\\\/about-me\\\/\"\n\t            ],\n\t            \"url\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/author\\\/andrewtreadway\\\/\"\n\t        }\n\t    ]\n\t}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Python, Basket Analysis, and Pymining | IBKR Quant","description":"Python\u2019s pymining package provides a collection of useful algorithms for item set mining, association mining, and more.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.interactivebrokers.com\/campus\/wp-json\/wp\/v2\/posts\/185984\/","og_locale":"en_US","og_type":"article","og_title":"Python, Basket Analysis, and Pymining | IBKR Campus US","og_description":"Python\u2019s pymining package provides a collection of useful algorithms for item set mining, association mining, and more.","og_url":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/python-basket-analysis-and-pymining\/","og_site_name":"IBKR Campus US","article_published_time":"2023-03-03T12:54:37+00:00","article_modified_time":"2023-03-06T14:24:30+00:00","og_image":[{"width":1000,"height":563,"url":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/python-word-cloud-programming-languages.jpg","type":"image\/jpeg"}],"author":"Andrew Treadway","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Andrew Treadway","Est. reading time":"6 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"NewsArticle","@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/python-basket-analysis-and-pymining\/#article","isPartOf":{"@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/python-basket-analysis-and-pymining\/"},"author":{"name":"Andrew Treadway","@id":"https:\/\/ibkrcampus.com\/campus\/#\/schema\/person\/d4018570a16fb867f1c08412fc9c64bc"},"headline":"Python, Basket Analysis, and Pymining","datePublished":"2023-03-03T12:54:37+00:00","dateModified":"2023-03-06T14:24:30+00:00","mainEntityOfPage":{"@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/python-basket-analysis-and-pymining\/"},"wordCount":931,"publisher":{"@id":"https:\/\/ibkrcampus.com\/campus\/#organization"},"image":{"@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/python-basket-analysis-and-pymining\/#primaryimage"},"thumbnailUrl":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/python-word-cloud-programming-languages.jpg","keywords":["Basket Analysis","itertools","Pandas","pymining package","Python","the Relim Algorithm"],"articleSection":["Data Science","Programming Languages","Python Development","Quant","Quant Development","Quant North America","Quant Regions"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/python-basket-analysis-and-pymining\/","url":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/python-basket-analysis-and-pymining\/","name":"Python, Basket Analysis, and Pymining | IBKR Campus US","isPartOf":{"@id":"https:\/\/ibkrcampus.com\/campus\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/python-basket-analysis-and-pymining\/#primaryimage"},"image":{"@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/python-basket-analysis-and-pymining\/#primaryimage"},"thumbnailUrl":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/python-word-cloud-programming-languages.jpg","datePublished":"2023-03-03T12:54:37+00:00","dateModified":"2023-03-06T14:24:30+00:00","description":"Python\u2019s pymining package provides a collection of useful algorithms for item set mining, association mining, and more.","inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/python-basket-analysis-and-pymining\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/python-basket-analysis-and-pymining\/#primaryimage","url":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/python-word-cloud-programming-languages.jpg","contentUrl":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/python-word-cloud-programming-languages.jpg","width":1000,"height":563,"caption":"Python"},{"@type":"WebSite","@id":"https:\/\/ibkrcampus.com\/campus\/#website","url":"https:\/\/ibkrcampus.com\/campus\/","name":"IBKR Campus US","description":"Financial Education from Interactive Brokers","publisher":{"@id":"https:\/\/ibkrcampus.com\/campus\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/ibkrcampus.com\/campus\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/ibkrcampus.com\/campus\/#organization","name":"Interactive Brokers","alternateName":"IBKR","url":"https:\/\/ibkrcampus.com\/campus\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/ibkrcampus.com\/campus\/#\/schema\/logo\/image\/","url":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2024\/05\/ibkr-campus-logo.jpg","contentUrl":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2024\/05\/ibkr-campus-logo.jpg","width":669,"height":669,"caption":"Interactive Brokers"},"image":{"@id":"https:\/\/ibkrcampus.com\/campus\/#\/schema\/logo\/image\/"},"publishingPrinciples":"https:\/\/www.interactivebrokers.com\/campus\/about-ibkr-campus\/","ethicsPolicy":"https:\/\/www.interactivebrokers.com\/campus\/cyber-security-notice\/"},{"@type":"Person","@id":"https:\/\/ibkrcampus.com\/campus\/#\/schema\/person\/d4018570a16fb867f1c08412fc9c64bc","name":"Andrew Treadway","description":"Andrew Treadway currently works as a Senior Data Scientist, and has experience doing analytics, software automation, and ETL. He completed a master\u2019s degree in computer science \/ machine learning, and an undergraduate degree in pure mathematics. Connect with him on LinkedIn: https:\/\/www.linkedin.com\/in\/andrew-treadway-a3b19b103\/In addition to TheAutomatic.net blog, he also teaches in-person courses on Python and R through my NYC meetup: more details.","sameAs":["https:\/\/theautomatic.net\/about-me\/"],"url":"https:\/\/www.interactivebrokers.com\/campus\/author\/andrewtreadway\/"}]}},"jetpack_featured_media_url":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2023\/03\/python-word-cloud-programming-languages.jpg","_links":{"self":[{"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/posts\/185984","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/users\/388"}],"replies":[{"embeddable":true,"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/comments?post=185984"}],"version-history":[{"count":0,"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/posts\/185984\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/media\/185996"}],"wp:attachment":[{"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/media?parent=185984"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/categories?post=185984"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/tags?post=185984"},{"taxonomy":"contributors-categories","embeddable":true,"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/contributors-categories?post=185984"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}