In a previous post, we explored the dividend history of stocks included in the SP500. Today we’ll extend that anlaysis to cover the Nasdaq because, well, because in the previous post I said I would do that. We’ll also explore a different source for dividend data, do some string cleaning and check out ways to customize a tooltip in plotly
. Bonus feature: we’ll get into some animation too. We have a lot to cover, so let’s get to it.
We need to load up our packages for the day.
library(tidyverse)
library(tidyquant)
library(janitor)
library(plotly)
First we need all the companies listed on the Nasdaq. Not so long ago, it wasn’t easy to import that information into R. Now we can use tq_exchange("NASDAQ")
from the tidyquant
package.
nasdaq <-
tq_exchange("NASDAQ")
nasdaq %>%
head()
# A tibble: 6 x 7
symbol company last.sale.price market.cap ipo.year sector industry
<chr> <chr> <dbl> <chr> <dbl> <chr> <chr>
1 YI 111, Inc. 2.69 $219.66M 2018 Health… Medical/N…
2 PIH 1347 Prope… 4.85 $29.16M 2014 Finance Property-…
3 PIHPP 1347 Prope… 25.7 $17.96M NA Finance Property-…
4 TURN 180 Degree… 2.09 $65.04M NA Finance Finance/I…
5 FLWS 1-800 FLOW… 15.3 $983.42M 1999 Consum… Other Spe…
6 BCOW 1895 Banco… 9.33 $45.5M 2019 Finance Banks
Notice how the market.cap
column is of type character? Let’s coerce it to a dbl
with as.numeric
and while we’re at it, let’s remove the periods in all the column numes with clean_names
from the janitor
package.
nasdaq %>%
clean_names() %>%
mutate(market_cap = as.numeric(market_cap)) %>%
select(symbol, market_cap) %>%
head()
# A tibble: 6 x 2
symbol market_cap
<chr> <dbl>
1 YI NA
2 PIH NA
3 PIHPP NA
4 TURN NA
5 FLWS NA
6 BCOW NA
Not exactly what we had in mind. The presence of those M
, B
and $
characters are causing as.numeric()
to coerce the column to NAs. If we want to do any sorting by market cap, we’ll need to clean that up and it’s a great chance to explore some stringr
. Let’s start with str_remove_all
and remove those non-numeric characters. The call is market_cap %>% str_remove_all("\\$|M|B")
, and then an arrange(desc(market_cap))
so that the largest cap company is first.
nasdaq %>%
clean_names() %>%
mutate(market_cap = market_cap %>% str_remove_all("\\$|M|B") %>% as.numeric()) %>%
arrange(desc(market_cap)) %>%
head()
# A tibble: 6 x 7
symbol company last_sale_price market_cap ipo_year sector industry
<chr> <chr> <dbl> <dbl> <dbl> <chr> <chr>
1 CIVBP Civista B… 66 667920 NA Finance Major Banks
2 ASRVP AmeriServ… 29.6 621600 NA Finance Major Banks
3 ESGRP Enstar Gr… 26.6 425920 NA Finance Property-C…
4 AGNCN AGNC Inve… 26.0 337870 NA Consum… Real Estat…
5 SBFGP SB Financ… 15.8 237221. NA Finance Major Banks
6 ESGRO Enstar Gr… 26.4 115984 NA Finance Property-C…
Well, that wasn’t too bad!
Wait, that looks weird, where’s AMZN and MSFT shouldn’t they be at the top of the market cap? Look closely at market_cap
and notice it’s been coerced to a numeric value as we intended but we didn’t account for the fact that those M
and B
letters were abbreviating values and standing place for a whole bunch of zeroes. The first symbol above, CIVBP
, didn’t have an M
or B
because it’s market cap is low, so it didn’t have any zeroes lopped off of it.
We need a way to remove the M
and the B
account for those zeroes that got removed. Here’s how I chose to tackle this.
- Find all the cells that do not have an
M
or aB
, remove the$
sign, convert to numeric and divide by 1000. We do that withif_else(str_detect(market_cap, "M|B", negate = TRUE), str_remove_all(market_cap, "\\$") %>% as.numeric() %>%
/(1000)
. - Find all the cells that have a
B
, remove theB
and the$
sign, convert to numeric and multiply by 1000. We do that withif_else(str_detect(market_cap, "B"), str_remove_all(market_cap, "\\$|B") %>% as.numeric() %>%
*(1000)
. - Find all the cells that have an
M
, remove theM
and the$
sign, convert to numeric and don’t multiply or divide. We do that withstr_remove_all(market_cap, "\\$|M") %>% as.numeric()))
.
Stay tuned for the next installment, in which Jonathan will continue coding this tibble in R. To download the full R script, visit his blog http://www.reproduciblefinance.com/2019/08/25/tech-dividends/ .
Any stock, options or futures symbols displayed are for illustrative purposes only and are not intended to portray recommendations.
Disclosure: Interactive Brokers
Information posted on IBKR Campus that is provided by third-parties does NOT constitute a recommendation that you should contract for the services of that third party. Third-party participants who contribute to IBKR Campus are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.
This material is from Reproducible Finance and is being posted with its permission. The views expressed in this material are solely those of the author and/or Reproducible Finance and Interactive Brokers is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to buy or sell any security. It should not be construed as research or investment advice or a recommendation to buy, sell or hold any security or commodity. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.
Join The Conversation
For specific platform feedback and suggestions, please submit it directly to our team using these instructions.
If you have an account-specific question or concern, please reach out to Client Services.
We encourage you to look through our FAQs before posting. Your question may already be covered!