Introduction
Over the past while with my time on LinkedIn, I got to have exposure to many people from many different lines of work. I also managed to have carved a space for myself there where I can post about Data Science topics and share my blogs along the way. There have always been posts and polls comparing R and Python as well as the subsequent debates among users of the languages as far as which one is superior for doing Data Science. While these sort of arguments will never end and I am far from innocent of engaging in them, I chose to take to task understanding why Data Science practioners preferred one language over another by “controlling” for exposure to the other language.
In this blog I am going to share my results from my LinkedIn polls comparing respondents preferences. The polls asked for respondents preferences for:
- Using
dplyr
in R vspandas
in Python for data wrangling, - Using
ggplot2
in R vsmatplotlib
andseaborn
in Python for data visualization, and - Using Jupyter notebooks vs RMarkdown for writing reports.
Disclaimer
This is by no means a formal study, its more of just me sharing my findings in blog form. Social media platforms come and go, but having a blog where I can share my findings (albeit less popular) offers a place where I can post my curated content. Likely due to LinkedIn’s algorithms, my first and second questions got more traction with over 132,000 views combined and over 1600 and 1300 votes respectively, while my last question only got a little more than 4000 views and over 106 votes at the time of writing.
To quote a comment on one of my polls:
With this in mind, lets share the results of these polls.
(Visuals were made with ggplot2
and the ggtech
package for the theme)
1. dplyr vs pandas
As expected, most users who were pro-pandas never used dplyr
before. However, when controlling for prior experience, it was pretty much a 50-50 split among respondents between using pandas
in Python and dplyr
in R. There were some comments recommending that I check out the data.table
and dtplyr
packages in R; while I don’t have much exposure to using those packages presently, I hope to check them out in the future.
For my closest experience to dplyr
in Python, check out my review on the siuba
module.
2. ggplot2 vs matplotlib and seaborn
In the case of comparing ggplot2
to matplotlib
and seaborn
among users who had experience with both packages, ggplot2
is preferred by 56% of users. Most users of matplotlib
and seaborn
don’t have experience with ggplot2
and vice-versa.
I was told to check out the plotly
library which is compatible in R and Python and it really looks like a great library to have for building interactive dashboards and applications. While I don’t have much experience with it now, I do hope to check it out when time allows for it.
3. Using Jupyter notebooks vs RMarkdown for writing reports.
The results from this poll are questionable as I only got 106 replies to this poll. With this in mind these are the results:
Of users with experience with using both RMarkdown and Jupyter notebooks for writing their reports, 63% of users prefer using RMarkdown over Jupyter notebooks, however there are more users who have experienced Jupyter notebooks than RMarkdown.
Conclusion
With all being said, using dplyr
in R or pandas
Python for doing data wrangling seems like a toss up among users with experience with both languages. For data visualization, ggplot2
seems to be preferred over matplotlib
or seaborn
and if you trust the sample size, RMarkdown is preferred over Jupyter notebook among users with experience with both.
In general, apparent that R is still the underdog in terms of it being a language used for Data Science and programming- but by no means does that make me intend on stopping from using it any time soon.
When I get the time, I look forward to giving data.table
and plotly
a spin!
Thank you for reading!
Visit Bensstats Blog for additional insight on this topic and subscribe to his newsletter: https://bensstats.wordpress.com/category/rvspython/.
Disclosure: bensstats
All investments carry a certain degree of risk, including the possible loss of principal. There is no assurance that an investment will provide positive performance over any period of time. There are specific risks that apply to investment strategies. These risks should be reviewed carefully before taking any investment action. Since no one investment style or manager is suitable for all types of investors, this commentary is provided for informational purposes only. The statements contained herein are the opinions of Rareview Capital LLC. All opinions and views constitute our judgments as of the date of writing and are subject to change at any time without notice. This commen contains no investment advice or recommendations. Individual investor results will vary. Past performance is no guarantee of future results.
Disclosure: Interactive Brokers
Information posted on IBKR Campus that is provided by third-parties does NOT constitute a recommendation that you should contract for the services of that third party. Third-party participants who contribute to IBKR Campus are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.
This material is from bensstats and is being posted with its permission. The views expressed in this material are solely those of the author and/or bensstats and Interactive Brokers is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to buy or sell any security. It should not be construed as research or investment advice or a recommendation to buy, sell or hold any security or commodity. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.