The Best Articles in Data Science
The most useful articles in Data Science from around the web—beginners to advanced—curated by thought leaders and our community. We focus on timeless pieces and update the list whenever we discover new, must-read articles or videos—make sure to bookmark and revisit this page.
Top 5 Data Science Articles
At a glance: these are the articles that have been most read, shared, and saved in Data Science by Refind users in 2023 so far.
What is ...?
New to Data Science? These articles make an excellent introduction.
What is Reverse ETL? The Definitive Guide
Learn everything there is to know about Reverse ETL, how it fits into the modern data stack, and why it's different than ETL.
Learn Julia For Beginners – The Future Programming Language of Data Science and Machine Learning Explained
Julia is a high-level, dynamic programming language, designed to give users the speed of C/C++ while remaining as easy to use as Python. This means that developers can solve problems faster and more…
«So now the function is defined to take in only a string. Let us test this out to make sure we can only call the function with a string value»
An introduction to data science and machine learning with Microsoft Excel
Microsoft Excel is a powerful tool for learning the basics of data science and machine learning.
What is Hierarchical Clustering?
By Nagesh Singh Chauhan, Big data developer at CirrusLabs What is Clustering?? Clustering is a technique that groups similar objects such that the objects in the same group are more similar to each…
Trending
These links are currently making the rounds in Data Science on Refind.
Dashboards Are Dead: 3 Years Later
What’s the purpose of dashboards in 2023?
The Myth of Objective Data
When we view objectivity and subjectivity as opposites rather than complements, we distort the empirical realities of data collection.
«This despair helps my students recognize an apparently banal assignment as a real design situation. It teaches them that data is created, not found; and that creating it well demands humanity, rather than objectivity.»
Short Articles
Short on time? Check out these useful short articles in Data Science—all under 10 minutes.
Same data, different stories: How to manipulate the graphs to support your narrative
How to shape the narrative with graph manipulation without sacrificing your credibility. The art of tweaking graphs to better support your story.
How Do They Know This?
An informative and apolitical new book reminds us that statistics are not always what they seem.
«But for the most part, official statistics are imperfect but good enough.»
Why data remains the greatest challenge for machine learning projects
Appen’s latest State of AI Report reveals advances in helping enterprises overcome barriers to sourcing and preparing their data.
The Difficult Life of the Data Lead
Why balancing managing a team, demanding stakeholders and still being hands-on is no easy task
Long Articles
These are some of the most-read long-form articles in Data Science.
Are You Ready to Hire a Data Scientist? Advice for Founders
Are you ready to hire a data scientist? Mengying Li, Growth Data Science Lead at Notion, shares her framework for testing whether you should invest in this key hire and how to find the right data…
Megastudy scepticism
In December last year Katherine Milkman and friends published a “megastudy” testing 54 interventions to increase the gym visits of 61,000 experimental participants.
Meet Julia: The Future of Data Science
The next big thing, or just massively overhyped?
10 Simple Things You Can Do to Improve Your Data Science Skills in 2023
Here’s to making this next year productive, satisfying, and all-around beneficial.
Data50: The World’s Top Data Startups
The Data50 are the bellwether data startups across the most exciting categories in data, such as AI/ML, ELT and orchestration, and data observability.
Thought Leaders
We monitor hundreds of thought leaders, influencers, and newsletters in Data Science, including:
Andrew Ng
Co-Founder of Coursera; Stanford CS adjunct faculty. Former head of Baidu AI Group/Google Brain. #ai #machinelearning, #deeplearning #MOOCs

Hilary Mason
Co-Founder of @HiddenDoorCo. Formerly Founder of @FastForwardLabs (acquired by @Cloudera). I ♥ data and cheeseburgers. She/her.
Werner Vogels
CTO @ Amazon

Data Science Renee
Sr Director of DS at @HelioCampus || Author of SQL for Data Scientists (Wiley) || @DataSciGuide @NewDataSciJobs || @paix120 || Not views of employer || she/her
Nathan Yau
making charts
Publications
We monitor hundreds of publications, blogs, newsletters, and news sources in Data Science, including:
KDnuggets
KD stands for Knowledge Discovery. Covering #DataScience #MachineLearning #AI #Analytics. Edited by @mattmayo13. Founded by Gregory Piatetsky-Shapiro.
Towards Data Science
A Medium publication sharing concepts, ideas, and codes. Share your insights and projects with like-minded readers: http://bit.ly/write-for-tds.
Data Science Central
Part of the DSC Community and TechTarget, our focus is on data science, ML, AI, deep learning, dataviz, Hadoop, IoT and BI.
freeCodeCamp.org
We're a community of millions of people who are building new skills and getting new jobs together. A 501(c)(3) public charity. Tweets by @abbeyrenn.
ZDNET
Tomorrow belongs to those who embrace it today.
What is Refind?
Every day Refind picks the most relevant links from around the web for you. Picking only a handful of links means focusing on what’s relevant and useful. We favor timeless pieces—links with long shelf-lives, articles that are still relevant one month, one year, or even ten years from now. These lists of the best resources on any topic are the result of years of careful curation.
How does Refind curate?
It’s a mix of human and algorithmic curation, following a number of steps:
- We monitor 10k+ sources and 1k+ thought leaders on hundreds of topics—publications, blogs, news sites, newsletters, Substack, Medium, Twitter, etc.
- In addition, our users save links from around the web using our Save buttons and our extensions.
- Our algorithm processes 100k+ new links every day and uses external signals to find the most relevant ones, focusing on timeless pieces.
- Our community of active users gets the most relevant links every day, tailored to their interests. They provide feedback via implicit and explicit signals: open, read, listen, share, mark as read, read later, «More/less like this», etc.
- Our algorithm uses these internal signals to refine the selection.
- In addition, we have expert curators who manually curate niche topics.
The result: lists of the best and most useful articles on hundreds of topics.
How does Refind detect «timeless» pieces?
We focus on pieces with long shelf-lives—not news. We determine «timelessness» via a number of metrics, for example, the consumption pattern of links over time.
How many sources does Refind monitor?
We monitor 10k+ content sources on hundreds of topics—publications, blogs, news sites, newsletters, Substack, Medium, Twitter, etc.
Which sources does Refind monitor in Data Science?
We monitor hundreds of sources in Data Science, including KDnuggets, Towards Data Science, Data Science Central, freeCodeCamp.org, ZDNET, and many more.
Who are the thought leaders in Data Science?
We follow dozens of thought leaders in Data Science, including Andrew Ng, Hilary Mason, Werner Vogels, Data Science Renee, Nathan Yau.
Missing a thought leader? Submit them here
Can I submit a link?
Indirectly, by using Refind and saving links from outside (e.g., via our extensions).
How can I report a problem?
When you’re logged-in, you can flag any link via the «More» (...) menu. You can also report problems via email to hello@refind.com
Who uses Refind?
200k+ smart people start their day with Refind. To learn something new. To get inspired. To move forward. Our apps have a 4.9/5 rating.
Is Refind free?
Yes, it’s free!
How can I sign up?
Head over to our homepage and sign up by email or with your Twitter or Google account.