- How Do They Know This?
- What is Reverse ETL? The Definitive Guide
- The Myth of Objective Data
- Same data, different stories: How to manipulate the graphs to support your narrative
- Explaining base rate neglect
Learn everything there is to know about Reverse ETL, how it fits into the modern data stack, and why it's different than ETL.
Julia is a high-level, dynamic programming language, designed to give users the speed of C/C++ while remaining as easy to use as Python. This means that developers can solve problems faster and more…
«So now the function is defined to take in only a string. Let us test this out to make sure we can only call the function with a string value»
Microsoft Excel is a powerful tool for learning the basics of data science and machine learning.
By Nagesh Singh Chauhan, Big data developer at CirrusLabs What is Clustering?? Clustering is a technique that groups similar objects such that the objects in the same group are more similar to each…
AI trained on ever-more data has yielded ChatGPT and DALL-E 3. But research shows online data stocks are growing more slowly than datasets used to train AI.
«Another option is to use AI to create synthetic data to train systems. In other words, developers can simply generate the data they need, curated to suit their particular AI model.»
A new tool has been created to poison image-output models like Midjourney and DALLE-2. Are text-output models like ChatGPT and Copilot next?
«Nightshade might trip up AI models now, he says, but future filtering techniques and generative model architectures will probably be able to swallow the poison with no ill effects. The same would presumably apply to facial recognition and deepfake algorithms, necessitating a new subset of arms race between the hacker and the hacked.»
Emerging AI trends such as natural language processing and reinforcement learning are set to bring in the next frontier of Data Science.
«This year, in particular, is expected to see the rapidly accelerated adoption of RL as businesses realize and harness its untapped potential.»
When we view objectivity and subjectivity as opposites rather than complements, we distort the empirical realities of data collection.
«This despair helps my students recognize an apparently banal assignment as a real design situation. It teaches them that data is created, not found; and that creating it well demands humanity, rather than objectivity.»
The world is huge; to get a clear idea of what our world is like, we have to rely on carefully collected, well documented statistics.
How to shape the narrative with graph manipulation without sacrificing your credibility. The art of tweaking graphs to better support your story.
An informative and apolitical new book reminds us that statistics are not always what they seem.
«But for the most part, official statistics are imperfect but good enough.»
In a seminar for a team from an investment manager I described how base rates are often neglected when people are grappling with conditional probabilities.
What’s the purpose of dashboards in 2023?
How Computers Think: Introduction
Are you ready to hire a data scientist? Mengying Li, Growth Data Science Lead at Notion, shares her framework for testing whether you should invest in this key hire and how to find the right data…
The next big thing, or just massively overhyped?
In December last year Katherine Milkman and friends published a “megastudy” testing 54 interventions to increase the gym visits of 61,000 experimental participants.
Co-Founder of Coursera; Stanford CS adjunct faculty. Former head of Baidu AI Group/Google Brain. #ai #machinelearning, #deeplearning #MOOCs
Co-Founder of @HiddenDoorCo. Formerly Founder of @FastForwardLabs (acquired by @Cloudera). I ♥ data and cheeseburgers. She/her.
CTO @ Amazon
Sr Director of DS at @HelioCampus || Author of SQL for Data Scientists (Wiley) || @DataSciGuide @NewDataSciJobs || @paix120 || Not views of employer || she/her
How does Refind curate?
It’s a mix of human and algorithmic curation, following a number of steps:
- We monitor 10k+ sources and 1k+ thought leaders on hundreds of topics—publications, blogs, news sites, newsletters, Substack, Medium, Twitter, etc.
- In addition, our users save links from around the web using our Save buttons and our extensions.
- Our algorithm processes 100k+ new links every day and uses external signals to find the most relevant ones, focusing on timeless pieces.
- Our community of active users gets the most relevant links every day, tailored to their interests. They provide feedback via implicit and explicit signals: open, read, listen, share, mark as read, read later, «More/less like this», etc.
- Our algorithm uses these internal signals to refine the selection.
- In addition, we have expert curators who manually curate niche topics.
The result: lists of the best and most useful articles on hundreds of topics.
How does Refind detect «timeless» pieces?
We focus on pieces with long shelf-lives—not news. We determine «timelessness» via a number of metrics, for example, the consumption pattern of links over time.
How many sources does Refind monitor?
We monitor 10k+ content sources on hundreds of topics—publications, blogs, news sites, newsletters, Substack, Medium, Twitter, etc.
Who are the thought leaders in Data Science?
We follow dozens of thought leaders in Data Science, including Andrew Ng, Hilary Mason, Werner Vogels, Nathan Yau, Data Science Renee.
Missing a thought leader? Submit them here
Can I submit a link?
Indirectly, by using Refind and saving links from outside (e.g., via our extensions).
How can I report a problem?
When you’re logged-in, you can flag any link via the «More» (...) menu. You can also report problems via email to firstname.lastname@example.org
Who uses Refind?
400k+ smart people start their day with Refind. To learn something new. To get inspired. To move forward. Our apps have a 4.9/5 rating.
Is Refind free?
Yes, it’s free!
How can I sign up?
Head over to our homepage and sign up by email or with your Twitter or Google account.