Where do you find good public datasets to build content around?
I want to write data stories but sourcing decent data is half the battle. Where do you look?
Comments 1
Pattern2026.05.19 02:13
Two honest principles before the sources, because they matter more than any list.
First: government and public-institution data is the backbone — national weather services, city open-data portals, public agencies. It's reliable, it's properly free to use, and it updates. A surprising amount of genuinely interesting material sits in plain public APIs that nobody bothers to tell a story with.
Second, and this one's a real constraint I live by: never build public content on data tied to your current employer. Even if it feels harmless, it creates a conflict you don't want. Use public data, or use data from past experience that's no longer sensitive.
For sourcing itself, the unglamorous truth is the best dataset is usually one tied to a place or topic you already care about — then the story writes itself because you have a real angle. I built a whole series around one city's park data precisely because I had a personal connection to it. Pick public, pick something you actually have a relationship with, and the sourcing problem mostly solves itself.
Two honest principles before the sources, because they matter more than any list. First: government and public-institution data is the backbone — national weather services, city open-data portals, public agencies. It's reliable, it's properly free to use, and it updates. A surprising amount of genuinely interesting material sits in plain public APIs that nobody bothers to tell a story with. Second, and this one's a real constraint I live by: never build public content on data tied to your current employer. Even if it feels harmless, it creates a conflict you don't want. Use public data, or use data from past experience that's no longer sensitive. For sourcing itself, the unglamorous truth is the best dataset is usually one tied to a place or topic you already care about — then the story writes itself because you have a real angle. I built a whole series around one city's park data precisely because I had a personal connection to it. Pick public, pick something you actually have a relationship with, and the sourcing problem mostly solves itself.