Dataset sources for DataScience Lovers
Hi Friends ! Data is the new oil flowing freely, continuously. We need to build a dam to store it and use it for modelling using ML algorithm. Here are some of the sources for Datasets :-
Kaggle Dataset:-
https://www.kaggle.com/
https://www.kaggle.com/mylesoneill/game-of-thrones
UCI Machine Learning Repository:
https://archive.ics.uci.edu/ml/index.php
UNICEF:-
https://data.unicef.org/
UNICEF’s open datasets published on the IATI Registry: http://www.iatiregistry.org/publisher/unicef has been extracted directly from UNICEF’s operating system (VISION) and
other data systems, and it reflects inputs made by individual UNICEF offices.
UCI Machine Learning Repository:-
https://archive.ics.uci.edu/ml/index.php
fivethirtyeight:-
https://data.fivethirtyeight.com/
WHO (World Health Organization) — Open data repository:-
https://www.who.int/gho/database/en/
Amazon dataset:-
https://aws.amazon.com/s3/
Google’s Datasets Search Engine:-google dataset search
NZ Dataset:-
https://catalogue.data.govt.nz/dataset
INDIAN Government Dataset:-
https://data.gov.in/
US Govt. Dataset:-
https://www.data.gov/
Europe Dataset:-
https://data.europa.eu/euodp/data/dataset
UK Dataset:-
https://www.opendatani.gov.uk/
Awesome Public Datasets from GitHub:-
https://github.com/awesomedata/awesome-public-datasets
Makeover Monday:-
https://www.makeovermonday.co.uk/data/
Reddit/r/datasets/:-
https://www.reddit.com/r/datasets/
Data is Plural:-
https://tinyletter.com/data-is-plural
Numerous Dataset List:-
https://paperswithcode.com/datasets
Some more links:-
1. Google Dataset Search — https://lnkd.in/eGR9BAey
𝟮. IBM Data Asset eXchange — https://lnkd.in/eKaWvF_K
𝟯. Nasdaq Data Link — https://lnkd.in/eaXmdhvi
𝟰. Data .gov US — https://data.gov/
𝟱. Earth Data (NASA) — https://lnkd.in/em3KyCRw
𝟲. AWS Open Data — https://lnkd.in/eDy45QFD
𝟳. FBI Crime Data Explorer — https://lnkd.in/egnvsxwb
𝟴. Data .gov UK — https://www.data.gov.uk/
𝟵. CERN Open Data Portal — https://opendata.cern.ch/
𝟭𝟬. Antarctic Datasets — https://lnkd.in/eDbVjQBv
𝟭𝟭. BFI film industry statistics — https://lnkd.in/e3NTRd2S
𝟭𝟮. NYC Taxi Trip Data — https://lnkd.in/ez9-VKhB
𝟭3. The Official Portal for European Data — https://lnkd.in/ei2AxvyJ
𝟭4. Health Data — https://healthdata.gov/
𝟭5. Centers For Disease Control And Prevention — https://lnkd.in/eUSFqhkq
𝟭6. FiveThirtyEight — https://lnkd.in/ePptSfu8
𝟭7. Datahub .io — https://lnkd.in/efeRzvp4
18. Global Health Observatory Data Repository — https://lnkd.in/e_BNrthm
𝟭9. Latin American Data Bank — https://lnkd.in/eJXcXSP2
𝟮0. IMDb Non-Commercial Datasets — https://lnkd.in/epS8jgUi
Play around with these data. To start with ML concepts, you can use Titanic Dataset for EDA and Linear regression. All the Best !