Google Takes Dataset Search Out Of Beta

Google has brought its dataset search tool out of the beta-testing phase, while adding new features.

Google Dataset Search was originally released in September 2018 to try to make datasets more accessible to researchers.

According to the search company, large amounts of such data is published online, from organisations including universities, governments and labs, but it can be difficult to find via standard searches.

Along with the search tool Google also released a set of open metadata tags, urging publishers to add them to pages containing datasets to make the information easier for search engines to index.

A Google data centre in Oklahoma. Image credit: Google

Metadata framework

Google’s tool has now indexed some 25 million datasets, in areas ranging from penguin populations to volcanic eruptions to medical data.

The information can be used for purposes such as testing hypotheses or to training AI algorithms.

Casual users can also use Google’s dataset search to find information related to their interests, such as a list of the fastest skiiers.

Google said hundreds of thousands of users have tried Dataset Search since its launch, and that the reaction from the scientific community was positive overall.

The company said the journal Nature, for instance, has begun requiring that data sharing take place with the proper metadata, said Natasha Noy, research scientist at Google Research.

New search features include the ability to filter data by type, such as tables, images or text, as well as whether the data is free to use and the geographic area covered.

Data discovery

The search engine is now available to use on mobile devices and has expanded dataset descriptions.

The biggest areas currently indexed include geosciences, biology and agriculture, with the most common queries being “education”, “weather”, “cancer”, “crime”, “soccer”… and “dogs”.

The US is the leader in open government dataset publishing, making more than 2 million available online.

Noy said Google is planning to continue releasing further updates to the search engine now that the beta-testing period has ended.

The company said its ultimate goal is to “help foster an ecosystem” for publishing, discovering and using datasets.

Matthew Broersma

Matt Broersma is a long standing tech freelance, who has worked for Ziff-Davis, ZDnet and other leading publications

Recent Posts

Apple Cuts Orders iPhone 16, Says Analyst

Industry supply chain analyst says Apple cut orders for the iPhone 16 for Q4 2024…

10 hours ago

LinkedIn Fined €310m By Irish Data Protection Commission

Heavy fine for LinkedIn, after Irish data protection watchdog cites GDPR violations with people's personal…

12 hours ago

CMA Begins Probe Into Alphabet Partnership With Anthropic

UK competition regulator begins phase one investigation into Alphabet's partnership with AI startup Anthropic

13 hours ago

TSMC Stops Supplying Customer, After Discovery Of Restricted Chip

After alerting the US of an attempt to circumvent US export controls, TSMC halts chip…

14 hours ago

Top Court Sides With Intel Over EU Antitrust Fine

Fresh win for Intel after Europe top court upholds annulment of billion-euro antitrust fine imposed…

18 hours ago

Perplexity Boss Surprised After New Corp Sues

News Corp surprises Perplexity, after the media group sued the AI search engine for allegedly…

19 hours ago