Google Takes Dataset Search Out Of Beta

Google has brought its dataset search tool out of the beta-testing phase, while adding new features.

Google Dataset Search was originally released in September 2018 to try to make datasets more accessible to researchers.

According to the search company, large amounts of such data is published online, from organisations including universities, governments and labs, but it can be difficult to find via standard searches.

Along with the search tool Google also released a set of open metadata tags, urging publishers to add them to pages containing datasets to make the information easier for search engines to index.

A Google data centre in Oklahoma. Image credit: Google

Metadata framework

Google’s tool has now indexed some 25 million datasets, in areas ranging from penguin populations to volcanic eruptions to medical data.

The information can be used for purposes such as testing hypotheses or to training AI algorithms.

Casual users can also use Google’s dataset search to find information related to their interests, such as a list of the fastest skiiers.

Google said hundreds of thousands of users have tried Dataset Search since its launch, and that the reaction from the scientific community was positive overall.

The company said the journal Nature, for instance, has begun requiring that data sharing take place with the proper metadata, said Natasha Noy, research scientist at Google Research.

New search features include the ability to filter data by type, such as tables, images or text, as well as whether the data is free to use and the geographic area covered.

Data discovery

The search engine is now available to use on mobile devices and has expanded dataset descriptions.

The biggest areas currently indexed include geosciences, biology and agriculture, with the most common queries being “education”, “weather”, “cancer”, “crime”, “soccer”… and “dogs”.

The US is the leader in open government dataset publishing, making more than 2 million available online.

Noy said Google is planning to continue releasing further updates to the search engine now that the beta-testing period has ended.

The company said its ultimate goal is to “help foster an ecosystem” for publishing, discovering and using datasets.

Matthew Broersma

Matt Broersma is a long standing tech freelance, who has worked for Ziff-Davis, ZDnet and other leading publications

Recent Posts

OpenAI, Broadcom In Talks Over Development Of AI Chip – Report

Rebelling against Nividia? OpenAI is again reportedly exploring the possibility of developing its own AI…

1 day ago

Microsoft Outage Impacts Airlines, Media, Banks & Businesses Globally

IT outage causes major disruptions around the world, after Crowdstrike update allegedly triggers Microsoft outages

1 day ago

GenAI Integration Efforts Hampered By Costs, SnapLogic Finds

Hefty investment. SnapLogic research finds UK businesses are setting aside three-quarters of their IT budgets…

2 days ago

Meta Refuses EU Release Of Multimodal Llama AI Model

Mark Zuckerberg firm says European regulatory environment too ‘unpredictable’, so will not release multimodal Llama…

2 days ago

Synchron Announces Brain Interface Chat Powered by OpenAI

Brain implant firm Synchron offers AI-driven emotion and language predictions for users, powered by OpenAI's…

2 days ago