News
Datasets
We get news data from over 90,000 sources worldwide, clean, structure, and enrich it, providing our customers with custom news solutions. Check out the most popular datasets and contact us to start getting them regularly, or build a custom solution for your unique needs.
Fundraising
Events
All news on fundraising events worldwide, organized by company. Key details include event description, timing, location, involved entities, impact, source, sentiment, category, and relevance score.
[
{
"id": "Event id",
"event_type": "Currently, it's set to 'fundraising'; other types of event are accessible upon request",
"fundraising": {
"amount": "Amount of money raised",
"company_description": "Short description of the company that raised funds",
"funding_type": "Type of the funding event",
"currency": "Currency"
},
"global_event_type": "Type of the parent event, in this case it's set to 'Finance'",
"associated_article_ids": "IDs of the articles associated with this event",
"extraction_date": "Date when the event was extracted",
"event_date": "Date of the event",
"company_name": "Name of the company",
"articles": {
"paid_content": "True if the content is marked as paid or sponsored",
"link": "Full URL where the article was originally published",
"description": "Short summary of the article provided by the publisher",
"language": "The language of the article",
"media": "A link to a thumbnail image of the article",
"all_domain_links": "All domain URL embedded in the article's content HTML",
"title": "The title of the article",
"journalists": "Clean list of journalists. No news publishcation names, only people",
"content": "The full content of the article",
"word_count": "Number of words in the article's content",
"domain_url": "The domain URL of the article's source",
"all_links": "All URL links embedded in the article's content HTML",
"rights": "Copyright",
"rank": "The page rank of the source website (which is given in the clean_url)",
"twitter_account": "The Twitter account of the publisher",
"id": "Newscatcher API's unique identifier for each news article",
"name_source": "The common name of the News Source",
"full_domain_url": "The full domain URL with a subcategory of the article's source",
"author": "The author of the article",
"is_headline": "True when an article has been seen on the main page of the news source",
"nlp": {
"summary": "AI-generated short summary of the article",
"ner_PER": {
"entity_name": "Name of the person",
"count": "Number of mentions in the article"
},
"ner_LOC": {
"entity_name": "Entity recognized - location",
"count": "Number of mentions in the article"
},
"ner_ORG": {
"entity_name": "Entity recognized - organization",
"count": "Number of mentions in the article"
},
"ner_MISC": {
"entity_name": "Entity recognized - others",
"count": "Number of mentions in the article"
},
"sentiment": {
"title": "Sentiment of the title",
"content": "Sentiment of content"
},
"theme": "Theme recognized"
},
"published_date_precision": "Accuracy of the published_date field. There are 3 types of date precision we define: 'full' — day and time of an article is correctly identified with the appropriate timezone, 'timezone unknown' — day and time of an article is correctly identified without timezone, 'date' — only the day is identified without an exact time",
"is_opinion": "True if the article is an 'Opinion' article"
}
}
]
Corporate
Headquarter Changes
Track near real-time corporate headquarters shifts with details on event description, timing, new and old locations, affected employees, source, category, and relevance score. Additional insights, like sentiment analysis, are available on request.
[
{
"id": "Event id",
"event_type": "Currently, it's set to 'fundraising'; other types of event are accessible upon request",
"fundraising": {
"amount": "Amount of money raised",
"company_description": "Short description of the company that raised funds",
"funding_type": "Type of the funding event",
"currency": "Currency"
},
"global_event_type": "Type of the parent event, in this case it's set to 'Finance'",
"associated_article_ids": "IDs of the articles associated with this event",
"extraction_date": "Date when the event was extracted",
"event_date": "Date of the event",
"company_name": "Name of the company",
"articles": {
"paid_content": "True if the content is marked as paid or sponsored",
"link": "Full URL where the article was originally published",
"description": "Short summary of the article provided by the publisher",
"language": "The language of the article",
"media": "A link to a thumbnail image of the article",
"all_domain_links": "All domain URL embedded in the article's content HTML",
"title": "The title of the article",
"journalists": "Clean list of journalists. No news publishcation names, only people",
"content": "The full content of the article",
"word_count": "Number of words in the article's content",
"domain_url": "The domain URL of the article's source",
"all_links": "All URL links embedded in the article's content HTML",
"rights": "Copyright",
"rank": "The page rank of the source website (which is given in the clean_url)",
"twitter_account": "The Twitter account of the publisher",
"id": "Newscatcher API's unique identifier for each news article",
"name_source": "The common name of the News Source",
"full_domain_url": "The full domain URL with a subcategory of the article's source",
"author": "The author of the article",
"is_headline": "True when an article has been seen on the main page of the news source",
"nlp": {
"summary": "AI-generated short summary of the article",
"ner_PER": {
"entity_name": "Name of the person",
"count": "Number of mentions in the article"
},
"ner_LOC": {
"entity_name": "Entity recognized - location",
"count": "Number of mentions in the article"
},
"ner_ORG": {
"entity_name": "Entity recognized - organization",
"count": "Number of mentions in the article"
},
"ner_MISC": {
"entity_name": "Entity recognized - others",
"count": "Number of mentions in the article"
},
"sentiment": {
"title": "Sentiment of the title",
"content": "Sentiment of content"
},
"theme": "Theme recognized"
},
"published_date_precision": "Accuracy of the published_date field. There are 3 types of date precision we define: 'full' — day and time of an article is correctly identified with the appropriate timezone, 'timezone unknown' — day and time of an article is correctly identified without timezone, 'date' — only the day is identified without an exact time",
"is_opinion": "True if the article is an 'Opinion' article"
}
}
]
Data Breaches
Stay ahead with near-real-time updates on the latest data breaches. This dataset includes comprehensive details such as the breach date, affected parties, types of data leaked, and involved companies or individuals.
[
{
"id": "Event id",
"event_type": "Currently, it's set to 'fundraising'; other types of event are accessible upon request",
"fundraising": {
"amount": "Amount of money raised",
"company_description": "Short description of the company that raised funds",
"funding_type": "Type of the funding event",
"currency": "Currency"
},
"global_event_type": "Type of the parent event, in this case it's set to 'Finance'",
"associated_article_ids": "IDs of the articles associated with this event",
"extraction_date": "Date when the event was extracted",
"event_date": "Date of the event",
"company_name": "Name of the company",
"articles": {
"paid_content": "True if the content is marked as paid or sponsored",
"link": "Full URL where the article was originally published",
"description": "Short summary of the article provided by the publisher",
"language": "The language of the article",
"media": "A link to a thumbnail image of the article",
"all_domain_links": "All domain URL embedded in the article's content HTML",
"title": "The title of the article",
"journalists": "Clean list of journalists. No news publishcation names, only people",
"content": "The full content of the article",
"word_count": "Number of words in the article's content",
"domain_url": "The domain URL of the article's source",
"all_links": "All URL links embedded in the article's content HTML",
"rights": "Copyright",
"rank": "The page rank of the source website (which is given in the clean_url)",
"twitter_account": "The Twitter account of the publisher",
"id": "Newscatcher API's unique identifier for each news article",
"name_source": "The common name of the News Source",
"full_domain_url": "The full domain URL with a subcategory of the article's source",
"author": "The author of the article",
"is_headline": "True when an article has been seen on the main page of the news source",
"nlp": {
"summary": "AI-generated short summary of the article",
"ner_PER": {
"entity_name": "Name of the person",
"count": "Number of mentions in the article"
},
"ner_LOC": {
"entity_name": "Entity recognized - location",
"count": "Number of mentions in the article"
},
"ner_ORG": {
"entity_name": "Entity recognized - organization",
"count": "Number of mentions in the article"
},
"ner_MISC": {
"entity_name": "Entity recognized - others",
"count": "Number of mentions in the article"
},
"sentiment": {
"title": "Sentiment of the title",
"content": "Sentiment of content"
},
"theme": "Theme recognized"
},
"published_date_precision": "Accuracy of the published_date field. There are 3 types of date precision we define: 'full' — day and time of an article is correctly identified with the appropriate timezone, 'timezone unknown' — day and time of an article is correctly identified without timezone, 'date' — only the day is identified without an exact time",
"is_opinion": "True if the article is an 'Opinion' article"
}
}
]
There are 3 types of date precision we define:
""full"" — day and time of an article is correctly identified with the appropriate timezone
""timezone unknown"" — day and time of an article is correctly identified without timezone
""date"" — only the day is identified without an exact time "
Layoffs
Stay informed on layoffs reported from around the world, organized by company. Included information covers company name, number of employees affected, reasons for the layoffs, locations, and the full text of related news articles.
[
{
"id": "Event id",
"event_type": "Currently, it's set to 'fundraising'; other types of event are accessible upon request",
"fundraising": {
"amount": "Amount of money raised",
"company_description": "Short description of the company that raised funds",
"funding_type": "Type of the funding event",
"currency": "Currency"
},
"global_event_type": "Type of the parent event, in this case it's set to 'Finance'",
"associated_article_ids": "IDs of the articles associated with this event",
"extraction_date": "Date when the event was extracted",
"event_date": "Date of the event",
"company_name": "Name of the company",
"articles": {
"paid_content": "True if the content is marked as paid or sponsored",
"link": "Full URL where the article was originally published",
"description": "Short summary of the article provided by the publisher",
"language": "The language of the article",
"media": "A link to a thumbnail image of the article",
"all_domain_links": "All domain URL embedded in the article's content HTML",
"title": "The title of the article",
"journalists": "Clean list of journalists. No news publishcation names, only people",
"content": "The full content of the article",
"word_count": "Number of words in the article's content",
"domain_url": "The domain URL of the article's source",
"all_links": "All URL links embedded in the article's content HTML",
"rights": "Copyright",
"rank": "The page rank of the source website (which is given in the clean_url)",
"twitter_account": "The Twitter account of the publisher",
"id": "Newscatcher API's unique identifier for each news article",
"name_source": "The common name of the News Source",
"full_domain_url": "The full domain URL with a subcategory of the article's source",
"author": "The author of the article",
"is_headline": "True when an article has been seen on the main page of the news source",
"nlp": {
"summary": "AI-generated short summary of the article",
"ner_PER": {
"entity_name": "Name of the person",
"count": "Number of mentions in the article"
},
"ner_LOC": {
"entity_name": "Entity recognized - location",
"count": "Number of mentions in the article"
},
"ner_ORG": {
"entity_name": "Entity recognized - organization",
"count": "Number of mentions in the article"
},
"ner_MISC": {
"entity_name": "Entity recognized - others",
"count": "Number of mentions in the article"
},
"sentiment": {
"title": "Sentiment of the title",
"content": "Sentiment of content"
},
"theme": "Theme recognized"
},
"published_date_precision": "Accuracy of the published_date field. There are 3 types of date precision we define: 'full' — day and time of an article is correctly identified with the appropriate timezone, 'timezone unknown' — day and time of an article is correctly identified without timezone, 'date' — only the day is identified without an exact time",
"is_opinion": "True if the article is an 'Opinion' article"
}
}
]
There are 3 types of date precision we define:
""full"" — day and time of an article is correctly identified with the appropriate timezone
""timezone unknown"" — day and time of an article is correctly identified without timezone
""date"" — only the day is identified without an exact time "
Remote Work Transitions
Get access to updates on remote work transitions. The dataset includes location, a brief summary, company name, type of change, number of people affected, and the full text of related articles.
[
{
"id": "Event id",
"event_type": "Currently, it's set to 'fundraising'; other types of event are accessible upon request",
"fundraising": {
"amount": "Amount of money raised",
"company_description": "Short description of the company that raised funds",
"funding_type": "Type of the funding event",
"currency": "Currency"
},
"global_event_type": "Type of the parent event, in this case it's set to 'Finance'",
"associated_article_ids": "IDs of the articles associated with this event",
"extraction_date": "Date when the event was extracted",
"event_date": "Date of the event",
"company_name": "Name of the company",
"articles": {
"paid_content": "True if the content is marked as paid or sponsored",
"link": "Full URL where the article was originally published",
"description": "Short summary of the article provided by the publisher",
"language": "The language of the article",
"media": "A link to a thumbnail image of the article",
"all_domain_links": "All domain URL embedded in the article's content HTML",
"title": "The title of the article",
"journalists": "Clean list of journalists. No news publishcation names, only people",
"content": "The full content of the article",
"word_count": "Number of words in the article's content",
"domain_url": "The domain URL of the article's source",
"all_links": "All URL links embedded in the article's content HTML",
"rights": "Copyright",
"rank": "The page rank of the source website (which is given in the clean_url)",
"twitter_account": "The Twitter account of the publisher",
"id": "Newscatcher API's unique identifier for each news article",
"name_source": "The common name of the News Source",
"full_domain_url": "The full domain URL with a subcategory of the article's source",
"author": "The author of the article",
"is_headline": "True when an article has been seen on the main page of the news source",
"nlp": {
"summary": "AI-generated short summary of the article",
"ner_PER": {
"entity_name": "Name of the person",
"count": "Number of mentions in the article"
},
"ner_LOC": {
"entity_name": "Entity recognized - location",
"count": "Number of mentions in the article"
},
"ner_ORG": {
"entity_name": "Entity recognized - organization",
"count": "Number of mentions in the article"
},
"ner_MISC": {
"entity_name": "Entity recognized - others",
"count": "Number of mentions in the article"
},
"sentiment": {
"title": "Sentiment of the title",
"content": "Sentiment of content"
},
"theme": "Theme recognized"
},
"published_date_precision": "Accuracy of the published_date field. There are 3 types of date precision we define: 'full' — day and time of an article is correctly identified with the appropriate timezone, 'timezone unknown' — day and time of an article is correctly identified without timezone, 'date' — only the day is identified without an exact time",
"is_opinion": "True if the article is an 'Opinion' article"
}
}
]
There are 3 types of date precision we define:
""full"" — day and time of an article is correctly identified with the appropriate timezone
""timezone unknown"" — day and time of an article is correctly identified without timezone
""date"" — only the day is identified without an exact time "
Customized News Datasets for Your Unique Needs
Bespoke Integration
100% compatibility with your data stack.
95% Coverage Per Source
Miss nothing critical
Output Formats
JSON, CSV, XML, other
Delivery Methods
API, Data Dumps, Streaming
Only Relevant Articles
Less than 2% of false positives
Fast Insights
Under 10 min latency
Trusted by
the Top Leaders
OUR SOLUTIONS IN — ACTION
READY FOR
CUSTOM NEWS SOLUTIONS?
Drop your email and find out how our API delivers precisely what your business needs.