News
Datasets

We get news data from over 90,000 sources worldwide, clean, structure, and enrich it, providing our customers with custom news solutions. Check out the most popular datasets and contact us to start getting them regularly, or build a custom solution for your unique needs.

Fundraising
Events

All news on fundraising events worldwide, organized by company. Key details include event description, timing, location, involved entities, impact, source, sentiment, category, and relevance score.

Asset Valuation
Audience Segmentation
Demand Forecasting
[
    {
        "id": "Event id",
        "event_type": "Currently, it's set to 'fundraising'; other types of event are accessible upon request",
        "fundraising": {
            "amount": "Amount of money raised",
            "company_description": "Short description of the company that raised funds",
            "funding_type": "Type of the funding event",
            "currency": "Currency"
        },
        "global_event_type": "Type of the parent event, in this case it's set to 'Finance'",
        "associated_article_ids": "IDs of the articles associated with this event",
        "extraction_date": "Date when the event was extracted",
        "event_date": "Date of the event",
        "company_name": "Name of the company",
        "articles": {
            "paid_content": "True if the content is marked as paid or sponsored",
            "link": "Full URL where the article was originally published",
            "description": "Short summary of the article provided by the publisher",
            "language": "The language of the article",
            "media": "A link to a thumbnail image of the article",
            "all_domain_links": "All domain URL embedded in the article's content HTML",
            "title": "The title of the article",
            "journalists": "Clean list of journalists. No news publishcation names, only people",
            "content": "The full content of the article",
            "word_count": "Number of words in the article's content",
            "domain_url": "The domain URL of the article's source",
            "all_links": "All URL links embedded in the article's content HTML",
            "rights": "Copyright",
            "rank": "The page rank of the source website (which is given in the clean_url)",
            "twitter_account": "The Twitter account of the publisher",
            "id": "Newscatcher API's unique identifier for each news article",
            "name_source": "The common name of the News Source",
            "full_domain_url": "The full domain URL with a subcategory of the article's source",
            "author": "The author of the article",
            "is_headline": "True when an article has been seen on the main page of the news source",
            "nlp": {
                "summary": "AI-generated short summary of the article",
                "ner_PER": {
                    "entity_name": "Name of the person",
                    "count": "Number of mentions in the article"
                },
                "ner_LOC": {
                    "entity_name": "Entity recognized - location",
                    "count": "Number of mentions in the article"
                },
                "ner_ORG": {
                    "entity_name": "Entity recognized - organization",
                    "count": "Number of mentions in the article"
                },
                "ner_MISC": {
                    "entity_name": "Entity recognized - others",
                    "count": "Number of mentions in the article"
                },
                "sentiment": {
                    "title": "Sentiment of the title",
                    "content": "Sentiment of content"
                },
                "theme": "Theme recognized"
            },
            "published_date_precision": "Accuracy of the published_date field. There are 3 types of date precision we define: 'full' — day and time of an article is correctly identified with the appropriate timezone, 'timezone unknown' — day and time of an article is correctly identified without timezone, 'date' — only the day is identified without an exact time",
            "is_opinion": "True if the article is an 'Opinion' article"
        }
    }
]
Object
Description
ID
Event id
event_type
Currently, it's set to "fundraising"; other types of event are accessible upon request
Fundraising
Details about the event, a list
global_event_type
Type of the parent event, in this case it's set to "Finance"
associated_article_ids
IDs of the articles associated with this event
extraction_date
Date when the event was extracted
event_date
Date of the event
company_name
Name of the company
articles
News articles associated with the event, an array

Corporate
Headquarter Changes

Track near real-time corporate headquarters shifts with details on event description, timing, new and old locations, affected employees, source, category, and relevance score. Additional insights, like sentiment analysis, are available on request.

Risk Analysis
Market Analysis
[
    {
        "id": "Event id",
        "event_type": "Currently, it's set to 'fundraising'; other types of event are accessible upon request",
        "fundraising": {
            "amount": "Amount of money raised",
            "company_description": "Short description of the company that raised funds",
            "funding_type": "Type of the funding event",
            "currency": "Currency"
        },
        "global_event_type": "Type of the parent event, in this case it's set to 'Finance'",
        "associated_article_ids": "IDs of the articles associated with this event",
        "extraction_date": "Date when the event was extracted",
        "event_date": "Date of the event",
        "company_name": "Name of the company",
        "articles": {
            "paid_content": "True if the content is marked as paid or sponsored",
            "link": "Full URL where the article was originally published",
            "description": "Short summary of the article provided by the publisher",
            "language": "The language of the article",
            "media": "A link to a thumbnail image of the article",
            "all_domain_links": "All domain URL embedded in the article's content HTML",
            "title": "The title of the article",
            "journalists": "Clean list of journalists. No news publishcation names, only people",
            "content": "The full content of the article",
            "word_count": "Number of words in the article's content",
            "domain_url": "The domain URL of the article's source",
            "all_links": "All URL links embedded in the article's content HTML",
            "rights": "Copyright",
            "rank": "The page rank of the source website (which is given in the clean_url)",
            "twitter_account": "The Twitter account of the publisher",
            "id": "Newscatcher API's unique identifier for each news article",
            "name_source": "The common name of the News Source",
            "full_domain_url": "The full domain URL with a subcategory of the article's source",
            "author": "The author of the article",
            "is_headline": "True when an article has been seen on the main page of the news source",
            "nlp": {
                "summary": "AI-generated short summary of the article",
                "ner_PER": {
                    "entity_name": "Name of the person",
                    "count": "Number of mentions in the article"
                },
                "ner_LOC": {
                    "entity_name": "Entity recognized - location",
                    "count": "Number of mentions in the article"
                },
                "ner_ORG": {
                    "entity_name": "Entity recognized - organization",
                    "count": "Number of mentions in the article"
                },
                "ner_MISC": {
                    "entity_name": "Entity recognized - others",
                    "count": "Number of mentions in the article"
                },
                "sentiment": {
                    "title": "Sentiment of the title",
                    "content": "Sentiment of content"
                },
                "theme": "Theme recognized"
            },
            "published_date_precision": "Accuracy of the published_date field. There are 3 types of date precision we define: 'full' — day and time of an article is correctly identified with the appropriate timezone, 'timezone unknown' — day and time of an article is correctly identified without timezone, 'date' — only the day is identified without an exact time",
            "is_opinion": "True if the article is an 'Opinion' article"
        }
    }
]
Object
Description
ID
Event id
associated_article_ids
IDs of the articles associated with this event
extraction_date
Date when the event was extracted
event_date
Date of the event
company_name
Name of the company
articles
News articles associated with the event, an array
how_much_related
Identifies how much the article is related to the event
change_type
Type of the HQ change
area_size_sqft
Area size in square feet
area_size_sqft
Area size in square feet
raw_area_size
Raw area size
employees_affected
Names of the employees involved in changes
summary
Short summary of the event
location
Location of the event
states
All the states mentioned in the news articles, a list
cities
All the cities mentioned in the news articles, a list
first_published_date
Date when the article about the event was published first
recent_published_date
Date when the most recent article about the event was published

Data Breaches

Stay ahead with near-real-time updates on the latest data breaches. This dataset includes comprehensive details such as the breach date, affected parties, types of data leaked, and involved companies or individuals.

Risk Analysis
Insurance
Financial Analysis
Compliance and Regulatory Oversight
Investment Management
[
    {
        "id": "Event id",
        "event_type": "Currently, it's set to 'fundraising'; other types of event are accessible upon request",
        "fundraising": {
            "amount": "Amount of money raised",
            "company_description": "Short description of the company that raised funds",
            "funding_type": "Type of the funding event",
            "currency": "Currency"
        },
        "global_event_type": "Type of the parent event, in this case it's set to 'Finance'",
        "associated_article_ids": "IDs of the articles associated with this event",
        "extraction_date": "Date when the event was extracted",
        "event_date": "Date of the event",
        "company_name": "Name of the company",
        "articles": {
            "paid_content": "True if the content is marked as paid or sponsored",
            "link": "Full URL where the article was originally published",
            "description": "Short summary of the article provided by the publisher",
            "language": "The language of the article",
            "media": "A link to a thumbnail image of the article",
            "all_domain_links": "All domain URL embedded in the article's content HTML",
            "title": "The title of the article",
            "journalists": "Clean list of journalists. No news publishcation names, only people",
            "content": "The full content of the article",
            "word_count": "Number of words in the article's content",
            "domain_url": "The domain URL of the article's source",
            "all_links": "All URL links embedded in the article's content HTML",
            "rights": "Copyright",
            "rank": "The page rank of the source website (which is given in the clean_url)",
            "twitter_account": "The Twitter account of the publisher",
            "id": "Newscatcher API's unique identifier for each news article",
            "name_source": "The common name of the News Source",
            "full_domain_url": "The full domain URL with a subcategory of the article's source",
            "author": "The author of the article",
            "is_headline": "True when an article has been seen on the main page of the news source",
            "nlp": {
                "summary": "AI-generated short summary of the article",
                "ner_PER": {
                    "entity_name": "Name of the person",
                    "count": "Number of mentions in the article"
                },
                "ner_LOC": {
                    "entity_name": "Entity recognized - location",
                    "count": "Number of mentions in the article"
                },
                "ner_ORG": {
                    "entity_name": "Entity recognized - organization",
                    "count": "Number of mentions in the article"
                },
                "ner_MISC": {
                    "entity_name": "Entity recognized - others",
                    "count": "Number of mentions in the article"
                },
                "sentiment": {
                    "title": "Sentiment of the title",
                    "content": "Sentiment of content"
                },
                "theme": "Theme recognized"
            },
            "published_date_precision": "Accuracy of the published_date field. There are 3 types of date precision we define: 'full' — day and time of an article is correctly identified with the appropriate timezone, 'timezone unknown' — day and time of an article is correctly identified without timezone, 'date' — only the day is identified without an exact time",
            "is_opinion": "True if the article is an 'Opinion' article"
        }
    }
]
Object
Description
ID
Unique identifier for the dataset entry.
event_type
Type of the event (e.g., data breach).
global_event_type
Unique identifier for thGeneralized category of the event (e.g., DataMonitoring).e dataset entry.
associated_article_ids
List of IDs for articles associated with this event.
extraction_date
Timestamp of when the data was extracted.
event_date
Date of the event
company_name
Name of the company involved in the data breach.
data_breach
Data breach associated with the event, an array
articles
News articles associated with the event, an array

Layoffs

Stay informed on layoffs reported from around the world, organized by company. Included information covers company name, number of employees affected, reasons for the layoffs, locations, and the full text of related news articles.

Risk Analysis
Investment Management
Strategic Planning
Talent Acquisition
Industry Trends Detection
[
    {
        "id": "Event id",
        "event_type": "Currently, it's set to 'fundraising'; other types of event are accessible upon request",
        "fundraising": {
            "amount": "Amount of money raised",
            "company_description": "Short description of the company that raised funds",
            "funding_type": "Type of the funding event",
            "currency": "Currency"
        },
        "global_event_type": "Type of the parent event, in this case it's set to 'Finance'",
        "associated_article_ids": "IDs of the articles associated with this event",
        "extraction_date": "Date when the event was extracted",
        "event_date": "Date of the event",
        "company_name": "Name of the company",
        "articles": {
            "paid_content": "True if the content is marked as paid or sponsored",
            "link": "Full URL where the article was originally published",
            "description": "Short summary of the article provided by the publisher",
            "language": "The language of the article",
            "media": "A link to a thumbnail image of the article",
            "all_domain_links": "All domain URL embedded in the article's content HTML",
            "title": "The title of the article",
            "journalists": "Clean list of journalists. No news publishcation names, only people",
            "content": "The full content of the article",
            "word_count": "Number of words in the article's content",
            "domain_url": "The domain URL of the article's source",
            "all_links": "All URL links embedded in the article's content HTML",
            "rights": "Copyright",
            "rank": "The page rank of the source website (which is given in the clean_url)",
            "twitter_account": "The Twitter account of the publisher",
            "id": "Newscatcher API's unique identifier for each news article",
            "name_source": "The common name of the News Source",
            "full_domain_url": "The full domain URL with a subcategory of the article's source",
            "author": "The author of the article",
            "is_headline": "True when an article has been seen on the main page of the news source",
            "nlp": {
                "summary": "AI-generated short summary of the article",
                "ner_PER": {
                    "entity_name": "Name of the person",
                    "count": "Number of mentions in the article"
                },
                "ner_LOC": {
                    "entity_name": "Entity recognized - location",
                    "count": "Number of mentions in the article"
                },
                "ner_ORG": {
                    "entity_name": "Entity recognized - organization",
                    "count": "Number of mentions in the article"
                },
                "ner_MISC": {
                    "entity_name": "Entity recognized - others",
                    "count": "Number of mentions in the article"
                },
                "sentiment": {
                    "title": "Sentiment of the title",
                    "content": "Sentiment of content"
                },
                "theme": "Theme recognized"
            },
            "published_date_precision": "Accuracy of the published_date field. There are 3 types of date precision we define: 'full' — day and time of an article is correctly identified with the appropriate timezone, 'timezone unknown' — day and time of an article is correctly identified without timezone, 'date' — only the day is identified without an exact time",
            "is_opinion": "True if the article is an 'Opinion' article"
        }
    }
]
Object
Description
ID
Unique identifier for the dataset entry.
layoff
Layoff associated with the event, an array
event_type
Type of the event (e.g., layoff).
global_event_type
Generalized category of the event (e.g., Layoff).
associated_article_ids
List of IDs for articles associated with this event.
extraction_date
Timestamp of when the data was extracted.
event_date
Date of the event
company_name
Name of the company involved in the layoff.
articles
News articles associated with the event, an array

Remote Work Transitions

Get access to updates on remote work transitions. The dataset includes location, a brief summary, company name, type of change, number of people affected, and the full text of related articles.

Market Research
Risk Management
Real Estate Management
Talent Acquisition
[
    {
        "id": "Event id",
        "event_type": "Currently, it's set to 'fundraising'; other types of event are accessible upon request",
        "fundraising": {
            "amount": "Amount of money raised",
            "company_description": "Short description of the company that raised funds",
            "funding_type": "Type of the funding event",
            "currency": "Currency"
        },
        "global_event_type": "Type of the parent event, in this case it's set to 'Finance'",
        "associated_article_ids": "IDs of the articles associated with this event",
        "extraction_date": "Date when the event was extracted",
        "event_date": "Date of the event",
        "company_name": "Name of the company",
        "articles": {
            "paid_content": "True if the content is marked as paid or sponsored",
            "link": "Full URL where the article was originally published",
            "description": "Short summary of the article provided by the publisher",
            "language": "The language of the article",
            "media": "A link to a thumbnail image of the article",
            "all_domain_links": "All domain URL embedded in the article's content HTML",
            "title": "The title of the article",
            "journalists": "Clean list of journalists. No news publishcation names, only people",
            "content": "The full content of the article",
            "word_count": "Number of words in the article's content",
            "domain_url": "The domain URL of the article's source",
            "all_links": "All URL links embedded in the article's content HTML",
            "rights": "Copyright",
            "rank": "The page rank of the source website (which is given in the clean_url)",
            "twitter_account": "The Twitter account of the publisher",
            "id": "Newscatcher API's unique identifier for each news article",
            "name_source": "The common name of the News Source",
            "full_domain_url": "The full domain URL with a subcategory of the article's source",
            "author": "The author of the article",
            "is_headline": "True when an article has been seen on the main page of the news source",
            "nlp": {
                "summary": "AI-generated short summary of the article",
                "ner_PER": {
                    "entity_name": "Name of the person",
                    "count": "Number of mentions in the article"
                },
                "ner_LOC": {
                    "entity_name": "Entity recognized - location",
                    "count": "Number of mentions in the article"
                },
                "ner_ORG": {
                    "entity_name": "Entity recognized - organization",
                    "count": "Number of mentions in the article"
                },
                "ner_MISC": {
                    "entity_name": "Entity recognized - others",
                    "count": "Number of mentions in the article"
                },
                "sentiment": {
                    "title": "Sentiment of the title",
                    "content": "Sentiment of content"
                },
                "theme": "Theme recognized"
            },
            "published_date_precision": "Accuracy of the published_date field. There are 3 types of date precision we define: 'full' — day and time of an article is correctly identified with the appropriate timezone, 'timezone unknown' — day and time of an article is correctly identified without timezone, 'date' — only the day is identified without an exact time",
            "is_opinion": "True if the article is an 'Opinion' article"
        }
    }
]
Object
Description
ID
Unique identifier for the dataset entry.
remote_work_transition
Remote work transition associated with the event, an array
event_type
Specific type of event (e.g., remote_work_transition).
global_event_type
Generalized category of the event (e.g., Layoff).
associated_article_ids
List of IDs for articles associated with this event.
extraction_date
Timestamp of when the data was extracted.
event_date
Date of the event
company_name
Name of the company involved in the layoff.
articles
News articles associated with the event, an array

Customized News Datasets for Your Unique Needs

Bespoke Integration

100% compatibility with your data stack.

95% Coverage Per Source

Miss nothing critical

Output Formats

JSON, CSV, XML, other

Delivery Methods

API, Data Dumps, Streaming

Only Relevant Articles

Less than 2% of false positives

Fast Insights

Under 10 min latency

Trusted by
the Top Leaders

We use NewsCatcher to capture the real-time impact of news stories on corporate credit spreads. We love the rapidly evolving reach and expanding API functionality

Rajiv Bhat

We compute the sum of scores of all retrieved articles for each API option. As a result, NewsCatcher and Google News achieve the highest scores of 35 and 39, respectively. The other three APIs, Newsdata.io, Aylien, and NewsAPI.org, score 16.5, 30.5, and 23.5.

Berkeley University of California

It’s almost like we were a farm-to-table restaurant growing our own vegetables. Then the NewsCatcher guys came in and said, ‘You don’t have to worry about that. Just focus on the kitchen.

Mishaal

Jumping on a call with NewsCatcher proved to be incredibly valuable. The personalized service and tailored solutions made a significant difference, proving that the initial effort to engage with them was well worth it.

Carlos Toruno

After analyzing ratios such as false positives, we found that NewsCatcher had, by far, the best results in terms of availability, quality, and regional focus, making it the clear winner based on our defined KPIs.

Michael

We found NewsCatcher to be a really nice global solution serving our purpose.The Integration of NewsCatcher was seamless. It took us less than four days for both the integration and the testing part of it.

Vedant Lohia