Search articles

POST

api

curl --request POST \
  --url https://v3-api.newscatcherapi.com/api/search \
  --header 'Content-Type: application/json' \
  --header 'x-api-token: <api-key>' \
  --data '{
  "q": "renewable energy",
  "predefined_sources": [
    "top 50 US"
  ],
  "lang": [
    "en"
  ],
  "from_": "2024/01/01",
  "to_": "2024/06/30",
  "additional_domain_info": true,
  "is_news_domain": true
}'

{
  "status": "<string>",
  "total_hits": 123,
  "page": 123,
  "total_pages": 123,
  "page_size": 123,
  "articles": [],
  "user_input": {}
}

Authorizations

x-api-token

string

header

required

API Key to authenticate requests.

To access the API, include your API key in the x-api-token header. To obtain your API key, complete the form or contact us directly.

Body

application/json

Request body for searching articles based on specified criteria such as keyword, language, country, source, and more.

string

required

The keyword(s) to search for in articles. Query syntax supports logical operators (AND, OR, NOT) and wildcards:

For an exact match, use double quotes. For example, "technology news".
Use * to search for any keyword.
Use + to include and - to exclude specific words or phrases. For example, +Apple, -Google.
Use AND, OR, and NOT to refine search results. For example, technology AND (Apple OR Microsoft) NOT Google.

For more details, see Advanced querying.

Example:

"technology AND (Apple OR Microsoft) NOT Google"

search_in

string

default:title,content

The article fields to search in. To search in multiple fields, use a comma-separated string.

Example: "title, summary"

Note: The summary option is available if NLP is enabled in your plan.

Available options: title, summary, content.

Example:

"title,content"

predefined_sources

Predefined top news sources per country.

Format: start with the word top, followed by the number of desired sources, and then the two-letter country code ISO 3166-1 alpha-2. Multiple countries with the number of top sources can be specified as a comma-separated string or an array of strings.

Examples:

"top 100 US"
"top 33 AT"
"top 50 US, top 20 GB"
["top 50 US", "top 20 GB"]

Example:

["top 50 US", "top 20 GB"]

source_name

Specifies terms to search within the source names. To specify multiple terms, use a comma-separated string or an array of strings.

Examples:

"sport, tech"
["sport", "tech"]

Note: The search does not require an exact match and returns all sources that include the specified terms anywhere in their names. You can use any word, phrase, or outlet name, such as "sport", or "new york times". For example, using "sport" as a term returns sources like "Motorsport", "Dot Esport", and "Tuttosport".

Example:

["sport", "tech"]

sources

One or more news sources to narrow down the search. The format must be a domain URL. Subdomains, such as finance.yahoo.com, are also acceptable. To specify multiple sources, use a comma-separated string or an array of strings.

Examples:

"nytimes.com, theguardian.com"
["nytimes.com", "theguardian.com"]

Example:

["nytimes.com", "theguardian.com"]

not_sources

The news sources to exclude from the search. To exclude multiple sources, use a comma-separated string or an array of strings.

Examples:

"cnn.com, wsj.com"
["cnn.com", "wsj.com"]

Example:

["cnn.com", "wsj.com"]

lang

The language(s) of the search. The only accepted format is the two-letter ISO 639-1 code. To select multiple languages, use a comma-separated string or an array of strings.

Examples:

"en,es"
["en", "es"]

To learn more, see Enumerated parameters > Language.

Example:

["en", "es"]

not_lang

The language(s) to exclude from the search. The accepted format is the two-letter ISO 639-1 code. To exclude multiple languages, use a comma-separated string or an array of strings.

Examples:

"fr,de"
["fr", "de"]

To learn more, see Enumerated parameters > Language.

Example:

["fr", "de"]

countries

The countries where the news publisher is located. The accepted format is the two-letter ISO 3166-1 alpha-2 code. To select multiple countries, use a comma-separated string or an array of strings.

Examples:

"US,CA"
["US", "CA"]

To learn more, see Enumerated parameters > Country.

Example:

["US", "CA"]

not_countries

The publisher location countries to exclude from the search. The accepted format is the two-letter ISO 3166-1 alpha-2 code. To exclude multiple countries, use a comma-separated string or an array of strings.

Examples:

"UK,FR"
["UK", "FR"]

To learn more, see Enumerated parameters > Country.

Example:

["UK", "FR"]

not_author_name

The list of author names to exclude from your search. To exclude articles by specific authors, use a comma-separated string or an array of strings.

Examples:

"John Doe, Jane Doe"
["John Doe", "Jane Doe"]

Example:

["John Doe", "Jane Doe"]

from_

The starting point in time to search from. Accepts date-time strings in ISO 8601 format and plain text strings. The default time zone is UTC.

Formats with examples:

YYYY-mm-ddTHH:MM:SS: 2024-07-01T00:00:00
YYYY-MM-dd: 2024-07-01
YYYY/mm/dd HH:MM:SS: 2024/07/01 00:00:00
YYYY/mm/dd: 2024/07/01
English phrases: 1 day ago, today

Note: By default, applied to the publication date of the article. To use the article's parse date instead, set the by_parse_date parameter to true.

Example:

"2021/01/01"

to_

The ending point in time to search up to. Accepts date-time strings in ISO 8601 format and plain text strings. The default time zone is UTC.

Formats with examples:

YYYY-mm-ddTHH:MM:SS: 2024-07-01T00:00:00
YYYY-MM-dd: 2024-07-01
YYYY/mm/dd HH:MM:SS: 2024/07/01 00:00:00
YYYY/mm/dd: 2024/07/01
English phrases: 1 day ago, today

Note: By default, applied to the publication date of the article. To use the article's parse date instead, set the by_parse_date parameter to true.

Example:

"2021/12/31"

published_date_precision

enum<string>

The precision of the published date. There are three types:

full: The day and time of an article is correctly identified with the appropriate timezone.
timezone unknown: The day and time of an article is correctly identified without timezone.
date: Only the day is identified without an exact time.

Available options:

full,

timezone unknown,

date

Example:

"full"

by_parse_date

boolean

default:false

If true, the from_ and to_ parameters use article parse dates instead of published dates. Additionally, the parse_date variable is added to the output for each article object.

Example:

true

sort_by

enum<string>

default:relevancy

The sorting order of the results. Possible values are:

relevancy: The most relevant results first.
date: The most recently published results first.
rank: The results from the highest-ranked sources first.

Available options:

relevancy,

date,

rank

Example:

"date"

ranked_only

boolean

default:true

If true, limits the search to sources ranked in the top 1 million online websites. If false, includes unranked sources which are assigned a rank of 999999.

Example:

true

from_rank

integer

default:1

The lowest boundary of the rank of a news website to filter by. A lower rank indicates a more popular source.

Required range: 1 <= x <= 999999

Example:

100

to_rank

integer

default:999999

The highest boundary of the rank of a news website to filter by. A lower rank indicates a more popular source.

Required range: 1 <= x <= 999999

Example:

100

is_headline

boolean

If true, only returns articles that were posted on the home page of a given news domain.

Example:

true

is_opinion

boolean

If true, returns only opinion pieces. If false, excludes opinion-based articles and returns news only.

Example:

true

is_paid_content

boolean

If false, returns only articles that have publicly available complete content. Some publishers partially block content, so this setting ensures that only full articles are retrieved.

Example:

false

parent_url

The categorical URL(s) to filter your search. To filter your search by multiple categorical URLs, use a comma-separated string or an array of strings.

Examples:

"wsj.com/politics,wsj.com/tech"
["wsj.com/politics", "wsj.com/tech"]

Example:

["wsj.com/politics", "wsj.com/tech"]

all_links

The complete URL(s) mentioned in the article. For multiple URLs, use a comma-separated string or an array of strings.

Examples:

"https://aiindex.stanford.edu/report/, https://www.stateof.ai/"
["https://aiindex.stanford.edu/report/", "https://www.stateof.ai/"]

For more details, see Search by URL.

Example:

{
  "string-input": {
    "summary": "Comma-separated string",
    "value": "https://aiindex.stanford.edu/report/, https://www.stateof.ai/"
  },
  "array-input": {
    "summary": "Array of strings",
    "value": [
      "https://aiindex.stanford.edu/report/",
      "https://www.stateof.ai/"
    ]
  }
}

all_domain_links

The domain(s) mentioned in the article. For multiple domains, use a comma-separated string or an array of strings.

Examples:

"who.int, nih.gov"
["who.int", "nih.gov"]

For more details, see Search by URL.

Example:

{
  "string-input": {
    "summary": "Comma-separated string",
    "value": "who.int, nih.gov"
  },
  "array-input": {
    "summary": "Array of strings",
    "value": ["who.int", "nih.gov"]
  }
}

additional_domain_info

boolean

If true, includes additional domain information in the response for each article:

is_news_domain: Boolean indicating if the source is a news domain.
news_domain_type: Type of news domain (e.g., "Original Content").
news_type: Category of news (e.g., "News and Blogs").

Example:

true

is_news_domain

boolean

If true, filters results to include only news domains.

Example:

true

news_domain_type

enum<string>

Filters results based on the news domain type. Possible values are:

Original Content: Sources that produce their own content.
Aggregator: Sources that collect content from various other sources.
Press Releases: Sources primarily publishing press releases.
Republisher: Sources that republish content from other sources.
Other: Sources that don't fit into main categories.

Available options:

Original Content,

Aggregator,

Press Releases,

Republisher,

Other

Example:

"Original Content"

news_type

Filters results based on the news type. Multiple types can be specified using a comma-separated string or an array of strings.

Examples:

"General News Outlets,Tech News and Updates"
["General News Outlets", "Tech News and Updates"]

For a complete list of available news types, see Enumerated parameters > News type.

Example:

[
  "General News Outlets",
  "Tech News and Updates"
]

word_count_min

integer

The minimum number of words an article must contain. To be used for avoiding articles with small content.

Required range: x >= 0

Example:

300

word_count_max

integer

The maximum number of words an article can contain. To be used for avoiding articles with large content.

Required range: x >= 0

Example:

1000

page

integer

default:1

The page number to scroll through the results. Use for pagination, as a single API response can return up to 1,000 articles.

For details, see How to paginate large datasets.

Required range: x >= 1

Example:

2

page_size

integer

default:100

The number of articles to return per page.

Required range: 1 <= x <= 1000

Example:

50

clustering_enabled

boolean

default:false

Determines whether to group similar articles into clusters. If true, the API returns clustered results.

To learn more, see Clustering news articles.

Example:

true

clustering_variable

enum<string>

default:content

Specifies which part of the article to use for determining similarity when clustering. Possible values are:

content: Uses the full article content (default).
title: Uses only the article title.
summary: Uses the article summary.

To learn more, see Clustering news articles.

Available options:

content,

title,

summary

Example:

"content"

clustering_threshold

number

default:0.6

Sets the similarity threshold for grouping articles into clusters. A lower value creates more inclusive clusters, while a higher value requires greater similarity between articles.

For example:

0.3: Results in larger, more diverse clusters.
0.6: Balances cluster size and article similarity (default).
0.9: Creates smaller, tightly related clusters.

To learn more, see Clustering news articles.

Required range: 0 < x <= 1

Example:

0.6

include_nlp_data

boolean

default:false

If true, includes an NLP object for each article in the response. This object provides results of NLP analysis, including article theme, summary, sentiment, tags, and named entity recognition if available.

Note: NLP coverage and analysis completeness may vary by language, with full data available for articles in English and Arabic. The include_nlp_data parameter is available only in NLP subscription plans.

To learn more, see NLP features.

Example:

true

has_nlp

boolean

default:false

If true, filters results to include only articles that have NLP data.

Note: NLP coverage and analysis completeness may vary by language, with full data available for articles in English and Arabic. The has_nlp parameter is available only in NLP subscription plans.

To learn more, see NLP features.

Example:

true

theme

Filters articles based on their general topic, as determined by NLP analysis. To select multiple themes, use a comma-separated string or an array of strings.

Examples:

"Finance, Tech"
["Finance", "Tech"]

Note: The theme parameter is only available if NLP is included in your subscription plan.

To learn more, see NLP features.

Available options: Business, Economics, Entertainment, Finance, Health, Politics, Science, Sports, Tech, Crime, Financial Crime, Lifestyle, Automotive, Travel, Weather, General.

Example:

["Business", "Finance"]

not_theme

Inverse of the theme parameter. Excludes articles based on their general topic, as determined by NLP analysis. To exclude multiple themes, use a comma-separated string or an array of strings.

Examples:

"Crime, Tech"
["Crime", "Tech"]

Note: The not_theme parameter is only available if NLP is included in your subscription plan.

To learn more, see NLP features.

Example:

["Crime"]

ORG_entity_name

Filters articles that mention specific organization names, as identified by NLP analysis. To specify multiple organizations, use a comma-separated string or an array of strings.

Examples:

"Apple, Microsoft"
["Apple", "Microsoft"]

Note: The ORG_entity_name parameter is only available if NLP is included in your subscription plan.

To learn more, see Search by entity.

Example:

["Apple", "Microsoft"]

PER_entity_name

Filters articles that mention specific person names, as identified by NLP analysis. To specify multiple names, use a comma-separated string or an array of strings.

Examples:

"Elon Musk, Jeff Bezos"
["Elon Musk", "Jeff Bezos"]

Note: The PER_entity_name parameter is only available if NLP is included in your subscription plan.

To learn more, see Search by entity.

Example:

["Elon Musk", "Jeff Bezos"]

LOC_entity_name

Filters articles that mention specific location names, as identified by NLP analysis. To specify multiple locations, use a comma-separated string or an array of strings.

Examples:

"California, New York"
["California", "New York"]

Note: The LOC_entity_name parameter is only available if NLP is included in your subscription plan.

To learn more, see Search by entity.

Example:

["California", "New York"]

MISC_entity_name

Filters articles that mention other named entities not falling under person, organization, or location categories. Includes events, nationalities, products, works of art, and more. To specify multiple entities, use a comma-separated string or an array of strings.

Examples:

"Bitcoin, Blockchain"
["Bitcoin", "Blockchain"]

Note: The MISC_entity_name parameter is only available if NLP is included in your subscription plan.

To learn more, see Search by entity.

Example:

["Bitcoin", "Blockchain"]

title_sentiment_min

number

Filters articles based on the minimum sentiment score of their titles.

Range is -1.0 to 1.0, where:

Negative values indicate negative sentiment.
Positive values indicate positive sentiment.
Values close to 0 indicate neutral sentiment.

Note: The title_sentiment_min parameter is only available if NLP is included in your subscription plan.

To learn more, see NLP features.

Required range: -1 <= x <= 1

Example:

-0.5

title_sentiment_max

number

Filters articles based on the maximum sentiment score of their titles.

Range is -1.0 to 1.0, where:

Negative values indicate negative sentiment.
Positive values indicate positive sentiment.
Values close to 0 indicate neutral sentiment.

Note: The title_sentiment_max parameter is only available if NLP is included in your subscription plan.

To learn more, see NLP features.

Required range: -1 <= x <= 1

Example:

0.5

content_sentiment_min

number

Filters articles based on the minimum sentiment score of their content.

Range is -1.0 to 1.0, where:

Negative values indicate negative sentiment.
Positive values indicate positive sentiment.
Values close to 0 indicate neutral sentiment.

Note: The content_sentiment_min parameter is only available if NLP is included in your subscription plan.

To learn more, see NLP features.

Required range: -1 <= x <= 1

Example:

-0.5

content_sentient_max

number

Filters articles based on the maximum sentiment score of their content.

Range is -1.0 to 1.0, where:

Negative values indicate negative sentiment.
Positive values indicate positive sentiment.
Values close to 0 indicate neutral sentiment.

Note: The content_sentiment_max parameter is only available if NLP is included in your subscription plan.

To learn more, see NLP features.

Required range: -1 <= x <= 1

Example:

0.5

iptc_tags

Filters articles based on International Press Telecommunications Council (IPTC) media topic tags. To specify multiple IPTC tags, use a comma-separated string or an array of strings.

Examples:

"20000199, 20000209"
["20000199", "20000209"]

Note: The iptc_tags parameter is only available in the v3_nlp_iptc_tags subscription plan.

To learn more, see IPTC Media Topic NewsCodes.

Example:

["20000199", "20000209"]

not_iptc_tags

Inverse of the iptc_tags parameter. Excludes articles based on International Press Telecommunications Council (IPTC) media topic tags. To specify multiple IPTC tags to exclude, use a comma-separated string or an array of strings.

Examples:

"20000205, 20000209"
["20000205", "20000209"]

Note: The not_iptc_tags parameter is only available in the v3_nlp_iptc_tags subscription plan.

To learn more, see IPTC Media Topic NewsCodes.

Example:

["20000205", "20000209"]

iab_tags

Filters articles based on Interactive Advertising Bureau (IAB) content categories.These tags provide a standardized taxonomy for digital advertising content categorization. To specify multiple IAB categories, use a comma-separated string or an array of strings.

Examples:

"Business, Events"
["Business", "Events"]

Note: The iab_tags parameter is only available in the v3_nlp_iptc_tags subscription plan.

To learn more, see the IAB Content taxonomy.

Example:

["Business", "Events"]

not_iab_tags

Inverse of the iab_tags parameter. Excludes articles based on Interactive Advertising Bureau (IAB) content categories. These tags provide a standardized taxonomy for digital advertising content categorization. To specify multiple IAB categories to exclude, use a comma-separated string or an array of strings.

Examples:

"Agriculture, Metals"
["Agriculture", "Metals"]

Note: The not_iab_tags parameter is only available in the v3_nlp_iptc_tags subscription plan.

To learn more, see the IAB Content taxonomy.

Example:

["Agriculture", "Metals"]

custom_tags

Filters articles based on provided taxonomy that is tailored to your specific needs and is accessible only with your API key. To specify tags, use the following pattern:

custom_tags.taxonomy=Tag1,Tag2,Tag3, where taxonomy is the taxonomy name and Tag1,Tag2,Tag3 are comma-separated tags. For POST requests, you can also specify tags as an array of strings.

Examples:

custom_tags.industry="Manufacturing, Supply Chain, Logistics"
"custom_tags.industry": ["Manufacturing", "Supply Chain", "Logistics"]

To learn more, see the Custom tags.

Example:

["Tag1", "Tag2", "Tag3"]

exclude_duplicates

boolean

If true, excludes duplicate and highly similar articles from the search results. If false, returns all relevant articles, including duplicates.

To learn more, see Articles deduplication.

Example:

true

Response

200

application/json

A successful response containing articles that match the specified search criteria. The response may include clustering information if enabled.

The response model for the search requests applies to the Search, Latest Headlines, Search by link, and Authors endpoints. Response field behavior:

Required fields are guaranteed to be present and non-null.
Optional fields may be null or undefined if the data point is not presented or couldn't be extracted during processing.
To access article properties in the articles response array, use array index notation. For example, articles[n].title, where n is the zero-based index of the article object (0, 1, 2, etc.).
The nlp property within the article object articles[n].nlp is only available with NLP-enabled subscription plans.

status

string

required

The status of the response.

total_hits

integer

required

The total number of articles matching the search criteria.

page

integer

required

The current page number of the results.

total_pages

integer

required

The total number of pages available for the given search criteria.

page_size

integer

required

The number of articles per page.

articles

object[]

A list of articles matching the search criteria.

The data model representing a single article in the search results.

articles.title

string

required

The title of the article.

articles.link

string

required

The URL link to the article.

articles.domain_url

string

required

The domain URL of the article.

articles.full_domain_url

string

required

The full domain URL of the article.

articles.parent_url

string

required

The categorical URL of the article.

articles.rank

integer

required

The rank of the article's source.

articles.content

string

required

The content of the article.

articles.id

string

required

The unique identifier for the article.

articles.score

number

required

The relevance score of the article.

articles.author

string

The primary author of the article.

articles.authors

A list of authors of the article.

articles.journalists

A list of journalists associated with the article.

articles.published_date

string

The date the article was published.

articles.published_date_precision

string

The precision of the published date.

articles.updated_date

string

The date the article was last updated.

articles.updated_date_precision

string

The precision of the updated date.

articles.parse_date

string

The date the article was parsed.

articles.name_source

string

The name of the source where the article was published.

articles.is_headline

boolean

Indicates if the article is a headline.

articles.paid_content

boolean

Indicates if the article is paid content.

articles.country

string

The country where the article was published.

articles.rights

string

The rights information for the article.

articles.media

string

The media associated with the article.

articles.language

string

The language in which the article is written.

articles.description

string

A brief description of the article.

articles.word_count

integer

default:0

The word count of the article.

articles.is_opinion

boolean

Indicates if the article is an opinion piece.

articles.twitter_account

string

The Twitter account associated with the article.

articles.all_links

A list of all URLs mentioned in the article.

articles.all_domain_links

A list of all domain URLs mentioned in the article.

articles.nlp

object

Natural Language Processing data for the article.

articles.nlp.theme

string

The themes or categories identified in the article.

articles.nlp.summary

string

A brief AI-generated summary of the article content.

articles.nlp.sentiment

object

Sentiment scores for the article's title and content.

articles.nlp.new_embedding

number[]

A dense 1024-dimensional vector representation of the article content, generated using the multilingual-e5-large model.

Note: The new_embedding field is only available in the v3_local_news_nlp_embeddings subscription plan.

articles.nlp.ner_PER

object[]

Named Entity Recognition for person entities (individuals' names).

articles.nlp.ner_ORG

object[]

Named Entity Recognition for organization entities (company names, institutions).

articles.nlp.ner_MISC

object[]

Named Entity Recognition for miscellaneous entities (events, nationalities, products).

articles.nlp.ner_LOC

object[]

Named Entity Recognition for location entities (cities, countries, geographic features).

articles.nlp.iptc_tags_name

string[]

IPTC media topic taxonomy paths identified in the article content. Each path represents a hierarchical category following the IPTC standard.

Note: The iptc_tags_name field is only available in the v3_nlp_iptc_tags subscription plan.

articles.nlp.iptc_tags_id

string[]

IPTC media topic numeric codes identified in the article content. These codes correspond to the standardized IPTC media topic taxonomy.

Note: The iptc_tags_id field is only available in the v3_nlp_iptc_tags subscription plan.

articles.nlp.iab_tags_name

string[]

IAB content taxonomy paths identified in the article content. Each path represents a hierarchical category following the IAB content standard.

Note: The iab_tags_name field is only available in the v3_nlp_iptc_tags subscription plan.

articles.custom_tags

object

An object that contains custom tags associated with an article, where each key is a taxonomy name, and the value is an array of tags.

articles.additional_domain_info

object

Additional information about the domain of the article.

Example:

{
  "is_news_domain": true,
  "news_type": "News and Blogs",
  "news_domain_type": "Original Content"
}

user_input

object

The user input parameters for the request.

The response model for the search requests applies to the Search, Latest Headlines, Search by link, and Authors endpoints. Response field behavior:

Required fields are guaranteed to be present and non-null.
Optional fields may be null or undefined if the data point is not presented or couldn't be extracted during processing.
To access article properties in the articles response array, use array index notation. For example, articles[n].title, where n is the zero-based index of the article object (0, 1, 2, etc.).
The nlp property within the article object articles[n].nlp is only available with NLP-enabled subscription plans.

status

string

required

The status of the response.

total_hits

integer

required

The total number of articles matching the search criteria.

page

integer

required

The current page number of the results.

total_pages

integer

required

The total number of pages available for the given search criteria.

page_size

integer

required

The number of articles per page.

articles

object[]

A list of articles matching the search criteria.

The data model representing a single article in the search results.

articles.title

string

required

The title of the article.

articles.link

string

required

The URL link to the article.

articles.domain_url

string

required

The domain URL of the article.

articles.full_domain_url

string

required

The full domain URL of the article.

articles.parent_url

string

required

The categorical URL of the article.

articles.rank

integer

required

The rank of the article's source.

articles.content

string

required

The content of the article.

articles.id

string

required

The unique identifier for the article.

articles.score

number

required

The relevance score of the article.

articles.author

string

The primary author of the article.

articles.authors

A list of authors of the article.

articles.journalists

A list of journalists associated with the article.

articles.published_date

string

The date the article was published.

articles.published_date_precision

string

The precision of the published date.

articles.updated_date

string

The date the article was last updated.

articles.updated_date_precision

string

The precision of the updated date.

articles.parse_date

string

The date the article was parsed.

articles.name_source

string

The name of the source where the article was published.

articles.is_headline

boolean

Indicates if the article is a headline.

articles.paid_content

boolean

Indicates if the article is paid content.

articles.country

string

The country where the article was published.

articles.rights

string

The rights information for the article.

articles.media

string

The media associated with the article.

articles.language

string

The language in which the article is written.

articles.description

string

A brief description of the article.

articles.word_count

integer

default:0

The word count of the article.

articles.is_opinion

boolean

Indicates if the article is an opinion piece.

articles.twitter_account

string

The Twitter account associated with the article.

articles.all_links

A list of all URLs mentioned in the article.

articles.all_domain_links

A list of all domain URLs mentioned in the article.

articles.nlp

object

Natural Language Processing data for the article.

articles.nlp.theme

string

The themes or categories identified in the article.

articles.nlp.summary

string

A brief AI-generated summary of the article content.

articles.nlp.sentiment

object

Sentiment scores for the article's title and content.

articles.nlp.new_embedding

number[]

A dense 1024-dimensional vector representation of the article content, generated using the multilingual-e5-large model.

Note: The new_embedding field is only available in the v3_local_news_nlp_embeddings subscription plan.

articles.nlp.ner_PER

object[]

Named Entity Recognition for person entities (individuals' names).

articles.nlp.ner_ORG

object[]

Named Entity Recognition for organization entities (company names, institutions).

articles.nlp.ner_MISC

object[]

Named Entity Recognition for miscellaneous entities (events, nationalities, products).

articles.nlp.ner_LOC

object[]

Named Entity Recognition for location entities (cities, countries, geographic features).

articles.nlp.iptc_tags_name

string[]

IPTC media topic taxonomy paths identified in the article content. Each path represents a hierarchical category following the IPTC standard.

Note: The iptc_tags_name field is only available in the v3_nlp_iptc_tags subscription plan.

articles.nlp.iptc_tags_id

string[]

IPTC media topic numeric codes identified in the article content. These codes correspond to the standardized IPTC media topic taxonomy.

Note: The iptc_tags_id field is only available in the v3_nlp_iptc_tags subscription plan.

articles.nlp.iab_tags_name

string[]

IAB content taxonomy paths identified in the article content. Each path represents a hierarchical category following the IAB content standard.

Note: The iab_tags_name field is only available in the v3_nlp_iptc_tags subscription plan.

articles.custom_tags

object

An object that contains custom tags associated with an article, where each key is a taxonomy name, and the value is an array of tags.

articles.additional_domain_info

object

Additional information about the domain of the article.

Example:

{
  "is_news_domain": true,
  "news_type": "News and Blogs",
  "news_domain_type": "Original Content"
}

user_input

object

The user input parameters for the request.

The response model when clustering is enabled, grouping similar articles into clusters. Applies to the Search and Latest headlines requests. Response field behavior:

Required fields are guaranteed to be present and non-null.
Optional fields may be null or undefined if the data point is not presented or couldn't be extracted during processing.
To access article properties in the articles response array, use array index notation. For example, articles[n].title, where n is the zero-based index of the article object (0, 1, 2, etc.).
The nlp property within the article object articles[n].nlp is only available with NLP-enabled subscription plans.

status

string

required

The status of the response.

total_hits

integer

required

The total number of articles matching the search criteria.

page

integer

required

The current page number of the results.

total_pages

integer

required

The total number of pages available for the given search criteria.

page_size

integer

required

The number of articles per page.

clusters_count

integer

required

The number of clusters in the search results.

clusters

object[]

required

A list of clusters found in the search results.

The data model representing a single cluster of articles.

clusters.cluster_id

string

required

The unique identifier for the cluster.

clusters.cluster_size

integer

required

The number of articles in the cluster.

clusters.articles

object[]

required

A list of articles in the cluster.

The data model representing a single article in the search results.

clusters.articles.title

string

required

The title of the article.

clusters.articles.link

string

required

The URL link to the article.

clusters.articles.domain_url

string

required

The domain URL of the article.

clusters.articles.full_domain_url

string

required

The full domain URL of the article.

clusters.articles.parent_url

string

required

The categorical URL of the article.

clusters.articles.rank

integer

required

The rank of the article's source.

clusters.articles.content

string

required

The content of the article.

clusters.articles.id

string

required

The unique identifier for the article.

clusters.articles.score

number

required

The relevance score of the article.

clusters.articles.author

string

The primary author of the article.

clusters.articles.authors

A list of authors of the article.

clusters.articles.journalists

A list of journalists associated with the article.

clusters.articles.published_date

string

The date the article was published.

clusters.articles.published_date_precision

string

The precision of the published date.

clusters.articles.updated_date

string

The date the article was last updated.

clusters.articles.updated_date_precision

string

The precision of the updated date.

clusters.articles.parse_date

string

The date the article was parsed.

clusters.articles.name_source

string

The name of the source where the article was published.

clusters.articles.is_headline

boolean

Indicates if the article is a headline.

clusters.articles.paid_content

boolean

Indicates if the article is paid content.

clusters.articles.country

string

The country where the article was published.

clusters.articles.rights

string

The rights information for the article.

clusters.articles.media

string

The media associated with the article.

clusters.articles.language

string

The language in which the article is written.

clusters.articles.description

string

A brief description of the article.

clusters.articles.word_count

integer

default:0

The word count of the article.

clusters.articles.is_opinion

boolean

Indicates if the article is an opinion piece.

clusters.articles.twitter_account

string

The Twitter account associated with the article.

clusters.articles.all_links

A list of all URLs mentioned in the article.

clusters.articles.all_domain_links

A list of all domain URLs mentioned in the article.

clusters.articles.nlp

object

Natural Language Processing data for the article.

clusters.articles.nlp.theme

string

The themes or categories identified in the article.

clusters.articles.nlp.summary

string

A brief AI-generated summary of the article content.

clusters.articles.nlp.sentiment

object

Sentiment scores for the article's title and content.

clusters.articles.nlp.new_embedding

number[]

A dense 1024-dimensional vector representation of the article content, generated using the multilingual-e5-large model.

Note: The new_embedding field is only available in the v3_local_news_nlp_embeddings subscription plan.

clusters.articles.nlp.ner_PER

object[]

Named Entity Recognition for person entities (individuals' names).

clusters.articles.nlp.ner_ORG

object[]

Named Entity Recognition for organization entities (company names, institutions).

clusters.articles.nlp.ner_MISC

object[]

Named Entity Recognition for miscellaneous entities (events, nationalities, products).

clusters.articles.nlp.ner_LOC

object[]

Named Entity Recognition for location entities (cities, countries, geographic features).

clusters.articles.nlp.iptc_tags_name

string[]

IPTC media topic taxonomy paths identified in the article content. Each path represents a hierarchical category following the IPTC standard.

Note: The iptc_tags_name field is only available in the v3_nlp_iptc_tags subscription plan.

clusters.articles.nlp.iptc_tags_id

string[]

IPTC media topic numeric codes identified in the article content. These codes correspond to the standardized IPTC media topic taxonomy.

Note: The iptc_tags_id field is only available in the v3_nlp_iptc_tags subscription plan.

clusters.articles.nlp.iab_tags_name

string[]

IAB content taxonomy paths identified in the article content. Each path represents a hierarchical category following the IAB content standard.

Note: The iab_tags_name field is only available in the v3_nlp_iptc_tags subscription plan.

clusters.articles.custom_tags

object

An object that contains custom tags associated with an article, where each key is a taxonomy name, and the value is an array of tags.

clusters.articles.additional_domain_info

object

Additional information about the domain of the article.

Example:

{
  "is_news_domain": true,
  "news_type": "News and Blogs",
  "news_domain_type": "Original Content"
}

user_input

object

The user input parameters for the request.

Was this page helpful?

Suggest edits Raise issue

Search articles Retrieve latest headlines

curl --request POST \
  --url https://v3-api.newscatcherapi.com/api/search \
  --header 'Content-Type: application/json' \
  --header 'x-api-token: <api-key>' \
  --data '{
  "q": "renewable energy",
  "predefined_sources": [
    "top 50 US"
  ],
  "lang": [
    "en"
  ],
  "from_": "2024/01/01",
  "to_": "2024/06/30",
  "additional_domain_info": true,
  "is_news_domain": true
}'

{
  "status": "<string>",
  "total_hits": 123,
  "page": 123,
  "total_pages": 123,
  "page_size": 123,
  "articles": [],
  "user_input": {}
}

Overview

Endpoints

Libraries

Authorizations

Body

Response