Search articles
Searches for articles based on specified criteria such as keyword, language, country, source, and more.
Authorizations
API Key to authenticate requests.
To access the API, include your API key in the x-api-token
header.
To obtain your API key, complete the form or contact us directly.
Query Parameters
The keyword(s) to search for in articles.
Query syntax supports logical operators (AND
, OR
, NOT
) and wildcards:
- For an exact match, use double quotes. For example,
"technology news"
. - Use
*
to search for any keyword. - Use
+
to include and-
to exclude specific words or phrases. For example,+Apple
,-Google
. - Use
AND
,OR
, andNOT
to refine search results. For example,technology AND (Apple OR Microsoft) NOT Google
.
For more details, see Advanced querying.
The article fields to search in. To search in multiple fields, use a comma-separated string.
Example: "title, summary"
Note: The summary
option is available if NLP is enabled in your plan.
Available options: title
, summary
, content
.
Predefined top sources per country.
Format: start with the word top
, followed by the number of desired sources,
and then the two-letter country code ISO 3166-1 alpha-2.
Examples:
"top 100 US"
"top 33 AT"
"top 5 GB"
Multiple countries can be specified with custom numbers as a comma-separated string.
Examples:
"top 50 US, top 20 GB"
"top 33 AT, top 50 IT"
One or more news sources to narrow down the search. The format must be a domain URL.
Subdomains, such as finance.yahoo.com
, are also acceptable.
To specify multiple sources, use a comma-separated string.
Examples:
"nytimes.com"
"theguardian.com, finance.yahoo.com"
The news sources to exclude from the search. To exclude multiple sources, use a comma-separated string.
Example: "cnn.com, wsj.com"
The language(s) of the search. The only accepted format is the two-letter ISO 639-1 code. To select multiple languages, use a comma-separated string.
Example: "en, es"
To learn more, see Enumerated parameters > Language.
The language(s) to exclude from the search. The accepted format is the two-letter ISO 639-1 code. To exclude multiple languages, use a comma-separated string.
Example: "fr, de"
To learn more, see Enumerated parameters > Language.
The countries where the news publisher is located. The accepted format is the two-letter ISO 3166-1 alpha-2 code. To select multiple countries, use a comma-separated string.
Example: "US, CA"
To learn more, see Enumerated parameters > Country.
The publisher location countries to exclude from the search. The accepted format is the two-letter ISO 3166-1 alpha-2 code. To exclude multiple countries, use a comma-separated string.
Example:"US, CA"
To learn more, see Enumerated parameters > Country.
The list of author names to exclude from your search. To exclude articles by specific authors, use a comma-separated string.
Example: "John Doe, Jane Doe"
The starting point in time to search from. Accepts date-time strings in ISO 8601 format and plain text strings. The default time zone is UTC.
Formats with examples:
- YYYY-mm-ddTHH:MM:SS:
2024-07-01T00:00:00
- YYYY-MM-dd:
2024-07-01
- YYYY/mm/dd HH:MM:SS:
2024/07/01 00:00:00
- YYYY/mm/dd:
2024/07/01
- English phrases:
1 day ago
,today
Note: By default, applied to the publication date of the article.
To use the article's parse date instead, set the by_parse_date
parameter to true
.
The ending point in time to search up to. Accepts date-time strings in ISO 8601 format and plain text strings. The default time zone is UTC.
Formats with examples:
- YYYY-mm-ddTHH:MM:SS:
2024-07-01T00:00:00
- YYYY-MM-dd:
2024-07-01
- YYYY/mm/dd HH:MM:SS:
2024/07/01 00:00:00
- YYYY/mm/dd:
2024/07/01
- English phrases:
1 day ago
,today
Note: By default, applied to the publication date of the article.
To use the article's parse date instead, set the by_parse_date
parameter to true
.
The precision of the published date. There are three types:
full
: The day and time of an article is correctly identified with the appropriate timezone.timezone unknown
: The day and time of an article is correctly identified without timezone.date
: Only the day is identified without an exact time.
full
, timezone unknown
, date
If true, the from_
and to_
parameters use article parse dates instead of published dates.
Additionally, the parse_date
variable is added to the output list for each article.
The sorting order of the results. Possible values are:
relevancy
: The most relevant results first.date
: The most recently published results first.rank
: The results from the highest-ranked sources first.
relevancy
, date
, rank
If true, limits the search to sources ranked in the top 1 million online websites. If false, includes unranked sources which are assigned a rank of 999999.
The lowest boundary of the rank of a news website to filter by.
Range: 1
to 999999
, where a lower rank indicates a more popular source.
If you set this to 100
, the API includes sources ranked 100 or higher.
1 < x < 999999
The highest boundary of the rank of a news website to filter by.
Range: 1
to 999999
, where a lower rank indicates a more popular source.
If you set this to 100
, the API includes sources ranked 100 or lower.
1 < x < 999999
If true, only returns articles that were posted on the home page of a given news domain.
If true, returns only opinion pieces. If false, excludes opinion-based articles and returns news only.
If false, returns only articles that have publicly available complete content. Some publishers partially block content, so this setting ensures that only full articles are retrieved.
The categorical URL(s) to filter your search. To filter your search by multiple categorical URLs, use a comma-separated string.
Example: "wsj.com/politics, wsj.com/tech"
The complete URL(s) mentioned in the article. For multiple URLs, use a comma-separated string.
Example: "https://aiindex.stanford.edu/report, https://www.stateof.ai"
For more details, see Search by URL.
The domain(s) mentioned in the article. For multiple domains, use a comma-separated string.
Example: "who.int, nih.gov"
For more details, see Search by URL.
If true, includes additional domain information in the response for each article:
is_news_domain
: Boolean indicating if the source is a news domain.news_domain_type
: Type of news domain (e.g., "Original Content").news_type
: Category of news (e.g., "News and Blogs").
If true, filters results to include only news domains.
Filters results based on the news domain type. Possible values are:
Original Content
: Sources that produce their own content.Aggregator
: Sources that collect content from various other sources.Press Releases
: Sources primarily publishing press releases.Republisher
: Sources that republish content from other sources.Other
: Sources that don't fit into main categories.
Original Content
, Aggregator
, Press Releases
, Republisher
, Other
Filters results based on the news type. Multiple types can be specified using a comma-separated string.
Example: "General News Outlets,Tech News and Updates"
For a complete list of available news types, see Enumerated parameters > News type.
The minimum number of words an article must contain. To be used for avoiding articles with small content.
x > 0
The maximum number of words an article can contain. To be used for avoiding articles with large content.
x > 0
The page number to scroll through the results. This parameter is used to paginate: scroll through results because one API response cannot return more than 1000 articles.
x > 1
The number of articles to return per page.
Range: 1
to 1000
.
1 < x < 1000
Determines whether to group similar articles into clusters. If true, the API returns clustered results.
To learn more, see Clustering news articles.
Specifies which part of the article to use for determining similarity when clustering.
Possible values are:
content
: Uses the full article content (default).title
: Uses only the article title.summary
: Uses the article summary.
To learn more, see Clustering news articles.
content
, title
, summary
Sets the similarity threshold for grouping articles into clusters.
Range: Greater than 0
to 1.0
.
A lower value creates more inclusive clusters, while a higher value requires greater similarity between articles.
Examples:
0.3
: Results in larger, more diverse clusters.0.6
: Balances cluster size and article similarity (default).0.9
: Creates smaller, tightly related clusters.
To learn more, see Clustering news articles.
0 < x < 1
If true, includes an NLP layer with each article in the response. This layer provides enhanced information such as theme classification, article summary, sentiment analysis, tags, and named entity recognition.
The NLP layer includes:
- Theme: General topic of the article.
- Summary: A concise overview of the article content.
- Sentiment: Separate scores for title and content (range: -1 to 1).
- Named entities: Identified persons (PER), organizations (ORG), locations (LOC), and miscellaneous entities (MISC).
- IPTC tags: Standardized news category tags.
- IAB tags: Content categories for digital advertising.
Note: The include_nlp_data
parameter is only available if NLP is included in your subscription plan.
To learn more, see NLP features.
If true, filters the results to include only articles with an NLP layer. This allows you to focus on articles that have been processed with advanced NLP techniques.
Note: The has_nlp
parameter is only available if NLP is included in your subscription plan.
To learn more, see NLP features.
Filters articles based on their general topic, as determined by NLP analysis. To select multiple themes, use a comma-separated string.
Example: "Finance, Tech"
Note: The theme
parameter is only available if NLP is included in your subscription plan.
To learn more, see NLP features.
Available options: Business
, Economics
, Entertainment
, Finance
, Health
, Politics
, Science
, Sports
, Tech
, Crime
, Financial Crime
, Lifestyle
, Automotive
, Travel
, Weather
, General
.
Inverse of the theme
parameter. Excludes articles based on their general topic, as determined by NLP analysis.
To exclude multiple themes, use a comma-separated string.
Example: "Crime, Tech"
Note: The not_theme
parameter is only available if NLP is included in your subscription plan.
To learn more, see NLP features.
Filters articles that mention specific organization names, as identified by NLP analysis. To specify multiple organizations, use a comma-separated string.
Example: "Apple, Microsoft"
Note: The ORG_entity_name
parameter is only available if NLP is included in your subscription plan.
To learn more, see Search by entity.
Filters articles that mention specific person names, as identified by NLP analysis. To specify multiple names, use a comma-separated string.
Example: "Elon Musk, Jeff Bezos"
Note: The PER_entity_name
parameter is only available if NLP is included in your subscription plan.
To learn more, see Search by entity.
Filters articles that mention specific location names, as identified by NLP analysis. To specify multiple locations, use a comma-separated string.
Example: "California, New York"
Note: The LOC_entity_name
parameter is only available if NLP is included in your subscription plan.
To learn more, see Search by entity.
Filters articles that mention other named entities not falling under person, organization, or location categories. Includes events, nationalities, products, works of art, and more. To specify multiple entities, use a comma-separated string.
Example: "Bitcoin, Blockchain"
Note: The MISC_entity_name
parameter is only available if NLP is included in your subscription plan.
To learn more, see Search by entity.
Filters articles based on the minimum sentiment score of their titles.
Range is -1.0
to 1.0
, where:
- Negative values indicate negative sentiment.
- Positive values indicate positive sentiment.
- Values close to 0 indicate neutral sentiment.
Note: The title_sentiment_min
parameter is only available if NLP is included in your subscription plan.
To learn more, see NLP features.
-1 < x < 1
Filters articles based on the maximum sentiment score of their titles.
Range is -1.0
to 1.0
, where:
- Negative values indicate negative sentiment.
- Positive values indicate positive sentiment.
- Values close to 0 indicate neutral sentiment.
Note: The title_sentiment_max
parameter is only available if NLP is included in your subscription plan.
To learn more, see NLP features.
-1 < x < 1
Filters articles based on the minimum sentiment score of their content.
Range is -1.0
to 1.0
, where:
- Negative values indicate negative sentiment.
- Positive values indicate positive sentiment.
- Values close to 0 indicate neutral sentiment.
Note: The content_sentiment_min
parameter is only available if NLP is included in your subscription plan.
To learn more, see NLP features.
-1 < x < 1
Filters articles based on the maximum sentiment score of their content.
Range is -1.0
to 1.0
, where:
- Negative values indicate negative sentiment.
- Positive values indicate positive sentiment.
- Values close to 0 indicate neutral sentiment.
Note: The content_sentiment_max
parameter is only available if NLP is included in your subscription plan.
To learn more, see NLP features.
-1 < x < 1
Filters articles based on IPTC (International Press Telecommunications Council) media topic tags. To specify multiple IPTC tags, use a comma-separated string of tag IDs.
Example: "20000199, 20000209"
Note: The iptc_tags
parameter is only available if tags are included in your subscription plan.
To learn more, see IPTC Media Topic NewsCodes.
Inverse of the iptc_tags
parameter. Excludes articles based on IPTC (International Press Telecommunications Council) media topic tags.
To specify multiple IPTC tags to exclude, use a comma-separated string of tag IDs.
Example: "20000205, 20000209"
Note: The not_iptc_tags
parameter is only available if tags are included in your subscription plan.
To learn more, see IPTC Media Topic NewsCodes.
Specifies terms to search within the source names. To specify multiple terms, use a comma-separated string.
Example: "sport, tech"
Note: The search does not require an exact match and returns all sources that include
the specified terms anywhere in their names. You can use any word, phrase, or outlet name,
such as "sport"
, or "new york times"
. For example, using "sport"
as a term returns
sources like "Motorsport"
, "Dot Esport"
, and "Tuttosport"
.
Filters articles based on IAB (Interactive Advertising Bureau) content categories. These tags provide a standardized taxonomy for digital advertising content categorization. To specify multiple IAB categories, use a comma-separated string.
Example: "Business, Events"
Note: The iab_tags
parameter is only available if tags are included in your subscription plan.
To learn more, see the IAB Content taxonomy.
Inverse of the iab_tags
parameter. Excludes articles based on IAB (Interactive Advertising Bureau) content categories.
These tags provide a standardized taxonomy for digital advertising content categorization.
To specify multiple IAB categories to exclude, use a comma-separated string.
Example: "Agriculture, Metals"
Note: The not_iab_tags
parameter is only available if tags are included in your subscription plan.
To learn more, see the IAB Content taxonomy.
If true, excludes duplicate and highly similar articles from the search results. If false, returns all relevant articles, including duplicates.
To learn more, see Articles deduplication.
Response
The response model for a search request.
The total number of articles matching the search criteria.
The current page number of the results.
The total number of pages available for the given search criteria.
The number of articles per page.
A list of articles matching the search criteria.
The user input parameters for the search.
The status of the response.
Was this page helpful?