Explore key changes and prepare for migration.
Feature | v2 | v3 |
---|---|---|
Base URL | api.newscatcherapi.com/v2 | v3-api.newscatcherapi.com/api |
Authentication header | x-api-key | x-api-token |
Maximum articles/request | 100 | 1,000 |
Historical data | Since 2019 | Since 2019* |
v2 | v3 | Change |
---|---|---|
/search | /search | Enhanced with additional filtering capabilities, NLP features, clustering, and deduplication |
/latest_headlines | /latest_headlines | Enhanced with additional filtering capabilities, NLP features, clustering, and deduplication |
/sources | /sources | Enhanced with additional filtering capabilities |
/authors | Search by author name | |
/search_by_link | Search by URL or article ID | |
/search_similar | Find similar articles | |
/aggregation_count | Get aggregation count by interval | |
/subscription | View subscription info |
GET
and POST
methods for all endpointssearch_in
uses underscore-separated strings) regardless of the methodGET
: Supports a comma-separated stringPOST
: Supports both a comma-separated string and an array of stringsfrom
→ from_
to
→ to_
topic
→ theme
topic
parameter is removed from the /sources
endpoint. Instead, you
have new filtering capabilities. To learn more, see Retrieve
sources.q
)Aspect | v2 | v3 | Change |
---|---|---|---|
Required | Yes | Yes | No change |
Query operators | ✓ Exact match with quotes "keyword" ✓ Boolean: AND , OR , NOT ✓ Wildcards: * and ? ✓ Must/Must not: + and - ✓ Grouping with () | Same as v2 plus: ✓ NEAR operator✓ COUNT operator | Enhanced operators |
Default behavior | Space-separated tokens treated as AND | Same as v2 | No change |
search_in
)Aspect | v2 | v3 | Change |
---|---|---|---|
Field name for article title | title | title | No change |
Field name for article content | summary | content | Renamed to reflect actual content |
Default value | "title_summary" | "title,content" | Functionally equivalent |
LLM-generated summary | Not available | summary (requires NLP) | New feature |
Multiple values | Underscore-separated string, e.g. "title_summary" | GET : Comma-separated string, e.g. "title,content" POST : Comma-separated string or array | Format standardization |
topic
-> theme
)Aspect | v2 | v3 | Change |
---|---|---|---|
Parameter name | topic | theme | Renamed |
Case format | lowercase, e.g. "tech" | Capitalized, e.g. "Tech" | Updated format |
Available categories | 15 lowercase categories | 17 capitalized categories | Expanded |
New categories in v3 | - | "Health" , "Crime" , "Financial Crime" , "Lifestyle" , "Automotive" , "Weather" , "General" | Added |
Removed v2 categories | "beauty" , "music" , "food" , "gaming" | Consolidated into new categories | Category restructuring |
Multiple values | Comma-separated string | GET : Comma-separated stringPOST : Comma-separated string or array | Enhanced POST format |
Exclusion option | Not available | not_theme parameter | New feature |
NLP dependency | No | Yes | New requirement |
v3_basic
.
Parameter | Type | Description |
---|---|---|
is_headline | boolean | Filters for articles that were posted on the home page of a given news domain |
is_opinion | boolean | Filters for opinion pieces when true, or excludes opinion-based articles when false |
is_paid_content | boolean | Filters out articles with paywalled content when false |
word_count_min | integer | Filters articles based on minimum word count |
word_count_max | integer | Filters articles based on maximum word count |
Parameter | Type | Description |
---|---|---|
parent_url | string | Filters articles by categorical URLs (e.g., “wsj.com/politics”) |
all_links | string | Filters articles by mentioned URLs within their content |
all_domain_links | string | Filters articles by mentioned domain names within their content |
Parameter | Type | Description |
---|---|---|
not_author_name | string | Excludes articles by specified authors |
Parameter | Type | Description |
---|---|---|
by_parse_date | boolean | Uses parse dates instead of published dates for date filtering |
Parameter | Type | Description |
---|---|---|
predefined_sources | string | Filters by predefined top sources per country (e.g., “top 100 US”) |
additional_domain_info | boolean | Includes extra metadata about the source domain |
is_news_domain | boolean | Filters for news domain sources only |
news_domain_type | string | Filters by domain type (Original Content, Aggregator, etc.) |
news_type | string | Filters by news type categories |
v3_nlp
plan or higher.
Parameter | Type | Description |
---|---|---|
include_nlp_data | boolean | Includes an NLP object for each article in the response |
has_nlp | boolean | Filters for articles that have NLP analysis available |
theme | string | Replaces topic parameter with expanded categories and NLP integration |
not_theme | string | Excludes articles with specified themes |
ORG_entity_name | string | Filters articles mentioning specific organization names |
PER_entity_name | string | Filters articles mentioning specific person names |
LOC_entity_name | string | Filters articles mentioning specific location names |
MISC_entity_name | string | Filters articles mentioning other named entities |
title_sentiment_min | float | Filters articles by minimum title sentiment score (-1 to 1) |
title_sentiment_max | float | Filters articles by maximum title sentiment score (-1 to 1) |
content_sentiment_min | float | Filters articles by minimum content sentiment score (-1 to 1) |
content_sentiment_max | float | Filters articles by maximum content sentiment score (-1 to 1) |
v3_nlp
plan or higher.
Parameter | Type | Description |
---|---|---|
clustering_enabled | boolean | Enables grouping of similar articles into clusters |
clustering_variable | string | Specifies which part of the article to use for clustering (“content”, “title”, or “summary”) |
clustering_threshold | float | Sets similarity threshold for clustering (range: 0-1) |
exclude_duplicates | boolean | Removes duplicate and highly similar articles from results |
v3_nlp_iptc_tags
subscription plan.
Parameter | Type | Description |
---|---|---|
iptc_tags | string | Filters articles by IPTC media topic tags |
not_iptc_tags | string | Excludes articles with specific IPTC media topic tags |
iab_tags | string | Filters articles by IAB content categories |
not_iab_tags | string | Excludes articles with specific IAB content categories |
v2 | v3 | Type | Description |
---|---|---|---|
clean_url | domain_url | string | Base domain of the source |
excerpt | description | string | Brief article description |
summary | content | string | Full article content |
_score | score | number | Relevancy score |
_id | id | string | Unique article identifier |
topic | theme | string | Available in v3 with NLP enabled |
Field | Type | Description |
---|---|---|
full_domain_url | string | Complete domain with subdomain |
name_source | string | Publisher name |
is_headline | boolean | Homepage article indicator |
paid_content | boolean | Paywall indicator |
parent_url | string | Category/section URL |
journalists | array | Array of journalist names |
word_count | integer | Article length |
updated_date | string | Last update timestamp |
updated_date_precision | string | Update time precision |
all_links | array | URLs mentioned in article |
all_domain_links | array | Domains mentioned in article |
v3_nlp
plan or higher) when include_nlp_data=true
.
Field | Type | Description |
---|---|---|
nlp | object | Natural Language Processing analysis results for the article content. |
Field | Type | Description |
---|---|---|
nlp.summary | string | AI-generated concise summary of article content |
nlp.theme | array[string] | High-level thematic categories from fixed set: Business, Economics, Entertainment, Finance, Health, Politics, Science, Sports, Tech, Crime, Financial Crime, Lifestyle, Automotive, Travel, Weather, General |
Field | Type | Description |
---|---|---|
nlp.sentiment.title | number | Sentiment score for article title (range: -1 to 1, negative values indicate negative sentiment) |
nlp.sentiment.content | number | Sentiment score for article content (range: -1 to 1, negative values indicate negative sentiment) |
Field | Type | Description |
---|---|---|
nlp.ner_PER | array[object] | Named entities recognized as persons |
nlp.ner_ORG | array[object] | Named entities recognized as organizations |
nlp.ner_LOC | array[object] | Named entities recognized as locations |
nlp.ner_MISC | array[object] | Named entities recognized as other types (events, products, etc.) |
v3_nlp_iptc_tags
subscription plan.
Field | Type | Description |
---|---|---|
nlp.iab_tags_name | array[string] | Interactive Advertising Bureau content categorization |
nlp.iptc_tags_name | array[string] | International Press Telecommunications Council subject names |
nlp.iptc_tags_id | array[string] | International Press Telecommunications Council subject IDs |
v3_nlp_embeddings
plan.
Field | Type | Description |
---|---|---|
nlp.new_embedding | array[number] | 1024-dimensional vector embedding for semantic similarity comparison (v3_nlp_embeddings plan only) |
clustering_enabled=true
:
Field | Type | Description |
---|---|---|
clusters_count | integer | Total number of clusters in the response |
clusters | array | Array of cluster objects |
cluster_id | string | Unique identifier for each cluster |
cluster_size | integer | Number of articles in the cluster |
articles | array | Array of article objects in the cluster |
exclude_duplicates=true
:
Field | Type | Description |
---|---|---|
duplicate_count | integer | Number of duplicate articles found |
duplicate_articles_group_id | string | Unique identifier for the duplicate group |
Field | Type | Description |
---|---|---|
name_source | string | Publisher name |
domain_url | string | Base domain URL |
logo | string | Source logo URL |
additional_info | object | Extended source data |
Field | Type | Description |
---|---|---|
nb_articles_for_7d | integer | Articles published in last week |
country | string | Source country code |
rank | integer | SEO rank |
is_news_domain | boolean | Indicates if domain is a news source |
news_domain_type | string | Type of news domain |
news_type | string | Category of news content |
topic
: Replaced by theme
in NLP features for the /search
and
/latest_headlines
endpoints. The field is unavailable for the /sources
endpoint as the corresponding parameter has been removed.is_republisher
: Replaced by more detailed domain classification.Code | v2 Description | v3 Description |
---|---|---|
400 | API not in headers | Bad request - Invalid JSON |
401 | API Key not found | Unauthorized |
403 | Not present | Plan limits exceeded |
406 | Wrong parameter | Not present |
408 | Request Timeout | Request Timeout |
422 | Not present | Validation Error |
429 | Concurrency violated | Rate limit exceeded |
500 | Not present | Internal server error |
v3_basic
plan for existing v2 customers.v3_basic
plan.v3_nlp
plan.v3_nlp_iptc_tags
plan.v3_nlp_embeddings
plan.