API changes v2 vs v3 - newscatcher

This guide outlines the key differences between NewsCatcher News API v2 and v3. It provides a technical comparison to help you understand the changes and prepare for migration.

This guide only covers the changes for v2/v3 shared endpoints. To learn about other v3 endpoints, their parameters, and response fields, see the API Reference.

Base API changes

Infrastructure updates

Feature	v2	v3
Base URL	api.newscatcherapi.com/v2	v3-api.newscatcherapi.com/api
Authentication header	x-api-key	x-api-token
Maximum articles/request	100	1,000
Historical data	Since 2019	Since 2019*

All historical data from News API v2 is available in v3. For data collected before July 2023, only the core functionalities are included. Advanced features such as NLP analysis, clustering, and deduplication for this specific time frame can be added upon request.

Available endpoints

v2	v3	Change
`/search`	`/search`	Enhanced with additional filtering capabilities, NLP features, clustering, and deduplication
`/latest_headlines`	`/latest_headlines`	Enhanced with additional filtering capabilities, NLP features, clustering, and deduplication
`/sources`	`/sources`	Enhanced with additional filtering capabilities
	`/authors`	Search by author name
	`/search_by_link`	Search by URL or article ID
	`/search_similar`	Find similar articles
	`/aggregation_count`	Get aggregation count by interval
	`/subscription`	View subscription info

Method support

Both v2 and v3 support GET and POST methods for all endpoints
Multiple value parameter formats:
- v2: Most parameters use comma-separated strings, with some exceptions (e.g., search_in uses underscore-separated strings) regardless of the method
- v3: More consistent formatting:
  - GET: Supports a comma-separated string
  - POST: Supports both a comma-separated string and an array of strings
Single-value parameters maintain their respective formats in both versions

Parameter changes

Renaming

from → from_
to → to_
topic → theme

The topic parameter is removed from the /sources endpoint. Instead, you have new filtering capabilities. To learn more, see Retrieve sources.

Query parameter (`q`)

Aspect	v2	v3	Change
Required	Yes	Yes	No change
Query operators	✓ Exact match with quotes `"keyword"` ✓ Boolean: `AND`, `OR`, `NOT` ✓ Wildcards: `*` and `?` ✓ Must/Must not: `+` and `-` ✓ Grouping with `()`	Same as v2 plus: ✓ `NEAR` operator ✓ `COUNT` operator	Enhanced operators
Default behavior	Space-separated tokens treated as `AND`	Same as v2	No change

Search fields (`search_in`)

Aspect	v2	v3	Change
Field name for article title	`title`	`title`	No change
Field name for article content	`summary`	`content`	Renamed to reflect actual content
Default value	`"title_summary"`	`"title,content"`	Functionally equivalent
LLM-generated summary	Not available	`summary` (requires NLP)	New feature
Multiple values	Underscore-separated string, e.g. `"title_summary"`	`GET`: Comma-separated string, e.g. `"title,content"` `POST`: Comma-separated string or array	Format standardization

Content classification (`topic` -> `theme`)

Aspect	v2	v3	Change
Parameter name	`topic`	`theme`	Renamed
Case format	lowercase, e.g. `"tech"`	Capitalized, e.g. `"Tech"`	Updated format
Available categories	15 lowercase categories	17 capitalized categories	Expanded
New categories in v3	-	`"Health"`, `"Crime"`, `"Financial Crime"`, `"Lifestyle"`, `"Automotive"`, `"Weather"`, `"General"`	Added
Removed v2 categories	`"beauty"`, `"music"`, `"food"`, `"gaming"`	Consolidated into new categories	Category restructuring
Multiple values	Comma-separated string	`GET`: Comma-separated string `POST`: Comma-separated string or array	Enhanced `POST` format
Exclusion option	Not available	`not_theme` parameter	New feature
NLP dependency	No	Yes	New requirement

New v3 parameters

Parameters are grouped by their availability in different subscription plans. For detailed plan information, see Subscription plans.

Core features

Available in all v3 plans, including v3_basic.

Content classification

Parameter	Type	Description
`is_headline`	`boolean`	Filters for articles that were posted on the home page of a given news domain
`is_opinion`	`boolean`	Filters for opinion pieces when true, or excludes opinion-based articles when false
`is_paid_content`	`boolean`	Filters out articles with paywalled content when false
`word_count_min`	`integer`	Filters articles based on minimum word count
`word_count_max`	`integer`	Filters articles based on maximum word count

URLs

Parameter	Type	Description
`parent_url`	`string`	Filters articles by categorical URLs (e.g., “wsj.com/politics”)
`all_links`	`string`	Filters articles by mentioned URLs within their content
`all_domain_links`	`string`	Filters articles by mentioned domain names within their content

Author

Parameter	Type	Description
`not_author_name`	`string`	Excludes articles by specified authors

Parameter	Type	Description
`by_parse_date`	`boolean`	Uses parse dates instead of published dates for date filtering

Source

Parameter	Type	Description
`predefined_sources`	`string`	Filters by predefined top sources per country (e.g., “top 100 US”)
`additional_domain_info`	`boolean`	Includes extra metadata about the source domain
`is_news_domain`	`boolean`	Filters for news domain sources only
`news_domain_type`	`string`	Filters by domain type (Original Content, Aggregator, etc.)
`news_type`	`string`	Filters by news type categories

Advanced features

Available in specific subscription plans.

Natural language processing

Requires v3_nlp plan or higher.

Parameter	Type	Description
`include_nlp_data`	`boolean`	Includes an NLP object for each article in the response
`has_nlp`	`boolean`	Filters for articles that have NLP analysis available
`theme`	`string`	Replaces `topic` parameter with expanded categories and NLP integration
`not_theme`	`string`	Excludes articles with specified themes
`ORG_entity_name`	`string`	Filters articles mentioning specific organization names
`PER_entity_name`	`string`	Filters articles mentioning specific person names
`LOC_entity_name`	`string`	Filters articles mentioning specific location names
`MISC_entity_name`	`string`	Filters articles mentioning other named entities
`title_sentiment_min`	`float`	Filters articles by minimum title sentiment score (-1 to 1)
`title_sentiment_max`	`float`	Filters articles by maximum title sentiment score (-1 to 1)
`content_sentiment_min`	`float`	Filters articles by minimum content sentiment score (-1 to 1)
`content_sentiment_max`	`float`	Filters articles by maximum content sentiment score (-1 to 1)

Clustering and deduplication

Requires v3_nlp plan or higher.

Parameter	Type	Description
`clustering_enabled`	`boolean`	Enables grouping of similar articles into clusters
`clustering_variable`	`string`	Specifies which part of the article to use for clustering (“content”, “title”, or “summary”)
`clustering_threshold`	`float`	Sets similarity threshold for clustering (range: 0-1)
`exclude_duplicates`	`boolean`	Removes duplicate and highly similar articles from results

Tagging

Requires the v3_nlp_iptc_tags subscription plan.

Parameter	Type	Description
`iptc_tags`	`string`	Filters articles by IPTC media topic tags
`not_iptc_tags`	`string`	Excludes articles with specific IPTC media topic tags
`iab_tags`	`string`	Filters articles by IAB content categories
`not_iab_tags`	`string`	Excludes articles with specific IAB content categories

Custom tags

Custom tags are available in all the NLP plans as a custom solution that provides tailored content classification using your organization’s taxonomy. For implementation details and examples, see Custom tags.

Response changes v2 vs v3

Field renaming

The following fields have been renamed in v3 for better clarity and consistency:

v2	v3	Type	Description
`clean_url`	`domain_url`	`string`	Base domain of the source
`excerpt`	`description`	`string`	Brief article description
`summary`	`content`	`string`	Full article content
`_score`	`score`	`number`	Relevancy score
`_id`	`id`	`string`	Unique article identifier
`topic`	`theme`	`string`	Available in v3 with NLP enabled

New fields in v3

Article object

The following fields are available in all v3 plans:

Field	Type	Description
`full_domain_url`	`string`	Complete domain with subdomain
`name_source`	`string`	Publisher name
`is_headline`	`boolean`	Homepage article indicator
`paid_content`	`boolean`	Paywall indicator
`parent_url`	`string`	Category/section URL
`journalists`	`array`	Array of journalist names
`word_count`	`integer`	Article length
`updated_date`	`string`	Last update timestamp
`updated_date_precision`	`string`	Update time precision
`all_links`	`array`	URLs mentioned in article
`all_domain_links`	`array`	Domains mentioned in article

Natural language processing (NLP) object

NLP object is a part of article object and vailable for all NLP plans (v3_nlp plan or higher) when include_nlp_data=true.

Field	Type	Description
`nlp`	`object`	Natural Language Processing analysis results for the article content.

Article understanding

Field	Type	Description
`nlp.summary`	`string`	AI-generated concise summary of article content
`nlp.theme`	`array[string]`	High-level thematic categories from fixed set: Business, Economics, Entertainment, Finance, Health, Politics, Science, Sports, Tech, Crime, Financial Crime, Lifestyle, Automotive, Travel, Weather, General

Sentiment analysis

Field	Type	Description
`nlp.sentiment.title`	`number`	Sentiment score for article title (range: -1 to 1, negative values indicate negative sentiment)
`nlp.sentiment.content`	`number`	Sentiment score for article content (range: -1 to 1, negative values indicate negative sentiment)

Named entity recognition (NER)

Field	Type	Description
`nlp.ner_PER`	`array[object]`	Named entities recognized as persons
`nlp.ner_ORG`	`array[object]`	Named entities recognized as organizations
`nlp.ner_LOC`	`array[object]`	Named entities recognized as locations
`nlp.ner_MISC`	`array[object]`	Named entities recognized as other types (events, products, etc.)

Each NER object contains:

{
  "entity_name": "string", // Recognized entity name
  "count": "integer" // Number of mentions in the article
}

Tags

Available for the v3_nlp_iptc_tags subscription plan.

Field	Type	Description
`nlp.iab_tags_name`	`array[string]`	Interactive Advertising Bureau content categorization
`nlp.iptc_tags_name`	`array[string]`	International Press Telecommunications Council subject names
`nlp.iptc_tags_id`	`array[string]`	International Press Telecommunications Council subject IDs

Vector representation

Available for the v3_nlp_embeddings plan.

Field	Type	Description
`nlp.new_embedding`	`array[number]`	1024-dimensional vector embedding for semantic similarity comparison (v3_nlp_embeddings plan only)

Clustering data

Available for all NLP plans when clustering_enabled=true:

Field	Type	Description
`clusters_count`	`integer`	Total number of clusters in the response
`clusters`	`array`	Array of cluster objects
`cluster_id`	`string`	Unique identifier for each cluster
`cluster_size`	`integer`	Number of articles in the cluster
`articles`	`array`	Array of article objects in the cluster

Deduplication data

Available for all NLP plans when exclude_duplicates=true:

Field	Type	Description
`duplicate_count`	`integer`	Number of duplicate articles found
`duplicate_articles_group_id`	`string`	Unique identifier for the duplicate group

Source object

Enhanced source information available in all v3 plans:

Field	Type	Description
`name_source`	`string`	Publisher name
`domain_url`	`string`	Base domain URL
`logo`	`string`	Source logo URL
`additional_info`	`object`	Extended source data

Additional info object fields:

Field	Type	Description
`nb_articles_for_7d`	`integer`	Articles published in last week
`country`	`string`	Source country code
`rank`	`integer`	SEO rank
`is_news_domain`	`boolean`	Indicates if domain is a news source
`news_domain_type`	`string`	Type of news domain
`news_type`	`string`	Category of news content

Removed response fields in v3

topic: Replaced by theme in NLP features for the /search and /latest_headlines endpoints. The field is unavailable for the /sources endpoint as the corresponding parameter has been removed.
is_republisher: Replaced by more detailed domain classification.

Error response changes

Format

{
  "status": "error",
  "error_code": "<string>",
  "message": "<string>"
}

Status codes

Code	v2 Description	v3 Description
400	API not in headers	Bad request - Invalid JSON
401	API Key not found	Unauthorized
403	Not present	Plan limits exceeded
406	Wrong parameter	Not present
408	Request Timeout	Request Timeout
422	Not present	Validation Error
429	Concurrency violated	Rate limit exceeded
500	Not present	Internal server error

SDKs

v2: Python SDK only
v3: SDKs for:
- Python
- TypeScript
- Go
- Java
- C#

All v3 SDKs provide complete support for both core and advanced features. For implementation details, see the Libraries documentation.

Timeline and support

Migration timeline

v2 supported until Q1 2025.
Historical data:
- v3 data available since July 2023.
- v2 historical data migration to v3 planned for Q1 2025.
- Initial historical data will include core features only.
- Advanced features (NLP, clustering, etc.) available for historical data upon request.

Support during migration

Both versions are available for parallel testing.
Automatic migration to v3_basic plan for existing v2 customers.
All new v3 endpoints accessible in v3_basic plan.
Advanced features require specific plans:
- NLP features: v3_nlp plan.
- IPTC and IAB tags: v3_nlp_iptc_tags plan.
- Embeddings: v3_nlp_embeddings plan.
- Custom tags: Available as a custom solution in all v3 NLP plans.

Next steps

Review the Migration guide for implementation details.
Explore plan features and requirements in Subscription plans.
Check the version-specific API Reference for detailed parameter and response field documentation:
- v3 API Reference
- v2 API Reference
Test v3 endpoints alongside your v2 implementation.
For advanced features:
- Learn about NLP features
- Explore Custom tags
- Understand Clustering capabilities

For implementation support or custom solutions, contact our support team.

Get started

Guides and concepts

Migration

How to

Troubleshooting

​Base API changes

​Infrastructure updates

​Available endpoints

​Method support

​Parameter changes

​Renaming

​Query parameter (q)

​Search fields (search_in)

​Content classification (topic -> theme)

​New v3 parameters

​Core features

​Content classification

​URLs

​Author

​Time-related

​Source

​Advanced features

​Natural language processing

​Clustering and deduplication

​Tagging

​Custom tags

​Response changes v2 vs v3

​Field renaming

​New fields in v3

​Article object

​Natural language processing (NLP) object

​Clustering data

​Deduplication data

​Source object

​Removed response fields in v3

​Error response changes

​Format

​Status codes

​SDKs

​Timeline and support

​Migration timeline

​Support during migration

​Next steps

Base API changes

Infrastructure updates

Available endpoints

Method support

Parameter changes

Renaming

Query parameter (`q`)

Search fields (`search_in`)

Content classification (`topic` -> `theme`)

New v3 parameters

Core features

Content classification

URLs

Author

Time-related

Source

Advanced features

Natural language processing

Clustering and deduplication

Tagging

Custom tags

Response changes v2 vs v3

Field renaming

New fields in v3

Article object

Natural language processing (NLP) object

Clustering data

Deduplication data

Source object

Removed response fields in v3

Error response changes

Format

Status codes

SDKs

Timeline and support

Migration timeline

Support during migration

Next steps