Get aggregation count by interval

GET

api

aggregation_count

curl --request GET \
  --url https://v3-api.newscatcherapi.com/api/aggregation_count \
  --header 'x-api-token: <api-key>'

{
  "status": "<string>",
  "total_hits": 123,
  "page": 123,
  "total_pages": 123,
  "page_size": 123,
  "aggregations": {
    "aggregation_count": [
      {
        "time_frame": "2024-12-31 00:00:00",
        "article_count": 86
      }
    ]
  },
  "user_input": {}
}

Authorizations

x-api-token

string

header

required

API Key to authenticate requests.

To access the API, include your API key in the x-api-token header. To obtain your API key, complete the form or contact us directly.

Query Parameters

string

required

The keyword(s) to search for in articles. Query syntax supports logical operators (AND, OR, NOT) and wildcards:

For an exact match, use double quotes. For example, "technology news".
Use * to search for any keyword.
Use + to include and - to exclude specific words or phrases. For example, +Apple, -Google.
Use AND, OR, and NOT to refine search results. For example, technology AND (Apple OR Microsoft) NOT Google.

For more details, see Advanced querying.

search_in

string

default:title,content

The article fields to search in. To search in multiple fields, use a comma-separated string.

Example: "title, summary"

Note: The summary option is available if NLP is enabled in your plan.

Available options: title, summary, content.

predefined_sources

string

Predefined top news sources per country.

Format: start with the word top, followed by the number of desired sources, and then the two-letter country code ISO 3166-1 alpha-2. Multiple countries with the number of top sources can be specified as a comma-separated string.

Examples:

"top 100 US"
"top 33 AT"
"top 50 US, top 20 GB"
"top 33 AT, top 50 IT"

sources

string

One or more news sources to narrow down the search. The format must be a domain URL. Subdomains, such as finance.yahoo.com, are also acceptable.To specify multiple sources, use a comma-separated string.

Examples:

"nytimes.com"
"theguardian.com, finance.yahoo.com"

not_sources

string

The news sources to exclude from the search. To exclude multiple sources, use a comma-separated string.

Example: "cnn.com, wsj.com"

lang

string

The language(s) of the search. The only accepted format is the two-letter ISO 639-1 code. To select multiple languages, use a comma-separated string.

Example: "en, es"

To learn more, see Enumerated parameters > Language.

not_lang

string

The language(s) to exclude from the search. The accepted format is the two-letter ISO 639-1 code. To exclude multiple languages, use a comma-separated string.

Example: "fr, de"

To learn more, see Enumerated parameters > Language.

countries

string

The countries where the news publisher is located. The accepted format is the two-letter ISO 3166-1 alpha-2 code. To select multiple countries, use a comma-separated string.

Example: "US, CA"

To learn more, see Enumerated parameters > Country.

not_countries

string

The publisher location countries to exclude from the search. The accepted format is the two-letter ISO 3166-1 alpha-2 code. To exclude multiple countries, use a comma-separated string.

Example:"US, CA"

To learn more, see Enumerated parameters > Country.

not_author_name

string

The list of author names to exclude from your search. To exclude articles by specific authors, use a comma-separated string.

Example: "John Doe, Jane Doe"

from_

The starting point in time to search from. Accepts date-time strings in ISO 8601 format and plain text. The default time zone is UTC.

Formats with examples:

YYYY-mm-ddTHH:MM:SS: 2024-07-01T00:00:00
YYYY-MM-dd: 2024-07-01
YYYY/mm/dd HH:MM:SS: 2024/07/01 00:00:00
YYYY/mm/dd: 2024/07/01
English phrases: 1 day ago, today

Note: By default, applied to the publication date of the article. To use the article's parse date instead, set the by_parse_date parameter to true.

Example:

"2024-07-01T00:00:00.000Z"

to_

The ending point in time to search up to. Accepts date-time strings in ISO 8601 format and plain text. The default time zone is UTC.

Formats with examples:

YYYY-mm-ddTHH:MM:SS: 2024-07-01T00:00:00
YYYY-MM-dd: 2024-07-01
YYYY/mm/dd HH:MM:SS: 2024/07/01 00:00:00
YYYY/mm/dd: 2024/07/01
English phrases: 1 day ago, today

Note: By default, applied to the publication date of the article. To use the article's parse date instead, set the by_parse_date parameter to true.

Example:

"2024-07-01T00:00:00.000Z"

published_date_precision

enum<string>

The precision of the published date. There are three types:

full: The day and time of an article is correctly identified with the appropriate timezone.
timezone unknown: The day and time of an article is correctly identified without timezone.
date: Only the day is identified without an exact time.

Available options:

full,

timezone unknown,

date

by_parse_date

boolean

default:false

If true, the from_ and to_ parameters use article parse dates instead of published dates. Additionally, the parse_date variable is added to the output for each article object.

sort_by

enum<string>

default:relevancy

The sorting order of the results. Possible values are:

relevancy: The most relevant results first.
date: The most recently published results first.
rank: The results from the highest-ranked sources first.

Available options:

relevancy,

date,

rank

ranked_only

boolean

default:true

If true, limits the search to sources ranked in the top 1 million online websites. If false, includes unranked sources which are assigned a rank of 999999.

from_rank

integer

default:1

The lowest boundary of the rank of a news website to filter by. A lower rank indicates a more popular source.

Required range: 1 <= x <= 999999

to_rank

integer

default:999999

The highest boundary of the rank of a news website to filter by. A lower rank indicates a more popular source.

Required range: 1 <= x <= 999999

is_headline

boolean

If true, only returns articles that were posted on the home page of a given news domain.

is_opinion

boolean

If true, returns only opinion pieces. If false, excludes opinion-based articles and returns news only.

is_paid_content

boolean

If false, returns only articles that have publicly available complete content. Some publishers partially block content, so this setting ensures that only full articles are retrieved.

parent_url

string

The categorical URL(s) to filter your search. To filter your search by multiple categorical URLs, use a comma-separated string.

Example: "wsj.com/politics, wsj.com/tech"

all_links

string

The complete URL(s) mentioned in the article. For multiple URLs, use a comma-separated string.

Example: "https://aiindex.stanford.edu/report, https://www.stateof.ai"

For more details, see Search by URL.

all_domain_links

string

The domain(s) mentioned in the article. For multiple domains, use a comma-separated string.

Example: "who.int, nih.gov"

For more details, see Search by URL.

word_count_min

integer

The minimum number of words an article must contain. To be used for avoiding articles with small content.

Required range: x >= 0

word_count_max

integer

The maximum number of words an article can contain. To be used for avoiding articles with large content.

Required range: x >= 0

page

integer

default:1

The page number to scroll through the results. Use for pagination, as a single API response can return up to 1,000 articles.

For details, see How to paginate large datasets.

Required range: x >= 1

page_size

integer

default:100

The number of articles to return per page.

Required range: 1 <= x <= 1000

include_nlp_data

boolean

default:false

If true, includes an NLP object for each article in the response. This object provides results of NLP analysis, including article theme, summary, sentiment, tags, and named entity recognition if available.

Note: NLP coverage and analysis completeness may vary by language, with full data available for articles in English and Arabic. The include_nlp_data parameter is available only in NLP subscription plans.

To learn more, see NLP features.

Example:

true

has_nlp

boolean

default:false

If true, filters results to include only articles that have NLP data.

Note: NLP coverage and analysis completeness may vary by language, with full data available for articles in English and Arabic. The has_nlp parameter is available only in NLP subscription plans.

To learn more, see NLP features.

Example:

true

theme

string

Filters articles based on their general topic, as determined by NLP analysis. To select multiple themes, use a comma-separated string.

Example: "Finance, Tech"

Note: The theme parameter is only available if NLP is included in your subscription plan.

To learn more, see NLP features.

Available options: Business, Economics, Entertainment, Finance, Health, Politics, Science, Sports, Tech, Crime, Financial Crime, Lifestyle, Automotive, Travel, Weather, General.

not_theme

string

Inverse of the theme parameter. Excludes articles based on their general topic, as determined by NLP analysis. To exclude multiple themes, use a comma-separated string.

Example: "Crime, Tech"

Note: The not_theme parameter is only available if NLP is included in your subscription plan.

To learn more, see NLP features.

ORG_entity_name

string

Filters articles that mention specific organization names, as identified by NLP analysis. To specify multiple organizations, use a comma-separated string.

Example: "Apple, Microsoft"

Note: The ORG_entity_name parameter is only available if NLP is included in your subscription plan.

To learn more, see Search by entity.

PER_entity_name

string

Filters articles that mention specific person names, as identified by NLP analysis. To specify multiple names, use a comma-separated string.

Example: "Elon Musk, Jeff Bezos"

Note: The PER_entity_name parameter is only available if NLP is included in your subscription plan.

To learn more, see Search by entity.

LOC_entity_name

string

Filters articles that mention specific location names, as identified by NLP analysis. To specify multiple locations, use a comma-separated string.

Example: "California, New York"

Note: The LOC_entity_name parameter is only available if NLP is included in your subscription plan.

To learn more, see Search by entity.

MISC_entity_name

string

Filters articles that mention other named entities not falling under person, organization, or location categories. Includes events, nationalities, products, works of art, and more. To specify multiple entities, use a comma-separated string.

Example: "Bitcoin, Blockchain"

Note: The MISC_entity_name parameter is only available if NLP is included in your subscription plan.

To learn more, see Search by entity.

title_sentiment_min

number

Filters articles based on the minimum sentiment score of their titles.

Range is -1.0 to 1.0, where:

Negative values indicate negative sentiment.
Positive values indicate positive sentiment.
Values close to 0 indicate neutral sentiment.

Note: The title_sentiment_min parameter is only available if NLP is included in your subscription plan.

To learn more, see NLP features.

Required range: -1 <= x <= 1

title_sentiment_max

number

Filters articles based on the maximum sentiment score of their titles.

Range is -1.0 to 1.0, where:

Negative values indicate negative sentiment.
Positive values indicate positive sentiment.
Values close to 0 indicate neutral sentiment.

Note: The title_sentiment_max parameter is only available if NLP is included in your subscription plan.

To learn more, see NLP features.

Required range: -1 <= x <= 1

content_sentiment_min

number

Filters articles based on the minimum sentiment score of their content.

Range is -1.0 to 1.0, where:

Negative values indicate negative sentiment.
Positive values indicate positive sentiment.
Values close to 0 indicate neutral sentiment.

Note: The content_sentiment_min parameter is only available if NLP is included in your subscription plan.

To learn more, see NLP features.

Required range: -1 <= x <= 1

content_sentiment_max

number

Filters articles based on the maximum sentiment score of their content.

Range is -1.0 to 1.0, where:

Negative values indicate negative sentiment.
Positive values indicate positive sentiment.
Values close to 0 indicate neutral sentiment.

Note: The content_sentiment_max parameter is only available if NLP is included in your subscription plan.

To learn more, see NLP features.

Required range: -1 <= x <= 1

iptc_tags

string

Filters articles based on International Press Telecommunications Council (IPTC) media topic tags. To specify multiple IPTC tags, use a comma-separated string of tag IDs.

Example: "20000199, 20000209"

Note: The iptc_tags parameter is only available in the v3_nlp_iptc_tags subscription plan.

To learn more, see IPTC Media Topic NewsCodes.

not_iptc_tags

string

Inverse of the iptc_tags parameter. Excludes articles based on International Press Telecommunications Council (IPTC) media topic tags. To specify multiple IPTC tags to exclude, use a comma-separated string of tag IDs.

Example: "20000205, 20000209"

Note: The not_iptc_tags parameter is only available in the v3_nlp_iptc_tags subscription plan.

To learn more, see IPTC Media Topic NewsCodes.

aggregation_by

enum<string>

The aggregation interval for the results. Possible values are:

day: Aggregates results by day.
hour: Aggregates results by hour.

Available options:

day,

hour

Response

200

application/json

A successful response containing aggregation count results that match the search criteria. If no matches, returns a failded aggregation response according to the defined schema.

The response model for a successful Aggregation count request. Response field behavior:

Required fields are guaranteed to be present and non-null.
Optional fields may be null or undefined if the data point is not presented or couldn't be extracted during processing.

status

string

required

The status of the response.

total_hits

integer

required

The total number of articles matching the search criteria.

page

integer

required

The current page number of the results.

total_pages

integer

required

The total number of pages available for the given search criteria.

page_size

integer

required

The number of articles per page.

aggregations

The aggregation results. Can be either a dictionary or a list of dictionaries.

user_input

object

The user input parameters for the request.

Was this page helpful?

Suggest edits Raise issue

Retrieve sources Get aggregation count by interval

curl --request GET \
  --url https://v3-api.newscatcherapi.com/api/aggregation_count \
  --header 'x-api-token: <api-key>'

{
  "status": "<string>",
  "total_hits": 123,
  "page": 123,
  "total_pages": 123,
  "page_size": 123,
  "aggregations": {
    "aggregation_count": [
      {
        "time_frame": "2024-12-31 00:00:00",
        "article_count": 86
      }
    ]
  },
  "user_input": {}
}

Overview

Endpoints

Libraries

Get aggregation count by interval

Authorizations

Query Parameters

Response