//remvovingautofillcolour
Home
Blogs
Introducing NewsCatcher Local News API
Product

Introducing NewsCatcher Local News API

NewsCatcher’s Hyperlocal News API delivers precise, city-focused news by scanning thousands of articles daily across 31,000 U.S. locations—perfect for planners, businesses, and developers seeking local insights.

Introducing NewsCatcher Local News API

A hyperlocal news feed consists of all the news about happenings in a city or any other area of similar scale. Topics might cover transit, extreme weather alerts, crime, local events, infrastructure projects, and so on. Besides the obvious use case of creating engaging consumer applications, city news can be very useful to city planners, analysts, social workers, law enforcement agencies, real estate industrialists and investors, local businesses, or corporations looking to expand into hyperlocal markets.

The seemingly simple task of getting a news feed with all the latest happenings in a given city is unexpectedly daunting. Consider getting all the news from New York City into a feed on a daily basis. The likes of the New York Times cover more than just New York, and a big chunk of New York's news is covered by national and international media outlets. Monitoring a single source or a group of sources will not be sufficient to have a comprehensive 'New York News Feed'. Conversely, monitoring a large number of news sources for news about just one city can be very resource-intensive and may not yield a good number of relevant results. Then, there is also the problem of location keywords clashing. Using a location name as a search keyword does not yield satisfactory results, as location names could sometimes be common words or multiple locations could have the same name.

With this in mind, NewsCatcher has made getting an ultra-granular, location-focused news feed very simple: just call our Local News API! We do the heavy lifting of scanning and processing thousands of news articles each day and tagging them with locations. Our method picks up locations up to a town-level precision with 84% accuracy. A total of around 31,000 locations (US only) are covered. We also use advanced NLP techniques and AI to detect location names and resolve keyword clashes if the location name is also a common word. In this blog, let's look at how NewsCatcher's Hyperlocal News API works and what you can do with it.

A Quick Demonstration

NewsCatcher provides 15 RSS feeds, each covering local news from a US city, as a demonstration. You can see the list and links to each feed at: https://www.newscatcherapi.com/local-news/rss/. A snippet from the New York feed is shown below:

<rss version="2.0">
  <channel>
    <title>NewsCatcher Local News RSS Feed - New York City, New York</title>
    <link>/rss/new-york-city-ny.rss</link>
    <description>This is a feed for New York City, New York</description>
    <lastBuildDate>Mon, 28 Oct 2024 05:41:28 +0000</lastBuildDate>
    <!-- news items here -->
    <!-- sample items shown below -->
    <item>
      <newscatcher_article_id>ac7a417f4626c2a1632faa99d31eef54</newscatcher_article_id>
      <title>Anne Hathaway Channels Scary Statue of Liberty for ‘Boo York' Halloween Costume: See the Look!</title>
      <theme>Entertainment</theme>
      <link><https://www.aol.com/anne-hathaway-channels-scary-statue-013540114.html></link>
      <media><https://s.yimg.com/ny/api/res/1.2/hXoHv3GyYsmcvEHI7tgT.w--/YXBwaWQ9aGlnaGxhbmRlcjt3PTEyMDA7aD04MDA-/https://media.zenfs.com/en/aol_people_articles_471/6e980c6fc3e21bc291461002f2f2fca1></media>
      <pubDate>2024-10-26 01:35:40</pubDate>
      <author>Ingrid Vasquez</author>
      <associated_town>New York City, New York</associated_town>
      <story_articles_number>3</story_articles_number>
    </item>
    <item>
      <newscatcher_article_id>27f1a6526f0e47fc7c13e16fb25b1991</newscatcher_article_id>
      <title>City Paves Over Bed-Stuy's Hydrant ‘Aquarium' and Puts Up a Sidewalk</title>
      <theme>General</theme>
      <link><https://www.nytimes.com/2024/10/25/nyregion/bed-stuy-aquarium-sidewalk-brooklyn.html></link>
      <media><https://static01.nyt.com/images/2024/10/25/multimedia/25xp-aquarium-wires-lwhp/25xp-aquarium-wires-lwhp-facebookJumbo.jpg></media>
      <pubDate>2024-10-25 23:01:30</pubDate>
      <author>Remy Tumin</author>
      <associated_town>New York City, New York</associated_town>
      <story_articles_number>3</story_articles_number>
    </item>
    <!-- more news items -->
  </channel>
</rss>

In the above feed, you can see articles talking about events in New York City. The item tags have links to the source articles, along with the title, cover image, publication date, and author. We also classify the article into a theme, such as ‘Sports’ or ‘Entertainment’, and provide the associated_town. There can be multiple associated towns for a given article and we will return all the identified towns (or cities).

Using the REST API

The RSS feed was only a demonstration. NewsCatcher offers a full-featured REST API that can return the required local news data in JSON format. The API can also return the news for over 31,000 locations in the US, not just the 15 locations in the RSS feeds.

Let's see how you can interact with this REST API. We'll be using Python (3.6+) in the code snippets below for illustrative purposes, but the API can be used with any language over HTTP. To follow along, you'll need the NewsCatcher API Endpoint URL, an API key, and the requests library installed. Let's put these things in the code below:

import requests
import json

NC_API_KEY = '<newscatcher-api-key-goes-here>'
NC_ENDPOINT = 'https://local-news.newscatcherapi.com'

Getting the Latest Local News Headlines

The simplest use-case of the Local News API is to get a list of the latest headlines for the location you're interested in. NewsCatcher offers a convenient endpoint to get just this. All you have to do is send a POST request to /api/latest_headlines/:

r = requests.post(
    f'{NC_ENDPOINT}/api/latest_headlines',
    headers={'x-api-token': NC_API_TOKEN},
    json={
        'associated_towns': [{'name': 'New York'}],
        'page_size': 10,
        'when': '1d',
    }
)

print(json.dumps(r.json(), indent=2))

The above code calls the API and prints the indented JSON response. Let's see what’s in the result:

{
  "status": "ok",
  "total_hits": 3,
  "page": 1,
  "total_pages": 1,
  "page_size": 10,
  "articles": [
    {
      "id": "5e05185a3499db5817f265fc354f1d52",
      "associated_town": [
        {
          "ai_validated": true,
          "name": "Rochester, New York",
          "description": [
            "HYPERLOCAL_SOURCES_EXCLUDE_QUERY",
            "HYPERLOCAL_SOURCES_INCLUDE_QUERY"
          ]
        },
        {
          "ai_validated": true,
          "name": "New York",
          "description": [
            "LOCAL_SOURCES_EXCLUDE_QUERY"
          ]
        }
      ],
      "ai_associated_town": null,
      "score": null,
      "title": "Vote: Section V's Girls Sports Athlete of the Week for Oct. 20-26 presented by Faber Builders",
      "author": "Marquel Slaughter",
      "link": "<https://www.democratandchronicle.com/story/sports/high-school/2024/10/28/who-is-section-v-girls-sports-athlete-of-the-week-for-oct-20-26-vote-now/75836114007>",
      "description": "Your vote will determine who will be the Faber Builders Girls Sports Athlete of the Week for October 20-26.",
      "media": "<https://www.democratandchronicle.com/gcdn/authoring/authoring-images/2024/09/06/PROC/75108463007-aotw-article-page-hdr-1200-x-628.jpg?crop=1115,627,x58,y0&width=1115&height=627&format=pjpg&auto=webp>",
      "content": "It's time to take a...(full content truncated)",
      "authors": [
        "Justin Ritzel",
        "James Johnson",
        "Marquel Slaughter"
      ],
      "published_date_precision": "full",
      "published_date": "2024-10-28 11:03:48",
      "updated_date": "2024-10-28 11:03:48",
      "updated_date_precision": "full",
      "is_opinion": false,
      "twitter_account": "@DandC",
      "domain_url": "democratandchronicle.com",
      "parent_url": "<https://www.democratandchronicle.com/sports>",
      "word_count": 357,
      "rank": 5339,
      "country": "US",
      "rights": "democratandchronicle.com",
      "language": "en",
      "nlp": {
        "theme": [
          "Sports"
        ],
        "summary": "Your vote will determine who will...(ai summary truncated)",
        "sentiment": {
          "title": 0.0,
          "content": 0.0
        },
        "ner_PER": [
          {
            "entity_name": "Governor",
            "count": 1
          },
          ...list truncated for illustration...
        ],
        "ner_ORG": [
          {
            "entity_name": "Section V",
            "count": 2
          },
          ...list truncated for illustration...
        ],
        "ner_MISC": [
          {
            "entity_name": "Girls Sports Athlete of the Week",
            "count": 1
          },
          ...list truncated for illustration...
        ],
        "ner_LOC": [
          {
            "entity_name": "Silver Hill Tech Park",
            "count": 1
          },
          ...list truncated for illustration...
        ]
      },
      "paid_content": false
    },
    ...list truncated for illustration...
  ],
  "user_input": "...object showing the input..."
}

In the above output, you can see that the query returned 3 hits. In the article data, there is data from the source, consisting of the article title, link, authors, publication date, rights attribution, and most importantly the full content. NewsCatcher also adds useful enrichments by analyzing the data. The first is of course the list of detected towns, which we use to filter the results and return just the news relevant to your input location. Apart from this, you can also see the theme of the article, sentiment analysis scores, and lists of recognized entities. The recognized entities are classified as persons, organizations, locations, and miscellaneous. You can readily use the enriched data for further analysis.

The Search Endpoint

While the latest_headlines endpoint was easy to get started with, that is not all that the Local News API offers. You can also use the search endpoint which offers additional capabilities - passing keyword queries, sorting, and accessing older data with custom time ranges.

Basic Usage

Let's quickly see an example of how to use the search endpoint, with some filters:

r = requests.post(
    f'{NC_ENDPOINT}/api/search',
    headers={'x-api-token': NC_API_KEY},
    json={
        'associated_towns': [{'name': 'New York'}],
        'page_size': 5,
        'q': 'strike',
        'search_in': 'title',
        'from_': '30 days ago'
    },
)

print(json.dumps(r.json(), indent=2))

The above code sends a POST request to the /api/search/ endpoint to return 5 articles satisfying the query criteria. There is a keyword query, 'strike' and the time filter from_ is set to last 30 days. Specifying the search_in parameter as title will make sure the results contain the word 'strike' in the title of the article. You can also include the content in the search target to return articles that have the word 'strike' in the article content.

Let's see the result:

{
  "status": "ok",
  "total_hits": 3,
  "page": 1,
  "total_pages": 1,
  "page_size": 5,
  "articles": [
    {
      "id": "7f2ddd19bfc79a2aaa3a2d4173130365",
      "associated_town": [
        {
          "ai_validated": false,
          "name": "Batavia, New York",
          "description": [
            "HYPERLOCAL_SOURCES_EXCLUDE_QUERY"
          ]
        },
        {
          "ai_validated": true,
          "name": "New York",
          "description": [
            "LOCAL_SOURCES_EXCLUDE_QUERY"
          ]
        }
      ],
      "ai_associated_town": null,
      "score": 13.088006,
      "title": "US dockworkers agree to suspend strike until Jan. 15",
...response similar to previous section...

Similar to the results from the latest_headlines endpoint, the response consists of a list of articles. You can see that the returned article has the word 'strike' in the title.

Getting Older Data

With the search endpoint, you can get older data by specifying from_ and to_ dates. To get articles that are 2 weeks old, use the following JSON body in the request:

{
  "associated_towns": [{"name": "Chicago"}],
  "from_": "21 days ago",
  "to_": "14 days ago"
}

The query above will return the articles from Chicago that are more than 2 weeks but less than 3 weeks old. NewsCatcher also offers the convenience of specifying the dates in natural language format (x days ago), instead of exact date strings.

Querying Multiple Locations

In case you have to query multiple locations, simply pass the list of locations using the associated_towns parameter. You need not make separate HTTP requests for each location. The JSON body in this case will look like this:

{
  "associated_towns": [
    {"name": "California"},
    {"name": "Texas"}
  ],
  "q": "layoff",
  "from_": "30 days ago"
}

The above query will return a list of articles mentioning 'layoff', having an associated location, ‘California’ OR ‘Texas’, from 30 days ago. Searching multiple towns works with the OR operator.

Clustering Articles

With local news, an event is often covered by multiple media outlets. NewsCatcher clusters together similar articles, and provides an option to retrieve article lists as clusters. Let's see how you can use this option:

{
  "associated_town": [{"name": "New York"}],
  "from_": "2 days ago",
  "clustering": true
}

In the above query, the clustering parameter is set to true. Let's see what this returns:

{
  "status": "ok",
  "total_hits": 10,
  "page": 1,
  "total_pages": 1,
  "page_size": 100,
  "clusters_count": 9,
  "agg_clusters": [],
  "clusters": {
    "cluster_id_1": {
      "articles": [...articles in this cluster...],
      "agg_cluster": false,
      "original_cluster_size": 2,
      "cluster_size": 1
    },
    "cluster_id_2": {....}
    ...more clusters...
  },
...more data...

From the result, you can see that there are 100 hits, grouped into 9 clusters. The lists of articles are organized inside the clusters. Each article would have data fields similar to those shown in earlier outputs.

Filtering by Sources

With both search and latest_headlines endpoints, you can filter the articles by sources. First, to see a list of available sources, use the sources endpoint:

r = requests.post(
    f'{NC_ENDPOINT}/api/sources',
    headers={'x-api-token': NC_API_KEY},
    json={'lang': 'en'},
)

print(json.dumps(r.json(), indent=2))

The above code returns a list of sources matching the filter lang: en in the following format:

{
  "message": "Maximum sources displayed according to your plan is set to 2000",
  "sources": [
    "yahoo.com",
    "wn.com",
    "headtopics.com",
    ...more sources...
  ],
  "user_input": "the filters passed as input"
}

In the output, there is a list of sources and a message about the maximum number of sources that can be retrieved using this endpoint. You can also see the input parameters that were sent in the request. In addition to the lang filter, you can use the countries and theme filters to get sources by country and article theme respectively. These sources can be used as filters in the queries sent to the search or latest_headlines endpoints:

{
  "associated_towns": [{"name": "New York"}],
  "sources": "yahoo.com",
  "not_sources": "iheart.com",
  "clustering": true
}

The above query specifies a source 'yahoo.com' using the sources parameter. The parameter can also take an array, allowing you to specify multiple sources. You can also specify sources to be excluded using the not_sources parameter. This too can accept a list instead of a string.

Local News API Demo

Conclusion

In this blog, we looked at the freshly launched NewsCatcher Local News API and how to use it. This greatly reduces the effort needed to get localized news for building consumer apps or ingesting into data analysis pipelines. You need not take up the hassle of building and maintaining complex scrapers to gather local news data. NewsCatcher also extracts location names from article text and verifies this using AI, sparing you from using NLP or other text analysis techniques.

We discussed some of the highlight features of this API in this blog. This only scratches the surface of what the API is capable of. Detailed specifications of the API, including comprehensive lists of the filters that can be used, are available in the documentation.

Choosing the Right News API Should Be Easy

Get access to the guide that simplifies your decision-making. Enter your email to download now.

Text Link
Success! Your white paper is on its way. Be sure to check your inbox shortly!
Oops! Something went wrong while submitting the form.

READY FOR
CUSTOM NEWS SOLUTIONS?

Drop your email and find out how our API delivers precisely what your business needs.