Scroll through pages
Learn how to iterate through found pages and extract articles
Whenever you make an API call using /v2/search or /v2/latest_headlines endpoints*,* we give you key information about your search. Here is an example of a search API call:
The output:
total_hits
tells you how many articles are found.
total_pages
indicates how many API calls you will have to make in order to get
these articles.
One API call can bring a maximum of 100 articles.
You can increase the page_size
parameter for your search to make fewer calls
and get more data.
gives
Based on the information above, we can say that given the page_size
and being
on the 1st page, we are seeing all found articles from 1 to 100 out
of 1300.
By incrementing the page
parameter to 2, I will get all articles from
101 to 200 out of 1300. You get the logic.
To summarize, your goal is to iterate through all found pages and extract articles.
The whole process should be divided into 2 parts:
- Make 1 call and identify the total number of pages in
total_pages
- Increment
page
input until thetotal_pages
value to get all articles.
Python (SDK)
The Python library can be installed using pip install
launched from the
terminal. You can find all the details either on
PyPi website or our
GitHub Repository.
When installed, the package can be directly called from the Python application.
We prepared separate functions get_search_all_pages
and
get_latest_headlines_all_pages
to simplify the process of extracting all
articles.
Python (requests)
Was this page helpful?