//remvovingautofillcolour

PRODUCT UPDATES

Welcome to the Instapage Product Updates page. Here you can find the latest news about changes to the Instapage platform.

August 02, 2024

  • Expanded Sources List: 
    • We’ve broadened our source coverage by increasing the number of sources from 80,000 to 91,000. 
    • This expansion was driven by feedback from our clients, ensuring a more comprehensive and diverse news feed.
  • Enhanced Proxy Logic: 
    • We’ve optimized our proxy mechanisms, reducing the number of instances where our data extractors are blocked by 80%. 
    • This improvement ensures more consistent and reliable data extraction across various sources.

July 19, 2024

  • Preventing Historic Data Breakdowns:
    • To safeguard against data loss and service disruptions, we’ve implemented regular snapshots of our data, stored on AWS Glacier. This allows for quick recovery during downtimes.
    • Additionally, each year of historical data is now duplicated across two servers, ensuring data remains accessible and secure even in the event of a server failure.
  • Improved System Performance: 
    • We’ve added four new servers to our v3 historic clusters, enhancing data management and overall system performance.

June 14, 2024

  • New Clustering Algorithm on v3 API: 
    • We’ve introduced a more efficient clustering method for our v3 API and benchmarked it against our existing approach. 
    • The new method is approximately 1.75x faster, offering significant performance improvements without requiring any changes to your existing code or API calls.

June 07, 2024

  • is_opinion Flag Now Available:
    • The is_opinion attribute, already present in v3, is now available as a filter parameter in the API. This allows for more precise filtering of opinion articles in your data queries.
  • Improved Source Country Identification:
    • We’ve enhanced our logic for determining the country of origin for news sources. 
    • This update has reduced the number of sources marked as ‘unknown’ by over 5,500, improving the accuracy of geographical data.

May 31, 2024

  • Translated Articles on v3 API:
    • The v3 API now includes English translations for non-English articles. 
    • We’ve achieved a 90% translation rate for non-English content, providing broader access to global news in English.

May 24, 2024

  • New English Sentiment Model:
    • We’ve fine-tuned our sentiment analysis model using a synthetic dataset of over a million articles labeled with ChatGPT. 
    • The new model operates 10x faster and delivers improved accuracy, with F-1 scores of 0.89 for non-finance articles and 0.87 for finance-related content.

May 17, 2024

  • Improved Language Detection:
    • We’ve fixed a bug that caused incorrect language identification due to certain text transformations. This fix enhances the accuracy of our language detection across articles.
  • Article Deduplication:
    • We’ve implemented a deduplication feature to identify and filter out republished or syndicated articles, ensuring that your data stream focuses on original content.
    • Comprehensive documentation is available for this feature:

May 10, 2024

  • New English Sentiment Model:
    • We’ve fine-tuned our sentiment analysis model using a synthetic dataset of over a million articles labeled with ChatGPT. 
    • The new model operates 10x faster and delivers improved accuracy, with F-1 scores of 0.89 for non-finance articles and 0.87 for finance-related content.

May 03, 2024

  • Article Update Monitoring:
    • We’ve introduced a feature that checks whether an article has been updated after its initial publication. 
    • If changes are detected, we ensure the extracted version reflects the latest content, keeping your data current.

April 19, 2024

  • Enhanced Parent URL Logic:
    • We’ve refined the logic for the parent_url attribute, which previously defaulted to the homepage of the news source where the article was first found. 
    • The new logic now prioritizes section-specific URLs over homepage links, improving the contextual relevance of the parent URL data.

April 12, 2024

  • Text Formatting Preservation:
    • We’ve improved our text extraction process to preserve formatting, ensuring that more than 90% of articles maintain clear paragraph splits. 
    • This enhancement provides cleaner and more readable data.
  • V3 API SDKs Launched:
    • We’ve launched SDKs for the v3 API in multiple programming languages, including Python, C#, Java, Go, and TypeScript, making it easier to integrate our API into various development environments.

March 29, 2024

  • Additional Historical Data: 
    • Our v3 API now includes NLP-enriched articles dating back to the beginning of July 2023. 
  • Improved latency
    • We’ve deployed a dedicated processing pipeline for priority sources, ensuring that these articles are indexed in under 5 minutes, down from the usual 15-60 minute delay.

March 25, 2024

  • Author Extraction Enhancement:
    • We’ve improved our extraction methods to better identify author names within the article content, including in-text endings like “…written by John Smith.” 
    • This ensures more accurate attribution of articles to their authors.