Search: AI, Hybrid, Classic

  • Pricing
  • Contact Us
  • Help
    • My Licenses
      • Login
      • Download
      • Why us
      • Contact us
    • Features
      • Google Retail Search
      • Weaviate
      • Elasticsearch
      • OpenSearch
      • Solr
      • Algolia
      • Roadmap
      • Artificial Intelligence
      • Hosting
      • Releases
    • Documentation
    • Support
    • Testimonials
    • Live Demos
    • Blog
Menu
  • Pricing
  • Contact Us
  • Help
    • My Licenses
      • Login
      • Download
      • Why us
      • Contact us
    • Features
      • Google Retail Search
      • Weaviate
      • Elasticsearch
      • OpenSearch
      • Solr
      • Algolia
      • Roadmap
      • Artificial Intelligence
      • Hosting
      • Releases
    • Documentation
    • Support
    • Testimonials
    • Live Demos
    • Blog
Search
Search
Search
Search

Home » Can WPSOLR index PDFs & Other Documents not located in the WordPress Media Library ?

Can WPSOLR index PDFs & Other Documents not located in the WordPress Media Library ?

879 views 2 November 14, 2016 Updated on July 13, 2019 admin

YES, WPSOLR can index PDFs & Other Documents not located in the WordPress Media Library 

WordPress standard search, and most of WordPress search plugins, simply cannot search in documents encoded contents at all. The few plugin who can, often lack the capability to fetch the documents located outside the Media Library.

Steps to index a file, in the media library or not, in WPSOLR

wpsolr indexes files from media library or from an external url
wpsolr indexes files from media library or from an external url

– WPSOLR loads the file from the media library, or download it from an external url
– WPSOLR sends the encoded file content to Apache Solr Tika to decode it to plain text (extraction)
– WPSOLR adds the decoded plain text to the post content
– WPSOLR indexes the post

Which urls are indexed ?

WPSOLR is not a Web crawler, therefore does not follow the hyperlinks contained in a post body.

Instead, the user has to perform one action among:

– Add an ACF  file field (of type “url”) to the post
– Insert a shortcode of the plugin “Embed Any Document” in the post content
– Insert a shortcode of the plugin “PDF Embedder” in the post content
– Insert a shortcode of the plugin “Google Doc Embedder” in the post content

 

 

Tags:attached filemedia librarynot reviewedpdf

Was this helpful?

2 Yes  3 No
Related Articles
  • ACF pack – Can I use google map fields with WPSOLR Geolocation pack ?
  • ACF pack – How are managed repeaters data and flexible content layouts ?

Didn't find your answer? Contact Us

ACF pack – How are managed repeaters data and flexible content layouts ?  

Login
Support
Privacy policy
  • Terms and Conditions
  • © 2020 wpsolr.com. All Rights Reserved.
Youtube
Roadmap