Inquiry: Solr Search indexing speed for large pdfs ~10-60mb and in bulk quantity
- smeettParticipant1 year ago #31502
Hi, I have been exploring Solr search to be used on a custom wordpress setup hosted on AWS. Our website project has ~4000 pdfs which are of size anywhere between 10-60mb. If we wanted to OCR index all of them, how long does it take with this plugin? I am happy to update php memory limits or timeout settings to expedite the process but want to ensure there is support for the aforementioned size/quantity. Any specific search API to be utilised/recommended if not Solr search for this use case: https://www.wpsolr.com/pricing/1 year ago #31507
Our AI extraction add-on can send your documents to Google Vision before indexing the extracted content.
This API can send documents in batch (as opposed to the Amazon Rekognition API).smeettParticipant1 year ago #31513
Nope, text-heavy pdfs. These are primarily handbooks, handouts or other legal documents. There are images within the pdfs but they’re not required to be indexed or simply need to be ignored. Is there a short trial where I can experiment/install the plugin to identify the time required/taken?
You must be logged in to reply to this topic.