Data ingestion is crucial to a good Apache Solr search

One solution is web scrapping, with many quality issues. A better solution is extracting and cleaning data from the database itself.

This is exactly what is done by WPSOLR: the plugin understands how your WordPress or WooCommerce data is stored, and therefore is able to index it in the best way possible into Solr.

What is a post type, or a product attribute? How to send them so they can be used as facets/filters ?

This is where WPSOLR shines.

And it can also index files like PDFs or .docx files from the media library.

See it by yourself with WooCommerce + Apache Solr +

