A World class multilingual search
WPSOLR is perfectly integrated with WordPress most popular multilingual plugins: WPML and Polylang. Your search should not be less multilingual that your posts, products, or pages, isn’t it ?
And true multilingual search is complex, because it has to cope with many layers, from search content, to static and dynamic UI elements.
We will explain all of them in detail in the next chapters.
Multilingual search content
Your search content is made of post types bits and pieces: title, text, excerpt, taxonomies, terms, and custom fields. WooCommerce products also deal with attributes and variations.
When a post is saved, it’s different pieces are sent to a Solr index, which will transform them, accordingly with the schema.xml index file instructions.
For instance, texts will be cut in individual words. Punctuation and meaningless words will be removed (“stopwords”). Words will be reduced to a common radical (“stemming”). Synonyms will be matched, and so one.
But, wait a minute … This is true in English, where sentences are made of words separated by white spaces and punctuation. But is it true everywhere in the world, in countries with symbol based languages ( Chinese, Japanese, Korean, Russian …) ?
It seems each language needs specific text transformations. And fortunately, Apache Solr can manage them all thanks to it’s numerous language specific filters that can be inserted in an index schema.xml.
One or several indexes ?
But we are dealing with several languages. So, should we store content in one or several indexes ? Well, this is a question WPSOLR solved for you.
One index by language is incredibly flexible
When you’re using WPSOLR with WPML or Polylang, each content is stored in it’s language index. So, the French content is stored in an index with a French tailored schema.xml to deal with accents. While the Chinese content belongs to it’s own Chinese tailored schema.xml.
Why this choice ? Because it is the most flexible. With one Solr index by language, you can tweak as much as you like each language to your needs. Not only with the schema.xml, but also with the solrconfig.xml.
Workflow to index a post type
Workflow to search results for a language
One shared index is not as flexible, and more complex
While with a shared index, you can certainly define specific fields with specific language filters. But at the expense of much more complexity on the client side (call field_en, or field_fr, or field_cn, depending on the language context).
And you cannot define a default search field by language, or customize the search handler with a default parameter or java plugin by language, among many other things.
Multilingual static content
Static content is the text contained in the php files that will build the search page results displayed to the users.
It can be the “Search” label on the search form submit button. Or the “Sort by” label of the sort drop-down-list. Or the “Next page” label in the search navigation bar.
WPSOLR is delivered with POT and PO files, that you can modify at will. You can also create your own language translation if necessary.
Multilingual dynamic content
Dynamic content is the text contained in the database, related to your search content, that will build the search page results displayed to the users.
For instance, the list of sort items (“Less expensive first”, “Newest first”, …). Or the list of post types (“Post”, “Page”, “Product”, “Knowledge base”, …) in facets.
WPSOLR calls WPML/POLYLANG actions/filters to retrieve a translation for the dynamic content from their respective:
Which languages can be managed by WPSOLR ?
This is a list of languages officially supported by Apache Solr:
- Brazilian Portuguese
- Simplified Chinese
- Hebrew, Lao, Myanmar, Khmer