Diacritic character replacement search does not work properly
- leomoonParticipant5 years, 1 month ago #13927
When searching using diacritical characters, ćčšđž, the search should also hit csdz as secondary search results. This works well on quicksearch results, but not on full results.
On my site at https://test.antikvarijatknjiga.hr you can try with Andric (also returns Andrić, as it should), Dorde (Đorđe), Sibenik (Šibenik), Zivotic (Životić), Covjek (Čovjek).
wpsolrKeymaster5 years, 1 month ago #13930This is related to search engine configuration, more precisely analysers.
Special characters of a language can be treated by specific language filters.
For instance:
– https://lucene.apache.org/solr/guide/6_6/filter-descriptions.html#FilterDescriptions-ASCIIFoldingFilter or https://lucene.apache.org/solr/guide/6_6/charfilterfactories.html#CharFilterFactories-solr.MappingCharFilterFactory
To replace characters with their Lating form: “á” => “a”– Hungarian stemmers like https://www.solr-start.com/javadoc/solr-lucene/org/apache/lucene/analysis/hu/HungarianLightStemmer.html
….
You must be logged in to reply to this topic.