Forum Replies Created
- AndreasJParticipant1 year, 11 months ago in reply to: Indexing error : Invalid UTF-8 middle byte 0x3c #33168
Yes, this works for me.
This resolved problems with around 15 documents out of 1500.Would you consider to make this change in upcoming release?
I found simliar code in
class-wpsolr-model-abstract.php:132
that might need this change (and perhaps many more places?).Might add that I am using:
Using Apache Solr, Opensolr SW-SOLR-8-0
and PHP 8.0, WordPress 6.2AndreasJParticipant1 year, 11 months ago in reply to: Indexing error : Invalid UTF-8 middle byte 0x3c #33164I had similar problem. I think the problem is a multibyte character that is truncated in field snippet_s. See the question-mark in snippet:
"snippet_s": ". {\n \"S.Arsene.pdf\":\" Y-a-t-il un int\u00e9r?"
This origins from content
"content": ". {\n \"S.Arsene.pdf\":\" Y-a-t-il un int\u00e9r\u00eat \u00e0 r\u00e9aliser
Fix for Version 23.0:
I suggest fixing the code here wpsolr-pro/wpsolr/core/classes/models/post/class-wpsolr-model-post.php:257 to replace
substr
withmb_substr
.Patch:
diff --git a/wp-content/plugins/wpsolr-pro/wpsolr/core/classes/models/post/class-wpsolr-model-post.php b/wp-content/plugins/wpsolr-pro/wpsolr/core/classes/models/post/class-wpsolr-model-post.php index a13bde04..2b10a92c 100644 --- a/wp-content/plugins/wpsolr-pro/wpsolr/core/classes/models/post/class-wpsolr-model-post.php:257 +++ b/wp-content/plugins/wpsolr-pro/wpsolr/core/classes/models/post/class-wpsolr-model-post.php @@ -253,8 +253,8 @@ class WPSOLR_Model_Post extends WPSOLR_Model_Abstract { static::$highlight_fragsize = WPSOLR_Service_Container::getOption()->get_search_max_length_highlighting(); } $snippet = strip_tags( $pexcerpt ); - $this->solarium_document_for_update[ WpSolrSchema::_FIELD_NAME_SNIPPET_S ] = - ( ! empty( $snippet ) ) ? $snippet : substr( $this->solarium_document_for_update[ WpSolrSchema::_FIELD_NAME_CONTENT ], 0, static::$highlight_fragsize ); + $this->solarium_document_for_update[ WpSolrSchema::_FIELD_NAME_SNIPPET_S ] = + ( ! empty( $snippet ) ) ? $snippet : mb_substr( $this->solarium_document_for_update[ WpSolrSchema::_FIELD_NAME_CONTENT ], 0, static::$highlight_fragsize ); }
- This reply was modified 1 year, 11 months ago by AndreasJ. Reason: format