Search keyword not working properly
- wpsolrKeymaster2 years, 2 months ago #30117
This is a file format issue, often due to XML (generated from a PDF for instance) containing the forbidden “<" character. No obvious solution, appart from removing the documents or fixing the documents content: https://www.wpsolr.com/forums/topic/indexing-error-invalid-utf-8-middle-byte-0x3c/
hsurekarParticipant2 years, 2 months ago #30232ok, now i am getting below error but not getting any post ID to check what is the issue. could you please let me know how to fix below error?
Batch process : 1
Debug mode : OnAn error or timeout occured.
Error code: parsererror
Error message: SyntaxError: Unexpected token P in JSON at position 0
Posts excluded from the index:<br><b></b><br><br>******** DEBUG ACTIVATED – Beginning of new loop (batch size) *******<br><br>******** DEBUG ACTIVATED – Query documents from last post date *******<br><br>Query:<br><b>SELECT ID, post_modified, post_parent, post_type FROM wp_1_posts AS A WHERE ((post_modified = ‘2010-12-10 17:34:08’ AND ID > 97920) OR (post_modified > ‘2010-12-10 17:34:08’)) AND ( post_status IN (‘publish’) AND ( post_type = ‘post’ ) ) ORDER BY post_modified ASC, ID ASC LIMIT 1</b><br><br>Last post date:<br><b>2010-12-10 17:34:08</b><br><br>Last post ID:<br><b>97920</b><br><br>Post to be sent:<br><b>{ “id”: “97923”, “PID”: “97923”, “type”: “post”, “meta_type_s”: “post_type”, “displaymodified”: “2010-12-10T18:05:00Z”, “title”: “CJAM relocates to Adams court “, “title_s”: “CJAM relocates to Adams court “, “permalink”: “https:\/\/masslawyersweekly.com\/2010\/12\/10\/cjam-relocates-to-adams-court\/”, “post_status_s”: “publish”, “content”: “Chief Justice for Administration and Management Robert A. Mulligan has moved his office from Boston\u2019s Center Plaza to the first-floor mezzanine of the John Adams Courthouse.\r\n\r\nThe move further reduces the space leased at Center Plaza for the Administrative Office of the Trial Court. Since late 2008, consolidations and relocations have reduced the leased space there by 32 percent, and the landlord has renegotiated the lease.\r\n\r\nDepartments remaining at Center Plaza include human resources, legal, fiscal, support services and the Judicial Institute. The Administrative Office of the Juvenile Court and the Sentencing Commission also remain at Center Plaza.\r\n\r\nTelephone numbers for the CJAM, Chief of Staff Bob Panneton and Executive Director Frank Carney remain the same. The new address is: John Adams Courthouse, One Pemberton Square, Boston, MA, 02108. Mail for the AOTC departments should continue to be sent to Center Plaza.”, “snippet_s”: “Chief Justice for Administration and Management Robert A. Mulligan has moved his office from Boston?”, “post_author_s”: “542”, “author”: “Mass. Lawyers Weekly Staff”, “menu_order_i”: 0, “PID_i”: “97923”, “author_s”: “https:\/\/masslawyersweekly.com\/author\/mass-lawyersweeklystaff\/”, “displaydate”: “2010-12-10T18:05:00Z”, “displaydate_dt”: “2010-12-10T18:05:00Z”, “date”: “2010-12-10T23:05:00Z”, “displaymodified_dt”: “2010-12-10T18:05:00Z”, “modified”: “2010-12-10T23:05:00Z”, “displaymodified_dt_i”: “1292004300000”, “displaymodified_dt_y_i”: “2010”, “displaymodified_dt_ym_i”: “12”, “displaymodified_dt_yw_i”: “49”, “displaymodified_dt_yd_i”: “344”, “displaymodified_dt_md_i”: “10”, “displaymodified_dt_wd_i”: “6”, “displaymodified_dt_dh_i”: “18”, “displaymodified_dt_dm_i”: “5”, “displaymodified_dt_ds_i”: 0, “displaydate_dt_i”: “1292004300000”, “displaydate_dt_y_i”: “2010”, “displaydate_dt_ym_i”: “12”, “displaydate_dt_yw_i”: “49”, “displaydate_dt_yd_i”: “344”, “displaydate_dt_md_i”: “10”, “displaydate_dt_wd_i”: “6”, “displaydate_dt_dh_i”: “18”, “displaydate_dt_dm_i”: “5”, “displaydate_dt_ds_i”: 0, “date_i”: “1292022300000”, “date_y_i”: “2010”, “date_ym_i”: “12”, “date_yw_i”: “49”, “date_yd_i”: “344”, “date_md_i”: “10”, “date_wd_i”: “6”, “date_dh_i”: “23”, “date_dm_i”: “5”, “date_ds_i”: 0, “displaydate_i”: “1292004300000”, “displaydate_y_i”: “2010”, “displaydate_ym_i”: “12”, “displaydate_yw_i”: “49”, “displaydate_yd_i”: “344”, “displaydate_md_i”: “10”, “displaydate_wd_i”: “6”, “displaydate_dh_i”: “18”, “displaydate_dm_i”: “5”, “displaydate_ds_i”: 0, “modified_i”: “1292022300000”, “modified_y_i”: “2010”, “modified_ym_i”: “12”, “modified_yw_i”: “49”, “modified_yd_i”: “344”, “modified_md_i”: “10”, “modified_wd_i”: “6”, “modified_dh_i”: “23”, “modified_dm_i”: “5”, “modified_ds_i”: 0, “comments”: [], “numcomments”: 0, “categories_str”: [ “News Briefs” ], “categories”: [ “News Briefs”, “MALW”, “Subscriber Only”, “Yes” ], “flat_hierarchy_categories_str”: [ “News Briefs” ], “non_flat_hierarchy_categories_str”: [ “News Briefs” ], “tags”: [ “CJAM”, “Dec. 13 2010 issue.” ], “dmcss_pub_code_str”: [ “MALW” ], “dmcss_security_policy_str”: [ “Subscriber Only” ], “we_own_it_str”: [ “Yes” ] }</b><br><br>{“nb_results”:0,”status”:400,”message”:”Solr HTTP error: Bad Request (400)\n{\n "responseHeader":{\n "status":400,\n "QTime":0},\n "error":{\n "metadata":[\n "error-class","org.apache.solr.common.SolrException",\n "root-error-class","java.io.CharConversionException"],\n "msg":"Invalid UTF-8 middle byte 0x3c (at char #1563, byte #127)",\n "code":400}}\n”,”indexing_complete”:false}
hsurekarParticipant2 years, 2 months ago #30235I found issue please see below content of post
If I add below content then it will give me error.
“It’s no longer just the customers and internal clients and regulators of our business, but additional pressures externally from the SEC, analysts and shareholders.””If I add below content then it will work proper
“It’s no longer just the customers and internal clients and regulators of our business, but additional pressures externally from the SEC, analysts and shareholders.”
So could you please check and provide us solution for this quote related issue.
Thanks
HemantwpsolrKeymaster2 years, 2 months ago #30237If I add below content then it will give me error.
“It’s no longer just the customers and internal clients and regulators of our business, but additional pressures externally from the SEC, analysts and shareholders.””In your PDF content, or in your post’s description content?
hsurekarParticipant2 years, 2 months ago #30250Please see below content which we are adding but on your site editor it’s converting normal quote but not in wordpress editor.
One thing about him that might surprise people: “I’m pretty involved in charities involving kids, like Boys Town New England, where I’m on the board of directors.”
hsurekarParticipant2 years, 2 months ago #30251I have attached txt file with that content please check it.
TXT file : https://drive.google.com/file/d/1CAqqjVwZIix_ayMikrDOq7Om1Uj3NoUT/view?usp=sharing
You must be logged in to reply to this topic.