Project

General

Profile

Actions

Patch #30037

closed

Allow single Chinese character as a search keyword

Added by Go MAEDA almost 6 years ago. Updated almost 6 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Search engine
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:

Description

Currently, Redmine requires at least 2 characters length for search keywords. It is not a problem for languages such as English and French.

But for languages such as Japanese and Chinese, the limitation imposes inconvenience for users because there are some words which consist of only a single character. Some examples as follows:

  • 金 = money in Japanese and Chinese
  • 水 = water in Japanese and Chinese
  • 猪 = a wild boar in Japanese, a pig in Chinese
  • 陈 = a common family name for Chinese people

To allow such single character search keywords, I suggest the following patch. It accepts single multibyte-character keywords while keeping the current limitation for keywords with ASCII characters.

diff --git a/lib/redmine/search.rb b/lib/redmine/search.rb
index 674022151..d4c0b9b20 100644
--- a/lib/redmine/search.rb
+++ b/lib/redmine/search.rb
@@ -59,8 +59,8 @@ module Redmine
         # extract tokens from the question
         # eg. hello "bye bye" => ["hello", "bye bye"]
         @tokens = @question.scan(%r{((\s|^)"[^"]+"(\s|$)|\S+)}).collect {|m| m.first.gsub(%r{(^\s*"\s*|\s*"\s*$)}, '')}
-        # tokens must be at least 2 characters long
-        @tokens = @tokens.uniq.select {|w| w.length > 1 }
+        # tokens must be at least 2 characters long in ASCII
+        @tokens = @tokens.uniq.select {|w| w.bytesize > 1 }
         # no more than 5 tokens to search for
         @tokens.slice! 5..-1
       end

Files

Actions #2

Updated by Tomohisa Kusukawa almost 6 years ago

+1

Actions #3

Updated by Anonymous almost 6 years ago

+0⁰

Actions #4

Updated by Go MAEDA almost 6 years ago

  • Target version set to 4.1.0

I am sure that this is a necessary improvement for people who speak Japanese and Chinese because there are some words expressed by only a single character as I wrote in the description field. They cannot search such word with the current versions of Redmine.

Setting the target version to 4.1.0.

Actions #5

Updated by Go MAEDA almost 6 years ago

Updated the patch. The new patch accepts single character tokens only if they are a Chinese character (汉字/漢字).

The previous patch accepts all multibyte characters. But the implementation has some problems. First, accepting only Chinese characters is enough. Even in CJK languages, it is not necessary to search single characters other than Chinese characters such as Japanese Kana (かな) and Korean Hangul (한글) because they are syllabic writing systems and single character of those are not meaningful words in most cases. Second, the previous patch also accepts single latin characters such as "á", "б", and "ß". I think it is a regression. Current versions of Redmine does not accept those single characters.

In order to solve these issues, the new patch allows character tokens only for Chinese characters and keeps the current behavior for other characters.

Actions #6

Updated by Go MAEDA almost 6 years ago

  • Subject changed from Allow single character search keywords for Chinese characters to Allow single Chinese character as a search keyword
  • Status changed from New to Closed
  • Assignee set to Go MAEDA
  • Target version changed from 4.1.0 to 4.0.0

Committed.

Actions

Also available in: Atom PDF