Defect #20730
closed
Fix tokenization of phrases with non-ascii chars
Added by Jens Krämer over 9 years ago.
Updated about 9 years ago.
Description
\w only matches ASCII characters, we should either use [:alnum:]
instead or simply match all non-"
characters for the phrase. Test case included.
Files
- Tracker changed from Patch to Defect
- Target version set to 3.1.2
+1
Search keyword '"日本語 テスト"' (written in Japanese) matches both "日本語 テスト" and "日本語テスト" in the current trunk, but it should not match the latter.
expected:
Redmine::Search::Fetcher.new('"日本語 テスト"', ...).tokens => ['日本語 テスト']
actual:
Redmine::Search::Fetcher.new('"日本語 テスト"', ...).tokens => ['日本語', 'テスト']
This behavior can be fixed by this patch.
- Status changed from New to Closed
- Assignee set to Jean-Philippe Lang
- Target version changed from 3.1.2 to 3.0.6
- Resolution set to Fixed
Also available in: Atom
PDF