Project

General

Profile

Defect #41464 ยป 0001-Fix-CSV-file-encoding-auto-detection-failure-with-mu.patch

Go MAEDA, 2024-10-19 09:03

View differences:

app/models/import.rb
69 69
    encoding = lu(user, :general_csv_encoding)
70 70
    if file_exists?
71 71
      begin
72
        content = File.read(filepath, 256)
72
        # Reading a specified number of bytes may corrupt the trailing
73
        # multi-byte character, so we read the first few lines instead.
74
        content = read_head_lines
73 75

  
74 76
        separator = [',', ';'].max_by {|sep| content.count(sep)}
75 77
        wrapper = ['"', "'"].max_by {|quote_char| content.count(quote_char)}
......
123 125
    filepath.present? && File.exist?(filepath)
124 126
  end
125 127

  
128
  # Reads lines from the beginning of the file, up to the specified byte limit
129
  def read_head_lines(byte_limit = 4096)
130
    return nil unless file_exists?
131

  
132
    chunk = File.read(filepath, byte_limit)
133
    last_lf_index = chunk.rindex("\n")
134
    last_lf_index ? chunk[..last_lf_index] : chunk
135
  end
136

  
126 137
  # Returns the headers as an array that
127 138
  # can be used for select options
128 139
  def columns_options(default=nil)
test/fixtures/files/mbcs-multiline-text.txt
1
An emoticon is represented by 4 bytes in UTF-8 encoding.
2

  
3
If you simply read the first 4096 bytes of this file, the trailing characters of a multi-byte sequence might be cut off, resulting in an invalid UTF-8 string.
4

  
5
๐Ÿ˜€๐Ÿ˜๐Ÿ˜‚๐Ÿ˜ƒ๐Ÿ˜„๐Ÿ˜…๐Ÿ˜†๐Ÿ˜‡๐Ÿ˜ˆ๐Ÿ˜‰๐Ÿ˜Š๐Ÿ˜‹๐Ÿ˜Œ๐Ÿ˜๐Ÿ˜Ž๐Ÿ˜๐Ÿ˜๐Ÿ˜‘๐Ÿ˜’๐Ÿ˜“๐Ÿ˜”๐Ÿ˜•๐Ÿ˜–๐Ÿ˜—๐Ÿ˜˜๐Ÿ˜™๐Ÿ˜š๐Ÿ˜›๐Ÿ˜œ๐Ÿ˜๐Ÿ˜ž๐Ÿ˜Ÿ๐Ÿ˜ ๐Ÿ˜ก๐Ÿ˜ข๐Ÿ˜ฃ๐Ÿ˜ค๐Ÿ˜ฅ๐Ÿ˜ฆ๐Ÿ˜ง๐Ÿ˜จ๐Ÿ˜ฉ๐Ÿ˜ช๐Ÿ˜ซ๐Ÿ˜ฌ๐Ÿ˜ญ๐Ÿ˜ฎ๐Ÿ˜ฏ๐Ÿ˜ฐ๐Ÿ˜ฑ๐Ÿ˜ฒ๐Ÿ˜ณ๐Ÿ˜ด๐Ÿ˜ต๐Ÿ˜ถ๐Ÿ˜ท๐Ÿ˜ธ๐Ÿ˜น๐Ÿ˜บ๐Ÿ˜ป๐Ÿ˜ผ๐Ÿ˜ฝ๐Ÿ˜พ๐Ÿ˜ฟ๐Ÿ™€๐Ÿ™๐Ÿ™‚๐Ÿ™ƒ๐Ÿ™„๐Ÿ™…๐Ÿ™†๐Ÿ™‡๐Ÿ™ˆ๐Ÿ™‰๐Ÿ™Š๐Ÿ™‹๐Ÿ™Œ๐Ÿ™๐Ÿ™Ž๐Ÿ™
6
๐Ÿ˜€๐Ÿ˜๐Ÿ˜‚๐Ÿ˜ƒ๐Ÿ˜„๐Ÿ˜…๐Ÿ˜†๐Ÿ˜‡๐Ÿ˜ˆ๐Ÿ˜‰๐Ÿ˜Š๐Ÿ˜‹๐Ÿ˜Œ๐Ÿ˜๐Ÿ˜Ž๐Ÿ˜๐Ÿ˜๐Ÿ˜‘๐Ÿ˜’๐Ÿ˜“๐Ÿ˜”๐Ÿ˜•๐Ÿ˜–๐Ÿ˜—๐Ÿ˜˜๐Ÿ˜™๐Ÿ˜š๐Ÿ˜›๐Ÿ˜œ๐Ÿ˜๐Ÿ˜ž๐Ÿ˜Ÿ๐Ÿ˜ ๐Ÿ˜ก๐Ÿ˜ข๐Ÿ˜ฃ๐Ÿ˜ค๐Ÿ˜ฅ๐Ÿ˜ฆ๐Ÿ˜ง๐Ÿ˜จ๐Ÿ˜ฉ๐Ÿ˜ช๐Ÿ˜ซ๐Ÿ˜ฌ๐Ÿ˜ญ๐Ÿ˜ฎ๐Ÿ˜ฏ๐Ÿ˜ฐ๐Ÿ˜ฑ๐Ÿ˜ฒ๐Ÿ˜ณ๐Ÿ˜ด๐Ÿ˜ต๐Ÿ˜ถ๐Ÿ˜ท๐Ÿ˜ธ๐Ÿ˜น๐Ÿ˜บ๐Ÿ˜ป๐Ÿ˜ผ๐Ÿ˜ฝ๐Ÿ˜พ๐Ÿ˜ฟ๐Ÿ™€๐Ÿ™๐Ÿ™‚๐Ÿ™ƒ๐Ÿ™„๐Ÿ™…๐Ÿ™†๐Ÿ™‡๐Ÿ™ˆ๐Ÿ™‰๐Ÿ™Š๐Ÿ™‹๐Ÿ™Œ๐Ÿ™๐Ÿ™Ž๐Ÿ™
7
๐Ÿ˜€๐Ÿ˜๐Ÿ˜‚๐Ÿ˜ƒ๐Ÿ˜„๐Ÿ˜…๐Ÿ˜†๐Ÿ˜‡๐Ÿ˜ˆ๐Ÿ˜‰๐Ÿ˜Š๐Ÿ˜‹๐Ÿ˜Œ๐Ÿ˜๐Ÿ˜Ž๐Ÿ˜๐Ÿ˜๐Ÿ˜‘๐Ÿ˜’๐Ÿ˜“๐Ÿ˜”๐Ÿ˜•๐Ÿ˜–๐Ÿ˜—๐Ÿ˜˜๐Ÿ˜™๐Ÿ˜š๐Ÿ˜›๐Ÿ˜œ๐Ÿ˜๐Ÿ˜ž๐Ÿ˜Ÿ๐Ÿ˜ ๐Ÿ˜ก๐Ÿ˜ข๐Ÿ˜ฃ๐Ÿ˜ค๐Ÿ˜ฅ๐Ÿ˜ฆ๐Ÿ˜ง๐Ÿ˜จ๐Ÿ˜ฉ๐Ÿ˜ช๐Ÿ˜ซ๐Ÿ˜ฌ๐Ÿ˜ญ๐Ÿ˜ฎ๐Ÿ˜ฏ๐Ÿ˜ฐ๐Ÿ˜ฑ๐Ÿ˜ฒ๐Ÿ˜ณ๐Ÿ˜ด๐Ÿ˜ต๐Ÿ˜ถ๐Ÿ˜ท๐Ÿ˜ธ๐Ÿ˜น๐Ÿ˜บ๐Ÿ˜ป๐Ÿ˜ผ๐Ÿ˜ฝ๐Ÿ˜พ๐Ÿ˜ฟ๐Ÿ™€๐Ÿ™๐Ÿ™‚๐Ÿ™ƒ๐Ÿ™„๐Ÿ™…๐Ÿ™†๐Ÿ™‡๐Ÿ™ˆ๐Ÿ™‰๐Ÿ™Š๐Ÿ™‹๐Ÿ™Œ๐Ÿ™๐Ÿ™Ž๐Ÿ™
8
๐Ÿ˜€๐Ÿ˜๐Ÿ˜‚๐Ÿ˜ƒ๐Ÿ˜„๐Ÿ˜…๐Ÿ˜†๐Ÿ˜‡๐Ÿ˜ˆ๐Ÿ˜‰๐Ÿ˜Š๐Ÿ˜‹๐Ÿ˜Œ๐Ÿ˜๐Ÿ˜Ž๐Ÿ˜๐Ÿ˜๐Ÿ˜‘๐Ÿ˜’๐Ÿ˜“๐Ÿ˜”๐Ÿ˜•๐Ÿ˜–๐Ÿ˜—๐Ÿ˜˜๐Ÿ˜™๐Ÿ˜š๐Ÿ˜›๐Ÿ˜œ๐Ÿ˜๐Ÿ˜ž๐Ÿ˜Ÿ๐Ÿ˜ ๐Ÿ˜ก๐Ÿ˜ข๐Ÿ˜ฃ๐Ÿ˜ค๐Ÿ˜ฅ๐Ÿ˜ฆ๐Ÿ˜ง๐Ÿ˜จ๐Ÿ˜ฉ๐Ÿ˜ช๐Ÿ˜ซ๐Ÿ˜ฌ๐Ÿ˜ญ๐Ÿ˜ฎ๐Ÿ˜ฏ๐Ÿ˜ฐ๐Ÿ˜ฑ๐Ÿ˜ฒ๐Ÿ˜ณ๐Ÿ˜ด๐Ÿ˜ต๐Ÿ˜ถ๐Ÿ˜ท๐Ÿ˜ธ๐Ÿ˜น๐Ÿ˜บ๐Ÿ˜ป๐Ÿ˜ผ๐Ÿ˜ฝ๐Ÿ˜พ๐Ÿ˜ฟ๐Ÿ™€๐Ÿ™๐Ÿ™‚๐Ÿ™ƒ๐Ÿ™„๐Ÿ™…๐Ÿ™†๐Ÿ™‡๐Ÿ™ˆ๐Ÿ™‰๐Ÿ™Š๐Ÿ™‹๐Ÿ™Œ๐Ÿ™๐Ÿ™Ž๐Ÿ™
9
๐Ÿ˜€๐Ÿ˜๐Ÿ˜‚๐Ÿ˜ƒ๐Ÿ˜„๐Ÿ˜…๐Ÿ˜†๐Ÿ˜‡๐Ÿ˜ˆ๐Ÿ˜‰๐Ÿ˜Š๐Ÿ˜‹๐Ÿ˜Œ๐Ÿ˜๐Ÿ˜Ž๐Ÿ˜๐Ÿ˜๐Ÿ˜‘๐Ÿ˜’๐Ÿ˜“๐Ÿ˜”๐Ÿ˜•๐Ÿ˜–๐Ÿ˜—๐Ÿ˜˜๐Ÿ˜™๐Ÿ˜š๐Ÿ˜›๐Ÿ˜œ๐Ÿ˜๐Ÿ˜ž๐Ÿ˜Ÿ๐Ÿ˜ ๐Ÿ˜ก๐Ÿ˜ข๐Ÿ˜ฃ๐Ÿ˜ค๐Ÿ˜ฅ๐Ÿ˜ฆ๐Ÿ˜ง๐Ÿ˜จ๐Ÿ˜ฉ๐Ÿ˜ช๐Ÿ˜ซ๐Ÿ˜ฌ๐Ÿ˜ญ๐Ÿ˜ฎ๐Ÿ˜ฏ๐Ÿ˜ฐ๐Ÿ˜ฑ๐Ÿ˜ฒ๐Ÿ˜ณ๐Ÿ˜ด๐Ÿ˜ต๐Ÿ˜ถ๐Ÿ˜ท๐Ÿ˜ธ๐Ÿ˜น๐Ÿ˜บ๐Ÿ˜ป๐Ÿ˜ผ๐Ÿ˜ฝ๐Ÿ˜พ๐Ÿ˜ฟ๐Ÿ™€๐Ÿ™๐Ÿ™‚๐Ÿ™ƒ๐Ÿ™„๐Ÿ™…๐Ÿ™†๐Ÿ™‡๐Ÿ™ˆ๐Ÿ™‰๐Ÿ™Š๐Ÿ™‹๐Ÿ™Œ๐Ÿ™๐Ÿ™Ž๐Ÿ™
10
๐Ÿ˜€๐Ÿ˜๐Ÿ˜‚๐Ÿ˜ƒ๐Ÿ˜„๐Ÿ˜…๐Ÿ˜†๐Ÿ˜‡๐Ÿ˜ˆ๐Ÿ˜‰๐Ÿ˜Š๐Ÿ˜‹๐Ÿ˜Œ๐Ÿ˜๐Ÿ˜Ž๐Ÿ˜๐Ÿ˜๐Ÿ˜‘๐Ÿ˜’๐Ÿ˜“๐Ÿ˜”๐Ÿ˜•๐Ÿ˜–๐Ÿ˜—๐Ÿ˜˜๐Ÿ˜™๐Ÿ˜š๐Ÿ˜›๐Ÿ˜œ๐Ÿ˜๐Ÿ˜ž๐Ÿ˜Ÿ๐Ÿ˜ ๐Ÿ˜ก๐Ÿ˜ข๐Ÿ˜ฃ๐Ÿ˜ค๐Ÿ˜ฅ๐Ÿ˜ฆ๐Ÿ˜ง๐Ÿ˜จ๐Ÿ˜ฉ๐Ÿ˜ช๐Ÿ˜ซ๐Ÿ˜ฌ๐Ÿ˜ญ๐Ÿ˜ฎ๐Ÿ˜ฏ๐Ÿ˜ฐ๐Ÿ˜ฑ๐Ÿ˜ฒ๐Ÿ˜ณ๐Ÿ˜ด๐Ÿ˜ต๐Ÿ˜ถ๐Ÿ˜ท๐Ÿ˜ธ๐Ÿ˜น๐Ÿ˜บ๐Ÿ˜ป๐Ÿ˜ผ๐Ÿ˜ฝ๐Ÿ˜พ๐Ÿ˜ฟ๐Ÿ™€๐Ÿ™๐Ÿ™‚๐Ÿ™ƒ๐Ÿ™„๐Ÿ™…๐Ÿ™†๐Ÿ™‡๐Ÿ™ˆ๐Ÿ™‰๐Ÿ™Š๐Ÿ™‹๐Ÿ™Œ๐Ÿ™๐Ÿ™Ž๐Ÿ™
11
๐Ÿ˜€๐Ÿ˜๐Ÿ˜‚๐Ÿ˜ƒ๐Ÿ˜„๐Ÿ˜…๐Ÿ˜†๐Ÿ˜‡๐Ÿ˜ˆ๐Ÿ˜‰๐Ÿ˜Š๐Ÿ˜‹๐Ÿ˜Œ๐Ÿ˜๐Ÿ˜Ž๐Ÿ˜๐Ÿ˜๐Ÿ˜‘๐Ÿ˜’๐Ÿ˜“๐Ÿ˜”๐Ÿ˜•๐Ÿ˜–๐Ÿ˜—๐Ÿ˜˜๐Ÿ˜™๐Ÿ˜š๐Ÿ˜›๐Ÿ˜œ๐Ÿ˜๐Ÿ˜ž๐Ÿ˜Ÿ๐Ÿ˜ ๐Ÿ˜ก๐Ÿ˜ข๐Ÿ˜ฃ๐Ÿ˜ค๐Ÿ˜ฅ๐Ÿ˜ฆ๐Ÿ˜ง๐Ÿ˜จ๐Ÿ˜ฉ๐Ÿ˜ช๐Ÿ˜ซ๐Ÿ˜ฌ๐Ÿ˜ญ๐Ÿ˜ฎ๐Ÿ˜ฏ๐Ÿ˜ฐ๐Ÿ˜ฑ๐Ÿ˜ฒ๐Ÿ˜ณ๐Ÿ˜ด๐Ÿ˜ต๐Ÿ˜ถ๐Ÿ˜ท๐Ÿ˜ธ๐Ÿ˜น๐Ÿ˜บ๐Ÿ˜ป๐Ÿ˜ผ๐Ÿ˜ฝ๐Ÿ˜พ๐Ÿ˜ฟ๐Ÿ™€๐Ÿ™๐Ÿ™‚๐Ÿ™ƒ๐Ÿ™„๐Ÿ™…๐Ÿ™†๐Ÿ™‡๐Ÿ™ˆ๐Ÿ™‰๐Ÿ™Š๐Ÿ™‹๐Ÿ™Œ๐Ÿ™๐Ÿ™Ž๐Ÿ™
12
๐Ÿ˜€๐Ÿ˜๐Ÿ˜‚๐Ÿ˜ƒ๐Ÿ˜„๐Ÿ˜…๐Ÿ˜†๐Ÿ˜‡๐Ÿ˜ˆ๐Ÿ˜‰๐Ÿ˜Š๐Ÿ˜‹๐Ÿ˜Œ๐Ÿ˜๐Ÿ˜Ž๐Ÿ˜๐Ÿ˜๐Ÿ˜‘๐Ÿ˜’๐Ÿ˜“๐Ÿ˜”๐Ÿ˜•๐Ÿ˜–๐Ÿ˜—๐Ÿ˜˜๐Ÿ˜™๐Ÿ˜š๐Ÿ˜›๐Ÿ˜œ๐Ÿ˜๐Ÿ˜ž๐Ÿ˜Ÿ๐Ÿ˜ ๐Ÿ˜ก๐Ÿ˜ข๐Ÿ˜ฃ๐Ÿ˜ค๐Ÿ˜ฅ๐Ÿ˜ฆ๐Ÿ˜ง๐Ÿ˜จ๐Ÿ˜ฉ๐Ÿ˜ช๐Ÿ˜ซ๐Ÿ˜ฌ๐Ÿ˜ญ๐Ÿ˜ฎ๐Ÿ˜ฏ๐Ÿ˜ฐ๐Ÿ˜ฑ๐Ÿ˜ฒ๐Ÿ˜ณ๐Ÿ˜ด๐Ÿ˜ต๐Ÿ˜ถ๐Ÿ˜ท๐Ÿ˜ธ๐Ÿ˜น๐Ÿ˜บ๐Ÿ˜ป๐Ÿ˜ผ๐Ÿ˜ฝ๐Ÿ˜พ๐Ÿ˜ฟ๐Ÿ™€๐Ÿ™๐Ÿ™‚๐Ÿ™ƒ๐Ÿ™„๐Ÿ™…๐Ÿ™†๐Ÿ™‡๐Ÿ™ˆ๐Ÿ™‰๐Ÿ™Š๐Ÿ™‹๐Ÿ™Œ๐Ÿ™๐Ÿ™Ž๐Ÿ™
13
๐Ÿ˜€๐Ÿ˜๐Ÿ˜‚๐Ÿ˜ƒ๐Ÿ˜„๐Ÿ˜…๐Ÿ˜†๐Ÿ˜‡๐Ÿ˜ˆ๐Ÿ˜‰๐Ÿ˜Š๐Ÿ˜‹๐Ÿ˜Œ๐Ÿ˜๐Ÿ˜Ž๐Ÿ˜๐Ÿ˜๐Ÿ˜‘๐Ÿ˜’๐Ÿ˜“๐Ÿ˜”๐Ÿ˜•๐Ÿ˜–๐Ÿ˜—๐Ÿ˜˜๐Ÿ˜™๐Ÿ˜š๐Ÿ˜›๐Ÿ˜œ๐Ÿ˜๐Ÿ˜ž๐Ÿ˜Ÿ๐Ÿ˜ ๐Ÿ˜ก๐Ÿ˜ข๐Ÿ˜ฃ๐Ÿ˜ค๐Ÿ˜ฅ๐Ÿ˜ฆ๐Ÿ˜ง๐Ÿ˜จ๐Ÿ˜ฉ๐Ÿ˜ช๐Ÿ˜ซ๐Ÿ˜ฌ๐Ÿ˜ญ๐Ÿ˜ฎ๐Ÿ˜ฏ๐Ÿ˜ฐ๐Ÿ˜ฑ๐Ÿ˜ฒ๐Ÿ˜ณ๐Ÿ˜ด๐Ÿ˜ต๐Ÿ˜ถ๐Ÿ˜ท๐Ÿ˜ธ๐Ÿ˜น๐Ÿ˜บ๐Ÿ˜ป๐Ÿ˜ผ๐Ÿ˜ฝ๐Ÿ˜พ๐Ÿ˜ฟ๐Ÿ™€๐Ÿ™๐Ÿ™‚๐Ÿ™ƒ๐Ÿ™„๐Ÿ™…๐Ÿ™†๐Ÿ™‡๐Ÿ™ˆ๐Ÿ™‰๐Ÿ™Š๐Ÿ™‹๐Ÿ™Œ๐Ÿ™๐Ÿ™Ž๐Ÿ™
14
๐Ÿ˜€๐Ÿ˜๐Ÿ˜‚๐Ÿ˜ƒ๐Ÿ˜„๐Ÿ˜…๐Ÿ˜†๐Ÿ˜‡๐Ÿ˜ˆ๐Ÿ˜‰๐Ÿ˜Š๐Ÿ˜‹๐Ÿ˜Œ๐Ÿ˜๐Ÿ˜Ž๐Ÿ˜๐Ÿ˜๐Ÿ˜‘๐Ÿ˜’๐Ÿ˜“๐Ÿ˜”๐Ÿ˜•๐Ÿ˜–๐Ÿ˜—๐Ÿ˜˜๐Ÿ˜™๐Ÿ˜š๐Ÿ˜›๐Ÿ˜œ๐Ÿ˜๐Ÿ˜ž๐Ÿ˜Ÿ๐Ÿ˜ ๐Ÿ˜ก๐Ÿ˜ข๐Ÿ˜ฃ๐Ÿ˜ค๐Ÿ˜ฅ๐Ÿ˜ฆ๐Ÿ˜ง๐Ÿ˜จ๐Ÿ˜ฉ๐Ÿ˜ช๐Ÿ˜ซ๐Ÿ˜ฌ๐Ÿ˜ญ๐Ÿ˜ฎ๐Ÿ˜ฏ๐Ÿ˜ฐ๐Ÿ˜ฑ๐Ÿ˜ฒ๐Ÿ˜ณ๐Ÿ˜ด๐Ÿ˜ต๐Ÿ˜ถ๐Ÿ˜ท๐Ÿ˜ธ๐Ÿ˜น๐Ÿ˜บ๐Ÿ˜ป๐Ÿ˜ผ๐Ÿ˜ฝ๐Ÿ˜พ๐Ÿ˜ฟ๐Ÿ™€๐Ÿ™๐Ÿ™‚๐Ÿ™ƒ๐Ÿ™„๐Ÿ™…๐Ÿ™†๐Ÿ™‡๐Ÿ™ˆ๐Ÿ™‰๐Ÿ™Š๐Ÿ™‹๐Ÿ™Œ๐Ÿ™๐Ÿ™Ž๐Ÿ™
15
๐Ÿ˜€๐Ÿ˜๐Ÿ˜‚๐Ÿ˜ƒ๐Ÿ˜„๐Ÿ˜…๐Ÿ˜†๐Ÿ˜‡๐Ÿ˜ˆ๐Ÿ˜‰๐Ÿ˜Š๐Ÿ˜‹๐Ÿ˜Œ๐Ÿ˜๐Ÿ˜Ž๐Ÿ˜๐Ÿ˜๐Ÿ˜‘๐Ÿ˜’๐Ÿ˜“๐Ÿ˜”๐Ÿ˜•๐Ÿ˜–๐Ÿ˜—๐Ÿ˜˜๐Ÿ˜™๐Ÿ˜š๐Ÿ˜›๐Ÿ˜œ๐Ÿ˜๐Ÿ˜ž๐Ÿ˜Ÿ๐Ÿ˜ ๐Ÿ˜ก๐Ÿ˜ข๐Ÿ˜ฃ๐Ÿ˜ค๐Ÿ˜ฅ๐Ÿ˜ฆ๐Ÿ˜ง๐Ÿ˜จ๐Ÿ˜ฉ๐Ÿ˜ช๐Ÿ˜ซ๐Ÿ˜ฌ๐Ÿ˜ญ๐Ÿ˜ฎ๐Ÿ˜ฏ๐Ÿ˜ฐ๐Ÿ˜ฑ๐Ÿ˜ฒ๐Ÿ˜ณ๐Ÿ˜ด๐Ÿ˜ต๐Ÿ˜ถ๐Ÿ˜ท๐Ÿ˜ธ๐Ÿ˜น๐Ÿ˜บ๐Ÿ˜ป๐Ÿ˜ผ๐Ÿ˜ฝ๐Ÿ˜พ๐Ÿ˜ฟ๐Ÿ™€๐Ÿ™๐Ÿ™‚๐Ÿ™ƒ๐Ÿ™„๐Ÿ™…๐Ÿ™†๐Ÿ™‡๐Ÿ™ˆ๐Ÿ™‰๐Ÿ™Š๐Ÿ™‹๐Ÿ™Œ๐Ÿ™๐Ÿ™Ž๐Ÿ™
16
๐Ÿ˜€๐Ÿ˜๐Ÿ˜‚๐Ÿ˜ƒ๐Ÿ˜„๐Ÿ˜…๐Ÿ˜†๐Ÿ˜‡๐Ÿ˜ˆ๐Ÿ˜‰๐Ÿ˜Š๐Ÿ˜‹๐Ÿ˜Œ๐Ÿ˜๐Ÿ˜Ž๐Ÿ˜๐Ÿ˜๐Ÿ˜‘๐Ÿ˜’๐Ÿ˜“๐Ÿ˜”๐Ÿ˜•๐Ÿ˜–๐Ÿ˜—๐Ÿ˜˜๐Ÿ˜™๐Ÿ˜š๐Ÿ˜›๐Ÿ˜œ๐Ÿ˜๐Ÿ˜ž๐Ÿ˜Ÿ๐Ÿ˜ ๐Ÿ˜ก๐Ÿ˜ข๐Ÿ˜ฃ๐Ÿ˜ค๐Ÿ˜ฅ๐Ÿ˜ฆ๐Ÿ˜ง๐Ÿ˜จ๐Ÿ˜ฉ๐Ÿ˜ช๐Ÿ˜ซ๐Ÿ˜ฌ๐Ÿ˜ญ๐Ÿ˜ฎ๐Ÿ˜ฏ๐Ÿ˜ฐ๐Ÿ˜ฑ๐Ÿ˜ฒ๐Ÿ˜ณ๐Ÿ˜ด๐Ÿ˜ต๐Ÿ˜ถ๐Ÿ˜ท๐Ÿ˜ธ๐Ÿ˜น๐Ÿ˜บ๐Ÿ˜ป๐Ÿ˜ผ๐Ÿ˜ฝ๐Ÿ˜พ๐Ÿ˜ฟ๐Ÿ™€๐Ÿ™๐Ÿ™‚๐Ÿ™ƒ๐Ÿ™„๐Ÿ™…๐Ÿ™†๐Ÿ™‡๐Ÿ™ˆ๐Ÿ™‰๐Ÿ™Š๐Ÿ™‹๐Ÿ™Œ๐Ÿ™๐Ÿ™Ž๐Ÿ™
17
๐Ÿ˜€๐Ÿ˜๐Ÿ˜‚๐Ÿ˜ƒ๐Ÿ˜„๐Ÿ˜…๐Ÿ˜†๐Ÿ˜‡๐Ÿ˜ˆ๐Ÿ˜‰๐Ÿ˜Š๐Ÿ˜‹๐Ÿ˜Œ๐Ÿ˜๐Ÿ˜Ž๐Ÿ˜๐Ÿ˜๐Ÿ˜‘๐Ÿ˜’๐Ÿ˜“๐Ÿ˜”๐Ÿ˜•๐Ÿ˜–๐Ÿ˜—๐Ÿ˜˜๐Ÿ˜™๐Ÿ˜š๐Ÿ˜›๐Ÿ˜œ๐Ÿ˜๐Ÿ˜ž๐Ÿ˜Ÿ๐Ÿ˜ ๐Ÿ˜ก๐Ÿ˜ข๐Ÿ˜ฃ๐Ÿ˜ค๐Ÿ˜ฅ๐Ÿ˜ฆ๐Ÿ˜ง๐Ÿ˜จ๐Ÿ˜ฉ๐Ÿ˜ช๐Ÿ˜ซ๐Ÿ˜ฌ๐Ÿ˜ญ๐Ÿ˜ฎ๐Ÿ˜ฏ๐Ÿ˜ฐ๐Ÿ˜ฑ๐Ÿ˜ฒ๐Ÿ˜ณ๐Ÿ˜ด๐Ÿ˜ต๐Ÿ˜ถ๐Ÿ˜ท๐Ÿ˜ธ๐Ÿ˜น๐Ÿ˜บ๐Ÿ˜ป๐Ÿ˜ผ๐Ÿ˜ฝ๐Ÿ˜พ๐Ÿ˜ฟ๐Ÿ™€๐Ÿ™๐Ÿ™‚๐Ÿ™ƒ๐Ÿ™„๐Ÿ™…๐Ÿ™†๐Ÿ™‡๐Ÿ™ˆ๐Ÿ™‰๐Ÿ™Š๐Ÿ™‹๐Ÿ™Œ๐Ÿ™๐Ÿ™Ž๐Ÿ™
test/unit/issue_import_test.rb
464 464
    end
465 465
  end
466 466

  
467
  def test_encoding_guessing_respects_multibyte_boundaries
468
    # Reading a specified number of bytes from the beginning of this file may
469
    # stop in the middle of a multi-byte character, which can lead to an
470
    # invalid UTF-8 string.
471
    test_file = 'mbcs-multiline-text.txt'
472
    chunk = File.read(Rails.root.join('test', 'fixtures', 'files', test_file), 4096)
473
    chunk.force_encoding('UTF-8') # => "An emoticon is ...๐Ÿ˜ƒ๐Ÿ˜„๐Ÿ˜…\xF0\x9F"
474
    assert_not chunk.valid_encoding?
475

  
476
    import = generate_import(test_file)
477
    with_settings :repositories_encodings => 'UTF-8,ISO-8859-1' do
478
      import.set_default_settings
479
      guessed_encoding = import.settings['encoding']
480
      assert_equal 'UTF-8', guessed_encoding
481
    end
482
  end
483

  
467 484
  def test_set_default_settings_should_detect_field_wrapper
468 485
    to_test = {
469 486
      'import_issues.csv' => '"',
(2-2/2)