Project

General

Profile

Actions

Feature #5864

closed

Regex Text on Receiver Email

Added by Sam Bo almost 14 years ago. Updated almost 7 years ago.

Status:
Closed
Priority:
Normal
Category:
Email receiving
Target version:
Start date:
2010-07-10
Due date:
% Done:

0%

Estimated time:
Resolution:
Fixed

Description

Modify the "Truncate Email after these lines" section or add a new section that allows a regex to be added.. This would make the truncation of emails much easier and the ability for anyone to support their specific requirements for truncating after the email system replied to text ("On Wed June 24th 2010 Johnny wrote:" ).

This is related to #2852

I modified the controller for the mail hander to look for both the text truncation and hard coded the reply line that my email system uses.. It could easily be adapted to use a similar approach and regex check multiple expressions to handle different scenarios.

My Hard Coded Example:
regex = Regexp.new("(^(#{ delimiters.join('|') })\s*[\r\n].*)|(On (.*)wrote:[\r\n].*)", Regexp::MULTILINE)

A proposed generic pseudo-code controller example:
regex = Regexp.new("(^(#{ delimiters.join('|') })\s*[\r\n].*)|(#{ delimitersregex.join('|') })", Regexp::MULTILINE)


Files

redmine_regex.png (23.6 KB) redmine_regex.png Ben Blanco, 2016-01-27 16:09
redmine_regex_incoming_emails.png (20.7 KB) redmine_regex_incoming_emails.png Ben Blanco, 2016-02-15 18:52
allow_regex_delimiters.patch (3.02 KB) allow_regex_delimiters.patch Marius BĂLTEANU, 2016-10-19 23:50
regex_delimiter_setting.png (57.8 KB) regex_delimiter_setting.png Marius BĂLTEANU, 2016-11-23 21:49
allow_regex_delimiters_v2.patch (5.21 KB) allow_regex_delimiters_v2.patch Marius BĂLTEANU, 2016-11-23 21:50
allow_regex_delimiters_v3.patch (9.29 KB) allow_regex_delimiters_v3.patch Marius BĂLTEANU, 2016-12-11 18:23
make_text_clickable.patch (1.41 KB) make_text_clickable.patch Marius BĂLTEANU, 2016-12-15 22:34

Related issues

Related to Redmine - Patch #11684: Truncate incoming emailsClosed

Actions
Has duplicate Redmine - Patch #10069: delimiter improvmentsClosed

Actions
Actions #1

Updated by Bart Stuyckens over 13 years ago

+1

Actions #2

Updated by Terence Mill over 13 years ago

+1

Actions #3

Updated by Akshat Pradhan over 13 years ago

Sam, where did you put this? I'm looking in mail_handler_controller.rb and that doesn't look like the correct place to put this. I also want to be able to regex out a certain portion of all incoming emails. I use google apps/smtp for all received emails.

Sam Bo wrote:

My Hard Coded Example:
regex = Regexp.new("(^(#{ delimiters.join('|') })\s*[\r\n].*)|(On (.*)wrote:[\r\n].*)", Regexp::MULTILINE)

A proposed generic pseudo-code controller example:
regex = Regexp.new("(^(#{ delimiters.join('|') })\s*[\r\n].*)|(#{ delimitersregex.join('|') })", Regexp::MULTILINE)

Actions #5

Updated by Sam Bo over 12 years ago

I'm going to 1 (not the G kind mind you) this since it has been over a year. I'd love to see this so we don't have to manually update the model on each upgrade! If I knew anything about Ruby I'd submit a patch.

Actions #6

Updated by Sam Bo over 12 years ago

Ouch, the plus 1 didn't come through correctly. Sorry for the double posting here.

Actions #7

Updated by Nick Caballero almost 12 years ago

diff -r 146377aeb5a8 app/models/mail_handler.rb
--- a/app/models/mail_handler.rb    Sun May 13 19:09:35 2012 +0000
+++ b/app/models/mail_handler.rb    Fri May 25 23:00:22 2012 +0000
@@ -415,9 +415,9 @@

   # Removes the email body of text after the truncation configurations.
   def cleanup_body(body)
-    delimiters = Setting.mail_handler_body_delimiters.to_s.split(/[\r\n]+/).reject(&:blank?).map {|s| Regexp.escape(s)}
+    delimiters = Setting.mail_handler_body_delimiters.to_s.split(/[\r\n]+/).reject(&:blank?)
     unless delimiters.empty?
-      regex = Regexp.new("^[> ]*(#{ delimiters.join('|') })\s*[\r\n].*", Regexp::MULTILINE)
+      regex = Regexp.new("^(#{ delimiters.join('|') })\s*[\r\n].*", Regexp::MULTILINE)
       body = body.gsub(regex, '')
     end
     body.strip

Actions #8

Updated by Jens Schneider about 11 years ago

+1

This would be extremly helpfull for filtering multilingual email replys.

Depending on the users email programm, the sender could be displayed as:

Von: [mailto:]
or
From: [mailto:]

With a regular expression, this could be filtered out with a single line in the redmine configuration.

Actions #9

Updated by Chris Birchall almost 11 years ago

The use of multiline regex can make it quite tricky to write delimiters correctly. e.g. if you start your delimiter with .* then it can match the whole message, resulting in the whole message being deleted.

I went for a slightly safer fix: use a normal regex, and if you find a line matching any of the delimiters, delete that line and anything after it.

diff --git a/app/models/mail_handler.rb b/app/models/mail_handler.rb
index c84672b..82fd5fe 100644
--- a/app/models/mail_handler.rb
+++ b/app/models/mail_handler.rb
@@ -483,10 +483,16 @@ class MailHandler < ActionMailer::Base

   # Removes the email body of text after the truncation configurations.
   def cleanup_body(body)
-    delimiters = Setting.mail_handler_body_delimiters.to_s.split(/[\r\n]+/).reject(&:blank?).map {|s| Regexp.escape(s)}
+    delimiters = Setting.mail_handler_body_delimiters.to_s.split(/[\r\n]+/).reject(&:blank?)
     unless delimiters.empty?
-      regex = Regexp.new("^[> ]*(#{ delimiters.join('|') })\s*[\r\n].*", Regexp::MULTILINE)
-      body = body.gsub(regex, '')
+      # Combine all delimiters into one regex
+      regex = Regexp.new("^(#{ delimiters.join('|') })")
+
+      # If the regex matches a line
+      regex.match(body) { |m|
+        # Delete the matched line and everything after it
+        body = body[0 ... m.begin(0)]
+      }
     end
     body.strip
   end
Actions #10

Updated by Anonymous over 10 years ago

Chris Birchall wrote:

The use of multiline regex can make it quite tricky to write delimiters correctly. e.g. if you start your delimiter with .* then it can match the whole message, resulting in the whole message being deleted.

I went for a slightly safer fix: use a normal regex, and if you find a line matching any of the delimiters, delete that line and anything after it.

[...]

This worked perfectly. Thank you, Chris!

Actions #11

Updated by Toshi MARUYAMA about 10 years ago

Actions #12

Updated by Antonio García-Domínguez about 10 years ago

The patch by Chris worked beautifully in Redmine 2.3.2. Thanks!

+1

Actions #13

Updated by Andrew Hills about 10 years ago

I did not want to maintain a patch for my installation of Redmine, so I've created a plugin (tested only on 2.4.3 thus far) changing the behavior of the truncation field from text snippets per line to regular expressions per line.

http://www.redmine.org/plugins/redmine_mail_handler_clean_body_regexp

Actions #14

Updated by Massimo Rossello almost 10 years ago

+1
Thank you for this life saving patch, the plugin works like a charm!

Actions #15

Updated by Kevin Palm almost 10 years ago

+1

Actions #16

Updated by Anonymous about 9 years ago

+1

Actions #17

Updated by Michael Schaefer over 8 years ago

+1 I'd so much like to see this integrated in redmine!

Actions #18

Updated by Alexander Ryabinovskiy over 8 years ago

+1, nice feature.

Actions #19

Updated by Ismael Barros² over 8 years ago

+1, please do

Actions #20

Updated by Sebastian Paluch over 8 years ago

+10!

Actions #21

Updated by Ben Blanco about 8 years ago

Thanks for this!

I nearly have email truncation working as I'd like to - ie. it wasn't working at all with stock 3.2.0 redmine for me.

However, with the above tweak of app/models/mail_handler.rb it finally actually performs some truncation.

Now, the last thing I'd like to do is get rid of email clients' first line on replies (as mentionned by Jens Schneider), ie.

On Mon, Feb 1, 2016 at 12:35, redmine@foo.com wrote:
> Feature #5864: Regex Text on Receiver Email
> blabla
> blabblabla

So I'm looking for how to write a regex that would find & select the first line where it finds the mention of the redmine server's email address - so as to delete it, as well as any line following it.

For now I have come up with: (redmine@foo\.com)

Which sort of works ok, when I test it with a sample text - here: http://rubular.com/r/OowzIArxPf

But it's not working when I declare it in redmine..

Is it because the regex is not good (enough)?

Or is it because I'm not writing/putting it in correctly in redmine's Admin interface? Should something be prefixed|appended to the regex for it to be taken into account?

Actions #23

Updated by Ben Blanco about 8 years ago

For anyone interested, I finally have configured my redmine 3.2.0 to cleanly truncate incoming emails.

Amend mail_handler.rb

The default mail_handler.rb was not performing any truncation for me (not sure why; never got an answer/much help to try and troubleshoot it, cf. #21746).

So, reading the thread of comments on #5864, I finally found that the following works:

def cleanup_body(body)
    delimiters = Setting.mail_handler_body_delimiters.to_s.split(/[\r\n]+/).reject(&:blank?)
    unless delimiters.empty?
        # Combine all delimiters into one regex
         regex = Regexp.new("^[> ]*(#{ delimiters.join('|') })")
        # If the regex matches a line
          regex.match(body) { |m|
                             # Delete the matched line and everything after it
                              body = body[0 ... m.begin(0)]
                            }
    end
    body.strip
end

Administration / Settings / Incoming Emails

In redmine's incoming emails settings, I then put the following values:

.+redmine@foo\.com.+
^-{2}.\n
.+image:\scid:.+

The first one, .+redmine@foo\.com.+, enables to wipe out email clients' top line on replies, such as:

On Mon, Feb 1, 2016 at 12:35, redmine@foo.com wrote:
> Feature #5864: Regex Text on Receiver Email
> blabla
> blabblabla

So that's very cool.

The other two settings are:

  • ^-{2}.\n is to remove Gmail's appended signatures, which apparently are always preceded by a line with -- (followed by an invisible single character; which when you look at Gmail.com's online raw email dump feature, they show as being =20, but it isn't, it's a single character of whateverz they append to that double tack, hence the "." used in the regex).
  • Whereas .+image:\scid:.+ is to remove an image logo our company has for our staff's email signature.

Also, for that signature image, I also declared the full file name, in the Exclude attachments by name setting.

I might have to add file exclusions if/when we allow non-staff, ie. clients, to create/comment issues via email, but for now this setup works flawlessly!

Caveats

I've noticed that the regex truncation and/or attachment exclusion rules I specify in redmine's Administration are not taken into account upon Save. I have to restart redmine application, and then they're properly taken into account.

I'm not sure if that's normal; but mentionning it in case someone else reads this, as it'll save you a lot of time and pain.

If someone knows that this is a normal behaviour, then we should probably add a notice in Administration saying "Please restart redmine application for these settings to be taken into account".

If it's not normal, please let me know, and I can provide info on server setup (nginx+passenger+rbenv4rubies basically).

Final note

I tried using the redmine_mail_handler_clean_body_regexp plugin - it didn't work for me. Not sure if because I'm running redmine 3.2.0.

Actions #24

Updated by Marius BĂLTEANU over 7 years ago

We made a patch with tests that implements this feature. Without regex delimiters, we weren't able to truncate correctly the emails received. I think that this feature is really needed in core.

Actions #25

Updated by Toshi MARUYAMA over 7 years ago

Marius BALTEANU wrote:

We made a patch with tests that implements this feature. Without regex delimiters, we weren't able to truncate correctly the emails received. I think that this feature is really needed in core.

Does this patch break if existing setting has regexp special characters?
I think it is better to switch regexp on or off.

Actions #26

Updated by Marius BĂLTEANU over 7 years ago

Toshi MARUYAMA wrote:

Marius BALTEANU wrote:

We made a patch with tests that implements this feature. Without regex delimiters, we weren't able to truncate correctly the emails received. I think that this feature is really needed in core.

Does this patch break if existing setting has regexp special characters?

From my tests, no, it doesn't break, but if you have a specific scenario in your mind, please tell me and I'll test it.

I think it is better to switch regexp on or off.

Yes, agree with you. I've updated the patch to add a new setting which enable/disable this feature.

Actions #27

Updated by Peter Petrik over 7 years ago

Excellent patch, thanks for contributing it!

Actions #28

Updated by Toshi MARUYAMA over 7 years ago

Marius BALTEANU wrote:

Toshi MARUYAMA wrote:

Marius BALTEANU wrote:

We made a patch with tests that implements this feature. Without regex delimiters, we weren't able to truncate correctly the emails received. I think that this feature is really needed in core.

Does this patch break if existing setting has regexp special characters?

From my tests, no, it doesn't break, but if you have a specific scenario in your mind, please tell me and I'll test it.

For example, "***cut below lines***".

Actions #29

Updated by Toshi MARUYAMA over 7 years ago

  • Target version set to 3.4.0
Actions #30

Updated by Marius BĂLTEANU over 7 years ago

Toshi MARUYAMA wrote:

For example, "***cut below lines***".

You're right, without the new setting to enable/disable this feature, the existing delimiters with special regex characters will behave differently.

Actions #31

Updated by Jean-Philippe Lang over 7 years ago

I think that this patch would raise an error when receiving an email if "Enable regexp delimiters" is checked and the entered delimiter is an invalid regexp.

Actions #32

Updated by Marius BĂLTEANU over 7 years ago

Jean-Philippe Lang wrote:

I think that this patch would raise an error when receiving an email if "Enable regexp delimiters" is checked and the entered delimiter is an invalid regexp.

Thanks for your feedback. I'll modify the patch to validate each regex on save when the "Enable regexp delimiters" is checked.

Actions #33

Updated by Marius BĂLTEANU over 7 years ago

I've updated the patch to validate each regex delimiter on save. The user won't be able to save new settings with invalid regex delimiters and will receive an error message for each invalid entry.

Please let me know if more changes are required to have this committed.

Actions #34

Updated by Marius BĂLTEANU over 7 years ago

Thanks for implementing this feature. Attached is a small patch that makes the text "Enable regular expressions" clickable (as a label for the checkbox).

Actions #35

Updated by Jean-Philippe Lang over 7 years ago

  • Status changed from New to Closed
  • Assignee set to Jean-Philippe Lang
  • Resolution set to Fixed

I've fixed it using an existing class, thanks for pointing this out.

Actions #36

Updated by Go MAEDA over 7 years ago

Actions #37

Updated by Mischa The Evil almost 7 years ago

  • Subject changed from Regex Text on Receiver Email to Regex Text on Receiver Email
Actions

Also available in: Atom PDF