Project

General

Profile

Actions

Defect #35036

closed

Markdown text sections broken by thematic breaks (horizontal rules)

Added by Martin Cizek almost 3 years ago. Updated almost 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Text formatting
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
Resolution:
Fixed
Affected version:

Description

A thematic break composed of hyphens (e.g. "---") breaks the division of Markdown text into individually editable sections.

Steps to reproduce

Configure markdown text formatting, create a Wiki page in a web browser and enter the following content:

# Title
## Heading 2
Preceding CRLF is the default for web-submitted data.

---

End of thematic breaks.

## Heading 2
Nulla nunc nisi, egestas in ornare vel, posuere ac libero.

More in the unit tests in the enclosed patch.

Cause

The reason is that it is confused with a setext heading. Although the current regexp in extract_sections actually tries to restrict setext headings in a way that it must follow a non-empty line, it does not account for a whitespace-only line or even plain CRLF. And as long as the text originates from a web browser, there is always a CRLF. So the problem is pretty common even for carefully formated text.

Fix

Attaching a patch with a fix and corresponding unit tests.

Broader context

The current approach to section extraction is inherently fragile - as shown in the other (skipped) unit test enclosed in the patch. I'd suggest to keep the skipped test there to mark it as a known issue. Will create a dedicated issue for this.


Files

0001-markdown_formatter-extract_sections-fix.patch (2.76 KB) 0001-markdown_formatter-extract_sections-fix.patch extract_sections fix for dash thematic breaks + unit tests Martin Cizek, 2021-04-05 16:05

Related issues

Related to Redmine - Feature #35037: Make wiki text section extraction less fragileNew

Actions
Actions #2

Updated by Martin Cizek almost 3 years ago

Just to make it clear - the skipped test address an already existing error, which is probably not worth fixing with the current approach in extract_sections implementations. See #35037 for more details.

But the particular and common error in thematic break mistreatment is solved by the patch "as is", it can be just applied. :)

Actions #3

Updated by Go MAEDA almost 3 years ago

  • Status changed from New to Confirmed
  • Target version set to 4.1.4

Setting the target version to 4.1.4.

Actions #4

Updated by Go MAEDA almost 3 years ago

  • Related to Feature #35037: Make wiki text section extraction less fragile added
Actions #5

Updated by Go MAEDA almost 3 years ago

  • Status changed from Confirmed to Resolved
  • Assignee set to Go MAEDA
  • Resolution set to Fixed

Committed the fix. Thank you for your contribution.

Actions #6

Updated by Go MAEDA almost 3 years ago

  • Status changed from Resolved to Closed
  • Target version changed from 4.1.4 to 4.2.2
Actions

Also available in: Atom PDF