Project

General

Profile

Actions

Defect #35036

closed

Markdown text sections broken by thematic breaks (horizontal rules)

Added by Martin Cizek over 3 years ago. Updated over 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Text formatting
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
Resolution:
Fixed
Affected version:

Description

A thematic break composed of hyphens (e.g. "---") breaks the division of Markdown text into individually editable sections.

Steps to reproduce

Configure markdown text formatting, create a Wiki page in a web browser and enter the following content:

# Title
## Heading 2
Preceding CRLF is the default for web-submitted data.

---

End of thematic breaks.

## Heading 2
Nulla nunc nisi, egestas in ornare vel, posuere ac libero.

More in the unit tests in the enclosed patch.

Cause

The reason is that it is confused with a setext heading. Although the current regexp in extract_sections actually tries to restrict setext headings in a way that it must follow a non-empty line, it does not account for a whitespace-only line or even plain CRLF. And as long as the text originates from a web browser, there is always a CRLF. So the problem is pretty common even for carefully formated text.

Fix

Attaching a patch with a fix and corresponding unit tests.

Broader context

The current approach to section extraction is inherently fragile - as shown in the other (skipped) unit test enclosed in the patch. I'd suggest to keep the skipped test there to mark it as a known issue. Will create a dedicated issue for this.


Files

0001-markdown_formatter-extract_sections-fix.patch (2.76 KB) 0001-markdown_formatter-extract_sections-fix.patch extract_sections fix for dash thematic breaks + unit tests Martin Cizek, 2021-04-05 16:05

Related issues

Related to Redmine - Feature #35037: Make wiki text section extraction less fragileNew

Actions
Actions

Also available in: Atom PDF