Defect #35036
closedMarkdown text sections broken by thematic breaks (horizontal rules)
0%
Description
A thematic break composed of hyphens (e.g. "---
") breaks the division of Markdown text into individually editable sections.
Steps to reproduce¶
Configure markdown
text formatting, create a Wiki page in a web browser and enter the following content:
# Title
## Heading 2
Preceding CRLF is the default for web-submitted data.
---
End of thematic breaks.
## Heading 2
Nulla nunc nisi, egestas in ornare vel, posuere ac libero.
More in the unit tests in the enclosed patch.
Cause¶
The reason is that it is confused with a setext heading. Although the current regexp in extract_sections
actually tries to restrict setext headings in a way that it must follow a non-empty line, it does not account for a whitespace-only line or even plain CRLF. And as long as the text originates from a web browser, there is always a CRLF. So the problem is pretty common even for carefully formated text.
Fix¶
Attaching a patch with a fix and corresponding unit tests.
Broader context¶
The current approach to section extraction is inherently fragile - as shown in the other (skipped) unit test enclosed in the patch. I'd suggest to keep the skipped test there to mark it as a known issue. Will create a dedicated issue for this.
Files
Related issues