Skip to content

Feature request: reader mode, only parse primary content

Hi there! Thanks for creating this.

I wanted to build a reader-view type of app which only presents the main content, ignoring everything else. For example, the readability js library from mozilla turns complex HTML pages into very simple text-only pages. Maybe html2md could be used for something similar?

Maybe add an option to find the main_content node, and only parse from that. That could be the article, content or main HTML tag. Or maybe there's something smarter, like taking the first h1 element and then getting the parent container?

Maybe this is entirely out of scope, or maybe you're willing to accept a PR that does something like this. Let me know!