HTML Whitespace Is Broken

12 pointsposted 12 hours ago
by tobr

3 Comments

bubblesnort

7 hours ago

Author clearly didn't research this thoroughly.

HTML was originally based on SGML, which no one ever implemented correctly. The standard then hopped onto the XML bandwagon and more parsing hell ensued.

Remember when in old browsers you had to eliminate all whitespace between elements just so you could style a list correctly? Or traverse the DOM in JavaScript without hitting nasty textNodes? The HTML would be difficult to write AND difficult to read for those browsers.

Author goes on to ask what the rendering of an HTML fragment looks like and then pulls out a CSS rabbit trick. The real answer is "depends on the user agent".

The CSS rules for whitespace handling were added to make predictable whitespace handling easier to achieve across the entire document. Everything can be a footgun in the wrong hands.

HTML is absolutely not in the same realm as programming languages. Quoting strings brings a whole slew of new problems with them. Not in the least another round of parsing hell in the browsers. But also everywhere HTML is concatenated to create a page. Since HTML had to be forgiving, closing tags can be implied. That alone accounts for some of the differences in whitespace handling between the beginning and end of a given text. Browsers tend to do mostly the right thing in those cases.

And if the author tried to be exhaustive about whitespace behaviour in HTML, I wonder why there's also no mention of the WBR element or its corresponding unicode character.

lofaszvanitt

7 hours ago

Who/what decides which articles get displayed on the main page? I mean, sometimes there are zero articles for days, and sometimes there are myriads of totally uninteresting takes all around.

BoingBoomTschak

8 hours ago

YES! I recently deep dived into modern HTML/CSS to make my SSG and the two worst surprises were this (more specifically via https://github.com/ruricolist/spinneret/issues/37) and CSS's bad defaults around those inline/block elements (and the needed peppering of auto margin) and box-sizing != border-box.