Here's an easy, if not always precise way to remember: \* Hyphens connect things...

lxgr · 2025-03-28T01:12:59 1743124379

> EM dashes break things, such as sentences or thoughts

Some style guides recommend "space, en dash, space" for this, and I prefer that myself – mainly because some software doesn't treat em dashes correctly as word separators for double click selection purposes.

For example, I'm pretty sure that at least some Kindle models would highlight both the word before and after the em dash when selecting one of them, which makes using the dictionary very annoying.

krick · 2025-03-28T11:10:09 1743160209

It's actually only your post that made me realize people don't normally put spaces around em dash. In French, Russian and a bunch of other languages proper typesetting is to use em dash as a standard dash character, and you always put spaces around them. So I did it in English as well, for many years now.

(I also now looked up and found out that in Spanish, apparently, you are supposed to put space only on one side of the dash, when used as a direct speech separator.)

rmunn · 2025-03-28T13:16:21 1743167781

I also put spaces around em dashes. It looks wrong—subtly wrong—to me to have the words glued together around the dash. It looks right — completely right — to me to have the dash standing on its own, as if it was a word in its own right.

tines · 2025-03-28T14:02:09 1743170529

The reason not to do this is observable in your post on my phone. The spaces cause the word wrapping algorithm to leave a dangling dash at the end of the line which looks ugly. Omitting spaces prevents the word break.

rmunn · 2025-03-28T14:29:32 1743172172

I mentioned that as an advantage in one of my other comments. An advantage both ways, because it depends on preference. I have the same preference as hansvm: I would rather see the dangling dash at the end of the line, so I prefer putting spaces around the dashes. Having the entire word-dash-word structure move to the next line feels ugly to me. As with most things, de gustibus non est disputandum. (And also, quidquid Latine dictum sit altum videtur).

chipotle_coyote · 2025-03-28T15:59:29 1743177569

It's the dangling dash at the beginning of the line that gets me. I see a lot of word break algorithms, including the one WebKit (and I suspect Blink) uses, which are happy to break "foo—bar" on either side of the em dash.

hansvm · 2025-03-28T14:14:27 1743171267

Funny, I'd rather have the break at the start or end of the emdash-implied break than just before or after it, not having to mentally handle some single dangling word divorced from its compatriots.

mmooss · 2025-03-28T18:24:42 1743186282

> The reason not to do this is observable in your post on my phone. The spaces cause the word wrapping algorithm to leave a dangling dash at the end of the line which looks ugly. Omitting spaces prevents the word break.

That's an interesting practicality but I don't think it's the cause of the rule: The rule probably long predates automated line breaking. Also, I think automatic line breaking will break compound words at the hyphen; it doesn't require spaces (which is also obvious from a software development point of view: the logic is relatively simple either way):

  Lorem ipsum dolor sit amet, consectetur adipiscing double-
  decker lorem ipsum dolor sit amet, consectetur ...

da_chicken · 2025-03-28T14:30:32 1743172232

Ironically, on my phone the only line that ends with an em dash has no spaces in it.

If you want to not have a line break, you shouldn't rely on arbitrary behavior. You should use non-breaking characters like non-breaking spaces and word joiners.

lxgr · 2025-03-28T20:16:46 1743193006

Preventing the word break doesn't seem very desirable, especially if it causes a large gap.

lashloch · 2025-03-28T13:43:51 1743169431

Funny—I'm the exact opposite. The extra spaces distract my eyes. To each their own! :)

rmunn · 2025-03-28T13:49:27 1743169767

To each their own: fully agreed, even though our tastes differ. I will mention one advantage of the spaces-around-dashes method: word wrap with default settings will break on the spaces around the dashes so that the entire word one, dash, word two combo doesn't end up pulled onto the next line as a whole unit. Whereas the advantage of the no-spaces method that you prefer is that word wrap will pull the entire word one, dash, word two combo onto the next line as a whole unit.

Why yes, I did list the opposite behavior as an advantage of each. Because that, too, is up to individual preference. :-)

lxgr · 2025-03-28T15:45:13 1743176713

That depends on the layout engine, I believe. Just tried it in Firefox (on macOS; not sure if it uses Core Text or something custom there), and it does sometimes break around the em dash in "foo—bar" style, not just "foo – bar" style.

I've definitely noticed the behavior you describe on some layout engines, too, and it's another reason why I personally prefer "foo – bar" style.

mmooss · 2025-03-28T20:00:29 1743192029

It's not your own. You write mostly for others to read.

rmunn · 2025-03-28T13:51:02 1743169862

P.S. I also prefer smileys with noses, :-), as opposed to the noseless smileys, :), that most people these days seem to prefer. :-)

hilbert42 · 2025-03-29T06:24:24 1743229464

I've wondered about this for similar reasons. I usually omit the spaces but as I said in an earlier post I'll sometimes include them when I think the typography calls for it or when I want to add extra emphasis.

I've come to the conclusion it boils down to which style manual one follows. I've taken a careful look at numbers of high-end books which no doubt have been carefully typeset and I've found EM dashes with and without spaces.

It seems there is no definitive rule but I might be wrong.

laptopdev · 2025-03-28T19:33:04 1743190384

Grammar nasi but isn't it "It looks right — completely right, to me — to have the dash standing on its own"...

snozolli · 2025-03-28T14:40:17 1743172817

people don't normally put spaces around em dash

For what it's worth, I was in the last class in my high school to learn typing on IBM Selectric typewriters. We were taught to type two spaces, two hyphens, then two spaces. Incidentally, we were taught two spaces after periods and colons. To this day, I find it hard to read text that doesn't have proper spacing after periods. (HTML and WYSIWYG word processors handle formatting, but e.g. fixed-font text editors don't)

dragonwriter · 2025-03-28T18:22:38 1743186158

Its funny that people think that conventions for typewritten text built around the limitations of typewriters define what is “proper” in environments where typewriters and their limitations are not involved.

ovalanche · 2025-03-28T18:42:45 1743187365

Yes, this always grinds my gears too. There is already a slightly larger space after periods in contemporary typefaces.

The old typewriter typefaces were monospaced, ie. every character was the same width, but this is no longer the case. Virtually all typefaces today are proportionally spaced, not monospaced. So it’s redundant to leave extra room after periods.

snozolli · 2025-03-28T23:26:34 1743204394

What does this have to do with what I wrote? I said nothing of the sort. In fact, I explicitly pointed out that HTML and WYSIWYG word processors address it automatically.

kevin_thibedeau · 2025-03-28T18:37:38 1743187058

I was taught that and abandoned it as a pointless anachronism. How often are you reading long form text in a monospace font?

snozolli · 2025-03-28T23:27:05 1743204425

Often enough, thanks.

mmooss · 2025-03-28T18:19:07 1743185947

What is a "standard dash character"? There is no such thing in English; only hyphen, EN dash, EM dash (and some odds and ends).

rahimnathwani · 2025-03-28T01:58:33 1743127113

I grew up in the UK, and have always used space, minus, space.

The first keyboard I used was my dad's typewriter, and I don't recall it having any 'dash' other that the minus sign.

Propelloni · 2025-03-28T09:40:35 1743154835

I was under the impression that you do "-" for hyphen, "--" for En dash, and "---" for Em dash. IIRC, LaTeX (or maybe the editor, it has been some time) even helpfully changes that for you to the correct dash.

JadeNB · 2025-03-28T14:15:37 1743171337

> I was under the impression that you do "-" for hyphen, "--" for En dash, and "---" for Em dash. IIRC, LaTeX (or maybe the editor, it has been some time) even helpfully changes that for you to the correct dash.

The conversion of '--' to an en dash and '---' to an em dash is done by the TeX compiler, and appears in the rendered file, but I think that most TeX editors don't change the TeX code itself. (This is distinct from XeTeX-based compilers, which can handle non-ASCII Unicode characters like the em dash '—' directly in the source.)

(I think that the article's point is that, in some fonts, -- (two hyphens) is literally the (approximate) size of an em dash, not that it is always understood as meaning an em dash. At least in my font, --- (three hyphens) is far too long to literally look like an em dash:

---

--

—

–

(in order, three hyphens, two hyphens, em dash, en dash).)

rahimnathwani · 2025-03-28T14:01:13 1743170473

Google Docs also does these replacements.

Finnucane · 2025-03-28T12:53:10 1743166390

British typesetting style is a little different from US style in the way dashes are presented. In the UK, you might see a thin-space--en-dash---thin-space where a US typesetter would use a em-dash. Typewriter style generally follows books style. Since typesetters no longer use an extra space after punctuation, it's vestigial in typing.

robin_reala · 2025-03-28T09:26:22 1743153982

en-US style is a single em-dash. en-GB style is a single en-dash with spaces on either side.

KPGv2 · 2025-03-28T04:16:07 1743135367

space, minus, space is on the same level as manually typing two spaces after a period

lxgr · 2025-03-28T04:29:22 1743136162

How so? One is the only way to approximate an en or em dash on a typewriter or in a charset that doesn’t have one, the other seems like a workaround of a typesetting bug at best.

Propelloni · 2025-03-28T09:43:28 1743155008

-, --, --- is, IIRC, how it is done in LaTex and would be exceedingly simple to do on a typewriter. That being said, to break up sentences I use " -- " because I think it looks nicer than "---". I'll go now ;)

lxgr · 2025-03-28T12:14:25 1743164065

LaTeX is a markup language though, not ASCII art. I can get behind two dashes as a substitute if no en dash is available, but three seems too much and looks like halfway to a horizontal line to me ;)

rahimnathwani · 2025-03-28T04:35:00 1743136500

Until ~10 years ago, I used to type two spaces after a period.

Daneel_ · 2025-03-28T05:24:07 1743139447

I still do, and I maintain that it’s easier to read text with double spaces after periods.

_emacsomancer_ · 2025-03-28T08:08:01 1743149281

TeX puts more space after periods/fullstops (which is why you're supposed to do special markup or other measures to mark '.' in the middle of sentences which aren't sentence-enders (e.g. like e.g.)). But it's generally smaller than the equivalent of two manual spaces.

(A nice thing in (La)TeX is that one could follow the "two spaces after a full-stop" rule, which then has the advantage of being an explicit marking for sentence boundaries (which your editor might be able to navigate; Emacs has a convention of assuming two spaces after a sentence-ending '.'), but then the TeX typesetting will take care of making it look right. I lost the habit of actually doing this, for better or worse, except when flycheck/checkdoc/package-linter.el makes me do it for docstrings.)

globnomulous · 2025-03-28T12:10:13 1743163813

I used to feel similarly. Now I find the double space a visual distraction that doesn't in any way improve readability.

The effect of the double space is, I suspect, a product of the reader's expectations: if you expect it, its absence creates mental work, detracting from readability; if you don't expect it, its presence is what creates mental work.

asveikau · 2025-03-28T16:55:17 1743180917

I'm still doing it when I am typing at a physical keyboard. Hard habit to break. I learned it so long ago too.

You can tell when I've edited something on both a phone and a physical keyboard, based on the inconsistent use of spaces.

rahimnathwani · 2025-03-28T17:00:09 1743181209

  Hard habit to break. I learned it so long ago too.

Haha I learned to type organically, and it was only in my mid-40s that I retrained myself to type the correct way. It took something like 40 hours of practice on keybr.com before I could get close enough to my regular typing speed, such that I could switch over to the 'correct' method without it impacting my work.

Retraining myself to stop doing double-spaces took maybe a week.

kevin_thibedeau · 2025-03-28T18:40:17 1743187217

Most word processors can be configured to flag double spaces. That gives feedback to break the habit.

opello · 2025-03-28T05:12:54 1743138774

> Some style guides recommend "space, en dash, space" for this

The last paragraph of the article also addressed the subjective nature of spacing around the em dash:

> Spacing around an em dash varies. Most newspapers insert a space before and after the dash, and many popular magazines do the same, but most books and journals omit spacing, closing whatever comes before and after the em dash right up next to it.

As far as the selection detail, did you mean that you replace an em dash used like a comma or parenthesis with spaces and an en dash for specific highlight performance issues? Surely the spaces and an em dash would alleviate the selection highlight behavior and not muddy the waters of when to use an em vs. an en dash?

JadeNB · 2025-03-28T14:12:19 1743171139

> Spacing around an em dash varies. Most newspapers insert a space before and after the dash, and many popular magazines do the same, but most books and journals omit spacing, closing whatever comes before and after the em dash right up next to it.

It's funny that they omit to mention the possibility of setting it off with a thin space ' ' or hair space ' ' (those are the thin-space and hair-space Unicode characters, though they show up full width for me), which I thought was preferred typographic practice.

(On Googling, maybe the reason that they don't mention it is that I was imagining it; I can't find any evidence for my belief.)

opello · 2025-03-28T15:43:29 1743176609

> those are the thin-space and hair-space Unicode characters, though they show up full width for me

Interestingly, at least in my browser and grabbing the direct link to the comment with curl, show the bytes as 0x20 for both. Perhaps the comment submission handler, or even the browser, collated your more specific U+2009 (thin) and U+200A (hair) spaces into the regular U+0020 space?

JadeNB · 2025-03-28T16:56:51 1743181011

> Interestingly, at least in my browser and grabbing the direct link to the comment with curl, show the bytes as 0x20 for both. Perhaps the comment submission handler, or even the browser, collated your more specific U+2009 (thin) and U+200A (hair) spaces into the regular U+0020 space?

Probably! I think HN strips out emoji; maybe it just takes the safest approach and strips out all non-white-listed Unicode.

mmooss · 2025-03-28T04:18:28 1743135508

The AP Style Manual, a/the leading source for US journalism at least, says

  <word> <space> <dash> <space> <word>

Outside of journalism, usually there is no padding, only,

  <word> <dash> <word>

I'm with you: For searches, the spaces make the words easier to parse. Those rules predate computers, I would guess.

lxgr · 2025-03-28T04:25:42 1743135942

> <word> <dash> <word>

That one I’d usually parse as a hyphen, as in e.g. well-known. “Word space dash space word” is much clearer, in my view.

> The AP Style Manual, a/the leading source for US journalism

One of the things I can easily get away with by not being a US journalist :)

stouset · 2025-03-28T06:37:04 1743143824

It’s quite hard to mistake an em dash for a hyphen in a proportional font.

self-fulfilling

self—fulfilling

One of these looks very, very wrong.

johnisgood · 2025-03-28T06:42:44 1743144164

I agree, although I still prefer spaces between —.

mattl · 2025-03-28T04:32:17 1743136337

Chicago Manual of Style has no spaces, so there’s some variation at least.

mmooss · 2025-03-28T05:07:53 1743138473

CMOS is not journalism, so it's not variation from the GP?

mattl · 2025-03-28T05:45:05 1743140705

A wider number of people use either of them. Every place I’ve used used CMOS which I now use with others.

ghaff · 2025-03-28T06:56:11 1743144971

Company I used to work for used AP for things like press releases and, I think, official blog posts and Chicago plus a couple different tech style guides for everything else.

Basically, we didn’t like some things in AP but we wanted to make it easy for journalists to copy/paste.

cyrillite · 2025-03-28T11:11:23 1743160283

I have been doing this for purely aesthetic reasons my whole life. Style guides be damned, I hate connected em dashes.

lxgr · 2025-03-28T12:19:27 1743164367

The good thing about style guides is that they’re guides, not laws :)

That’s one thing I really like about English: There’s no central authority decreeing what’s right and what’s wrong top down, and it feels like there is some room for individual preferences and experimentation.

Very refreshing, compared to e.g. German, which has more than one semi-official authority gate keeping “correctness” in speech and writing.

mmooss · 2025-03-28T19:08:58 1743188938

In fairness, especially in the Anglo-Saxon dominated world post-WWII, English was under no threat to be swamped by German or French words.

KPGv2 · 2025-03-28T04:15:30 1743135330

> Some style guides recommend "space, en dash, space" for this

Which one does that? I threw up a little in my mouth and wish to avoid such style guides in the future!

lxgr · 2025-03-28T04:26:51 1743136011

Better avoid British journalism then, and many other languages on top of that.

It’s very common outside of America, even in English.

mmooss · 2025-03-28T04:24:01 1743135841

https://news.ycombinator.com/item?id=43501482

BoumTAC · 2025-03-28T09:15:10 1743153310

I'm not a native English speaker, but don't you use the ";" in English ?

To me, it feels like it is the same purpose as the EM dashes.

And I discovered the EM with ChatGPT, I've never seen it before.

layer8 · 2025-03-28T14:10:06 1743171006

A semicolon connects, whereas an em-dash creates more of a pause and therefore separates. In addition, em-dashes can be used in pairs to create a parenthesis, which semicolons can’t. I think with time you will appreciate the difference.

https://thenarrativearc.org/blog/2020/2/4/epic-grammar-battl...

OJFord · 2025-03-28T09:30:45 1743154245

Dashes surround a sub-clause - something like this - which is like a parenthetical addition to a sentence that could stand alone without it; semi-colons (';') connect a further sentence or part of one where perhaps a full-stop and additional word could have been. They also sometimes separate list items following a colon, especially if the things listed are longer sentences perhaps themselves containing commas that'd otherwise be ambiguous.

grey413 · 2025-03-28T09:57:22 1743155842

Em dashes are very similar to semicolons. You use em dashes if your related sentence is in the middle of another sentence, and semicolons if it's at the end.

They're frequently used in skilled and professional grade writing.

mmooss · 2025-03-28T18:51:31 1743187891

So as not to mislead anyone, the parent is mostly incorrect:

Here's an example sentence: Semicolons must have independent clauses—phrases that could form a full sentence on their own—on both sides of them; they are essentially alternatives for periods. Em dashes don't require independent clauses on either side.

In the italicized sentence,

* phrases that could form a full sentence on their own is not an independent clause but is valid between em dashes. on both sides of them, after the em dashes, is also not an independent clause. (The em dashes function like commas or parentheses here.)

* The parts before and after the semicolon are independent clauses. You could replace the semicolon with a period and you'd have perfectly valid grammar. I just chose to connect the two sentences a bit more.

I don't know if you can use em dashes as the parent comment describes, connecting three independent clauses:

* My favorite fruit is peaches—they are very sweet—I eat them all summer.

I think the above is wrong; it should be one of the following:

* My favorite fruit is peaches—they are very sweet—and I eat them all summer.: The last section is a dependent clause made by "and", not an independent clause.

* My favorite fruit is peaches—they are very sweet; I eat them all summer.: One both sides of the semicolon are independent clauses; I could replace the semicolon with a period.

Maybe there are examples I'm not thinking of? I infer that the rule might be that the punctution following the em-dashed clauses should be the punctuation that would have been used without the em-dashed clause, but that's based on very limited evidence.

mmooss · 2025-03-28T18:58:50 1743188330

Many people don't use semicolons (;) in English but many do, and they are certainly part of correct grammar.

Semicolons are generally alternatives to periods, when you want more connection between the two sentences. Like periods, semicolons must have two full sentences—that is, what could be full sentences—on either side of them; the potential 'full sentences' are properly called independent clauses. (A dependent clause needs the rest of the sentence to form valid grammar; it can't function on its own. For example, in this paragraph's first sentence, when you want more connection between the two sentences is a dependent clause. Often they follow commas.)

Another use of semicolons is for lists in a paragraph where one of the list items has a comma in it (similar to the parsing problem for CSVs where some records contain commas): I only like wine; beer, but only ales; and orange juice.

dspillett · 2025-03-28T12:30:06 1743165006

> Unicode has the original ASCII hyphen-minus (U+002d), as well as a dedicated hyphen (U+2010), other functional hyphens…

Which can be fun when parsing CSV files from various sources. I've hit numbers with U2010 or others where you would expect a hyphen-minus should be. Presumably someone² has copied a negative number from a document where one of the alternate symbols was used, and pasted it into everyone's favourite data-mangler¹ which interpreted it as a string, and so on down the chain.

--------

[1] Excel. Sometimes a joy, sometimes the bane of my existence.

[2] It is surprising, horrifying even, how much manual manipulation of data goes on in banking, where you might naturally assume everything is more automated these days. Sometimes a laborious manual process done regularly is seen as cheaper than paying for it to be automated…

divbzero · 2025-03-28T00:51:06 1743123066

I prefer the dedicated minus (U+2212) over the hyphen-minus (U+002d) for mathematical use because they look different in most font faces.

Are there cases where the dedicated hyphen (U+2010) is preferred over the hyphen-minus?

LegionMammal978 · 2025-03-28T01:22:57 1743124977

G. Brandon Robinson swears by U+2010 for hyphens in groff's Unicode output [0], but I see it as a hypercorrection. The most common convention by far (among authors who use Unicode and care about dashes) is to use U+002D for hyphens and U+2212 for minus signs. Not even the Unicode Consortium uses U+2010 for hyphens in its documents, and I'm not aware of any major organization that does.

As far as appearance goes, almost all fonts I've looked at make U+2010 identical to U+002D (i.e., they don't put any 'minus' into the 'hyphen-minus'), but a few make U+2010 a smidgeon shorter.

[0] https://news.ycombinator.com/item?id=38121765

mmooss · 2025-03-28T23:34:21 1743204861

Edit: G. Branden Robinson (note spelling) is the maintainer of groff.

https://www.gnu.org/software/groff/

wruza · 2025-03-28T01:25:51 1743125151

Intl.NumberFormat also prefers it, but then you can't paste negative numbers into most financial software, calculators, spreadsheets. Even back into inputs on the same webpage, if it does custom number parsing. Even though <input type=number> accepts U+2212 as a minus, it turns it into a regular minus when you spin it down to -2.

It looks much better though and more visible: −1 vs -1. I wish hyphen was a separate symbol from the ascii start, or that monospace fonts didn't tend to shorten "-" cause it makes little sense in monospace anyway.

layer8 · 2025-03-28T14:17:23 1743171443

It has two potential benefits:

— In the context of automatic text processing, it unambiguously indicates the function of a hyphen, as opposed to a minus

— Fonts can choose to make the hyphen-minus a bit wider than a regular hyphen, to accommodate the usage as a minus sign. In that case, U+2010 would be typographically more appropriate for a hyphen, similar to how U+2212 usually is typographically more appropriate for a minus sign.

zajio1am · 2025-03-28T11:58:28 1743163108

Visual style of hyphen-minus depends on font. Some fonts displays it more like a minus, others like a hyphen. So if you care about distinguishing hyphen and minus, it makes sense to use dedicated hyphen and minus, and do not use hyphen-minus at all.

mproud · 2025-03-28T03:04:27 1743131067

A regular hyphen arguably looks better when used as a hyphen and not a minus.

docmars · 2025-03-28T16:08:34 1743178114

EN dashes are also great for date ranges: 1/1/2025–3/28/2025

energy123 · 2025-03-28T01:28:25 1743125305

The em dash is now a GPT-ism and is not advisable unless you want people to think your writing is the output of a LLM.

sho_hn · 2025-03-28T01:56:59 1743127019

My advise is to take pleasure and have confidence in good writing, over misspent energy worrying about things like this.

If you practice your skills, you will reap the rewards.

alt187 · 2025-03-28T01:48:46 1743126526

The letter 'm' is now a GPT-ism and is not advisable unless you want people to think your writing is the output of a LLM.

xanderlewis · 2025-03-28T01:31:20 1743125480

No, thanks—I’ll keep using them as I always have.

mmooss · 2025-03-28T04:25:09 1743135909

Someone else said the same. How can that be when most word processors, and at least some phone keyboards, automatically insert em dashes?

grey413 · 2025-03-28T09:22:08 1743153728

It's infuriating that people are drawing this conclusion. LLMs pick up on em dash usage because professional and skilled writers use em dashes. They're a consistently useful, if niche, part of the literary toolkit.

But, no, now it's a problem because the majority of people's experience with writing is graded essays. And because LLMs emulate professionals, it's now a red flag if students write too much like professionals. What a joke.

phlakaton · 2025-03-28T04:37:11 1743136631

Emily Dickinson wept—

mmooss · 2025-03-28T05:16:05 1743138965

Ha, good point, and an interesting question: What kinds of dashes did Dickinson intend?

It's a hard one to answer: We could look at published Emily Dickinson books from the time, but did Dickinson really pay that close attention to or have that much control over the type?

We could look at Dickinson's actual personal documents, but if they were handewritten, distinguishing dashes could be difficult even if there was intention there.

armedgorilla · 2025-03-28T11:59:01 1743163141

Fortunately we have troves of her handwritten documents; all of her poems were first printed posthumously. To me, she's using the punctuation as pacing or tonal markers as opposed to ligatures ("I'll clutch— and clutch— " vs "I'll clutch-and clutch-"). Many publishers style these marks as longer than normal m-dashes for that reason, which makes sense seeing as they are rarely used as asides.

I interpret her marks—

as breathless pauses—

that— having no unicode—

should be given to m—

and space—

https://www.edickinson.org/editions/2/image_sets/12170035

phlakaton · 2025-03-28T12:26:10 1743164770

Em-dashes have been the norm in every Dickinson poem I read, and I think it might have derived from the preferences of Victorian publishers, who I understand loved those long dashes.

mmooss · 2025-03-28T19:01:28 1743188488

Great comment. Thank you!

grey413 · 2025-03-28T09:26:53 1743154013

I imagine it would have been up to the typesetter to make the call. The conventions for dash usage are fairly straightforward. You use em-dashes for asides, en dashes for ranges, and hyphens for most other cases. Its easy to figure out the right character from context (apart from en ranges vs hyphen ranges).

lostlogin · 2025-03-28T21:29:22 1743197362

I had a quick search, attempting to find a great author who hated em dashes and preferred the vastly superior en dash. I found nothing.

This list of authors punctuation quirks is interesting though.

https://lithub.com/the-punctuation-marks-loved-and-hated-by-...

phlakaton · 2025-03-29T16:27:22 1743265642

You want Robert Bringhurst, poet and typographic nerd. He gives them special withering attention in his Elements of Typographic Style. I think he referred to them as Victorian excrescences?

nkotov · 2025-03-28T14:12:07 1743171127

Recently ran into this. Didn't realize it was that obvious.

windward · 2025-03-28T15:25:52 1743175552

And you'd better not 'delve' into anything

raverbashing · 2025-03-28T07:23:25 1743146605

You are right of course

However this is the kind of rule that "existed" for a while and most likely will go away as most people can't be bothered with the difference and it all looks similar anyway

Or maybe who knows, it will keep going on because chatgpt knows it

econ · 2025-03-28T02:49:13 1743130153

I've always wanted an array or object with range keys like: arr[0–2] = 123; if(arr[1.5555]>122){}

yesbabyyes · 2025-03-28T07:59:23 1743148763

That doesn't seem to be an array at all, if the idea is to check whether a number is within a range. Seems like an interesting data type though, a combination of a range data type and a map/associative array.

econ · 2025-03-29T03:51:24 1743220284

I was thinking of a sparse array but any name will do. obj[~42] ?

One may have a bunch of key ranges each associated with a value or one may have a key that should be "rounded" to the nearest key or retreave the one below or above it.

It feels like something basic enough to have in a language and I found it oddly complicated to write myself. Comparing it with all values doesn't seem like a very good solution.

Not that I know many languages.

paulddraper · 2025-03-28T05:08:47 1743138527

In Python it’s a colon.

econ · 2025-03-29T03:52:29 1743220349

Nice, covers at least some of the abstraction "problem".

mproud · 2025-03-28T03:03:22 1743131002

A Figure Dash is perfect for phone numbers (especially when working with tabular numbers).

hilbert42 · 2025-03-29T05:09:39 1743224979

"There's also the figure dash…"

Re last paragraph: dashes, etc. are confusing for perhaps most of us who aren't, say, typesetters, myself included. I use EM dashes a lot usually without a space between words and sometimes with spaces when I think the typography calls for it—or for extra emphasis.

Essentially, most of us guess the rules and often this doesn't matter much but it can in certain circumstances.

For example, in say machine conversion/transliteration. The ASCII dash is often used as a substitute for Unicode minus sign because it's easy to select [it's my usual practice], and anyway many don't know there is an actual difference. Whilst a human will usually know the difference by its use or context a machine may take the literal interpretation which could lead to say a numerical calculation error.

This problem has annoyed me for a long while. Why is it that wordprocessors and editors do not highlight these characters and query whether the usage is correct? Surely this ought not to be that difficult.

Another example is Roman numerals. The average person will enter say an uppercase 'I' for the Roman numeral one. Here's a typical example which is incorrect:

WWII

Here I entered the normal ASCII 'I' because it was too involved to find the correct Unicode character for Roman numeral one.

I'd like to know what others who are in typography, machine learning etc. think about this, and why WP programs and editors don't have simple ergonomics that allow for easy selection of the correct character.

† On a related matter, you'll note I've used single quotes whereas mmooss uses double quotes. This tell me that mmooss is likely in the US whereas I'm not. Again, this is not really a major problem for humans but it can be in transliteration, etc. Also, it's unclear (at least to me) what the default is for quoting quotes, i.e.: "" versus "' (right, I've refrained from using triple quotes).

Again, this seems country specific with I believe the US favoring double followed by single. Even when these rules are defined do people strictly adhere to them?

st_goliath · 2025-03-28T00:12:46 1743120766

Also, not to be confused with "一", which is a different thing entirely……

mortos · 2025-03-28T02:20:03 1743128403

This one is U+4E00, CJK Unified Ideograph-4E00. So it's a common character between Chinese, Japanese, and Korean. This should be "one" in all three. And it does technically look a little different than a dash: https://unicodeplus.com/U+4E00

KPGv2 · 2025-03-28T04:18:42 1743135522

And this is different from Japanese's chuuonpu (U+30FC) which is a vowel elongation mark, and it's rendered horizontally or vertically depending on whether the text direction is horizontal or vertical, respectively.

ー