The case of double characters when printing man pages
By Thom Holwerda on 2017-08-09 21:50:10

In what's never going to be a regular occurance, I'm linking to a Twitter thread. Chris Espinosa tweets:

Just as I was wrapping up an email and getting ready to leave work, a co-worker rolled his chair over to show me an "interesting" thing.

Go ahead, read it.

UNIX, man. Not even once.

RE[3]: UNIX showing its age again...
By Doc Pain on 2017-08-11 01:06:58
> I wouldn't consider most of your examples to be "convenient". I recall piping the output of man to nroff, but that's a really old memory.

Personally, I find that

man ls > ls.txt

is all I need to do under my current bash environment.

When I do this in my (home) UNIX environment (not Linux, not bash), I get a text file with control characters (the ^H as backspace for "overprinting" as well as the double characters and the _ underlining). Example line:

-^H-D^HD _^Hf_^Ho_^Hr_^Hm_^Ha_^Ht

This is the text found in the text file for

-D format

Reading the file with "less ls.txt" again restores the look of the manpage including highlighted and underlined characters, in this example, "-D" printed bold, and "format" underlined. Outputting it with "cat ls.txt" directly to the terminal does not display the control characters - normal text is written to the terminal. But they are still there - "cat ls.txt | less" leads to the "manpage style" again.

It might be possible that in your environment the control characters are stripped automatically. Maybe your man's roff implementation (nroff, groff, troff, etc) behaves different, maybe the shell changes $PAGER for the process to something else than the traditional "less" upon redirection...
RE: Twitter? Not even once...
By Alfman on 2017-08-11 05:47:43

> This is the first time in many, many years that I have gone to Twitter to read anything.
Ok, so perhaps I am an old fart, but WTF?!?
It is.
Kind of.
I couldn't make it.

twitter is the geocities of millennials.
RE[2]: Twitter? Not even once...
By The123king on 2017-08-11 08:30:58
I think Geocities would be offended at the comparison
By Megol on 2017-08-11 09:00:13
Not being a twit
By Megol on 2017-08-11 09:00:55
reading that
By Megol on 2017-08-11 09:01:35
is a bloody pain!
RE: Comment by Drumhellar
By Carewolf on 2017-08-11 11:29:55
Which also explains why his shell is so bad it didn't handle the conversion in the streaming operator like a modern Linux shell.
RE[4]: UNIX showing its age again...
By Soulbender on 2017-08-11 12:07:51
> man ls > ls.txt

Gives a correctly formatted text file on Ubuntu 17.04.
RE: Comment by Licaon_Kter
By Kochise on 2017-08-11 12:39:24
Chris Espinosa? @cdespinosa Aug 8

Just as I was wrapping up an email and getting ready to leave work, a co-worker rolled his chair over to show me an “interesting” thing. He was in Terminal, looking for documentation on a new feature, and had been told that it only existed as a Unix ‘man’ page. He wanted to email it to somebody so he’d used the Terminal feature to capture output to a file, and had opened it in a text editor, but in the text file the title of the command and certain words in the text had ddoouubblleedd cchhaarraacctteerrss.

And forty years of my life dissolved away as into a mist, like I had nibbled on Proust’s madeleine. And I laughed and cried at the same time (Reply to this post if you know where this is going). Of course, I told him, Terminal is a tty, and man thinks that it’s a Decwriter from 1975. It is printing char-backspace-char to make it bold Terminal simulates the overstrike. But when it’s copied to a file it’s just 0x08 and nonprinting, and you see the doubled characters.

But I remembered, from 1997 when NeXT came in, or 1987 when I managed A/UX, or 1977 when I troffed the Apple II Reference Manual that the man page for man had the instructions for how to stop this. A flag? Redirecting standard out to a non-tty pipe? ‘man man’ it is… And though the man page has changed a lot in 40 years, it was still there. | col -b > filename.txt to strip the doubled characters. With also the magic words “reverse line feed.” Ah! Pinfeed perforated paper on a letter-quality printer!

See, there were these rollers with pins on them that fed the paper. But to tear off the last page the perf needed to be above the pin feed. That left 2” unused so if your top margin was less than that, it needed to inject a Form Feed, then six lines of Reverse Line Feed to print the top of the page but if you were piping to a text file the reverse line feeds would just overprint the last lines on the previous page. col filtered them out.

The memory of years of fussing with pinfeed paper and RIBBONS for God’s sake and top alignment and that in 2017 all that code is still there but the main point is that it being a living document we have no idea of what it originally said without the edit history.
RE[2]: Capabilities
By uridium on 2017-08-11 12:59:39
Yep. It's like classic cars, plenty of us aa^Hdd^Hdd^Hii^Hcc^Htt^Hss^ H enthusiasts out there that still run them. Granted you can emulate faster than real hardware, but part of the charm is restoring, running and operating these old girls. I have a few. Once you've moved in the circles, pdp-11 owners come out of the woodwork. Plenty around :)

There's even modern extensions now days such as Bilquist's BQTCP, Telnet, FTP, and web server (BQHTTP) for them. The old girls continue to run and be enjoyed. If you just want to emulate one, try simh as a starting point and see if you catch the bug. If you do, a cheap DECServer550 with the resistor and ROM hack is a cheap and easy entry-point.
