Fonts and Editors (was: Re: US school district spied on students through webcams, court [telecom])

> ***** Moderator's Note *****

>> >> PLEASE do not paste text from a microsoft editor into email >> submissions you are sending to the digest! It leaves proprietary >> artifacts in the text which I must edit out by hand. > > More correctly, they can use an MS editor provided they save in > plain text mode. > > ***** Moderator's Note ***** > > Please don't. Microsoft's plan for world domination includes using > "Windows" fonts in every application, and AFAIK that includes their > "plain text" editors. When an email arrives with a header that says > "charset=us-ascii" (United States - American Standard Code for > Information Interchange), I'm entitled to assume that there are no > "High ASCII" characters in it, i.e., no bytes with a value above 127 > decimal. > [...]

Frustration understood. :-)

Where I've seen the problem occur frequently is copy'n'pasting from a web page into an editor or even Thunderbird's entry window, with the offending characters mostly being left and right single and double quotes and an em-dash ("---" as a single horizontal line).

What I (mostly) do now is copy'n'paste from a web page to an Emacs window, and then I can see and correct the bad characters, then copy'n'paste from the Emacs window to Thunderbird's entry windows and everything's fine. Use of Emacs is also great to keep line length around 75 characters or so via ESC-Q after setting the fill column (^U nnn ^X F).

Another choice of editor for Windows users is Textpad, free for personal use and available here:

It's probably *THE* most featureful text editor for a Windows system and yet is extremely easy to use. Among other things it can correct characters and read/save files in Windows and UNIX/Linux formats.

Emacs for Windows is available here (can download with a browser):

and if you want to make the "Caps Lock" a proper [CTRL] key, the best and most reliable and for Win2K, WinXP, Vista and Win7 is here (writeup first):

and, yes, I use it on my Win2K, WinXP, Vista and Win7 systems.

***** Moderator's Note *****

Since this post is a big change of subject, I have "de-threaded" it.

I use emacs to edit the Digest, but it's not for the faint-hearted, and I'll warn potential users that there's a big learning curve to climb (Some say the name stands for extend-meta-alt-control-spacebar, due to the way emacs makes extensive use of the Escape, Alt, and control keys). Although emacs offers features - such as "rectangle editing" - which aren't available in most editors, it's a big change from the Windows world.

Other editors may be more user-friendly, but the plain truth is that those whom are used to a what-you-see-is-all-you-get environment will find it easiest to stick with what they know already. The "Official" font of the Digest is "ISO-8859-1", which is a compromise between ASCII and Unicode Transformation Format (UTF): at some point, we'll probably have to convert to UTF-8, but for now I'll just ask readers to not cut and paste unless they _know_ that the result doesn't contain proprietary, non - ISO-8859-1 characters.

Bill Horne Moderator

P.S. There's a wealth of information on this issue, since it's as old as the Internet. Some links I got with a quick search follow:

formatting link
formatting link
formatting link
formatting link

Reply to
Thad Floryan
Loading thread data ...

formatting link

Bill Gates "[Deity] Complex" notwithstanding, his three programs, Notepad, Wordpad, and even his full-blown word processor program indeed can, and will, work at the basic ASCII level. Problem is, too many users do not get it.

BTW, Apple was the pioneer at pushing fonts. ;-)

Reply to
Sam Spade

Notepad won't correct slanty quotes. Here's a snippet from a Word document I received; I copied it into Notepad, then copied the Notepad version into this message. If Bill doesn't "correct" it for me, you'll see what slanty quotes do:

[quote] Please ?don?t be a no show.? [unquote]

Neal McLain

***** Moderator's Note *****

I've tried to leave the quote intact, but I had to edit the message since someone forgot to put "[telecom]" in the subject line.

In case the editor and/or the reader's news/mail client doesn't reproduce the quote accurately, or renders the "incorrect" characters "correctly", the offending characters have octal values of 223 and

224, which are respectively, 147 and 148 decimal.

Those are, of course, the "open quote" and "close quote" characters from the Windows-1252 character set. ISO-8859-1 doesn't have an equivalent that I could find: it does, however, offer «chevron style quotes». C'est la guerre.

Bill Horne Moderator

Reply to
Neal McLain

Are we now counting angels on the head of a telephonic pin? ;-)

***** Modertor's Note *****

Let's think of it as the Usenet Mesa, instead of the telephonic pin. ;-)

I asked the contributors not to cut-and-paste from sources with proprietary font codes, since I must edit such things by hand. Someone else said that Microsoft's low-end text editors could work with ASCII. Neal pointed out that Microsoft's products work with the _Microsoft_ character set, and that they _will_ include non-standard characters in saved files or cut/copied text.

It's an old problem, and we're not going to solve it here. Suffice to say that ISO-8859-1 is the "standard" character set for the Telecom Digest.

This is relevent to telecom because we have to agree on a character set in order to discuss telecom issues. You might not care _which_ hotel a trade conference is held at, but you'll obviously care that all the attendees go to the same one.

Bill Horne Moderator

Reply to
Sam Spade

Yikes, I'm tempted to stick my head even deeper into the sand for this one. Over the decades I've struggled to understand character sets and coding schemes, and the more I research it, the more confused I get. Big-Endian vs. Little-Endian 32-bit numbers are enough to make my brain blow an o-ring...

I'm typing this on EditPad 3.5.1, the hottest new version of EditPad (in 1999). I'm running on the dreaded NT5 (laugh all you want), and after I type this I will likely take the lazy way out and copy/paste it directly into the google groups reply page. Will that result in bogus characters? Please advise, and also advise as to what really is the best way to make a submission to the digest now. (e-mail? google web interface? Patrick's old page? I know you've covered this question, I just can't seem to find it...)

The "convert" pull-down menu in EditPad has two commands, "OEM to ANSI" and "ANSI to OEM." I've never really understood what they did, because this version only _saves_ files in one format - the "native" (perverse?) ASCII format of windows. I know the basic purpose is the whole "CR+LF" vs "CR" only deal, but what happens to the file the next time it is opened and saved on the windows platform? Like I said, I just get more confused... It will not even read unicode files, even if said unicode files contain only characters below ASCII 255 with the second eight-bit "half-word" padded out with zeroes.

Another strange thing, over the years I've discovered about a half- dozen Digest Archive files on the MIT ftp server that appeared to just end in mid-sentence when read with this program (EditPad). This would usually happen right when Patrick was just getting up a good head of steam on one of his editorials, and it was a bummer. I discovered (accidentally) that the files read just fine in UltraEdit32 (another "read anything/write anything" program that still leaves you guessing half the time as to what format you are _really_ in), and once re- saved as ASCII from there (again on an NT platform), they were readable in EditPad. I looked at them on UltrEdit 32 in hex mode at the time, but I can't remember now if there was an errant EOF marker in them, or if it was something else. I can probably figure out the exact files by looking at the "last modified" date on the copies in my archive here - I think they were all from the same year.

Jim

******************************************************** Speaking from a secure undisclosed location.
Reply to
JimB

In your case, as you are posting through Google Groups, it really doesn't matter what you are using. Google Groups is notorious for adding "nonprinting" characters (that can print on one's display terminal emulation anyway) to quotes, and the usual reformating that breaks lines without allowing the user to control any of it. On top of it, Google Groups sends messages to Usenet with the character set mismarked.

This didn't happen in your message, unless our editor neatened things up. However, the article was marked ISO-8859-1 in lieu of ASCII, even though I didn't spot any non-ASCII characters. That's not wrong (as ASCII is a subset) but it's not especially helpful either. ASCII is universally displayable as intended.

So another way to be helpful in this regard is not to post via Google Groups.

***** Moderator's Note *****

Well, I appreciate the sentiment, but let's not get too far afield: I don't discriminate between posts from Google versus posts created with newsreaders, or email clients, etc. I take 'em any way I can get 'em, and those who post via Google are as welcome as any other. While I would _like_ to have an easier job, I _know_ that there are many ways to read and contribute to the Digest, and I'd rather have posts that need a little work than none at all.

Google Groups has an advantage that some other ways of submitting do not: I ask the readers to make reasonable efforts to preserve "Threading" information on posts that they reply to, and Google Groups does that, so I'd rather have a reply filed via Google Groups than one that's sent as a "new" email, because I have to do research to retrieve the fields of the previous messages in the thread and add them to such "new" emails by hand.

Concerns about the character set come *after* the threading: it may be a PITA to have to read a reply with unusual characters in it, but at least you Û_can_ Üread Üit as part of the thread it's intended for, rather than having to jump around and mentally re-adjust to every subject as you go back-and-forth between threads. (Extraneous characters added for emphasis).

Bill Horne Moderator

Reply to
Adam H. Kerman

Cabling-Design.com Forums website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.