What is usenet

July 2010

Introduction.

Usenet is one of the oldest computer network communications systems still in widespread use. It was established in 1980, following experiments from the previous year, over a decade before the World Wide Web was introduced and the general public got access to the Internet. It was originally conceived as a "poor man's ARPANET," employing UUCP to offer mail and file transfers, as well as announcements through the newly developed news software. This system, developed at University of North Carolina at Chapel Hill and Duke University, was called USENET to emphasize its creators' hope that the USENIX organization would take an active role in its operation (Daniel et al, 1980).

The articles that users post to Usenet are organized into topical categories called newsgroups, which are themselves logically organized into hierarchies of subjects. For instance, sci.math and sci.physics are within the sci hierarchy, for science. When a user subscribes to a newsgroup, the news client software keeps track of which articles that user has read.

In most newsgroups, the majority of the articles are responses to some other article. The set of articles which can be traced to one single non-reply article is called a thread. Most modern newsreaders display the articles arranged into threads and subthreads, making it easy to follow a single discussion in a high-volume newsgroup.

When a user posts an article, it is initially only available on that user's news server. Each news server, however, talks to one or more other servers (its "newsfeeds") and exchanges articles with them. In this fashion, the article is copied from server to server and (if all goes well) eventually reaches every server in the network. The later peer-to-peer networks operate on a similar principle; but for Usenet it is normally the sender, rather than the receiver, who initiates transfers. Some have noted that this seems a monstrously inefficient protocol in the era of abundant high-speed network access. Usenet was designed for a time when networks were much slower, and not always available. Many sites on the original Usenet network would connect only once or twice a day to batch-transfer messages in and out.

Usenet has significant cultural importance in the networked world, having given rise to, or popularized, many widely recognized concepts and terms such as "FAQ" and "spam."

Today, almost all Usenet traffic is carried over the Internet. The current format and transmission of Usenet articles is very similar to that of Internet email messages. However, Usenet articles are posted for general consumption; any Usenet user has access to all newsgroups, unlike email, which requires a list of known recipients.

Today, Usenet has diminished in importance with respect to mailing lists, web forums and weblogs. The difference, though, is that Usenet requires no personal registration with the group concerned, that information need not be stored on a remote server, that archives are always available, and that reading the messages requires not a mail or web client, but a news client (included in many modern e-mail clients.

ISPs, news servers, and newsfeeds

Many Internet service providers, and many other Internet sites, operate news servers for their users to access. ISPs that do not operate their own servers directly will often offer their users an account from another provider that specifically operates newsfeeds. Most commonly, these accounts are through Supernews, Giganews and Usenet.com. Usually the ISP will get a kickback for referring the customer to the Usenet provider. In early news implementations, the server and newsreader were a single program suite, running on the same system. Today, one uses separate newsreader client software, a program that resembles an email client but accesses Usenet servers instead.

Not all ISPs run news servers. A news server is one of the most difficult Internet services to administer well because of the large amount of data involved, small customer base (compared to mainstream Internet services such as email and web access), and a disproportionately high volume of customer support incidents (frequently complaining of missing news articles that are not the ISP's fault). Some ISPs outsource news operation to specialist sites, which will usually appear to a user as though the ISP ran the server itself. Many sites carry a restricted newsfeed, with a limited number of newsgroups. Commonly omitted from such a newsfeed are foreign-language newsgroups and the alt.binaries hierarchy which largely carries software, music, videos and images, and accounts for over 99 percent of article data.

For those who have access to the Internet, but do not have access to a news server, Google Groups allows reading and posting of text news groups via the World Wide Web. Though this or other "news-to-Web gateways" are not always as easy to use as specialized newsreader software—especially when threads get long—they are often much easier to search. Users who lack access to an ISP news server can use Google Groups to access the alt.free.newsservers newsgroup, which has information about open news servers.

There are also Usenet providers that specialize in offering service to users whose ISPs do not carry news, or that carry a restricted feed. One list of such providers is available at UsenetProviders' list of Usenet providers (Germany) or Jeremy Nixon's list of (paid) Usenet providers.

See also news server operation for an overview of how news systems are implemented.

Newsreader clients

Newsgroups are typically accessed with special client software that connects to a news server. With the rise of the world wide web, web front-ends have sometimes been used to access newsgroups via the aforementioned news-to-web gateways. However, these gateways often provide limited features, and for that reason using a local client is still regarded as the best way to access newsgroups.

Newsreader clients are available for all major operating systems and come in all shapes and sizes. Mail clients or "communication suites" also now commonly have an integrated newsreader. Often, however, these integrated clients are of low quality, e.g. incorrectly implementing Usenet protocols, standards and conventions. Many of these integrated clients, for example the one in Microsoft's Outlook Express, are disliked by purists because of their misbehavior.

Technicl details

Usenet is a set of protocols for generating, storing and retrieving news "articles" (which resemble Internet mail messages) and for exchanging them among a readership which is potentially widely distributed. These protocols most commonly use a flooding algorithm which propagates copies throughout a network of participating servers. Whenever a message reaches a server, that server forwards the message to all its network neighbors that haven't yet seen the article. Only one copy of a message is stored per server, and each server makes it available on demand to the (typically local) readers able to access that server. Usenet was thus one of the first peer-to-peer applications, although in this case the "peers" are themselves servers that the users then access, rather than the users themselves being peers on the network.

RFC 850 was the first formal specification of the messages exchanged by Usenet servers. It was superseded by RFC 1036.

One difference between Usenet and newer peer-to-peer applications is that one can request the automated removal of a posting from the whole network by creating a cancel message, although due to a lack of authentication and resultant abuse, this capability is frequently disabled. Copyright holders may still request the manual deletion of infringing material using the provisions of World Intellectual Property Organization treaty implementations, such as the U.S. Online Copyright Infringement Liability Limitation Act.

On the Internet, Usenet is typically on TCP Port 119.

Organization

The major set of worldwide newsgroups is contained within nine hierarchies, eight of which are operated under consensual guidelines that govern their administration and naming. The current "Big Eight" are:

  • comp.*: computer-related discussions (comp.software, comp.sys.amiga)

  • humanities.*: Fine arts, literature, and philosophy (humanities.classics, humanities.design.misc)

  • misc.*: Miscellaneous topics (misc.education, misc.forsale, misc.kids)

  • news.*: Discussions and announcements about news (meaning Usenet, not current events) (news.groups, news.admin)

  • rec.*: Recreation and entertainment (rec.music, rec.arts.movies)

  • sci.*: Science related discussions (sci.psychology, sci.research)

  • soc.*: Social discussions (soc.college.org, soc.culture.african)

  • talk.*: Talk about various controversial topics (talk.religion, talk.politics, talk.origins)

The alt.* hierarchy is not subject to the procedures controlling groups in the Big Eight, and it is as a result less organized. However, groups in the alt.* hierarchy tend to be more specialized or specific—for example, there might be a newsgroup under the Big Eight which contains discussions about children's books, but a group in the alt hierarchy may be dedicated to one specific author of children's books. Binaries are posted in alt.binaries.*, making it the largest of all the hierarchies.

Many other hierarchies of newsgroups are distributed alongside these. Regional and language-specific hierarchies such as japan.*, malta.* and ne.* serve specific regions such as Japan, Malta and New England. Companies such as Microsoft administer their own hierarchies to discuss their products and offer community technical support. Some users prefer to use the term "Usenet" to refer only to the Big Eight hierarchies; others include alt as well. The more general term "netnews" incorporates the entire medium, including private organizational news systems.

Binary content

Usenet was originally created to distribute text content encoded in the 7-bit ASCII character set. With the help of programs that encode 8-bit values into ASCII, it became practical to distribute binary files content. Binary posts, due to their size and dubious copyright status, were in time restricted to specific newsgroups, making it easier for administrators to allow or disallow the traffic.

The oldest widely used encoding method is uuencode, from the Unix uucp package. In the late 1980s Usenet articles were often limited to 60,000 characters, and larger hard limits exist today. Files are therefore commonly split into sections that require reassembly by the reader.

With the header extensions and the Base64 and Quoted-Printable MIME encodings, there was a new generation of binary transport. In practice, MIME has seen increased adoption in text messages, but it is avoided for most binary attachments. Some operating systems with metadata attached to files use specialized encoding formats. For Mac OS, both Binhex and special MIME types are used.

The standard method of uploading binary content to Usenet is to first archive the files into RAR archives (for large files usually in 20 MB or 50 MB parts) then create Parchive files. Parity files are used to recreate missing data. This is needed often, as not every part of the files reach a server. These are all then encoded into yEnc and uploaded to the selected binary groups.

(source wikipedia) http://en.wikipedia.org/wiki/Usenet Back to home