r1 - 23 Mar 2004 - 07:35:29 - BrianParkYou are here: Complexwiki >  Main Web  >  WebDevelopment > AF8FileNotFoundError

AF8 File Not Found Errors

While doing a Google search, I came across the FAQ page (http://www.boutell.com/wusage/faq.html#wusage.af8.error) which refers to a "AF8 404 Not Found" error.

After many month of error-free operation, I too started encountering these "AF8" document not found errors. I was confused about them because none of my documents contain the sequence "AF8". I believe I have found the answer after some research and experimentation. As background, I run an Apache 1.3.20 server on SuSE Linux 7.3 (2.4.10 kernel), with my browser being Internet Explorer 5.5 on Windows 2000 Professional.

Some of my HTML documents do NOT contain a character encoding meta-tag, such as

<meta http-equiv="charset" content="iso-8859-1">

since they were auto-generated bya program. These documents simply contained the bare <html></html> tags:

<html>
<head>[...]</head>
<body>[...]</body>
</html>

Unfortunately, if the http-equiv meta-tag does NOT exist, then IE 5.5 will attempt deduce the document encoding using its "Auto-Select" feature (appearing in the drop down menu under "View/Encodings").

On a subset of these "bare" HTML documents, IE will decide that the document is in UTF-7 encoding (see http://www.landfield.com/rfcs/rfc2152.html for definition of UTF-7). Figuring out why this happened only on a small subset of these "bare" HTML documents required some detective work, but it appears that the common pattern is that these documents contain a sequence of characters of the form:

    +[...]-
The "+" symbol is an escape character in the UTF-7 encoding scheme that indicates the start of UTF-7 encoding. The "-" symbol indicates the end of a UTF-7 sequence. The presence of these characters confuses the browser into thinking that the document is encoded in UTF-7.

For document rendering purposes, the problem does not make itself known, because the spurious "+[..]-" sequence usually doesn't translate into a valid UTF-7 character, so the browser will display these characters literally, as is.

The problem appears in hyperlinks that originate from this document. IE 5.5 assumes that the document is in UTF-7, so it translates all hyperlinks on the document to UTF-7 format before sending it to the HTTP server. Some ASCII characters in the <a> tag become translated into UTF-7 format. For example, if the document contains the HTML fragment

    <a href="http://server/some_link_with_underscore">Some Link With Underscore</a>
then IE 5.5 converts the HREF into the UTF-7 sequence:
    http://server/some+AF8-link+AF8-with+AF8-underscore

because in UTF-7, the "underscore" character is "+AF8-".

The solution is to include the character set meta-tag in every document. For most English documents, this should be

    <meta http-equiv="charset" content="iso-8859-1">
or
    <meta http-equiv="charset" content="utf-8">

"iso-8859-1" is the "Latin-1" character set, also known as "Western European", and "UTF-8" is another encoding of the Unicode character set (a sibling of UTF-7; see http://www.w3.org/International/O-charset.html for more details; see http://www.unicode.org/ for a description of Unicode). Both encoding schemes are backwards compatible with ASCII. That is to say, that all ASCII characters are also iso-8859-1 characters, and all ASCII characters are also UTF-8 characters.


-- BrianPark - 23 Mar 2004

Edit | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r1 | More topic actions
 
Complexwiki
Copyright © 1999-2006 by Complexfission.com.
Ideas, requests, problems regarding Complexwiki? Send feedback