"Patton, Paul B (MN10)" <Paul.B.Patton@HBC.honeywell.com> writes: >Re Paul Schinder's reply to Doug Roberts: > >>}Secondly, you may notice that this print routine is missing the >>}"Content-type: text/html\n\n" line. It seems that if the file open >> >>Alarm bells just went off. HTTP requires a certain line end for >>Content-type:. I don't remember exactly what it is, but it's either >>\012\012 or \015\012\015\012. Note that you cannot *portably* write >>these any other way in Perl, no combination of \r and \n will be >>correct on all platforms. > >The book "HTML & CGI Unleashed" emphasizes that all CGI scripts >should ensure that the two bytes following the last printable byte of >a "Content-type..." line are \012\012 to ensure compatibility. I would use a stronger word than compatibility: Internet standards* specify that the byte stream appearing on the network MUST be ASCII CR LF. As far as I'm aware that applies generally for all text based protocols such as telnet, ftp, smtp, pop3, imap, etc. As I'm sure Paul is only too aware from unnecessary effort in porting CPAN modules, many of those modules make no distinction between the host and network character sets. Unfortunately this sloppy coding practice escapes attention because most perl people are working on systems which map the C-implementation-dependent \r\n escape sequences to ASCII \015\012. As Paul mentioned the only way to portably write these characters in C or Perl is with octal/hex encoding which is what all perl code dealing with network-ASCII should use. I guess it goes back to teletypes and computers not having a standard way to mark end-of-line. I had a brief look at converting one CPAN module I needed, and got lost which uses of '\n' were for strings going to the network vs strings going to host files/display. I got it sort of working without too much trouble, but I only used a few parts of the code and certainly wasn't confident with my fixes. Other people are probably much better than me at intuiting what code does by inspection, but I still think fixing up '\n's is tricky. IMPORTANT QUESTION Is there some way we can encourage the CPAN maintainers to prefer network modules making use of \015\012 over \r\n ? As mentioned above, it's not just a matter of changing all \n's to \012, and it's better if the author/maintainer does this. Do any other perl implementations use unusual encodings for \r\n ? This issue is a bit like a lot of network code developed on Sun/etc systems initially misses an occasional hton() encoding simply because the host == network byte-order. It would be a nice addition to lint tools to keep track of which byte order a variable holds. Danny Thomas * HTTP-1.1 rfc2068 is flexible wrt to line-breaks in textual content, but not in the protocol messages: This flexibility regarding line breaks applies only to text media in the entity-body; a bare CR or LF MUST NOT be substituted for CRLF within any of the HTTP control structures (such as header fields and multipart boundaries).