--boAH8PqvUi1v1f55
Content-Type: multipart/mixed; boundary="RwGu8mu1E+uYXPWP"
Content-Disposition: inline


--RwGu8mu1E+uYXPWP
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

* James Britt (ruby / jamesbritt.com) wrote:
[snipped]
> If I understand the XML prolog production rule correctly, the (optional) 
> XML decl must come before the doctype decl

That leaves us with one of three options, none of which I'm particularly
happy with:

1.  Leave it as is.  If it's not valid XML, then this is no good.
2.  Swap the order of the XML declaration and DOCTYPE declaration. 
    This works, but it throws IE into quirks mode, which kind of 
    defeats the whole point of using XHTML in the first place.
3.  Remove the XML declaration altogether.  Makes the XML valid and
    keeps IE in standards mode, but since it's being served as
    'text/html', that means the user agent can't make any assumptions
    about the character encoding (and, since it's XML, the parser won't
    understand the meta http-equiv hackery in use today).  

Option 3 wouldn't be a problem if we could use "application/xhtml+xml",
since that infers a character encoding of UTF-8.  Unfortunately, IE6 
doesn't support that content type (and, according to what I've read, IE7
won't either).

I think the least painful option is a combination of 2 and 3.  Swap the
XML declaration, but set it to '' by default.  That'll keep IE in
standards mode, still be valid XML, and give people the option of
enabling an XML declaration by passing one in.

I've attached two patches, the first applies after my previous patch,
and the second is a combination of both patches (for lazy people, like
me).  Both patches are also available online at the following URLs:

  * The follow-up patch (make the XML declaration optional)
    http://diff.pablotron.org/ruby-1.8.4-xhtml_cgi-fix_xmldecl.diff
  * The Big Enchalada Patch (aka the combined patch)
    http://diff.pablotron.org/ruby-1.8.4-xhtml_cgi-2.diff

Both xmllint and REXML are happy with the output produced by the
combined patch.

References:

  * "The <?xml> prolog, strict mode, and XHTML in IE"
    http://blogs.msdn.com/ie/archive/2005/09/15/467901.aspx
  * "Activating the Right Layout Mode Using the Doctype Declaration"
    http://hsivonen.iki.fi/doctype/
  * "Quirks mode and strict mode"
    http://www.quirksmode.org/css/quirksmode.html

-- 
Paul Duncan <pabs / pablotron.org>        OpenPGP Key ID: 0x82C29562
http://www.pablotron.org/               http://www.paulduncan.org/

--RwGu8mu1E+uYXPWP
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="ruby-1.8.4-xhtml_cgi-fix_xmldecl.diff"
Content-Transfer-Encoding: quoted-printable

diff -ur ruby-1.8.4-xhtml_cgi/lib/cgi.rb ruby-1.8.4-xhtml_cgi-2/lib/cgi.rb
--- ruby-1.8.4-xhtml_cgi/lib/cgi.rb	2006-01-21 12:57:43.000000000 -0500
+++ ruby-1.8.4-xhtml_cgi-2/lib/cgi.rb	2006-01-21 12:50:41.000000000 -0500
@@ -1664,13 +1664,14 @@
     # "DOCTYPE", if given, is used as the leading DOCTYPE SGML tag; it
     # should include the entire text of this tag, including angle brackets.
     #
-    # For XHTML 1.0 output, two addition pseudo-attributes, "XMLDECL"
-    # and "XML_ENCODING", are available.  "XMLDECL" is, as the name
-    # implies, the full XML declaration for the output document, and, 
-    # like the "DOCTYPE" pseudo-element, should include the entire text
-    # of the tag, including angle brackets. "XML_ENCODING" is the top-
-    # level XML encoding for the document, and defaults to "UTF-8" if
-    # unspecified.
+    # One additional pseudo-attribute, "XMLDECL", is available for XHTML
+    # 1.0 output.  "XMLDECL" is, as the name implies, the full XML
+    # declaration for the output document, and, like the "DOCTYPE"
+    # pseudo-element, should include the entire text of the tag,
+    # including the angle brackets.  The XML declaration is omitted by
+    # default because the XML specification requires it to be the first
+    # fragment in the document, but several modern browsers (IE6, Opera
+    # 7, and Konqueror 3.2).
     #
     # The body of the html element is supplied as a block.
     # 
@@ -1715,6 +1716,16 @@
       pretty = "  " if true == pretty
       buf = ""
 
+      # If the pseudo-attribute XMLDECL exists and is a string, then
+      # print it out in full as the XML declaration.  
+      #
+      # Note: IE6, Opera 7, and Konqueror 3.2 all get thrown into quirks
+      # mode if the first element in the document isn't a DOCTYPE
+      # declaration, so we deliberately omit the XML declaration by
+      # default here. 
+      buf += attributes['XMLDECL'] if attributes.key?('XMLDECL')
+      
+      # add the document's DOCTYPE
       if attributes.has_key?("DOCTYPE")
         if attributes["DOCTYPE"]
           buf += attributes.delete("DOCTYPE")
@@ -1725,33 +1736,6 @@
         buf += doctype
       end
 
-      # if the method xmldecl exists, then print out an XML
-      # declaration.  IE doesn't render in strict mode if the first
-      # element in the document isn't a DOCTYPE declaration, so we need
-      # to put the XML declaration after the DOCTYPE declaration, even
-      # though it really makes more sense the other way around.
-      if respond_to?(:xmldecl)
-        buf += if attributes.key?('XMLDECL')
-          # if the pseudo-attribute XMLDECL is specified, then delete it
-          # from the attribute list and use that instead of the
-          # pre-defined XML declaration
-          attributes.delete('XMLDECL')
-        else
-          # if the pseudo-attribute XML_ENCODING is specified, then
-          # delete it from the attribute list and use it instead of
-          # UTF-8
-          encoding = if attributes.key?('XMLENCODING') 
-            attributes.delete('XML_ENCODING')
-          else
-            'UTF-8'
-          end
-
-          # render the XML declaration with the specified encoding
-          xmldecl(encoding)
-        end
-      end
-      
-
       # add the xml namespace unless the xmlns method isn't defined
       # _and_ we don't have the xmlns attribute set
       unless attributes.key?('xmlns') || !respond_to?(:xmlns)
@@ -2354,10 +2338,6 @@
       %|<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">|
     end
 
-    def xmldecl(enc = 'UTF-8')
-      %|<?xml version='1.0' encoding='#{enc}'?>|
-    end
-
     def xmlns
       'http://www.w3.org/1999/xhtml'
     end
@@ -2405,10 +2385,6 @@
       %|<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">|
     end
 
-    def xmldecl(enc = 'UTF-8')
-      %|<?xml version='1.0' encoding='#{enc}'?>|
-    end
-
     def xmlns
       'http://www.w3.org/1999/xhtml'
     end
@@ -2457,11 +2433,6 @@
       %|<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">|
     end
 
-    # the XML declaration for this XHTML document
-    def xmldecl(enc = 'UTF-8')
-      %|<?xml version='1.0' encoding='#{enc}'?>|
-    end
-
     # the XML namespace attribute for this XHTML document
     def xmlns
       'http://www.w3.org/1999/xhtml'

--RwGu8mu1E+uYXPWP
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="ruby-1.8.4-xhtml_cgi-2.diff"
Content-Transfer-Encoding: quoted-printable

diff -ur ruby-1.8.4/lib/cgi.rb ruby-1.8.4-xhtml_cgi/lib/cgi.rb
--- ruby-1.8.4/lib/cgi.rb	2005-10-06 21:01:22.000000000 -0400
+++ ruby-1.8.4-xhtml_cgi/lib/cgi.rb	2006-01-21 12:50:41.000000000 -0500
@@ -266,10 +266,13 @@
 #   end
 # 
 #   # add HTML generation methods
-#   CGI.new("html3")    # html3.2
-#   CGI.new("html4")    # html4.01 (Strict)
-#   CGI.new("html4Tr")  # html4.01 Transitional
-#   CGI.new("html4Fr")  # html4.01 Frameset
+#   CGI.new("html3")      # html3.2
+#   CGI.new("html4")      # html4.01 (Strict)
+#   CGI.new("html4Tr")    # html4.01 Transitional
+#   CGI.new("html4Fr")    # html4.01 Frameset
+#   CGI.new("xhtml10")    # XHTML 1.0 (Strict)
+#   CGI.new("xhtml10Tr")  # XHTML 1.0 Transitional
+#   CGI.new("xhtml10Fr")  # XHTML 1.0 Frameset
 #
 class CGI
 
@@ -536,6 +539,24 @@
   # 
   # This method does not perform charset conversion. 
   #
+  # Content-Type and XHTML 1.0 Output:
+  #
+  # In accordance with the oft-heralded Principle of Least Suprise and
+  # both backwards and browser compatability, the Content-Type for
+  # generated XHTML defaults to 'text/html'.  However, if you're
+  # generating XHTML 1.0 content (i.e., you've created a CGI with the
+  # xhtml10, xhtml10Tr, or xhtml10Fr HTML output types), for user agents
+  # which are XHTML-aware, you might consider using a more
+  # XHTML-friendly content type. such as 'application/xhtml+xml',
+  # 'application/xml', or 'text/xml'.  In particular, the use of
+  # 'application/xhtml+xml' change the default character encoding
+  # behavior, rendering, and validation that user agents make about your
+  # document.  The nuances of these character types and the effect they
+  # have on conforming user agents are covered in gory detail in the W3C
+  # note "XHTML Media Types" at the following URL:
+  #
+  #   http://www.w3.org/TR/xhtml-media-types/
+  #
   def header(options = "text/html")
 
     buf = ""
@@ -1224,35 +1245,83 @@
   # Provides methods for code generation for tags following
   # the various DTD element types.
   module TagMaker # :nodoc:
+    TagStyle = Struct.new(
+      # upper-case element and attribute names on output 
+      # (true for HTML, false for XHTML)
+      :upcase,  
+
+      # allow bare (minimized) attributes (attributes without values)
+      # (true for HTML, false for XHTML)
+      :bare_attrs, 
+
+      # always close elements, even if empty
+      # (false for HTML, true for XHTML)
+      :always_close, 
+
+      # add implicit IDs to elements with name attributes but not IDs
+      # (false for HTML, true for XHTML10)
+      :implicit_ids
+    )
+
+    # HTML3/4 tag style
+    # this is declared here because it's the default tag style if
+    # unspecified in the methods below (to preserve
+    # backwards-compatability for other extensions depending on these
+    # methods)
+    HTML_TAG_STYLE = TagMaker::TagStyle.new(true, true, false, false)
+    XHTML10_TAG_STYLE = TagMaker::TagStyle.new(false, false, true, true)
+
 
     # Generate code for an element with required start and end tags.
     #
     #   - -
-    def nn_element_def(element)
-      nOE_element_def(element, <<-END)
+    def nn_element_def(element, style = HTML_TAG_STYLE)
+      elem_name = style.upcase ? element.upcase : element
+      nOE_element_def(element, <<-END, style)
           if block_given?
             yield.to_s
           else
             ""
           end +
-          "</#{element.upcase}>"
+          "</#{elem_name}>"
       END
     end
 
     # Generate code for an empty element.
     #
     #   - O EMPTY
-    def nOE_element_def(element, append = nil)
+    def nOE_element_def(element, append = nil, style = HTML_TAG_STYLE,lose_elem = false)
+      elem_name = style.upcase ? element.upcase : element
+      elem_end = (close_elem && style.always_close) ? ' />' : '>'
+      attr_name = style.upcase ? 'name' : 'name.downcase'
+
       s = <<-END
-          "<#{element.upcase}" + attributes.collect{|name, value|
+          has_id = #{style.implicit_ids}
+          has_id &&= attributes.keys.map { |v| v.downcase}.include?('id')
+
+          "<#{elem_name}" + attributes.collect{|name, value|
             next unless value
-            " " + CGI::escapeHTML(name) +
+            " " + CGI::escapeHTML(#{attr_name}) +
             if true == value
-              ""
+              #{style.bare_attrs} ? "" : '="' + CGI::escapeHTML(#{attr_name}) + '"'
             else
-              '="' + CGI::escapeHTML(value) + '"'
+              val = '="' + CGI::escapeHTML(value) + '"'
+
+              # what we're doing here is cloning the name attribute to
+              # an ID attribute if the implicit_ids style flag is set
+              # and an ID attribute wasn't explicitly specified.  this
+              # is necessary to maintain backwards compatability with
+              # the existing CGI modules and forwards compatability with
+              # XHTML 1.0 and (eventually) XHTML 1.1.  This approach has
+              # the added bonus of being guaranteed to work in older
+              # user agents.
+              val += ' id' + val if #{style.implicit_ids} && 
+                                    !has_id && 'name' == name.downcase
+
+              # return attribute value assignment
+              val
             end
-          }.to_s + ">"
+          }.to_s + "#{elem_end}"
       END
       s.sub!(/\Z/, " +") << append if append
       s
@@ -1262,10 +1331,11 @@
     # start) tag is optional.
     #
     #   O O or - O
-    def nO_element_def(element)
-      nOE_element_def(element, <<-END)
+    def nO_element_def(element, style = HTML_TAG_STYLE)
+      elem_name = style.upcase ? element.upcase : element
+      nOE_element_def(element, <<-END, style)
           if block_given?
-            yield.to_s + "</#{element.upcase}>"
+            yield.to_s + "</#{elem_name}>"
           else
             ""
           end
@@ -1554,7 +1624,7 @@
       end
       if @output_hidden
         body += @output_hidden.collect{|k,v|
-          "<INPUT TYPE=\"HIDDEN\" NAME=\"#{k}\" VALUE=\"#{v}\">"
+          hidden(k, v)
         }.to_s
       end
       super(attributes){body}
@@ -1594,6 +1664,15 @@
     # "DOCTYPE", if given, is used as the leading DOCTYPE SGML tag; it
     # should include the entire text of this tag, including angle brackets.
     #
+    # One additional pseudo-attribute, "XMLDECL", is available for XHTML
+    # 1.0 output.  "XMLDECL" is, as the name implies, the full XML
+    # declaration for the output document, and, like the "DOCTYPE"
+    # pseudo-element, should include the entire text of the tag,
+    # including the angle brackets.  The XML declaration is omitted by
+    # default because the XML specification requires it to be the first
+    # fragment in the document, but several modern browsers (IE6, Opera
+    # 7, and Konqueror 3.2).
+    #
     # The body of the html element is supplied as a block.
     # 
     #   html{ "string" }
@@ -1637,6 +1716,16 @@
       pretty = "  " if true == pretty
       buf = ""
 
+      # If the pseudo-attribute XMLDECL exists and is a string, then
+      # print it out in full as the XML declaration.  
+      #
+      # Note: IE6, Opera 7, and Konqueror 3.2 all get thrown into quirks
+      # mode if the first element in the document isn't a DOCTYPE
+      # declaration, so we deliberately omit the XML declaration by
+      # default here. 
+      buf += attributes['XMLDECL'] if attributes.key?('XMLDECL')
+      
+      # add the document's DOCTYPE
       if attributes.has_key?("DOCTYPE")
         if attributes["DOCTYPE"]
           buf += attributes.delete("DOCTYPE")
@@ -1647,6 +1736,12 @@
         buf += doctype
       end
 
+      # add the xml namespace unless the xmlns method isn't defined
+      # _and_ we don't have the xmlns attribute set
+      unless attributes.key?('xmlns') || !respond_to?(:xmlns)
+        attributes['xmlns'] = xmlns
+      end
+
       if block_given?
         buf += super(attributes){ yield }
       else
@@ -2055,7 +2150,6 @@
 
   # Mixin module for HTML version 3 generation methods.
   module Html3 # :nodoc:
-
     # The DOCTYPE declaration for this version of HTML
     def doctype
       %|<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">|
@@ -2236,6 +2330,141 @@
 
   end # Html4Fr
 
+  # Mixin module for generating XHTML version 1.0
+  module Xhtml10 # :nodoc:
+
+    # The DOCTYPE declaration for this version of HTML
+    def doctype
+      %|<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">|
+    end
+
+    def xmlns
+      'http://www.w3.org/1999/xhtml'
+    end
+
+    # Initialise the HTML generation methods for this version.
+    def element_init
+      style = TagMaker::XHTML10_TAG_STYLE
+
+      extend TagMaker
+      methods = ""
+      # - -
+      for element in %w[tt i b big small em strong dfn code samp kbd
+        var cite abbr acronym sub sup span bdo address div map object
+        h1 h2 h3 h4 h5 h6 pre q ins del dl ol ul label select optgroup
+        fieldset legend button table title style script noscript
+        textarea form a blockquote caption 
+        html body p dt dd li option theadtfood tbody colgroup tr th td head ]
+        methods += <<-BEGIN + nn_element_def(element, style) + <<-END
+          def #{element}(attributes = {})
+        BEGIN
+          end
+        END
+      end
+
+      # - O EMPTY
+      for element in %w[img base br area link param hr input col meta ]
+        methods += <<-BEGIN + nOE_element_def(element, nil, style, true)<-END
+          def #{element}(attributes = {})
+        BEGIN
+          end
+        END
+      end
+
+      eval(methods)
+    end
+
+  end # Xhtml10
+
+
+  # Mixin module for HTML version 4 transitional generation methods.
+  module Xhtml10Tr # :nodoc:
+
+    # The DOCTYPE declaration for this version of HTML
+    def doctype
+      %|<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">|
+    end
+
+    def xmlns
+      'http://www.w3.org/1999/xhtml'
+    end
+
+    # Initialise the HTML generation methods for this version.
+    def element_init
+      style = TagMaker::XHTML10_TAG_STYLE
+
+      extend TagMaker
+      methods = ""
+      # - -
+      for element in %w[ tt i b u s strike big small em strong dfn
+          code samp kbd var cite abbr acronym font sub sup span bdo
+          address div center map object applet h1 h2 h3 h4 h5 h6 pre q
+          ins del dl ol ul dir menu label select optgroup fieldset
+          legend button table iframe noframes title style script
+          noscript textarea form a blockquote caption html body p dt dd
+          li option thead tfoot tbody colgroup tr th td head]
+        methods += <<-BEGIN + nn_element_def(element, style) + <<-END
+          def #{element}(attributes = {})
+        BEGIN
+          end
+        END
+      end
+
+      # - O EMPTY
+      for element in %w[ img base basefont br area link param hr input
+          col isindex meta ]
+        methods += <<-BEGIN + nOE_element_def(element, nil, style, true)<-END
+          def #{element}(attributes = {})
+        BEGIN
+          end
+        END
+      end
+
+      eval(methods)
+    end
+
+  end # Xhtml10Tr
+  
+  # Mixin module for generating XHTML version 1.0 with framesets.
+  module Xhtml4Fr # :nodoc:
+
+    # The DOCTYPE declaration for this version of HTML
+    def doctype
+      %|<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">|
+    end
+
+    # the XML namespace attribute for this XHTML document
+    def xmlns
+      'http://www.w3.org/1999/xhtml'
+    end
+
+    # Initialise the HTML generation methods for this version.
+    def element_init
+      style = TagMaker::XHTML10_TAG_STYLE
+
+      methods = ""
+      # - -
+      for element in %w[ frameset ]
+        methods += <<-BEGIN + nn_element_def(element, style) + <<-END
+          def #{element}(attributes = {})
+        BEGIN
+          end
+        END
+      end
+
+      # - O EMPTY
+      for element in %w[ frame ]
+        methods += <<-BEGIN + nOE_element_def(element, nil, style, true)<-END
+          def #{element}(attributes = {})
+        BEGIN
+          end
+        END
+      end
+      eval(methods)
+    end
+
+  end # Xhtml10Fr
+
 
   # Creates a new CGI instance.
   #
@@ -2246,6 +2475,9 @@
   # html4:: HTML 4.0
   # html4Tr:: HTML 4.0 Transitional
   # html4Fr:: HTML 4.0 with Framesets
+  # xhtml10:: XHTML 1.0 (Strict)
+  # xhtml4Tr:: XHTML 1.0 Transitional
+  # xhtml4Fr:: XHTML 1.0 with Framesets
   #
   # If not specified, no HTML generation methods will be loaded.
   #
@@ -2291,6 +2523,20 @@
       extend Html4Fr
       element_init()
       extend HtmlExtension
+    when 'xhtml10'
+      extend Xhtml10
+      element_init()
+      extend HtmlExtension
+    when 'xhtml10Tr'
+      extend Xhtml10Tr
+      element_init()
+      extend HtmlExtension
+    when 'xhtml10Fr'
+      extend Xhtml10Tr
+      element_init()
+      extend Xhtml10Fr
+      element_init()
+      extend HtmlExtension
     end
   end
 

--RwGu8mu1E+uYXPWP--

--boAH8PqvUi1v1f55
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFD0oEGzdlT34LClWIRAsRXAKDdX/gdx4FX/30FYZ3rcH1IKk9/rwCeJPNa
NcfescT+eqIClXI+Kq90yYA
O9 -----END PGP SIGNATURE----- --boAH8PqvUi1v1f55--