NAME XPC - XML Procedure Call SYNOPSIS use XPC; and then my $xpc = XPC->new(< END_XPC or my $xpc = XPC->new(); $xpc->add_call('localtime'); or my $xpc = XPC->new_call('localtime'); and then later print XML_FILE $xpc->as_string(); DESCRIPTION This class represents an XPC request or response. It uses XML::Parser to parse XML passed to its constructor. MOTIVATION A Commentary on the XML-RPC Specification and Definition of XPC Version 0.2 Introduction The following commentary is based upon the specification from the UserLand web site. The version referenced for this commentary has a notation on it that it was "Updated 10/16/99 DW" (see http://www.xmlrpc.com/spec). These comments are stylistic in nature, and it is well recognized by the author that style in program and protocol design are very personal. This commentary will, however, point out the rationale of the proposed changes to the specification's design. Procedure Call Structural Simplifications The example in the "Request example" section looks like this: examples.getStateName 41 We note by looking at the remainder of the specification that there are only two top-level elements allowed in XML-RPC: "methodCall" and "methodResponse". Since methods are *the* subject of RPC, and since all top-level elements in the design are about methods, there is no need to have the redundant qualifier "method" in the names of these elements. Thus, the example would be modified to look like this: examples.getStateName 41 Now, the content of the "methodName" element is constrained to be very simple text (from the "Payload format" section, which says "... identifier characters, upper and lower-case A-Z, the numeric characters, 0-9, underscore, dot, colon and slash"). It is also mandatory. This is precisely the reason XML includes the ability to add attributes to elements (it is technically redundant, but very convenient). So, we really should turn this example into: 41 Once the "methodName" element has been removed from the design, the "params" element becomes superfluous, since its only purpose was to group the parameters and separate them from the method name. Now, the "call" element *is* the element that groups the parameters, leaving us with: 41 Header Nomenclature One final comment on terminology: RPC stands for Remote *Procedure* Call, so we should probably not use the term "method" when we mean "procedure" or something else. Since the "procedures" can return values, which corresponds in some languages to the term "function", we have a rivalry for the term to use. "Procedure" matches the acronym nicely, but for some folks "Function" would have a better connotation. Fans of Eiffel might even prefer "Feature", or "Query" for calls returning a value and "Routine" or "Command" for those not. Given the variety of possibilities, here we stay with the simple policy of matching the acronym: 41 Scalar Values Typically, an interface definition determines the number, names and types of parameters to a procedure call. It is incumbent upon the caller to conform to that specification. Therefore, the declaration for any procedure to be called as part of an interface *should* indicate the expected types of the parameters, which means that the caller should not have to indicate the type of value it is passing (and, the value *itself* isn't passed in general, but rather a *textual representation* of the value is passed). XML-RPC should not be blind to typing issues. These issues should not appear in the calling standard, but rather in an interface definition standard (about which more later). Removing the type information from the example results in: 41 Since the element really now just means "scalar" (see the specification section "Scalar s"), let's call it that: 41 If for some reason not contemplated here type information is necessary for scalars, then having a simple "type" attribute of the "scalar" element would suffice, especially since the set of allowable values is fixed, small, and consists of only short string values ("i4", "int", "boolean", "string", "double", "dateTime.iso8601", and "base64"). If we only ever expected simple, short scalar values, we could make one more change, to: but, it is presumed that it would be possible to have a very long scalar string value, for which the former representation would be better. Named Parameters Some procedures may be implemented in a language that makes it very easy to implement named parameters. Supporting this would be easy: 41 Scalar Types Whether types apply to calls and interfaces or just to interfaces, they are an important part of the specification. The specification defines "i4" and "int" to be synonyms for a 'four-byte signed integer'. Since the value will be represented in the call as text, this description really isn't an appropriate specification, since it is written in terms of a binary representation. We suggest here a single term for this data type, "integer", and that it be defined in terms of a range of acceptable values: -2,147,483,648 to +2,147,483,647 (just the range of vales that can be stored in a two's complement 32-bit binary representation). The "boolean" data type is distinct from the "integer" data type, yet its domain {"0", "1"} is a subset of the "integer" domain instead of the more consistent {"false", "true"}. If "boolean" is going to be treated as its own type, it should have its own domain. The specification defines "double" to be 'double-precision signed floating point number'. Note that in the 1999-01-21 questions-and-answers section near the end of the document, it is revealed that the full generality of the data type commonly meant by such a description is not available. Niether infinities, nor "NaN" (the Not-a-Number value) are permitted. Not even exponential notation is allowed. Very simple strings matching the Perl regular expression: /^([+-])(\d*)(\.)(\d*)$/ are the only ones permitted according to the answer given, although one suspects that what was meant was something closer to this: /^([+-])?(\d*)((\.)(\d*))?$/ because the first expression requires the sign to be present, and permits ""+."" and ""-."" as valid strings (although to what values they would map is a mystery). Note: The second expression makes the leading sign and trailing decimal point and digits optional, but still isn't perfect, since it allows the empty string as a value. This type should be called "rational" instead of "double" to get away from the physical description. "decimal" is another potentially reasonable name for this type. Also, the FAQ answer says the range of allowable values is implementation- dependant, but the specification refers to "double-precision floating-point", which does have an expected set of behaviors for most people. The specification mentions "ASCII" in the type definition for string, but XML permits all of Unicode. Shouldn't one expect to be able to pass around string values with all the characters thus permitted? Shouldn't servers and clients be written to handle this broader character set, and convert as necessary internally? Otherwise, we are taking a big step back from the promise of XML and the web. The "dateTime.iso8601" data type name is awkward. They didn't refer to the IEEE 754 floating point standard in the name of the "double" type (which would have been "double.ieee754" if they had). Unless the specification is going to allow multiple "dateTime" variants, the qualifier is just an annoyance. In addition, most people call this type "timestamp", even if their computer languages sometimes just call it "DATE" (as in many SQL implementations). So, here we propose that this type just be called "timestamp" and that the type description refer to the ISO 8601 standard. Finally, the "base64" type (added 1999-01-21) really should be "binary" with the encoding standard (Base-64) referenced in the type description. Structures Structures continue the same idiom used elsewhere in the specification: the avoidance of element attributes. Here is the example used in the specification (modified to acommodate the recommendations already made here): lowerBound 18 upperBound 139 The "name" element here should be converted into an attribute of the "member" element, leaving: 18 139 Arrays The "array" element is defined with a superfluous "data" child element. This element serves no function, so it should be removed. Here is the example from the specification (again, modified based on previous recommendations): 12 Egypt false -31 Removing the unneeded "data" element leaves us with: 12 Egypt false -31 We have recommended getting rid of "value" and using "scalar", but the specification allows a "value" to contain a scalar value *or* a "struct" *or* an "array". We can still do without the "value" element, though: 12 Egypt false -31 Responses The example in the document is: faultCode 4 faultString Too many parameters. This has much unnecessary nesting. It is *much* simpler to store the fault code as an attribute of the "fault" element and to have the fault description be the body of the "fault" element: Too many parameters. Adding a Consistent Top-Level Element It would be nice if one could always be sure that XML data involved in the XML-RPC protocol had a particular root element. Another benefit of doing this is that a given request *could* include multiple calls, which for certain types of interactions could be of great performance benefit. If you need to make many related calls, the network latency would be a real drag on performance, but batching up the calls into one big bundle amortizes the transport time, increasing performance. A top- level element of "xpc" is used here to stand for "XML Procedure Call". ... ... ... As soon as we decide to put multiple calls in a transmission, it begs the issue of tieing responses to calls. We could use order for this, but we could also provide an attribute to "call" and "response" called "id" that is optionally provided by the caller, and if present, is copied into the response element for that call. HTTP POST REQUEST CONTENT: ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... <bar>Here's some text in an element.</bar> However, if we add to the "scalar", "array" and "struct" types a new type "xml", then we can do the natural thing: Here's some text in an element. We could even use XML Namespaces if needed to resolve element name collisions if they arise (namespaces are commonly used for this reason in XSLT transforms). Technically speaking, allowing parameters and results to contain XML makes the other XML-RPC types redundant, but providing shortcuts for these common cases does make sense. Interface Specifications In order to provide true discoverability, there needs to be a way for a client to ask the server what operations it supports, and to get back interface information for the supported procedures. Sending an empty "query" element should cause the server to return an array of procedure names: HTTP POST REQUEST CONTENT: HTTP RESPONSE CONTENT: foo bar Sending a "query" element with a procedure name filled in should return a response containing a prototype: HTTP POST REQUEST CONTENT: HTTP RESPONSE CONTENT: The 'foo' procedure! Given an integer, returns an array with that many elements, with each element containing the integer number of its position within the array. Requesting information on an unknown procedure results in a "fault" return: HTTP POST REQUEST CONTENT: HTTP RESPONSE CONTENT: Unknown procedure name 'quux'! Conclusion The "Strategies/Goals" section of the specification lists these items (paraphrased): * Leverage the ability of CGI to pass many firewalls to build an RPC mechanism that can cross many platforms and many network boundaries. * Cleanliness. * Extensibility. * Easy implementation. The first of these seems to be met without difficulty by leveraging the HTTP protocol. Cleanliness is of course a subjective measure, and this document has pointed out many points on which we think cleanliness can be improved. The original specification doesn't seem to address extensibility other than to list it as a goal. This document's addition of the XML type provides much extensibility. Ease of implementation should not be radically decreased by the modified version of XML-RPC proposed here, except in the handling of Unicode text. This is likely the main reason ASCII was specified in the original protocol definition. ADDITIONAL INFORMATION The following sections provide details behind the proposed XPC. Document Type Definition for Proposed XPC This appendix shows the complete simple DTD for XPC. It is no more complicated than the XML-RPC DTD (see http://www.ipso-facto.demon.co.uk/xml-rpc-inline.html or http://www.ontosys.com/xml-rpc/xml-rpc.dtd). XML Schema for Proposed XPC An XML-RPC <---> XPC Gateway The following XSLT transform will convert XML-RPC requests into XPC requests: The following XSLT transform will convert XPC responses into XML-RPC responses (where it is possible): The following XSLT transform will convert XPC requests into XML-RPC requests (where it is possible): The following XSLT transform will convert XML-RPC responses into XPC responses: AUTHOR Gregor N. Purdy COPYRIGHT Copyright (C) 2001 Gregor N. Purdy. All rights reserved. This is free software; you can redistribute it and/or modify it under the same terms as Perl itself.