The GSHTMLParser class is a simple subclass of
GSXMLParser which should parse reasonably well
formed HTML documents. If you wish to parse XHTML
documents, you should use GSXMLParser... the
GSHTMLParser class is for older 'legacy'
documents.
You may create a subclass of this class to handle
incremental parsing of html documents... this is
provided for handling legacy documents, as modern
html documents should use xhtml, and for those you
should simply subclass
GSSAXHandler
GSSAXHandler is a callback-based interface to
the
GSXMLParser
which operates in a similar (though not identical)
manner to SAX.
Each GSSAXHandler object is associated with a
GSXMLParser object. As parsing progresses,
the methods of the GSSAXHandler are invoked by the
parser, so the handler is able to deal with the
elements and entities being parsed.
The callback methods in the GSSAXHandler class do
nothing - it is intended that you subclass
GSSAXHandler and override them.
If you create a GSXMLParser passing nil
as the GSSAXHandler, the parser will parse data to
create a
GSXMLDocument
instance which you can then examine as a whole
... this is generally the preferred mechanism for
parsing as it permits the parser to validate
the parsed document againts a DTD, and your
software can then examine the document secure
in the knowledge that it contains the expected
structure. Use of a GSSAXHandler is
preferred for very large documents with
simple structure... in which case incremental
parsing is more efficient.
The default handler for parsing documents... this will
build a GSXMLDocument for you. This handler may not
currently be subclassed, though that capability may
be added at a later date.
A class wrapping attributes of an XML element node.
Generally when examining a GSXMLDocument, you need
not concern yourself with GSXMLAttribute objects as you
will probably use the
[GSXMLNode -objectForKey:]
method to return the string value of any attribute you
are interested in.
Sets the root of the document. NB. The
node must have been created as part of the
receiving document (eg. using the
-makeNodeWithNamespace:name:content:
method).
Warning the underscore at the start of the
name of this instance variable indicates that, even
though it is not technically private, it is
intended for internal use within the package, and
you should not use the variable in other code.
Warning the underscore at the start of the
name of this instance variable indicates that, even
though it is not technically private, it is
intended for internal use within the package, and
you should not use the variable in other code.
Return the numeric constant value for the namespace
type named. This method is inefficient, so the
returned value should be saved for re-use later.
The possible values are -
Warning the underscore at the start of the
name of this instance variable indicates that, even
though it is not technically private, it is
intended for internal use within the package, and
you should not use the variable in other code.
Converts a node type string to a numeric
constant which can be compared with the result
of the -type
method to determine what sort of node an instance
is. Because this method is quite inefficient, you
should cache the numeric type returned and re-use
the cached value.
Return node content as a string. This should return
meaningful information for text nodes and for
entity nodes containing only text nodes. If
entity substitution was not enabled during parsing,
an element containing text may actually contain both
text nodes and entity reference nodes, in this case
you should not use this method to get the content of
the element, but should examine the child nodes of the
element individually and perform any entity
reference you need to do explicitly. NB.
There are five standard entities which are
automatically substituted into the content
text rather than appearing as entity nodes in their
own right. These are '<', '>', ''', '"'
and '&'. If you with to receive content in which
these characters are represented by the original
entity escapes, you need to use the
-escapedContent
method.
This performs the same function as the
-content
method, but retains escaped character information
(the standard five entities <, >,
', ", and &) which
are normally replaced with their standard equivalents
(<, >, ', ", and &).
Return the first child element of this node. If you
wish to step through all children of the node
(including non-element nodes) you should use the
-firstChild
method instead.
Creation of a new child element, added at the
end of parent children list. ns and
content parameters are optional (may be
nil). If content is non
nil, a child list containing the
TEXTs and ENTITY_REFs node will be created. Return
previous node.
Return the next node at this level. This method can
return any type of node, and it may be more
convenient to use the
-nextElement
node if you are parsing a document where you wish to
ignore non-element nodes such as whitespace text
separating elements.
Returns the next element node, skipping past any
other node types (such as text nodes). If there is no
element node to be returned, this method returns
nil. NB. This method is not
available in java, as the method name conflicts
with that of java's Enumerator class.
Return the previous element node at this level.
NB. This method is not available in java, as
the method name conflicts with that of java's
Enumerator class.
Return attributes and values as a dictionary, but
applies the specified selector to each key before
adding the key and value to the dictionary. The
selector must be a method of NSString taking no
arguments and returning an object suitable for
use as a dictionary key.
This method exists for the use of GSWeb... it is
probably not of much use elsewhere.
Sets the namespace of the receiver to the value
specified. Supplying a nil
namespace removes any namespace previously set or
any namespace that the node inherited from a parent
when it was created.
Return node-type. The most efficient way of testing
node types is to use this method and compare the
return value with a value you previously obtained
using the
+typeFromDescription:
method.
Warning the underscore at the start of the
name of this instance variable indicates that, even
though it is not technically private, it is
intended for internal use within the package, and
you should not use the variable in other code.
The XML parser object is the pivotal part of parsing an
XML document - it will either build a tree representing
the document (if initialized without a GSSAXHandler),
or will cooperate with a GSSAXHandler object to provide
parsing without the overhead of building a tree.
The parser may be initialized with an input source (in
which case it will expect to be asked to parse the
entire input in a single operation), or without. If
it is initialised without an input source, incremental
parsing can be done by feeding successive parts of
the XML document into the parser as NSData objects.
This method controls the loading of external
entities into the system. If it returns an empty
string, the entity is not loaded. If it returns a
filename, the entity is loaded from that file.
If it returns nil, the default entity
loading mechanism is used.
The default entity loading mechanism is to construct
a file name from the locationURL, by replacing all path
separators with underscores, then attempt to
locate that file in the DTDs resource directory of
the main bundle, and all the standard system
locations.
As a special case, the default loader examines the
publicID and if it is a GNUstep DTD, the loader
constructs a special name from the ID (by
replacing dots with underscores and spaces with
hyphens) and looks for a file with that name and
a '.dtd' extension in the GNUstep bundles.
NB. This method will only be called if there is no
SAX handler in use, or if the corresponding method in
the SAX handler returns nil.
If the handler object supplied is
nil, the parser will build a tree
representing the parsed file rather than
attempting to get the handler to
deal with the parsed elements and entities.
Sets a directory in which to look for DTDs when
resolving external references. Can be used whjen
DTDs have not been installed in the normal locatioons.
If the handler object supplied is
nil, the parser will use an instance
of
GSTreeSAXHandler
to build a tree representing the parsed file. This
tree will then be available (via the
-document
method) as a
GSXMLDocument
on completion of parsing.
The source for the parsing process is not
specified - so parsing must be done
incrementally by feeding data to the
parser.
Initialisation of a new Parser with SAX
handler (if not nil) by
calling
-initWithSAXHandler:
Sets the input source for the parser to be the
specified data object (which must
contain an XML document), so parsing of the
entire document will be performed rather than
incremental parsing.
Initialisation of a new Parser with SAX
handler (if not nil) by
calling
-initWithSAXHandler:
Sets the input source for the parser to be the
specified input stream, so parsing
of the entire document will be performed rather than
incremental parsing.
Set and return the previous value for blank text nodes
support. ignorableWhitespace nodes are only
generated when running the parser in validating
mode and when the current element doesn't allow CDATA
or mixed content.
Parse source. Return YES if parsed as
valid, otherwise NO. If validation
against a DTD is not enabled, the return value
simply indicates whether the xml was well formed.
This method should be called once to parse
the entire document.
Pass data to the parser for incremental
parsing. This method should be called many
times, with each call passing another block of
data from the same document. After the
whole of the document has been parsed, the method
should be called with an empty or nildata object to indicate end of parsing.
On this final call, the return value indicates whether
the document was valid or not. If validation to a DTD
is not enabled, the return value simply indicates
whether the xml was well formed.
GSXMLParser *p = [GSXMLParser parserWithSAXHandler: nil source: nil];
while ((data = getMoreData()) != nil)
{
if ([p parse: data] == NO)
{
NSLog(@"parse error");
}
}
// Do something with document parsed
[p parse: nil]; // Completed parsing of document.
Sets up (or removes) a mutable string to which error
and warning messages are saved. Using an argument of
NO will cause these messages to be
written to stderr (the default). NB. A SAX
handler which overrides the error and warning
logging messages may stop this mechanism operating.
Set and return the previous value for entity support.
Initially the parser always keeps entity
references instead of substituting entity values
in the output.
The GSXMLRPC class provides methods for constructing
and parsing XMLRPC method call and response documents
... so that calls may be constructed of standard
objects.
The correspondence between XMLRPC values and
Objective-C objects is as follows -
i4 (or
int) is an
NSNumber
other than a real/float or boolean.
If you attempt to use any other type of
object in the construction of an
XMLRPC document, the
[NSObject -description]
method of that object will be used to create a striong, and the resulting object will be encoded as an XMLRPC string element.
In particular, the names of members in a
struct must be strings, so if
you provide an
NSDictionary
object to represent a struct the keys of the dictionary will be converted to strings if necessary.
The class also provides a method for
making a synchronous XMLRPC method
call (with timeout), or an
asynchronous call in which
the call completion is handled by a
delegate.
You may also use the class to
implement an XMLRPC server, by
calling the
-parseMethod:params:
method to parse the data POSTed to your server, and -buildResponseWithParams:
(or -buildResponseWithFaultCode:andString:) to produce the data to be sent back to the client.
In order to simply make a synchronous
XMLRPC call to a server, all you
need to do is write code like:
GSXMLRPC *server = [[GSXMLRPC alloc] initWithURL: @"http://server/path"];
id result = [server makeMethodCall: name params: p timeout: 30];
Saying that you want to call the specified method
('name') on the server, passing the parameters
('p') and with a 30 second timeout. If there
is a network or http-level error or a timeout, the
result will be an error string, otherwise it will be
an array (on success) or a dictionary containing the
fault details.
Given a method name and an array of
parameters, this method constructs
the XML document for the corresponding XMLRPC call and
returns the document as an NSData object containing
UTF-8 text. The params array may be
empty or nil if there are no parameters
to be passed. The method returns
nil if passed an invalid
method name (a method name may
contain any of the ascii alphanumeric characters
and underscore, fullstop, colon, or slash). This
method is used internally when sending an
XMLRPC method call to a remote system,
but you can also call it yourself.
Given a method name and an array of
parameters, this method constructs
the XML document for the corresponding XMLRPC call and
returns the document as a string. The
params array may be empty or
nil if there are no parameters to be
passed. The method returns
nil if passed an invalid
method name (a method name may
contain any of the ascii alphanumeric characters
and underscore, fullstop, colon, or slash).
Constructs an XML document for an XMLRPC fault
response with the specified code and
string. The resulting document is returned as a
string. This method is intended for use by
applications acting as XMLRPC servers.
Builds an XMLRPC response with the specified array
of parameters and returns the document as a string.
The params array may be empty or
nil if there are no parameters to be
returned (an empty params element will
be created). This method is intended for use by
applications acting as XMLRPC servers.
Returns the delegate previously set by the
-setDelegate:
method. The delegate handles completion of
asynchronous method calls to the URL specified
when the receiver was initialised (if any).
Initialise the receiver to make XMLRPC calls to
the specified url and (optionally) with the
specified SSL parameters. The
url argument may be nil, in
which case the receiver will be unable to make XMLRPC
calls, but can be used to parse incoming requests
and build responses. If the SSL credentials are
non-nil, connections to the remote server will be
authenticated using the supplied certificate
so that the remote system knows who is contacting it.
Calls
-sendMethodCall:params:timeout:
and waits for the response. Returns the response
parameters (an array), the response fault (a
dictionary), or a failure reason (a string).
Parses XML data containing an XMLRPC method call.
Returns the name of the method call.
Empties, and then places the method parameters (if
any) in the params argument. NB. Any
containers (arrays or dictionaries) in the
parsed parameters will be mutable, so you can modify
this data structure as you like. Raises an
exception if parsing fails. This method is
intended for the use of XMLRPC server
applications.
Parses XML data containing an XMLRPC method
response. Returns nil for
success, the fault dictionary on failure.
Places the response parameters (if any) in the
params argument. NB. Any containers
(arrays or dictionaries) in the parsed parameters
will be mutable, so you can modify this data structure
as you like. Raises an exception if parsing
fails. Used internally when making a method
call to a remote server.
Returns the result of the last method call, or
nil if there has been no method call or
one is in progress. The result may be one of -
A mutable array... the parameters of a success
response.
A dictionary... containing a fault response.
A string... describing a low-level failure (eg.
timeout).
NB. Any containers (arrays or dictionaries) in the
parsed parameters of a success response will be
mutable, so you can modify this data structure as
you like.
Send an asynchronous XMLRPC method call
with the specified timeout. A delegate should
have been set to handle the result of this call, but
if one was not set the state of the asynchronous call
may be polled by calling the
-result
method, which will return nil
as long as the call has not completed.
The call may be cancelled by calling the
-timeout:
method This method
returns YES if the call was started,
NO if it could not be started (eg
because another call is in progress or because of
bad arguments). NB. For the asynchronous
operation to proceed, the current
NSRunLoop
must be run.
Specify whether to generate compact XML (omit
indentation and other white space and omit
<string> element markup).
Compact representation saves some space (can be
important when sent over slow/low bandwidth
connections), but sacrifices readability.
Sets the delegate object which will receive callbacks
when an XMLRPC call completes. NB. this
delegate is not retained, and should be
removed before it is deallocated (call
-setDelegate:
again with a nil argument to remove the
delegate).
Sets the time zone for use when sending/receiving
date/time values. The XMLRPC specification
says that timezone is server dependent so you will
need to set it according to the server you are
connecting to. If this is not set, UCT is
assumed.
Handles timeouts, passing information to delegate
... you don't need to call this method, but you
may call it in order to cancel an
asynchronous request as if it had timed out.
Use of the GSXPathContext class is simple... you just
need to look up xpath to learn the syntax of xpath
expressions, then you can apply those
expressions to a context to retrieve data from
a document.
Warning the underscore at the start of the
name of this instance variable indicates that, even
though it is not technically private, it is
intended for internal use within the package, and
you should not use the variable in other code.
Warning the underscore at the start of the
name of this instance variable indicates that, even
though it is not technically private, it is
intended for internal use within the package, and
you should not use the variable in other code.
For XPath queries returning a node set. An XPATH
node set is an ordered set of nodes returned as a result
of an expression. The order of the nodes in the set is the
same as the order in the xml document from which they
were extracted.
XPath queries return a GSXPathObject. GSXPathObject in
itself is an abstract class; there are four types of
completely different GSXPathObject types, listed
below. I'm afraid you need to check the returned type
of each GSXPath query to make sure it's what you meant it
to be.
You don't create GSXPathObject instances, instead the
XPATH system creates them and returns them as the
result of the
[GSXPathContext -evaluateExpression:]
method.
Warning the underscore at the start of the
name of this instance variable indicates that, even
though it is not technically private, it is
intended for internal use within the package, and
you should not use the variable in other code.
Warning the underscore at the start of the
name of this instance variable indicates that, even
though it is not technically private, it is
intended for internal use within the package, and
you should not use the variable in other code.
Performs an XSLT transformation on the specified
file using the stylesheet provided. Returns an
autoreleased GSXMLDocument containing the
transformed XML, or nil on
failure.
Performs an XSLT transformation on the specified
file using the stylesheet and parameters provided. See
the libxslt documentation for details of the supported
parameters. Returns an autoreleased
GSXMLDocument containing the transformed XML,
or nil on failure.
Performs an XSLT transformation on the specified
file using the stylesheet provided. Returns an
autoreleased GSXMLDocument containing the
transformed XML, or nil on
failure.
Performs an XSLT transformation on the specified
file using the stylesheet and parameters provided.See
the libxslt documentation for details of the supported
parameters. Returns an autoreleased
GSXMLDocument containing the transformed XML,
or nil on failure.
Performs an XSLT transformation on the current
document using the supplied stylesheet.
Returns an autoreleased GSXMLDocument containing
the transformed XML, or nil on failure.
Performs an XSLT transformation on the current
document using the supplied stylesheet and
paramaters (params may be
nil). See the libxslt documentation for
details of the supported parameters. Returns
an autoreleased GSXMLDocument containing the transformed
XML, or nil on failure.
Delegates should implement this method in order to
be informed of the success or failure of an XMLRPC method
call which was initiated by the
-sendMethodCall:params:timeout:
method.
An empty method provided for subclasses to override.
Called by the sender when an XMLRPC
method call completes (either success or failure).
The delegate may then call the
-result
method to retrieve the result of the method call
from the sender.
Deals with standard XML internal entities.
Converts the five XML special characters in the
receiver ('>', '<', '&', ''' and
'"') to their escaped equivalents, and
return the escaped string. Also converts
non-ascii characters to the corresponding numeric
entity escape sequences. You should perform
any non-standard entity substitution you require
after you have called this method.
Deals with standard XML internal entities.
Converts the five XML escape sequences
('>', '<', '&',
''' and '"') to the unicode
characters they represent, and returns the
unescaped string. Also converts numeric
entity escape sequences to the corresponding unicode
characters. You should perform any
non-standard entity substitution you require
before you have called this method.