Programs Acting as a TCP and/or UPD Server

All HadronZoo TCP/UPD server programs operate epoll in edge-triggered mode. This is facilitated by the hzIpServer class, working in conjunction with the hzIpListen class (listening socket), and the hzIpConnex class (client connection). In outline, the method is as follows: Call hzIpServer::GetInstance() to create the singletopn hzIpServer instance; Then call 'AddPort' functions (see below), to add one or more listening sockets; Then call hzIpServer::Serve() to make the service live. Note that it is this function, which does not return until shutdown conditions apply and the shutdown proceedure is complete, that manages the epoll loop.

The hzIpServer regime provides an operational framework, but not the service functionality. During normal operation, hzIpConnex instances are automatically created and destroyed as clients connect and disconnect. There is optional action on client connection and disconnection, depending on the service protocol and other requirements - and compulsory action on data receipt. These actions are respectively handled by user defined OnConnect(), OnDisconn() and OnIngress() functions, which are supplied as pointers during the initialization of the applicable listening socket. hzIpServer has three member functions to add listening sockets as follows:-

1) AddPortTCP Used for any TCP client EXCEPT HTTP 2) AddPortHTTP Used for HTTP services ONLY 3) AddPortUDP Adds a UDP socket

The 'AddPort' functions set an inactivity timeout, the port number to listen on, the maximum number of simultaneous connections and state if SSL will be used. More importantly, the AddPort functions assign one or more application specific callback functions, which will do all the work. In more detail we have:-

hzEcode AddPortTCP ( hzTcpCode(*OnRequest)(hzChain&,hzTcpConnex*), hzTcpCode(*OnConnect)(hzTcpConnex*) = 0, hzTcpCode(*OnDisconn)(hzTcpConnex*) = 0, uint32_t nTimeout, uint32_t nPort, uint32_t nMaxClients, bool bSecure = false ) ;

Note that AddPortTCP can be used for any TCP client EXCEPT HTTP; OnConnect is optional, reserved for protocols with a server hello; OnDisconn is optional, usually used to add more reporting.

AddPortHTTP() is a special case of AddPortTCP(), and has no function pointer arguments. There are no special steps required on connection or disconnection within the HTTP protocol, and OnIngress() is always the library function HandleHttpMsg().

AddPortUDP() has the same OnIngress, OnConnect and OnDisconn function pointer arguments that AddPortTCP() has - in spite of UDP being a connectionless protocol. As with AddPortTCP, OnConnect and OnDisconn are optional and their presence would depend on whether or not OnIngress() required state maintenence of any form.

Server Operation: The OnIngress function

Although OnConnect() is critical in protocols with a server hello (e.g. SMTP and POP3), and OnDisconn() can be very useful for post-session reporting, the bulk of the work falls to the compusory OnIngress() function. With epoll in edge-triggered mode, the client sockets are non-blocking. When data arrives on a client socket it is always read in full. Although an incoming burst of data could amount to a single, complete request message, this cannot be assumed. The data could span one or more messages, of which the first and/or the last, could be incomplete.

The first task of OnIngress() is to determine if the input so far received, amounts to a whole request message. If it does, OnIngress() must process the request and remove it from the input chain.

Outgoing Client Connections

Server programs often connect as clients to other server programs. The DWE for example, commonly uses repository microservices and can proxy HTTP requests to particular domains, to another DWE instance. Server-to-server connections can be set up as individual indepenedent channels, but can also be accommodated within the server epoll event loop.

Acting as TCP/UDP Client

HadronZoo based programs can become either TCP or UDP clients by virtue of the hzTcpClient and hzUdpClient classes respectively. These two classes provide the means to send packets to and from the host but don't do anything beyond this. They are intended to be used, either in conjunction with another class that will impliment a standard protocol, or where the objective is to impliment a proprietary protocol.

Thusfar, HadronZoo has provided classes for the following standard protocols:-

1) FTP To act as FTP client you have to know the host you are trying to connect to and the username and password. You then declare an instance of hzFtpClient and call the Init method as Init(hostname, username, password). The StartSession() method opens the connection while QuitSession() terminates it. The methods in-between are largely self explanitory although you will need to use hzDirent (defined in hzDirectory.h) as these are used to describe directory entries. FTP sessions regularly fall over and to counter this, the hzFtpClient methods that transport data, all reconnect automatically. Automatic reconnection is not always possible of course and apart from which, it is usually nessesary to keep track of what files did or did not get sent or received. To this end there is the hzFtpHost class which will do a bulk sync to a given host or not as the case may be.
2) POP3 To act as a POP3 client and collect emails (eg as an auto-responder).
3) HTTP The hzHttpClient class manages the TCP connection to a remote website and provides the functions of GetPage() and PostForm(). But at this level, nothing is known about the content of the pages or forms. The hzWebhost class equates to a third party website and can be configured to accomodate the nuances of tha website. This would be for the purposes of automating browsing (web-scraping).

Using the hzWebhost class to interact with a website is a matter of emulating what a human operating a browser would do to acheve the same. This is a parameterized and generally recursive process in which, starting from one or more 'root' or 'entry point' pages such as the home page, pages are downloaded and links to other pages are garnered. Where these links point to other pages on the website (or other websites listed as related to it), these pages are alse read in. The process terminates when all discovered pages meeting any supplied limiting criteria are downloaded.

Where authentication is required the authentication sequence is normally by form submission. The form is downloaded from a particular host URL, filled in and sent back to the host URL indicated in the form (same host but not necessarilly the same URL). The authentication sequence is parameterized as a series of one or more pages that must first be fetched (i.e. to garner session cookies) followed by the form submission that will effect the authenication. Commonly both the home page as well as the login form will have to be downloaded before all the cookies expected to be returned with the login form have been sent.

Note that in some cases, the login data is sent by a GET and a URL extension. In such cases there will be no <postForm> tag and the username and password is embedded in the <loadPage> tag's url attribute.

Each step in the HTTP session is implimented by a hzWebCMD instance

Currently there are no standard UDP service protocols covered by HadronZoo functions.