Demarcation of System Administrator and Webmaster
Linux reserves port numbers below 1024 for programs with superuser privileges, so for a service to listen on the standard HTTP and HTTPS ports (80 and 443), the server program must be run by the system administrator. Listening sockets are exclusive so once a program is listening on a given port, no other program on the same machine may do so. Instances of the same program can, but only when forked from the original. However forking cannot acheive anything beyond what can be done with multithreading. In order for multiple HTTP/S services to be availale on the standard ports from the same physical server, the server program must be able to handle requests on behalf of them all. To do this it must distinguish between requests on the basis of domain, and either internally incorporate the functionality of each service, or route requests to the applicable HTTP/S service.
Since only the system administrator can run the HTTP/S server program on the standard ports, it is for the system administrator to decide which domains are served, and who will act as webmaster for each domain. The webmasters, who are Linux users presumed not to have superuser privileges, then administer the service available at their respective domains. This demarcation makes sense even where the system administrator and webmaster are the same person. Where multiple services are administered by multiple webmasters, it is crucial. It is imperative that action by one webmaster, has no impact on other webmasters or the system administrator. It is also imporant to minimimize the impact of system administrator action, upon the webmasters.
Apache is actually a more natural fit to this demarcation, than the web engine is. In Apache websites, all resources are files, be they images or webpages. In any HTTP request, the path part of the supplied URL, is taken as a path relative to the directory named as the document root. If this path finds a file, it will be served. If not, the visitor gets a 404 error. The resources are disparate and so can be added, deleted and modified at will. If webmasters write pages with broken links or PHP scripts that don't work, the impact will be limited to the website in hand. The disparate nature also means the system administrator can shutdown and restart the HTTP service at will. With the LAMP method, website data state is vested with MySQL external database. Thus, providing the shutdown is orderly and allows all requests currently in progress to complete, the data state is preserved. Service will simply resume once Apache is back online - which, with little in the way of initialization, is usually achieved within a couple of seconds.
Web Engine Operation
The web engine is a different proposition. Webapps hosted by a given web engine instance, are separately configured as either internal or by proxy. In the latter case, requests are passed to a secondary web engine instance which will host the webapp internally. Where webapps are internal, their resources and other related entities form a complete and coherent set within the host web engine program space. Webapps are thus whole applications, rather than a set of disparate files, which is one reason why they are called webapps rather than websites.
The HDB is the web engine native database, so unless webapps are data inert, HDB repositories will be among the resources in the web engine program space. As previously stated, HDB repositories can be available as external microservices, but by default are internal. External repository microservices consume very little web engine memory. Internal repositories have greater operational efficiency, but consume potentially significant volumes of web engine memory. The web engine and particularly the HDB, are designed for performance, rather than fast startup. The key to perfomance is to hold a lot of data in the memory. The key to fast startup, is to not hold a lot of data in the memory!
In typical multi-domain installations, the system administrator runs a main web engine instance, which listens on the standard ports and hosts every webapp by proxy. The webmasters run secondary web engine instances, which listen on non-standard ports and host webapps internally. The downsides of this approach are minor. While proxy connections inevitably add latency, the throughput reduction is not severe. The upsides are major. The approach is simple, flexible and above all, safe. The demarcation is preserved so webmaster action, such as adding C++ modules, will only affect the webmaster concerned. Worth noting also, that the proxy mechanism is entirely standard. Thus, on servers where Apache is listening on the standard ports, developers can try out a Dissemino webapp by means of the Apach proxypass facility.
The web engine has an in-built change control regime, which enables changes to webapps, live or otherwise. Restarts are only needed for data model changes, although webmasters will want to stop and start webapps for other reasons. One way of stopping a webapp, is to kill the web engine instance that hosts it, but webapps can be suspended and shut down without this. Suspension simply blocks new requests and completes those in progress. A shutdown is a suspension plus an unloading of the webapp resources and entity set.
Dissemino Spheres and the Sphere File
If you look at the web engine source code, you won't need programming skills to see that there is very little to it. This is because almost all the functionality is provided by the HadronZoo library. Everything needed to implement the HDB and the Dissemino method, including the config read, is in the library - so that any HadronZoo program can incorporate the HDB and be given a web front end. Most HadronZoo programs with a web front end underutilize its capabilities, as they only need a simple control panel for stats reporting and other admin purposes. All could host a sophisticated webapp should this be justified, and doing so would only be a matter of changing the configs, not the program code. The web engine is something of an exception. It incorporates both the HDB and a HTTP/S interface, but it isn't the front end of anything. It is a vanila HadronZoo program where the front end is the objective. It basically does nothing except read and act upon the webapp configs. The web engine is also an exception in another respect. It is the only official HadronZoo program that can host multiple webapps, on behalf of multiple domains, and do so using the same listening port.
The web engine is invoked with an XML file which must either be a webapp config, with a <webappCfg> root tag, or a Dissemino Sphere config, with a <sphereCfg> root tag. Where the web engine hosts a single webapp, it will do so internally and be invoked with the applicable webapp config. In all other cases the web engine is invoked with a Dissemino Sphere config.
A Dissemino Sphere or 'sphere of influence', is an authorative record of the set of webapps that a given web engine will host. Although there will only be one sphere in the typical installation described above, there can be several spheres and collectively, they cover the entire set of webapps across the machine. The currently accepted spheres are held as XML files in a standard location such as /etc/hzDissemino/. These can only be written by the system administrator, but are readable by anyone. Although the web engine will only act on the config file it is invoked with, it will read in the entire set of spheres so as to check that there are no contradictions.
In the below example, the web engine will listen on the standard HTTP and HTTPS ports, and host two webapps, named mysite1 and mysite2. Of these, mysite1 is hosted internally while mysite2 is hosted by proxy.
<sphereCfg portSTD="80" portSSL="443"> <webapp name="mysite1" webmaster="johndoe" basedir="/usr/www/mysite1.com" rootfile="mysite1Root.xml"> <domain name="mysite1.com"/> <subdomain name="www.mysite1.com"/> </webapp> <webapp name="mysite2" webmaster="janedoe" basedir="/usr/www/mysite2.com" rootfile="mysite2Root.xml" portSTD="19020" portSSL="19120"> <domain name="mysite2.com"/> <subdomain name="www.mysite2.com"/> </webapp> </sphereCfg>
On startup the web engine checks each webapp in the supplied config, against the entire set of spheres. Webapp names, the domains and subdomains the webapps serve, and the ports if used, must all be unique. Should anything clash, the web engine terminates. The next step assuming no clashes, is to load the internal webapps (if any). Should any internal webapps fail to load, the web engine terminates. Assuming none do, the web engine proceeds to serve. No checks are performed on the by proxy webapp configs. This will only happen when the web engine that will host them internally, is started.
The Gospel According to OpenSSL
It may seem strange that the <domain> and <subdomain> subtags of the <webapp> tag, are of equal standing. This is because it is assumed that SSL, i.e. HTTPS, will be required. With SSL, an insecure connection is established first, by means of a TCP handshake. Then before any data transfer can take place, a secure connection is established by means of an SSL handshake. Then the conversation ensues, exactly as it would with a non-SSL connection. During the SSL handshake, the client states which domain or subdomain it is trying to connect to, and the server sends the applicable certificate. The scope of the various SSL certificates that are available, are as follows:-
Multi-domain Covers any number of domains. Note that mydomain.org and mydomain.edu would be two different domains. Wildcard For a given domain, these cover any number of subdomains, Single-name This covers a specific domain or subdomain. In some cases it will cover the domain plus the www subdomain - so mydomain.com and www.mydomain.com.
Cerificates are supplied as a set of three attributes, sslPvtKey - the private key, sslCert - the certificate file, and sslCertCA - the certificate authority file. Where the server has a multi-domain certificate, this will be sent in the SSL handshake in all cases, so it is set in the <sphereCfg> tag, in the Sphere file for the standard ports. If domain wildcard certificates are used, these are set in the <domain> tags. The certificate can also be set in the <domain> tag if it is single-name, but covers both domain and www subdomain, and the webapp only serves the domain and www subdomain. Where a single-name certificate only covers the named domain or a subdomain, the certificate must be set in the applicable <domain> or <subdomain> tag.