HDB |
HadronZoo Database. |
Data Class: |
A data class is purely a data structure definition. It sets out the structure or form, all objects of the class must take. The data structure is defined
as a set of members, each of which has a data type and may hold a value or an array of values, of that data type. Note that data classes are themselves,
data types, so a member of one data class can have another data class as its data type. By this means data classes support hierarchy. |
Data Class Member: |
A data class member holds one or more values of a predefined data type. |
Data Class Object: |
An instance of a data class. Note that in this document, data class objects are usually referred to as "data objects". |
Data Object Member: |
A member of an individual data object. |
Atomic Data: |
Atomic data values or datum of an atomic data type, are indivisible. Either they are fundamentally indivisible or are treated as such for the purpose of
internal data processing and storage. |
Composite Data: |
Data classes are composite data types because they consist of members. The term 'composite data' simply means one or more data objects. For consistency,
the terminology applies even where a data class only has one member. |
Host/Guest Class: |
As data classes are data types and class members can be of any predefined data type, members of an object of a (host) data class can contain objects of
another (guest) data class. Indeed this is the means by which data classes are hierarchical. Note that because members can only be of a predefined data
type it is not possible to define a data class with a member whose data type is itself. |
Simple/Complex: |
A simple data class is one in which all members are atomic and so has no guest classes. The term 'simple' is preferred over 'atomic' because an "atomic
data class" would be very confusing! A complex data class obviously, is one with one or more guest class members. |
Repository: |
An unordered persistent data store which issues ids on datum recipt, and retrieves datum on recipt of an id. |
Data Object Repository: |
An unordered persistent data store, serving as mass container of data objects of a given pre-defined class. |
Binary Repository: |
An unordered persistent data store, serving as mass container of binary datum. |
Index: |
An ordered data store, persistent or otherwise. |
Ordered/Unordered |
Within the context of data stores, ordered means that entries are ordered by value. Unordered however, does not mean entries are random, but in order of
arrival (i.e. in chronological order). |
RAM Primacy (Method) |
High performance data storage method: The data is memory resident but backed to persistent media. During operation, new and changed datum are written as
deltas to a delta file. On startup, the last known data state is reconstructed from the stored deltas. The memory resident data is (usually) arranged in
in a form suitable for direct operation within the program, so the RAM is considered to be the primary data store. The persistant media (delta file), is
thus secondary. It holds the exact same data, but as deltas in order of occurence, a form unlikely to be suitable for direct operation. Because of high
memory consumption, RAM Primacy is usually reserved for high value data that is expected to to accessed with high frequency. |
RAM Primacy (Device) |
Any repository, index, or other program entity that exploits the RAM Primacy method, is a RAM Primacy device. |
Delta Notation (HDN): |
HadronZoo Delta Notation is formal shorthand for writing out data state deltas to delta files. Initially HDN described new data objects, changes in data
objects, and data object deletions, but nothing beyond data objects. It was realized that deltas could be sent to other physical servers, for additional
backup and to enable server redundancy strategies, but only if all data resources in applications were covered. Accordingly, delta notation was extended
to cover all forms of data resources found in applications. |
Binary Data/Datum: |
The term 'binary data' is generally understood as non-text data, usually encoded in some way, e.g. MicroSoft word documents, images and video clips. The
term often implies that binary datum are opaque from the perspective of the program in hand, i.e. they are simply atomic values. In the HadronZoo realm,
owing to the preponderance of RAM Primacy, the term has an additional meaning. Particular data sources (e.g. fields in a webapp form), are deemed binary
in order to store the datum in a binary repository. This is usually because the data source does not warrant RAM Primacy. |
Data Resource: |
Generic term for any data storage entity (repository, index or file), with a specified purpose within the applicable data object model. |
AANI: |
Acronym for the HadronZoo design directive "Always append, never insert", meaning all writes to persistent media, must be to the file end. The directive
is not universal, applying only to data containers during 'normal operation'. More specifically, it applies to the INSERT and UPDATE operations of data
containers (both unordered and ordered). Any disorder arising in the data files of such containers, is rectified by periodic rationalization. |
Datacron: |
In AANI compliant data containers, data state changes appear in the data files in order of occurence. This, in conjunction with the very common practice
of recording the date and time alongside such events, meant that the data files formed a chronological record of all data state changes - since creation
or at least, since last rationalization. This observation gave rise to the term 'datacron' to describe AANI compliant data containers. Strictly speaking
the firmly established term is a misnomer, as there is no requirement within AANI to record the time and date. The single most important datacron in the
HadronZoo repertoire is the delta file. |
Serial Datacron: |
The Serial Datacron is the single most important RAM Primacy device in the HadronZoo repertoire. The RAM component can be an idset, or a collection held
by any of the collection class templates. The persistant media component is a delta file. |
ISAM: |
ISAM (Indexed Sequential Access Method) is a method of indexation characterized as a tree with two distinct node types: Index and data. The top level of
the tree may only contain data nodes, which contain key/element pairs. The lower levels may only contain index nodes, whose entries point to the nodes
in the level above. ISAM can be implemented in RAM, or in persistent media as a composite datacron index. |
Indexed Chain: |
Indexed chain is an ISAM variant, adapted for variable length elements. The index level nodes are similar to those of standard ISAM, but the data level
is structured as an ordered concatenation of elements. This concatenation is held in a chain virtual blocks, which elements can span. |
Node/Block: |
Within the context of ISAM and indexed chain but also more widely, logical or physical data store units are described as nodes if they must contain an
integer number of whole entries, and blocks if they are not subject to this constraint. |
Idset: |
Set of unique integers, encoded for compactness. As a result of the encoding, the integers always appear in ascending order. Idsets are used in indexes
to store data object ids, hence the name. |
Native Data Class: |
Certain data model and application specific entities (e.g. Repositories, Dissemino form definitions and form handlers), either can or must be tied to a
data class. Where this is the case, the data class is said to be the "native data class" of the entity. |
Alien Data Class: |
From the perspective of an entity that has a native data class, an alien data class is any other than the native, or any guest class of the native. |
hdb/hds mnemonic: |
The HDB classes form a distinct group within the HadronZoo C++ class library, so are assigned names prefixed with 'hdb' for "HadronZoo Database", rather
than the default HadronZoo mnemonic of 'hz'. The other distinct group are the Dissemino classes which are prefixed with 'hds' for "HadronZoo Dissemino". |
Program: |
HadronZoo defines a program as a "distinct and whole executable entity, with a singular space confined by the machine boundary". |
Service: |
Services accept client connections and process and respond to, client requests. In strict terms, a service is the functionality available at a given URL
and port. However HTTP/S websites can be regarded as a single service if as is usually the case, the functionality available at both ports is the same.
Note that a service or set of services can be provided by a single distinct and whole server program, OR the service(s) may be the result of a number of
programs working together, possibly across a multitude of machines. |
Microservice (SRM): |
Microservices as considered by HadronZoo, are completely self-contained server programs which perform a simple primary function and do little or nothing
else. Although the function could be anything (e.g. generate a unique id), most HadronZoo microservices avail a single data object repository, hence the
term 'single repository microservice' or SRM. |
Application: |
Applications are usually described as "a program with a user interface" or similar. In HadronZoo parlance however the term 'application' means a program
adhering to a particular configuration. |
Webapp/Website: |
A website is a HTTP and or HTTPS service and so is a webapp (web application). The difference (in HadronZoo terms), is that a website consists of one or
more webapps. Thus, a website can be composite whereas a webapp is always non-composite. This is true even where a webapp is implemented across multiple
machines, as it is the result of the Disssemino method applied to a single config. |
LAMP Method: |
Acronym for Linux, Apache, MySQL and PHP/Python/Perl. |
Sister (entity): |
Identical entity manifest on another machine. Applies to applications and data resources, within the context of the delta server. |
Control Panel: |
The control panel method is where HTTP server programs build their HTML responses by aggregating hard coded print commands to a chain buffer. The method
was used extensively prior to the development of the Dissemino method, and is still used to produce stats reports. The method is so-called because it is
prone to producing pages with "all the grace of home router control panels". Pages produced by the method are known as control panels. |
Serialized Integer: |
Serialized Integers are extensively used by HadronZoo as a space saving device. Four regimes are offered in the HadronZoo library: 32 and 64 bit, signed
and unsigned. In serial form 32-bit integers consume between 1 and 5 bytes while 64-bit integers consume between 1 and 9 bytes. The obvious anticipation
is that most values will consume fewer bytes than the 4 or 8 bytes that would otherwise be required. In all regimes, if the top bit of the first byte is
0, the series is single byte with 7 data bits (range 0-127). In the 32-bit unsigned regime, if the top bit of the first byte is 1, the next two bits act
as controls - so control codes are either 00, 01, 10 or 11. Codes 00, 01 and 10 correspond to a 2, 3 or 4 byte serial, respectively having 13, 21 and 29
data bits. 13 bits has a range of 0-8,191 but as a 2-byte series are never used to express values of less than 128, values are interpreted as 128-8,319.
Likewise a 3-byte series has a range of 8,320 to 2,105,471 and a 4-byte series has a range of 2,105,472 to 538,876,383. Code 11 indicates a full 32-bit
value is provided in the next 4 bytes. In the 64 bit regimes there are three control bits for range, and in signed regimes an extra control bit is used
to indicate negative numbers. |
Length Indicator: |
A length indicator is a serilized integer, used to indicate the length (size) of an entity. By convention, length indicators directly precede the entity
in question. Also by convention, the stated length does not include that of the length indicator itself. As entity length should not be unduly large and
cannot be negative, length indicators use the 32-bit unsigned serial integer regime. |
JSON: |
JavaScript Object Notation. Widely used to represent hierarchical data objects. Discussed in article 5.4 "Data Encoding". |
EDO: |
Encoded Data Object: Discussed in article 5.4 "Data Encoding". |
Cstr: |
Null terminated character string. |
XML-esce: |
The HadronZoo term for "near or similar to" XML (Extensible Markup Language). HadronZoo uses XML-esce, rather than formal XML for configuration. The tag
rules are exactly same but there are no DTD (document type Definition) files involved. All the tags used are expected by the reader program. |
VTTO: |
Acronym for Volume/Traffic Tradeoff, pronounced as 'VETO'. VTTO metrics are important in RAM Primacy cost-benefit analysis, and in the related matter of
system server topology. VTTO evaluation considers the relationship between a body of data and the user action (traffic), that accesses and operates upon
it. The tradeoff is between the cost and benfits of RAM Primacy, i.e. memory consumption vs performance. |
GDPR: |
General Data Protection Regulations |