OSDV Technical Architecture
From OSDVwiki
OSDV's technical architecture consists of:
- the top-level architecture for all of OSDV's technical development work;
- the system architecture for each of the 2 main software systems under development, OARS and OASES;
- component-level architecture for the major components of each system, particularly OASES component-level architecture, which consists mainly of SHARP, the common platform for the standalone devices that comprise OARS.
All three levels of architecture are covered here, but without much context. For more information, see the Master Development Plan.
Contents |
[edit] Top-Level Architecture
The top-level technical architecture for OSDV consists of the 4 basic technical deliverables that will be the end result OSDV's current development efforts, shown in the Top-Level Architecture of OSDV Technical Deliverables diagram
The top-level architecture features two distinct systems, standards-based data interchange between them, and standards-based data externalization for Web publication. The two systems are: OARS (Open Auditable voter Registration System) and OASES (Open Auditable Structured Election System). EML (Election Markup Language) is an existing standard XML schema for standard representation of election-related data. EAML (Election Audit Markup Language) is used for the externalization function mentioned above.
The externalization function is a critical aspect of OARS and OASES, which enable a substantial degree of public transparency and accountability by:
- logging every transaction performed in each system, and
- externalizing the log data for Web publication,
- so that members of the public have access to searchable, importable, complete data documenting how a particular election was conducted.
As a result, the definition of EAML and the demonstration of its use will both be critical parts of the development plan.
For data interchange, existing EML is used to some extent, but current EML is not sufficiently broad for the needs of OARS and OASES. As a result, the top-level architecture includes EML+ which refers to a set of draft standard data definitions that extend current EML. EML+ is used for data interchange between OARS and OASES. EML + is also used for data interchange among the distinct components of each system, particularly OASES, in which the complexity of the typical current voting system has been reduced by modular architecture and component design.
The diagram shows how these 4 top-level components interact. OASES and OARS interoperate by means of EML+, and both externalize information in EAML, which can be published to the Web for public access.
[edit] OARS System Architecture
OARS architecture is that of a typical database-backed Web application for transaction processing. Client use a Web browser to access the Web application that implements a set of transactions on the VR DB itself, as well as support function such authentication, authorization, user management, logging, auditing, and reporting.
The main components of OARS are the elements of a standard open-source software stack for DB-backed web applications, e.g., Apache, Rails, MySQL. Embedded with this application framework are:
- OARS application logic for transactions on the VR DB;
- OARS logic for support functions;
- OARS logic for logging and auditing;
- OARS database itself, based on:
- OARS schema for VR records, transaction-related records, support function records, and log record.
shows this software stack and embedded elements, along with some of the principle data flows:
- County officials using a browser to use Web application interfaces to perform transactions on the VR DB;
- State officials using a browser to use Web application interfaces to manage the DVRS, and data interchange with other databases (DMV, etc.);
- Implementation of external data interchange operations.
The diagram also shows two further dataflows, both standards-based data interchange with external systems. In one case, the external system is a county voting system, such as OASES, that uses standard data formats such as EML to import election-related information such as an entire county’s set of the VR records. In the other cases, the external system consumes OASES’ log records on the operation of the DVRS, and publishes the data for public inspection, data mining, etc.
One additional architectural point to note in the diagram, is the use of a browser to use the DVRS’s Web application. Certainly this could be performed by any common Web browser software on existing county officials’ workstations. However, another element of OSDV’s technical architecture could be used, the Limited Open Browser Appliance (LOBE), described below.
[edit] OARS Component Architecture
OARS system architecture is entirely conventional, so there is little say about the component-level architecture, other than to note that distinct OARS functions (e.g., the primary transactions, logging functions, external data interchange functions) are, as the diagram show, implemented as separate modules.
On the client side, an ordinary browser might be used, but LOBE is preferred because a LOBE instance can be completely pre-configured for the communication security and application security requirements of OARS. LOBE's architecture is a variant of the toaster architecture of the OASES components. The main (and large) difference is that LOBE's platform may be a conventional open source OS, rather than the minimized embedded-application platform of OASES.
[edit] OASES System Architecture
OASES’ architecture is non-traditional, both as compared with existing voting systems, and with typical PC-based systems. The contrast with PC systems is in the Component Architecture, described below. The contrast with typical voting systems is that OASES is both smaller more modular, with the election management functions decomposed into distinct fixed-function systems, and the number of devices reduced to a minimum. The comparison with existing systems is stark (see Typical Election System Architecture). Both these contrasts are critical factors for OASES being able to achieve goals for a system that is simple, minimal, feasible to assess, modular, feasible to re-assess, trustworthy, and high assurance.
[edit] High Level Architecture: OASES Components
The OASES Architecture diagram shows that high-level architecture of OASES. One of the two main parts of OASES is the group of components that comprise the election management system (EMS) functions. (The typical concept of operation for an EMS is described as part of the Typical Election System Architecture.) Each critical Election Management (EMS) function is broken out into a separate component, so as to minimize the complexity of the code involved in such critical but simple functions as tabulation.
Besides the EMS components, the device components are the minimum numbered required: a polling-place counting optical scanner (PCOS) device that can scan hand-marked or electronically marked ballots; an electronic ballot marking (EBM) device that provides enhanced access functions for voters who need them for independent voting; and a central office ballot scanner that incorporates a number of post-election back-office functions for ballot scanning.
[edit] OASES Component Dataflow
The functions and dataflow of OASES nine components are best described initially in the context of the workflow in a typical election process. (See Election Process Overview). Based on such a simplified election workflow, the workflow of OASES’ components can be readily seen in the next several diagrams.
The OASES Pre-Election diagram starts with the external input data to some of the OASES components for election management. These inputs are: voter registration records, precinct definitions, district definitions, and information about the contests and measures that the districts submit for inclusion on the ballot. Using these inputs, the OASES components perform the following basic functions:
- The OASES Precinct and District Manager (PDM) consumes a version of the precinct and district definitions that it previously computed.
- This previously defined database (of precinct and district definitions) basically maps addresses to precincts, aligning precinct boundaries with district boundaries.
- If human input is provided to modify them, then the PDM produces a new version.
- The PDM produces datasets used by the next two components.
- The OASES Voter Eligibility Manager component consumes voter registration records (from an external DVRS), as well as a precinct definition dataset computed by the PDM. These data are used to create pollbooks and datasets that represent their content in digital form – essentially mapping voter registration records to precincts.
- The OASES Contest and Ballot Manager consumes a dataset of precinct and district relations (from the PDM) and human-supplied data about the contests and measures from each district.
- The CBM produces a dataset that defines the ballot style for each precinct.
- The CBM can optionally consume a ballot style dataset that it produced previously, in order for election officials to do updates.
- The OASES Ballot Studio consumes the ballot style definitions, creates initial ballot designs, and provides a user interface for the modification of ballot default designs, e.g., order of contests, order of candidates.
- After from the ballot design work has been performed, there are two distinct representations of the resulting ballots: paper ballots, and ballot-definition datasets that become part of the programming of voting devices.
- The OASES System Builder component consumes those ballot definition/design datasets, and uses the data to produce the election-specific programming for each of the 3 OASES voting devices: the Electronic Ballot Marking device, the Precinct Optical Scanner Device, and the Central Optical Ballot Scanner.
Further, theOASES Pre-Election Audit diagram shows that in addition to pollbooks, ballots, and programming, OASES operation also includes audit log data for all the activities of the OASES components used in this phase.
The OASES Polling diagram shows the role of OASES components in the election workflow in the polling place. The inputs to the process are pollbooks, paper ballots, the ballot definition data as part of the polling devices' programming. The outputs are marked ballots, tallies from the scanner devices, electronic log data, and other log data from the polling place, such as marked-up poll books.
[Image:OASESpostFlow.gif|thumb|OASES System Dataflow in Post-Polling Processes]] The OASES Post-Polling diagram shows the role of OASES components in the post election process.
- The OASES Central Optical Ballot Scanner device is used by election officials to process vote-by-mail and provisional ballots.
- In addition to scanning and counting ballots, the central scanner also provides a user interface for officials to manage imperfect ballots that require interpretation.
- The output of a central scanner is a set of per-precinct vote tallies similar to those produced by PCOS devices, with the exception that PCOS devices produce tallies for only one precinct.
- The OASES Ballot Tabulation Manager consumes these tally datasets, along with those produced by the PCOS devices. The BTM consolidates the tallies and produces election result reports, e.g., results of contests in districts entirely in the county; the county’s total vote contribution to contents in districts not entirely within the county; breakdown of results by precinct, polling-place, VBM, provisional, etc.
Lastly, both the central scanner device and the ballot tabulator produce audit log records of their operation by county officials. The log records, together with those of the other OASES components, can be consolidated into a single EAML dataset and published.
[edit] Summary of OASES Data Flows
The OASES Inter-Component Dataflows diagram summarizes these dataflows among all the OASES components. The dataflows within a polling place are in the shaded box; all other operations occur in a county central elections office. The path of paper ballots is indicated with dotted lines, while the solid lines are the digital information flows described above.
[edit] OASES Component Architecture
A key part of OASES’ system architecture is the “toaster” architecture for OASES components. Each of these nine components is a “toaster,” that is, a fixed function device that is similar to a computing “appliance,” but with additional property that the toaster can’t be reprogrammed. Each OASES component consists of a single embedded application, running on a minimal embedded operating system. The OS is “minimized” because it provides all and only what is required to support the platform for the embedded application. Each embedded application uses the same application platform, which uses the same minimized OS. Further, the application platform is minimized to the limited needs of suite of embedded applications.
The minimized OS, called SHARP (Sustainable High Assurance Re-uasble Platform), is designed to run on hardware that meets simple specifications that enable the re-use of existing commodity PC hardware, printers, and scanners. An exception to this goal may be the EBM (electronic ballot marking) device, which may require custom industrial design for handicapped access, but use the same basic commodity computing hardware.
The basic concept of the component architecture is that each of OASES components is a “toaster” consisting of commodity hardware, an instance of the same platform, and a distinct body of embedded application software. Each system boots from read-only removable media; the hardware itself is stateless, with the OASES component not relying on any data or code stored on the hardware (with the exception of embedded firmware in I/O device hardware). Each system requires one or more input datasets that are supplied by removable media. Likewise, output datasets are written on write-once removable media. Dataflow between OASES components is via these dataset: the output of one may be the input of another.
The three voting device components of OASES differ from the others in that they require election-specific input data, which is provided as part of the boot image (so that only removable media is required to operate the system). The OASES System Builder component creates these boot media with the election-specific data. (In other systems, the term election “programming” is used, but this would be a misnomer for OASES because election-specific processing is entirely data driven.)
The OASES Component Data-Transfer diagram the inter-component dataflows, including the key concept of the data being conveyed by removable media, either media with datasets, or media with boot images for the three voting device components.
[edit] Component Architecture Elements
Figure N shows the key features of the OSDV toaster architecture. Running on commodity hardware, SHARP is OSDV’s minimal device operating system, built from standard components of an open-source OS kernel, together a minimal number (perhaps close to zero) of non-kernel software packages other than the application platform. The application platform is an interpreter for the python programming language, together a minimal set of python packages needed to support the embedded application suite. A particular version of python, phtin, is used because of its minimized functionality and lower complexity and size. Each OASES component differs primarily in the python software that implements to embedded application particular each OASES component.
It’s work noting that the choice of Python is something we working on validating at this point. Other interpreted languages are possible alternatives, as well as embedded Java environments. Some of the reasons for the current use of python are reasons related to ease of evaluation. That is, a key goal for OASES is that the software be engineered with evaluation in mind. The choice of an interpreted language helps with independent evaluators’ being able to review source more easily than in a compiled language, because thin python is a smaller and simpler language, with simpler language semantics.
The choice of SHARP as the basis of a single application platform is also a choice made with independent evaluation in mind. Each OASES component differs only in the embedded application software. As a result, software evaluation consists of evaluating the platform software, plus evaluating each separate body of embedded application code. The platform provides the basic OS functionality for kernel separation, process isolation, code integrity, memory protection, etc. – all of which contribute to system assurance, and isolate the embedded application software from the rest of the system. This latter point is critical for re-evaluation and re-certification. When there is a modification of the embedded application software of a previously evaluated and certified system, the required re-certification can be based on a re-evaluation not of the entire OASES system, but just of the software of the embedded applications that changed. Separate, unchanged embedded application software, and all the platform software, need not be subject to a complete re-evaluation of source code, because the platform isolates the rest of the system from the changed software.
Figure N shows all nine of OASES components as distinct instances of hardware plus platform software, differing from one another only in the embedded application software. Figure N shows the consequence that the evaluated software base is primarily the platform (perhaps a few hundreds of thousands of lines of compiled OS and interpreter code), together with the embedded application code (each some thousands of lines of thin interpreted language code).

