The approach taken to the conversion of PHP articles into Drupal records is one which reflects the biases of a programmer exploring solutions such as to avoid manual copying of portions of those articles into the content management system (CMS). The challenge was whether the conversion could be extensively ensured by program, notably in order to extract other information and build that into other records to be imported into the CMS. The intention was therefore to build many of the CMS records prior to importing them rather than endeavour to generate additional content types within the CMS facility. The approach was framed in this way because of extensive expertise in manipulation of text with a DOS-based application -- and little expertise in the PHP-related programming required for the CMS.
The approach was first successively enabled using a suite of programs developed with the Advanced Revelation (AREV) application through a DOS box in Windows XP. The resulting pre-formatted flatfiles (CSVs) were then imported into Drupal 6 and later into Drupal 7. This required some ingenuity to circumvent a 64k constraint on record length within AREV for documents of greater size. Cessation of support for Windows XP, encouraged a shift to OpenInsight, a later variant of AREV operating on Windows 10. The upgrade to the Kairos site in December 2018 was enabled using OpenInsight 9.4, subsequently upgraded OpenInsight 10.04.
Article formatting: Advantage has been taken of the fact that the original PHP articles were in a format which had remained standard and relatively stable over decades, both from the earliest (in the 1960s), and since first placed onto a website (in HTML format) in the early 1990s. Articles from earlier periods were adapted to that format as they were digitized. The key factor enabling conversion was the presence in those files of HTML title delimiters defining the sub-titles of what could then be split out as separate Drupal records. The conversion challenge was defined such as to avoid any additional mark-up -- otherwise required to facilitate the process, using programming "tricks" to circumvent anomalies. This could well be described as a less than efficient process (if not stupid!), but it did offer some nice programming challenges for someone anxious to avoid manual manipulation (at all costs!).
Retaining relationship to original version: The conversion challenge was also seen as a means of preserving a degree of complementarity between the PHP articles in the Laetus facility and the variant on the Kairos facility. The intention was not to switch to writing articles within the Drupal CMS, or updating them there, since it has been far more convenient to continue the process of writing/editing of the PHP variants within the Laetus facility using Dreamweaver. This was one reason for using record (node) identifiers within the CMS based on the original PHP file/folder name -- rather than switch to a numeric node identifier as is most commonly the case for a CMS.
Note that it is the Laetus version which is considered to be the document master copy. Only minor editing is done exceptionally on documents in the Drupal context -- most notably in the event of conversion issues which have not been resolved by program. Errors emerging in the Drupal may well be used as a means of detecting and correcting errors in the Laetus version or in the conversion scripts.
Constructing records for import: Building the various CMS record types prior to import, rather than depending on (absent) Drupal skills to manipulate the basic imported documents, has meant that new record types can be created and populated as required in order to enhance the CMS facility. Of particular interest are those relating to the pattern of links.
Drupal node import: Of interest in the strategy adopted is the constraint imposed by the state of development of the Drupal "node import facility". Basically the options available for updating any node of a particular content type by the import process are to delete such nodes individually, or in a batch process (VBO) -- and then to import the corrected set of nodes into the IDs thereby made "free". The provisions for "overwriting" a node, without prior deletion, have been progressively developed stage within the Drupal community -- but primarily for numeric nodes. In practice this means that it is easy to batch delete all the bibliographic reference records, or the associated author records, and then to re-import a set generated from PHP articles after corrections (in the light of errors that became apparent from Drupal sorts). The advantage of the alphanumeric node naming system is that the links to the other documents are not affected by this process since the pointers from those documents remain valid.
Upgrading and adaptation:
Responsive requirements of tablets: There has been considerable pressure to adapt documents to the requirements of responsive web design for a range of browsing devices and platforms. Various steps have been taken over the period 2018-2019 to achieve this for both the original Laetus documents and the Kairos variant. Particular attention has been required to the formatting of images. A specific issue has been encountered in the presentation of responsive image maps in a number of documents. The provisional solution -- unfortunately -- has been to disable that facility.
Animations:Documents may include a variety of animations:
Character encoding issues: Working with articles published as early as the 1960s, making extensive use of accented characters, has required a degree of flexibility in adapting the conversion to handle characters which pre-date the currently favoured UTF-8 standard. Some of the articles are in French and other languages. Many cite authors and articles requiring such characters. Although the conversion process enabled some of these anomalies to be "corrected", These issues have not been completely resolved and are reflected in the contrasting ways in which Drupal itself handles such encoding in different contexts.
Augmenting access possibilities: A major motivation for exploring a CMS variant was to segment the longer PHP files (some over 150k) into more "readable" forms as CMS records. This was seen as particularly valuable in that the sub-titles attributed to the HTML title-delimited segments were interesting to extract in order to benefit from the Drupal Views facility, in addition to enabling more specific access via search engines.
Benefitting from extensive hyperlinking: A significant characteristic of the PHP articles is the degree of hyperlinking between them. The conversion was designed to derive further information from this pattern of links, notably by generating "checklists" of citations "from" and "to" the CMS records. Unfortunately, as noted above, the links have been enabled to the "main" document introducing a set, and not to the individual documents of the set. This could be improved in the future.