Philosophical Research:Data model
What are Items, Lexemes, and Ontology pages? Perhaps you have already found technical descriptions of how each of these things behave, but still would like to know what their intended purpose is.
First of all, Lithographica is an ontology. It is not a "page", "book", or "text" composed of linguistic utterances; instead, it is primarily a mathematical graph made up of points connected by arrows. The act of studying natural processes or meaning within particular texts is carried out through the act of assigning each separable object or image a point and drawing particular kinds of arrows between them, much as an astronomer might keep track of particular areas of stars by drawing a constellation. By drawing a lot of arrows we become able to use simple building blocks to replicate outwardly-complex structures and processes, such as all the bone names in a typical bird skeleton, all the stars and planets in a particular star system, or all the separate earldoms in 1000s England.
In terms of the digital tools used to encode this "arrow method", RDF-style data structures are used to encode each relationship between things. Given some concept A and some concept B, we can term the arrow between them the predicate and begin differentiating different kinds of relationships into different kinds of RDF properties. Some arrows can now begin to describe literal information such as names or measurements, while other arrows describe specific kinds of structural relationships such as a book belonging to a series. RDF-style data frameworks can then mark particular kinds of nodes as generally containing certain kinds of arrows, like a book series generally containing books — this leads to the concept of Resources or Items which are said to conform to particular schemas or data structure classes. This is the basis of the Wikibase Item structure, the Wikibase Property structure which is used to model arbitrary new RDF-style properties, and the Wikibase Lexeme structure which applies the concepts of the Wikibase Item class to model elements of particular human languages, typically in their written form.
Lithographica is a bit like building Wikipedia backwards. Instead of starting from broad concepts and working down into fine-grained sections, the goal is to work up from the most elementary and easily-observable concepts and build progressively larger concepts or statements, which in some cases receive their own wiki pages acting as human-readable summaries of the mathematical Item relationships. Wiki pages appear eventually as the core ontological models designed mostly-independently of language snap together and solidify and thus become commonly-understood and easy to describe in natural language.
In theory, Ontology pages may become localized at some point as the project grows, such that each node deemed interesting enough for a summary has a summary in any number of natural languages. Early on, many areas of the ontology including Lexemes have focused on studying texts in either English, Japanese, or German, but there is no particular reason for this other than the desire to centralize parallel models of the same concept in the same place, which in the case of Lexemes is the language-separated term — here we encounter a minor conflict between the Lithographica use of Lexemes as concept disambiguation and the ontolex definition of Lexemes as being separated by language.
- [1] - SQL tables versus RDF. I just like this page, I think it's neat
Elementary Items
these Entities often have the purpose of linking to descriptions of elementary observable concepts in other databases such as Wikidata, Wikipedia, and Fandom wikis.
at times, elementary Items can form their own definitions through set-theory Properties modeling an object's structure: nucleon - consists of components - quark - at order of magnitude - on average 3 (quantum physics).
Sign Entities - these have been under consideration as an improvement on Wikibase Items. currently it appears that they will not be implemented as a new data structure, but may return later in the form of an extension to name particular Wikibase predicate-statements and tag them with their own RDF Resource classes.
Z Items
Z stands for Zettel (card) or Zahl (number), both in reference to Z Items being the most generic kind of concept a "number", "card", or "entry" can be assigned to.
S Items
S stands for signifier, statement (in the case of S2 Statements), or structure (in the case of S0 data structures).
Statement Items
- Statement Items: z2, s2, f2
- these Items express concrete, hypothetical, or purely-counterfactual relationships between elementary Items.
- this is a somewhat different way of doing things than Wikidata does them. it means that at an internal level, the whole notion of named Claims could possibly be replaced with regular Items with shorter ID strings.
- relying on Statement Items has the advantage of making the SeaTurtle approach more viable.
- it also has the advantage of making it easier to achieve SPoV from the beginning. Statement Items inherently promote the emergence of plural ontologies suitable for modeling a real world of competing plural philosophies and models.
S2 Statements
F2 Statements
f stands for false or fringe-science
Z2 Statements
Ontological-category Items
S0 Concepts
logical or metaphysical categories which group Z or S items
Z0 Physical Patterns
Z0 Patterns represent generic versions of Z Items which one can expect to be instantiated as some kind of spacetime-unique object or identifiable instance. For example, one Z0 Item could conceivably be "cat of species Felis catus", and its corresponding Z/Z1 Item could be a highly notable individual cat, while at the same time, Items in the S dimension mirror this Z0-to-Z1 relationship with the S0 Item "fictional cat" and the S1 Item "Graystripe". Equally, a Z Item could be used to mark a real-world phenomenon which seemingly cannot be generalized despite recurring often, such as a Z Item for "quantum mechanics (study)". Despite there being many papers about quantum mechanics, there is not necessarily any such thing as a drop-in replacement through the study of some different kind of physics. Even so, there could still conceivably be a Z0 Pattern for "academic study of physics" which instantiates itself as "quantum mechanics (study)" and "thermodynamics (study)".
Other kinds of "zero" categories
It is unlikely, though maybe possible, for there to be more "0" categories introduced. "M0" is a potential candidate, for some sort of generic repeatable category of questions that keeps instantiating itself into new versions of a particular kind of question. One possible example of an M0 Item is "National questions"; in concept, every group of people that eventually becomes a nation-state has a national question.
There are no F0 Concepts because of the way in which the inherently arbitrary nature of some S0 classifications for S1 or S2 Items makes it impossible to identify "objectively wrong" categories. At one point an "H0" category was considered for historically-contingent unique objects that have no particular rationale for why they are the way they are today, but with time this was simply flattened back into "Z".
Lexemes
Lexemes are Item-like Entities provided by the Wikibase Lexeme extension. Similar to a dictionary entry, their basic purpose is to divide specific recorded languages into words or phrases of particular grammatical categories (for example, English noun or German verb), and map the connections between a set of related written forms and a set of distinct but related meanings. As far as Lithographica is concerned, Lexemes are to be used like disambiguation pages between ambiguous written words and word-independent concepts (Items). Terms are usually sorted by language, but for the purposes of this project their precise grammatical categories are broader to allow for notions like abstract nouns that express themselves into verbs and adjectives, etc. (This will be described in more detail elsewhere — later.)
The Lexeme structure is also (mis)used for a few more specialized roles where Lexemes are more strictly interpreted as written signifiers, as explained below.
Citation Lexemes
A citation Lexeme is meant to represent the concept of a particular referenced work rendered into speech. Citation Lexemes do not hold the contrasting connotations of works ("Dragon Ball means wild power escalation"), but instead simply associate the titles or aliases of works to particular sub-series or editions ("Dragon Ball and DBZ are two names for the same continuous series"). The Senses of a citation Lexeme should be suitable for linking to particular Z Items or S0 Concepts which constitute actual works or collections of works somebody would reference, while the Forms of the citation Lexeme can be any names the Senses are ever referred to by no matter how ambiguous. Each Sense should be connected to the collection of Forms which represent it so that it becomes clear which names refer to which parts of the series or work. Specific numbered parts of a larger work can also be added as Senses in the case they come to be referenced so often they overshadow other parts. Note that it is generally not necessary to add every numbered part of a work as a Sense — if individual numbered parts are being referenced this often, it may be best to simply reference them as Items or give them their own separate citation Lexemes.
Citation Lexemes may be especially useful for thesis portals referencing works by recurring bop citation codes. A recurring citation code should be recorded on a particular citation Lexeme, and the Lexeme Sense linked by Property on its corresponding Item, such that searching the citation code can bring up both the Lexeme and the Item. At that point the citation code can safely link itself directly to the Item for the most specific numbered part of the work referenced.
Works and editions
In general, this project follows a simplified and incomplete version of the FRBR standards. Works and editions should not be separated, and instead should be regarded as if the characteristics of particular editions are all varying characteristics of the work they originate from. This simplification is for the purposes of making data entry slightly easier, or for those who are willing to take the effort to separate out editions anyway, to allow separate edition identifiers to all be managed by Wikidata. The logic goes that if Wikidata is already a repository detailing almost all "official" published works, there is no real need to duplicate the effort again especially if it would result in a single book having two Items on Wikidata and two Items on Lithographica which the same user might have to create all at once.
- A graphic novel which neatly follows the story of a particular prose volume with no deviation should be considered an edition of the same work. ex.: Silver Eyes trilogy (FNaF), Wings of Fire graphix adaptation
- A film adaptation which neatly maps to a particular prose volume and does not intentionally deviate from it should be considered an edition of the same work. ex.: Harry Potter and the Philosopher's Stone
- A film adaptation which "adapts" a larger series but does not map to a particular prose or comic volume should be considered a different work. ex.: Dragon Ball Evolution
- A dramatized adaptation which neatly maps to a particular comic volume but has its own set of numbered parts should be formally considered an edition of the same work, but is allowed to have a separate Item primarily for the purpose of grouping differing sets of numbered parts. ex.: Dragon Ball (books), Dragon Ball / Dragon Ball Z (shows)
Part of the reason this system was devised was it was too confusing and unintuitive for new Wikidata editors to immediately identify an edition of a comic. Are volumes of a serialized comic considered works? (No. An entire collection of volumes is considered an edition, despite the misleading Wikidata data constraint that Items should not have more than one ISBN.) If several non-serialized comics are collected together, what is this? (An edition of the individual comics.) If a graphix adaptation has hardcover and softcover bindings, does it count as a work, or the line of adapted books count as a separate series? (It shouldn't.) If fans create an abridged series, does this count as an edition? (It should. Journey to the West was also abridged.)