|
|
TITLE
RP9 and TOSEC File
Names
|
|
TOPIC
This document describes the
default format used for RP9 file names, and the
mapping to and from TOSEC file names.
|
|
DISCUSSION
RP9 was designed to
simplify the distribution, use and
organization of retrogames and other classic
content, while recognizing the individual
strengths of different media image formats
and naming conventions.
Because they are used in players that
also have access to a local or online
database, RP9 files may be freely renamed without
compromising functionality, in a way more
similar to MP3 files than to TOSEC files.
Even RP9 Toolbox, a component of
RetroPlatform Player, includes a
customizable feature to name and rename
files according to user preferences. At the
same time however, a default format is used
for consistency. In its simplest form, it looks like this:
- Asteroid Invader II (Acme Games, 1986,
Amiga).rp9
This format respects the fact that the
application title (rather than, for example,
the publisher entity) has emerged as the
preferred initial part of commonly used
naming methodologies, and adds a minimum of
information to visually identify or search
for items based on the elements in the file
name. The following fields are always
indicated:
- Item title
- Entity name (commonly used name, not
corporate registration)
- Release year
- System family
Two additional fields may be added:
- Extended data (e.g. all additional
TOSEC attributes), enclosed in one
"master" set of [square
brackets]
- Final enumeration suffix (number
between round brackets), in case of
multiple items with the same name in the
same directory, e.g. (2), (3), etc.
The extended format is the default, but
it may be not visible in well-cataloged
games and demos, because in an ideal
situation there is only one optimal entry
for each title, and no need for extra
information. This is what is most desirable
from a usability point of view.
Nevertheless, the extended format aims to
preserve the full set of additional TOSEC
fields, with no loss of information, while
improving parsing and reducing undesirable
naming differences.
The main differences between RP9 and
TOSEC names:
- RP9 does not rearrange the "The"
prefix to the end of a title or subtitle (if a title begins
with "The" or any other article in any
language, the file name also begins in
the same way)
- Like TOSEC, RP9 also maps ":"
(illegal in most file systems) to "-",
but it enforces typesetting accuracy by
adding proper spacing if necessary (the
TOSEC 1.0 specification indicates the
opposite, but in practice most TOSEC
files that were based on that spec had
the space added, resulting in two
possible versions of the same file)
- The "vx.xx" version attribute is
isolated from the title and placed in
its own pair of round parentheses (the
"v" is preserved)
- The original Roman and Arabic
numerals are preserved if available,
rather than being normalized (this type
of normalization can always be done
internally in the search layer, if
desired)
This results in RP9 file names like the
following:
- Asteroid Invader II (Acme Games, 1986, Amiga)[(v2.1(demo)(US)[a]].rp9
- Asteroid Invader II (Acme Games, 1986, Amiga)[(v2.1)(demo)(US)[a]](2).rp9
- Asteroid Invader II (Acme Games, 1986, Amiga)[(v2.1)(demo)(US)[a]](3).rp9
In general, RP9 naming follows a goal of
rigorous simplicity and elegance, also
taking into account undesirable variations
observed in the reality of tens of thousands
of TOSEC 1.0 files. This is why there are
some small differences between TOSEC and
RP9, whereby for RP9 the aim was to reduce
the presence of inconsistent exceptions and
to make parsing easier.
In particular, the choice to not
rearrange titles beginning with an article
(as in "Das Boot" changing to "Boot, Das")
was based on the following considerations:
- This transformation originates from
traditional library cataloging rules,
but is less useful in a context where
automated search (usually "live"
as-you-type search) is pervasive
- Any rearrangement is a modification
of the original title, introducing more
work to humans, an inevitable duality
(at the beginning there was one title,
then there are two) and bringing with it
the possibility of further unintended
consequences (errors, borderline cases,
difficulty in reconstructing the
original, etc.)
- The presence of such a rule opens
the doors to a "because you can"
approach to editing, often with
inconsistent results (which articles of
which languages should be rearranged?
should these be rearranged even within
the context of another language? if the
title has a subtitle separated by a
dash, does the rule apply to both parts,
or not?)
- Examples of "difficult" cases: "The
Halley Mission - A Shuttle Simulation"
vs. "Halley Mission, The - A Shuttle
Simulation" vs. "Halley Mission, The -
Shuttle Simulation, A" (four possible
combinations to search for); "Live, Die
- The German Rocket" vs. "Die Live - The
German Rocket, The" (not only four
possible combinations to choose from,
but also not clear whether "Die" is
article or not); "The Elphs, the Devils
and the Blue Angel" vs. "Elphs, the
Devils and the Blue Angel, The" vs. "the
The Elphs Devils and the Blue Angel"
(incorrect automatic reconstruction
based on comma followed by article).
The system family property as used in the
RP9 file name aims to indicate the widest
set of compatible systems, not just one
sample configuration. For example, for Amiga systems the
supported names are "Amiga", "CDTV" and
"CD32", because these reflect
three important device branches both from a
technical and a recognition perspective. Additional configuration details
(e.g. preference for A-500 vs. A-1200) are
embedded in the RP9 manifest. For CBM
(8-bit) systems the platform are
model-specific (C64, VIC 20, etc.), because
the differences (and software
incompatibilities) between the various models
were more distinct.
Other details:
- By default, space characters are
used (not underscore characters)
- Illegal characters (such as "?") are
converted to underscore characters
(except ":" which becomes "-", with an
initial space added if necessary)
- Multiple space characters are
condensed into one
- Leading and trailing space
characters are stripped
- The extended information field is
always included in a pair of square
brackets, which may include any other
combination and nesting of paired round
and/or square brackets
In the player implementation, if there is
no database match for an item the file name
information is used as follows:
- Underscore characters are converted
to spaces
- If the extended information includes
a version field, that is extracted as
such
- If the extended information includes
a demo status field, that is extracted
as such
- If the extended information includes
additional fields, these are extracted
for further processing (without the
"master" square brackets, and without
the already-extracted version field)
- If the player has dedicated
columns or fields for the version or
demo status, this information is
displayed there. Otherwise, the
information is, by default, displayed
together with the title.
- Any additional extended information,
if present, is either shown in any
dedicated columns or fields the player
may have, or, only if necessary for
disambiguation purposes (e.g. if there
are two otherwise identical entries), is
displayed after the title information.
Related Links
|
|
Article Information |
|
Article ID: |
19-103 |
Platform: |
All |
Products: |
RetroPlatform Player |
Additional Keywords: |
normalization, mapping,
transformations |
Last Update: |
2012-01-28 |
|
Your feedback is
always appreciated. It is safe to link to
this page. |
|
|