Jump to content

వికీపీడియా:WikiProject Tabular Data

వికీపీడియా నుండి

Wikipedia lives from many good written articles, which explicate coherencies and fancy for read.

Good articles also live from the facts, which are used in different articles or which are the subject of permanent changes. It is time and labour-intensive to harmonize these facts across several articles and keep them up-to-date. This project tries to solve this problem using software support in MediaWiki and replace consistently requires bot runs.

Organization

[మార్చు]

The structuring organization, coordination, and maintenance of the metadata templates take place in the Wikipedia:WikiProject metadata/data organization.

Data, like e.g. the population number of a municipality or the gross domestic product of a country are often used in several articles as well as in the continuous text as in infoboxes or tables. Such data are characterized by a multiply of attributes:

  • Which (key): which object is meant (e.g. Berlin)
  • What (relation): what kind of data is it (e.g. population number)
  • How much (value): what value does the datum have (e.g. 3420786)
  • When (date): when was this value determined (e.g. March 31, 2008)
  • From where (source): where does the datum come from (e.g. Amt für Statistik Berlin-Brandenburg)

The first three attributes describe the data itself. In according to the resource description framework (RDF) the are also called subjectpredicateobject. The last two attributes are metadata in the closest terms, which means data about data.

Templates make it easy to return a value based on an input parameter. In this way, the template “population number” (predicate) could return the value “3420786” (object) when given the parameter “Berlin”. The template programming therefore offers by the parser function switch a smart solution.

Data types

[మార్చు]

The relevant data are distinguished by different types of data, which should be available in a machine editable form for practical reasons and not changed into the suitable form until output. This machine editable form (e.g. numbers without thousands separators and point as decimal separator) makes it possible to calculate the population density from the quotient “population number/area”.

Nomenclature of the templates

[మార్చు]

The naming of the templates of the template type „metadata“ follows the scheme:

Template:Metadata basis apportionment

For example: Template:Metadata population number DE-NI

The call is carried out according:

{{Metadata Yyz|Key|Accessory}}

For example: {{Metadata population number DE-NI|12345}}

Data of the same basis is described by a universal umbrella term as possible: Examples:

  • population number addicts the data for the population number of a political subdivision.
  • head addicts the data of the head of a political subdivision, which may comply with a mayor of municipality.
  • GDP addicts the data of the gross domestic product of a coutry.

Separation of the dataset

[మార్చు]

The optional separation of the dataset follows the criteria

  • Form of the data supply of the data source,
  • Size of the data set (for example there should not be more than 2000 entries at maximum in one switch-list)
  • logical coherence of the data group.

Examples:

  • The arrangement of the data templates for the population numbers of German municipalities, municipalities associations, counties and administrative region follows may be effected by the federal states (Bundesländer).
  • The arrangement of the data templates for the human development index may be effected by states of the earth.

The naming of the separation is also carried out by universal criteria as possible. For example: Data about administrative units are separated according to ISO 3166.

The allocation of the data is carried out according to universal, explicit and independent of Wikipedia internal regulations keys. For example, the Community Identification Number should be used for data about municipalities. The ISO-3166-key should be used for data about states. This may prevent the loss of data embedding by lemma change and allows the use of the data sets independent of the different name conventions of different language editions of Wikipedia.

Interested editors

[మార్చు]
[మార్చు]