Difference between revisions of "Digital object identifier"

From WikiChem
Jump to: navigation, search
(change slightly)
(reimported from http://en.wikipedia.org/w/index.php?title=Digital_object_identifier&oldid=305119816 for copyright reasons)
 
Line 1: Line 1:
''This page is based on the [http://en.wikipedia.org/w/index.php?title=Digital_object_identifier&oldid=118039525 Wikipedia article on Digital object identifier], downloaded March 31, 2007''
+
The '''Digital Object Identifier''' ('''DOI''') '''System''' is a managed system for persistent identification of content-related entities on digital networks {{Citation needed|date=July 2009}}. These entities may be content items (digital files, physical objects, abstract works), or any related entities in a content transaction (e.g. licenses, parties, etc.).  "DOI" is sometimes used to mean the identifiers within this system; hence the use of the term alone is deprecated unless the meaning is sufficiently clear from an earlier mention or the specific context: instead it should always be used in conjunction with a specific noun. The DOI name is the identifier string that specifies a unique object (the referent) within the [http://www.doi.org DOI System]; the DOI syntax is the form and sequence of characters comprising any DOI name, specifically the prefix element, separator, and suffix element; and the DOI System is the functional deployment of DOI names as identifiers in computer sensible form through assignment, resolution, referent description, administration, etc.
  
{{selfref|See [[Template:DOI]] and for the usage of "{{PAGENAME}}" in wikichem.}}
+
The DOI System can be used to identify physical, digital, or abstract entities; these names resolve to data specified by the registrant, and use an extensible metadata model to associate descriptive and other elements of data with the DOI Name.  The DOI System is an implementation of the [[Handle System]] and of the [[indecs Content Model]] and so inherits the design principles and features of each.  
  
A '''digital object identifier''' (or '''DOI''') is a [[standardization|standard]] for persistently identifying a piece of [[intellectual property]] on a [[Computer networking|digital network]] and associating it with related data, the [[metadata]], in a structured extensible way. This standardization is similar to [[PURL]]. A typical use of a DOI is to give a scientific paper or article a unique identifying number that can be used by anyone to locate details of the paper, and possibly an electronic copy. Unlike the [[URL]] system used on the Internet for web pages, the DOI does not change over time, even if the article is relocated (provided the DOI resolution system is updated when the change of location is made).  
+
The DOI System is implemented through a federation of DOI Registration Agencies, under policies and common infrastructure provided by the International DOI Foundation,<ref>[http://www.doi.org/index.html Welcome to the DOI System]</ref> which developed and controls the system. The DOI System has been developed and implemented in a range of publishing applications since 2000; by early 2009 approximately 40 million DOI names had been assigned {{Citation needed|date=July 2009}}.
  
DOIs are an application of the [[CNRI]] [[Handle System]], a generic system for assigning names to objects; DOIs are handles having the prefix "10.", whereas other namespaces in the handle system have other handles. DOIs can be resolved through the DOI resolver at http://dx.doi.org; but, being handles, they can also be resolved through the global handle resolver at http://hdl.handle.net
+
==International DOI Foundation (IDF)==
  
== Comparison with other standards ==
+
The International DOI Foundation (IDF), a non-profit organisation created in 1998, is the governance body of the DOI System {{Citation needed|date=July 2009}}. It safeguards all [[intellectual property|intellectual property rights]] relating to the DOI System, manages common operational features, and supports the development and promotion of the DOI System. The IDF ensures that any improvements made to the DOI System (including creation, maintenance, registration, resolution and policymaking of DOI names) are available to any DOI registrant, and that no third party licenses might reasonably be required to practice the DOI standard.
DOIs have been called "the bar code for intellectual property": like the physical [[barcode]], they are enabling tools for use throughout the [[supply chain]] to add [[Value (economics)|value]] and save [[cost]]. A DOI differs from commonly used internet pointers to material such as the [[Uniform Resource Locator|URL]] because it identifies an object as a first-class entity, not simply the place where the object is located; an address is merely an attribute of a thing, whereas the thing itself is a first class object. A DOI also differs from identifiers of intellectual property such as ([[International Standard Book Number|ISBN]]s, [[International Standard Recording Code|ISRC]]s, etc., because it can be associated with defined services and is immediately actionable on a network.
 
  
A DOI can apply to any form of intellectual property expressed in any digital environment. Intellectual property includes both physical and [[digital media|digital]] manifestations, [[performance]]s and abstract works: DOIs can be used to identify [[e-text]]s, [[image]]s, [[Sound recording and reproduction|audio]] or [[video]] items, [[Computer software|software]], etc. An entity can be identified at any arbitrary level of [[granularity]]. This means that, for instance, DOIs can identify a [[academic journal|journal]], an individual issue of a journal, an individual [[article (publishing)|article]] in the journal, or a single [[table (information)|table]] in that article.
+
IDF is controlled by a Board elected by the members of the Foundation, with an appointed Managing Agent who is responsible for co-ordinating and planning its activities. Membership is open to all organizations with an interest in electronic publishing and related enabling technologies. The IDF holds annual open meetings on the topics of DOI and related issues: the 2009 meeting will be held in San Francisco in October <ref>[http://www.doi.org/doi_presentations/members_meeting_2009/index.html 2009 IDF Open Meeting: "Ensuring Persistence"]</ref>
  
== Structure ==
+
==Applications==
The DOI consists of a unique [[alphanumeric]] character string divided into two parts: a [[Prefix (linguistics)|prefix]] and a [[suffix]].
 
  
An example of a complete DOI is:
+
A DOI name can be assigned to any object that is a form of intellectual property. The term object is used with a specific sense within the DOI system: in the ontology sense of any entity, like the common meaning of the word "thing" (rather than in any computer science sense e.g. Object-oriented programming).  So "DOI" is parsed as "digital identifier of an object", rather than "identifier of a digital object". As well as identifying digital media manifestations of intellectual property, DOI names can also identify physical manifestations, [[performance]]s and abstract works. For example, they can be used to identify: e-texts; images; audio or video items and software, etc.  DOI names can also be assigned to related entities in a content transaction (e.g. licenses, parties, etc.).
  
:'''10.1000/182'''
+
An entity can be identified at any arbitrary level of [[granularity]]. This means that, for instance, DOI names can identify a journal, an individual issue of a journal, an individual article in the journal or a single table in that article. The choice of granularity is left to the assigner, but in the DOI System it must be declared as part of the accompanying metadata; where an application is highly reliant on knowledge of granularity and relationships, the accompanying metadata specified as a requirement by the DOI Registration Agency will normally describe this, using a data dictionary based on the indecs Content Model.
 +
 
 +
Applications of the DOI System are provided by DOI Registration Agencies (RAs), appointed by the IDF, whose primary role is to provide services to DOI registrants: allocating DOI prefixes, registering DOI names and providing the necessary infrastructure to allow registrants to declare and maintain metadata and state data. RAs are also expected to actively promote the widespread adoption of the DOI System, to cooperate with the IDF in the development of the DOI System as a whole and to provide services on behalf of their specific user community. A list of current RAs is maintained by the International DOI Foundation.
 +
 
 +
Currently, most applications use a single redirection to a managed URL.  It is expected that more applications will begin to make use of additional features in the DOI System, such as multiple resolution (the return as output of several pieces of current information related to a DOI-identified entity — specifically at least one URL plus defined data structures allowing management) and the provision of structured metadata in machine-readable form.
 +
 
 +
Major applications currently include:
 +
 
 +
* persistent citation in scholarly materials (journal articles, books, etc.) through [http://www.crossref.org CrossRef];
 +
* scientific data sets, through a consortium of leading research libraries and technical information providers, building on work by the [http://www.doi.org/news/DOINewsMar09.html#1 German National Library of Science and Technology (TIB)];
 +
* [[European Union]] official publications, through the [[Publications Office (European Union)|EU publications office]].
 +
 
 +
An illustration of an application making good use of DOI System functionality is [[Organisation for Economic Co-operation and Development|OECD's]] publication service [[SourceOECD]]: each table or graph in an OECD publication containing a DOI name leads to an Excel file of data underlying the tables & graphs. Further development of such services is planned.<ref>http://dx.doi.org/10.1787/603233448430 OECD Publishing White Paper</ref>.
 +
 
 +
A multilingual European DOI RA activity, [http://www.mEDRA.org ''m''EDRA] and a Chinese RA, [http://www.wanfangdata.com/ Wanfang Data], are active in non-English language markets.  Expansion to other sectors is planned by the International DOI Foundation.
 +
 
 +
The DOI System is currently being standardised through the [[ISO|International Organization for Standardization]], in its technical committee on identification and description [http://www.iso.org/iso/iso_technical_committee.html?commid=48836 TC46/SC9]. In April 2008 the Committee Draft prepared by an international Working Group was approved, after voting by ISO's national bodies, for distribution as a Draft International Standard (DIS). In February 2009, ISO provided further editorial comments on that draft. A further revised draft was submitted to ISO on 2 April 2009. Depending upon the result of ISO activities and further voting, the final standard may be published in 2009 or 2010. <ref>[http://www.doi.org/ about_the_doi.html DOI Standards and Specifications]</ref>
 +
 
 +
DOI names may be used with other appropriate technology to provide added services, e.g., the [[OpenURL]] for context sensitive linking: the DOI directory is OpenURL-enabled so can recognize a user with access to an OpenURL link resolver. Hence on resolving, metadata can be pulled from the DOI agency [http://www.crossref.org/ CrossRef] to create an OpenURL targeting the current local link resolver.  Such an OpenURL link that contains a DOI name is persistent; publishers who use the CrossRef DOI System to identify their content make their products OpenURL-aware.
 +
 
 +
==Features and benefits==
 +
 +
DOI names were developed with the key intended benefits of:
 +
 
 +
* Persistent identification: each DOI name unequivocally and permanently identifies the object to which it is associated
 +
* Network actionability: each DOI name resolves to one or more web pages or other data assigned by the publisher
 +
* Semantic interoperability: metadata can be provided which allows unambiguous communication to any user, from any place, at any point of a distribution chain, with relevant pieces of information about the identified objects and their relationships
 +
 
 +
The DOI System uses two underlying technologies plus a social infrastructure to achieve this.  The technical infrastructure inherits the features and capabilities of the two underlying technologies: the Handle System and the indecs content model.
 +
 
 +
The Handle System ensures that the DOI name:
 +
 
 +
* is not based on any changeable attributes of the entity (location, ownership, or any other attribute that may change without changing the referent's identity);
 +
* is opaque (preferably a "dumb number": a well known pattern invites assumptions that may be misleading, and meaningful semantics may not translate across languages and may cause trademark conflicts);
 +
* is unique within the system (to avoid collisions and referential uncertainty);
 +
* has optional, but nice to have, features that should be supported (human-readable, cut-and-paste-able, embeddable; fits common systems, e.g., URI specification).
 +
 
 +
And that the DOI name's resolution mechanism:
 +
 
 +
* is reliable (using redundancy, no single points of failure, and fast enough to not appear broken);
 +
* is scalable (higher loads simply managed with more computers);
 +
* is flexible (can adapt to changing computing environments; useful to new applications);
 +
* is trusted (both resolution and administration have technical trust methods; an operating organization is committed to the long term);
 +
* builds on open architecture (encouraging the leverage efforts of a community in building applications on the infrastructure);
 +
* is transparent (users need not know the infrastructure details).
 +
 
 +
The [[Handle System|Handle System's]] ability to provide administrative granularity, multiple resolution, and data typing were key to its selection for the DOI System. The Handle System is part of a [http://www.cnri.reston.va.us/k-w.html Digital Object Architecture] which relates to digital objects in a computer science sense, as an identifiable item of structured information in digital form within a network-based computer environment.  Any object in the more general sense (the ontology sense, the word "thing") may be [http://www.dlib.org/dlib/may01/kahn/05kahn.html represented as a digital object], so there is no inconsistency in this use in the DOI System.
 +
 
 +
The [[indecs Content Model]] is the basis of the DOI System's approach to assigning metadata to define a referent and its relationships. This approach places importance on:
 +
 
 +
* unique identification;
 +
* functional granularity;
 +
* appropriate access;
 +
* designated authority; and
 +
* independence of specific business model or legal framework.
 +
 
 +
The International DOI Foundation (IDF) oversees the integration of these technologies and operation of the system through a technical and social infrastructure. The social infrastructure of a federation of independent registration agencies offering DOI services was modelled on existing successful federated deployments of identifiers such as [[GS1]] and [[ISBN]]. 
 +
 
 +
==Comparison with other identifier schemes==
 +
 
 +
A DOI name differs from commonly used Internet pointers to material such as the URL, because it identifies an object as a first-class entity, not simply the place where the object is located. A DOI name also differs from identifiers such as the [[International Standard Book Number|ISBN]], [[International Standard Recording Code|ISRC]], etc. because it can be associated with defined services and is immediately actionable on a network.
 +
 
 +
The comparison of persistent identifier approaches is difficult because they are not all doing the same thing. Imprecisely referring to a set of schemes as "identifiers" doesn't mean that they can be compared easily. Similarly, when any two technologies (e.g., two web browsers) are compared, the criteria used for comparison must be defined.
 +
 
 +
The DOI System offers persistent, semantically interoperable resolution to related current data, and is best suited to material that will be used in services outside the direct control of the issuing assigner (e.g., public citation, or managing content of value). It uses a managed registry (providing social and technical infrastructure). It does not assume any specific business model for the provision of identifiers or services, and enables other existing services to link to it in defined ways.
 +
 
 +
Other "identifier systems" may be enabling technologies with low barriers to entry, providing an easy to use labelling mechanism where anyone can set up a new instance (examples include [[Persistent Uniform Resource Locator|PURL]], [[Uniform Resource Locator|URLs]], [[Globally Unique Identifier|GUIDS]], etc.) but which may lack some of the functionality of a registry-controlled scheme and usually lack accompanying metadata in a controlled scheme.  The DOI System does not have this approach and should not be compared directly to such identifier schemes.  Various applications using such enabling technologies with added features have been devised which meet some of the features offered by the DOI System (e.g. [http://www.cdlib.org/inside/diglib/ark/ ARK]) for specific sectors.
 +
 
 +
A DOI name is not dependent on the object's location and, in this way, is similar to a [[Uniform Resource Name]] (URN) or [[Persistent Uniform Resource Locator]] (PURL) but differs from an ordinary [[Uniform Resource Locator]] (URL). URLs are often used as substitute identifiers for documents on the Internet (better characterised as [[Uniform Resource Identifier|URIs]]) although the same document at two different locations has two URLs. Persistent identifiers such as DOI names identify objects as first class entities: two instances of the same object would have the same DOI name.
 +
 
 +
==Structure of DOI name (identifier string)==
 +
 
 +
A DOI name consists of a unique character string (case-insensitive, legal graphic characters of Unicode not in practice using certain characters such as pointed brackets "<>") divided into two parts: a prefix and a suffix.
 +
 
 +
An example of a complete DOI name is:
 +
 
 +
:<code>10.1000/182</code>
  
 
where:
 
where:
  
:'''10.1000''' is the prefix, or ''publisher ID'', composed by a part identifying the string as a DOI (10) and a part identifying the [[licensure|registrant]] (1000);
+
:<code>10.1000</code> is the prefix:
 +
:<code>10</code> is the directory code.  All DOI names start with "10.". This distinguishes a DOI name from any other implementation of the [[Handle System]].
  
:'''182''' is the suffix, or ''item ID'', identifying the single object. (Typical suffixes are longer than this example.)
+
:<code>1000</code> is the registrant's code (colloquially publisher ID, although it may represent a publishers imprint, one journal, or a whole organization) identifying the registrant. In this DOI name, the number "1000" identifies the International DOI Foundation.
  
The prefix is assigned by a DOI Registration Agency to a specific registrant. The suffix is assigned by the registrant and must be unique within a prefix. It can integrate existing standard identifiers such as an ISBN or [[International Standard Serial Number|ISSN]], or [[SICI]]. The DOI is case insensitive and is considered an "[[opacity|opaque]] string": nothing can be inferred from the number with respect to its use in the DOI System.
+
:<code>182</code> is the suffix, or item ID, identifying the single object. For this DOI name, the object corresponding to <tt>doi:10.1000/182</tt> is the latest version of the DOI Handbook. (Typical suffixes are longer than this example, e.g., hdy.2009.9 or j.1365-313X.2008.03660.x).
  
== Resolution ==
+
The prefix is assigned by a DOI Registration Agency to a specific registrant. The suffix is assigned by the registrant and must be unique within a prefix. It can integrate existing standard identifiers such as an [[International Standard Book Number|ISBN]] or [[International Standard Serial Number|ISSN]], or [[Serial Item and Contribution Identifier|SICI]]. An example of an application integrating the ISBN with DOI was launched in 2009 <ref>[http://www.doi.org/news/DOINewsMar09.html#2 DOI News, March 2009, "Launch of Actionable ISBN using DOI System"]</ref>.  
DOI resolution redirects the user from a DOI to one or more pieces of typed data: URLs representing instances of the object, services such as e-mail, or one or more items of metadata.
 
  
"What the DOI identifies" and "what the DOI resolves to" are two different concepts: it is possible that a DOI does not resolve to the identified entity, but just to some related information wished by the publisher.
+
The DOI is considered an "[[Magic cookie|opaque]] string": nothing can be inferred from the number with respect to its use in the DOI System.
  
DOI resolution is provided through the [[Handle System]] technology, developed by the [[Corporation for National Research Initiatives]], and is freely available to any user encountering a DOI.
+
Citations using DOI names should be printed as <tt>doi:10.1000/182</tt>. When the citation is a hypertext link, it is recommended to embed the link as a http proxy expression by appending <nowiki>http://dx.doi.org/</nowiki> to the DOI name beginning 10. (e.g. the text <tt>doi:10.1000/182</tt> is linked as <nowiki>http://dx.doi.org/10.1000/182</nowiki>).
  
To resolve a DOI, just type in the address bar of any browser the string "<nowiki>http://dx.doi.org/</nowiki>" followed by the DOI. For example, to resolve the DOI 10.1000/182, enter into your browser the address: <nowiki>http://dx.doi.org/10.1000/182</nowiki>. Of course, web pages or other hypertext documents can include hypertext links in this form, as in this sentence which links to the [http://dx.doi.org/10.1000/182 DOI Handbook].
+
==Resolution==
  
The DOI organization has applied for a "doi:" [[URI scheme]] to allow a DOI to be expressed as a [[Uniform Resource Identifier]] (URI) without requiring reference to a specific HTTP server as in the previous paragraphAs of April, 2006, this had not been approved. [http://www.doi.org/factsheets/DOIIdentifierSpecs.html]
+
DOI name resolution is provided through the [[Handle System]], developed by [[Corporation for National Research Initiatives]], and is freely available to any user encountering a DOI nameResolution redirects the user from a DOI name to one or more pieces of typed data: URLs representing instances of the object, services such as e-mail, or one or more items of metadata. To the Handle System, a DOI name is a handle, and so has a set of values assigned to it and may be thought of as a record that consists of a group of fields. Each handle value must have a data type specified in its "<type>" field, that defines the syntax and semantics of its data.
  
== Metadata ==
+
To resolve a DOI name, it may be input to a DOI resolver (e.g., at [http://www.doi.org/ www.doi.org]) or may be represented as a http string by preceding the DOI name by the string
Each DOI is associated with a series of metadata, a set of bibliographical and commercial information concerning the content (title, author, publication date, copyright, price, etc.) and its position within the whole registrant's publishing offer (the belonging of a title to a series, of an article to a serial, the availability of one publication in more formats and/or through different media, etc.). By means of metadata, the DOI configures not simply as an identifying string, but takes the form of a powerful and unambiguous tool for data storage and exchange.
 
  
Metadata, as well as the DOI they are associated to, are persistently connected to the object they describe, so they can be easily communicated to other subjects across the productive and distributive chain, enhancing a content producer's ability to trade electronically. Furthermore, metadata represent the key for the development of DOI-based services, such as transnational databases and search engines for different kinds of contents. Asserting that metadata are persistent does not mean they are unmodifiable: registrants may update metadata about their contents any time they wish (whether some publication data change, when the primary URL the DOI resolves to is modified, etc.).
+
:<code><nowiki>http://dx.doi.org/</nowiki></code>
  
== Advantages ==
+
For example, to resolve the DOI name <tt>10.1000/182</tt>, enter the address: "<nowiki>http://dx.doi.org/10.1000/182</nowiki>". Web pages or other hypertext documents can include hypertext links in this form.  Some browsers allow the direct resolution of a DOI (or other handles) with an add-on, e.g., [https://addons.mozilla.org/en-US/firefox/addon/718 Mozilla Handle/DOI Protocol Handler].
There are three main values granted by DOI adoption:
 
* Persistent Identification: each DOI unequivocally and permanently identifies the object to which it is associated
 
* Network Actionability: through Handle System technology, each DOI resolves to one or more web pages assigned by the publisher
 
* Semantic Interoperability: metadata allow to unambiguously communicate - to any user, from any place, at any point of the productive/distributive chain - all the pieces of information about the related objects and their hierarchical relationships
 
  
== International DOI Foundation (IDF) ==
+
==Metadata==
The [[International DOI Foundation]] (IDF), a non-profit organisation created in 1998, is the governance body of the DOI System, which safeguards all intellectual property rights relating to the DOI System.
 
IDF supports the development and promotion of the Digital Object Identifier system as a common infrastructure for content management, and works to ensure that any improvements made to the DOI system (including creation, maintenance, registration, resolution and policymaking of DOIs) are available to any DOI registrant, and that no third party licenses might reasonably be required to practice the DOI standard.
 
  
IDF is controlled by a Board elected by the members of the Foundation, with an appointed full-time Director who is responsible for co-ordinating and planning its activities. Through the elected Board, the activities of the Foundation are ultimately controlled by its members. Membership is open to all organizations with an interest in electronic publishing and related enabling technologies.  
+
Each DOI name is associated with a series of [[metadata]]. The extent of this metadata may be defined by an application profile; a small kernel of common data for all DOI names can be optionally extended with other relevant data, which may be public or restricted. The metadata can be existing data from another scheme, which can be mapped to a DOI Application Profile using a data dictionary based on the indecs Content Model.
  
== Registration agencies ==
+
Registrants may update metadata about their contents any time they wish (when some publication data changes, when the primary URL the DOI name resolves to is modified, etc.).
A DOI Registration Agency (RA) is an authority recognized by the IDF, whose primary role is to provide services to DOI registrants: allocating DOI prefixes, registering DOIs and providing the necessary infrastructure to allow registrants to declare and maintain metadata and state data. RAs are also expected actively to promote the widespread adoption of the DOI, to cooperate with the IDF in the development of the DOI System as a whole and to provide services on behalf of their specific user community.
 
  
Currently, eight major RAs are active worldwide, as  [http://www.doi.org/registration_agencies.html listed] at www.doi.org:
+
==DOI assignment fees==
#[[CrossRef]] ([[USA]]) - [http://www.crossref.org/ website]
 
#[[R.R. Bowker]] (USA) - [http://www.bowker.com/ website]
 
#[[Copyright Agency Limited|CAL]] ([[Australia]]) - [http://www.copyright.com.au/ website]
 
#[[mEDRA]] ([[Europe]]) - [http://www.medra.org/it/index.htm website]
 
#[[Nielsen BookData]] ([[UK]]) - [http://www.nielsenbookdata.co.uk/ website]
 
#[[TIB]] ([[Germany]]) - [http://www.tib-hannover.de/ website]
 
#[[OPOCE]] ([[EU]]) - [http://www.publications.eu.int/ website]
 
#[[Wanfang Data]] ([[China]]) - [http://www.WanfangData.com/ website]
 
  
== See also ==
+
Unlike non-standardized URL indexing services, which are generally free, there is usually a charge to assign a new DOI name, to cover the costs of providing and operating services. These fees are set independently by each individual Registration Agency. Internally, an administrative fee is paid by the RA to the IDF, to support the costs of developing and maintaining the system.  The DOI system overall, through the IDF, operates on a not-for-profit cost-recovery basis.
* [[Uniform Resource Name]]
 
* [[Persistent Uniform Resource Locator]]
 
* [[LSID|Life Science Identifiers]]
 
  
== References ==
+
==See also==
* [http://www.doi.org/factsheets/DOIIdentifierSpecs.html Factsheet: DOI System and Internet Identifier Specifications]
+
*[[Handle System]]
 +
*[[Indecs Content Model]]
 +
*[[Uniform Resource Identifier|Uniform Resource Identifier (URI)]]
 +
*[[PURL|Persistent Uniform Resource Locator (PURL)]]
 +
*[[LSID|Life Science Identifiers]]
 +
*[[OAI]]
 +
*[[Object identifier]]
 +
*[[PubMed]]
 +
*[[Bibcode]]
 +
*[[Extensible Resource Identifier|Extensible Resource Identifier (XRI)]]
 +
*[[Universally Unique Identifier]] (UUID)
  
== External links ==
+
==Notes and references==
* [http://doi.org/ The Digital Object Identifier System] from the International DOI Foundation
+
{{reflist}}
* [http://www.handle.net/ Handle System]
+
 
 +
==External links==
 +
*[http://doi.org/ The DOI System]
 +
*[http://www.doi.org/factsheets/DOIIdentifierSpecs.html Factsheet: DOI System and Internet Identifier Specifications]
 +
*[http://www.handle.net/ The Handle System]
  
 
[[Category:Chemical publishing]]
 
[[Category:Chemical publishing]]
 
[[Category:Chemical information]]
 
[[Category:Chemical information]]
[[Category:Wikipedia content]]
+
 
 +
{{Imported from Wikipedia|name=Digital object identifier|id=305119816}}

Latest revision as of 11:08, 31 July 2009

The Digital Object Identifier (DOI) System is a managed system for persistent identification of content-related entities on digital networks[ref. needed]. These entities may be content items (digital files, physical objects, abstract works), or any related entities in a content transaction (e.g. licenses, parties, etc.). "DOI" is sometimes used to mean the identifiers within this system; hence the use of the term alone is deprecated unless the meaning is sufficiently clear from an earlier mention or the specific context: instead it should always be used in conjunction with a specific noun. The DOI name is the identifier string that specifies a unique object (the referent) within the DOI System; the DOI syntax is the form and sequence of characters comprising any DOI name, specifically the prefix element, separator, and suffix element; and the DOI System is the functional deployment of DOI names as identifiers in computer sensible form through assignment, resolution, referent description, administration, etc.

The DOI System can be used to identify physical, digital, or abstract entities; these names resolve to data specified by the registrant, and use an extensible metadata model to associate descriptive and other elements of data with the DOI Name. The DOI System is an implementation of the Handle System and of the indecs Content Model and so inherits the design principles and features of each.

The DOI System is implemented through a federation of DOI Registration Agencies, under policies and common infrastructure provided by the International DOI Foundation,[1] which developed and controls the system. The DOI System has been developed and implemented in a range of publishing applications since 2000; by early 2009 approximately 40 million DOI names had been assigned[ref. needed].

International DOI Foundation (IDF)

The International DOI Foundation (IDF), a non-profit organisation created in 1998, is the governance body of the DOI System[ref. needed]. It safeguards all intellectual property rights relating to the DOI System, manages common operational features, and supports the development and promotion of the DOI System. The IDF ensures that any improvements made to the DOI System (including creation, maintenance, registration, resolution and policymaking of DOI names) are available to any DOI registrant, and that no third party licenses might reasonably be required to practice the DOI standard.

IDF is controlled by a Board elected by the members of the Foundation, with an appointed Managing Agent who is responsible for co-ordinating and planning its activities. Membership is open to all organizations with an interest in electronic publishing and related enabling technologies. The IDF holds annual open meetings on the topics of DOI and related issues: the 2009 meeting will be held in San Francisco in October [2]

Applications

A DOI name can be assigned to any object that is a form of intellectual property. The term object is used with a specific sense within the DOI system: in the ontology sense of any entity, like the common meaning of the word "thing" (rather than in any computer science sense e.g. Object-oriented programming). So "DOI" is parsed as "digital identifier of an object", rather than "identifier of a digital object". As well as identifying digital media manifestations of intellectual property, DOI names can also identify physical manifestations, performances and abstract works. For example, they can be used to identify: e-texts; images; audio or video items and software, etc. DOI names can also be assigned to related entities in a content transaction (e.g. licenses, parties, etc.).

An entity can be identified at any arbitrary level of granularity. This means that, for instance, DOI names can identify a journal, an individual issue of a journal, an individual article in the journal or a single table in that article. The choice of granularity is left to the assigner, but in the DOI System it must be declared as part of the accompanying metadata; where an application is highly reliant on knowledge of granularity and relationships, the accompanying metadata specified as a requirement by the DOI Registration Agency will normally describe this, using a data dictionary based on the indecs Content Model.

Applications of the DOI System are provided by DOI Registration Agencies (RAs), appointed by the IDF, whose primary role is to provide services to DOI registrants: allocating DOI prefixes, registering DOI names and providing the necessary infrastructure to allow registrants to declare and maintain metadata and state data. RAs are also expected to actively promote the widespread adoption of the DOI System, to cooperate with the IDF in the development of the DOI System as a whole and to provide services on behalf of their specific user community. A list of current RAs is maintained by the International DOI Foundation.

Currently, most applications use a single redirection to a managed URL. It is expected that more applications will begin to make use of additional features in the DOI System, such as multiple resolution (the return as output of several pieces of current information related to a DOI-identified entity — specifically at least one URL plus defined data structures allowing management) and the provision of structured metadata in machine-readable form.

Major applications currently include:

An illustration of an application making good use of DOI System functionality is OECD's publication service SourceOECD: each table or graph in an OECD publication containing a DOI name leads to an Excel file of data underlying the tables & graphs. Further development of such services is planned.[3].

A multilingual European DOI RA activity, mEDRA and a Chinese RA, Wanfang Data, are active in non-English language markets. Expansion to other sectors is planned by the International DOI Foundation.

The DOI System is currently being standardised through the International Organization for Standardization, in its technical committee on identification and description TC46/SC9. In April 2008 the Committee Draft prepared by an international Working Group was approved, after voting by ISO's national bodies, for distribution as a Draft International Standard (DIS). In February 2009, ISO provided further editorial comments on that draft. A further revised draft was submitted to ISO on 2 April 2009. Depending upon the result of ISO activities and further voting, the final standard may be published in 2009 or 2010. [4]

DOI names may be used with other appropriate technology to provide added services, e.g., the OpenURL for context sensitive linking: the DOI directory is OpenURL-enabled so can recognize a user with access to an OpenURL link resolver. Hence on resolving, metadata can be pulled from the DOI agency CrossRef to create an OpenURL targeting the current local link resolver. Such an OpenURL link that contains a DOI name is persistent; publishers who use the CrossRef DOI System to identify their content make their products OpenURL-aware.

Features and benefits

DOI names were developed with the key intended benefits of:

  • Persistent identification: each DOI name unequivocally and permanently identifies the object to which it is associated
  • Network actionability: each DOI name resolves to one or more web pages or other data assigned by the publisher
  • Semantic interoperability: metadata can be provided which allows unambiguous communication to any user, from any place, at any point of a distribution chain, with relevant pieces of information about the identified objects and their relationships

The DOI System uses two underlying technologies plus a social infrastructure to achieve this. The technical infrastructure inherits the features and capabilities of the two underlying technologies: the Handle System and the indecs content model.

The Handle System ensures that the DOI name:

  • is not based on any changeable attributes of the entity (location, ownership, or any other attribute that may change without changing the referent's identity);
  • is opaque (preferably a "dumb number": a well known pattern invites assumptions that may be misleading, and meaningful semantics may not translate across languages and may cause trademark conflicts);
  • is unique within the system (to avoid collisions and referential uncertainty);
  • has optional, but nice to have, features that should be supported (human-readable, cut-and-paste-able, embeddable; fits common systems, e.g., URI specification).

And that the DOI name's resolution mechanism:

  • is reliable (using redundancy, no single points of failure, and fast enough to not appear broken);
  • is scalable (higher loads simply managed with more computers);
  • is flexible (can adapt to changing computing environments; useful to new applications);
  • is trusted (both resolution and administration have technical trust methods; an operating organization is committed to the long term);
  • builds on open architecture (encouraging the leverage efforts of a community in building applications on the infrastructure);
  • is transparent (users need not know the infrastructure details).

The Handle System's ability to provide administrative granularity, multiple resolution, and data typing were key to its selection for the DOI System. The Handle System is part of a Digital Object Architecture which relates to digital objects in a computer science sense, as an identifiable item of structured information in digital form within a network-based computer environment. Any object in the more general sense (the ontology sense, the word "thing") may be represented as a digital object, so there is no inconsistency in this use in the DOI System.

The indecs Content Model is the basis of the DOI System's approach to assigning metadata to define a referent and its relationships. This approach places importance on:

  • unique identification;
  • functional granularity;
  • appropriate access;
  • designated authority; and
  • independence of specific business model or legal framework.

The International DOI Foundation (IDF) oversees the integration of these technologies and operation of the system through a technical and social infrastructure. The social infrastructure of a federation of independent registration agencies offering DOI services was modelled on existing successful federated deployments of identifiers such as GS1 and ISBN.

Comparison with other identifier schemes

A DOI name differs from commonly used Internet pointers to material such as the URL, because it identifies an object as a first-class entity, not simply the place where the object is located. A DOI name also differs from identifiers such as the ISBN, ISRC, etc. because it can be associated with defined services and is immediately actionable on a network.

The comparison of persistent identifier approaches is difficult because they are not all doing the same thing. Imprecisely referring to a set of schemes as "identifiers" doesn't mean that they can be compared easily. Similarly, when any two technologies (e.g., two web browsers) are compared, the criteria used for comparison must be defined.

The DOI System offers persistent, semantically interoperable resolution to related current data, and is best suited to material that will be used in services outside the direct control of the issuing assigner (e.g., public citation, or managing content of value). It uses a managed registry (providing social and technical infrastructure). It does not assume any specific business model for the provision of identifiers or services, and enables other existing services to link to it in defined ways.

Other "identifier systems" may be enabling technologies with low barriers to entry, providing an easy to use labelling mechanism where anyone can set up a new instance (examples include PURL, URLs, GUIDS, etc.) but which may lack some of the functionality of a registry-controlled scheme and usually lack accompanying metadata in a controlled scheme. The DOI System does not have this approach and should not be compared directly to such identifier schemes. Various applications using such enabling technologies with added features have been devised which meet some of the features offered by the DOI System (e.g. ARK) for specific sectors.

A DOI name is not dependent on the object's location and, in this way, is similar to a Uniform Resource Name (URN) or Persistent Uniform Resource Locator (PURL) but differs from an ordinary Uniform Resource Locator (URL). URLs are often used as substitute identifiers for documents on the Internet (better characterised as URIs) although the same document at two different locations has two URLs. Persistent identifiers such as DOI names identify objects as first class entities: two instances of the same object would have the same DOI name.

Structure of DOI name (identifier string)

A DOI name consists of a unique character string (case-insensitive, legal graphic characters of Unicode not in practice using certain characters such as pointed brackets "<>") divided into two parts: a prefix and a suffix.

An example of a complete DOI name is:

10.1000/182

where:

10.1000 is the prefix:
10 is the directory code. All DOI names start with "10.". This distinguishes a DOI name from any other implementation of the Handle System.
1000 is the registrant's code (colloquially publisher ID, although it may represent a publishers imprint, one journal, or a whole organization) identifying the registrant. In this DOI name, the number "1000" identifies the International DOI Foundation.
182 is the suffix, or item ID, identifying the single object. For this DOI name, the object corresponding to doi:10.1000/182 is the latest version of the DOI Handbook. (Typical suffixes are longer than this example, e.g., hdy.2009.9 or j.1365-313X.2008.03660.x).

The prefix is assigned by a DOI Registration Agency to a specific registrant. The suffix is assigned by the registrant and must be unique within a prefix. It can integrate existing standard identifiers such as an ISBN or ISSN, or SICI. An example of an application integrating the ISBN with DOI was launched in 2009 [5].

The DOI is considered an "opaque string": nothing can be inferred from the number with respect to its use in the DOI System.

Citations using DOI names should be printed as doi:10.1000/182. When the citation is a hypertext link, it is recommended to embed the link as a http proxy expression by appending http://dx.doi.org/ to the DOI name beginning 10. (e.g. the text doi:10.1000/182 is linked as http://dx.doi.org/10.1000/182).

Resolution

DOI name resolution is provided through the Handle System, developed by Corporation for National Research Initiatives, and is freely available to any user encountering a DOI name. Resolution redirects the user from a DOI name to one or more pieces of typed data: URLs representing instances of the object, services such as e-mail, or one or more items of metadata. To the Handle System, a DOI name is a handle, and so has a set of values assigned to it and may be thought of as a record that consists of a group of fields. Each handle value must have a data type specified in its "<type>" field, that defines the syntax and semantics of its data.

To resolve a DOI name, it may be input to a DOI resolver (e.g., at www.doi.org) or may be represented as a http string by preceding the DOI name by the string

http://dx.doi.org/

For example, to resolve the DOI name 10.1000/182, enter the address: "http://dx.doi.org/10.1000/182". Web pages or other hypertext documents can include hypertext links in this form. Some browsers allow the direct resolution of a DOI (or other handles) with an add-on, e.g., Mozilla Handle/DOI Protocol Handler.

Metadata

Each DOI name is associated with a series of metadata. The extent of this metadata may be defined by an application profile; a small kernel of common data for all DOI names can be optionally extended with other relevant data, which may be public or restricted. The metadata can be existing data from another scheme, which can be mapped to a DOI Application Profile using a data dictionary based on the indecs Content Model.

Registrants may update metadata about their contents any time they wish (when some publication data changes, when the primary URL the DOI name resolves to is modified, etc.).

DOI assignment fees

Unlike non-standardized URL indexing services, which are generally free, there is usually a charge to assign a new DOI name, to cover the costs of providing and operating services. These fees are set independently by each individual Registration Agency. Internally, an administrative fee is paid by the RA to the IDF, to support the costs of developing and maintaining the system. The DOI system overall, through the IDF, operates on a not-for-profit cost-recovery basis.

See also

Notes and references

External links

Error creating thumbnail: Unable to save thumbnail to destination
Wikipedia-logo.png This page was originally imported from Wikipedia, specifically this version of the article "Digital object identifier". Please see the history page on Wikipedia for the original authors. This WikiChem article may have been modified since it was imported. It is licensed under the Creative Commons Attribution–Share Alike 3.0 Unported license.