University of Texas at Austin
Libraries Home | Mobile | My Account | Renew Items | Sitemap | Help |
support us

University of Texas Libraries

University of Texas Libraries
Celebrating the Life

Background

Where to Start

More Sources

Searching by Properties

Literature Searching

Quality of Data

 

The Quality of Data



While people often refer to scientific data points as "facts," one scientist has stated that "there are no facts - just measurements embedded within assumptions." (1)

The accuracy of data published in the primary literature (e.g., in peer-reviewed journals) should not be assumed. Reviewers rarely examine such data closely. Experimental and measurement errors can occur. Authors can be sloppy in their use of units and symbols, and errors and typos creep in during the editing process. The pressure to keep articles short and omit "unnecessary" tables and graphs means that some useful data do not appear in published articles at all. If that's not bad enough, errors that make it into the literature can be propogated elsewhere almost indefinitely, creating confusion and uncertainty about even basic properties.

Most but not all data "handbooks" are secondary sources, meaning they are compilations of data previously reported in the primary literature. The reliability of secondary sources obviously depends on both the quality of the original data and on the care taken in compiling and evaluating them. Most compilations provide literature references for the data. Those that don't include such references should be used with caution. The age of the data is also relevant. The enthalpy of a compound is the same today as it was in 1905 - what has changed is the precision of measurement and estimation methods. Older data may be perfectly valid, but they should be compared to more recent values if they can be found.

The same caveats apply to data you might find on the Internet. A value found on a college lab course web page, in an MSDS, or in Wikipedia (2) cannot be treated the same as a value contained in a NIST database. The bottom line is that all sources of data should be viewed with a critical eye. Ask these questions: Is the source cited? When was this work done, and by whom? Were the data determined experimentally or derived by calculation (estimated)? What methods, experimental parameters, or special conditions applied? If you can't answer these questions the data probably should not be trusted.

The term critically evaluated - while occasionally overused - is a useful one to look for in secondary sources. This usually implies that someone has evaluated the data and procedures for internal consistency, and, in cases where conflicting values have been reported, established a set of recommended values. It does not mean that experts have repeated and verified the measurements themselves. Touloukian provides a useful overview of critical evaluation, stating that "while 'critical analysis' always sets a 'level of confidence' for the recommended values, there is no implication whatsoever of high accuracy or precision in these values." (3) Most primary literature and secondary compilations are uncritical, however.

Sources of Critically Evaluated Data

CINDAS
Founded by the physicist Y.S. Touloukian (1920-1981), the Center for Information and Numerical Data Analysis and Synthesis (formerly at Purdue University, now private) has carried out a systematic research program on the properties and behavior of materials since 1960.

DECHEMA
The Society for Chemical Engineering and Biotechnology is based in Frankfurt, Germany.

DIPPR
The Design Institute for Physical Property Research Project 801 is affiliated with AIChE and is now located at Brigham Young University.

IUPAC
Data published under the auspices of the International Union of Pure and Applied Chemistry are particularly valuable in the areas of solubility, electrochemistry, and thermodynamics.

National Institute for Standards and Technology (formerly National Bureau of Standards)
NIST's National Standard Reference Data Service is a coordinating body for many different data collection centers. Data collected and evaluated by NIST can be considered very reliable.

Thermodynamics Research Center
Founded in 1942 by Frederick D. Rossini, the TRC was originally established to undertake the American Petroleum Institute Research Project 44, whose goal was to establish a comprehensive archive of critically evaluated data for the petroleum refining industry. It is now part of NIST.


Notes

  1. Bradley, J.C. Blog post 6/22/2011.
  2. Walker, M.A. "Wikipedia as a resource for chemistry." ACS Symp. Ser. 1060, 2011, 79-92.
    [Wikipedia chembox: Wikipedia pages on common chemical compounds usually contain an infobox sidebar called a chembox, which provides data values for properties that are not likely to change. Data marked with a check mark have been "verified" by WikiProject Chemicals. However, any data lacking a specific literature citation, and any value lacking the verfication checkmark, should be verified independently before being trusted.]
  3. Touloukian, Y.S. "Twenty-five years of pioneering accomplishments by CINDAS--a retrospective review." Int. J. Thermophysics 2, 1981, 205-222.