Data processing

From Infogalactic: the planetary knowledge core
Jump to: navigation, search

<templatestyles src="Module:Hatnote/styles.css"></templatestyles>

Data processing is, broadly, "the collection and manipulation of items of data to produce meaningful information."[1] In this sense it can be considered a subset of information processing, "the change (processing) of information in any manner detectable by an observer." [note 1]

The term is often used more specifically in the context of a business or other organization to refer to the class of commercial data processing applications.[2]

Data processing functions

Data processing may involve various processes, including:

  • Validation – Ensuring that supplied data is "clean, correct and useful"
  • Sorting – "arranging items in some sequence and/or in different sets."
  • Summarization – reducing detail data to its main points.
  • Aggregation – combining multiple pieces of data.
  • Analysis – the "collection, organization, analysis, interpretation and presentation of data.".
  • Reporting – list detail or summary data or computed information.
  • Classification – separates data into various categories.

History

The United States Census Bureau illustrates the evolution of data processing from manual through electronic procedures.

Manual data processing

Although widespread use of the term data processing dates only from the nineteen-fifties[3] data processing functions have been performed manually for millennia. For example bookkeeping involves functions such as posting transactions and producing reports like the balance sheet and the cash flow statement. Completely manual methods were augmented by the application of mechanical or electronic calculators. A person whose job was to perform calculations manually or using a calculator was called a "computer."

The 1850 United States Census schedule was the first to gather data by individual rather than household. A number of questions could be answered by making a check in the appropriate box on the form. From 1850 through 1880 the Census Bureau employed "a system of tallying, which, by reason of the increasing number of combinations of classifications required, became increasingly complex. Only a limited number of combinations could be recorded in one tally, so it was necessary to handle the schedules 5 or 6 times, for as many independent tallies."[4] "It took over 7 years to publish the results of the 1880 census"[5] using manual processing methods.

Automatic data processing

The term automatic data processing was applied to operations performed by means of unit record equipment, such as Herman Hollerith's application of punched card equipment for the 1890 United States Census. "Using Hollerith's punchcard equipment, the Census Office was able to complete tabulating most of the 1890 census data in 2 to 3 years, compared with 7 to 8 years for the 1880 census. ... It is also estimated that using Herman Hollerith's system saved some $5 million in processing costs"[5] (in 1890 dollars) even with twice as many questions as in 1880.

Electronic data processing

Computerized data processing, or Electronic data processing represents the further evolution, with the computer taking the place of several independent pieces of equipment. The Census Bureau first made limited use of electronic computers for the 1950 United States Census, using a UNIVAC I system,[4] delivered in 1952.

Further evolution

"Data processing (DP)" has also previously been used to refer to the department within an organization responsible for the operation of data processing applications.[6] The term data processing has mostly been subsumed under the newer and somewhat more general term information technology (IT).[citation needed] "Data processing" has acquired a negative connotation, suggesting use of older technologies. As an example, in 1996 the Data Processing Management Association (DPMA) changed its name to the Association of Information Technology Professionals. Nevertheless, the terms are roughly synonymous.

Applications

Commercial data processing

<templatestyles src="Module:Hatnote/styles.css"></templatestyles>

Commercial data processing involves a large volume of input data, relatively few computational operations, and a large volume of output. For example, an insurance company needs to keep records on tens or hundreds of thousands of policies, print and mail bills, and receive and post payments.

Data analysis

<templatestyles src="Module:Hatnote/styles.css"></templatestyles>

In a science or engineering field, the terms data processing and information systems are considered too broad, and the more specialized term data analysis is typically used. Data analysis makes use of specialized and highly accurate algorithms and statistical calculations that are less often observed in the typical general business environment.

One divergence of culture between data processing and data analysis is shown by the numerical representations generally used; In data processing, measurements are typically stored as integers, fixed-point or binary-coded decimal representations of numbers, whereas the majority of measurements in data analysis are stored as floating-point representations of rational numbers.

For data analysis, packages like SPSS or SAS, or their free counterparts such as DAP, gretl or PSPP are often used.

See also

Notes

  1. Data processing is distinct from word processing, which manipulates text rather than data.Lua error in package.lua at line 80: module 'strict' not found.

References

  1. Lua error in package.lua at line 80: module 'strict' not found.
  2. Lua error in package.lua at line 80: module 'strict' not found.
  3. Lua error in package.lua at line 80: module 'strict' not found.
  4. 4.0 4.1 Lua error in package.lua at line 80: module 'strict' not found.
  5. 5.0 5.1 Lua error in package.lua at line 80: module 'strict' not found.
  6. Lua error in package.lua at line 80: module 'strict' not found.

Further reading

  • Bourque, Linda B.; Clark, Virginia A. (1992) Processing Data: The Survey Example. (Quantitative Applications in the Social Sciences, no. 07-085). Sage Publications. ISBN 0-8039-4741-0
  • Levy, Joseph (1967) Punched Card Data Processing. McGraw-Hill Book Company.