dirty data

Dirty data, also known as rogue data,[https://web.archive.org/web/20170831172432/https://spotlessdata.com/blog/spotless-version-12-out-now Spotless version 12 out now] are inaccurate, incomplete or inconsistent data, especially in a computer system or database.{{Cite book |last=Chu |first=Margaret Y. |title=Blissful data: wisdom and strategies for providing meaningful, useful, and accessible data for all employees |date=2004 |publisher=AMACOM |isbn=978-0-8144-0780-6 |location=New York |page=71}}

Dirty data can contain such mistakes as spelling or punctuation errors, incorrect data associated with a field, incomplete or outdated data, or even data that has been duplicated in the database. They can be cleaned through a process known as data cleansing.{{Citation | year = 2013 |last1=Wu |first1 = S. |title= A review on coarse warranty data and analysis | journal = Reliability Engineering and System |volume = 114 |pages=1–11 |url =https://kar.kent.ac.uk/32972/1/LatestVersionV01.pdf |doi=10.1016/j.ress.2012.12.021}}

Dirty Data (Social Science)

In sociology, dirty data refer to secretive data the discovery of which is discrediting to those who kept the data secret. Following the definition of Gary T. Marx, Professor Emeritus of MIT, dirty data are one among four types of data:{{Cite web|url=http://web.mit.edu/gtmarx/www/dirty.html|title=Notes on the discovery, collection, and assessment of hidden and|website=web.mit.edu|access-date=2017-02-17}}{{Cite journal |last=Walby |first=Kevin |last2=Larsen |first2=Mike |date=2012-01-01 |title=Access to Information and Freedom of Information Requests: Neglected Means of Data Production in the Social Sciences |url=http://journals.sagepub.com/doi/10.1177/1077800411427844 |journal=Qualitative Inquiry |language=en |volume=18 |issue=1 |pages=31–42 |doi=10.1177/1077800411427844 |issn=1077-8004}}{{Cite web |last=Roe |first=David |date=April 27, 2021 |title=What are the Most Common Types of Dirty Data? |url=https://domycoding.com/ |website=DMCoding}}

  • Nonsecretive and nondiscrediting data:
  • Routinely available information.
  • Secretive and nondiscrediting data:
  • Strategic and fraternal secrets, privacy.
  • Nonsecretive and discrediting data:
  • sanction immunity,
  • normative dissensus,
  • selective dissensus,
  • making good on a threat for credibility,
  • discovered dirty data.
  • Secretive and discrediting data: Hidden and dirty data.

See also

References

{{reflist}}

Category:Data quality

{{database-stub}}