Catpac

{{short description|Text analysis software}}

Catpac is a computer program that analyzes text samples to identify key concepts contained within the sample. It was conceived chiefly by Richard Holmes, a Michigan State computer programmer and Dr. Joseph Woelfel, a University at Albany and University at Buffalo sociologist for the analysis of attitude formation and change in the sociological context. Contributions by Rob Zimmelman, an undergraduate and graduate student at the University of Albany, from 1981 to 1984 on the Univac 1100 mainframe, included the inclusion of the CATPAC software in the Galileo*Telegal system, text-labeling and porting of CATPAC output for the Galileo system of paired-comparison conceptual visualization. CATPAC and the Galileo system are still in commercial use today, and with recent data capture and visualization contributions, continues to grow. Contributions by other students at the university resulted in the software that is still in commercial use today. It uses text files as input and produces output such as word and alphabetical frequencies as well as various types of cluster analysis.{{cite web |url=http://academic.csuohio.edu/kneuendorf/content/cpuca/qtap.htm |title=Quantitative Text Analysis Programs |accessdate=2010-11-26 |url-status=dead |archiveurl=https://archive.today/20120701153803/http://academic.csuohio.edu/kneuendorf/content/cpuca/qtap.htm |archivedate=2012-07-01 }}

Design

Catpac is a self-organizing, i.e. unsupervised, interactive activation and competition (IAC) artificial neural network used for text analysis.

{{cite web

|url=http://www.galileoco.com/Manuals/CATPAC3.pdf

|title=Catpac II User's Guide

|first=Joseph|last=Woelfel

|authorlink=Joseph Woelfel

|edition=Version 2.0

|publisher=The Galileo Company

}}{{Cite web |title=ANN MULTILINGUAL TEXT PATTERN RECOGNITION |url=http://www.galileoco.com/literature/Wolfpak10a.pdf |website=www.galileoco.com}} The program generates a multidimensional scalar output organizing words throughout the text by creating a weighted word-by-word matrix that establishes the eigenvector centralities of concepts.

{{cite conference

|last1=Egnoto|first1=M.

|last2=Nam|first2=Y.

|last3=Vishwanath|first3=A

|date=November 2010

|title=A longitudinal analysis of the newspaper coverage of cell phones

|conference=National Communication Association Conference

|location=San Francisco, CA.

}} The word-by-word matrix represents the relationship between one word and the occurrence of another.

{{cite journal

|last1=Doerfel|first1=M. L.

|last2=Barnett|first2=G. A.

|year=1999

|title=A semantic network analysis of the International Communication Association

|journal=Human Communication Research

|volume=25|number=4|pages=589–603 |doi=10.1111/j.1468-2958.1999.tb00463.x

|citeseerx=10.1.1.531.2227

}} Catpac identifies important words and patterns based on the organization of the text. This process mimics the connections between neurons in a human brain, strengthening connections through conditioning to generate a pattern of similarities among all words within a body of text.

Use

Catpac has been used in commercial studies, in academic scholarship to investigate massive textual data sets,

{{cite journal

|last1=Chen|first1=H.

|last2=Evans|first2=C.

|last3=Battleson|first3=B.

|last4=Zubrow|first4=E.

|last5=Woelfel|first5=J.

|title=Procedures for the precise analysis of massive textual datasets

|journal=Communication & Science Journal

|date=10 October 2011

}}

{{cite journal

|last1=Doerfel|first1=M. L.

|last2=Barnett|first2=G. A.

|year=1996

|title=The use of CATPAC for textual analysis

|journal=Field Methods

|volume=8|number=2|pages=4–7 |doi=10.1177/1525822x960080020501

|s2cid=144484166

}} as a strong semantic network analysis tool,

{{cite conference

|last1=Ortega|first1=C.R.

|last2=Egnoto|first2=M.J.

|year=2011

|title=Longitudinal analysis of press coverage of violent video games: Assessing agenda-setting via semantic and LIWC analyses

|conference=NYSCA conference

}} for longitudinal analyses,

{{cite journal

|last1=Kim|first1=J.H.

|last2=Su|first2=T-Y.

|last3=Hong|first3=J.

|year=2007

|title=The influence of geopolitics and foreign policy on the U.S. and Canadian media: An analysis of newspaper coverage of Sudan's Darfur conflict

|journal=The Harvard International Journal of Press/Politics

|volume=12|number=3|pages=87–95 |doi=10.1177/1081180x07302972

|s2cid=220748200

}}

{{cite journal

|last1=Murphy|first1=P.

|last2=Maynard|first2=M.

|year=2000

|title=Framing the genetic testing issue: Discourse and cultural clashes among policy communities

|journal=Science Communication

|volume=22|number=2|pages=133–153 |doi=10.1177/1075547000022002002

|s2cid=143663868

}}

{{cite journal

|last1=Rosen|first1=D.

|last2=Woelfel|first2=J.

|last3=Krikorian|first3=D.

|last4=Barnett|first4=G.A.

|year=2003

|title=Procedures for analyses of online communities

|journal=Journal of Computer-Mediated Communication

|volume=8|number=4

}} for multilingual analyses,

{{cite conference

|last1=Evans|first1=C.

|last2=Chen|first2=H.

|last3=Battleson|first3=B.

|last4=Wölfel|first4=J.K.

|last5=Woelfel|first5=J.

|year=2008

|title=Neural networks for pattern recognition in multilingual text

|conference=International Network for Social Network Analysis (INSNA) Sunbelt conference

|location=St. Pete Beach, FL

}}

{{cite book

|last1=Evans|first1=C.

|last2=Chen|first2=H.

|last3=Battleson|first3=B.

|last4=Wölfel|first4=J.K.

|last5=Woelfel|first5=J.

|year=2010

|title=Unsupervised artificial neural networks for pattern recognition in multilingual text

|location=Amherst, NY

|publisher=RAH Press

}} as a predictor of media usage

{{cite book

|last1=Cheong|first1=P.

|last2=Hwang|first2=J.

|last3=Elbirt|first3=B.

|last4=Chen|first4=H.

|last5=Evans|first5=C.

|last6=Woelfel|first6=J

|year=2010

|chapter=Media use as a function of identity: The role of the self concept in media usage

|editor-first=M.|editor-last=Hinner

|title=Freiberger beiträge zur interkulturellen und wirtschaftskommunikation

|trans-title=A forum for general and intercultural business communication

|volume=6

|series=The interrelationship of business and communication

|pages=365–381

|location=Berlin|publisher=Peter Lang

}} and as a powerful content analysis tool.

{{cite book

|last=Krippendorff|first=K.

|year=2004

|title=Content analysis: An introduction to its methodology

|edition=2nd

|location=Thousand Oaks, CA

|publisher=SAGE Publications

}}

{{cite web

|last=Neuendorf

|first=K.

|title=Quantitative text analysis programs

|work=The Content Analysis Guidebook Online

|accessdate=26 November 2010

|url=http://academic.csuohio.edu/kneuendorf/content/cpuca/qtap.htm

|url-status=dead

|archiveurl=https://archive.today/20120701153803/http://academic.csuohio.edu/kneuendorf/content/cpuca/qtap.htm

|archivedate=1 July 2012

}}

Availability

Catpac, conceived as an improvement to simple word-count software more than 30 years ago, is currently available in windows 32 bit format.

References

{{reflist}}