concept class

{{Multiple issues|{{context|date=December 2021}}

{{rewrite|the article should cover more aspects of concept classes in a more balanced way, so as to avoid violating WP:UNDUE|date=December 2021}}}}In computational learning theory in mathematics, a concept over a domain X is a total Boolean function over X. A concept class is a class of concepts. Concept classes are a subject of computational learning theory.

Concept class terminology frequently appears in model theory associated with probably approximately correct (PAC) learning. [https://arxiv.org/abs/1801.06566 Chase, H., & Freitag, J. (2018). Model Theory and Machine Learning. arXiv preprint arXiv:1801.06566]. In this setting, if one takes a set Y as a set of (classifier output) labels, and X is a set of examples, the map $c: X\to Y$ , i.e. from examples to classifier labels (where $Y = \{0, 1\}$ and where c is a subset of X), c is then said to be a concept. A concept class $C$ is then a collection of such concepts.

Given a class of concepts C, a subclass D is reachable if there exists a sample s such that D contains exactly those concepts in C that are extensions to s. Not every subclass is reachable.{{cite journal|last=Angluin|first=D.|year=2004|title=Queries revisited|url=https://www.sciencedirect.com/science/article/pii/S030439750300608X/pdf?md5=c06f6d6f5b10d73c3e05a1321128a67e&pid=1-s2.0-S030439750300608X-main.pdf|journal=Theoretical Computer Science|volume=313|issue=2|pages=188–191|doi=10.1016/j.tcs.2003.11.004 }}{{Why|date=December 2021}}

Background

A sample $s$ is a partial function from $X$ {{What|date=December 2021|reason=What is "X"?}} to $\{0, 1\}$ . Identifying a concept with its characteristic function mapping $X$ to $\{0, 1\}$ , it is a special case of a sample.

Two samples are consistent if they agree on the intersection of their domains. A sample $s'$ extends another sample $s$ if the two are consistent and the domain of $s$ is contained in the domain of $s'$ .

Examples

Suppose that $C = S^+(X)$ . Then:

the subclass $\{\{x\}\}$ is reachable with the sample $s = \{(x, 1)\}$ ;{{Why|date=December 2021}}
the subclass $S^+(Y)$ for $Y\subseteq X$ are reachable with a sample that maps the elements of $X - Y$ to zero;{{Why|date=December 2021}}
the subclass $S(X)$ , which consists of the singleton sets, is not reachable.{{Why|date=December 2021}}

Applications

Let $C$ be some concept class. For any concept $c\in C$ , we call this concept $1/d$ -good for a positive integer $d$ if, for all $x\in X$ , at least $1/d$ of the concepts in $C$ agree with $c$ on the classification of $x$ . The fingerprint dimension $FD(C)$ of the entire concept class $C$ is the least positive integer $d$ such that every reachable subclass $C'\subseteq C$ contains a concept that is $1/d$ -good for it. This quantity can be used to bound the minimum number of equivalence queries{{What|date=December 2021|reason=What is an "equivalence query"?}} needed to learn a class of concepts according to the following inequality: $FD(C) - 1\leq \#EQ(C)\leq \lceil FD(C)\ln(|C|)\rceil$ .

References

Category:Computational learning theory