# Language (11/30/2016)

## Language

• Uniquely human
• Can generate infinitely many new sentences from relatively few examples: “Paucity of examples”
• Can we really generate infinite, “distinct” sentences?
• Argument 1: natural numbers are infinite
• I have one chocolate
• I have two chocolates
• I have three chocolates…
• Argument 2:
• I had a good time
• And, “I think you are funny”, “He thinks I think he is funny”, “I think he thinks I think he is funny”, …
• Other species generally only repeat simple patterns that they have encountered explicitly (ex. songbirds)

Leading theory last 50 years: Nativist theory

Nativist theory argues that humans are born with universal grammar, and an ability to develop language skills is innate. In recent years, this theory has condensed to the minimalist program, which requires an operation called “MERGE” and postulates it to be the jump to language (Berwick and Chomsky).

On the contrary, empiricist theory argues that children receive enough linguistic inputs to develop language skills, and there is no need to assume such an innate universal grammar.

From an evolutionary perspective, learning language skills do not seem as incremental as learning to walk. For walking, we exhibit an incremental learning curve, but for languages, we seem to go straight from knowing few words to generating infinite sentences!

Why is there such an huge “gap”? Are there any evolutionary advantages in this superior language acquisition abilities?

Studies suggests that language skills may help with cognitive abilities:

• Babies sitting in an asymmetrically shaped room or colored room were asked to find an object in one side of a room. Pre-verbal babies cannot find objects as well as babies who have learned how to speak.
• Language facilitates internal thought and advance planning.

## How can a computer learn language?

Grammars:

1. Regex: a*bab* = Deterministic Finite Automata (DFA)
2. Context Free Languages = Pushdown Automata
S -> 0A | 1B | e
A -> 0S | 1C
B -> 0C | 1S
C -> 0B | 1A

Model:

1. is x $\in$ L? (membership query). TEACHER responds with YES/NO
2. $\bar{L} = L$? (model query). TEACHER responds with YES or with a counter-example x $\in L \oplus \bar{L}$ if not equal.

Can a computer learn a regex language from small # of queries? (Polynomial in # of states in the smallest DFA that accepts the language).

## Angluin’s Algorithm

How can we make this algorithm neurally plausible, i.e., distributed with little global control or synchrony, allowing both generation and recognition? It appears that primitives such as JOIN and PJOIN could be very helpful.