Mathematically speaking, inferring the “formal language” which can describe a set of expressions. In the slightly looser sense used by linguists studying natural human language, discovering the syntactic rules of a given language, which is kinda the same thing but with every term sloppier, and the subject matter itself messier.

This is already a crazy complex area, and being naturally perverse, I am
interested in an especially esoteric corner of it, to whit,
grammars of things that *aren’t* speech;
inferring design grammars, say, could allow
you to produce more things off the same “basic plan” from some examples of the
thing;
look at enough trees and you know how to build the rest of the forest, that
kind of thing.
I’m especially interested in things expressed not as a sequence of symbols from
a finite alphabet -
i.e. not over the free monoid, or over the free monoid but the
symbol expression is hidden. This is a boutique interest.
I also care about probabilistic ones, i.e. assigning measures to these things.
That part is not at all boutique.

Normally design grammars deal with simple languages, such as, say “regular” languages. I’m interested in things a rung or two up the Chomsky hierarchy - Context-free grammars, maybe even context-sensitive ones.

See also design grammars, iterated function systems and my research proposal in this area.

## Software

## Things to read

Peter Norvig on Chomsky and statistical versus explanatory models of natural language syntax. Full of sick burns.

In January of 2011, television personality Bill O’Reilly weighed in on more than one culture war with his statement “tide goes in, tide goes out. Never a miscommunication. You can’t explain that,” which he proposed as an argument for the existence of God. […] O’Reilly realizes that it doesn’t matter what his detractors think of his astronomical ignorance, because his supporters think he has gotten exactly to the key issue:

*why*? He doesn’t care*how*the tides work, tell him*why*they work.*Why*is the moon at the right distance to provide a gentle tide, and exert a stabilizing effect on earth’s axis of rotation, thus protecting life here?*Why*does gravity work the way it does? Why does anything at all exist rather than not exist? O’Reilly is correct that these questions can only be addressed by mythmaking, religion or philosophy, not by science.Chomsky has a philosophy based on the idea that we should focus on the deep whys and that mere explanations of reality don’t matter. In this, Chomsky is in complete agreement with O’Reilly. (I recognize that the previous sentence would have an extremely low probability in a probabilistic model trained on a newspaper or TV corpus.)

Cosma Shalizi’s inevitable mention in 3, 2, 1… go

- Arau00
- Araujo, L. (2000) Evolutionary parsing for a probabilistic context free grammar. In Proc. of the Int. Conf. on on Rough Sets and Current Trends in Computing (RSCTC-2000), Lecture Notes in Computer Science 2005 (p. 590).
- AKHC06
- Ates, K., Kukluk, J., Holder, L., Cook, D., & Zhang, K. (2006) Graph grammar induction on structural data for visual programming. In 18th IEEE International Conference on Tools with Artificial Intelligence, 2006. ICTAI ’06 (pp. 232–242). DOI.
- BaPo99
- Badii, R., & Politi, A. (1999) Complexity: Hierarchical Structures and Scaling in Physics (Cambridge Nonlinear Science Series). . Cambridge University Press
- BDGH10
- Bailly, R., Denis, F., Gilbert, É., & Habrard, A. (2010) Using PCA for Probabilistic Grammatical Inference on Trees.
- BaHu93
- Barnsley, M., & Hurd, L. (1993) Fractal Image Compression. . A K Peters/CRC Press
- Bart83
- Bartsch-Spörl, B. (1983) Grammatical inference of graph grammars for syntactic pattern recognition. In H. Ehrig, M. Nagl, & G. Rozenberg (Eds.), Graph-Grammars and Their Application to Computer Science (Vol. 153, pp. 1–7). Springer Berlin / Heidelberg
- BTCB14
- Bolhuis, J. J., Tattersall, I., Chomsky, N., & Berwick, R. C.(2014) How Could Language Have Evolved?.
*PLoS Biol*, 12(8), e1001934. DOI. - BoBV12
- Boulanger-Lewandowski, N., Bengio, Y., & Vincent, P. (2012) Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription. In 29th International Conference on Machine Learning.
- CaHi00
- Casacuberta, F., & de la Higuera, C. (2000) Computational Complexity of Problems on Probabilistic Grammars and Transducers. In A. Oliveira (Ed.), Grammatical Inference: Algorithms and Applications (Vol. 1891, pp. 211–212). Springer Berlin / Heidelberg
- Char96a
- Charniak, E. (1996a) Statistical Language Learning. (Reprint.). A Bradford Book
- Char96b
- Charniak, E. (1996b) Tree-bank Grammars. In In Proceedings of the Thirteenth National Conference on Artificial Intelligence (pp. 1031–1036).
- ChMa06
- Chater, N., & Manning, C. D.(2006) Probabilistic models of language processing and acquisition.
*Trends Cogn Sci*, 10(7), 335–344. DOI. - Chom56
- Chomsky, N. (1956) Three models for the description of language.
*IRE Transactions on Information Theory*, 2(3), 113–124. DOI. - Chom69
- Chomsky, N. (1969) Aspects of the Theory of Syntax. . The MIT Press
- Chom02
- Chomsky, N. (2002) Syntactic Structures. (2nd ed.). Walter de Gruyter
- CDEC02
- Christiansen, M. H., Dale, R. A. C., Ellefson, M. R., & Conway, C. M.(2002) The role of sequential learning in language evolution: computational and experimental studies. In A. Cangelosi & D. Parisi (Eds.), Simulating the Evolution of Language. Springer-Verlag New York, Inc.
- CrVi08
- Cruz-Alcázar, P. P., & Vidal, E. (2008) Two grammatical inference applications in music processing.
*Applied Artificial Intelligence*, 22(1-2), 53–76. DOI. - CrVi98
- Cruz-Alcázar, P., & Vidal-Ruiz, E. (1998) Learning regular grammars to model musical style: Comparing different coding schemes. In V. Honavar & G. Slutzki (Eds.), Grammatical Inference (Vol. 1433, pp. 211–222). Springer Berlin / Heidelberg
- Dede11
- DeDeo, S. (2011) Effective Theories for Circuits and Automata.
*Chaos*, 21. DOI. - Higu00
- de la Higuera, C. (2000) Current Trends in Grammatical Inference. In F. Ferri, J. Iñesta, A. Amin, & P. Pudil (Eds.), Advances in Pattern Recognition (Vol. 1876, pp. 28–31). Springer Berlin / Heidelberg
- Higu05
- de la Higuera, C. (2005) A bibliographical study of grammatical inference.
*Pattern Recognition*, 38(9), 1332–1348. DOI. - Higu10
- de la Higuera, C. (2010) Grammatical inference : learning automata and grammars. . Cambridge; New York: Cambridge University Press
- HiPT06
- de la Higuera, C., Piat, F., & Tantini, F. (2006) Learning Stochastic Finite Automata for Musical Style Recognition. In J. Farré, I. Litovsky, & S. Schmitz (Eds.), Implementation and Application of Automata (Vol. 3845, pp. 345–346). Springer Berlin / Heidelberg
- DeLT04
- Denis, F., Lemay, A., & Terlutte, A. (2004) Learning regular languages using RFSAs.
*Algorithmic Learning Theory*, 313, 267–294. DOI. - DLGT13
- Duvenaud, D., Lloyd, J. R., Grosse, R., Tenenbaum, J. B., & Ghahramani, Z. (2013) Structure Discovery in Nonparametric Regression through Compositional Kernel Search.
*arXiv:1302.4922 [cs, Stat]*. - DTBK99
- Džeroski, S., Todorovski, L., Bratko, I., Kompare, B., & Križman, V. (1999) Equation discovery with ecological applications. In A. H. Fielding (Ed.), Machine Learning Methods for Ecological Applications (pp. 185–207). Springer US
- Elle14
- Ellerman, D. (2014) Four Ways from Universal to Particular: How Chomsky’s Language-Acquisition Faculty is Not Selectionist.
*arXiv:1410.4501 [math]*. - Elli71
- Ellis, C. A.(1971) Probabilistic tree automata.
*Information and Control*, 19(5), 401–416. DOI. - Elma90
- Elman, J. L.(1990) Finding structure in time.
*Cognitive Science*, 14, 179–211. DOI. - Elma91
- Elman, J. L.(1991) Distributed representations, simple recurrent networks, and grammatical structure.
*Machine Learning*, 7, 195–225. DOI. - Elma93
- Elman, J. L.(1993) Learning and development in neural networks: the importance of starting small.
*Cognition*, 48, 71–99. DOI. - Elma03
- Elman, J. L.(2003) Generalization from Sparse Input. In Proceedings of the 38th Annual Meeting of the Chicago Linguistic Society. Citeseer
- FeKH04
- Fee, M. S., Kozhevnikov, A. A., & Hahnloser, R. H.(2004) Neural mechanisms of vocal sequence generation in the songbird.
*Annals of the New York Academy of Sciences*, 1016, 153–170. DOI. - Flas07
- Flasiński, M. (2007) Inference of parsable graph grammars for syntactic pattern recognition.
*Fundamenta Informaticae*, 80(4), 379–413. - GaBa14
- Gaillard, P., & Baudin, P. (2014) A consistent deterministic regression tree for non-parametric prediction of time series.
*arXiv:1405.1533 [cs, Math, Stat]*. - GaLe14
- Gallo, S., & Leonardi, F. G.(2014) Nonparametric statistical inference for the context tree of a stationary ergodic process.
*arXiv:1411.7650 [math, Stat]*. - GiSB11
- Giesa, T., Spivak, D., & Buehler, M. (2011) Reoccurring Patterns in Hierarchical Protein Materials and Music: The Power of Analogies.
*BioNanoScience*, 1(4), 153–161. DOI. - Gold67
- Gold, E. M.(1967) Language identification in the limit.
*Information and Control*, 10(5), 447–474. DOI. - GSFT12
- Grosse, R., Salakhutdinov, R. R., Freeman, W. T., & Tenenbaum, J. B.(2012) Exploiting compositionality to explore a large space of model structures.
*arXiv:1210.4856 [cs, Stat]*. - Grün96:
- Grünwald, P. (1996) A minimum description length approach to grammar inference. In Connectionist, Statistical, and Symbolic Approaches to Learning for Natural Language Processing (Vol. 1040, pp. 203–216). London, UK, UK: Springer-Verlag
- HeLM04
- Heckel, R., Lajios, G., & Menge, S. (2004) Stochastic Graph Transformation Systems. In H. Ehrig, G. Engels, F. Parisi-Presicce, & G. Rozenberg (Eds.), Graph Transformations (Vol. 3256, pp. 243–246). Springer Berlin / Heidelberg
- Horn69
- Horning, J. J.(1969) A study of grammatical inference.
- Jack02
- Jackendoff, R. (2002) Foundations of Language: Brain, Meaning, Grammar, Evolution. . Oxford University Press, USA
- JiKo11
- Jin, D. Z., & Kozhevnikov, A. A.(2011) A compact statistical model of the song syntax in Bengalese finch.
*PLoS Comput Biol*, 7(3), -1001108. DOI. - John04
- Johnson, K. (2004) Gold’s Theorem and Cognitive Science.
*Philosophy of Science*, 71(4), 571–592. DOI. - JoHC04
- Jonyer, I., Holder, L. B., & Cook, D. J.(2004) MDL-based context-free graph grammar induction and applications.
*International Journal on Artificial Intelligence Tools*, 13(01), 65–79. DOI. - KeLu97
- Keller, B., & Lutz, R. (1997) Evolving stochastic context-free grammars from examples using a minimum description length principle. In 1997 Workshop on Automata Induction Grammatical Inference and Language Acquisition. Citeseer
- KeLu05
- Keller, B., & Lutz, R. (2005) Evolutionary induction of stochastic context free grammars.
*Pattern Recognition*, 38(9), 1393–1406. DOI. - KiBe92
- Kippen, J., & Bel, B. (1992) Modelling music with grammars: formal language representation in the Bol Processor.
*Computer Representations and Models in Music*, 207–232. - Koza93
- Koza, J. R.(1993) Discovery of rewrite rules in Lindenmayer systems and state transition rules in cellular automata via genetic programming.
- KuHC06
- Kukluk, J. P., Holder, L. B., & Cook, D. J.(2006) Inference of Node Replacement Recursive Graph Grammars. In Proceedings of the 2006 SIAM International Conference on Data Mining.
- LWPX09
- Lin, L., Wu, T., Porway, J., & Xu, Z. (2009) A stochastic graph grammar for compositional object representation and recognition.
*Pattern Recognition*, 42(7), 1297–1307. DOI. - LóSG04:
- López, D., Sempere, J. M., & García, P. (2004) Inference of reversible tree languages.
*IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics*, 34(4), 1658–1665. DOI. - Mann02
- Manning, C. D.(2002) Probabilistic syntax. In Probabilistic linguistics (pp. 289–341). Cambridge, MA: MIT Press
- MaSc99
- Manning, C. D., & Schütze, H. (1999) Foundations of statistical natural language processing. . Cambridge, Mass.: MIT Press
- MZWV07
- Müller, P., Zeng, G., Wonka, P., & Van Gool, L. (2007) Image-based procedural modeling of facades. In ACM SIGGRAPH 2007 papers. New York, NY, USA: ACM DOI.
- NeWi97
- Nevill-Manning, C. G., & Witten, I. H.(1997) Identifying Hierarchical Structure in Sequences: A linear-time algorithm.
- NoKN01
- Nowak, M. A., Komarova, N. L., & Niyogi, P. (2001) Evolution of universal grammar.
*Science*, 291, 114–118. DOI. - Pere00
- Pereira, F. (2000) Formal grammar and information theory: together again?.
*Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences*, 358(1769), 1239–1253. DOI. - PiBl90
- Pinker, S., & Bloom, P. (1990) Natural language and natural selection.
*Behavioral and Brain Sciences*, 13, 707. - ShCr00
- Shalizi, C. R., & Crutchfield, J. P.(2000) Computational Mechanics: Pattern and Prediction, Structure and Simplicity.
- Smit84
- Smith, A. R.(1984) Plants, fractals, and formal languages. In SIGGRAPH Comput. Graph. (Vol. 18, pp. 1–10). ACM DOI.
- SmWi95
- Smith, T. C., & Witten, I. H.(1995) A Genetic Algorithm for the Induction of Natural Language Grammars. (Vol. 17, p. 17). Presented at the Proc IJCAI-95 Workshop on New Approaches to Learning for Natural Language Processing
- StOm94
- Stolcke, A., & Omohundro, S. (1994) Inducing probabilistic grammars by Bayesian model merging. In R. Carrasco & J. Oncina (Eds.), Grammatical Inference and Applications (Vol. 862, pp. 106–118). Springer Berlin / Heidelberg
- ToDK98
- Todorovski, L., Džeroski, S., & Kompare, B. (1998) Modelling and prediction of phytoplankton growth with equation discovery.
*Ecological Modelling*, 113(1–3), 71–81. DOI. - THRB10
- Torrini, P., Heckel, R., Rath, I., & Bergmann, G. (2010) Stochastic Graph Transformation with Regions.
*Electronic Communications of the EASST*, 29. - Vali84
- Valiant, L. G.(1984) A theory of the learnable.
*Commun. ACM*, 27(11), 1134–1142. DOI. - Vali09
- Valiant, L. G.(2009) Evolvability.
*J. ACM*, 56(1), 3:1–3:21. DOI. - VTHC05a
- Vidal, E., Thollard, F., de la Higuera, C., Casacuberta, F., & Carrasco, R. C.(2005a) Probabilistic finite-state machines - part I.
*IEEE Transactions on Pattern Analysis and Machine Intelligence*, 27(7), 1013–1025. DOI. - VTHC05b
- Vidal, E., Thollard, F., de la Higuera, C., Casacuberta, F., & Carrasco, R. C.(2005b) Probabilistic finite-state machines - part II.
*IEEE Transactions on Pattern Analysis and Machine Intelligence*, 27(7), 1026–1039. DOI. - VKKP15
- Vinyals, O., Kaiser, Ł., Koo, T., Petrov, S., Sutskever, I., & Hinton, G. (2015) Grammar as a Foreign Language. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, R. Garnett, & R. Garnett (Eds.), Advances in Neural Information Processing Systems 28 (pp. 2755–2763). Curran Associates, Inc.
- ZhZh11
- Zhao, Y., & Zhu, S.-C. (2011) Image Parsing via Stochastic Scene Grammar. . Presented at the Twenty-Fifth Annual Conference on Neural Information Processing Systems (NIPS 2011), Granada, Spain