作者:Rick Smetsers email@example.com, Frits Vaandrager firstname.lastname@example.org
In Angluin’s L algorithm a learner constructs a sequence of hypotheses in order to learn a regular language. Each hypothesis is consistent with a larger set of observations and is described by a bigger model. From a behavioral perspective, however, a hypothesis is not always better than the previous one, in the sense that the minimal length of a counterex ample that distinguishes a hypothesis from the target language may decrease. We present a simple modification of the L algorithm that ensures that for subsequent hypotheses the minimal length of a counterexample never decreases, which implies that the distance to the target language never increases in a corresponding ultrametric. Preliminary experi mental evidence suggests that our algorithm speeds up learning in practical applications by reducing the number of equivalence queries.
作者:Edward Stabler email@example.com
Recent computational, mathematical work on learnability extends to classes of languages that plausibly include the human languages, but there is nevertheless a gulf between this work and linguistic theory. The languages of the two fields seem almost completely disjoint and incommensurable. This paper shows that this has happened, at least in part, because the recent advances in learnability have been misdescribed in two important respects. First, they have been described as resting on ‘empiricist’ conceptions of language, when actually, in fundamental respects that are made precise here, they are equally compatible with the ‘rationalist’, ‘nativist’ traditions in linguistic theory. Second, the recent mathematical proposals have sometimes been presented as if they not only advance but complete the account of human language acquisition, taking the rather dramatic difference between what current mathematical models can achieve and what current linguistic theories tell us as an indication that current linguistic theories are quite generally mistaken. This paper compares the two perspectives and takes some first steps toward a unified theory, aiming to identify some common ground where ‘rationalist’ linguistic hypotheses could directly address weaknesses in the current mathematical proposals.
作者:Franc¸ois Coste firstname.lastname@example.org, Jacques Nicolas email@example.com
Based on Harris’s substitutability criterion, the recent definitions of classes of substitutable languages have led to interesting polynomial learnability results for expressive formal lan guages. These classes are also promising for practical applications: in natural language analysis, because definitions have strong linguisitic support, but also in biology for mod eling protein families, as suggested in our previous study introducing the class of local substitutable languages. But turning recent theoretical advances into practice badly needs truly practical algorithms. We present here an efficient learning algorithm, motivated by intelligibility and parsing efficiency of the result, which directly reduces the positive sam ple into a small nonredundant canonical grammar of the target substitutable language. Thanks to this new algorithm, we have been able to extend our experimentation to a com plete protein dataset confirming that it is possible to learn grammars on proteins with high specificity and good sensitivity by a generalization based on local substitutability.
作者:Mattias Gybels firstname.lastname@example.org;Aix Marseille Universite´;CNRS;等
Spectral methods propose new and elegant solutions in probabilistic grammatical inference. We propose two ways to improve them. We show how a linear representation, or equivalently a weighted automata, output by the spectral learning algorithm can be taken as an initial point for the Baum Welch algorithm, in order to increase the likelihood of the observation data. Secondly, we show how the inference problem can naturally be expressed in the framework of Structured LowRank Approximation. Both ideas are tested on a benchmark extracted from the PAutomaC challenge.
5 Supplementary Material [会议论文]
作者:We first list two of our key assumptions
作者:Shinichi Nakajima Masashi Sugiyama
Variational Bayesian (VB) learning is known to be a promising approximation to Bayesian learning with computational eciency. How ever, in some applications, e.g., largescale collaborative filtering and tensor factoriza tion, VB is still computationally too costly. In such cases, looser approximations such as MAP estimation and partially Bayesian (PB) learning, where a part of the parameters are pointestimated, seem attractive. In this pa per, we theoretically investigate the behavior of the MAP and the PB solutions of matrix factorization. A notable finding is that the global solutions of MAP and PB in the em pirical Bayesian scenario, where the hyperpa rameters are also estimated from observation, are trivial and useless, while their local solu tions behave similarly to the global solution of VB. This suggests that empirical MAP and empirical PB with local search can be alterna tives to empirical VB equipped with the use ful automatic relevance determination prop