• 已选条件:
 全选  【符合条件的数据共:58437条】

作者:Xiao Qin xqin@cs.wpi.edu, Xika Lin xika@wpi.edu, Matthew O. Ward matt@wpi.edu

预览  |  原文链接  |  全文  [ 浏览:1 下载:1  ]     分享

Association rule mining is known to be computationally intensive, yet realtime decision making applications are increasingly intolerant to delays. The stateoftheart PARAS 1 solution, a parameter space framework for online association mining, enables efficient rule mining by compactly indexing the final ruleset and providing efficient querytime redun dancy resolution. Unfortunately, as many association mining models, PARAS was designed for static data. Modern transaction databases undergo regular data updates that quickly invalidating existing rules or introducing new rules for the PARAS index. While reloading the PARAS index from scratch is impractical, as even upon minor data changes, a complete rule inference and redundancy resolution steps would have to be performed. We now pro pose to tackle this open problem by designing an incremental parameter space construction approach, called iPARAS, that utilizes the previous mining result to minimally adjust the ruleset and associated redundancy relationships. iPARAS features two innovative tech niques. First, iPARAS provides an endtoend solution, composed of three algorithms, to efficiently update the final ruleset in the parameter space. Second, iPARAS designs a com pact data structure to maintain the complex redundancy relationships. Overall, iPARAS achieves several times speedup on parameter space construction for transaction databases comparing to the stateoftheart online association rule mining system PARAS.

作者:Joa˜o Duarte joao.m.duarte@inesctec.pt, LIAAD-INESC TEC, University of Porto

预览  |  原文链接  |  全文  [ 浏览:3 下载:3  ]     分享

The volume and velocity of data is increasing at astonishing rates. In order to extract knowledge from this huge amount of information there is a need for efficient online learn ing algorithms. Rulebased algorithms produce models that are easy to understand and can be used almost offhand. Ensemble methods combine several predicting models to improve the quality of prediction. In this paper, a new online ensemble method that combines a set of rulebased models is proposed to solve regression problems from data streams. Experimental results using synthetic and real timeevolving data streams show the pro posed method significantly improves the performance of the single rulebased learner, and outperforms two stateoftheart regression algorithms for data streams.

3 Data Fusion at Scale [会议论文]

作者:Markus De Shon, Ph.D.

预览  |  原文链接  |  全文  [ 浏览:2 下载:2  ]     分享

作者:Shaheen Fatima, Michael Wooldridge

预览  |  原文链接  |  全文  [ 浏览:1 下载:1  ]     分享

We are concerned with the problem of how a collection of agents can decide to share a resource, represented as a unit sized pie. We investigate a simple and natural noncooperative bargaining pro tocol for this problem, in which players take it in turns to make proposals on how the resource should be allocated, and the other players vote on whether or not to accept the allocation. Voting is modelled as a weighted voting game: each player is assigned a weight, and a proposal for allocation is implemented if the weight of players in favour of that proposal meets or exceeds a given quota. The agenda, (i.e., the order in which the players are called to make offers), is defined exogenously. Thus, the outcome is an offer that has majority support. We consider two variants of this protocol, provide subgame perfect equilibrium strategies for both, and inves tigate their properties. Finally, we give those conditions for which the noncooperative equilibria for the two games is in the core of

作者:Stefan Magureanu MAGUR@KTH.SE;Supelec;Plateau de Moulon;等

预览  |  原文链接  |  全文  [ 浏览:1 下载:4  ]     分享

We consider stochastic multiarmed bandit problems where the expected reward is a Lipschitz function of the arm, and where the set of arms is either discrete or continuous. For discrete Lip schitz bandits, we derive asymptotic problem specific lower bounds for the regret satisfied by any algorithm, and propose OSLB and CKLUCB, two algorithms that efficiently exploit the Lipschitz structure of the problem. In fact, we prove that OSLB is asymptotically optimal, as its asymptotic regret matches the lower bound. The regret analysis of our algorithms relies on a new concentration inequality for weighted sums of KL divergences between the empirical distributions of rewards and their true distributions. For continuous Lipschitz bandits, we propose to first discretize the action space, and then apply OSLB or CKLUCB, algorithms that provably exploit the structure efficiently. This approach is shown, through numerical experiments, to significantly outperform existing algo rithms that directly deal with the continuous set of arms. Finally the results and algorithms are

作者:Chihiro Shibata shibatachh@stf.teu.ac.jp

预览  |  原文链接  |  全文  [ 浏览:0 下载:1  ]     分享

Motivated by the idea of applying nonparametric Bayesian models to dual approaches for distributional learning, we define (k, l)contextsensitive probabilistic contextfree gram mars (PCFGs) using hierarchical PitmanYor processes (PYPs). The data sparseness problem that occurs when inferring contextsensitive probabilities for rules is handled by the smoothing effect of hierarchical PYPs. Many possible definitions or constructions of PYP hierarchies can be used to represent the context sensitivity of derivations of CFGs in Chomsky normal form. In this study, we use a definition that is considered to be the most natural as an extension of infinite PCFGs defined in previous studies. A Markov Chain Monte Carlo method called blocked MetropolisHastings (MH) sampling is known to be effective for inferring PCFGs from unsupervised sentences. Blocked MH sampling is ap plicable to (k, l)contextsensitive PCFGs by modifying their socalled inside probabilities. We show that the computational cost of blocked MH sampling for (k, l)contextsensitive PCFGs is O(|V |l+3|s|3) for each sentence s, where V is a set of nonterminals. This cost is too high to iterate sufficient sampling times, especially when l 6= 0, thus we propose an al ternative sampling method that separates the sampling procedure into pointwise sampling for nonterminals and blocked sampling for rules. The computational cost of this sampling method is O(min{|s|l, |V |l}(|V ||s|2 + |s|3)).