
- Module Based Reinforcement Learning for a Zsolt Kalm'ar, Csaba Szepesv'ari and Andr'as Lorincz
- REGO: Rank-based Estimation of Renyi Information using Euclidean Graph Optimization
- Empirical Bernstein Stopping Volodymyr Mnih mnih@cs.ualberta.ca
- Margin Maximizing Discriminant Analysis Andras Kocsor1
- Log-optimal currency portfolios and control Lyapunov exponents L. Gerencser, M. Rasonyi , Cs. Szepesvari, Zs. Vago
- , , 1--36 () fl Kluwer Academic Publishers, Boston. Manufactured in The Netherlands.
- Multicriteria Reinforcement Learning # Zoltan Gabor, Zsolt Kalmar and Csaba Szepesvari
- Sequential Importance Sampling for Visual Tracking Reconsidered Peter Torma
- APPROXIMATE INVERSEDYNAMICS BASED ROBUST CONTROL USING STATIC AND DYNAMIC FEEDBACK
- Proceedings of IEEE WCCI ICNN'94 Vol. I. pp. 6165, IEEE Inc., Orlando, Florida, 1994. URL ftp://iserv.iki.kfki.hu/pub/papers/icnn94/olah.everyday.ps.Z
- Model-based and Model-free Reinforcement Learning for Visual Servoing
- Universal Parameter Optimisation in Games Based on Levente Kocsis, Csaba Szepesvari
- Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation
- Toward Off-Policy Learning Control with Function Approximation Hamid Reza Maei,
- Some basic facts concerning minimax sequential decision processes
- Static and Dynamic Aspects Optimal Sequential Decision Making
- Combining Local Search, Neural Networks and Particle Filters to Achieve Fast and Reliable
- A Markov-Chain Monte Carlo Approach to Simultaneous Localization and Mapping
- LS-N-IPS: an Improvement of Particle Filters by Means of Local Search
- Reduced-Variance Payoff Estimation in Adversarial Bandit Problems
- Learning near-optimal policies with Bellman-residual minimization based fitted
- Toward a Classification of Finite Partial-Monitoring Games
- Journal of Machine Learning Research 1 (2008) 815-857 Submitted 6/05; Revised 2/07; Published 5/08 Finite-Time Bounds for Fitted Value Iteration
- The Asymptotic ConvergenceRate of Cs. Szepesv'ari \Lambda
- An Evaluation Criterion for Macro Learning and Some Results
- Efficient Object Tracking in Video Sequences by means of LSNIPS Peter Torma
- NonMarkovian Policies in Sequential Decision Problems
- Learning and Exploitation do not Conflict under Minimax Optimality ?
- Prediction of Protein DomainTypes by Backpropagation
- Reinforcement Learning: Theory and Practice Csaba Szepesv'ari
- Szepesvri Csaba Publikcik, hivatkozsok PUBLIKCIK1
- Maximum Margin Discriminant Analysis based Face Recognition
- Manifold-Adaptive Dimension Estimation Amir massoud Farahmand amir@cs.ualberta.ca
- Machine Learning, 39, 287308, 2000. c 2000 Kluwer Academic Publishers. Printed in The Netherlands.
- Reduced-Variance Payoff Estimation in Adversarial Bandit Problems
- Topology learning solved by extended objects: a neural network model
- Machine Learning manuscript No. (will be inserted by the editor)
- Efficient Approximate Planning in Continuous Space Markovian Decision Problems
- Towards Facial Pose Tracking Peter Torma
- Comparing ValueFunction Estimation Algorithms in Undiscounted Problems
- Speeding Up Planning in Markov Decision Processes via Automatically Constructed Abstractions
- Regularized Policy Iteration Amir massoud Farahmand
- Model-based reinforcement learning with nearly tight exploration complexity bounds
- Module Based Reinforcement Learning for a Zsolt Kalm'ar 1;4 , Csaba Szepesv'ari 2;4 , and Andr'as Lorincz 3;4
- Neural Networks in press URL ftp://iserv.iki.kfki.hu/pub/papers/new/szepes.cc.ps.Z
- An integrated architecture for motioncontrol and pathplanning Csaba Szepesv'ari z y and Andr'as Lorincz y \Lambda
- Active Learning in Multi-Armed Bandits Andras Antos1
- Multicriteria Reinforcement Learning # Zoltan Gabor, Zsolt Kalmar and Csaba Szepesvari
- Generalized Markov Decision Processes: Dynamicprogramming and Reinforcementlearning
- Exploration-exploitation trade-off using variance estimates in multi-armed bandits
- Models of active learning in group-structured state spaces
- Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation
- Machine Learning manuscript No. (will be inserted by the editor)
- Learning When to Stop Thinking and Do Something! Barnabas Poczos poczos@cs.ualbeta.ca
- Regularized Fitted Q-iteration: Application to Amir massoud Farahmand1
- Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping
- Machine Learning manuscript No. (will be inserted by the editor)
- Fitted Q-iteration in continuous action-space MDPs Andras Antos
- Apprenticeship Learning using Inverse Reinforcement Learning and Gradient Methods
- Bandit based Monte-Carlo Planning Levente Kocsis and Csaba Szepesvari
- Finite Time Bounds for Sampling Based Fitted Value Iteration Csaba Szepesvari szcsaba@sztaki.hu
- RSPSA: Enhanced Parameter Optimisation in Levente Kocsis, Csaba Szepesvari, Mark H.M. Winands
- Interpolation-based Q-learning Csaba Szepesvari szcsaba@sztaki.hu
- Shortest Path Discovery Problems: A Framework, Algorithms and Experimental Results
- Budgeted Distribution Learning of Belief Net Parameters Liuyang Li liuyang@ualberta.ca
- Learning to Segment from a Few Well-Selected Training Images Alireza Farhangfar FARHANG@CS.UALBERTA.CA
- Kernel Machine Based Feature Extraction Algorithms for Regression Problems
- Local Importance Sampling: A Novel Technique to Enhance Particle Filtering
- On using Likelihood-adjusted Proposals in Particle Filtering: Local Importance Peter Torma
- Prediction of Protein Functional Domains from Sequences Using Artificial Neural Networks
- Active Learning in Heteroscedastic Noise $ Andras Antosa
- Value-Iteration Based Fitted Policy Iteration: Learning with a Single Trajectory
- Online Optimization in X-Armed Bandits Sebastien Bubeck
- Active Learning of Group-Structured Environments
- Enhancing Particle Filters using Local Likelihood Sampling
- Regularized Fitted Q-iteration for Planning in Continuous-Space Markovian Decision Problems
- A Convergent O(n) Algorithm for Off-policy Temporal-difference Learning