Yasin AbbasiYadkoriDeepMind Gmail: "yasin dot abbasi" 
I am a researcher at DeepMind. Prior to that I was a researcher at VinAI and Adobe. I was a postdoctoral fellow at Queensland University of Technology with Peter Bartlett. I completed my PhD at University of Alberta under the supervision of Csaba Szepesvari. Research interests: Artificial intelligence, machine learning, sequential decision problems 

Y. AbbasiYadkori, N. Lazic, C. Szepesvari, ModelFree Linear Quadratic Control via Reduction to Expert Prediction, International Conference on Artificial Intelligence and Statistics (AISTATS), 2019. pdf Y. AbbasiYadkori, P. Bartlett, and A. Malek, Linear Programming for LargeScale Markov Decision Problems, International Conference on Machine Learning (ICML), 2014. pdf Y. AbbasiYadkori, D. Pal, and Cs. Szepesvari, Improved Algorithms for Linear Stochastic Bandits, Neural Information Processing Systems (NIPS), 2011. pdf Y. AbbasiYadkori and Cs. Szepesvari, Regret Bounds for the Adaptive Control of Linear Quadratic Systems, Conference on Learning Theory (COLT), 2011. pdf 
B. Hao, Y. AbbasiYadkori, Z. Wen, G. Cheng, Bootstrapping Upper Confidence Bound, NIPS, 2019. M. Phan, Y. AbbasiYadkori, J. Domke, Thompson Sampling and Approximate Inference, NIPS, 2019. Y. AbbasiYadkori, P. L. Bartlett, K. Bhatia, N. Lazic, C. Szepesvari, G. Weisz, POLITEX: Regret Bounds for Policy Iteration using Expert Prediction, ICML, 2019. T. Mai, A. Rao, M. Kapilevich, R. Rossi, Y. AbbasiYadkori, R. Sinha, On Densification for Minwise Hashing, UAI, 2019. Y. AbbasiYadkori, N. Lazic, C. Szepesvari, ModelFree Linear Quadratic Control via Reduction to Expert Prediction, AISTATS, 2019. Preliminary Version: Y. AbbasiYadkori, N. Lazic, C. Szepesvari, ModelFree Linear Quadratic Control via Reduction to Expert Prediction, arXiv:1804.06021 [cs.LG], 2018. pdf E. Banijamali, Y. AbbasiYadkori, M. Ghavamzadeh, N. Vlassis, Optimizing over a Restricted Policy Class in Markov Decision Processes, AISTATS, 2019. Preliminary Version: E. Banijamali, Y. AbbasiYadkori, M. Ghavamzadeh, N. Vlassis, Optimizing over a Restricted Policy Class in Markov Decision Processes, arXiv:1802.09646 [cs.LG], 2018. pdf T. Nguyen, A. Shameli, Y. AbbasiYadkori, A. Rao, B. Kveton, Sample Efficient GraphBased Optimization with Noisy Observations, AISTATS, 2019. Y. AbbasiYadkori, P. L. Bartlett, V. Gabillon, A. Malek, M. Valko, Best of both worlds: Stochastic & adversarial bestarm identification, Conference on Learning Theory (COLT), 2018. X. Cheng, N. S. Chatterji, Y. AbbasiYadkori, P. L. Bartlett, M. I. Jordan, Sharp Convergence Rates for Langevin Dynamics in the Nonconvex Setting, arXiv:1805.01648 [stat.ML], 2018. pdf S. Li, Y. AbbasiYadkori, B. Kveton, S. Muthukrishnan, V. Vinay, Z. Wen, Offline Evaluation of Ranking Policies with Click Models, International Conference on Knowledge Discovery and Data Mining (KDD), 2018. Preliminary Version: S. Li, Y. AbbasiYadkori, B. Kveton, S. Muthukrishnan, V. Vinay, Z. Wen, Offline Evaluation of Ranking Policies with Click Models, arXiv:1804.10488 [cs.LG], 2018. pdf G. Theocharous, Z. Wen, Y. AbbasiYadkori, and N. Vlassis, Posterior Sampling for Large Scale Reinforcement Learning, Neural Information Processing Systems (NIPS), 2018. Preliminary Version: G. Theocharous, Z. Wen, Y. AbbasiYadkori, and N. Vlassis, Posterior Sampling for Large Scale Reinforcement Learning, arXiv:1711.07979 [cs.LG], 2017. pdf A. Shameli and Y. AbbasiYadkori, A Continuation Method for Discrete Optimization and its Application to Nearest Neighbor Classification, arXiv:1802.03482 [cs.LG], 2018. pdf B. Kveton, Cs. Szepesvari, A. Rao, Z., Yasin AbbasiYadkori, and S. Muthukrishnan, Stochastic LowRank Bandits, arXiv:1712.04644 [cs.LG], 2017. pdf Y. AbbasiYadkori, P. Bartlett, and V. Gabillon, Near Minimax Optimal Players for the FiniteTime 3Expert Prediction Problem, Neural Information Processing Systems (NIPS), 2017. pdf A. Kazerouni, M. Ghavamzadeh, Y. AbbasiYadkori, and B. Van Roy, Conservative Contextual Linear Bandits, Neural Information Processing Systems (NIPS), 2017. pdf Y. AbbasiYadkori, Fast Mixing Random Walks and Regularity of Incompressible Vector Fields, arXiv:1611.09252 [stat.CO], 2016. pdf Y. AbbasiYadkori, P. L. Bartlett, V. Gabillon, and A. Malek, HitandRun for Sampling and Planning in NonConvex Spaces, Artificial Intelligence and Statistics (AISTATS), 2017. pdf Preliminary Version: Y. AbbasiYadkori, P. L. Bartlett, V. Gabillon, and A. Malek, HitandRun for Sampling and Planning in NonConvex Spaces, arXiv:1610.08865 [stat.CO], 2016. pdf Y. AbbasiYadkori, P. L. Bartlett, and S. Wright, A Fast and Reliable Policy Improvement Algorithm, Artificial Intelligence and Statistics (AISTATS), 2016. pdf W. M. Koolen, A. Malek, P. L. Bartlett, and Y. AbbasiYadkori, Minimax Time Series Prediction, Neural Information Processing Systems (NIPS), 2015. pdf Y. AbbasiYadkori, P. Bartlett, X. Chen, and A. Malek, LargeScale Markov Decision Problems with KL Control Cost and its Application to Crowdsourcing, International Conference on Machine Learning (ICML), 2015. pdf Y. AbbasiYadkori and Cs. Szepesvari, Bayesian Optimal Control of Smoothly Parameterized Systems, Conference on Uncertainty in Artificial Intelligence (UAI), 2015. pdf Preliminary Version: Y. AbbasiYadkori and Cs. Szepesvari, Bayesian Optimal Control of Smoothly Parameterized Systems: The Lazy Posterior Sampling Algorithm, arXiv:1406.3926 [cs.LG], 2014. pdf Y. AbbasiYadkori, P. Bartlett, and A. Malek, Linear Programming for LargeScale Markov Decision Problems, International Conference on Machine Learning (ICML), 2014. pdf Preliminary Version: Y. AbbasiYadkori, P. Bartlett, and A. Malek, Linear Programming for LargeScale Markov Decision Problems, arXiv:1402.6763 [math.OC], 2014. pdf Y. AbbasiYadkori, P. Bartlett, and V. Kanade, Tracking Adversarial Targets, International Conference on Machine Learning (ICML), 2014. pdf Y. Seldin, P. Bartlett, K. Crammer, and Y. AbbasiYadkori, Prediction with Limited Advice and Multiarmed Bandits with Paid Observations, International Conference on Machine Learning (ICML), 2014. Y. AbbasiYadkori and G. Neu, Online learning in MDPs with side information, arXiv:1406.6812 [cs.LG], 2014. pdf Y. AbbasiYadkori, P. Bartlett, V. Kanade, Y. Seldin, and Cs. Szepesvari, Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions, Neural Information Processing Systems (NIPS), 2013. pdf Preliminary Version: Y. AbbasiYadkori, P. Bartlett, and Cs. Szepesvari, Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions, arXiv:1303.3055 [cs.LG], 2013. pdf Y. AbbasiYadkori, D. Pal, and Cs. Szepesvari, OnlinetoConfidenceSet Conversions and Application to Sparse Stochastic Bandits, Artificial Intelligence and Statistics (AISTATS), 2012. pdf Y. AbbasiYadkori, D. Pal, and Cs. Szepesvari, Improved Algorithms for Linear Stochastic Bandits, Neural Information Processing Systems (NIPS), 2011. pdf Preliminary Version: Y. AbbasiYadkori, D. Pal, and Cs. Szepesvari, Online Least Squares Estimation with SelfNormalized Processes: An Application to Bandit Problems, arXiv:1102.2670 [cs.AI], 2011. pdf Y. AbbasiYadkori and Cs. Szepesvari, Regret Bounds for the Adaptive Control of Linear Quadratic Systems, Conference on Learning Theory (COLT), 2011. pdf K. Hajebi, Y. AbbasiYadkori, H. Shahbazi, and H. Zhang, Fast Approximate NearestNeighbor Search with kNearest Neighbor Graph, International Joint Conference on Artificial Intelligence (IJCAI), 2011. pdf Y. AbbasiYadkori, J. Modayil, and Cs. Szepesvari, Extending RapidlyExploring Random Trees for Asymptotically Optimal Anytime Motion Planning, International Conference on Intelligent Robots and Systems (IROS), 2010. pdf P. Hooper, Y. AbbasiYadkori, R. Greiner, and B. Hoehn, Improved Mean and Variance Approximations for Belief Net Responses via Network Doubling, Conference on Uncertainty in Artificial Intelligence (UAI), 2009. pdf B. Poczos, Y. AbbasiYadkori, Cs. Szepesvari, R. Greiner, and N. Sturtevant, Learning when to stop thinking and do something!. International Conference on Machine Learning (ICML), 2009. pdf M. Ravanbakhsh, Y. AbbasiYadkori, M. Abbaspour, H. SarbaziAzad, A heuristic routing mechanism using a new addressing scheme. International conference on Bio inspired models of network, information and computing systems, 2006. pdf Workshop PapersY. Seldin, Cs. Szepesvari, P. Auer, Y. AbbasiYadkori, Evaluation and Analysis of the Performance of the EXP3 Algorithm in Stochastic Environments, European Workshop on Reinforcement Learning (EWRL), 2013. pdf Y. AbbasiYadkori, A. Antos, and Cs. Szepesvari, ForcedExploration Based Algorithms for Playing in Stochastic Linear Bandits, COLT Workshop on Online Learning with Limited Feedback, 2009. pdf ThesesPh.D. thesis: Online Learning for Linearly Parametrized Control Problems. Department of Computing Science, University of Alberta, September 2012. pdf M.Sc. thesis: ForcedExploration Based Algorithms for Playing in Bandits with Large Action Sets. Department of Computing Science, University of Alberta, Spring 2009. pdf 