EVOLVE International Conference

EVOLVE 2014 July 1-4
Beijing, China

EVOLVE2014 Tutorials

 

Evolutionary reinforcement learning or reinforcement evolutionary algorithms?

 

Instructor Madalina M. Drugan

 

A recent trend in evolutionary algorithms (EAs) transfers expertise from and to other areas of machine learning.An interesting novel symbiosis considers: i)reinforcement learning, which learns difficult dynamic elaborated tasks requiring lots of computational resources, and ii)evolutionary algorithmswith the main strengths in its eloquence and computational efficiency.These two techniques are addressing the same problem of maximization of the agent' reward in a potentially unknowndifficult environment that can include partial observations and/or abstract credit assessment.These machine learning methods exchange techniques in order to improve their theoretical and empirical efficiency, like computational speed for on-line learning, and robust behavior for the off-line optimization algorithms.

 

Reinforcement learning (RL) is considered the most general on-line/off-line learning technique that includes a long-term versus a short term reward trade-off.RL is successfully applied in disciplines likes game theory, robot control, control theory, operation research, etc.The on-line learning involves finding a balance between exploration of uncharted territories and exploitation of current knowledge. For example, a robot learns to act in an unknown and changing environment by receiving feedback from the environment.

 

There are few examples of usage of evolutionary algorithms into reinforcement learning and vice-versa.

 

Multi-objective reinforcement learning is a variant of reinforcement learning that uses tuples of rewards instead of a single reward.Multi-objective RL differs from standard RL in important ways since several actions can be considered to be the best according to their reward tuples. Techniques from multi-objective EAs should be used in the multi-objective RL framework to improve the exploration/exploitation trade-off for complex and large multi-objective environments.

 

The performance of EAs often depends on the optimal usage of genetic operators to explore/exploit promising parts of the search space. The problem of selecting the best genetic operator is similar to the problem that an agent faces when choosing between alternatives in achieving its goal of maximizing its cumulative expected reward.Practical approaches find among the various multi-armed bandit algorithms (or reinforcement learning) the ones that solve best the operator selection problem.

 

The scope of this tutorial is to discuss on resemblances and differences in learning with Reinforcement learning and Evolutionary Algorithms.

 

First, we introduce the use of RL to improve the performance of EAs, and second, to introduce the use of EAs to improve the performance of RL. Although both paradigms optimize some quantity of interest, the methodology, terminology and thebasic assumptions about the environment are quite different.During this tutorial, we will compare the exploitation/exploration trade-off that is important in both EAs and RL but with different meaning.

 

Madalina M.Drugan (biography)

 

She is senior researcher at the Artificial Intelligence Lab,VrijeUniversiteitBrussels,Belgium.She received a PhD (2006) from the Computer Science Department, University of Utrecht,TheNetherlands. Her PhD thesis "Conditional log-likelihood MDL and Evolutionary MCMC" is researching (designing, analyzing, experimenting) various Machine Learning and optimization algorithms in fields like Bayesian Network classifiers, Feature Selection, Evolutionary Computation, and Markov chain Monte Carlo. She did research in Evolutionary Computation related algorithmic design for Bioinformatics, Multi-objective optimization, Meta-heuristics, Operational Research, and Evolutionary Computation for more than 10 years.

 

Recently, she is involved in developing a theoretical and algorithmic framework of thenew branch of Reinforcement Learning using multi-objectiverewards. She has experience with research grants, reviewing services and a strong publication record in international peer-reviewed journals and conferences, various academic prices.

 

 

Optimisation Under Uncertainty: a Natural Bridge Between Probabilities and Evolutionary Computation

Instructor Prof. Massimiliano Vasile

 

Nowadays, the use of computer models has become an essential part of any decision process and design methodology. It forms the backbone of Model-Based System Engineering and of the concept of Virtual Prototyping (VP). Models are however incomplete representations of reality and are affected by uncertainties of different nature.
Not integrating a measure of these uncertainties in the evaluation of the design budgets (or performance indicators) during the optimisation of systems and components provides unreliable decisions on product quality and reliability.  A common approach to account for uncertainties in system design is to add safety margins to already optimised solutions. These margins are generally defined through experience and historical data rather than through a propagation of uncertainty through a system model. However, it can be shown that the use of predefined design margins applied to pre-optimised solutions can lead to an overestimation of the system budgets or of its reliability. In fact, optimised solutions can be very sensitive and poorly resilient to uncertainty and the a posteriori introduction of margins can result in suboptimal solutions.
A general tendency, from the design and control of manufacturing processes, to air traffic management, from decision making on multi-phase programmes to the control of the ascent trajectory of a rocket, is to introduce the quantification of uncertainty (UQ) directly in the optimisation process. This combination, however, significantly increases the complexity of the optimisation problem and dedicated techniques are required.

This tutorial will present different ways to quantify (or model) uncertainty, using both Probability and Imprecise Probability theories, and to propagate uncertainty through system models.
The tutorial will then introduce some formulations of the Optimisation Under Uncertainty (OUU) problem along with some examples of solution.
Particular attention will be dedicated to worst-case scenario optimisation problems and their solution with Evolutionary Computation.

 

Massimiliano Vasile (biography)

 

Massimiliano Vasile is currently Professor of Space Systems Engineering in the Department of Mechanical & Aerospace Engineering at the University of Strathclyde. Previous to this, he was a Senior Lecturer in the Department of Aerospace Engineering and Head of Research for the Space Advanced Research Team at the University of Glasgow. Before starting his academic career in 2004, he was the first member of the ESA Advanced Concepts Team and initiator of the ACT research stream on global trajectory optimisation, mission analysis and biomimicry. His research interests include Computational Optimization, Robust Design and Optimization Under Uncertainty exploring the limits of computer science at solving highly complex problems in science and engineering.

He developed Direct Transcription by Finite Elements on Spectral Basis for optimal control, implemented in the ESA software DITAN for low-thrust trajectory design. He has worked on the global optimisation of space trajectories developing innovative single and multi-objective optimisation algorithms, and on the combination of optimisation and imprecise probabilities to mitigate the effect of uncertainty in decision making and autonomous planning. More recently he has undertaken extensive research on the development of effective techniques for asteroid deflection and manipulation. His research has been funded by the European Space Agency, the EPSRC, the Planetary Society and the European Commission. Prof Vasile is currently leading Stardust, an EU-funded international research and training network on active debris removal and asteroid manipulation.

 

Tutorials

The following tutorials will be offered (free of charge) during the conference (see below for detailed description and biographies):

Massimiliano Vasile   Optimisation Under Uncertainty: a Natural Bridge Between Probabilities and Evolutionary Computation
Madalina Drugan   Evolutionary reinforcement learning or reinforcement evolutionary algorithms?
     

 


 

Massimiliano Vasile 

University of Strathclyde, UK

MassimilianoVasile-100x99

Biography: 

Massimiliano Vasile is currently Professor of Space Systems Engineering in the Department of Mechanical & Aerospace Engineering at the University of Strathclyde. Previous to this, he was a Senior Lecturer in the Department of Aerospace Engineering and Head of Research for the Space Advanced Research Team at the University of Glasgow. Before starting his academic career in 2004, he was the first member of the ESA Advanced Concepts Team and initiator of the ACT research stream on global trajectory optimisation, mission analysis and biomimicry. His research interests include Computational Optimization, Robust Design and Optimization Under Uncertainty exploring the limits of computer science at solving highly complex problems in science and engineering. He developed Direct Transcription by Finite Elements on Spectral Basis for optimal control, implemented in the ESA software DITAN for low-thrust trajectory design.

He has worked on the global optimisation of space trajectories developing innovative single and multi-objective optimisation algorithms, and on the combination of optimisation and imprecise probabilities to mitigate the effect of uncertainty in decision making and autonomous planning. More recently he has undertaken extensive research on the development of effective techniques for asteroid deflection and manipulation. His research has been funded by the European Space Agency, the EPSRC, the Planetary Society and the European Commission. Prof Vasile is currently leading Stardust, an EU-funded international research and training network on active debris removal and asteroid manipulation.

Madalina Drugan 

Vrije Universiteit Brussel, Belgium

madalina

Biography: 

Madalina Drugan is currently a senior researcher at the Artificial Intelligence Lab, Vrije Universiteit Brussels, Belgium. She received a PhD (2006) from the Computer Science Department, University of Utrecht, The Netherlands. Her PhD thesis "Conditional log-likelihood MDL and Evolutionary MCMC" is researching (designing, analyzing, experimenting) various Machine Learning and optimization algorithms in fields like Bayesian Network classifiers, Feature Selection, Evolutionary Computation, and Markov chain Monte Carlo. She did research in Evolutionary Computation related algorithmic design for Bioinformatics, Multi-objective optimization, Meta-heuristics, Operational Research, and Evolutionary Computation for more than 10 years.

Recently, she is involved in developing a theoretical and algorithmic framework of the new branch of Reinforcement Learning using multi-objectiverewards. She has experience with research grants, reviewing services and a strong publication record in international peer-reviewed journals and conferences, various academic prices.

 

Joomla templates by Joomlashine