COW :: Seminars ::
METU Ceng On the Web
| print-view
seminars
filter:
list | browse
id : 817
type : MSc_Thesis
dateandtime : 2018-09-13 10:00:00
duration : 90 min.
Recommended duration for PhD thesis is 90 minutes, for other seminar types, it is 60 minutes. The duration specified here is used to reserve the room.
place : A105
Please check room availability from Room Scheduling page. You must use the same room name as used in the scheduling page if you want to automatically reserve the room.
departmental : yes
title : TDGammon Revisited: Integrating Invalid Actions and Dice Factor In Continuous Action and Observation Space
author : ENGIN DENIZ USTA
supervisors : PROF.DR. FERDA N. ALPASLAN
Supervisors field is applicable especially for a Thesis Defense
company : Computer Engineering Dept. Middle East Technical Univ.
country : Turkey
abstract : After TDGammon's success in 1991, the interest in game-playing agents has risen significantly. With the developments in Deep Learning and emulations for older games has been created, human-level control for Atari games has been achieved and Deep Reinforcement Learning has proven itself to be a success. However, the ancestor of DRL, TDGammon, and its game Backgammon got out of sight, because of the fact that Backgammon's actions are much more complex than other games (most of the Atari games has 2 or 4 different actions), the huge action space has much invalid actions, and there is a dice factor which involves stochasticity. Last but not least, the professional level in Backgammon has been achieved a long time ago. In this thesis, the latest methods in DRL will be tested against its ancestor game, Backgammon, while trying to teach how to select valid moves and considering the dice factor.
biography :
download slides :
[ ]check this to delete slides
slidesFilename :
slidesFilename is the name of downloadable file, and will be automatically filled when you upload a new file. You may change the name also.
links :
notificationSent : final
| top
2018-10-21 23:42:24, 0.017 secs
COW by: Ahmet Sacan