REINFORCEMENT LEARNING WITH A STOCHASTIC ACTION SET

专利类型: 
区域机场群专利导航
公开(公告)号: 
US20210089868A1
申请日: 
2019-09-23
申请局: 
US
摘要: 
Systems and methods are described for a decision-making process including actions characterized by stochastic availability, provide an Markov decision process (MDP) model that includes a stochastic action set based on the decision-making process, compute a policy function for the MDP model using a policy gradient based at least in part on a function representing the stochasticity of the stochastic action set, identify a probability distribution for one or more actions available at a time period using the policy function, and select an action for the time period based on the probability distribution.
原始专利权人: 
ADOBE INC.
受让人: 
ADOBE INC.
当前专利权人: 
Adobe Inc