Abstract: | 多重賭博機問題, 乃指決策者面對重複的多個不確定選項時, 該如何作抉擇的問題。舉凡金融、商品、就業、財務管理等諸多面向的實際議題, 皆可在多重賭博機問題的架構下進行探討。因此, 研究者已投入相當多的努力來尋找適用於各種賭博機問題的最佳策略。然而, 實驗經濟學的研究顯示出決策者的行為是偏離最適策略的。因此, 近十幾年來已陸續有研究者透過真人實驗來探知人類在賭博機問題中的決策模式。雖然實驗研究指出了人類在不同型態之博賭機問題中的行為特徵, 甚至挑選出了最適合描述受測者行者的模型。但仔細縱覽系列文獻後可以察覺受測者的行為似乎與不同的實驗設計有關, 而即使是在同樣的實驗設計中, 研究者也發覺決策者似乎呈現不同的策略型態。因此, 本研究將從有限理性的角度出發, 由決策者的內在條件(概念形成過程與認知能力限制) 及外在環境條件(不同資訊集合與機率設計) 兩個層面分別著手, 試圖找出影響決策者策略型態的關鍵因素, 並釐清其影響程度。本研究將結合真人實驗、心理計量及其他心理學分析方法, 嘗試找出決定人類在連續重複的不確定選擇問題中採用何種策略的關鍵條件。而本研究的結果, 將可作為結構型資產價格模型中, 決策者策略轉換行為建模的重要依據。 The multi-armed bandit problem is an abstract model that pictures a scenario in which agents have to repeatedly make choices among several alternatives whose payoff probabilities are unknown. Because of their lack of knowledge about the alternatives, agents have to learn the probabilities through experiences. However, due to the time or budget constraints, making decisions in multi-armed bandit problems always incurs opportunity costs. As a result, agents have to come up with a strategy to balance exploration and exploitation so as to maximize their total payoffs. Due to the importance of multi-armed bandit problems, economists, mathematicians, statisticians, and even computer scientists have developed many rules or algorithms to solve this problem optimally. However, these normative models are often either too complicated or too cognitively consuming to maneuver. In view of this, some experimental studies have been conducted to understand how human agents make decisions in various kinds of multi-armed bandit problems. What we have learned from previous experimental studies is that human do not perform optimally in bandit problems. In addition to this, laboratory results exhibit conflicting evidence about how real humans approach these problems. Notwithstanding some studies selected the “best” models to describe human behavior, there is evidence that people may use adopt different strategies, which may be due to the heterogeneity in their inner characteristics. Moreover, different “best” models appeared in different experimental studies manifests the variability of its nature-probably there do exists a set of models which best describe human behavior in different circumstances. In this project, we will investigate the above conjectures by studying human behavior in two-armed bandit problems. In the first stage, we will study how stochastic events are perceived by human decision makers. In doing so, we can have the first clue to approach how agents deal with stochastic alternatives in bandit problems. In the second stage, we will conduct the experiments under different information feedback schemes and probability arrangements to verify whether different environmental conditions will impel human decision makers to take different strategies. Furthermore, we will also examine the conjecture that different cognitively equipped agents will employ different strategies or procedures to tackle the problems. To do this, we will measure the working memory capacity of our subjects using a series of psychometric tests, and then inspect carefully whether there is any relationship between cognitive capacity and the types of strategies/procedures used by the subjects. According to our research results, we are expected to conclude a set of boundary conditions with which we can outguess the types of strategies employed by human agents. This result will be very valuable in all the empirical applications of multi-armed bandit problems. More importantly, it can shed light on the switching behavior in structure asset pricing models. |