During the summer, I worked with Amy on two closely related projects intended to improve our trading agent, RoxyBot. The first was a theoretical and experimental examination of bidding strategies for agents in on-line auctions, and the second was a suite of scripts for collecting and parsing data for those strategies to utilize.
If you are familiar with eBay, you will probably also be familiar with its proxy bidding agent, which increments your initial bid for a particular item when another party outbids you, until it reaches the ceiling value you provided. In one sense, that defines the general bidding philosophy of agents in on-line auctions -- an agent identifies which items are desired, prefers purchasing them for as low a price as possilbe, but typically identifies a higher amount that it would be willing to spend if necessary. There is, however, a key distinction between the eBay scenario and TAC: for TAC, it is advisable for agents to consider the worth of each good in combination with other goods. As a result, the price an agent is willing to pay for an item is its marginal value within a set of goods.
We tested various bidding strategies, each of which computed marginal values in a slightly different way. In the 2000 Trading Agent Competition, RoxyBot used point estimates to predict the closing price of each auction. Given this type of estimation, we compared the performance of the RoxyBot 2000 bidding strategy to another strategy referred to as the "MU calculator". In the Monte Carlo auction simulations I designed and ran, RoxyBot performed optimally under the assumption of price certainty; the MU calculator was also very effective, but not optimal.
Obviously, certainty about prices is one thing auction settings universally lack, and in 2001 the bidding strategy of RoxyBot was changed to use a distribution rather than a point estimate to predict auction closing prices. We developed new strategies and tested their performance through simulation as well. The MU calculator and RoxyBot 2000 strategies arrived at bidding policies by using the mean price of the distribution as point estimates. A strategy called "average MU" sampled from the distribution, computed the marginal utility of each good in each sample, using the sampled prices as point estimates, and averaged the marginal utilities of each good across samples.
Currently our approach is a myopic one, meaning that we attempt to determine the optimal policy at each time step without projecting the impact of our actions onto the future economy. Incorporating a forward-looking aspect into our approach would undoubtedly improve its success, and is a worthwhile direction for continued research.
The MU calculator is a strategy that assumes price certainty. The marginal utility of a good in this approach is the utility attainable when that good is owned, minus the utility attainable when the good cannot be purchased. RoxyBot 2000 differs from the MU calculator in that before answering the question, "What am I willing to pay for the goods I can buy?", it answers the question, "Which goods do I want to buy?" We refer to this problem as "completion". RoxyBot 2000 then calculates marginal utilities in the same manner as the MU calculator, but only with respect to this optimal set of goods. Furthermore, RoxyBot places bids only on those goods in the optimal set. If there exist multiple optimal solutions to the completion problem, one solution is chosen randomly. We contend that RoxyBot is an optimal strategy for generating bidding policies for any given set of prices and valuations. We validated this intuition as well as several other observations with short proofs.
Designing an optimal strategy assuming price certainty is one thing, but really only useful insofar as it can be extended to deal with uncertainty. The first strategy we developed to clear this hurdle I'll call "average MU". This strategy uses a distribution of prices. It samples from the distribution and calculates marginal utilities for each sample in the manner of the MU calculator. The resulting bidding policy is the average marginal utility of each good across samples.
A strategy that uses marginal utilities in a more intuitive way is "MU Search". MU Search generates a bidding policy by sampling from the price distribution and calculating marginal utilities as well. However, instead of averaging the marginal utilities, each policy is evaluated on a set of sample prices. The scores earned by each policy are averaged, and the policy with the highest average score is the policy retained by MU Search. Since that policy was the most robust during testing, we anticipate that it will also be the most robust in the game. A very similar strategy is RoxyBot Search, which differs only in that marginal utilities are calculated in the manner of RoxyBot 2000 rather than the MU calculator.
We tested the performance of all of these strategies using Monte Carlo simulations as well as practice TAC games. In the simulations, experiments were run using several different price distributions, and variations on the bidding strategies were also frequently incorporated. Unfortunately, modifications to the price distribution dramatically affected the performance of all the strategies, making it difficult to draw clear conclusions from the experiments with price uncertainty. However, when the price distribution most closely modeled that of the TAC game, RoxyBot Search and MU Search performed significantly better than the average MU strategy.
The final implementation of RoxyBot can play the TAC game using any of the strategies described above. Nicole and I modified RoxyBot's bid placement algorithm so that policies are generated with this simple scheme: generate a policy, using the specified strategy, sample repeatedly from the price distribution, and test the performance of the policy on those samples. This procedure is repeated as many times as possible, and when bids are due, the best policy is returned. The price distribution information used to generate and test policies came from the seeding rounds of TAC 2002. A suite of Perl scripts I wrote downloaded, parsed, and manipulated the seeding round data so that it could be used for this purpose. In the 2002 Trading Agent Competition, RoxyBot proceeded to the semifinals. Two directions in which our team would like to move in the future are seeking sequential optimality and estimating joint price distributions.