<TeXmacs|1.99.7>

<style|<tuple|article|std-latex>>

<\body>
  <\hide-preamble>
    <new-theorem|observation|Observation>

    <new-theorem|theorem|Theorem>

    <new-theorem|fact|Fact>

    <new-theorem|lemma|Lemma>

    <new-theorem|dfn|Definition>

    <new-theorem|claim|Claim>

    <new-theorem|definition|Definition>

    <new-theorem|corollary|Corollary>

    <new-theorem|question|Question>

    <new-theorem|res|Result>

    <assign|ignore|<macro|1|>>

    <assign|proofsketch|<macro|body|<\surround|<no-indent><with|font-shape|italic|Proof
    (sketch):>|<math|\<blacksquare\>><vspace|tex-below-display-skip>>
      <arg|body>
    </surround>>>

    <assign|prevproof|<macro|1|2|body|<\surround|<no-indent><with|font-shape|italic|Proof
    of <arg|1><nbsp><reference|2>:>|<math|\<blacksquare\>><vspace|tex-below-display-skip>>
      <arg|body>
    </surround>>>

    <new-theorem|prop|Proposition>

    <new-theorem|proposition|Proposition>

    <assign|adam|<macro|1|<with|color|red|font-series|bold|[Adam: <arg|1>]>>>

    <assign|nsi|<macro|1|<with|color|blue|font-series|bold|[Nicole:
    <arg|1>]>>>

    <assign|veps|<macro|\<varepsilon\>>>

    <assign|EX|<macro|<math-up|SA>>>

    <assign|E|<macro|<math-up|E>>>

    <assign|P|<macro|<math-up|P>>>

    <assign|bS|<macro|\<bbb-S\>>>

    <assign|cA|<macro|\<cal-A\>>>

    <assign|cB|<macro|\<cal-B\>>>

    <assign|cC|<macro|\<cal-C\>>>

    <assign|cD|<macro|\<cal-D\>>>

    <assign|cF|<macro|\<cal-F\>>>

    <assign|cG|<macro|\<cal-G\>>>

    <assign|cN|<macro|\<cal-N\>>>

    <assign|cO|<macro|\<cal-O\>>>

    <assign|cP|<macro|\<cal-P\>>>

    <assign|cS|<macro|\<cal-S\>>>

    <assign|cT|<macro|\<cal-T\>>>

    <assign|cX|<macro|\<cal-X\>>>

    <assign|KL|<macro|<math-up|KL>>>

    <assign|cY|<macro|\<cal-Y\>>>

    <assign|cZ|<macro|\<cal-Z\>>>

    <assign|bbZ|<macro|\<bbb-Z\>>>

    <assign|bbN|<macro|\<bbb-N\>>>

    <assign|0|<macro|{0,1}>>

    <assign|r|<macro|[0,1]>>

    <assign|reals|<macro|<with|font-series|bold|R>>>

    <assign|rn|<macro|<reals><rsup|k>>>

    <assign|eps|<macro|\<epsilon\>>>

    <assign|opt|<macro|<math-up|opt>>>

    <assign|cov|<macro|<math-up|cov>>>

    <assign|cor|<macro|<math-up|cor>>>

    <assign|ecov|<macro|<wide|<math-up|cov>|^>>>

    <assign|poly|<macro|<math-up|poly>>>

    <assign|POS|<macro|POS>>

    <assign|POA|<macro|POA>>

    <assign|EE|<macro|<math-up|E>>>

    <assign|FFF|<macro|\<cal-F\>>>

    <assign|F|<macro|\<bbb-F\>>>

    <assign|R|<macro|\<bbb-R\>>>

    <assign|Z|<macro|\<bbb-Z\>>>

    <assign|N|<macro|\<bbb-N\>>>

    <assign|Q|<macro|\<bbb-Q\>>>

    <assign|HH|<macro|<math-bf|H>>>

    <assign|LL|<macro|\<ell\>>>

    <assign|var|<macro|<math-up|var>>>

    <assign|mean|<macro|\<mu\>>>

    <assign|shrink|<macro|<math-up|shrink>>>

    <assign|evar|<macro|<wide|<math-up|cov>|^>>>

    <assign|sgn|<macro|<math-up|sgn>>>

    <new-theorem|Alg|Algorithm>

    <assign||<macro|>>

    <assign|I|<macro|<math-up|I>>>

    <assign|myalg|<\macro|1|2|3>
      <vspace|1fn>

      <\frame>
        <\mini-paragraph|5.5in>
          <Alg|<label|1><with|font-shape|small-caps|<arg|2>><next-line><with|font-family|tt|<arg|3>>>
        </mini-paragraph>
      </frame>

      \ <vspace|1fn>
    </macro>>
  </hide-preamble>

  <doc-data|<doc-title|Dueling algorithms>|<doc-author|<author-data|<author-misc|Department
  of Electrical Engineering and Computer Science, Northwestern
  University>|<author-misc|Part of this work was performed while the author
  was at Microsoft Research>|<author-name|Nicole
  Immorlica>>>|<doc-author|<author-data|<author-misc|Microsoft Research New
  England>|<author-name|Adam Tauman Kalai>>>|<doc-author|<author-data|<author-misc|Department
  of Computer Science, University of Toronto>|<author-name|Brendan Lucier
  <rsup|<math|\<dag\>>>>>>|<doc-author|<author-data|<author-misc|Department
  of Electrical Engineering and Computer Science, Massachusetts Institute of
  Technology. Supported in part by a Fannie Hurts
  Fellowship.>|<author-name|Ankur Moitra <math|<space|0.17em><rsup|\<dag\>>>>>>|<doc-author|<author-data|<author-misc|Department
  of Economics, University of Pennsylvania>|<author-name|Andrew Postlewaite
  <rsup|<math|\<dag\>>>>>>|<doc-author|<author-data|<author-misc|Microsoft
  R&D Israel and the Technion, Israel>|<author-name|Moshe
  Tennenholtz>>>|<doc-date|<date|>>>

  <reset-counter|page>

  <\abstract>
    We revisit classic algorithmic search and optimization problems from the
    perspective of competition. Rather than a single optimizer minimizing
    expected cost, we consider a zero-sum game in which an optimization
    problem is presented to two players, whose only goal is to
    <with|font-shape|italic|outperform the opponent>. Such games are
    typically exponentially large zero-sum games, but they often have a rich
    combinatorial structure. We provide general techniques by which such
    structure can be leveraged to find minmax-optimal and approximate
    minmax-optimal strategies. We give examples of ranking, hiring,
    compression, and binary search duels, among others. We give bounds on how
    often one can beat the classic optimization algorithms in such duels.
  </abstract>

  <new-page>

  <section|Introduction>

  Many natural optimization problems have two-player competitive analogs. For
  example, consider the ranking problem of selecting an order on <math|n>
  items, where the cost of searching for a single item is its rank in the
  list. Given a fixed probability distribution over desired items, the
  trivial greedy algorithm, which orders items in decreasing probability, is
  optimal.

  Next consider the following natural two-player version of the problem,
  which models a user choosing between two search engines. The user thinks of
  a desired web page and a query and executes the query on both search
  engines. The engine that ranks the desired page higher is chosen by the
  user as the \Pwinner.\Q If the greedy algorithm has the ranking of pages
  <math|\<omega\><rsub|1>,\<omega\><rsub|2>,\<ldots\>,\<omega\><rsub|n>>,
  then the ranking <math|\<omega\><rsub|2>,\<omega\><rsub|3>,\<ldots\>,\<omega\><rsub|n>,\<omega\><rsub|1>>
  beats the greedy ranking on every item except <math|\<omega\><rsub|1>>. We
  say the greedy algorithm is <math|1-1/n> <with|font-shape|italic|beatable>
  because there is a probability distribution over pages for which the greedy
  algorithm loses <math|1-1/n> of the time. Thus, in a competitive setting,
  an \Poptimal\Q search engine can perform poorly against a clever opponent.

  This <with|font-shape|italic|ranking duel> can be modeled as a symmetric
  constant-sum game, with <math|n>! strategies, in which the player with the
  higher ranking of the target page receives a payoff of 1 and the other
  receives a payoff of 0 (in the case of a tie, say they both receive a
  payoff of 1/2). As in all symmetric one-sum games, there must be (mixed)
  strategies that guarantee expected payoff of at least 1/2 against any
  opponent. Put another way, there must be a (randomized) algorithm that
  takes as input the probability distribution and outputs a ranking, which is
  guaranteed to achieve expected payoff of at least <math|1/2> against any
  opposing algorithm.

  This conversion can be applied to any optimization problem with an element
  of uncertainty. Such problems are of the form
  <math|min<rsub|x\<in\>X><E><rsub|\<omega\>\<sim\>p><around|[|c<around|(|x,\<omega\>|)>|]>>,
  where <math|p> is a probability distribution over the
  <with|font-shape|italic|state of nature> <math|\<omega\>\<in\>\<Omega\>>,
  <math|X> is a feasible set, and <math|c:X\<times\>\<Omega\>\<rightarrow\><reals>>
  is an objective function. The dueling analog has two players simultaneously
  choose <math|x,x<rprime|'>>; player 1 receives payoff 1 if
  <math|c<around|(|x,\<omega\>|)>\<less\>c<around|(|x<rprime|'>,\<omega\>|)>>,
  payoff <math|0> if <math|c<around|(|x,\<omega\>|)>\<gtr\>c<around|(|x<rprime|'>,\<omega\>|)>>,
  payoff <math|1/2> otherwise, and similarly for player 2.<footnote|Our
  techniques will also apply to asymmetric payoff functions; see Appendix
  <reference|app:asymmetric>.>

  There are many natural examples of this setting beyond the ranking duel
  mentioned above. For example, for the shortest-path routing under a
  distribution over edge times, the corresponding
  <with|font-shape|italic|racing duel> is simply a race, and the state of
  nature encodes uncertain edge delays.<footnote|We also refer to this as the
  <with|font-shape|italic|primal duel> because any other duel can be
  represented as a race with an appropriate graph and probability
  distribution <math|p>, though there may be an exponential blowup in
  representation size.> For the classic secretary problem, in the
  corresponding <with|font-shape|italic|hiring duel> two employers must each
  select a candidate from a pool of <math|n> candidates (though, as standard,
  they must decide whether or not to choose a candidate before interviewing
  the next one), and the winner is the one that hires the better candidate.
  This could model, for example, two competing companies attempting to hire
  CEOs or two opposing political parties selecting politicians to run in an
  election; the absolute quality of the candidate may be less important than
  being better than the other's selection. In a
  <with|font-shape|italic|compression duel>, a user with a (randomly chosen)
  sample string <math|\<omega\>> chooses between two compression schemes
  based on which one compresses that string better. This setting can also
  model a user searching for a file in two competing, hierarchical storage
  systems and choosing the system that finds the file first. In a
  <with|font-shape|italic|binary search duel>, a user searches for a random
  element in a list using two different search trees, and chooses whichever
  tree finds the element faster.

  <paragraph|Our contribution.> For each of these problems, we consider a
  number of questions related to how vulnerable a classic algorithm is to
  competition, what algorithms will be selected at equilibrium, and how well
  these strategies at equilibrium solve the original optimization problem.

  <\question>
    Will players use the classic optimization solution in the dueling
    setting?
  </question>

  Intuitively, the answer to this question should depend on how much an
  opponent can <with|font-shape|italic|game> the classic optimization
  solution. For example, in the <with|font-shape|italic|ranking duel> an
  opponent can beat the greedy algorithm on almost all pages \U and even the
  most oblivious player would quickly realize the need to change strategies.
  In contrast, we demonstrate that many classic optimization solutions \U
  such as the secretary algorithm for hiring, Huffman coding for compression,
  and standard binary search \U are substantially less vulnerable. We say an
  algorithm is <math|\<beta\>>-beatable (over distribution <math|p>) if there
  exists a response which achieves payoff <math|\<beta\>> against that
  algorithm (over distribution <math|p>). We summarize our results on the
  beatability of the standard optimization algorithm in each of our example
  optimization problems in the table below:

  <\center>
    <tabular*|<tformat|<cwith|1|-1|1|1|cell-lborder|1ln>|<cwith|1|-1|1|1|cell-halign|l>|<cwith|1|-1|1|1|cell-rborder|1ln>|<cwith|1|-1|2|2|cell-halign|c>|<cwith|1|-1|2|2|cell-rborder|1ln>|<cwith|1|-1|3|3|cell-halign|c>|<cwith|1|-1|3|3|cell-rborder|1ln>|<cwith|1|-1|1|-1|cell-valign|c>|<cwith|1|1|1|-1|cell-tborder|1ln>|<cwith|1|1|1|-1|cell-bborder|1ln>|<cwith|6|6|1|-1|cell-bborder|1ln>|<table|<row|<cell|Optimization
    Problem>|<cell|Upper Bound>|<cell|Lower
    Bound>>|<row|<cell|Ranking>|<cell|<math|1-1/n>>|<cell|<math|1-1/n>>>|<row|<cell|Racing>|<cell|<math|1>>|<cell|<math|1>>>|<row|<cell|Hiring>|<cell|<math|0.82>>|<cell|<math|0.51>>>|<row|<cell|Compression>|<cell|<math|3/4>>|<cell|<math|2/3>>>|<row|<cell|Search>|<cell|<math|5/8>>|<cell|<math|5/8>>>>>>
  </center>

  <\question>
    What strategies do players play at equilibrium?
  </question>

  We say an algorithm efficiently <with|font-shape|italic|solves> the duel if
  it takes as input a representation of the game and probability distribution
  <math|p>, and outputs an action <math|x\<in\>X> distributed according to
  some minmax optimal (i.e., Nash equilibrium) strategy. As our main result,
  we give a general method for solving duels that can be represented in a
  certain bilinear form. We also show how to convert an approximate
  best-response oracle for a dueling game into an approximate minmax optimal
  algorithm, using techniques from low-regret learning. We demonstrate the
  generality of these methods by showing how to apply them to the numerous
  examples described above. For many problems we consider, the problem of
  computing minmax optimal strategies reduces to finding a simple description
  of the space of feasible mixed strategies (i.e. expressing this set as the
  projection of a polytope with polynomially many variables and constraints).
  See <cite|Yann> for a thorough treatment of such problems.

  <\question>
    Are these equilibrium strategies still good at solving the optimization
    problem?
  </question>

  As an example, consider the ranking duel. How much more time does a web
  surfer need to spend browsing to find the page he is interested in, because
  more than one search engine is competing for his attention? In fact, the
  surfer may be <em|better> off due to competition, depending on the model of
  comparison. For example, the cost to the web surfer may be the minimum of
  the ranks assigned by each search engine. And we leave open the tantalizing
  possibility that this quantity could in general be smaller at equilibrium
  for two competing search engines than for just one search engine playing
  the greedy algorithm.

  <vspace|-2mm><paragraph|Related work.> The work most relevant to ours is
  the study of ranking games<nbsp><cite|BFHS09>, and more generally the study
  of social context games<nbsp><cite|AKT08>. In these settings, players'
  payoffs are translated into utilities based on social contexts, defined by
  a graph and an aggregation function. For example, a player's utility can be
  the sum/max/min of his neighbors' payoffs. This work studies the effect of
  social contexts on the existence and computation of game-theoretic solution
  concepts, but does not re-visit optimization algorithms in competitive
  settings.

  For the hiring problem, several competitive variants and their algorithmic
  implications have been considered (see, e.g.,<nbsp><cite|IKM06> and the
  references therein). A typical competitive setting is a (general sum) game
  where a player achieves payoff of 1 if she hires the very best applicant
  and zero otherwise. But, to the best of our knowledge, no one has
  considered the natural model of a duel where the objective is simply to
  hire a better candidate than the opponent. Also related to our algorithmic
  results are succinct zero-sum games, where a game has exponentially many
  strategies but the payoff function can be computed by a succinct circuit.
  This general class has been showed to be EXP-hard to
  solve<nbsp><cite|FKS95>, and also difficult to
  approximate<nbsp><cite|FIKU05>.

  Finally, we note the line of research on competition among mechanisms, such
  as the study of competing auctions (see e.g.
  <cite|BS99|Mcafee93|MT04|PS97>) or schedulers <cite|ATZ10>. In such
  settings, each player selects a mechanism and then bidders select the
  auction to participate in and how much to bid there, where both designers
  and bidders are strategic. This work is largely concerned with the
  existence of sub-game perfect equilibrium.

  <vspace|-2mm><paragraph|Outline.> In Section <reference|sec:defn> we define
  our model formally and provide a general framework for solving dueling
  problems as well as the warmup example of the ranking duel. We then use
  these tools to analyze the more intricate settings of the hiring duel
  (Section<nbsp><reference|sec:hiring>), the compression duel
  (Section<nbsp><reference|sec:compression>), and the search duel
  (Section<nbsp><reference|sec:bst>). We describe avenues of future research
  in Section<nbsp><reference|sec:conc>.

  <section|Preliminaries><label|sec:defn>

  A problem of optimization under uncertainty,
  <math|<around|(|X,\<Omega\>,c,p|)>>, is specified by a feasible set
  <math|X>, a commonly-known distribution <math|p> over the state of nature,
  <math|\<omega\>>, chosen from set <math|\<Omega\>>, and an objective
  function <math|c:X\<times\>\<Omega\>\<rightarrow\><reals>>. For simplicity
  we assume all these sets are finite. When <math|p> is clear from context,
  we write the expected cost of <math|x\<in\>X> as
  <math|c<around|(|x|)>=<E><rsub|\<omega\>\<sim\>p><around|[|c<around|(|x,\<omega\>|)>|]>>.
  The one-player optimum is <math|<opt>=min<rsub|x\<in\>X> c<around|(|x|)>>.
  Algorithm <math|A> takes as input <math|p> and randomness
  <math|r\<in\><around|[|0,1|]>>, and outputs <math|x\<in\>X>. We define
  <math|c<around|(|A|)>=<E><rsub|r><around|[|c<around|(|A<around|(|p,r|)>|)>|]>>
  and an algorithm <math|A> is <with|font-shape|italic|one-player optimal> if
  <math|c<around|(|A|)>=<opt>>.

  In the two-person constant-sum duel game
  <math|D<around|(|X,\<Omega\>,c,p|)>>, players simultaneously choose
  <math|x,x<rprime|'>\<in\>X>, and player 1's payoff is:

  <\equation*>
    v<around|(|x,x<rprime|'>,p|)>=Pr<rsub|\<omega\>\<sim\>p><around|[|c<around|(|x,\<omega\>|)>\<less\>c<around|(|x<rprime|'>,\<omega\>|)>|]>+<frac|1|2>*Pr<rsub|\<omega\>\<sim\>p><around|[|c<around|(|x,\<omega\>|)>=c<around|(|x<rprime|'>,\<omega\>|)>|]>.
  </equation*>

  When <math|p> is understood from context we write
  <math|v<around|(|x,x<rprime|'>|)>>. Player 2's payoff is
  <math|v<around|(|x<rprime|'>,x|)>=1-v<around|(|x,x<rprime|'>|)>>. This
  models a tie, <math|c<around|(|x,\<omega\>|)>=c<around|(|x<rprime|'>,\<omega\>|)>>,
  as a half point for each. We define the value of a strategy,
  <math|v<around|(|x,p|)>>, to be how much that strategy guarantees,
  <math|v<around|(|x,p|)>=min<rsub|x<rprime|'>\<in\>X>
  v<around|(|x,x<rprime|'>,p|)>>. Again, when <math|p> is understood from
  context we write simply <math|v<around|(|x|)>>.

  The set of probability distributions over set <math|S> is denoted
  <math|\<Delta\><around|(|S|)>>. A <with|font-shape|italic|mixed strategy>
  is <math|\<sigma\>\<in\>\<Delta\><around|(|X|)>>. As is standard, we extend
  the domain of <math|v> to mixed strategies bilinearly by expectation. A
  <with|font-shape|italic|best response> to mixed strategy <math|\<sigma\>>
  is a strategy which yields maximal payoff against <math|\<sigma\>>, i.e.,
  <math|\<sigma\><rprime|'>> is a best response to <math|\<sigma\>> if it
  maximizes <math|v<around|(|\<sigma\><rprime|'>,\<sigma\>|)>>. A
  <with|font-shape|italic|minmax> strategy is a (possibly mixed) strategy
  that guarantees the safety value, in this case 1/2, against any opponent
  play. The best response to such a strategy yields payoffs of 1/2. The set
  of minmax strategies is denoted <math|M*M<around|(|D<around|(|X,\<Omega\>,c,p|)>|)>=<around|{|\<sigma\>\<in\>\<Delta\><around|(|X|)><nbsp>\|<nbsp>v<around|(|\<sigma\>|)>=1/2|}>>.
  A basic fact about constant-sum games is that the set of Nash equilibria is
  the cross product of the minmax strategies for player 1 and those of player
  2.

  <subsection|Bilinear duels><label|sec:bilinear>

  In a bilinear duel, the feasible set of strategies are points in
  <math|n>-dimensional Euclidean space, i.e.,
  <math|X\<subseteq\><reals><rsup|n>>, <math|X<rprime|'>\<subseteq\><reals><rsup|n<rprime|'>>>
  and the payoff to player 1 is <math|v<around|(|x,x<rprime|'>|)>=x<rsup|t>*M*x<rprime|'>>
  for some matrix <math|M\<in\><reals><rsup|n\<times\>n<rprime|'>>>. In
  <math|n\<times\>n> bimatrix games, <math|X> and <math|X<rprime|'>> are just
  simplices <math|<around|{|x\<in\><reals><rsub|\<geq\>0><rsup|n><nbsp>\|<nbsp><big|sum>x<rsub|i>=1|}>>.
  Let <math|K> be the convex hull of <math|X>. Any point in <math|K> is
  achievable (in expectation) as a mixed strategy. Similarly define
  <math|K<rprime|'>>. As we will point out in this section, solving these
  reduces to linear programming with a number of constraints proportional to
  the number of constraints necessary to define the feasible sets, <math|K>
  and <math|K<rprime|'>>. (In typical applications, <math|K> and
  <math|K<rprime|'>> have a polynomial number of facets but an exponential
  number of vertices.)

  Let <math|K> be a polytope defined by the intersection of <math|m>
  halfspaces, <math|K=<around|{|x\<in\><reals><rsup|n><nbsp>\|<nbsp>w<rsub|i>\<cdot\>x\<geq\>b<rsub|i>*<text|for
  >i=1,2,\<ldots\>,m|}>>. Similarly, let <math|K<rprime|'>> be the
  intersection of <math|m<rprime|'>> halfspaces
  <math|w<rsub|i><rprime|'>\<cdot\>x\<geq\>b<rsub|i><rprime|'>>. The typical
  way to reduce to an LP for constant-sum games is:

  <\equation*>
    max<rsub|v\<in\><reals>,x\<in\><reals><rsup|n>> v*<text|such that
    >x\<in\>K*<text|and >x<rsup|T>*M*x<rprime|'>\<geq\>v*<text|for all
    >x<rprime|'>\<in\>X<rprime|'>.
  </equation*>

  The above program has a number of constraints which is
  <math|m+<around|\||X<rprime|'>|\|>>, (<math|m> constraints guaranteeing
  that <math|x\<in\>K>), and <math|<around|\||X<rprime|'>|\|>> is typically
  exponential. Instead, the following linear program has
  <math|O*<around|(|n<rprime|'>+m+m<rprime|'>|)>> constraints, and hence can
  be found in time polynomial in <math|n<rprime|'>,m,m<rprime|'>> and the
  bit-size representation of <math|M> and the constraints in <math|K> and
  <math|K<rprime|'>>.

  <\equation>
    <label|eq:LP>max<rsub|x\<in\><reals><rsup|n>,\<lambda\>\<in\><reals><rsup|m<rprime|'>>>
    <big|sum><rsub|1><rsup|m<rprime|'>>\<lambda\><rsub|i>*b<rsub|i><rprime|'>*<text|such
    that >x\<in\>K*<text|and >x<rsup|t>*M=<big|sum><rsub|1><rsup|m<rprime|'>>\<lambda\><rsub|i>*w<rsub|i><rprime|'>.
  </equation>

  <\lemma>
    <label|lem:lp>For any constant-sum game with strategies
    <math|x\<in\>K,x<rprime|'>\<in\>K> and payoffs
    <math|x<rsup|t>*M*x<rprime|'>>, the maximum of the above linear program
    is the value of the game to player 1, and any maximizing <math|x> is a
    minmax optimal strategy.
  </lemma>

  <\proof>
    First we argue that the value of the above LP is at least as large as the
    value of the game to player 1. Let <math|x,\<lambda\>> maximize the above
    LP and let the maximum be <math|\<alpha\>>. For any
    <math|x<rprime|'>\<in\>K<rprime|'>>,

    <\equation*>
      x<rsup|t>*M*x<rprime|'>=<big|sum><rsub|1><rsup|m<rprime|'>>\<lambda\><rsub|i>*w<rsub|i><rprime|'>\<cdot\>x<rprime|'>\<geq\><big|sum><rsub|1><rsup|m<rprime|'>>\<lambda\><rsub|i>*b<rsub|i><rprime|'>=\<alpha\>.
    </equation*>

    Hence, this means that strategy <math|x> guarantees player <math|x> at
    least <math|\<alpha\>> against any opponent response,
    <math|x<rprime|'>\<in\>K>. Hence <math|\<alpha\>\<leq\>v> with equality
    iff <math|x> is minmax optimal. Next, let <math|x> be any minmax optimal
    strategy, and let <math|v> be the value of the constant-sum game. This
    means that <math|x<rsup|t>*M*x<rprime|'>\<geq\>v> for all
    <math|x<rprime|'>\<in\>K<rprime|'>> with equality for some point. In
    particular, the minmax theorem (equivalently, duality) means that the LP
    <math|min<rsub|x<rprime|'>\<in\>K<rprime|'>> x<rsup|t>*M*x<rprime|'>> has
    a minimum value of <math|v> and that there is a vector of
    <math|\<lambda\>\<geq\>0> such that <math|<big|sum><rsub|1><rsup|m<rprime|'>>\<lambda\><rsub|i>*w<rsub|i><rprime|'>=x<rsup|t>*M>
    and <math|<big|sum><rsub|1><rsup|m<rprime|'>>\<lambda\><rsub|i>*b<rsub|i><rprime|'>=v>.
    Hence <math|\<alpha\>\<geq\>v>.
  </proof>

  <subsection|Reduction to bilinear duels><label|sec:blexact>

  The sets <math|X> in a duel are typically objects such as paths, trees,
  rankings, etc., which are not themselves points in Euclidean space. In
  order to use the above approach to reduce a given duel
  <math|D<around|(|X,\<Omega\>,c,p|)>> to a bilinear duel in a
  <with|font-shape|italic|computationally efficient manner>, one needs the
  following:

  <\enumerate>
    <item>An efficiently computable function <math|\<phi\>:X\<rightarrow\>K>
    which maps any <math|x\<in\>X> to a feasible point in
    <math|K\<subseteq\><reals><rsup|n>>.

    <item>A payoff matrix <math|M> demonstrating such that
    <math|v<around|(|x,x<rprime|'>|)>=\<phi\><around|(|x|)><rsup|t>*M*\<phi\><around|(|x<rprime|'>|)>>,
    demonstrating that the problem is indeed bilinear.

    <item>A set of polynomially many feasible constraints which defines
    <math|K>.

    <item>A \Prandomized rounding algorithm\Q which takes as input a point in
    <math|K> outputs an object in <math|X>.
  </enumerate>

  In many cases, parts (1) and (2) are straightforward. Parts (3) and (4) may
  be more challenging. For example, for the binary trees used in the
  compression duel, it is easy to map a tree to a vector of node depths.
  However, we do not know how to efficiently determine whether a given vector
  of node depths is indeed a mixture over trees (except for certain types of
  trees which are in sorted order, like the binary search trees in the binary
  search duel). In the next subsection, we show how computing approximate
  best responses suffices.

  <subsection|Approximating best responses and approximating
  minmax><label|sec:appx>

  In some cases, the polytope <math|K> may have exponentially or infinitely
  many facets, in which case the above linear program is not very useful. In
  this section, we show that if one can compute
  <with|font-shape|italic|approximate> best responses for a bilinear duel,
  then one can <with|font-shape|italic|approximate> minmax strategies.

  For any <math|<eps>\<gtr\>0>, an <math|<eps>>-best response to a player 2
  strategy <math|x<rprime|'>\<in\>K<rprime|'>> is any <math|x\<in\>K> such
  that <math|x<rsup|t>*M*x<rprime|'>\<geq\>min<rsub|y\<in\>K>
  y<rsup|T>*M*x<rprime|'>-<eps>>. Similarly for player 1. An
  <math|<eps>>-minmax strategy <math|x\<in\>K> for player 1 is one that
  guarantees player 1 an expected payoff not worse than <math|<eps>> minus
  the value, i.e.,

  <\equation*>
    min<rsub|x<rprime|'>\<in\>K> v<around|(|x,x<rprime|'>|)>\<geq\>max<rsub|y\<in\>K>
    min<rsub|x<rprime|'>\<in\>K> v<around|(|y,x<rprime|'>|)>-<eps>.
  </equation*>

  Best response oracles are functions from <math|K> to <math|K<rprime|'>> and
  vice versa. However, for many applications (and in particular the ones in
  this paper) where all feasible points are nonnegative, one can define a
  best response oracle for all nonnegative points in the positive orthant.
  (With additional effort, one can remove this assumption using Kleinberg and
  Awerbuch's elegant notion of a Barycentric spanner <cite|AK04>.) For
  scaling purposes, we assume that for some <math|B\<gtr\>0>, the convex sets
  are <math|K\<subseteq\><around|[|0,B|]><rsup|n>> and
  <math|K<rprime|'>\<subseteq\><around|[|0,B|]><rsup|n<rprime|'>>> and the
  matrix <math|M\<in\>[-B,B]<rsup|n\<times\>n<rprime|'>>> is bounded as well.

  Fix any <math|<eps>\<gtr\>0>. We suppose that we are given an
  <math|<eps>>-approximate best response oracle in the following sense. For
  player 1, this is an oracle <math|<cO>:<around|[|0,B|]><rsup|n<rprime|'>>\<rightarrow\>K>
  which has the property that <math|<cO><around|(|x<rprime|'>|)><rsup|t>*M*x<rprime|'>\<geq\>max<rsub|x\<in\>K>
  x<rsup|t>*M*x<rprime|'>-<eps>> for any <math|x<rprime|'>\<in\><around|[|0,B|]><rsup|n<rprime|'>>>.
  Similarly for <math|<cO><rprime|'>> for player 2. Hence, one is able to
  potentially respond to things which are not feasible strategies of the
  opponent. As can be seen in a number of applications, this does not impose
  a significant additional burden.

  <\lemma>
    <label|lem:appx>For any <math|<eps>\<gtr\>0>,
    <math|n,n<rprime|'>\<geq\>1>, <math|B\<gtr\>0>, and any bilinear dual
    with convex <math|K\<subseteq\><around|[|0,B|]><rsup|n>> and
    <math|K<rprime|'>\<subseteq\><around|[|0,B|]><rsup|n<rprime|'>>> and
    <math|M\<in\>[-B,B]<rsup|n\<times\>n<rprime|'>>>, and any
    <math|<eps>>-best response oracles, there is an algorithm for finding
    <math|<around*|(|24<around|(|<eps>max
    <around|(|m,m<rprime|'>|)>|)><rsup|1/3>*B<rsup|2>*<around|(|n*n<rprime|'>|)><rsup|2/3>|)>>-minmax
    strategies <math|x\<in\>K,x<rprime|'>\<in\>K<rprime|'>>. The algorithm
    uses <math|<poly><around|(|\<beta\>,m,m<rprime|'>,1/<eps>|)>> runtime and
    make <math|<poly><around|(|\<beta\>,m,m<rprime|'>,1/<eps>|)>> oracle
    calls.
  </lemma>

  The reduction and proof is deferred to Appendix<nbsp><reference|app:defn>.
  It uses Hannan-type of algorithms, namely \PFollow the expected leader\Q
  <cite|KV05>.

  We reduce the compression duel, where the base objects are trees, to a
  bilinear duel and use the approximate best response oracle. To perform such
  a reduction, one needs the following.

  <\enumerate>
    <item>An efficiently computable function <math|\<phi\>:X\<rightarrow\>K>
    which maps any <math|x\<in\>X> to a feasible point in
    <math|K\<subseteq\><reals><rsup|n>>.

    <item>A bounded payoff matrix <math|M> demonstrating such that
    <math|v<around|(|x,x<rprime|'>|)>=\<phi\><around|(|x|)><rsup|t>*M*\<phi\><around|(|x<rprime|'>|)>>,
    demonstrating that the problem is indeed bilinear.

    <item><math|<eps>>-best response oracles for players 1 and 2. Here, the
    input to an <math|<eps>> best response oracle for player 1 is
    <math|x<rprime|'>\<in\><around|[|0,B|]><rsup|n<rprime|'>>>.
  </enumerate>

  <subsection|Beatability>

  One interesting quantity to examine is how well a one-player optimization
  algorithm performs in the two-player game. In other words, if a single
  player was a monopolist solving the one-player optimization problem, how
  badly could they be beaten if a second player suddenly entered. For a
  particular one-player-optimal algorithm <math|A>, we define its
  <with|font-shape|italic|beatability over distribution <math|p>> to be
  <math|<E><rsub|r><around|[|v<around|(|A<around|(|p,r|)>,p|)>|]>>, and we
  define its <with|font-shape|italic|beatability> to be
  <math|inf<rsub|p><E><rsub|r><around|[|v<around|(|A<around|(|p,r|)>,p|)>|]>>.

  <subsection|A warmup: the ranking duel><label|sec:rank>

  In the ranking duel, <math|\<Omega\>=<around|[|n|]>=<around|{|1,2,\<ldots\>,n|}>>,
  <math|X> is the set of permutations over <math|n> items, and
  <math|c<around|(|\<pi\>,\<omega\>|)>\<in\><around|[|n|]>> is the position
  of <math|\<omega\>> in <math|\<pi\>> (rank 1 is the \Pbest\Q rank). The
  greedy algorithm, which outputs permutation
  <math|<around|(|\<omega\><rsub|1>,\<omega\><rsub|2>,\<ldots\>,\<omega\><rsub|n>|)>>
  such that <math|p<around|(|\<omega\><rsub|1>|)>\<geq\>p<around|(|\<omega\><rsub|2>|)>\<geq\>\<cdots\>\<geq\>p<around|(|\<omega\><rsub|n>|)>>,
  is optimal in the one-player version of the problem.<footnote|In some
  cases, such as a model of competing search engines, one could have the
  agents rank only <math|k> items, but the algorithmic results would be
  similar.>

  This game can be represented as a bilinear duel as follows. Let <math|K>
  and <math|K<rprime|'>> be the set of doubly stochastic matrices,
  <math|K=K<rprime|'>=<around|{|x\<in\><reals><rsub|\<geq\>0><rsup|n<rsup|2>><nbsp>\|<nbsp>\<forall\>j*<big|sum><rsub|i>x<rsub|i*j>=1,\<forall\>i*<big|sum><rsub|j>x<rsub|i*j>=1|}>>.
  Here <math|x<rsub|i*j>> indicates the <with|font-shape|italic|probability>
  that item <math|i> is placed in position <math|j>, in some distribution
  over rankings. The Birkhoff-von Neumann Theorem states that the set
  <math|K> is precisely the set of probability distributions over rankings
  (where each ranking is represented as a permutation matrix
  <math|x\<in\><around|{|0,1|}><rsup|n<rsup|2>>>), and moreover any such
  <math|x\<in\>K> can be implemented efficiently via a form of randomized
  rounding. See, for example, Corollary 1.4.15 of <cite|LP86>. Note <math|K>
  is a polytope in <math|n<rsup|2>> dimensions with <math|O<around|(|n|)>>
  facets. In this representation, the expected payoff of <math|x> versus
  <math|x<rprime|'>> is

  <\equation*>
    <big|sum><rsub|i>p<around|(|i|)>*<around*|(|<frac|1|2>*Pr
    <around|[|<text|Equally rank >i|]>+Pr <around|[|<text|P1 ranks
    >i<text|higher>|]>|)>=<big|sum><rsub|i>p<around|(|i|)>*<big|sum><rsub|j>x<rsub|i*j>*<around*|(|<frac|1|2>*x<rprime|'><rsub|i*j>+<big|sum><rsub|k\<gtr\>j>x<rprime|'><rsub|i*k>|)>.
  </equation*>

  The above is clearly bilinear in <math|x> and <math|x<rprime|'>> and can be
  written as <math|x<rsup|t>*M*x<rprime|'>> for some matrix <math|M> with
  bounded coefficients. Hence, we can solve the bilinear duel by the linear
  program (<reference|eq:LP>) and round it to a (randomized) minmax optimal
  algorithm for ranking.

  We next examine the beatability of the greedy algorithm. Note that for the
  uniform probability distribution <math|p<around|(|1|)>=p<around|(|2|)>=\<ldots\>=p<around|(|n|)>=1/n>,
  the greedy algorithm outputting, say, <math|<around|(|1,2,\<ldots\>,n|)>>
  can be beaten with probability <math|1-1/n> by the strategy
  <math|<around|(|2,3,\<ldots\>,n,1|)>>. One can make greedy's selection
  unique by setting <math|p<around|(|i|)>=1/n+<around|(|i-n/2|)>*\<epsilon\>>,
  and for sufficient small <math|\<epsilon\>> greedy can be beaten a fraction
  of time arbitrarily close to <math|1-1/n>.

  <section|Hiring Duel><label|sec:hiring>

  In a hiring duel, there are two employers <math|A> and <math|B> and two
  corresponding sets of workers <math|U<rsub|A>=<around|{|a<rsub|1>,\<ldots\>,a<rsub|n>|}>>
  and <math|U<rsub|B>=<around|{|b<rsub|1>,\<ldots\>,b<rsub|n>|}>> with
  <math|n> workers each. The <math|i>'th worker of each set has a common
  value <math|v<around|(|i|)>> where <math|v<around|(|i|)>\<gtr\>v<around|(|j|)>>
  for all <math|i> and <math|j\<gtr\>i>. Thus there is a total ranking of
  workers <math|a<rsub|i>\<in\>U<rsub|A>> (similarly
  <math|b<rsub|i>\<in\>U<rsub|B>>) where a rank of <math|1> indicates the
  best worker, and workers are labeled according to rank. The goal of the
  employers is to hire a worker whose value (equivalently rank) beats that of
  his competitor's worker. Workers are interviewed by employers one-by-one in
  a random order. The relative ranks of workers are revealed to employers
  only at the time of the interview. That is, at time <math|i>, each employer
  has seen a prefix of the interview order consisting of <math|i> of workers
  and knows only the projection of the total ranking on this
  prefix.<footnote|In some cases, an employer also knows when and whom his
  opponent hired, and may condition his strategy on this information as well.
  Only one of the settings described below needs this knowledge set; hence we
  defer our discussion of this point for now and explicitly mention the
  necessary assumptions where appropriate.> Hiring decisions must be made at
  the time of the interview, and only one worker may be hired. Thus the
  employers' pure strategies are mappings from any prefix and permutation of
  workers' ranks in that prefix to a binary hiring decision. We note that the
  permutation of ranks in a prefix does not effect the distribution of the
  rank of the just-interviewed worker, and hence without loss of generality
  we may assume the strategies are mapings from the round number and current
  rank to a hiring decision.

  In dueling notation, our game is <math|<around|(|X,\<Omega\>,c,p|)>> where
  the elements of <math|X> are functions <math|h:<around|{|1,\<ldots\>,n|}><rsup|2>\<rightarrow\><around|{|0,1|}>>
  indicating for any round <math|i> and projected rank of current interviewee
  <math|j\<leq\>i> the hiring decision <math|h<around|(|i,j|)>>;
  <math|\<Omega\>> is the set <math|<around|(|\<sigma\><rsub|A>,\<sigma\><rsub|B>|)>>
  of all pairs of permutations of <math|U<rsub|A>> and <math|U<rsub|B>>;
  <math|c<around|(|h,\<sigma\>|)>> is the value
  <math|v<around|(|\<sigma\><rsup|-1><around|(|i<rsup|\<ast\>>|)>|)>> of the
  first candidate <math|i<rsup|\<ast\>>=<math-up|argmin><rsub|i><around|{|i:h<around|(|i,<around|[|\<sigma\><rsup|-1><around|(|i|)>|]><rsub|i>|)>=1|}>>
  (where <math|<around|[|\<sigma\><rsup|-1><around|(|i|)>|]><rsub|j>>
  indicates the projected rank of the <math|i>'th candidate among the first
  <math|j> candidates according to <math|\<sigma\>>) that received an offer;
  and <math|p> (as is typical in the secretary problem) is the uniform
  distribution over <math|\<Omega\>>. The mixed strategies
  <math|\<pi\>\<in\>\<Delta\><around|(|X|)>> are simply mappings
  <math|\<pi\>:<around|{|0,\<ldots\>,n|}><rsup|2>\<rightarrow\><around|[|0,1|]>>
  from rounds and projected ranks to a probability
  <math|\<pi\><around|(|i,j|)>> of a hiring decision.

  The values <math|v<around|(|\<cdummy\>|)>> may be chosen adversarially, and
  hence in the one-player setting the optimal algorithm against a worst-case
  <math|v<around|(|\<cdummy\>|)>> is the one that maximizes the probability
  of hiring the best worker (the worst-case values set
  <math|v<around|(|1|)>=1> and <math|v<around|(|i|)>\<less\>\<less\>1> for
  <math|i\<gtr\>1>). In the literature on secretary problems, the following
  <with|font-shape|italic|classical algorithm> is known to hire the best
  worker with probability approaching <math|<frac|1|e>>: Interview n/e
  workers and hire next one that beats all the previous. Furthermore, there
  is no other algorithm that hires the best worker with higher probability.

  <subsection|Common pools of workers><label|subsec:commonhiring>

  In this section, we study the <with|font-shape|italic|common hiring duel>
  in which employers see the <with|font-shape|italic|same> candidates in the
  <with|font-shape|italic|same> order so that
  <math|\<sigma\><rsub|A>=\<sigma\><rsub|B>> and each employer observes when
  the other hires. In this case, the following strategy <math|\<pi\>> is a
  symmetric equilibrium: If the opponent has already hired, then hire anyone
  who beats his employee; otherwise hire as soon as the current candidate has
  at least a <math|50%> chance of being the best of the remaining candidates.

  <\lemma>
    <label|lem:commonequil>Strategy <math|\<pi\>> is efficiently computable
    and constitutes a symmetric equilibrium of the common hiring duel.
  </lemma>

  The computability follows from a derivation of probabilities in terms of
  binomials, and the equilibrium claim follows by observing that there can be
  no profitable deviation. This strategy also beats the classical algorithm,
  enabling us to provide non-trivial lower and upper bounds for its
  beatability.

  <\proof>
    For a round <math|i>, we compute a threshold <math|t<rsub|i>> such that
    <math|\<pi\>> hires if and only if the projected rank of the current
    candidate <math|j> is at most <math|t<rsub|i>>. Note that if <math|i>
    candidates are observed, the probability that the <math|t<rsub|i>>'th
    best among them is better than all remaining candidates is precisely
    <math|<choose|i|t<rsub|i>>/<choose|n|t<rsub|i>>>. The numerator is the
    number of ways to place the <math|1> through <math|t<rsub|i>>'th best
    candidates overall among the first <math|i> and the denominator is the
    number of ways to place the <math|1> through <math|t<rsub|i>>'th best
    among the whole order. Hence to efficiently compute <math|\<pi\>> we just
    need to compute <math|t<rsub|i>> or, equivalently, estimate these ratios
    of binomials and hire whenever on round <math|i> and observing the
    <math|j>'th best so far, <math|<choose|i|j>/<choose|n|j>\<geq\>1/2>.

    We further note <math|\<pi\>> is a symmetric equilibrium since if an
    employer deviates and hires early then by definition the opponent has a
    better than <math|50%> chance of getting a better candidate. Similarly,
    if an employer deviates and hires late then by definition his candidate
    has at most a <math|50%> chance of being a better candidate than that of
    his opponent.
  </proof>

  <\lemma>
    <label|lem:hiringbeatability>The beatability of the classical algorithm
    is at least <math|0.51> and at most <math|0.82>.
  </lemma>

  The lower bound follows from the fact that <math|\<pi\>> beats the
  classical algorithm with probability bounded above <math|1/2> when the
  classical algorithm hires early (i.e., before round <math|n/2>), and the
  upper bound follows from the fact that the classical algorithm guarantees a
  probability of <math|1/e> of hiring the best candidate, in which case no
  algorithm can beat it.

  <\proof>
    For the lower bound, note that in any event, <math|\<pi\>> guarantees a
    payoff of at least <math|1/2> against the classical algorithm. We next
    argue that for a constant fraction of the probability space,
    <math|\<pi\>> guarantees a payoff of strictly better than <math|1/2>. In
    particular, for some <math|q,1/e\<less\>q\<less\>1/2>, consider the event
    that the classical algorithm hires in the interval
    <math|<around|{|n/e,q*n|}>>. This event happens whenever the best among
    the first <math|q*n> candidates is not among the first <math|n/e>
    candidates, and hence has a probability of <math|<around|(|1-1/q*e|)>>.
    Conditioned on this event, <math|\<pi\>> beats the classical algorithm
    whenever the best candidate overall is in the last
    <math|n*<around|(|1-q|)>> candidates,<footnote|This is a loose lower
    bound; there are many other instances where <math|\<pi\>> also wins,
    e.g., if the second-best candidate is in the last
    <math|n*<around|(|1-q|)>> candidates and the best occurs after the third
    best in the first <math|q*n> candidates.> which happens with probability
    <math|<around|(|1-q|)>> (the conditioning does not change this
    probability since it is only a property of the permutation projected onto
    the first <math|q*n> elements). Hence the overall payoff of <math|\<pi\>>
    against the classical algorithm is <math|<around|(|1-q|)>*<around|(|1-1/q*e|)>+<around|(|1/2|)>*<around|(|1/q*e|)>>.
    Optimizing for <math|q> yields the result.

    For the upper bound, note as mentioned above that the classical algorithm
    has a probability approaching <math|1/e> of hiring the
    <with|font-shape|italic|best> candidate. From here, we see
    <math|<around|(|<around|(|1/2*e|)>+<around|(|1-1/e|)>|)>=1-1/2*e\<less\>0.82>
    is an upper bound on the beatability of the classical algorithm since the
    best an opponent can do is always hire the best worker when the classical
    algorithm hires the best worker and always hire a better worker when the
    classical algorithm does not hire the best worker.
  </proof>

  <subsection|Independent pools of workers><label|subsec:separatehiring>

  In this section, we study the <with|font-shape|italic|independent hiring
  duel> in which the employers see <with|font-shape|italic|different>
  candidates. Thus <math|\<sigma\><rsub|A>\<neq\>\<sigma\><rsub|B>> and the
  employers do not see when the opponent hires. We use the bilinear duel
  framework introduced in Section<nbsp><reference|sec:bilinear> to compute an
  equilibrium for this setting, yielding the following theorem.

  <\theorem>
    <label|thm:separatehiring>The equilibrium strategies of the independent
    hiring duel are efficiently computable.
  </theorem>

  The main idea is to represent strategies <math|\<pi\>> by vectors
  <math|<around|{|p<rsub|i*j>|}>> where <math|p<rsub|i*j>> is the (total)
  probability of hiring the <math|j>'th best candidate seen so far on round
  <math|i>. Let <math|q<rsub|i>> be the probability of reaching round
  <math|i>, and note it can be computed from the
  <math|<around|{|p<rsub|i*j>|}>>. Recall <math|\<pi\><around|(|i,j|)>> is
  the probability of hiring the <math|j>'th best so far at round <math|i>
  conditional on seeing the <math|j>'th best so far at round <math|i>. Thus
  using Bayes' Rule we can derive an efficiently-computable bijective mapping
  (with an efficiently computable inverse) <math|\<phi\><around|(|\<pi\>|)>>
  between <math|\<pi\>> and <math|<around|{|p<rsub|i*j>|}>> which simply sets
  <math|\<pi\><around|(|i,j|)>=p<rsub|i*j>/<around|(|q<rsub|i>/i|)>>. It only
  remains to show that one can find a matrix <math|M> such that the payoff of
  a strategy <math|\<pi\>> versus a strategy <math|\<pi\><rprime|'>> is
  <math|\<phi\><around|(|\<pi\>|)><rsup|t>*M*\<phi\><around|(|\<pi\><rprime|'>|)>>.
  This is done by calculating the appropriate binomials.

  We show how to apply the bilinear duel framework to compute the equilibrium
  of the independent hiring duel. This requires the following steps: define a
  subset <math|K> of Euclidean space to represent strategies, define a
  bijective mapping between <math|K> and feasible (mixed) strategies
  <math|\<Delta\><around|(|X|)>>, and show how to represent the payoff matrix
  of strategies in the bilinear duel space. We discuss each step in order.

  <with|font-series|bold|Defining <math|K>.> For each
  <math|1\<leq\>i\<leq\>n> and <math|j\<leq\>i> we define <math|p<rsub|i*j>>
  to be the (total) probability of seeing and hiring the <math|j>'th best
  candidate seen so far at round <math|i>. Our subspace
  <math|K=<around|[|0,1|]><rsup|n*<around|(|n+1|)>/2>> consists of the
  collection of probabilities <math|<around|{|p<rsub|i*j>|}>>. To derive
  constraints on this space, we introduce a new variable <math|q<rsub|i>>
  representing the probability of reaching round <math|i>. We note that the
  probability of reaching round <math|<around|(|i+1|)>> must equal the
  probability of reaching round <math|i> and <with|font-shape|italic|not>
  hiring, so that <math|q<rsub|i+1>=q<rsub|i>-<big|sum><rsub|j=1><rsup|n>p<rsub|i*j>>.
  Furthermore, the probability <math|p<rsub|i*j>> can not exceed the
  probability of reaching round <math|i> and interviewing the <math|j>'th
  best candidate seen so far. The probability of reaching round <math|i> is
  <math|q<rsub|i>> by definition, and the probability that the projected rank
  of the <math|i>'th candidate is <math|j> is <math|1/i> by our choice of a
  uniformly random permutation. Thus <math|p<rsub|i*j>\<leq\>q<rsub|i>/i>.
  Together with the initial condition that <math|q<rsub|i>=1>, these
  constraints completely characterize <math|K>.

  <with|font-series|bold|Mapping.> Recall a strategy <math|\<pi\>> indicates
  for each <math|i> and <math|j\<leq\>i> the
  <with|font-shape|italic|conditional> probability of making an offer given
  that the employer is interviewing the <math|i>'th candidate and his
  projected rank is <math|j> whereas <math|p<rsub|i*j>> is the
  <with|font-shape|italic|total> probability of interviewing the <math|i>'th
  candidate with a projected rank of <math|j> and making an offer. Thus
  <math|\<pi\><around|(|i,j|)>=p<rsub|i*j>/<around|(|q<rsub|i>/i|)>> and so
  <math|p<rsub|i*j>=q<rsub|i>*\<pi\><around|(|i,j|)>/i>. Together with the
  equailities derived above that <math|q<rsub|1>=1> and
  <math|q<rsub|i+1>=q<rsub|i>-<big|sum><rsub|j=1><rsup|n>p<rsub|i*j>>, we can
  recursively map any strategy <math|\<pi\>> to <math|K> efficiently. To map
  back we just take the inverse of this bijection: given a point
  <math|<around|{|p<rsub|i*j>|}>> in <math|K>, we compute the (unique)
  <math|q<rsub|i>> satisfying the constraints <math|q<rsub|1>=1> and
  <math|q<rsub|i+1>=q<rsub|i>-<big|sum><rsub|j=1><rsup|n>p<rsub|i*j>>, and
  define <math|\<pi\><around|(|i,j|)>=p<rsub|i*j>/<around|(|q<rsub|i>/i|)>>.

  <with|font-series|bold|Payoff Matrix.> By the above definitions, for any
  strategy <math|\<pi\>> and corresponding mapping
  <math|<around|{|p<rsub|i*j>|}>>, the probability that the strategy hires
  the <math|j>'th best so far on round <math|i> is <math|p<rsub|i*j>>. Given
  that employer <math|A> hires the <math|j>'th best so far on round <math|i>
  and employer <math|B> hires the <math|j<rprime|'>>'th best so far on round
  <math|i<rprime|'>>, we define <math|M<rsub|i*j*i<rprime|'>*j<rprime|'>>> to
  be the probability that the overall rank of employer <math|A>'s hire beats
  that of employer <math|B>'s hire plus one-half times the probability that
  their ranks are equal. We can derive the entries of the this matrix as
  follows: Let <math|E<rsup|X><rsub|r>> be the event that with respect to
  permutation <math|\<sigma\><rsub|X>> the overall rank of a fixed candidate
  is <math|r>, and <math|F<rsup|X><rsub|i*j>> be the event that the projected
  rank of the last candidate in a random prefix of size <math|i> is <math|j>.
  Then

  <\equation*>
    M<rsub|i*j*i<rprime|'>*j<rprime|'>>=<big|sum><rsub|r,r<rprime|'>:1\<leq\>r\<less\>r<rprime|'>\<leq\>n>Pr
    <around|[|E<rsup|A><rsub|r>\|F<rsup|A><rsub|i*j>|]>*Pr
    <around|[|E<rsup|B><rsub|r<rprime|'>>\|F<rsup|B><rsub|i<rprime|'>*j<rprime|'>>|]>+<frac|1|2>*<big|sum><rsub|1\<leq\>r\<leq\>n>Pr
    <around|[|E<rsup|A><rsub|r>\|F<rsup|A><rsub|i*j>|]>*Pr
    <around|[|E<rsup|B><rsub|r>\|F<rsup|B><rsub|i<rprime|'>*j<rprime|'>>|]>.
  </equation*>

  Furthermore, by Bayes rule, <math|Pr <around|[|E<rsup|X><rsub|r>\|F<rsup|X><rsub|i*j>|]>=Pr
  <around|[|F<rsup|X><rsub|i*j>\|E<rsup|X><rsub|r>|]>*Pr
  <around|[|E<rsup|X><rsub|r>|]>/Pr <around|[|F<rsup|X><rsub|i*j>|]>> where
  <math|Pr <around|[|E<rsup|X><rsub|r>|]>=1/n> and <math|Pr
  <around|[|F<rsup|X><rsub|i*j>|]>=1/i>. To compute <math|Pr
  <around|[|F<rsup|X><rsub|i*j>\|E<rsup|X><rsub|r>|]>>, we select the ranks
  of the other candidates in the prefix of size <math|i>. There are
  <math|<choose|r-1|j-1>> ways to pick the ranks of the better candidates and
  <math|<choose|n-r+1|i-j>> ways to pick the ranks of the worse candidates.
  As there are <math|<choose|n-1|i-1>> ways overall to pick the ranks of the
  other candidates, we see:

  <\equation*>
    Pr <around|[|F<rsup|X><rsub|i*j>\|E<rsup|X><rsub|r>|]>=<frac|<choose|r-1|j-1><choose|n-r+1|i-j>|<choose|n-1|i-1>>.
  </equation*>

  Letting <math|<around|{|p<rsub|i*j>|}>> be the mapping
  <math|\<phi\><around|(|\<pi\>|)>> of employer <math|A>'s strategy
  <math|\<pi\>> and <math|<around|{|p<rprime|'><rsub|i*j>|}>> be the mapping
  <math|\<phi\><around|(|\<pi\>|)>> of employer <math|B>'s strategy
  <math|\<pi\><rprime|'>>, we see that <math|c<around|(|\<pi\>,\<pi\><rprime|'>|)>=\<phi\><around|(|\<pi\>|)><rsup|t>*M*\<phi\><around|(|\<pi\><rprime|'>|)>>,
  as required.

  By the above arguments, and the machinery from
  Section<nbsp><reference|sec:bilinear>, we have proven
  Theorem<nbsp><reference|thm:separatehiring> which claims that the
  equilibrium of the independent hiring duel is computable.

  <section|Compression Duel><label|sec:compression>

  In a compression duel, two competitors each choose a binary tree with leaf
  set <math|\<Omega\>>. An element <math|\<omega\>\<in\>\<Omega\>> is then
  chosen according to distribution <math|p>, and whichever player's tree has
  <math|\<omega\>> closest to the root is the winner. This game can be
  thought of as a competition between prefix-free compression schemes for a
  base set of words. The Huffman algorithm, which repeatedly pairs nodes with
  lowest probability, is known to be optimal for single-player compression.

  The compression duel is <math|D<around|(|X,\<Omega\>,c,p|)>>, where
  <math|\<Omega\>=<around|[|n|]>> and <math|X> is the set of binary trees
  with leaf set <math|\<Omega\>>. For <math|T\<in\>X> and
  <math|\<omega\>\<in\>\<Omega\>>, <math|c<around|(|T,\<omega\>|)>> is the
  depth of <math|\<omega\>> in <math|T>. In Section
  <reference|sec.compress.fail> we consider a variant in which not every
  element of <math|\<Omega\>> must appear in the tree.

  <subsection|Computing an equilibrium>

  The compression duel can be represented as a bilinear game. In this case,
  <math|K> and <math|K<rprime|'>> will be sets of stochastic matrices, where
  a matrix entry <math|<around|{|x<rsub|i*j>|}>> indicates the probability
  that item <math|\<omega\><rsub|i>> is placed at depth <math|j>. The set
  <math|K> is precisely the set of probability distributions over node depths
  that are consistent with probability distributions over binary trees. We
  would like to compute minmax optimal algorithms as in Section
  <reference|sec:blexact>, but we do not have a randomized rounding scheme
  that maps elements of <math|K> to binary trees. Instead, following Section
  <reference|sec:appx>, we will find approximate minmax strategies by
  constructing an <math|<eps>>-best response oracle.

  The mapping <math|\<phi\>:X\<to\>K> is straightforward: it maps a binary
  tree to its depth profile. Also, the expected payoff of <math|x\<in\>K>
  versus <math|x<rprime|'>\<in\>K<rprime|'>> is
  <math|<big|sum><rsub|i>p<around|(|i|)>*<big|sum><rsub|j>x<rsub|i*j>*<around*|(|<frac|1|2>*x<rprime|'><rsub|i*j>+<big|sum><rsub|k\<gtr\>j>x<rprime|'><rsub|i*j>|)>>
  which can be written as <math|x<rsup|t>*M*x<rprime|'>> where matrix
  <math|M> has bounded entries. To apply Lemma <reference|lem:appx>, we must
  now provide an <math|<eps>> best response oracle, which we implement by
  reducing to a knapsack problem.

  Fix <math|p> and <math|x<rprime|'>\<in\>K<rprime|'>>. We will reduce the
  problem of finding a best response for <math|x<rprime|'>> to the
  multiple-choice knapsack problem (MCKP), for which there is an FPTAS
  <cite|Lawler-79>. In the MCKP, there are <math|n> lists of items, say
  <math|<around|{|<around|(|\<alpha\><rsub|i*1>,\<ldots\>,\<alpha\><rsub|i*k<rsub|i>>|)>\|1\<leq\>i\<leq\>n|}>>,
  with each item <math|\<alpha\><rsub|i*j>> having a value
  <math|v<rsub|i*j>\<geq\>0> and weight <math|w<rsub|i*j>\<geq\>0>. The
  problem is to choose exactly one item from each list with total weight at
  most <math|1>, with the goal of maximizing total value. Our reduction is as
  follows. For each <math|\<omega\><rsub|i>\<in\>\<Omega\>> and
  <math|0\<leq\>j\<leq\>n>, define <math|w<rsub|i*j>=2<rsup|-j>> and
  <math|v<rsub|i*j>=p<around|(|\<omega\><rsub|i>|)>*<around*|(|<frac|1|2>*x<rprime|'><rsub|i*j>+<big|sum><rsub|d\<gtr\>j>x<rprime|'><rsub|i*d>|)>>.
  This defines a MCKP input instance. For any given <math|t\<in\>X>,
  <math|v<around|(|\<phi\><around|(|t|)>,x<rprime|'>|)>=<big|sum><rsub|\<omega\><rsub|i>\<in\>\<Omega\>>v<rsub|i*d<rsub|t><around|(|i|)>>>
  and <math|<big|sum><rsub|\<omega\><rsub|i>\<in\>\<Omega\>>w<rsub|i,d<rsub|t><around|(|i|)>>\<leq\>1>
  by the Kraft inequality. Thus, any strategy for the compression duel can be
  mapped to a solution to the MCKP. Likewise, a solution to the MCKP can be
  mapped in a value-preserving way to a binary tree <math|t> with leaf set
  <math|\<Omega\>>, again by the Kraft inequality. This completes the
  reduction.

  <subsection|Beatability>

  We will obtain a bound of <math|3/4> on the beatability of the Huffman
  algorithm. The high-level idea is to choose an arbitrary tree <math|T> and
  consider the leaves for which <math|T> beats <math|H> and vice-versa. We
  then apply structural properties of trees to limit the relative sizes of
  these sets of leaves, then use properties of Huffman trees to bound the
  relative probability that a sampled leaf falls in one set or the other.

  Before bounding the beatability of the Huffman algorithm in the No Fail
  compression model, we review some facts about Huffman trees. Namely, that
  nodes with lower probability occur deeper in the tree, and that siblings
  are always paired in order of probability (see, for example, page 402 of
  Gersting <cite|gersting-93>. In what follows, we will suppose that <math|H>
  is a Huffman tree.

  <\fact>
    <label|fact.huff.depths>If <math|d<rsub|H><around|(|v<rsub|1>|)>\<gtr\>d<rsub|H><around|(|v<rsub|2>|)>>
    then <math|p<rsub|H><around|(|v<rsub|1>|)>\<leq\>p<rsub|H><around|(|v<rsub|2>|)>>.
  </fact>

  <\fact>
    <label|fact.huff.sibling>If <math|v<rsub|1>> and <math|v<rsub|2>> are
    siblings with <math|p<rsub|H><around|(|v<rsub|1>|)>\<leq\>p<rsub|H><around|(|v<rsub|2>|)>>,
    then for every node <math|v<rsub|3>\<in\>H> either
    <math|p<rsub|H><around|(|v<rsub|3>|)>\<leq\>p<rsub|H><around|(|v<rsub|1>|)>>
    or <math|p<rsub|H><around|(|v<rsub|3>|)>\<geq\>p<rsub|H><around|(|v<rsub|2>|)>>.
  </fact>

  We next give a bound on the relative probabilities of nodes on any given
  level of a Huffman tree, subject to the tree not being too \Psparse\Q at
  the subsequent (deeper) level. Let <math|p<rsub|H><rsup|m*i*n><around|(|d|)>=min<rsub|v:d<rsub|H><around|(|v|)>=d>
  p<rsub|H><around|(|v|)>> and <math|p<rsub|H><rsup|m*a*x><around|(|d|)>=max<rsub|v:d<rsub|H><around|(|v|)>=d>
  p<rsub|H><around|(|v|)>>.

  <\lemma>
    <label|lem.huff.mult3>Choose any <math|d\<less\>max<rsub|v>
    d<rsub|H><around|(|v|)>> and nodes <math|v,w> such that
    <math|d<rsub|H><around|(|w|)>=d<rsub|H><around|(|v|)>=d>. If <math|v> is
    not the common ancestor of all nodes of depth greater than <math|d>, then
    <math|p<rsub|H><around|(|w|)>\<leq\>3*p<rsub|H><around|(|v|)>>.
  </lemma>

  <\proof>
    Let <math|a=p<rsub|H><around|(|v|)>>. By assumption there exists a
    non-leaf node <math|z\<neq\>v> with <math|d<rsub|H><around|(|z|)>=d>, say
    with children <math|z<rsub|1>> and <math|z<rsub|2>>. Then
    <math|p<rsub|H><around|(|z<rsub|1>|)>\<leq\>a> and
    <math|p<rsub|H><around|(|z<rsub|2>|)>\<leq\>a> by Fact
    <reference|fact.huff.depths>, so <math|p<rsub|H><around|(|z|)>\<leq\>2*a>.
    This implies that <math|v>'s sibling has probability at most <math|2*a>
    by Fact <reference|fact.huff.sibling>, so the parent of <math|v> has
    probability at most <math|3*a>. Fact <reference|fact.huff.depths> then
    implies that <math|p<rsub|H><around|(|w|)>\<leq\>3*a> as required.
  </proof>

  For any <math|T\<in\>X> and set of nodes <math|R\<subseteq\>T> we define
  the weight of <math|R> to be <math|w<rsub|T><around|(|R|)>=<big|sum><rsub|v\<in\>R>2<rsup|-d<rsub|T><around|(|v|)>>>.
  The Kraft inequality for binary trees is
  <math|w<rsub|T><around|(|T|)>\<leq\>1>. In fact, we have
  <math|w<rsub|T><around|(|T|)>=1> since we can assume each interior node of
  <math|T> has two children.

  <\lemma>
    <label|lem.huff.weight>Choose <math|R\<subseteq\>H> such that no node of
    <math|R> is a descendent of any other, and suppose
    <math|w<around|(|R|)>=2<rsup|-d>> for some <math|d\<in\><around|[|n|]>>.
    Then <math|p<rsup|m*i*n><rsub|H><around|(|d|)>\<leq\>p<around|(|R|)>\<leq\>p<rsup|m*a*x><rsub|H><around|(|d|)>>.
  </lemma>

  <\proof>
    We will show <math|p<around|(|R|)>\<leq\>p<rsup|m*a*x><rsub|H><around|(|d|)>>;
    the argument for the other inequality is similar. We proceed by induction
    on <math|<around|\||R|\|>>. If <math|<around|\||R|\|>=1> the result is
    trivial (since <math|R=<around|{|v|}>> where
    <math|d<rsub|H><around|(|v|)>=d>). Otherwise, since
    <math|w<around|(|R|)>=2<rsup|-d>>, there must be at least two nodes of
    the maximum depth present in <math|R>. Let <math|v> and <math|w> be the
    two such nodes with smallest probability, say with
    <math|p<rsub|H><around|(|v|)>\<leq\>p<rsub|H><around|(|w|)>>. Let
    <math|w<rprime|'>> be the parent of <math|w>. Then
    <math|p<rsub|H><around|(|w<rprime|'>|)>\<geq\>p<rsub|H><around|(|w|)>+p<rsub|H><around|(|v|)>>,
    since the sibling of <math|w> has weight at least
    <math|p<rsub|H><around|(|v|)>> by Fact <reference|fact.huff.sibling>.
    Also, <math|w<rprime|'>\<nin\>R> since <math|w\<in\>R> and no node of
    <math|R> is a descendent of any other. Let
    <math|R<rprime|'>=R\<cup\><around|{|w<rprime|'>|}>-<around|{|w,v|}>>.
    Then <math|w<around|(|R<rprime|'>|)>=w<around|(|R|)>>,
    <math|p<around|(|R<rprime|'>|)>\<geq\>p<around|(|R|)>>, and no node of
    <math|R<rprime|'>> is a descendent of any other. Thus, by induction,
    <math|p<around|(|R|)>\<leq\>p<around|(|R<rprime|'>|)>\<leq\>p<rsup|m*a*x><rsub|H><around|(|d|)>>
    as required.
  </proof>

  We are now ready to show that the beatability of the Huffman algorithm is
  at most <math|<frac|3|4>>.

  <\proposition>
    <label|prop.huffman.beatable>The beatability of the Huffman algorithm is
    at most <math|<frac|3|4>>.
  </proposition>

  Fix <math|\<Omega\>> and <math|p>. Let <math|H> denote the Huffman tree and
  choose any other tree <math|T>. Define <math|P=<around|{|v\<in\>\<Omega\>:d<rsub|T><around|(|v|)>\<less\>d<rsub|H><around|(|v|)>|}>>,
  <math|Q=<around|{|v\<in\>\<Omega\>:d<rsub|T><around|(|v|)>\<gtr\>d<rsub|H><around|(|v|)>|}>>.
  That is, <math|P> is the set of elements of <math|\<Omega\>> for which
  <math|T> beats <math|H>, and <math|Q> is the set of elements for which
  <math|H> beats <math|T>. Our goal is to show that
  <math|p<around|(|P|)>\<less\>3*p<around|(|Q|)>>, which would imply that
  <math|v<around|(|T,H|)>\<leq\>3/4>.

  We first claim that <math|w<around|(|P|)>\<less\>w<around|(|Q|)>>. To see
  this, write <math|U=\<Omega\>-<around|(|P\<cup\>Q|)>> and note that, by the
  Kraft inequality,

  <\equation>
    <label|eq.kraft.1>w<around|(|P|)>+w<around|(|Q|)>+w<around|(|U|)>=1=w<rsub|T><around|(|P|)>+w<rsub|T><around|(|Q|)>+w<rsub|T><around|(|U|)>.
  </equation>

  Moreover, <math|w<rsub|T><around|(|Q|)>\<gtr\>0>,
  <math|w<rsub|T><around|(|U|)>=w<rsub|H><around|(|U|)>>, and
  <math|w<rsub|T><around|(|P|)>\<geq\>2*w<around|(|P|)>> (since
  <math|d<rsub|T><around|(|v|)>\<leq\>d<rsub|H><around|(|v|)>-1> for all
  <math|v\<in\>P>). Applying these inequalities to <eqref|eq.kraft.1> implies
  <math|w<around|(|P|)>-w<around|(|Q|)>\<less\>0>, completing the claim.

  Our approach will be to express <math|P> and <math|Q> as disjoint unions
  <math|P=P<rsub|1>\<cup\>\<ldots\>\<cup\>P<rsub|r>> and
  <math|Q=Q<rsub|1>\<cup\>\<ldots\>\<cup\>Q<rsub|r>> such that
  <math|p<around|(|P<rsub|i>|)>\<leq\>3*p<around|(|Q<rsub|i>|)>> for all
  <math|i>. To this end, we express the quantities <math|w<around|(|P|)>> and
  <math|w<around|(|Q|)>> in binary: choose
  <math|x<rsub|1>,\<ldots\>,x<rsub|n>> and
  <math|y<rsub|1>,\<ldots\>,y<rsub|n>> from <math|<around|{|0,1|}>> such that
  <math|w<around|(|P|)>=<big|sum><rsub|i>x<rsub|i>*2<rsup|-i>> and
  <math|w<around|(|Q|)>=<big|sum><rsub|i>y<rsub|i>*2<rsup|-i>>. Since
  <math|w<around|(|P|)>> is a sum of element weights that are inverse powers
  of two, we can partition the elements of <math|P> into disjoint subsets
  <math|P<rsub|1>,\<ldots\>,P<rsub|n>> such that
  <math|w<around|(|P<rsub|i>|)>=x<rsub|i>*2<rsup|-i>> for all
  <math|i\<in\><around|[|n|]>>. Similarly, we can partition <math|Q> into
  disjoint subsets <math|Q<rsub|1>,\<ldots\>,Q<rsub|n>> such that
  <math|w<around|(|Q<rsub|i>|)>=y<rsub|i>*2<rsup|-i>> for all
  <math|i\<in\><around|[|n|]>>.

  Let <math|r=min <around|{|i:x<rsub|i>\<neq\>y<rsub|i>|}>>. Note that, since
  <math|w<around|(|P|)>\<less\>w<around|(|Q|)>>, we must have
  <math|x<rsub|r>=0> and <math|y<rsub|r>=1>.

  We first show that <math|p<around|(|P<rsub|i>|)>\<leq\>3*p<around|(|Q<rsub|i>|)>>
  for each <math|i\<less\>r>. Since <math|x<rsub|i>=y<rsub|i>>, we either
  have <math|P<rsub|i>=Q<rsub|i>=\<emptyset\>> or else
  <math|w<around|(|P<rsub|i>|)>=w<around|(|Q<rsub|i>|)>=2<rsup|-i>>. In the
  latter case, suppose first that <math|<around|\||Q<rsub|i>|\|>=1>. Then,
  since <math|Q<rsub|i>> consists of a single leaf and <math|i> is not the
  maximum depth of tree <math|H>, we can apply Lemma
  <reference|lem.huff.weight> and Lemma <reference|lem.huff.mult3> to
  conclude <math|p<around|(|P<rsub|i>|)>\<leq\>p<rsup|m*a*x><rsub|H><around|(|i|)>\<leq\>3*p<around|(|Q<rsub|i>|)>>.
  Next suppose that <math|<around|\||Q<rsub|i>|\|>\<gtr\>1>. We would again
  like to apply Lemma <reference|lem.huff.mult3>, but we must first verify
  that its conditions are met. Suppose for contradiction that all nodes of
  depth greater than <math|i> share a common ancestor of depth <math|i>.
  Then, since <math|w<around|(|Q<rsub|i>|)>=2<rsup|-i>> and
  <math|<around|\||Q<rsub|i>|\|>\<gtr\>1>, it must be that <math|Q<rsub|i>>
  contains all such nodes, which contradicts the fact that <math|Q<rsub|r>>
  contains at least one node of depth greater than <math|i>. We conclude that
  the conditions of Lemma <reference|lem.huff.mult3> are satisfied for all
  <math|v> and <math|w> at depth <math|i>, and therefore
  <math|p<around|(|P<rsub|i>|)>\<leq\>p<rsup|m*a*x><rsub|H><around|(|i|)>\<leq\>3*p<rsup|m*i*n><rsub|H><around|(|i|)>\<leq\>3*p<around|(|Q<rsub|i>|)>>
  as required.

  We next consider <math|i\<geq\>r>. Let <math|P<rprime|'><rsub|r>=<big|cup><rsub|j\<geq\>r>P<rsub|j>>
  and <math|Q<rprime|'><rsub|r>=<big|cup><rsub|j\<geq\>r>Q<rsub|j>>. We claim
  that <math|p<around|(|P<rprime|'><rsub|r>|)>\<leq\>3*p<around|(|Q<rprime|'><rsub|r>|)>>.
  If <math|P<rprime|'><rsub|r>=\<emptyset\>> then this is certainly true, so
  suppose otherwise. Then <math|w<around|(|P<rprime|'><rsub|r>|)>\<less\>2<rsup|-r>>,
  so <math|P<rprime|'><rsub|r>> contains elements of depth greater than
  <math|r>. As in the case <math|i\<less\>r>, this implies that either
  <math|Q<rsub|r>> contains only a single node (and cannot be the common
  ancestor of all nodes of depth greater than <math|r>), or else not all
  nodes of depth greater than <math|r> have a common ancestor of depth
  <math|r>. We can therefore apply Lemma <reference|lem.huff.weight> and
  Lemma <reference|lem.huff.mult3> to conclude
  <math|p<around|(|P<rprime|'><rsub|r>|)>\<leq\>p<rsup|m*a*x><rsub|H><around|(|r|)>\<leq\>3*p<around|(|Q<rsub|r>|)>\<leq\>3*p<around|(|Q<rprime|'><rsub|r>|)>>.

  Since <math|P=P<rsub|1>\<cup\>\<ldots\>\<cup\>P<rsub|r-1>\<cup\>P<rsub|r><rprime|'>>
  and <math|Q=Q<rsub|1>\<cup\>\<ldots\>\<cup\>Q<rsub|r-1>\<cup\>Q<rsub|r><rprime|'>>
  are disjoint partitions, we conclude that
  <math|p<around|(|P|)>\<leq\>3*p<around|(|Q|)>> as required. <math|\<Box\>>

  We now give an example to demonstrate that the Huffman algorithm is at
  least <math|<around|(|2/3-\<epsilon\>|)>>-beatable for every
  <math|\<epsilon\>\<gtr\>0>. For any <math|n\<geq\>3>, consider the
  probability distribution given by <math|p<around|(|\<omega\><rsub|1>|)>=<frac|1|3>>,
  <math|p<around|(|\<omega\><rsub|i>|)>=<frac|1|3\<cdot\>2<rsup|i-2>>> for
  all <math|1\<less\>i\<less\>n>, and <math|p<around|(|\<omega\><rsub|n>|)>=<frac|1|3\<cdot\>2<rsup|n-3>>>.
  For this distribution, the Huffman tree <math|t> satisfies
  <math|d<rsub|t><around|(|\<omega\><rsub|i>|)>=i> for each <math|i\<less\>n>
  and <math|d<rsub|t><around|(|\<omega\><rsub|n>|)>=n-1>. Consider the
  alternative tree <math|t<rprime|'>> in which
  <math|d<around|(|\<omega\><rsub|1>|)>=n-1> and
  <math|d<around|(|\<omega\><rsub|i>|)>=i-1> for all <math|i\<gtr\>1>. Then
  <math|t<rprime|'>> will win if any of <math|\<omega\><rsub|2>,\<omega\><rsub|3>,\<ldots\>,\<omega\><rsub|n-1>>
  are chosen, and will tie on <math|\<omega\><rsub|n>>. Thus
  <math|v<around|(|t<rprime|'>,t|)>=<big|sum><rsub|i\<gtr\>1><frac|1|3\<cdot\>2<rsup|i-2>>+<frac|1|2>\<cdot\><frac|1|3\<cdot\>2<rsup|n-3>>=<frac|2|3>-<frac|1|3\<cdot\>2<rsup|n-2>>>,
  and hence the Huffman algorithm is <math|<around|(|<frac|2|3>-<frac|1|3\<cdot\>2<rsup|n-2>>|)>>-beatable
  for every <math|n\<geq\>3>.

  We conclude the section by noting that if all probabilities are inverse
  powers of <math|2>, the Huffman algorithm is minmax optimal.\ 

  <\proposition>
    <label|prop.huffman.power2>Suppose there exist integers
    <math|a<rsub|1>,\<ldots\>,a<rsub|n>> such that
    <math|p<around|(|\<omega\><rsub|i>|)>=2<rsup|-a<rsub|i>>> for each
    <math|i\<leq\>n>. Then the value of the Huffman tree <math|H> is
    <math|v<around|(|H|)>=1/2>.
  </proposition>

  <\proof>
    We suppose that there exist integers <math|a<rsub|1>,\<ldots\>,a<rsub|n>>
    such that <math|p<around|(|\<omega\><rsub|i>|)>=2<rsup|-a<rsub|i>>> for
    each <math|i\<leq\>n>. Our goal is to show that the value of the Huffman
    tree <math|H> is <math|v<around|(|H|)>=1/2>.

    For this set of probabilities, the Huffman tree will set
    <math|d<rsub|H><around|(|\<omega\><rsub|i>|)>=a<rsub|i>> for all
    <math|\<omega\><rsub|i>\<in\>\<Omega\>>. In this case,
    <math|p<around|(|R|)>=w<around|(|R|)>> for all <math|R\<subseteq\>H>.
    Choose any other tree <math|T>, and define sets <math|P> and <math|Q> as
    in the proof of Proposition <reference|prop.huffman.beatable>. That is,
    <math|P> is the set of elements of <math|\<Omega\>> for which <math|T>
    beats <math|H>, and <math|Q> is the set of elements for which <math|H>
    beats <math|T>. Then, as in Proposition
    <reference|prop.huffman.beatable>, we must have
    <math|w<around|(|P|)>\<less\>w<around|(|Q|)>>, and hence
    <math|p<around|(|P|)>\<less\>p<around|(|Q|)>>. Thus
    <math|v<around|(|H,T|)>\<less\>1/2>. We conclude that the best response
    to the Huffman tree <math|H> must be <math|H> itself, and thus strategy
    <math|H> has a value of <math|1/2>.
  </proof>

  <subsection|Variant: allowed failures><label|sec.compress.fail>

  We consider a variant of the compression duel in which an algorithm can
  fail to encode certain elements. If we write <math|L<around|(|T|)>> to be
  the set of leaves of binary tree <math|T>, then in the (original) model of
  compression we require that <math|L<around|(|T|)>=\<Omega\>> for all
  <math|T\<in\>X>, whereas in the \PFail" model we require only that
  <math|L<around|(|T|)>\<subseteq\>\<Omega\>>. If
  <math|\<omega\>\<nin\>L<around|(|T|)>>, we will take
  <math|c<around|(|T,\<omega\>|)>=\<infty\>>. The Huffman algorithm is
  optimal for single-player compression in the Fail model.

  We note that our method of computing approximate minmax algorithms carries
  over to this variant; we need only change our best-response reduction to
  use a Multiple-Choice Knapsack Problem in which <em|at most> one element is
  chosen from each list. What is different, however, is that the Huffman
  algorithm is completely beatable in the Fail model. If we take
  <math|\<Omega\>=<around|{|\<omega\><rsub|1>,\<omega\><rsub|2>|}>> with
  <math|p<around|(|\<omega\><rsub|1>|)>=1> and
  <math|p<around|(|\<omega\><rsub|2>|)>=0>, the Huffman tree <math|H> places
  each of the elements of <math|\<Omega\>> at depth <math|2>. If <math|T> is
  the singleton tree that consists of <math|\<omega\><rsub|1>> as the root,
  then <math|v<around|(|T,H|)>=1>.

  <section|Binary Search Duel><label|sec:bst>

  In a binary search duel, <math|\<Omega\>=<around|[|n|]>> and <math|X> is
  the set of binary search trees on <math|\<Omega\>> (i.e. binary trees in
  which nodes are labeled with elements of <math|\<Omega\>> in such a way
  that an in-order traversal visits the elements of <math|\<Omega\>> in
  sorted order). Let <math|p> be a distribution on <math|\<Omega\>>. Then for
  <math|T\<in\>X> and <math|\<omega\>\<in\>\<Omega\>>,
  <math|c<around|(|T,\<omega\>|)>> is the depth of the node labeled by
  \P<math|\<omega\>>\Q in the tree <math|T>. In single-player binary search
  and uniform <math|p>, selecting the median <math|m> element in
  <math|\<Omega\>> as the root node and recursing on the left
  <math|<around|{|\<omega\>\|\<omega\>\<less\>m|}>> and right
  <math|<around|{|\<omega\>\|\<omega\>\<gtr\>m|}>> subsets to construct
  sub-trees is known to be optimal.

  The binary search game can be represented as a bilinear duel. In this case,
  <math|K> and <math|K<rprime|'>> will be sets of stochastic matrices (as in
  the case of the compression game) and the entry
  <math|<around|{|x<rsub|i,j>|}>> will represent the probability that item
  <math|\<omega\><rsub|j>> is placed at depth <math|i>. Of course, not every
  stochastic matrix is realizable as a distribution on binary search trees
  (i.e.

  such that the probability <math|\<omega\><rsub|j>> is placed at depth
  <math|i> is <math|<around|{|x<rsub|i,j>|}>>). In order to define linear
  constraints on <math|K> so that any matrix in <math|K> is realizable, we
  will introduce an auxiliary data structure in
  Section<nbsp><reference|sec:sas> called the
  <with|font-shape|small-caps|State-Action Structure> that captures the
  decisions made by a binary search tree. Using these ideas, we will be able
  to fit the binary search game into the bilinear duel framework introduced
  in Section<nbsp><reference|sec:blexact> and hence be able to efficiently
  compute a Nash equilibrium strategy for each player.

  Given a binary search tree <math|T\<in\>X>, we will write
  <math|c<rsub|T><around|(|\<omega\>|)>> for the depth of <math|\<omega\>> in
  <math|T>. We will also refer to <math|c<rsub|T><around|(|\<omega\>|)>> as
  the time that <math|T> finds <math|\<omega\>>.

  <subsection|Computing an equilibrium><label|sec:sas>

  In this subsection, we give an algorithm for computing a Nash equilibrium
  for the binary search game, based on the bilinear duel framework introduced
  in Section<nbsp><reference|sec:blexact>. We will do this by defining a
  structure called the <with|font-shape|small-caps|State-Action Structure>
  that we can use to represent the decisions made by a binary search tree
  using only polynomially many variables. The set of valid variable
  assignments in a <with|font-shape|small-caps|State-Action Structure> will
  also be defined by only polynomially many linear constraints and so these
  structures will naturally be closed under taking convex combinations. We
  will demonstrate that the value of playing
  <math|\<sigma\>\<in\>\<Delta\><around|(|X|)>> against any value matrix
  <math|V> \U see Definition<nbsp><reference|def:penalty> is a linear
  function of the variables in the <with|font-shape|small-caps|State-Action
  Structure> corresponding to <math|\<sigma\>>. Furthermore, all valid
  <with|font-shape|small-caps|State-Action Structures> can be efficiently
  realized as a distribution on binary search trees which achieves the same
  expected value.

  To apply the bilinear duel framework, we must give a mapping <math|\<phi\>>
  from the space of binary search trees to a convex set <math|K> defined
  explicitly by a polynomial number of linear constraints (on a polynomial
  number of variables). We now give an informal description of <math|K>: The
  idea is to represent a binary search tree <math|T\<in\>X> as a layered
  graph. The nodes (at each depth) alternate in type. One layer represents
  the current knowledge state of the binary search tree. After making some
  number of queries (and not yet finding the token), all the information that
  the binary search tree knows is an interval of values to which the token is
  confined - we refer to this as the <em|live interval>. The next layer of
  nodes represents an action - i.e. a query to some item in the live
  interval. Correspondingly, there will be three outgoing edges from an
  action node representing the possible replies that either the item is to
  the left, to the right, or at the query location (in which case the
  outgoing edge will exit to a terminal state).

  We will define a flow on this layered graph based on <math|T> and the
  distribution <math|p> on <math|\<Omega\>>. Flow will represent total
  probability - i.e. the total flow into a state node will represent the
  probability (under a random choice of <math|\<omega\>\<in\>\<Omega\>>
  according to <math|p>) that <math|T> reaches this state of knowledge (in
  exactly the corresponding number of queries). Then the flow out of a state
  node represents a decision of which item to query next. And lastly, the
  flow out of an action node splits according to Bayes' Rule - if all the
  information revealed so far is that the token is confined to some interval,
  we can express the probability that (say) our next query to a particular
  item finds the token as a conditional probability. We can then take convex
  combinations of these "basic" flows in order to form flows corresponding to
  distributions on binary search trees.

  We give a randomized rounding algorithm to select a random binary search
  tree based on a flow - in such a way that the marginal probabilities of
  finding a token <math|\<omega\><rsub|i>> at time <math|r> are exactly what
  the flow specifies they should be. The idea is that if we choose an
  outgoing edge for each state node (with probability proportional to the
  flow), then we have fixed a binary search tree because we have specified a
  decision rule for each possible internal state of knowledge. Suppose we
  were to now select an edge out of each action node (again with probability
  proportional to the flow) and we were to follow the unique path from the
  start node to a terminal node. This procedure would be equivalent to
  searching for a randomly chosen token <math|\<omega\><rsub|i>> chosen
  according to <math|p> and using this token to choose outgoing edges from
  action nodes. This procedure generates a random path from the start node to
  a terminal node, and is in fact equivalent to sampling a random path in the
  path decomposition of the flow proportionally to the flow along the path.
  Because these two rounding procedures are equivalent, the marginal
  distribution that results from generating a binary search tree (and
  choosing a random element to look for) will exactly match the corresponding
  values of the flow.

  <subsection|Notation>

  The natural description of the strategy space of the binary search game is
  exponential (in <math|<around|\||\<Omega\>|\|>>) \U so we will assume that
  the value of playing any binary search tree <math|T> against an opponent's
  mixed strategy is given to us in a compact form which we will refer to as a
  value matrix:

  <\definition>
    <nbsp><label|def:penalty> A value matrix <math|V> is an
    <math|<around|\||\<Omega\>|\|>\<times\><around|\||\<Omega\>|\|>> matrix
    in which the entry <math|V<rsub|i,j>> is interpreted to be the value of
    finding item <math|\<omega\><rsub|j>> at time <math|i>.
  </definition>

  Given any binary search tree <math|T<rprime|'>\<in\>X>, we can define a
  value matrix <math|V<around|(|T<rprime|'>|)>> so that the expected value of
  playing any binary search tree <math|T\<in\>X> against <math|T> in the
  binary search game can be written as <math|<big|sum><rsub|i,j>1<rsub|c<rsub|T><around|(|\<omega\><rsub|j>|)>=i>*V<around|(|T<rprime|'>|)><rsub|i,j>>:

  <\definition>
    Given a binary search tree <math|T<rprime|'>\<in\>X>, let
    <math|V<around|(|T<rprime|'>|)>> be a value matrix such that

    <\equation*>
      V<around|(|T<rprime|'>|)><rsub|i,j>=<around*|{|<tabular*|<tformat|<cwith|1|-1|1|1|cell-halign|l>|<cwith|1|-1|1|1|cell-lborder|0ln>|<cwith|1|-1|2|2|cell-halign|l>|<cwith|1|-1|2|2|cell-rborder|0ln>|<table|<row|<cell|0>|<cell|<text|if
      >c<rsub|T<rprime|'>><around|(|\<omega\><rsub|j>|)>\<less\>i>>|<row|<cell|<frac|1|2>>|<cell|<text|if
      >c<rsub|T<rprime|'>><around|(|\<omega\><rsub|j>|)>=i>>|<row|<cell|1>|<cell|<text|if
      >c<rsub|T<rprime|'>><around|(|\<omega\><rsub|j>|)>\<gtr\>i>>>>>|\<nobracket\>>
    </equation*>

    Similarly, given a mixed strategy <math|\<sigma\><rprime|'>\<in\>\<Delta\><around|(|X|)>>,
    let <math|V<around|(|\<sigma\><rprime|'>|)>=E<rsub|T<rprime|'>\<sim\>\<sigma\><rprime|'>><around|[|V<around|(|T<rprime|'>|)>|]>>
  </definition>

  Note that not every value matrix <math|V> can be realized as the value
  matrix <math|V<around|(|T<rprime|'>|)>> for some <math|T<rprime|'>\<in\>X>.
  In fact, <math|V> need not be realizable as <math|V<around|(|\<sigma\>|)>>
  for some <math|\<sigma\>\<in\>\<Delta\><around|(|X|)>>. However, we will be
  able to compute the best response against any value matrix <math|V>,
  regardless of whether or not the matrix corresponds to playing the binary
  search game against an adversary playing some mixed strategy. Lastly, we
  define a stochastic matrix <math|I<around|(|T|)>>, given <math|T\<in\>X>.
  From <math|I<around|(|T|)>>, and <math|V<around|(|T<rprime|'>|)>> we can
  write the expected value of playing <math|T> against <math|T<rprime|'>> as
  a inner-product. We let <math|\<less\>A,B\<gtr\><rsub|p>=<big|sum><rsub|i,j>A<rsub|i,j>*B<rsub|i,j>*p<around|(|\<omega\><rsub|j>|)>>
  when <math|A> and <math|B> are <math|<around|\||\<Omega\>|\|>\<times\><around|\||\<Omega\>|\|>>
  matrices.

  <\definition>
    Given a binary search tree <math|T\<in\>X>, let <math|I<around|(|T|)>> be
    an <math|<around|\||\<Omega\>|\|>\<times\><around|\||\<Omega\>|\|>>
    matrix in which <math|I<around|(|T|)><rsub|i,j>=1<rsub|c<rsub|T><around|(|\<omega\><rsub|j>|)>=i>>.
    Similarly, given <math|\<sigma\>\<in\>\<Delta\><around|(|X|)>>, let
    <math|I<around|(|\<sigma\>|)>=E<rsub|T\<sim\>\<sigma\>><around|[|I<around|(|T|)>|]>>.
  </definition>

  <\lemma>
    Given <math|\<sigma\>,\<sigma\><rprime|'>\<in\>\<Delta\><around|(|X|)>>,
    the expected value of playing <math|\<sigma\>> against
    <math|\<sigma\><rprime|'>> in the binary search game is exactly
    <math|\<less\>I<around|(|\<sigma\>|)>,V<around|(|\<sigma\><rprime|'>|)>\<gtr\><rsub|p>>.
  </lemma>

  <\proof>
    Consider any <math|T,T<rprime|'>\<in\>X>. Then the expected value of
    playing <math|T> against <math|T> in the binary search game is exactly
    <math|<big|sum><rsub|i>p<around|(|\<omega\><rsub|i>|)>*<around*|[|1<rsub|c<rsub|T><around|(|\<omega\><rsub|i>|)>\<less\>c<rsub|T<rprime|'>><around|(|\<omega\><rsub|i>|)>>+<frac|1|2>*1<rsub|c<rsub|T><around|(|\<omega\><rsub|i>|)>=c<rsub|T<rprime|'>><around|(|\<omega\><rsub|i>|)>>|]>=\<less\>I<around|(|T|)>,V<around|(|T<rprime|'>|)>\<gtr\><rsub|p>>.
    And since <math|\<less\>I<around|(|T|)>,V<around|(|T<rprime|'>|)>\<gtr\><rsub|p>>
    is bilinear in the matrices <math|I<around|(|T|)>> and
    <math|V<around|(|T<rprime|'>|)>>, indeed the expected value of playing
    <math|\<sigma\>> against <math|\<sigma\><rprime|'>> is
    <math|\<less\>I<around|(|\<sigma\>|)>,V<around|(|\<sigma\><rprime|'>|)>\<gtr\><rsub|p>>.
  </proof>

  <subsection|<with|font-shape|small-caps|State-Action Structure>>

  <\definition>
    <nbsp><label|def:lsr> Given a distribution <math|p> on <math|\<Omega\>>
    and <math|\<omega\><rsub|i>,\<omega\><rsub|j>,\<omega\><rsub|k>\<in\>\<Omega\>>
    (and <math|a*i\<less\>j\<less\>k>), let

    <\equation*>
      p<rsub|i,j,k><rsup|L>=<frac|P*r<rsub|\<omega\><rsub|k<rprime|'>>\<sim\>p>*<around|[|i\<leq\>k<rprime|'>\<less\>k|]>|P*r<rsub|\<omega\><rsub|k<rprime|'>>\<sim\>p>*<around|[|i\<leq\>k<rprime|'>\<leq\>j|]>>,p<rsub|i,j,k><rsup|E>=<frac|P*r<rsub|\<omega\><rsub|k<rprime|'>>\<sim\>p>*<around|[|k<rprime|'>=k|]>|P*r<rsub|\<omega\><rsub|k<rprime|'>>\<sim\>p>*<around|[|i\<leq\>k<rprime|'>\<leq\>j|]>>,<text|and
      >p<rsub|i,j,k><rsup|R>=<frac|P*r<rsub|\<omega\><rsub|k<rprime|'>>\<sim\>p>*<around|[|k\<less\>k<rprime|'>\<leq\>j|]>|P*r<rsub|\<omega\><rsub|k<rprime|'>>\<sim\>p>*<around|[|i\<leq\>k<rprime|'>\<leq\>j|]>>
    </equation*>
  </definition>

  Intuitively, we can regard the interval
  <math|<around|[|\<omega\><rsub|i>,\<omega\><rsub|j>|]>> as being divided
  into the sub-intervals <math|<around|[|\<omega\><rsub|i>,\<omega\><rsub|k-1>|]>>,
  <math|<around|{|\<omega\><rsub|k>|}>> and
  <math|<around|[|\<omega\><rsub|k+1>,\<omega\><rsub|j>|]>>. Then the
  quantity <math|p<rsub|i,j,k><rsup|L>> represents the probability that
  randomly generated element is contained in the first interval, conditioned
  on the element being contained in the original interval
  <math|<around|[|\<omega\><rsub|i>,\<omega\><rsub|j>|]>>. Similarly, one can
  interpret <math|p<rsub|i,j,k><rsup|E>> and <math|p<rsub|i,j,k><rsup|R>> as
  being conditional probabilities as well.

  We also define a set of knowledge states, which represent the current
  information that the binary search tree knows about the element and also
  how many queries have been made:

  <\definition>
    We define:

    <\enumerate>
      <item><math|<cS>=<around|{|<around|(|i,j,r|)>\|\<omega\><rsub|i>,\<omega\><rsub|j>\<in\>\<Omega\>,i\<less\>j,<text|and
      >r\<in\><around|{|1,2,....,<around|\||\<Omega\>|\|>|}>|}>>

      <item><math|<cA>=<around|{|<around|(|S,k|)>\|S=<around|(|i,j,r|)>\<in\><cS>,\<omega\><rsub|k>\<in\>\<Omega\>*<text|and
      >k\<in\><around|(|i,j|)>|}>>

      <item><math|<cF>=<around|{|<around|(|k,r|)>\|\<omega\><rsub|k>\<in\>\<Omega\>*<text|and
      >r\<in\><around|{|1,2,....,<around|\||\<Omega\>|\|>|}>|}>>
    </enumerate>

    We will refer to <math|<cS>> as the set of knowledge state. Additionally
    we will refer to <math|S<rsub|s*t*a*r*t>=<around|(|\<omega\><rsub|1>,\<omega\><rsub|n>,0|)>>
    as the start state. We will refer to <math|<cA>> as the set of action
    state and <math|<cF>> as the set of termination states.
  </definition>

  We can now define a <with|font-shape|small-caps|State-Action Structure>:

  <\definition>
    <label|def:sastructure>A <with|font-shape|small-caps|State-Action
    Structure> is a fixed directed graph generated as:

    <\enumerate>
      <item>Create a node <math|n<rsub|S>> for each <math|S\<in\><cS>>, a
      node <math|n<rsub|A>> for each <math|A\<in\><cA>> and a node
      <math|n<rsub|F>> for each <math|F\<in\><cF>>.

      <item>For each <math|S=<around|(|i,j,r|)>\<in\><cS>>, and for each
      <math|k> such that <math|i\<less\>k\<less\>j>, create a directed edge
      <math|e<rsub|S,k>> from <math|S> to
      <math|A=<around|(|S,k|)>\<in\><cA>>.

      <item>For each <math|A=<around|(|S,k|)>\<in\><cA>> and
      <math|S=<around|(|i,j,r|)>>, create a directed edge <math|e<rsub|A,F>>
      from <math|A> to <math|F=<around|(|k,r+1|)>> and directed edges
      <math|e<rsub|A,S<rsub|L>>> and <math|e<rsub|A,S<rsub|R>>> from <math|A>
      to <math|S<rsub|L>> and <math|S<rsub|R>> respectively for
      <math|S<rsub|L>=<around|(|i,k-1,r+1|)>> and
      <math|S<rsub|R>=<around|(|k+1,j,r+1|)>>.
    </enumerate>
  </definition>

  We will define a flow on this directed graph. The source of this flow will
  be the start node <math|S<rsub|s*t*a*r*t>> and the node corresponding to
  each termination state will be a sink. The total flow in this graph will be
  one unit, and this flow should be interpreted as representing the total
  probability of reaching a particular knowledge state, or performing a
  certain action.

  <\definition>
    <label|def:stateful>We will call an set of values <math|x<rsub|e>> for
    each directed edge in a <with|font-shape|small-caps|State-Action
    Structure> a stateful flow if (let us adopt the notation that
    <math|x<rsub|S,A>> is the flow on an edge <math|e<rsub|S,A>>):

    <\enumerate>
      <item>For all <math|e>, <math|0\<leq\>x<rsub|e>\<leq\>1>

      <item>All nodes except <math|n<rsub|S<rsub|s*t*a*r*t>>> and
      <math|n<rsub|F>> (for <math|F\<in\><cF>>) satisfy conservation of flow

      <item>For each action state <math|A=<around|(|S,i|)>\<in\><cA>> for
      <math|S=<around|(|i,j,r|)>>, the the flow on the three out-going edges
      <math|e<rsub|A,F>,e<rsub|A,S<rsub|L>>> and <math|e<rsub|A,S<rsub|R>>>
      from <math|n<rsub|A>>, satisfy <math|x<rsub|A,F>=p<rsub|i,j,k><rsup|E>*C>,
      <math|x<rsub|A,S<rsub|L>>=p<rsub|i,j,k><rsup|L>*C> and
      <math|x<rsub|A,S<rsub|R>>=p<rsub|i,j,k><rsup|R>> where
      <math|C=<big|sum><rsub|e=<around|(|S<rprime|'>,A|)>*<text|for
      >S<rprime|'>\<in\><cS>>x<rsub|S<rprime|'>,A>>
    </enumerate>
  </definition>

  Given <math|T\<in\>X>, we can define a flow <math|x<rsub|T>> in the
  <with|font-shape|small-caps|State-Action Structure> that captures the
  decisions made by <math|T>:

  <\definition>
    <label|def:xt>Given <math|T\<in\>X>, define <math|x<rsub|T>> as follows:

    <\enumerate>
      <item>For each <math|S=<around|(|i,j,r|)>\<in\><cS>> let
      <math|T<rsub|i,j>> be the sub-tree of <math|T> (if a unique such
      sub-tree exists) such that the labels contained in <math|T<rsub|i,j>>
      are exactly <math|<around|{|\<omega\><rsub|i>,\<omega\><rsub|i+1>,...,\<omega\><rsub|j>|}>>.
      Suppose that the root of this sub-tree <math|T<rsub|i,j>> is
      <math|\<omega\><rsub|k>>. Then send all flow entering the node
      <math|n<rsub|S>> on the outgoing edge <math|e<rsub|S,A>> for
      <math|A=<around|(|S,k|)>>.

      <item>For each <math|A\<in\><cA>>, divide flow into a action node
      <math|n<rsub|A>> according to Condition <math|3> in
      Definition<nbsp><reference|def:stateful> among outgoing edges.
    </enumerate>
  </definition>

  Note that the flow out of <math|n<rsub|S<rsub|s*t*a*r*t>>> is one. Of
  course, the choice of how to split flow on outgoing edges from an action
  node <math|n<rsub|A>> is already well-defined. But we need to demonstrate
  that <math|x<rsub|T>> does indeed satisfy conservation of flow
  requirements, and hence is a stateful flow:

  <\lemma>
    <nbsp><label|lemma:isstateful> For any <math|T\<in\>X>, <math|x<rsub|T>>
    is a stateful flow
  </lemma>

  <\proof>
    For some intervals <math|<around|{|\<omega\><rsub|i>,\<omega\><rsub|i+1>,...,\<omega\><rsub|j>|}>>,
    there is no sub-tree in <math|T> for which the labels contained in the
    sub-tree is exactly <math|<around|{|\<omega\><rsub|i>,\<omega\><rsub|i+1>,...,\<omega\><rsub|j>|}>>.
    If there is such an interval, however, it is clearly unique. We will
    prove by induction that the only state nodes in the
    <with|font-shape|small-caps|State-Action Structure> which are reached by
    flow <math|x<rsub|T>> are state nodes for which there is such a sub-tree.

    We will prove this condition by induction on <math|r> for state nodes
    <math|n<rsub|S>> of the form <math|S=<around|(|i,j,r|)>>. This condition
    is true in the base case because all flow starts at the node
    <math|n<rsub|S<rsub|s*t*a*r*t>>> and <math|S<rsub|s*t*a*r*t>=<around|(|\<omega\><rsub|1>,\<omega\><rsub|n>,0|)>>
    and indeed the entire binary search tree <math|T> has the property that
    the set of labels used is exactly <math|<around|{|\<omega\><rsub|1>,\<omega\><rsub|2>,...*\<omega\><rsub|n>|}>>.

    Suppose by induction that there is some sub-tree <math|T<rsub|i,j>> of
    <math|T> for which the labels of contained in the sub-tree are exactly
    <math|<around|{|\<omega\><rsub|i>,\<omega\><rsub|i+1>,...,\<omega\><rsub|j>|}>>.
    Let <math|\<omega\><rsub|k>> be the label of the root node of
    <math|T<rsub|i,j>>. Then all flow entering <math|n<rsub|S>> would be sent
    to the action node <math|A=<around|(|S,k|)>> and all flow out of this
    action node would be set to either a termination node or to state nodes
    <math|S<rsub|L>=<around|(|i,k-1,r+1|)>> or
    <math|S<rsub|R>=<around|(|k+1,r+1|)>> and both of the intervals
    <math|<around|{|\<omega\><rsub|i>,\<omega\><rsub|2>,...*\<omega\><rsub|r-1>|}>>
    or <math|<around|{|\<omega\><rsub|r+1>,\<omega\><rsub|r+2>,...,\<omega\><rsub|j>|}>>
    do indeed have the property that there is a sub-tree that contains
    exactly each respective set of labels - these are just the left and right
    sub-trees of <math|T<rsub|i,j>>.
  </proof>

  The variables in a stateful flow capture marginal probabilities that we
  need to compute the expected value of playing a binary search tree <math|T>
  against some value matrix <math|V>:

  <\lemma>
    <nbsp><label|lemma:conditional> Consider any state
    <math|S=<around|(|i,j,r|)>\<in\><cS>>. The total flow in <math|x<rsub|T>>
    into <math|n<rsub|S>> is exactly the probability that (under a random
    choice of <math|\<omega\><rsub|k>\<sim\>p>), <math|\<omega\><rsub|k>> is
    contained in some sub-tree of <math|T> at depth <math|r+1>. Similarly the
    total flow in <math|x<rsub|T>> into any terminal node <math|n<rsub|F>>
    for <math|F=<around|(|\<omega\><rsub|f>,r|)>> is exactly the probability
    (under a random choice of <math|\<omega\><rsub|k>\<sim\>p>) that
    <math|c<rsub|T><around|(|\<omega\><rsub|k>|)>=r>.
  </lemma>

  <\proof>
    We can again prove this lemma by induction on <math|r> for state nodes
    <math|n<rsub|S>> of the form <math|S=<around|(|i,j,r|)>>. In the base
    case, the flow into <math|n<rsub|S<rsub|s*t*a*r*t>>> is <math|1>, which
    is exactly the probability that (under a random choice of
    <math|\<omega\><rsub|t>\<sim\>p>), <math|\<omega\><rsub|t>> is contained
    in some sub-tree of <math|T> at depth <math|1>.

    So we can prove the inductive hypothesis by sub-conditioning on the event
    that the element <math|\<omega\><rsub|k>> is contained in some sub-tree
    of <math|T> at depth <math|r>. Let this subtree be <math|T<rprime|'>>. By
    the inductive hypothesis, this is exactly the flow into the node
    <math|n<rsub|S<rprime|'>>> where <math|S<rprime|'>=<around|(|i,j,r-1|)>>
    for some <math|\<omega\><rsub|i>,\<omega\><rsub|j>\<in\>\<Omega\>> and
    <math|i\<leq\>k\<leq\>j>. We can then condition on the event that
    <math|\<omega\><rsub|k>> is such that <math|i\<leq\>k\<leq\>j>. Let
    <math|\<omega\><rsub|r>> be the label of the root node of
    <math|T<rprime|'>>. Then using conditioning, the probability that
    <math|\<omega\><rsub|k>> is contained in the left-subtree of
    <math|T<rprime|'>> is exactly <math|p<rsub|i,j,r><rsup|L>>, and similarly
    for the right sub-tree. Also the probability that
    <math|\<omega\><rsub|k>=\<omega\><rsub|r>> is
    <math|p<rsub|i,j,r><rsup|E>>. And so Condition <math|3> in
    Definition<nbsp><reference|def:stateful> enforces the condition that the
    flow splits exactly as this total probability splits - i.e. the
    probability that <math|\<omega\><rsub|k>> is contained in the left and
    right sub-interval of <math|<around|{|\<omega\><rsub|i>,\<omega\><rsub|i+1>,...*\<omega\><rsub|j>|}>>
    or contained in the root "<math|\<omega\><rsub|r>>" respectively. Note
    that the set of sub-trees at any particular depth in <math|T> correspond
    to disjoint intervals of <math|\<Omega\>>, and hence there is no other
    flow entering the state <math|n<rsub|S>>, and this proves the inductive
    hypothesis.
  </proof>

  As an immediate corollary:

  <\corollary>
    <label|cor:vmat>The expected value of playing <math|T> against value
    matrix <math|V>,

    <\equation*>
      \<less\>I<around|(|T|)>,V\<gtr\><rsub|p>=<big|sum><rsub|F=<around|(|\<omega\><rsub|k>,r|)>\<in\><cF>>x<rsub|T><rsup|i*n><around|(|F|)>*V<rsub|r,k>
    </equation*>

    where <math|x<rsub|T><rsup|i*n>> denotes the total flow into a node
    according to <math|x<rsub|T>>.
  </corollary>

  And as a second corollary:

  <\corollary>
    <label|cor:mmat>Given <math|T\<in\>X>,

    <\equation*>
      V<around|(|T|)><rsub|i,j>=<frac|<frac|1|2>*x<rsub|T><rsup|i*n><around|(|\<omega\><rsub|j>,i|)>+<big|sum><rsub|i<rprime|'>\<gtr\>i>x<rsub|T><rsup|i*n><around|(|\<omega\><rsub|j>,i<rprime|'>|)>|p<around|(|\<omega\><rsub|j>|)>>
    </equation*>

    where <math|x<rsub|T><rsup|i*n><around|(|\<omega\><rsub|j>,i|)>> denotes
    the total flow into <math|n<rsub|F>> for
    <math|F=<around|(|\<omega\><rsub|j>,i|)>\<in\><cF>>.
  </corollary>

  <subsection|A rounding algorithm>

  <\proposition>
    <nbsp><label|prop:rr> Given a stateful flow <math|x>, there is an
    efficient randomized rounding procedure that generates a random
    <math|T\<in\>X> with the property that for any
    <math|\<omega\><rsub|j>\<in\>\<Omega\>> and for any
    <math|i\<in\><around|{|1,2,...,<around|\||\<Omega\>|\|>|}>>,
    <math|P*r*<around|[|c<rsub|T><around|(|\<omega\><rsub|j>|)>=i|]>=<frac|x<rsup|i*n><around|(|\<omega\><rsub|j>,i|)>|p<rsub|\<omega\><rsub|j>>>>.
  </proposition>

  <\proof>
    Since <math|x> is a unit flow from <math|n<rsub|S<rsub|s*t*a*r*t>>> to
    the set of sink nodes <math|n<rsub|F>> for <math|F\<in\><cF>>. So if we
    could sample a random path proportional to the total flow along the path,
    the probability that the path ends at any sink <math|n<rsub|F>> for
    <math|F=<around|(|\<omega\><rsub|j>,r|)>> is exactly
    <math|x<rsup|i*n><around|(|\<omega\><rsub|j>,r|)>>.

    <with|font-series|bold|First Rounding Procedure:> Consider the following
    procedure for generating a path according to this distribution - i.e. the
    probability of generating any path is exactly the flow along the path:
    Starting at the source node, and at every step choose a new edge to
    traverse proportionally to the flow along it. So if the process is
    currently at some node <math|n<rsub|S>> and the total flow into the node
    is <math|U>, and the total flow on some outgoing edge <math|e> is
    <math|u>, edge <math|e> is chosen with probability exactly
    <math|<frac|u|U>> and the process continues until a sink node is reached.
    Notice that this procedure always terminates in
    <math|O<around|(|<around|\||\<Omega\>|\|>|)>> steps because each time we
    traverse an action node <math|n<rsub|A>>, the counter <math|r> is
    incremented and every edge in a <with|font-shape|small-caps|State-Action
    Structure> either points into or points out of a action node.

    The key to our randomized rounding procedure is an alternative way to
    generate a path from the source node to a sink such that the probability
    that the path ends at any sink <math|n<rsub|F>> for
    <math|F=<around|(|\<omega\><rsub|j>,r|)>> is <em|still> exactly
    <math|x<rsup|i*n><around|(|\<omega\><rsub|j>,r|)>>. Instead, for each
    state node <math|n<rsub|S>>, we choose an outgoing edge in advance (to
    some action node) proportional to the flow on <math|x> on that edge.

    <with|font-series|bold|Second Rounding Procedure:> If we fix these
    choices in advance, we can define an alternate path selection procedure
    which starts at the source node, and traverse any edges that have already
    been decided upon. Whenever the process reaches an action node (in which
    case the outgoing edge has not been decided upon), we can select an edge
    proportional to the total flow on the edge. This procedure still
    satisfies the property that the probability that the path ends at any
    sink <math|n<rsub|F>> for <math|F=<around|(|\<omega\><rsub|j>,r|)>> is
    exactly <math|x<rsup|i*n><around|(|\<omega\><rsub|j>,r|)>>.

    <with|font-series|bold|Third Rounding Procedure:> Next, consider another
    modification to this procedure. Imagine still that the outgoing edges
    from every state node are chosen (randomly, as above in the
    <with|font-series|bold|Second Rounding Procedure:> ). Instead of choosing
    which outgoing edge to pick from an action node when we reach it, we
    could instead pick an item <math|\<omega\><rsub|k<rprime|'>>\<sim\>p> in
    advance and using this hidden value to determine which outgoing edge from
    a action node to traverse. We will maintain the invariant that if we are
    at <math|n<rsub|A>> and <math|A=<around|(|S,k|)>> for
    <math|S=<around|(|i,j,r|)>>, we must have
    <math|i\<leq\>k<rprime|'>\<leq\>j>. This is clearly true at the base
    case. Then we will traverse the edge <math|e<rsub|A,F>> for
    <math|F=<around|(|k,r|)>> if <math|\<omega\><rsub|k<rprime|'>>=\<omega\><rsub|k>>.
    Otherwise if <math|i\<leq\>k<rprime|'>\<leq\>k-1> we will traverse the
    edge <math|e<rsub|A,S<rsub|L>>> for <math|S<rsub|L>=<around|(|i,k-1,r+1|)>>.
    Otherwise <math|i\<leq\>k<rprime|'>\<leq\>k-1> and we will traverse the
    edge <math|e<rsub|A,S<rsub|R>>> for <math|S<rsub|R>=<around|(|k+1,j,r+1|)>>.
    This clearly maintains the invariant that <math|k<rprime|'>> is contained
    in the interval corresponding to the current knowledge state.

    This third procedure is equivalent to the second procedure. This follows
    from interpreting Condition <math|3> in
    Definition<nbsp><reference|def:stateful> as a rule for splitting flow
    that is consistent with the conditional probability that
    <math|\<omega\><rsub|k<rprime|'>>> is contained in the left or right
    sub-interval of <math|<around|{|\<omega\><rsub|i>,\<omega\><rsub|i+1>,...*\<omega\><rsub|j>|}>>
    or is equal to <math|\<omega\><rsub|k>> conditioned on
    <math|\<omega\><rsub|k<rprime|'>>\<in\><around|{|\<omega\><rsub|i>,\<omega\><rsub|i+1>,...*\<omega\><rsub|j>|}>>.
    An identical argument is used in the proof of
    Lemma<nbsp><reference|lemma:conditional>. In this case, we will say that
    <math|\<omega\><rsub|k<rprime|'>>> is the rule for choosing edges out of
    action nodes.

    Now we can prove the Lemma: The key insight is that once we have chosen
    the outgoing edges from each state node (but not which outgoing edges
    from each action node), we have determined a binary search tree: Given
    any element <math|\<omega\><rsub|k<rprime|'>>>, if we follow outgoing
    edges from action nodes using <math|\<omega\><rsub|k<rprime|'>>> as the
    rule, we must reach a terminal node <math|F=<around|(|\<omega\><rsub|k<rprime|'>>,r|)>>
    for some <math|r>. In fact, the value of <math|r> is determined by
    <math|\<omega\><rsub|k<rprime|'>>> because once
    <math|\<omega\><rsub|k<rprime|'>>> is chosen, there are no more random
    choices. So we can compute a vector of dimension
    <math|<around|\||\<Omega\>|\|>>, <math|<wide|u|\<vect\>>> such that
    <math|<wide|u|\<vect\>><rsub|j>=r> such that
    <math|F=<around|(|\<omega\><rsub|j>,r|)>> is reached when the
    <math|\<omega\><rsub|j>> is the rule for choosing edges out of action
    nodes.

    Using the characterization in Proposition<nbsp><reference|prop:depth>, it
    is easy to verify that the transition rules in the
    <with|font-shape|small-caps|State Action Structure> enforce that
    <math|<wide|u|\<vect\>>> is a depth vector and hence we can compute a
    binary search tree <math|T> which has the property that using selection
    rule <math|\<omega\><rsub|j>> results in reaching the sink node
    <math|F=<around|(|\<omega\><rsub|j>,c<rsub|T><around|(|\<omega\><rsub|j>|)>|)>>.

    Suppose we select each outgoing edge from a state node (as in the
    <with|font-series|bold|Third Rounding Procedure>) and select an
    <math|\<omega\><rsub|k<rprime|'>>\<sim\>p> (again as in the
    <with|font-series|bold|Third Rounding Procedure>) independently. Then
    from the choices of the outgoing edges from each state node, we can
    recover a binary search tree <math|T>. Then
    <math|P*r<rsub|T,\<omega\><rsub|k<rprime|'>>>*<around|[|c<rsub|T><around|(|\<omega\><rsub|k<rprime|'>>|)>=r|]>=x<rsup|i*n><around|(|\<omega\><rsub|k<rprime|'>>,r|)>>
    precisely because the <with|font-series|bold|First Rounding Procedure>
    and the <with|font-series|bold|Third Rounding Procedure> are equivalent.
    And then we can apply Bayes' Rule to compute that

    <\equation*>
      P*r<rsub|T>*<around|[|c<rsub|T><around|(|\<omega\><rsub|k<rprime|'>>|)>=r\|\<omega\><rsub|k<rprime|'>>=\<omega\><rsub|k>|]>=<frac|x<rsup|i*n><around|(|\<omega\><rsub|k>,r|)>|p<around|(|\<omega\><rsub|k>|)>>
    </equation*>
  </proof>

  <\theorem>
    There is an algorithm that runs in time polynomial in
    <math|<around|\||\<Omega\>|\|>> that computes an exact Nash equilibrium
    for the binary search game.
  </theorem>

  <\proof>
    We can now apply the biliear duel framework introduced in
    Section<nbsp><reference|sec:blexact> to the binary search game: The space
    <math|K> is the set of all stateful flows. The set of variables is
    polynomially sized \U see Definition<nbsp><reference|def:sastructure>,
    and the set of linear constraints is also polynomially sized and is given
    explicitly in Definition<nbsp><reference|def:stateful>. The function
    <math|\<phi\>> maps binary search trees <math|T\<in\>X> to a stateful
    flow <math|x<rsub|T>> and is the procedure given in
    Defintion<nbsp><reference|def:xt> for computing this mapping is
    efficient. Also the payoff matrix <math|M> is given explicitly in
    Corollary<nbsp><reference|cor:vmat> and
    Corollary<nbsp><reference|cor:mmat>. And lastly we give a randomized
    rounding algorithm in Proposition<nbsp><reference|prop:rr>.
  </proof>

  <subsection|Beatability>

  We next consider the beatability of the classical algorithm when <math|p>
  is the uniform distribution on <math|\<Omega\>>. For lack of a better term,
  let us call this single-player optima the median binary search - or median
  search.

  Here we give matching upper and lower bounds on the beatability of median
  search. The idea is that an adversary attempting to do well against median
  search can only place one item at depth <math|1>, two items at depth
  <math|2>, four items at depth <math|3> and so on. We can regard these as
  budget restrictions - the adversary cannot choose too many items to map to
  a particular depth. There are additional combinatorial restrictions, as
  well For example, an adversary cannot place two labels of depth <math|2>
  both to the right of the label of depth <math|1> - because even though the
  root node in a binary search tree can have two children, it cannot have
  more than one right child.

  But suppose we relax this restriction, and only consider budget
  restrictions on the adversary. Then the resulting best response question
  becomes a bipartite maximum weight matching problem. Nodes on the left (in
  this bipartite graph) represent items, and nodes on the right represent
  depths (there is one node of depth <math|1>, two nodes of depth <math|2>,
  ...). And for any choice of a depth to assign to a node, we can evaluate
  the value of this decision - if this decision beats median search when
  searching for that element, we give the corresponding edge weight <math|1>.
  If it ties median search, we give the edge weight <math|<frac|1|2>> and
  otherwise we give the edge zero weight.

  We give an upper bound on the value of a maximum weight matching in this
  graph, hence giving an upper bound on how well an adversary can do if he is
  subject to only budget restrictions. If we now add the combinatorial
  restrictions too, this only makes the best response problem harder. So in
  this way, we are able to bound how much an adversary can beat median
  search. In fact, we give a lower bound that matches this upper bound - so
  our relaxation did not make the problem strictly easier (to beat median
  search).

  We focus on the scenario in which <math|<around|\||\<Omega\>|\|>=2<rsup|r>-1>
  and <math|p> is the uniform distribution. Throughout this section we denote
  <math|n=<around|\||\<Omega\>|\|>>. The reason we fix <math|n> to be of the
  form <math|2<rsup|r>-1> is because the optimal single-player strategy is
  well-defined in the sense that the first query will be at precisely the
  median element, and if the element <math|\<omega\>> is not found on this
  query, then the problem will break down into one of two possible
  <math|2<rsup|r-1>-1> sized sub-problems. For this case, we give
  asymptotically matching upper and lower bounds on the beatability of median
  search.

  <\definition>
    We will call a <math|<around|\||\<Omega\>|\|>>-dimensional vector
    <math|<wide|u|\<vect\>>> over <math|<around|{|1,2,...<around|\||\<Omega\>|\|>|}>>
    a depth vector (over the universe <math|\<Omega\>>) if there is some
    <math|T\<in\>X> such that <math|<wide|u|\<vect\>><rsub|j>=c<rsub|T><around|(|\<omega\><rsub|j>|)>>.
  </definition>

  <\proposition>
    <label|prop:depth>A <math|<around|\||\<Omega\>|\|>>-dimensional vector
    <math|<wide|u|\<vect\>>> over <math|<around|{|1,2,...<around|\||\<Omega\>|\|>|}>>
    is a depth vector (over the universe <math|\<Omega\>>) if and only if

    <\enumerate>
      <item>exactly one entry of <math|<wide|u|\<vect\>>> is set to <math|1>
      (let the corresponding index be <math|j>), and

      <item>the vectors <math|<around|[|<wide|u|\<vect\>><rsub|1>-1,<wide|u|\<vect\>><rsub|2>-1,....*<wide|u|\<vect\>><rsub|j-1>-1|]>>
      and <math|<around|[|<wide|u|\<vect\>><rsub|j+1>-1,<wide|u|\<vect\>><rsub|j+2>-1,....*<wide|u|\<vect\>><rsub|n>-1|]>>
      are depth vectors over the universe
      <math|<around|{|\<omega\><rsub|1>,\<omega\><rsub|2>,...*\<omega\><rsub|j-1>|}>>
      and <math|<around|{|\<omega\><rsub|j+1>,\<omega\><rsub|j+2>,...*\<omega\><rsub|n>|}>>
      respectively.
    </enumerate>
  </proposition>

  <\proof>
    Given any vector <math|<wide|u|\<vect\>>> that (recursively) satisfies
    the above Conditions <math|1> and <math|2>, one can build up a binary
    search tree on <math|\<Omega\>> inductively. Let
    <math|\<omega\><rsub|j>\<in\>\<Omega\>> be the unique item such that
    <math|<wide|u|\<vect\>><rsub|j>=1> which exists because
    <math|<wide|u|\<vect\>>> satisfies Condition <math|1>. Since
    <math|<wide|u|\<vect\>>> satisfies Condition <math|2>, the vectors
    <math|<wide|u|\<vect\>><rsub|L>=<around|[|<wide|u|\<vect\>><rsub|1>-1,<wide|u|\<vect\>><rsub|2>-1,....*<wide|u|\<vect\>><rsub|j-1>-1|]>>
    and <math|<wide|u|\<vect\>><rsub|R>=<around|[|<wide|u|\<vect\>><rsub|j+1>-1,<wide|u|\<vect\>><rsub|j+2>-1,....*<wide|u|\<vect\>><rsub|n>-1|]>>
    and hence by induction we know that there are binary search trees
    <math|T<rsub|L>> and <math|T<rsub|R>> on the universe
    <math|<around|{|\<omega\><rsub|1>,\<omega\><rsub|2>,...*\<omega\><rsub|j-1>|}>>
    and <math|<around|{|\<omega\><rsub|j+1>,\<omega\><rsub|j+2>,...*\<omega\><rsub|n>|}>>
    respectively for which <math|<wide|u|\<vect\>><rsub|L><around|(|i|)>=c<rsub|T<rsub|L>><around|(|\<omega\><rsub|i>|)>>
    and <math|<wide|u|\<vect\>><rsub|R><around|(|i<rprime|'>|)>=c<rsub|T<rsub|R>><around|(|\<omega\><rsub|i<rprime|'>>|)>>
    for each <math|1\<leq\>i\<leq\>j-1> and
    <math|j+1\<leq\>i<rprime|'>\<leq\>n> respectively.

    So we can build a binary search tree <math|T> on <math|\<Omega\>> by
    labeling the root node <math|\<omega\><rsub|j>> and letting the left
    sub-tree to <math|T<rsub|L>> and the right sub-tree to <math|T<rsub|R>>.
    Since the in-order traversal of <math|T<rsub|L>> and of <math|T<rsub|R>>
    result in visiting <math|<around|{|\<omega\><rsub|1>,\<omega\><rsub|2>,...*\<omega\><rsub|j-1>|}>>
    and <math|<around|{|\<omega\><rsub|j+1>,\<omega\><rsub|j+2>,...*\<omega\><rsub|n>|}>>
    in sorted order, the in-order traversal of <math|T> will visit
    <math|\<Omega\>> in sorted order and hence <math|T\<in\>X>.

    Not also that <math|c<rsub|T><around|(|\<omega\><rsub|i>|)>=1+c<rsub|T<rsub|L>><around|(|\<omega\><rsub|i>|)>>
    for <math|1\<leq\>i\<leq\>j-1> and similarly
    <math|c<rsub|T><around|(|\<omega\><rsub|i<rprime|'>>|)>=1+c<rsub|T<rsub|R>><around|(|\<omega\><rsub|i<rprime|'>>|)>>
    for <math|j+1\<leq\>i<rprime|'>\<leq\>n>. So this implies that
    <math|<wide|u|\<vect\>>> satisfies <math|<wide|u|\<vect\>><rsub|i>=c<rsub|T><around|(|\<omega\><rsub|i>|)>>
    for all <math|1\<leq\>i\<leq\>n>, as desired. This completes the
    inductive proof that if a vector <math|<wide|u|\<vect\>>> satisfies
    Conditions <math|1> and <math|2>, then it is a depth vector.

    Conversely, given <math|T\<in\>X>, there is only one element
    <math|\<omega\><rsub|j>> such that <math|c<rsub|T><around|(|\<omega\><rsub|j>|)>=1>
    and so Condition <math|1> is met. Let <math|T<rsub|L>> and
    <math|T<rsub|R>> be the binary search trees that are the left and right
    sub-tree of <math|T> rooted at <math|\<omega\><rsub|j>> respectively,
    where "<math|\<omega\><rsub|j>>" is the label of the root node in
    <math|T>. Again, <math|c<rsub|T><around|(|\<omega\><rsub|i>|)>=1+c<rsub|T<rsub|L>><around|(|\<omega\><rsub|i>|)>>
    for <math|1\<leq\>i\<leq\>j-1> and similarly
    <math|c<rsub|T><around|(|\<omega\><rsub|i<rprime|'>>|)>=1+c<rsub|T<rsub|R>><around|(|\<omega\><rsub|i<rprime|'>>|)>>
    for <math|j+1\<leq\>i<rprime|'>\<leq\>n> so the vector corresponding to
    <math|c<rsub|T>> does indeed satisfy Condition <math|2> by induction.
  </proof>

  <\claim>
    <nbsp><label|cor:upper> For any depth vector <math|<wide|u|\<vect\>>>,
    and any <math|s\<in\><around|{|1,2,...<around|\||\<Omega\>|\|>|}>>,

    <\equation*>
      <around|\||<around|{|j\<in\><around|[|n|]>\|<text|such that
      ><wide|u|\<vect\>><rsub|j>=s|}>|\|>\<leq\>2<rsup|s-1>
    </equation*>
  </claim>

  <\lemma>
    <nbsp><label|lemma:bstbeatlow> The beatability of median search is at
    least <math|<frac|2<rsup|r-1>-1+2<rsup|r-3>|2<rsup|r>-1>\<approx\><frac|5|8>>.
  </lemma>

  <\proof>
    Consider the depth vector for median search for <math|2<rsup|3>-1>
    (<math|r=3>): <math|<around|[|3,2,3,1,3,2,3|]>> and consider a partially
    filled vector <math|<around|[|2,1,\<ast\>,\<ast\>,2,\<ast\>,\<ast\>|]>>.
    We can generate the depth vector for median search for <math|r+1> from
    the depth vector for median search for <math|r> as follows: alternately
    interleave values of <math|r+1> into the depth vector for <math|r>. For
    example the depth vector for median search for <math|r=4> is
    <math|<around|[|4,3,4,2,4,3,4,1,4,3,4,2,4,3,4|]>>. We assume by induction
    that all blocks in the partially filled vector are either <math|\<ast\>>s
    or are one less than the corresponding entry in the depth vector for
    median search. This is true by induction for the base case <math|r=3>. We
    also assume that the <math|\<ast\>>s are given in blocks of length
    exactly two. This is also true in the base case. Then if we consider the
    depth vector for median search for <math|r+1>, if an entry of <math|r+1>
    is interleaved, we can place a value of <math|r> if the corresponding
    entry in the partially filled vector is interleaved between two entries
    that are already assigned numbers. Otherwise three entries are
    interleaved into a string of exactly two <math|\<ast\>>s. The median
    entry in this string of <math|5> symbols corresponds to a newly added
    <math|r+1> entry in the depth vector for median search. At the median of
    this <math|5> symbol string, we can place a value of <math|r>. This again
    creates sequences of <math|\<ast\>>s of length exactly two, because we
    have replaced only the median entry in the string of <math|5> symbols.

    If we are given a partially filled depth vector with the property that
    one value <math|1> is placed, two values of <math|2> are placed, four
    values of <math|3> are placed,... and <math|2<rsup|r-1>> values of
    <math|r> are placed. Additionally, we require that all unfilled entries
    (which are given the value <math|\<ast\>> for now) occur in blocks of
    length exactly <math|2>. Then we can fill these symbols with the values
    <math|r+1> and <math|r+2>, such that the value of <math|r+1> aligns with
    a corresponding value of <math|r+1> in the depth vector for median search
    (precisely because any two consecutive symbols contain exactly one value
    of <math|r+1> in the depth vector corresponding to median search for
    <math|r+1>).

    We can use Proposition<nbsp><reference|prop:depth> to prove that this
    resulting completely filled vector is indeed a depth vector. How much
    does this strategy beat median search? There are <math|2<rsup|r>-1>
    locations (i.e. every index in which a value of <math|1>, <math|2>, ...
    or <math|r> is placed) in which this strategy beats median search. And
    there are <math|2<rsup|r-1>> locations in which this strategy ties median
    search. Note that this is for <math|2<rsup|r+1>-1> items, and so the
    beatability of median search on <math|2<rsup|r>-1> items is exactly

    <\equation*>
      lim<rsub|r\<rightarrow\>\<infty\>> <frac|2<rsup|r-1>-1+2<rsup|r-3>|2<rsup|r>-1>=<frac|5|8>
    </equation*>
  </proof>

  <\lemma>
    <nbsp><label|lemma:bstbeathigh> The beatability of median search is at
    most <math|<frac|2<rsup|r-1>-1+2<rsup|r-3>|2<rsup|r>-1>\<approx\><frac|5|8>>.
  </lemma>

  <\proof>
    One can give an upper bound on the beatability of median search by
    relaxing the question to a matching problem. Given a universe
    <math|\<Omega\>> of size <math|2<rsup|r>-1>, consider the following
    weighted matching problem: For every value of
    <math|s\<in\><around|{|1,2,...*r-1|}>>, add <math|2<rsup|s-1>> nodes on
    both the left and right side with label \Ps\Q. For any pair of nodes
    <math|a,b> where <math|a> is contained on the left side, and <math|b> is
    contained on the right side, set the value of the edge connecting
    <math|a> and <math|b> to be equal to <math|0> if the label of <math|a> is
    strictly smaller than the label of <math|b>, <math|<frac|1|2>> if the two
    labels have the same value, and <math|1> if the label of <math|a> is
    strictly larger than the label of <math|b>.

    Let <math|M> be the maximum value of a perfect matching. Let
    <math|<wide|M|\<bar\>>> be the average value - i.e.
    <math|<frac|M|2<rsup|r>-1>>.

    <claim|<math|<wide|M|\<bar\>>> is an upper bound on the beatability of
    binary search.>

    <\proof>
      For any <math|s\<in\><around|{|1,2,...*r-1|}>>, the depth vector
      <math|<wide|u|\<vect\>><around|(|M|)>> corresponding to median search
      has exactly <math|2<rsup|s-1>> indices <math|j> for which
      <math|<wide|u|\<vect\>><around|(|M|)><rsub|j>=s>.

      We can make an adversary more powerful by allowing the adversary to
      choose any vector <math|<wide|u|\<vect\>>> which satisfies the
      condition that for any <math|s\<in\><around|{|1,2,...<around|\||\<Omega\>|\|>|}>>,
      the number of indices <math|j> for which
      <math|<wide|u|\<vect\>><rsub|j>=s> is at most <math|2<rsup|s-1>>
      because using Claim<nbsp><reference|cor:upper> this is a weaker
      restriction than requiring the adversary to choose a vector
      <math|<wide|u|\<vect\>>> that is a depth vector. So in this case, the
      adversary may as well choose a vector <math|<wide|u|\<vect\>>> that
      satisfies the constraint in Claim<nbsp><reference|cor:upper> with
      equality.

      And in this case where we allow the adversary to choose any vector
      <math|<wide|u|\<vect\>>> that satisfies
      Claim<nbsp><reference|cor:upper>, the best response question is exactly
      the matching problem described above - because for each entry in
      <math|<wide|u|\<vect\>><rsub|M>> because the adversary only needs to
      choose what label <math|s\<in\><around|{|1,2,...*r-1|}>> to place at
      this location subject to the above budget constraint that at most
      <math|2<rsup|s-1>> labels of type "s" are used in total.
    </proof>

    <claim|<math|<wide|M|\<bar\>>\<leq\><frac|2<rsup|r-1>-1+2<rsup|r-3>|2<rsup|r>-1>>.>

    <\proof>
      Given a maximum value, bipartite matching problem, the dual covering
      problem has variables <math|y<rsub|v>> corresponding to each node
      <math|v>, and the goal is to minimize <math|<big|sum><rsub|v>y<rsub|v>>
      subject to the constraint that for every edge <math|<around|(|u,v|)>>
      in the graph (which has value <math|w<around|(|u,v|)>>), the dual
      variables satisfy <math|y<rsub|u>+y<rsub|v>\<geq\>w<around|(|u,v|)>>
      and each variable <math|y<rsub|v>> is non-negative.

      So we can upper bound <math|M> by giving a valid dual solution. This
      will then yield an upper bound on <math|M> and consequently will also
      give an upper bound on <math|<wide|M|\<bar\>>>.

      Consider the following dual solution: For each node on the right, with
      label "s" for <math|s\<less\>r-2>, set <math|y<rsub|v>> equal to
      <math|1>. For a node on the right with label "s" for <math|s=r-2>, set
      <math|y<rsub|v>> equal to <math|<frac|1|2>> and for each label "s" for
      <math|s=r-1>, set <math|y<rsub|v>=0>. Additionally, for every node on
      the left, only nodes with label "s" for <math|s=r-1> are given non-zero
      dual variable, and set this variable equal to <math|<frac|1|2>>.

      The value of the dual <math|<big|sum><rsub|v>y<rsub|v>> is
      <math|1+2+...*2<rsup|r-3>+<frac|1|2>*2<rsup|r-2>+<frac|1|2>*2<rsup|r-1>>.
      And so this yields an upper bound on <math|<wide|M|\<bar\>>> of
      <math|<frac|2<rsup|r-1>-1+2<rsup|r-3>|2<rsup|r>-1>> and

      <\equation*>
        lim<rsub|r\<rightarrow\>\<infty\>>
        <frac|2<rsup|r-1>-1+2<rsup|r-3>|2<rsup|r>-1>=<frac|5|8>
      </equation*>
    </proof>
  </proof>

  <section|Conclusions and Future Directions><label|sec:conc>

  The dueling framework presents a fresh way of looking at classic
  optimization problems through the lens of competition. As we have
  demonstrated, standard algorithms for many optimization problems do not, in
  general, perform well in these competitive settings. This leads us to
  suspect that alternative algorithms, tailored to competition, may find use
  in practice. We have adapted linear programming and learning techniques
  into methods for constructing such algorithms.

  We have only just begun an exploration of the dueling framework for
  algorithm analysis; there are many open questions yet to consider. For
  instance, one avenue of future work is to compare the computational
  difficulty of solving an optimization problem with that of solving the
  associated duel. We know that one is not consistently more difficult than
  the other: in Appendix <reference|app:racing> we provide an example in
  which the optimization problem is computationally easy but the competitive
  variant appears difficult; an example of the opposite situation is given in
  Appendix <reference|app:easyduel>, where a computationally hard
  optimization problem has a duel which can be solved easily. Is there some
  structure underlying the relationship between the computational hardness of
  an optimization problem and its competitive analog?

  Perhaps more importantly, one could ask about performance loss inherent
  when players choose their algorithms competitively instead of using the
  (single-player) optimal algorithm. In other words, what is the
  <with|font-shape|italic|price of anarchy> <cite|KP99> of a given duel? Such
  a question requires a suitable definition of the social welfare for
  multiple algorithms, and in particular it may be that two competing
  algorithms perform better than a single optimal algorithm. Our main open
  question is: <em|does competition between algorithms improve or degrade
  expected performance?>

  <page-break>

  <\bibliography|bib|plain|dueling>
    <bib-list|[99]|>
  </bibliography>

  <page-break>

  <section|Proofs from Section<nbsp><reference|sec:defn>><label|app:defn>

  Here we present the proof of Lemma<nbsp><reference|lem:appx>. The proof
  follows a reduction from low-regret learning to computing approximate
  minmax strategies <cite|FS96>. It was shown there that if two players use
  \Plow regret\Q algorithms, then the empirical distribution over play will
  converge to the set of minmax strategies. However, instead of using the
  weighted majority algorithm, we use the \PFollow the expected leader\Q
  (FEL) algorithm <cite|KV05>. That algorithm gives a reduction between the
  ability to compute best responses and \Plow regret.\Q

  Note, for this section, we will use the fact that
  <math|x<rsup|t>*M*x<rprime|'>\<in\>[-C,C]> for
  <math|C=B<rsup|3>*n*n<rprime|'>> under our assumptions on
  <math|K,K<rprime|'>>, and <math|M>. We will extend the domain of
  <math|v:<reals><rsub|\<geq\>0><rsup|n>\<times\><reals><rsub|\<geq\>0><rsup|n<rprime|'>>\<rightarrow\><reals>>
  naturally by <math|v<around|(|x,x<rprime|'>|)>=x<rsup|t>*M*x<rprime|'>>.
  For <math|x\<in\><around|[|0,B|]><rsup|n>> and
  <math|x<rprime|'>\<in\><around|[|0,B|]><rsup|n<rprime|'>>>,
  <math|v<around|(|x,x<rprime|'>|)>\<in\>[-C,C]>. Additionally, for
  simplicity we will change the domains of <math|<cO>> and
  <math|<cO><rprime|'>> to <math|<reals><rsub|\<geq\>0><rsup|n>> and
  <math|<reals><rsub|\<geq\>0><rsup|n<rprime|'>>>, as follows. For any
  <math|x<rprime|'>\<in\><reals><rsub|\<geq\>0><rsup|n<rprime|'>>>, we simply
  take <math|<cO><around|(|B*x<rprime|'>/<around|\<\|\|\>|x<rprime|'>|\<\|\|\>><rsub|\<infty\>>|)>>
  as the best response to <math|x<rprime|'>> (for <math|x<rprime|'>=0> an
  arbitrary element of <math|K>, such as <math|<cO><around|(|0|)>> may be
  chosen). This scaling is logical since <math|arg max<rsub|x\<in\>K>
  x<rsup|t>*M*x<rprime|'>=arg max<rsub|x\<in\>K>
  x<rsup|t>*M*\<alpha\>*x<rprime|'>> for <math|\<alpha\>\<gtr\>0>. By
  linearity in <math|v>, it implies that, for the new oracle <math|<cO>> and
  any <math|x<rprime|'>\<in\><reals><rsub|\<geq\>0><rsup|n<rprime|'>>>,

  <\equation>
    <label|eq:scale>v<around|(|<cO><around|(|x<rprime|'>|)>,x<rprime|'>|)>\<geq\>max<rsub|x\<in\>K>
    v<around|(|x,x<rprime|'>|)>-<eps><frac|<around|\<\|\|\>|x<rprime|'>|\<\|\|\>><rsub|\<infty\>>|B>.
  </equation>

  Similarly for <math|<cO><rprime|'>>.

  Fix any sequence length <math|T\<geq\>1>. Consider <math|T> periods of
  repeated play of the duel. Let the strategies chosen by players 1 and 2, in
  period <math|t>, be <math|x<rsub|t>> and <math|x<rprime|'><rsub|t>>,
  respectively. Define the <with|font-shape|italic|regret> of a player 1 on
  the sequence to be,

  <\equation*>
    max<rsub|x\<in\>K> <big|sum><rsub|t=1><rsup|T>v<around|(|x,x<rprime|'><rsub|t>|)>-<big|sum><rsub|t=1><rsup|T>v<around|(|x<rsub|t>,x<rprime|'><rsub|t>|)>.
  </equation*>

  Similarly define regret for player 2. The (possibly negative) regret of a
  player is how much better that player could have done using the best single
  strategy, where the best is chosen with the benefit of hindsight.

  <\observation>
    <label|ob:1>Suppose in sequence <math|x<rsub|1>,x<rsub|2>,\<ldots\>,x<rsub|T>>
    and <math|x<rprime|'><rsub|1>,x<rprime|'><rsub|2>,\<ldots\>,x<rprime|'><rsub|T>>,
    both players have at most <math|r> regret. Let
    <math|\<sigma\>=<around|(|x<rsub|1>+\<ldots\>+x<rsub|T>|)>/T>,
    <math|\<sigma\><rprime|'>=<around|(|x<rsub|1><rprime|'>+\<ldots\>+x<rsub|T><rprime|'>|)>/T>
    be the uniform mixed strategies over <math|x<rsub|1>,\<ldots\>,x<rsub|T>>,
    and <math|x<rsub|1><rprime|'>,\<ldots\>,x<rprime|'><rsub|T>>,
    respectively. Then <math|\<sigma\>> and <math|\<sigma\><rprime|'>> are
    <math|<eps>>-minmax strategies, for <math|<eps>=2*r/T>.
  </observation>

  <\proof>
    Say the minmax value of the game is <math|\<alpha\>>. Let
    <math|a=<frac|1|T>*<big|sum><rsub|t>v<around|(|x<rsub|i>,x<rprime|'><rsub|i>|)>>.
    Then, by the definition of regret, <math|a\<geq\>\<alpha\>-r/T>, because
    otherwise player 1 would have more than <math|r> regret as seen by any
    minmax strategy for player 1, which guarantees at least an
    <math|\<alpha\>*T> payoff on the sequence. Also, we have that, against
    the uniform mixed strategy over <math|x<rsub|1>,\<ldots\>,x<rsub|T>>, no
    strategy can achieve payoff of at least <math|a-r>, by the definition of
    regret (for player 2). Hence, <math|\<sigma\>> guarantees player 1 a
    payoff of at least <math|\<alpha\>-2*r/T>. A similar argument shows that
    <math|\<sigma\><rprime|'>> is <math|2*r/T>-minmax for player 2.
  </proof>

  The FEL algorithm for a player is simple. It has parameters
  <math|B,R\<gtr\>0,N\<geq\>1> and also takes as input an <math|<eps>> best
  response oracle for the player. For player 1 with best response orace
  <math|<cO>>, the algorithm operates as follows. On each period
  <math|t=1,2,\<ldots\>>, it chooses <math|N> independent uniformly-random
  vectors <math|r<rsub|t*1>,r<rsub|t*2>,\<ldots\>,r<rsub|t*N>\<in\><around|[|0,R|]><rsup|m<rprime|'>>>.
  It plays,

  <\equation*>
    <frac|1|N><around*|(|<big|sum><rsub|j=1><rsup|N><cO><around*|(|r<rsub|t*j>+<big|sum><rsub|\<tau\>=1><rsup|t-1>x<rsub|\<tau\>>|)>|)>\<in\>K.
  </equation*>

  The above is seen to be in <math|K> by convexity. Also recall that for ease
  of analysis, we have assumed that <math|<cO>> takes as input any positive
  combination of points in <math|K<rprime|'>>.

  <\lemma>
    <label|lem:a1>For any <math|B,C,R,T,\<beta\>,<eps>\<gtr\>0>, and any
    <math|r\<in\><around|[|0,R|]><rsup|m<rprime|'>>>,

    <\equation*>
      <big|sum><rsub|t=1><rsup|T>v<around|(|<cO><around|(|r+x<rprime|'><rsub|1>+x<rprime|'><rsub|2>+\<ldots\>+x<rprime|'><rsub|t>|)>,x<rprime|'><rsub|t>|)>\<geq\>max<rsub|x\<in\>K>
      <big|sum><rsub|t=1><rsup|T>v<around|(|x,x<rprime|'><rsub|t>|)>-2*C*R/B-T*<around|(|T+R/B|)><eps>.
    </equation*>
  </lemma>

  The proof is a straightforward modification of Kalai and Vempala's proof
  <cite|KV05>. What this is saying is that the \Pbe the leader\Q algorithm,
  which is \Pone step ahead\Q and uses the information for the current period
  in choosing the current period's play, has low regret. Moreover, one can
  perturb the payoffs by any amount in a bounded cube, and this won't affect
  the bounds significantly. The point of the perturbations, which we will
  choose randomly, will be to make it harder to predict what the algorithm
  will do. For the analysis, they will make it so that \Pbe the leader\Q and
  \Pfollow the leader\Q perform similarly.

  <\proof>
    Define <math|y<rsub|t>=r+x<rprime|'><rsub|1>+\<ldots\>+x<rprime|'><rsub|t-1>>.
    We first show,

    <\equation>
      <label|eq:woop>v<around|(|<cO><around|(|y<rsub|1>|)>,r|)>+<big|sum><rsub|t=1><rsup|T>v<around|(|<cO><around|(|y<rsub|t+1>|)>,x<rprime|'><rsub|t>|)>\<geq\>v<around|(|<cO><around|(|y<rsub|T+1>|)>,r|)>+<big|sum><rsub|t=1><rsup|T>v<around|(|<cO><around|(|y<rsub|T+1>|)>,x<rprime|'><rsub|t>|)>-T*<around|(|T+R/B|)><eps>.
    </equation>

    The facts that <math|<around|\<\|\|\>|r|\<\|\|\>><rsub|\<infty\>>\<leq\>R>
    implies that <math|v<around|(|x,r|)>\<in\>[-C*R/B,C*R/B]>, and hence,

    <align*|<tformat|<table|<row|<cell|C*R/B+<big|sum><rsub|t=1><rsup|T>v<around|(|<cO><around|(|y<rsub|t+1>|)>,x<rprime|'><rsub|t>|)>>|<cell|\<geq\>max<rsub|x\<in\>K><around*|(|v<around|(|x,r|)>+<big|sum><rsub|t=1><rsup|T>v<around|(|x,x<rprime|'><rsub|t>|)>|)>-T*<around|(|T+R/B|)><eps>>>|<row|<cell|>|<cell|\<geq\>max<rsub|x\<in\>K><around*|(|<big|sum><rsub|t=1><rsup|T>v<around|(|x,x<rprime|'><rsub|t>|)>|)>-T*<around|(|T+R/B|)><eps>-2*C*R/B,>>>>>

    which is equivalent to the lemma. We now prove (<reference|eq:woop>) by
    induction on <math|T>. For <math|T=0>, we have equality. For the
    induction step, it suffices to show that,

    <\equation*>
      v<around|(|<cO><around|(|y<rsub|T>|)>,r|)>+<big|sum><rsub|t=1><rsup|T-1>v<around|(|<cO><around|(|y<rsub|T>|)>,x<rprime|'><rsub|t>|)>\<geq\>v<around|(|<cO><around|(|y<rsub|T+1>|)>,r|)>+<big|sum><rsub|t=1><rsup|T-1>v<around|(|<cO><around|(|y<rsub|T+1>|)>,x<rprime|'><rsub|t>|)>-<around|(|R/B+T|)><eps>.
    </equation*>

    However, this is just an inequality between
    <math|v<around|(|<cO><around|(|y<rsub|T>|)>,y<rsub|T>|)>> and
    <math|v<around|(|<cO><around|(|y<rsub|T+1>|)>,y<rsub|T>|)>>, and hence
    follows from (<reference|eq:scale>) and the fact that
    <math|<around|\<\|\|\>|y<rsub|T>|\<\|\|\>><rsub|\<infty\>>/B\<leq\>R/B+T>.
    Hence we have established (<reference|eq:woop>) and also the lemma.
  </proof>

  <\lemma>
    <label|lem:a2>For any <math|\<delta\>\<geq\>0>, with probability
    <math|\<geq\>1-2*T*e<rsup|-2*\<delta\><rsup|2>*N>>,

    <\equation*>
      <big|sum><rsub|t=1><rsup|T>v<around|(|x<rsub|t>,x<rprime|'><rsub|t>|)>\<geq\>max<rsub|x\<in\>K>
      <big|sum><rsub|t=1><rsup|T>v<around|(|x,x<rprime|'><rsub|t>|)>-\<delta\>*C*T-2*B*C*m<rprime|'>*T/R-2*C*R/B-T*<around|(|T+R/B|)><eps>.
    </equation*>
  </lemma>

  <\proof>
    It is clear that <math|y<rsub|t>> and <math|y<rsub|t+1>> are similarly
    distributed. For any fixed <math|x<rsub|1><rprime|'>,x<rsub|2><rprime|'>,\<ldots\>,x<rsub|T><rprime|'>>,
    define <math|<wide|x|\<bar\>><rsub|t>> by,

    <\equation*>
      <wide|x|\<bar\>><rsub|t>=<frac|1|R<rsup|m<rprime|'>>>*<big|int><rsub|r\<in\><around|[|0,R|]><rsup|m<rprime|'>>><cO><around*|(|r+x<rsub|1><rprime|'>+\<ldots\>+x<rprime|'><rsub|t-1>|)>*d*r.
    </equation*>

    By linearity of expectation and <math|v>, it is easy to see that
    <math|<E><around|[|x<rsub|t>\|x<rsub|1><rprime|'>,\<ldots\>,x<rsub|t-1><rprime|'>|]>=<wide|x|\<bar\>><rsub|t>>
    and,

    <\equation*>
      <E><around|[|v<around|(|x<rsub|t>,x<rprime|'><rsub|t>|)><nbsp>\|<nbsp>x<rsub|1><rprime|'>,\<ldots\>,x<rsub|t><rprime|'>|]>=v<around|(|<wide|x|\<bar\>><rsub|t>,x<rprime|'><rsub|t>|)>.
    </equation*>

    By Chernoff-Hoeffding bounds, since <math|v<around|(|x<rsub|t>,x<rprime|'><rsub|t>|)>\<in\>[-C,C]>,
    for any <math|\<delta\>\<geq\>0>, we have that with probability at least
    <math|1-e<rsup|-2*\<delta\><rsup|2>*N>>,

    <\equation*>
      Pr <around*|[|<nbsp><around|\||v<around|(|x<rsub|t>,x<rprime|'><rsub|t>|)>-v<around|(|<wide|x|\<bar\>><rsub|t>,x<rprime|'><rsub|t>|)>|\|>\<geq\>\<delta\>*C<nbsp><around*|\||<nbsp>x<rsub|1><rprime|'>,\<ldots\>,x<rsub|t><rprime|'>|]>\<leq\>2*e<rsup|-2*\<delta\><rsup|2>*N>.|\<nobracket\>>
    </equation*>

    Hence, by the union bound, <math|Pr <around*|[|<nbsp><around|\||<big|sum><rsub|t>v<around|(|x<rsub|t>,x<rprime|'><rsub|t>|)>-<big|sum><rsub|t>v<around|(|<wide|x|\<bar\>><rsub|t>,x<rprime|'><rsub|t>|)>|\|>\<geq\>\<delta\>*C*T|]>\<leq\>2*T*e<rsup|-2*\<delta\><rsup|2>*N>>.

    The key observation of Kalai and Vempala is that
    <math|<wide|x|\<bar\>><rsub|t>> and <math|<wide|x|\<bar\>><rsub|t+1>> are
    close because the <math|m<rprime|'>>-dimensional translated cubes
    <math|x<rsub|1><rprime|'>+\<ldots\>+x<rprime|'><rsub|t-1>+<around|[|0,R|]><rsup|m<rprime|'>>>
    and <math|x<rsub|1><rprime|'>+\<ldots\>+x<rprime|'><rsub|t>+<around|[|0,R|]><rsup|m<rprime|'>>>
    overlap significantly. In particular, they overlap in on all but at most
    a <math|B*m<rprime|'>/R> fraction <cite|KV05> of their volume. Since
    <math|v> is in <math|[-1,1]>, this means that
    <math|<around*|\||v<around|(|<wide|x|\<bar\>><rsub|t>,x<rprime|'><rsub|t>|)>-v<around|(|<wide|x|\<bar\>><rsub|t+1>,x<rprime|'><rsub|t>|)>|\|>\<leq\>2*B*C*m<rprime|'>/R>.
    This follows from the fact that <math|v> is bilinear, and hence when
    moved into the integral has exactly the same behavior on all but a
    <math|B*m<rprime|'>/R> fraction of the points in each cube. This implies,
    that with probability <math|\<geq\>1-2*T*e<rsup|-2*\<delta\><rsup|2>*N>>,

    <\equation*>
      <big|sum><rsub|t=1><rsup|T>v<around|(|x<rsub|t>,x<rprime|'><rsub|t>|)>\<geq\><big|sum><rsub|t=1><rsup|T>v<around|(|<wide|x|\<bar\>><rsub|t+1>,x<rprime|'><rsub|t>|)>-\<delta\>*C*T-2*B*C*m<rprime|'>*T/R.
    </equation*>

    Combining this with Lemma <reference|lem:a1> completes the proof.
  </proof>

  We are now ready to prove Lemma <reference|lem:appx>.

  <\proof>
    <dueto|Proof of Lemma <reference|lem:appx>>We take
    <math|T=<around*|(|4*C*<sqrt|max <around|(|m,m<rprime|'>|)>>/<around|(|3<eps>|)>|)><rsup|2/3>>,
    <math|R=B*<sqrt|max <around|(|m,m<rprime|'>|)>*T>> and <math|N=ln
    <around|(|4*T*C/\<delta\>|)>/<around|(|2<eps><rsup|2>|)>>. As long as
    <math|T\<geq\>max <around|(|m,m<rprime|'>|)>>, <math|R/B\<leq\>T> and
    hence Lemma <reference|lem:a2> implies that with probability at least
    <math|1-\<delta\>>, if both players play FEL then both will have regret
    at most

    <\equation*>
      <eps>T+4*C*<sqrt|max <around|(|m,m<rprime|'>|)>*T>+2*T<rsup|2><eps>\<leq\>4*C*<sqrt|max
      <around|(|m,m<rprime|'>|)>*T>+3*T<rsup|2><eps>\<leq\>12*<around|(|max
      <around|(|m,m<rprime|'>|)>*C<rsup|2>|)><rsup|2/3><eps><rsup|-1/3>.
    </equation*>

    Observation <reference|ob:1> completes the proof.
  </proof>

  <section|A Racing Duel><label|app:racing>

  The racing duel illustrates a simple example in which the beatability is
  unbounded, the optimization problem is \Peasy,\Q but finding
  polynomial-time minmax algorithms remains a challenging open problem. The
  optimization problem behind the racing duel is routing under uncertainty.
  There is an underlying directed multigraph <math|<around|(|V,E|)>>
  containing designated start and terminal nodes <math|s,t\<in\>V>, along
  with a distribution over bounded weight vectors
  <math|\<Omega\>\<subset\><reals><rsub|\<geq\>0><rsup|E>>, where
  <math|\<omega\><rsub|e>> represents the delay in traversing edge <math|e>.
  The feasible set <math|X> is the set of paths from <math|s> to <math|t>.
  The probability distribution <math|p\<in\>\<Delta\><around|(|\<Omega\>|)>>
  is an arbitrary measure over <math|\<Omega\>>. Finally,
  <math|c<around|(|x,\<omega\>|)>=<big|sum><rsub|e\<in\>x>\<omega\><rsub|e>>.

  For general graphs, solving the racing duel seems quite challenging. This
  is true even when routing between two nodes with parallel edges, i.e.,
  <math|V=<around|{|s,t|}>> and all edges
  <math|E=<around|{|e<rsub|1>,e<rsub|2>,\<ldots\>,e<rsub|n>|}>> are from
  <math|s> to <math|t>. As mentioned in the introduction, this problem is in
  some sense a \Pprimal\Q duel in the sense that it can encode any duel and
  finite strategy set. In particular, given any optimization problem with
  <math|<around|\||X|\|>=n>, we can create a race where each edge
  <math|e<rsub|i>\<in\>E> corresponds to a strategy <math|x<rsub|i>\<in\>X>,
  and the delays on the edges match the costs of the associated strategies.

  <subsection|Shortest path routing is 1-beatable>

  The single-player racing problem is easy: take the shortest path on the
  graph with weights <math|w<rsub|e>=<E><rsub|\<omega\>\<sim\>p><around|[|\<omega\><rsub|e>|]>>.
  However, this algorithm can be beaten almost always. Consider a graph with
  two parallel edges, <math|a> and <math|b>, both from <math|s> to <math|t>.
  Say the cost of <math|a> is <math|\<epsilon\>/2\<gtr\>0> with probability
  1, and the cost of <math|b> is 0 with probability <math|1-\<epsilon\>> and
  <math|1> with probability <math|\<epsilon\>>. The optimization algorithm
  will choose <math|a>, but <math|b> beats <math|a> with probability
  <math|1-\<epsilon\>>, which is arbitrarily close to 1.

  <subsection|Price of anarchy>

  Take social welfare to be the average performance,
  <math|W<around|(|x,x<rprime|'>|)>=<around|(|c<around|(|x|)>+c<around|(|x<rprime|'>|)>|)>/2>.
  Then the price of anarchy for racing is unbounded. Consider a graph with
  two parallel edges, <math|a> and <math|b>, both from <math|s> to <math|t>.
  The cost of <math|a> is <math|\<epsilon\>\<gtr\>0> with probability 1, and
  the cost of <math|b> is 0 with probability 3/4 and <math|1> with
  probability 1/4. Then <math|b> a dominant strategy for both players, but
  its expected cost is <math|1/4>, so the price of anarchy is
  <math|1/<around|(|4<eps>|)>>, which can be arbitrarily large.

  <section|When Competing is Easier than Playing Alone><label|app:easyduel>

  Recall that the racing problem from Appendix <reference|app:racing> was
  \Peasy\Q for single-player optimization, yet seemingly difficult to solve
  in the competitive setting. We now give a contrasting example: a problem
  for which competing is easier than solving the single-player optimization.

  The intuition behind our construction is as follows. The optimization
  problem will be based upon a computationally difficult decision problem,
  which an algorithm must attempt to answer. After the algorithm submits an
  answer, nature provides its own \Panswer\Q chosen uniformly at random. If
  the algorithm disagrees with nature, it incurs a large cost that is
  independent of whether or not it was correct. If the algorithm and nature
  agree, then the cost of answering the problem correctly is less than the
  cost of answering incorrectly.

  More formally, let <math|L\<subseteq\><around|{|0,1|}><rsup|\<ast\>>> be an
  arbitrary language, and let <math|z\<in\><around|{|0,1|}><rsup|\<ast\>>> be
  a string. Our duel will be <math|D<around|(|X,\<Omega\>,p,c|)>> where
  <math|X=\<Omega\>=<around|{|0,1|}>>, <math|p> is uniform, and the cost
  function is

  <\equation*>
    c<around|(|x,\<omega\>|)>=<choice|<tformat|<table|<row|<cell|0>|<cell|<text|if
    <math|(x=\<omega\>=1> and <math|z\<in\>L)> or <math|(x=\<omega\>=0> and
    >z\<nin\>L)>>|<row|<cell|1>|<cell|<text|if <math|(x=\<omega\>=1> and
    <math|z\<nin\>L)> or <math|(x=\<omega\>=0> and
    >z\<in\>L)>>|<row|<cell|2>|<cell|<text|if >x\<neq\>\<omega\>>>>>>
  </equation*>

  The unique optimal solution to this (single-player) problem is to output
  <math|1> if and only if <math|z\<in\>L>. Doing so is as computationally
  difficult as the decision problem itself. On the other hand, finding a
  minmax optimal algorithm is trivial for every <math|z> and <math|L>, since
  <em|every> algorithm has value <math|1/2>: for any <math|x<rprime|'>>,
  <math|v*<around|(|1-x<rprime|'>,x<rprime|'>|)>=Pr
  <around|[|\<omega\>\<neq\>x<rprime|'>|]>=1/2=v<around|(|x<rprime|'>,x<rprime|'>|)>>.

  <section|Asymmetric Games><label|app:asymmetric>

  We note that all of the examples we considered have been symmetric with
  respect to the players, but our results can be extended to asymmetric
  games. Our analysis of bilinear duels in Section <reference|sec:bilinear>
  does not assume symmetry when discussing bilinear games. For instance, we
  could consider a game where player 1 wins in the case of ties, so player
  1's payoff is <math|Pr <around|[|c<around|(|x,\<omega\>|)>\<leq\>c<around|(|x<rprime|'>,\<omega\>|)>|]>>.
  One natural example would be a ranking duel in which there is an
  \Pincumbent\Q search engine that appeared first, so a user prefers to
  continue using it rather than switching to a new one. This game can be
  represented in the same bilinear form as in Section <reference|sec:rank>,
  the only change being a small modification of the payoff matrix <math|M>.
  Other types of asymmetry, such as players having different objective
  functions, can be handled in the same way. For example, in a hiring duel,
  our analysis techniques apply even if the two players may have different
  pools of candidates, of possibly different sizes and qualities.

  @inproceedingsAK04, author = Awerbuch, Baruch and Kleinberg, Robert D.,
  title = Adaptive routing with end-to-end feedback: distributed learning and
  geometric approaches, booktitle = STOC '04: Proceedings of the thirty-sixth
  annual ACM symposium on Theory of computing, year = 2004, isbn =
  1-58113-852-0, pages = 45\U53, location = Chicago, IL, USA, doi =
  http://doi.acm.org/10.1145/1007352.1007367, publisher = ACM, address = New
  York, NY, USA,

  @articleBFHS09, author = Felix Brandt and Felix A. Fischer and Paul
  Harrenstein and Yoav Shoham, title = Ranking games, journal = Artif.
  Intell., volume = 173, number = 2, year = 2009, pages = 221-239

  @inproceedingsAKT08, author = Itai Ashlagi and Piotr Krysta and Moshe
  Tennenholtz, title = Social Context Games, booktitle = WINE, year = 2008,
  pages = 675-683

  @inproceedingsATZ10, author = Itai Ashlagi and Moshe Tennenholtz and Aviv
  Zohar, title = Competing Schedulers, booktitle = AAAI, year = 2010

  @articleBS99, AUTHOR = "R. Burguet and J. Sakovics", TITLE ="Imperfect
  Competition in Auction Designs", volume=40(1), pages=231\U247,
  journal=International Economic Review, year=1999

  @inproceedingsFS96, author = Freund, Yoav and Schapire, Robert E., title =
  Game theory, on-line prediction and boosting, booktitle = COLT '96:
  Proceedings of the ninth annual conference on Computational learning
  theory, year = 1996, isbn = 0-89791-811-8, pages = 325\U332, location =
  Desenzano del Garda, Italy, doi = http://doi.acm.org/10.1145/238061.238163,
  publisher = ACM, address = New York, NY, USA,

  @incollection IKM06, author = Immorlica, Nicole and Kleinberg, Robert and
  Mahdian, Mohammad, affiliation = Microsoft Research, One Microsoft Way,
  Redmond, WA \<#FFFD\>, title = Secretary Problems with Competing Employers,
  booktitle = Internet and Network Economics, series = Lecture Notes in
  Computer Science, editor = Spirakis, Paul and Mavronicolas, Marios and
  Kontogiannis, Spyros, publisher = Springer Berlin / Heidelberg, isbn = ,
  pages = 389-400, volume = 4286, url = http://dx.doi.org/10.1007/11944874<rsub|3>5,
  note = 10.1007/11944874<rsub|3>5, year = 2006

  @articleKV05, title = "Efficient algorithms for online decision problems",
  journal = "Journal of Computer and System Sciences", volume = "71", number
  = "3", pages = "291 - 307", year = "2005", note = "Learning Theory 2003",
  issn = "0022-0000", doi = "DOI: 10.1016/j.jcss.2004.10.016", url =
  "http://www.sciencedirect.com/science/article/B6WJ0-4F2R60M-1/2/decd393207e921dd12ea556b5ac0bcc3",
  author = "Adam Kalai and Santosh Vempala", keywords = "Online algorithms",
  keywords = "Hannan's algorithm", keywords = "Optimization", keywords =
  "Decision theory"

  @articleMT04, AUTHOR = "D. Monderer and M. Tennenholtz", TITLE ="K-price
  auctions: Revenue Inequalities, Utility Equivalence, and Competition in
  Auction Design", volume=24(2), pages=255\U270, journal=Economic Theory,
  year=2004

  @articlePS97, AUTHOR = "M. Peters and S. Severinov", TITLE ="Competition
  Among Sellers Who Offer Auctions Instead of Prices", volume=75,
  pages=141\<#FFFD\>-179, journal=Journal of Economic Theory, year=1997

  @articleMcafee93, AUTHOR = "P. McAfee", TITLE ="Mechanism Design by
  Competing Sellers", volume=61, pages=1281-\<#FFFD\>1312,
  journal=Econometrica, year=1993

  @inproceedingsFKS95, author = Feigenbaum, J. and Koller, D. and Shor, P.,
  title = A game-theoretic classification of interactive complexity classes,
  booktitle = SCT '95: Proceedings of the 10th Annual Structure in Complexity
  Theory Conference (SCT'95), year = 1995, isbn = 0-8186-7052-5, pages = 227,
  publisher = IEEE Computer Society, address = Washington, DC, USA,

  @inproceedingsFIKU05, author = Fortnow, Lance and Impagliazzo, Russell and
  Kabanets, Valentine and Umans, Christopher, title = On the Complexity of
  Succinct Zero-Sum Games, booktitle = CCC '05: Proceedings of the 20th
  Annual IEEE Conference on Computational Complexity, year = 2005, isbn =
  0-7695-2364-1, pages = 323\U332, doi = http://dx.doi.org/10.1109/CCC.2005.18,
  publisher = IEEE Computer Society, address = Washington, DC, USA,

  @article CMRS64, author = Chow, Y. and Moriguti, S. and Robbins, H. and
  Samuels, S., affiliation = Purdue University USA, title = Optimal selection
  based on relative rank (the \<#FFFD\>secretary problem\<#FFFD\>), journal =
  Israel Journal of Mathematics, publisher = Hebrew University Magnes Press,
  issn = 0021-2172, keyword = Computer Science, pages = 81-90, volume = 2,
  issue = 2, url = http://dx.doi.org/10.1007/BF02759948, note =
  10.1007/BF02759948, year = 1964

  @articleYann, author = Mihalis Yannakakis, title = Expressing Combinatorial
  Optimization Problems by Linear Programs, journal = J. Comput. Syst. Sci.,
  volume = 43, number = 3, year = 1991, pages = 441-466, bibsource = DBLP,
  http://dblp.uni-trier.de
</body>