<TeXmacs|1.99.7>

<style|<tuple|amsart|std-latex>>

<\body>
  <\hide-preamble>
    <assign|U|<macro|1|<protect><rule|.1in|.1in>>>

    <new-theorem|thm|Theorem>

    <new-theorem|cor|Corollary>

    <new-theorem|lem|Lemma>

    <new-theorem|prop|Proposition>

    <new-theorem|defn|Definition>

    <new-theorem|rem|Remark>

    <new-theorem|con|Condition>

    <new-theorem|ex|Example>

    <assign|norm|<macro|1|<left|\|\|><arg|1><right|\|\|>>>

    <assign|abs|<macro|1|<left|\|><arg|1><right|\|>>>

    <assign|set|<macro|1|<left|{><arg|1><right|}>>>

    <assign|Real|<macro|\<bbb-R\>>>

    <assign|eps|<macro|\<varepsilon\>>>

    <assign|To|<macro|\<longrightarrow\>>>

    <assign|A|<macro|\<cal-A\>>>

    <assign|va|<macro|<wide*|a|\<bar\>>>>

    <assign|vx|<macro|<wide*|x|\<bar\>>>>

    <assign|vnu|<macro|<wide*|\<nu\>|\<bar\>>>>

    <assign|vnun|<macro|<wide*|\<nu\>|\<bar\>><rsub|k,n>>>

    <assign|vnud|<macro|<wide*|\<nu\>|\<bar\>><rsup|\<delta\>>>>

    <assign|vgam|<macro|<wide*|\<gamma\>|\<bar\>>>>

    <assign|vB|<macro|<wide*|B|\<bar\>>>>

    <assign|LVertt|<macro|<left|\|><space|-0.17em><left|\|><space|-0.17em><left|\|>>>

    <assign|RVertt|<macro|<right|\|><space|-0.17em><right|\|><space|-0.17em><right|\|>>>

    <assign|1|<macro|\<bbb-I\>>>

    <assign|baselinestretch|<macro|1.3>>
  </hide-preamble>

  <doc-data|<doc-title|An estimation method for the chi-square divergence
  with application to test of hypotheses>|<doc-author|<author-data|<author-name|M.
  Broniatowski<rsup|<math|1>>>>>|<doc-author|<author-data|<author-name|S.
  Leorato<rsup|<math|2>>>|<author-affiliation|<rsup|<math|1>>Universit de
  REIMS and LSTA, Universit PARIS 6, 175 Rue du Chevaleret 75013 PARIS,
  FRANCE<next-line><rsup|<math|2>>Dip. di Statistica, Probabilit e
  Statistiche Applicate, University of Rome "La Sapienza", P.le A. Moro, 5
  00185 Roma>|<author-email|<rsup|<math|2>>samantha.leorato@uniroma1.it>>>>

  <abstract-data|<abstract-keywords|chi-square divergence| hypothesis
  testing| linear constraints| marginal distributions| contamination models|
  Fenchel-Legendre transform| inliers>|<abstract-msc|62F03|
  62F10|62F30>|<\abstract>
    We propose a new definition of the chi-square divergence between
    distributions. Based on convexity properties and duality, this version of
    the <math|\<chi\><rsup|2>> is well suited both for the classical
    applications of the <math|\<chi\><rsup|2>> for the analysis of
    contingency tables and for the statistical tests for parametric models,
    for which it has been advocated to be robust against <em|inliers>.

    We present two applications in testing. In the first one we deal with
    tests for finite and infinite numbers of linear constraints, while, in
    the second one, we apply <math|\<chi\><rsup|2>->methodology for
    parametric testing against contamination.
  </abstract>>

  <section|Introduction><label|sec:1.intro>

  The <math|\<chi\><rsup|2>> distance is commonly used for categorized data.
  For the continuous case, optimal grouping pertaining to the
  <math|\<chi\><rsup|2>> criterion have been proposed by various authors; see
  f.i. <nbsp><cite|Bosq80>, <cite|Lancaster>, <cite|greenwoodNikulin>. These
  methods are mainly applied for tests, since they may lead to some bias
  effect for estimation.

  This paper introduces a new approach to the <math|\<chi\><rsup|2>>,
  inserting its study inside the range of divergence-based methods and
  presenting a technique avoiding grouping for estimation and test. Let us
  first introduce some notation.

  Let <math|M<rsub|1>> denote the set of all probability measures on
  <math|\<bbb-R\><rsup|d>> and <math|M> the set of all signed measures on
  <math|\<bbb-R\><rsup|d>> with total mass 1. For <math|P\<in\>M<rsub|1>> and
  <math|Q\<in\>M>, introduce the <math|\<chi\><rsup|2>> distance between
  <math|P> and <math|Q> by

  <\equation>
    \<chi\><rsup|2><around|(|Q,P|)>=

    <\around*|{>
      <array|c|l*l|<tformat|<table|<row|<cell|<big|int><rsub|\<nosymbol\>><around*|(|<frac|d*Q-d*P|d*P>|)><rsup|2>*<space|0.17em>d*P>|<cell|Q*<text|is
      a.c. w.r.t. >P>>|<row|<cell|\<infty\>>|<cell|<text|otherwise.>>>>>>
    </around*|\<nobracket\>>

    <label|eqn:1.def>
  </equation>

  For <math|\<Omega\>> a subset of <math|M> denote

  <\equation>
    \<chi\><rsup|2><around|(|\<Omega\>,P|)>=inf<rsub|Q\<in\>\<Omega\>>
    \<chi\><rsup|2><around|(|Q,P|)>,<label|eqn:1.def>
  </equation>

  with <math|inf<rsub|<around|{|\<emptyset\>|}>>=\<infty\>>.

  When the infimum in (<reference|eqn:1.def>) is reached at some measure
  <math|Q<rsup|\<ast\>>> which belongs to <math|\<Omega\>>, then
  <math|Q<rsup|\<ast\>>> is the <em|projection> of <math|P> to
  <math|\<Omega\>>. Also the role of the class of measures <math|M> will
  appear later, in connection with the possibility to obtain easily
  <math|Q<rsup|\<ast\>>> through usual optimization methods, which might be
  quite difficult when we consider subsets <math|\<Omega\>> in
  <math|M<rsub|1>>.

  For a problem of test such as <math|H<rsub|0>:P\<in\>\<Omega\>> vs
  <math|H<rsub|1>:P\<nin\>\<Omega\>>, the test statistic will be an estimate
  of <math|\<chi\><rsup|2><around|(|\<Omega\>,P|)>>, which equals <math|0>
  under <math|H<rsub|0>>, since in that case <math|P=Q<rsup|\<ast\>>>.
  Therefore, under <math|H<rsub|0>>, there is no restriction when considering
  <math|\<Omega\>> a subset of <math|M>.

  The <math|\<chi\><rsup|2>> distance belongs to the so-called
  <math|\<phi\>->divergences, defined through

  <\equation>
    \<phi\><around|(|Q,P|)>=

    <\around*|{>
      <array|c|l*l|<tformat|<table|<row|<cell|<big|int>\<varphi\><around*|(|<frac|d*Q|d*P>|)>*d*P>|<cell|<text|when
      >Q\<ll\>P>>|<row|<cell|\<infty\>>|<cell|<text|otherwise>>>>>>
    </around*|\<nobracket\>>

    <label|eqn:1.def>
  </equation>

  where <math|\<varphi\>> is a convex function defined on
  <math|\<bbb-R\><rsup|+>> satisfying <math|\<varphi\><around|(|1|)>=0>. This
  class of discrepancy measures between probability measures has been
  introduced by I. Csiszr <cite|Csiszar1967>, and the monograph by F. Liese
  and I. Vajda <cite|LV77> provides their main properties.

  The extension of <math|\<phi\>->divergences when <math|Q> is assumed to be
  in <math|M> is presented in <cite|Broniatowski-Keziou2003>, in the context
  of parametric estimation and tests.

  The class of minimum <math|\<phi\>->divergence test statistics include,
  within the others, the <em|loglikelihood ratio test>.

  For this class it is a matter of fact that first order efficiency is not a
  useful criterion of discrimination. A notion of robustness against model
  contamination is found in Lindsay <cite|Lindsay1994> (for estimators) and
  in Jimenez and Shao <cite|JimenezShao2001> (for test procedures), which
  gives an instrument to compare the tests associated to different
  divergences. Although their argument deals with finite support models, it
  may help as a benchmark for more general situations.

  By these papers it emerges that the <em|minimum Hellinger distance> test
  provides a reasonable compromise between robustness against model
  contaminations induced by outliers and by inliers.

  However, when the model might be subject to inlier contaminations only
  (namely missing data problems), as will be advocated in the present paper
  for contamination models, then minimum <math|\<chi\><rsup|2>->divergence
  test behaves better than both minimum Hellinger distance and loglikelihood
  ratio tests, in terms of their <em|residual adjustment functions> (RAF),
  because (we refer to <cite|JimenezShao2001> for the notation)

  <\equation*>
    <around*|\||<frac|A<rsub|\<chi\><rsup|2>>(-1)|A<rsub|L*R>(-1)>|\|>=<frac|1|2>\<less\>1<space|1em><text|and><space|2em><around*|\||<frac|A<rsub|\<chi\><rsup|2>>(-1)|A<rsub|H*D>(-1)>|\|>=<frac|1|4>\<less\>1.
  </equation*>

  Formula (<reference|eqn:1.def>) is not suitable for statistical purposes as
  such. Indeed, suppose that we are interested in testing wether <math|P> is
  in some class <math|\<Omega\>> of distributions with absolutely continuous
  component. Let <math|X=<around|(|X<rsub|1>,\<ldots\>,X<rsub|n>|)>> be an
  i.i.d. sample with unknown distribution <math|P>. Assume that
  <math|P<rsub|n>\<assign\><frac|1|n>*<big|sum><rsub|i=1><rsup|n>\<delta\><rsub|X<rsub|i>>>,
  the empirical measure pertaining to <math|X>, is the only information
  available on <math|P>, where <math|\<delta\><rsub|x>> is the Dirac measure
  at point <math|x>. Then, for all <math|Q\<in\>\<Omega\>>, the
  <math|\<chi\><rsup|2>> distance between <math|Q> and <math|P<rsub|n>> is
  infinite. Therefore no plug-in technique can lead to a definite statistic
  in this usual case.

  Our approach solves this difficulty and is based on the
  <textquotedblright>dual representation <textquotedblright> for the
  <math|\<chi\><rsup|2>> divergence, which is a consequence of the convexity
  of the mapping <math|Q\<longmapsto\>\<chi\><rsup|2><around|(|Q,P|)>>, plus
  some regularity property; this will be set in Section
  <reference|sec:2.estimator>, together with conditions under which <math|P>
  has a <math|\<chi\><rsup|2>->projection on <math|\<Omega\>>. We will also
  provide an estimate for the function <math|<frac|d*Q<rsup|\<ast\>>|d*P>>
  which indicates the local changes induced on <math|P> by the projection
  operator.

  In some cases it is possible to replace <math|\<Omega\>> by
  <math|\<Omega\>\<cap\>\<Lambda\><rsub|n>> where <math|\<Lambda\><rsub|n>>
  is the set of all measures in <math|M> whose support is <math|X>, when this
  intersection is not void, as happens when <math|\<Omega\>> is defined for
  example through moment conditions. This approach is called the Generalized
  Likelihood paradigm (see <cite|neweySmith2003> and references therein), and
  we will develop in Section 3 a complete study pertaining to such case when
  handling the <math|\<chi\><rsup|2>> divergence, in the event that
  <math|\<Omega\>> is defined through linear constraints, namely when

  <\equation>
    \<Omega\>=<around*|{|Q\<in\>M*<text|such
    that><big|int>f<around|(|x|)>*d*Q<around|(|x|)>=0|}><label|eqn:1.omegalinear>
  </equation>

  for some <math|\<bbb-R\><rsup|k>->valued function <math|f> defined on
  <math|\<bbb-R\><rsup|d>>. In this case the projection
  <math|Q<rsup|\<ast\>>> has a very simple form and its estimation results as
  the solution of a linear system of equations, which motivates the choice of
  <math|\<chi\><rsup|2>> criterion for tests of the form
  <math|H<rsub|0>:P\<in\>\<Omega\>> with <math|\<Omega\>> as in
  (<reference|eqn:1.omegalinear>). As is shown in Section 3, by Theorem
  <reference|th:2.linear> the constrained problem is in fact reduced to an
  unconstrained one.

  Also for the problem of testing whether <math|P> belongs to
  <math|\<Omega\>> our results include the asymptotic distribution of the
  test statistics under any <math|P> in the alternative, proving consistency
  of the procedure, a result that is not addressed in the current literature
  on Generalized Likelihood.

  In Section 3 we will apply the above results to the case of a test of fit,
  where <math|\<Omega\>=<around*|{|P<rsub|0>|}>> is a fixed p.m.

  When <math|\<Omega\>\<cap\>\<Lambda\><rsub|n>> is void some smoothing
  technique has been proposed, following <cite|Beran77>, substituting
  <math|P<rsub|n>> by some regularized version; see
  <cite|MoralesPardoVajda1997>. In those cases we have chosen not to make any
  smoothing, exploiting the dual representation in a parametric context.
  Section 4 addresses this approach through the study of contamination
  models, for a composite problem, when the contamination modifies a
  distribution with unknown parameter.

  <section|The definition of the estimator><label|sec:2.estimator>

  <subsection|Some properties of <math|\<chi\><rsup|2>->distance><label|subsec:2.duality>

  We will consider sets <math|\<Omega\>> of signed measures with total mass 1
  that integrate some class of functions <math|\<Phi\>>. The choice of
  <math|\<Phi\>> depends on the context as seen below. Let

  <\equation>
    M<rsub|\<Phi\>>\<assign\><around*|{|Q\<in\>M*<text| such that
    \ ><big|int><around|\||\<varphi\>|\|>*d<around|\||Q|\|>\<less\>\<infty\>,<text|for
    all >\<varphi\>\<in\>\<Phi\>|}>.<label|eqn:2.M>
  </equation>

  We first consider sufficient conditions for the existence of
  <math|Q<rsup|\<ast\>>>, the projection of <math|P> on <math|\<Omega\>>. We
  introduce the following notation.

  Let <math|\<b-Phi\>=\<Phi\>\<cup\>\<cal-B\><rsub|b>>, where
  <math|\<cal-B\><rsub|b>> is the class of all measurable bounded functions
  on <math|\<bbb-R\><rsup|d>>. Let <math|\<tau\><rsub|\<b-Phi\>>> be the
  coarsest topology on <math|M> which makes all mappings
  <math|Q\<longmapsto\><big|int>\<varphi\>*d*Q> continuous for all
  <math|\<varphi\>\<in\>\<b-Phi\>>. When <math|\<b-Phi\>> is restricted to
  <math|\<cal-B\><rsub|b>>, the <math|\<tau\><rsub|\<b-Phi\>>> topology turns
  out to be the usual <math|\<tau\>-> topology (see e.g. <cite|GOR1979>).

  Assume that for all functions <math|\<varphi\>> in <math|\<Phi\>> there
  exists some positive <math|\<varepsilon\>> with

  <\equation*>
    <big|int>\<varphi\><rsup|2+\<varepsilon\>>*d*P\<less\>\<infty\>.
  </equation*>

  Whenever <math|\<Omega\>> is a closed set in <math|M<rsub|\<Phi\>>>
  equipped with the <math|\<tau\><rsub|\<b-Phi\>>> topology and
  <math|\<chi\><rsup|2><around|(|\<Omega\>,P|)>> is finite, then, as a
  consequence of Theorem 2.3 in <cite|Broniatowski-Keziou2003>, <math|P> has
  a projection in <math|\<Omega\>>. Moreover, when <math|\<Omega\>> is
  convex, uniqueness is achieved.

  In statistical applications the set <math|\<Omega\>> is often defined
  through some statistical functional; for example, let <math|\<Omega\>>
  defined as in (<reference|eqn:1.omegalinear>). In this case
  <math|\<Phi\>\<assign\><around*|{|f|}>> and <math|\<Omega\>> is closed by
  the very definition of <math|\<Phi\>>; therefore the choice of the class of
  functions <math|\<Phi\>> is intimately connected with the set
  <math|\<Omega\>> under consideration. As seen in Section 4, and as
  developed in <cite|Broniatowski-Keziou2003> also when <math|\<Omega\>> is a
  subset of some parametric family of distributions, the class <math|\<Phi\>>
  can be defined with respect to <math|\<Omega\>>.

  We first provide a characterization of the
  <math|\<chi\><rsup|2>->projection of a p.m. <math|P> on some set
  <math|\<Omega\>> in <math|M>.

  Let <math|\<cal-D\>> denote the domain of the divergence for fixed
  <math|P>, namely

  <\equation*>
    \<cal-D\>=<around*|{|Q\<in\>M*<text|such that
    \ >\<chi\><rsup|2><around|(|Q,P|)>\<less\>\<infty\>|}>.
  </equation*>

  We have (see <cite|BroniatowskiKeziou2003b>, Theorem 2.6)

  <\thm>
    <label|th:2.characterization>Let <math|\<Omega\>> be a subset of
    <math|M>. Then

    <\enumerate>
      <item>If there exists some <math|Q<rsup|\<ast\>>> in <math|\<Omega\>>
      such that for all <math|Q> in <math|\<Omega\>\<cap\>\<cal-D\>>,

      <\equation*>
        q<rsup|\<ast\>>\<in\>L<rsub|1><around|(|Q|)>*<text|and
        ><big|int>q<rsup|\<ast\>>*d*Q<rsup|\<ast\>>\<leq\><big|int>q<rsup|\<ast\>>*d*Q
      </equation*>

      where <math|q<rsup|\<ast\>>=<frac|d*Q<rsup|\<ast\>>|d*P>>, then
      <math|Q<rsup|\<ast\>>> is the <math|\<chi\><rsup|2>->projection of
      <math|P> on <math|\<Omega\>>

      <item>If <math|\<Omega\>> is convex and <math|P> has projection
      <math|Q<rsup|\<ast\>>> on <math|\<Omega\>>, then, for all <math|Q> in
      <math|\<Omega\>>, <math|q<rsup|\<ast\>>> belongs to
      <math|L<rsub|1><around|(|P|)>> and <math|<big|int>q<rsup|\<ast\>>*d*Q<rsup|\<ast\>>\<leq\><big|int>q<rsup|\<ast\>>*d*Q>.
    </enumerate>
  </thm>

  Many statistically relevant problems in estimation and testing pertain to
  models defined by linear constraints (Empirical Likelihood paradigm and
  others). Section 3 is devoted to this case. We therefore present a
  characterization result for the <math|\<chi\><rsup|2>->projection on sets
  of measures defined by linear constraints.

  Let <math|\<Phi\>> be a collection (finite or infinite, countable or not)
  of real valued functions defined on <math|\<bbb-R\><rsup|d>>, which we
  assume to contain the function 1. Let <math|\<Omega\>> a subset of <math|M>
  be defined by

  <\equation*>
    \<Omega\>=<around*|{|Q\<in\>M*<text| such that
    \ ><big|int>g*d*Q=0*<space|0.27em><text| for all
    \ ><space|0.27em>g*<text|in >\<Phi\>-<around|{|1|}>|}>.
  </equation*>

  Denote <math|\<less\>\<Phi\>\<gtr\>> the linear span of <math|\<Phi\>>.

  We then have the following result (see <cite|BroniatowskiKeziou2003b>):

  <\thm>
    <label|th:2.linear>

    <\enumerate>
      <item><math|P> has a projection <math|Q<rsup|\<ast\>>> in
      <math|\<Omega\>> iff <math|Q<rsup|\<ast\>>> belongs to <math|\<Omega\>>
      and for all <math|Q\<in\>\<Omega\>>,
      <math|q<rsup|\<ast\>>\<in\>L<rsub|1><around|(|Q|)>> and
      <math|<big|int>q<rsup|\<ast\>>*d*Q<rsup|\<ast\>>\<leq\><big|int>q*d*Q<rsup|\<ast\>>>.

      <item>If <math|q<rsup|\<ast\>>> belongs to
      <math|\<less\>\<Phi\>\<gtr\>> and <math|Q<rsup|\<ast\>>> belongs to
      <math|\<Omega\>>, then <math|Q<rsup|\<ast\>>> is the projection of
      <math|P> on <math|\<Omega\>>.

      <item>If <math|P> has projection <math|Q<rsup|\<ast\>>> on
      <math|\<Omega\>>, the <math|q<rsup|\<ast\>>> belongs to
      <math|<wide|\<less\>\<Phi\>\<gtr\>|\<bar\>>>, the closure of
      <math|\<Phi\>> in <math|L<rsub|1><around|(|Q<rsup|\<ast\>>|)>>.
    </enumerate>
  </thm>

  <\rem>
    <label|rem:2.counterexample>The above result only provides a partial
    answer to the characterization of the projections. Let <math|P> be the
    uniform distribution on <math|<around|[|0,1|]>>. The set
    <math|M<rsub|1><around|(|P|)>> of all p.m.'s absolutely continuous with
    respect to <math|P> is a closed subset of <math|M<rsub|\<Phi\>>>, when
    <math|\<Phi\>\<assign\><around*|{|x\<mapsto\>x|}>\<cup\><around*|{|x\<mapsto\>1|}>>.
    Let <math|\<Omega\>\<assign\><around*|{|Q\<in\>M<rsub|1><around|(|P|)>:<space|0.27em><big|int>x*d*Q<around|(|x|)>=<frac|1|4>|}>>.
    Then <math|P> has a projection on <math|\<Omega\>> and
    <math|<frac|d*Q<rsup|\<ast\>>|d*P><around|(|x|)>*<with|font-size|1.41|1><rsub|<around*|{|q<rsup|\<ast\>>\<gtr\>0|}>><around|(|x|)>=c<rsub|0>+c<rsub|1>*x>,
    with <math|q<rsup|\<ast\>>=<frac|d*Q<rsup|\<ast\>>|d*P>>. The support of
    <math|Q<rsup|\<ast\>>> is strictly included in <math|<around|[|0,1|]>>.
    Otherwise we obtain <math|c<rsub|0>=<frac|5|2>> and <math|c<rsub|1>=-3>,
    a contradiction, since then <math|Q<rsup|\<ast\>>> is not a probability
    measure.
  </rem>

  <subsection|An alternative version of the
  <math|\<chi\><rsup|2>>><label|subsec:3.duality>

  The <math|\<chi\><rsup|2>> distance defined on <math|M> for fixed <math|P>
  in <math|M<rsub|1>> through <math|\<chi\><rsup|2><around|(|Q,P|)>=<big|int><around*|(|<frac|d*Q|d*P>-1|)><rsup|2>*d*P>
  is a convex function; as such it is the upper envelope of its support
  hyperplanes. The first result, which is Proposition 2.1 in
  <cite|Broniatowski-Keziou2003>, provides the description of the hyperplanes
  in <math|M<rsub|\<Phi\>>>.

  <\prop>
    <label|th:2.hausdorff>Equip <math|M<rsub|\<Phi\>>> with the
    <math|\<tau\><rsub|\<b-Phi\>>->topology. Then <math|M<rsub|\<Phi\>>> is a
    Hausdorff locally convex topological space. Further, the topological dual
    space of <math|M<rsub|\<Phi\>>> is the set of all mappings
    <math|Q\<mapsto\><big|int>f*d*Q> when <math|f> belongs to
    <math|\<less\>\<b-Phi\>\<gtr\>>.
  </prop>

  Proposition 2.3 in <cite|Broniatowski-Keziou2003> asserts that the
  <math|\<chi\><rsup|2>> distance defined on <math|M> for fixed <math|P> in
  <math|M<rsub|1>> is l.s.c. in <math|<around*|(|M<rsub|\<Phi\>>,\<tau\><rsub|\<b-Phi\>>|)>>.
  We can now state the duality lemma.

  Define on <math|\<less\>\<b-Phi\>\<gtr\>>, the Fenchel-Legendre transform
  of <math|\<chi\><rsup|2><around|(|\<cdummy\>,P|)>>

  <\equation>
    <label|eqn:T(f,P)>T<around|(|f,P|)>\<assign\>sup<rsub|Q\<in\>M<rsub|\<Phi\>>>
    <big|int>f*d*Q-\<chi\><rsup|2><around|(|Q,P|)>.
  </equation>

  We have

  <\lem>
    <label|th:2.dualitylemma>The function
    <math|Q\<longmapsto\>\<chi\><rsup|2><around|(|Q,P|)>> admits the
    representation

    <\equation>
      \<chi\><rsup|2><around|(|Q,P|)>=sup<rsub|f\<in\>\<less\>\<b-Phi\>\<gtr\>>
      <big|int>f*d*Q-T<around|(|f,P|)>.<label|eqn:2.dualrepres>
    </equation>
  </lem>

  Standard optimization techniques yield

  <\equation*>
    T<around|(|f,P|)>=<big|int>f*d*P+<frac|1|4>*<big|int>f<rsup|2>*d*P
  </equation*>

  for all <math|f\<in\>\<less\>\<b-Phi\>\<gtr\>>, see e.g. <cite|Aze1997>,
  Chapter 4.

  The function <math|f<rsup|\<ast\>>=2*<around*|(|<frac|d*Q<rsup|\<ast\>>|d*P>-1|)>>
  is the supremum in (<reference|eqn:2.dualrepres>) as can be seen through
  classical convex optimization procedures.

  We now consider a subclass <math|\<cal-F\>> in
  <math|\<less\>\<b-Phi\>\<gtr\>> and we assume:

  <\itemize>
    <item*|(C1)><math|f<rsup|\<ast\>>> belongs to
    <math|\<cal-F\>>.<label|(C1)>
  </itemize>

  Therefore

  <\equation*>
    \<chi\><rsup|2><around|(|Q,P|)>=sup<rsub|f\<in\>\<cal-F\>>
    <big|int>f*d*Q-T<around|(|f,P|)>
  </equation*>

  which we call the <em|dual representation> of the <math|\<chi\><rsup|2>>.

  This can be restated as follows: let

  <\equation*>
    m<rsub|f><around|(|x|)>\<assign\><big|int>f*d*Q-<around*|(|f<around|(|x|)>+<frac|1|4>*f<rsup|2><around|(|x|)>|)>.
  </equation*>

  Then

  <\equation>
    <label|eqn:2.chi2dual>\<chi\><rsup|2><around|(|Q,P|)>=sup<rsub|f\<in\>\<cal-F\>>
    <big|int>m<rsub|f><around|(|x|)>*d*P<around|(|x|)>.
  </equation>

  Hence we have

  <\equation>
    <label|eqn:2.chi2Omegadual>\<chi\><rsup|2><around|(|\<Omega\>,P|)>=inf<rsub|Q\<in\>\<Omega\>>
    sup<rsub|f\<in\>\<cal-F\>> <big|int>m<rsub|f><around|(|x|)>*d*P<around|(|x|)>.
  </equation>

  In the case when <math|\<Omega\>> is defined through a finite number of
  linear constraints, say

  <\equation*>
    \<Omega\>=<around*|{|Q\<in\>M:<space|0.27em><big|int>f<rsub|i><around|(|x|)>*d*Q<around|(|x|)>=a<rsub|i>,<space|0.27em>1\<leq\>i\<leq\>k|}>,
  </equation*>

  when <math|P> has a projection <math|Q<rsup|\<ast\>>> on <math|\<Omega\>>
  and <math|s*u*p*p<around|{|Q<rsup|\<ast\>>|}>> is known to coincide with
  that of <math|P>, then we may choose <math|\<cal-F\>> as the linear span of
  <math|<around|{|1,f<rsub|1>,\<ldots\>,f<rsub|k>|}>> and
  (<reference|eqn:2.chi2Omegadual>) turns out to be a parametric
  unconstrained optimization problem, since, by Theorem
  <reference|th:2.linear> (3)

  <\equation*>
    \<chi\><rsup|2><around|(|\<Omega\>,P|)>=sup<rsub|c<rsub|0>,c<rsub|1>,\<ldots\>,c<rsub|k>>
    c<rsub|0>+<big|sum><rsub|i=1><rsup|k>c<rsub|i>*a<rsub|i>-T*<around*|(|c<rsub|0>+<big|sum><rsub|i=1><rsup|k>c<rsub|i>*f<rsub|i>,P|)>.
  </equation*>

  In some other cases we may have a complete description of all functions
  <math|<frac|d*Q|d*P>> when <math|Q> belongs to <math|\<Omega\>>. A typical
  example is when <math|P> and <math|Q> belong to parametric families.

  <subsection|The estimator <math|\<chi\><rsub|n><rsup|2>>><label|subsec:2.estimator>

  Let us now present the estimate of <math|\<chi\><rsup|2><around|(|\<Omega\>,P|)>>.

  Together with an i.i.d. sample <math|X<rsub|1>,\<ldots\>,X<rsub|n>> with
  common unknown distribution <math|P>, define the estimate of
  <math|\<chi\><rsup|2><around|(|Q,P|)>> through

  <\equation>
    \<chi\><rsub|n><rsup|2><around|(|Q,P|)>\<assign\>sup<rsub|f\<in\>\<cal-F\>>
    <big|int>m<rsub|f><around|(|x|)>*d*P<rsub|n><around|(|x|)><label|eqn:2.def>
  </equation>

  a plug-in version of (<reference|eqn:2.chi2dual>).

  We also define the estimate of <math|\<chi\><rsup|2><around|(|\<Omega\>,P|)>>
  through

  <\equation>
    \<chi\><rsub|n><rsup|2><around|(|\<Omega\>,P|)>\<assign\>inf<rsub|Q\<in\>\<Omega\>>
    sup<rsub|f\<in\>\<cal-F\>> <big|int>m<rsub|f><around|(|x|)>*d*P<rsub|n><around|(|x|)>.<label|estim
    chi2>
  </equation>

  These estimates may seem cumbersome. However, in the case when we are able
  to reduce the class <math|\<cal-F\>> to a reasonable degree of complexity,
  these estimates perform quite well and can be used for testing
  <math|P\<in\>\<Omega\>> against <math|P\<nin\>\<Omega\>>. This will be made
  clear in the last two sections which serve as examples for the present
  approach.

  In some cases it is possible to commute the <math|sup> and the <math|inf>
  operators in (<reference|eqn:2.chi2Omegadual>), which turns out to become

  <\equation>
    \<chi\><rsup|2><around|(|\<Omega\>,P|)>=sup<rsub|f\<in\>\<cal-F\>>
    inf<rsub|Q\<in\>\<Omega\>> <big|int>f*d*Q-T<around|(|f,P|)>,<label|eqn:2.chi2>
  </equation>

  in which the <math|inf> operator acts only on the linear functional
  <math|<big|int>f*d*Q>.

  Also, when (<reference|eqn:2.chi2>) holds, we may define an estimate of
  <math|\<chi\><rsup|2><around|(|\<Omega\>,P|)>> through

  <\equation>
    <label|eqn:2.chi><wide|\<chi\>|\<bar\>><rsub|n><rsup|2><around|(|\<Omega\>,P|)>=sup<rsub|f\<in\>\<cal-F\>>
    inf<rsub|Q\<in\>\<Omega\>> <big|int>f*d*Q-T<around|(|f,P<rsub|n>|)>.
  </equation>

  When (<reference|eqn:2.chi2>) holds, it is quite easy to get the limit
  properties of <math|<wide|\<chi\>|\<bar\>><rsub|n><rsup|2>>.

  Indeed, by (<reference|eqn:2.chi2>) and (<reference|eqn:2.chi>)

  <align*|<tformat|<table|<row|<cell|<wide|\<chi\>|\<bar\>><rsub|n><rsup|2><around|(|\<Omega\>,P|)>-\<chi\><rsup|2><around|(|\<Omega\>,P|)>>|<cell|=<around*|(|sup<rsub|f\<in\>\<cal-F\>>
  inf<rsub|Q\<in\>\<Omega\>> <big|int>f*d*Q-T<around|(|f,P<rsub|n>|)>|)>-<around*|(|sup<rsub|f\<in\>\<cal-F\>>
  inf<rsub|Q\<in\>\<Omega\>> <big|int>f*d*Q-T<around|(|f,P|)>|)>.>>>>>

  Now define

  <\equation*>
    \<phi\><rsub|R><around|(|f|)>\<assign\>inf<rsub|Q\<in\>\<Omega\>>
    <big|int>f*d*Q-T<around|(|f,R|)>=inf<rsub|Q\<in\>\<Omega\>>
    <big|int>f*d*Q-<big|int><around*|(|f+<frac|1|4>*f<rsup|2>|)>*d*R,
  </equation*>

  a concave function of <math|f>.

  When <math|\<cal-F\>> is compact in a topology for which
  <math|\<phi\><rsub|R>> is uniformly continuous for all <math|R> in
  <math|M<rsub|1>>, then a sufficient condition for the a.s. convergence of
  <math|<wide|\<chi\>|\<bar\>><rsub|n><rsup|2><around|(|\<Omega\>,P|)>> to
  <math|\<chi\><rsup|2><around|(|\<Omega\>,P|)>> is

  <\equation*>
    lim<rsub|n\<rightarrow\>\<infty\>> sup<rsub|f\<in\>\<cal-F\>><around*|\||\<phi\><rsub|P<rsub|n>><around|(|f|)>-\<phi\><rsub|P><around|(|f|)>|\|>=0*<space|0.27em><space|0.27em>a.*s.
  </equation*>

  which in turn is

  <\equation*>
    lim<rsub|n\<rightarrow\>\<infty\>> sup<rsub|f\<in\>\<cal-F\>><around*|\||<big|int><around*|(|f+<frac|1|4>*f<rsup|2>|)>*d*P<rsub|n>-<big|int><around*|(|f+<frac|1|4>*f<rsup|2>|)>*d*P|\|>=0*<space|0.27em><space|0.27em>a.*s.
  </equation*>

  This clearly holds when the class of functions
  <math|<around*|{|<around*|(|f+<frac|1|4>*f<rsup|2>|)>,<space|0.27em>f\<in\>\<cal-F\>|}>>
  satisfies the functional Glivenko-Cantelli (GC) condition (see
  <cite|Pollard84>).

  The limit distribution of the statistic
  <math|<wide|\<chi\>|\<bar\>><rsub|n><rsup|2><around|(|\<Omega\>,P|)>> under
  <math|H*1>, i.e. when <math|P> does not belong to <math|\<Omega\>>, can be
  obtained under the following hypotheses, following closely the proof of
  Theorem 3.6 in <cite|Broniatowski2002>, where a similar result is proved
  for the Kullback-Leibler divergence estimate.

  Assume

  <\itemize>
    <item*|(C2)><label|(C2)><math|P> has a unique projection
    <math|Q<rsup|\<ast\>>> on <math|\<Omega\>>.

    <item*|(C3)><label|(C3)>The class <math|\<cal-F\>> is compact in the
    sup-norm.

    <item*|(C4)><label|(C4)>The class <math|<around*|{|f+<frac|1|4>*f<rsup|2>,<space|0.27em>f\<in\>\<cal-F\>|}>>
    is a functional Donsker class.
  </itemize>

  We then have

  <\thm>
    <label|th:2.weakconv>Under <math|H*1>, assume that <em|(C1)>--<em|(C4)>
    hold. The asymptotic distribution of

    <\equation*>
      <sqrt|n>*<around*|(|<wide|\<chi\>|\<bar\>><rsub|n><rsup|2><around|(|\<Omega\>,P|)>-\<chi\><rsup|2><around|(|\<Omega\>,P|)>|)>
    </equation*>

    is that of <math|B<rsub|P><around|(|g<rsup|\<ast\>>|)>>, where
    <math|B<rsub|P><around|(|\<cdummy\>|)>> is the <math|P->Brownian bridge
    defined on <math|\<cal-F\>>, and <math|g<rsup|\<ast\>>=-f<rsup|\<ast\>>-<frac|1|4>*f<rsup|\<ast\>><rsup|2>>.
  </thm>

  Therefore <math|<sqrt|n>*<around*|(|<wide|\<chi\>|\<bar\>><rsub|n><rsup|2><around|(|\<Omega\>,P|)>-\<chi\><rsup|2><around|(|\<Omega\>,P|)>|)>>
  has an asymptotic centered normal distribution with variance
  <math|E<rsub|P><around*|(|<around*|(|f<rsup|\<ast\>>+<frac|1|4>*f<rsup|\<ast\>><rsup|2>|)><rsup|2><around|(|X|)>|)>-E<rsub|P><around*|(|<around*|(|-f<rsup|\<ast\>>-<frac|1|4>*f<rsup|\<ast\>><rsup|2>|)><around|(|X|)>|)><rsup|2>>,
  where <math|X> has law <math|P>.

  The asymptotic distribution of <math|<wide|\<chi\>|\<bar\>><rsub|n><rsup|2>>
  under <math|H*0>, i.e. when <math|P> belongs to <math|\<Omega\>>, cannot be
  obtained in a general frame and must be derived accordingly to the context.

  In the next Sections we develop two applications of the above statements.
  In the first one we consider sets <math|\<Omega\>> defined by an infinite
  number of linear constraints. We approximate <math|\<Omega\>> through some
  sieve technique and provide consistent test for
  <math|H<rsub|0>:P\<in\>\<Omega\>>. We specialize this problem to the two
  sample test for paired data. So, in this first application, we basically
  use the representation of the projection <math|Q<rsup|\<ast\>>> of <math|P>
  on linear sets as described through Theorem <reference|th:2.linear>. In
  this first range of applications we will project <math|P<rsub|n>> on the
  non void set <math|\<Omega\>\<cap\>\<Lambda\><rsub|n>>.

  The second application deals with parametric models and test for
  contamination. We obtain a consistent test for the case when
  <math|\<Omega\>> is a set of parametrized distributions
  <math|F<rsub|\<theta\>>> for <math|\<theta\>> in
  <math|\<Theta\>\<subset\>\<bbb-R\><rsup|d>>. The test is

  <\equation*>
    H*0:<space|0.27em>P\<in\>\<Omega\>=<around*|{|F<rsub|\<theta\>>,<space|0.22em>\<theta\>\<in\>\<Theta\>|}>,<space|0.22em><text|i.e.><space|0.27em>\<lambda\>=0

    <space|1em><text|vs><space|1em>H*1:<space|0.22em>P\<in\><around|{|<around|(|1-\<lambda\>|)>*F<rsub|\<theta\>>+\<lambda\>*R,<space|0.17em>\<lambda\>\<neq\>0,<space|0.17em>\<theta\>\<in\>\<Theta\>|}>.
  </equation*>

  In this example we project <math|P<rsub|n>> on a set of absolutely
  continuous distributions and we make use of the minimax assumption
  (<reference|eqn:2.chi2>) which we prove to hold.

  <section|Test of a set of linear constraints><label|sec:3.linear>

  Let <math|\<cal-F\>> be a countable family of real-valued functions defined
  on <math|\<bbb-R\><rsup|d>>, <math|<around|{|a<rsub|i>|}><rsub|i=1><rsup|\<infty\>>>
  a real sequence and

  <\equation>
    \<Omega\>\<assign\><around*|{|Q\<in\>M*<text|such that
    \ ><big|int>f<rsub|i>*d*Q<around|(|x|)>=a<rsub|i>,<space|0.27em>i\<geq\>1|}><label|eqn:3.omega>
  </equation>

  We assume that <math|\<Omega\>> is not void. In accordance with the
  previous section we assume that the function <math|f<rsub|0>\<assign\>1>
  belongs to <math|\<cal-F\>> with <math|a<rsub|0>=1>.

  Let <math|X<rsub|1>,\<ldots\>,X<rsub|n>> be an i.i.d. sample with common
  distribution <math|P>.

  We intend to propose a test for <math|H<rsub|0>:P\<in\>\<Omega\>> vs
  <math|H<rsub|1>:P\<nin\>\<Omega\>>.

  We first consider the case when <math|\<cal-F\>> is a finite collection of
  functions, and next extend our results to the infinite case.

  For notational convenience we write <math|P*f> for <math|<big|int>f*d*P>
  whenever defined.

  <subsection|Finite number of linear constraints><label|subsec:3.finite>

  Consider the set <math|\<Omega\>> defined in (<reference|eqn:3.omega>) with
  <math|c*a*r*d<around|{|\<cal-F\>|}>=k>. Introduce the estimate of
  <math|\<chi\><rsup|2><around|(|\<Omega\>,P|)>> through

  <\equation>
    \<chi\><rsub|n><rsup|2><around|(|\<Omega\>,P|)>=inf<rsub|Q\<in\>\<Omega\>\<cap\>\<Lambda\><rsub|n>>
    \<chi\><rsup|2><around|(|Q,P<rsub|n>|)>.<label|estim lin>
  </equation>

  Embedding the projection device in <math|M\<cap\>\<Lambda\><rsub|n>>
  instead of <math|M<rsub|1>\<cap\>\<Lambda\><rsub|n>> yields to a simple
  solution for the optimum in (<reference|estim lin>), since no inequality
  constrains will be used. Also the topological context is simpler than as
  mentioned in the previous section since the projection of <math|P<rsub|n>>
  belongs to <math|\<bbb-R\><rsup|n>>. When developed in
  <math|M<rsub|1>\<cap\>\<Lambda\><rsub|n>> this approach is known as the
  Generalized Likelihood (GEL) paradigm (see <cite|MR2002302>). Our approach
  differs from the lattest through the use of the dual representation
  (<reference|estim chi2>), which provides consistency of the test procedure.

  It is readily checked that

  <\equation*>
    \<chi\><rsub|n><rsup|2><around|(|\<Omega\>,P|)>=<wide|\<chi\>|\<bar\>><rsub|n><rsup|2><around|(|\<Omega\>,P|)>
  </equation*>

  The set <math|\<Omega\>\<cap\>\<Lambda\><rsub|n>> is a convex closed subset
  in <math|\<bbb-R\><rsup|n>>. When the projection of <math|P<rsub|n>> on
  <math|\<Omega\>\<cap\>\<Lambda\><rsub|n>> exists uniqueness therefore
  holds. In the next section we develop various properties of our estimates,
  which are based on the duality formula (<reference|eqn:2.def>).

  The next subsections provide all limit properties of
  <math|\<chi\><rsub|n><rsup|2><around|(|\<Omega\>,P|)>>.

  <subsubsection|Notation and basic properties>

  Let <math|Q<rsub|0>> be any fixed measure in <math|\<Omega\>>. By
  (<reference|eqn:2.chi2Omegadual>)

  <align|<tformat|<table|<row|<cell|\<chi\><rsup|2><around*|(|\<Omega\>,P|)>>|<cell|=sup<rsub|f\<in\>\<less\>\<cal-F\>\<gtr\>><around*|(|Q<rsub|0>-P|)>*f-<frac|1|4>*P*f<rsup|2>*<no-number>>>|<row|<cell|>|<cell|=sup<rsub|a<rsub|0>,a<rsub|1,>*\<ldots\>,a<rsub|k>>
  <big|sum><rsub|i=1><rsup|k>a<rsub|i>*<around*|(|Q<rsub|0>-P|)>*f<rsub|i>-<frac|1|4>*P*<around*|(|<big|sum><rsub|i=1><rsup|k>a<rsub|i>*f<rsub|i>+a<rsub|0>|)><rsup|2>.<label|linear-chi>>>>>>

  since, for <math|Q> in <math|\<Omega\>> and for all <math|f> in
  <math|\<cal-F\>>, <math|Q*f=Q<rsub|0>*f> and

  <\equation*>
    \<chi\><rsup|2><rsub|n>=sup<rsub|a<rsub|0>,a<rsub|1,>*\<ldots\>,a<rsub|k>>
    <big|sum><rsub|i=1><rsup|k>a<rsub|i>*<around*|(|Q<rsub|0>-P<rsub|n>|)>*f<rsub|i>-<frac|1|4>*P<rsub|n>*<around*|(|<big|sum><rsub|i=1><rsup|k>a<rsub|i>*f<rsub|i>+a<rsub|0>|)><rsup|2>.
  </equation*>

  The infinite dimensional optimization problem in
  (<reference|eqn:2.chi2Omegadual>) thus reduces to a
  <math|<around|(|k+1|)>->dimensional one, much easier to handle.

  We can write the chi-square and <math|\<chi\><rsub|n><rsup|2>> through a
  quadratic form.

  Define the vectors <math|<wide*|\<nu\>|\<bar\>><rsub|n>> e
  <math|<wide*|\<nu\>|\<bar\>>> by

  <align|<tformat|<table|<row|<cell|<wide*|\<nu\>|\<bar\>><rsub|n><rprime|'>>|<cell|=<wide*|\<nu\>|\<bar\>><around|(|\<cal-F\>,P<rsub|n>|)><rprime|'>=<around*|{|<around*|(|Q<rsub|0>-P<rsub|n>|)>*f<rsub|1>,\<ldots\>,<around*|(|Q<rsub|0>-P<rsub|n>|)>*f<rsub|k>|}><label|nugamma>>>|<row|<cell|<wide*|\<nu\>|\<bar\>><rprime|'>>|<cell|=<wide*|\<nu\>|\<bar\>><around|(|\<cal-F\>,P|)><rprime|'>=<around*|{|<around*|(|Q<rsub|0>-P|)>*f<rsub|1>,\<ldots\>,<around*|(|Q<rsub|0>-P|)>*f<rsub|k>|}><no-number>>>|<row|<cell|<wide*|\<gamma\>|\<bar\>><rsub|n>>|<cell|=<wide*|\<gamma\>|\<bar\>><rsub|n><around|(|\<cal-F\>|)>=<sqrt|n>*<around*|{|<around*|(|P<rsub|n>-P|)>*f<rsub|1>,\<ldots\>,<around*|(|P<rsub|n>-P|)>*f<rsub|k>|}>=<sqrt|n>*<around*|(|<wide*|\<nu\>|\<bar\>>-<wide*|\<nu\>|\<bar\>><rsub|n>|)>.*<no-number>>>>>>

  Let <math|S> be the covariance matrix of
  <math|<wide*|\<gamma\>|\<bar\>><rsub|n>>. Write <math|S<rsub|n>> for the
  empirical version of <math|S>, obtained substituting <math|P> by
  <math|P<rsub|n>> in all entries of <math|S>.

  <\prop>
    <label|th:matrixform>Let <math|\<Omega\>> be as in
    <math|<around*|(|<reference|eqn:3.omega>|)>> and let
    <math|c*a*r*d<around*|{|\<cal-F\>|}>> be finite. We then have

    <\enumerate>
      <item*|(i)><math|\<chi\><rsub|n><rsup|2>=<wide*|\<nu\>|\<bar\>><rsub|n><rprime|'>S<rsub|n><rsup|-1><wide*|\<nu\>|\<bar\>><rsub|n>>

      <item*|(ii)><math|\<chi\><rsup|2><around*|(|\<Omega\>,P|)>=<wide*|\<nu\>|\<bar\>><rprime|'>S<rsup|-1><wide*|\<nu\>|\<bar\>>>
    </enumerate>
  </prop>

  <\proof>
    <\enumerate>
      <item*|(i)>Differentiating the function in (<reference|linear-chi>)
      with respect to <math|a<rsub|s>>, <math|s=0,1,\<ldots\>,k> yields

      <\equation>
        a<rsub|0>=-<big|sum><rsub|i=1><rsup|k>a<rsub|i>*P<rsub|n>*f<rsub|i><label|a0>
      </equation>

      for <math|s=0>, while for <math|s\<gtr\>0>

      <\equation>
        <around*|(|Q<rsub|0>-P<rsub|n>|)>*f<rsub|s>=<frac|1|2>*<around*|(|a<rsub|0>*P<rsub|n>*f<rsub|s>+<big|sum><rsub|i=1><rsup|k>a<rsub|i>*P<rsub|n>*f<rsub|i>*f<rsub|s>|)>.<label|eqn:3.a>
      </equation>

      Substituting <math|<around*|(|<reference|a0>|)>> in the last display,

      <\equation*>
        <around*|(|Q<rsub|0>-P<rsub|n>|)>*f<rsub|s>=<frac|1|2>*<big|sum><rsub|i=1><rsup|k>a<rsub|i>*<around*|(|P<rsub|n>*f<rsub|i>*f<rsub|s>-P<rsub|n>*f<rsub|s>*P<rsub|n>*f<rsub|i>|)>,
      </equation*>

      i.e.

      <\equation>
        2<wide*|\<nu\>|\<bar\>><rsub|n>=S<rsub|n><wide*|a|\<bar\>><label|a>
      </equation>

      where <math|<wide*|a|\<bar\>><rprime|'>=<around*|{|a<rsub|1>,a<rsub|2>,\<ldots\>,a<rsub|k>|}>>.

      Set <math|f<rsub|n><rsup|\<ast\>>=arg
      max<rsub|\<less\>\<cal-F\>\<gtr\>><around|(|Q<rsub|0>-P<rsub|n>|)>*f-<frac|1|4>*P<rsub|n>*f<rsup|2>>.
      For every <math|h\<in\>\<less\>\<cal-F\>\<gtr\>>,
      <math|<around*|(|Q<rsub|0>-P<rsub|n>|)>*h-<frac|1|2>*P<rsub|n>*h*f<rsub|n><rsup|\<ast\>>=0>
      . Set <math|h\<assign\>f<rsub|n><rsup|\<ast\>>> to obtain
      <math|<around*|(|Q<rsub|0>-P<rsub|n>|)>*f<rsub|n><rsup|\<ast\>>=<frac|1|2>*P<rsub|n><around|(|f<rsub|n><rsup|\<ast\>>|)><rsup|2>>.

      It then follows, using <math|<around*|(|<reference|a0>|)>> e
      <math|<around*|(|<reference|a>|)>>,

      <align*|<tformat|<table|<row|<cell|\<chi\><rsub|n><rsup|2>>|<cell|=<around*|[|<around*|(|Q<rsub|0>-P<rsub|n>|)>*f<rsub|n><rsup|\<ast\>>-<frac|1|4>*P<rsub|n><around|(|f<rsub|n><rsup|\<ast\>>|)><rsup|2>|]>=<frac|1|4>*P<rsub|n><around|(|f<rsub|n><rsup|\<ast\>>|)><rsup|2>=>>|<row|<cell|>|<cell|=<frac|1|4>*P<rsub|n>*<around*|(|<big|sum><rsub|i=1><rsup|k>a<rsub|i>*f<rsub|i>-<big|sum><rsub|i=1><rsup|k>a<rsub|i>*P<rsub|n>*f<rsub|i>|)><rsup|2>=>>|<row|<cell|>|<cell|=<frac|1|4><wide*|a|\<bar\>><rprime|'>S<rsub|n><wide*|a|\<bar\>>=<wide*|\<nu\>|\<bar\>><rsub|n><rprime|'>S<rsub|n><rsup|-1><wide*|\<nu\>|\<bar\>><rsub|n>.>>>>>

      <item*|(ii)>The proof is similar to the above one.
    </enumerate>
  </proof>

  <subsubsection|Almost sure convergence>

  Call an envelope for <math|\<cal-F\>>a function <math|F> such that
  <math|<around*|\||f|\|>\<leq\>F> for all <math|f> in <math|\<cal-F\>>.

  <\thm>
    <label|th:qc2>Assume that <math|\<chi\><rsup|2><around*|(|\<Omega\>,P|)>>
    is finite. Let <math|\<cal-F\>> be a finite class of functions as in
    (<reference|eqn:3.omega>) with an envelope function <math|F> such that
    <math|P*F<rsup|2>\<less\>\<infty\>>.

    Then <math|<around*|\||\<chi\><rsub|n><rsup|2>-\<chi\><rsup|2><around*|(|\<Omega\>,P|)>|\|>\<rightarrow\>0>,
    <math|P-a.*s>.
  </thm>

  <\proof>
    From Proposition <reference|th:matrixform>,

    <align*|<tformat|<table|<row|<cell|<around*|\||\<chi\><rsub|n><rsup|2>-\<chi\><rsup|2><around*|(|\<Omega\>,P|)>|\|>>|<cell|=<around*|\||<wide*|\<nu\>|\<bar\>><rsub|n><rprime|'>S<rsub|n><rsup|-1><wide*|\<nu\>|\<bar\>><rsub|n>-<wide*|\<nu\>|\<bar\>>S<rsup|-1><wide*|\<nu\>|\<bar\>>|\|>>>|<row|<cell|>|<cell|=<around*|\||<wide*|\<nu\>|\<bar\>><rsub|n><rprime|'><around*|(|S<rsub|n><rsup|-1>-S<rsup|-1>|)><wide*|\<nu\>|\<bar\>><rsub|n>|\|>+<around*|\||<wide*|\<nu\>|\<bar\>><rsub|n><rprime|'>S<rsup|-1><wide*|\<nu\>|\<bar\>><rsub|n>-<wide*|\<nu\>|\<bar\>><rprime|'>S<rsup|-1><wide*|\<nu\>|\<bar\>>|\|>.>>>>>

    For <math|<wide*|x|\<bar\>>> in <math|\<bbb-R\><rsup|k>> denote
    <math|<around*|\<\|\|\>|<wide*|x|\<bar\>>|\<\|\|\>>> the euclidean norm.
    Over the space of matrices <math|k\<times\>k> introduce the algebraic
    norm <math|<around*|\||<space|-0.17em><around*|\||<space|-0.17em><around*|\||A|\|><space|-0.17em>|\|><space|-0.17em>|\|>=sup<rsub|<around*|\<\|\|\>|<wide*|x|\<bar\>>|\<\|\|\>>\<leq\>1>
    <frac|<around*|\<\|\|\>|A<wide*|x|\<bar\>>|\<\|\|\>>|<around*|\<\|\|\>|<wide*|x|\<bar\>>|\<\|\|\>>>=sup<rsub|<around*|\<\|\|\>|<wide*|x|\<bar\>>|\<\|\|\>>=1><around*|\<\|\|\>|A<wide*|x|\<bar\>>|\<\|\|\>>>.
    All entries of <math|A> satisfy <math|<around*|\||a<around*|(|i,j|)>|\|>\<leq\><around*|\||<space|-0.17em><around*|\||<space|-0.17em><around*|\||A|\|><space|-0.17em>|\|><space|-0.17em>|\|>>.
    Moreover, if <math|<around*|\||\<lambda\><rsub|1>|\|>\<leq\><around*|\||\<lambda\><rsub|2>|\|>\<leq\>\<ldots\>\<leq\><around*|\||\<lambda\><rsub|k>|\|>>
    are the eigenvalues of <math|A>, <math|<around*|\||<space|-0.17em><around*|\||<space|-0.17em><around*|\||A|\|><space|-0.17em>|\|><space|-0.17em>|\|>=<around*|\||\<lambda\><rsub|k>|\|>>.
    Observe further that, if for all <math|<around*|(|i,j|)>>,
    <math|<around*|\||a<around*|(|i,j|)>|\|>\<leq\>\<varepsilon\>>, then, for
    any <math|<wide*|x|\<bar\>>\<in\>\<bbb-R\><rsup|k>>, such that
    <math|<around*|\<\|\|\>|<wide*|x|\<bar\>>|\<\|\|\>>=1>,
    <math|<around*|\<\|\|\>|A<wide*|x|\<bar\>>|\<\|\|\>><rsup|2>=<big|sum><rsub|i=1><rsup|k><around*|(|<big|sum><rsub|j>a<around*|(|i,j|)>*x<rsub|j>|)><rsup|2>\<leq\><big|sum><rsub|i><big|sum><rsub|j>a<around*|(|i,j|)><rsup|2><around*|\<\|\|\>|<wide*|x|\<bar\>>|\<\|\|\>><rsup|2>\<leq\>k<rsup|2>*\<varepsilon\><rsup|2>>,
    i.e. <math|<around*|\||<space|-0.17em><around*|\||<space|-0.17em><around*|\||A|\|><space|-0.17em>|\|><space|-0.17em>|\|>\<leq\>k*\<varepsilon\>>.

    For the first term in the RHS of the above display

    <align*|<tformat|<table|<row|<cell|A>|<cell|\<assign\><wide*|\<nu\>|\<bar\>><rsub|n><rprime|'><around*|(|S<rsub|n><rsup|-1>-S<rsup|-1>|)><wide*|\<nu\>|\<bar\>><rsub|n>=<wide*|\<nu\>|\<bar\>><rsub|n><rprime|'>S<rsup|-1/2>*<around*|(|S<rsup|1/2>*S<rsub|n><rsup|-1>*S<rsup|1/2>-I|)>*S<rsup|-1/2><wide*|\<nu\>|\<bar\>><rsub|n>>>|<row|<cell|>|<cell|\<leq\><around*|\<\|\|\>|<wide*|\<nu\>|\<bar\>><rsub|n><rprime|'>S<rsup|-1/2>|\<\|\|\>><around*|\||<space|-0.17em><around*|\||<space|-0.17em><around*|\||S<rsup|1/2>*S<rsub|n><rsup|-1>*S<rsup|1/2>-I|\|><space|-0.17em>|\|><space|-0.17em>|\|>\<leq\>c*o*s*t.*<space|0.22em>k<space|0.22em><around*|\||<space|-0.17em><around*|\||<space|-0.17em><around*|\||S<rsup|1/2>*S<rsub|n><rsup|-1>*S<rsup|1/2>-I|\|><space|-0.17em>|\|><space|-0.17em>|\|>.>>>>>

    Hence if <math|B\<assign\><around*|\||<space|-0.17em><around*|\||<space|-0.17em><around*|\||S<rsup|1/2>*S<rsub|n><rsup|-1>*S<rsup|1/2>-I|\|><space|-0.17em>|\|><space|-0.17em>|\|>>
    tends to <math|0> a.s., so does <math|A>.

    First note that

    <align*|<tformat|<table|<row|<cell|S<rsub|n><rsup|-1>>|<cell|=<around*|(|S+S<rsub|n>-S|)><rsup|-1>=S<rsup|-1/2>*<around*|(|I+S<rsup|-1/2>*<around*|(|S<rsub|n>-S|)>*S<rsup|-1/2>|)><rsup|-1>*S<rsup|-1/2>=>>|<row|<cell|>|<cell|=S<rsup|-1/2>*<around*|[|I+<big|sum><rsub|h=1><rsup|\<infty\>><around*|(|S<rsup|-1/2>*<around*|(|S-S<rsub|n>|)>*S<rsup|-1/2>|)><rsup|h>|]>*S<rsup|-1/2>.>>>>>

    Hence

    <\equation*>
      S<rsup|1/2>*S<rsub|n><rsup|-1>*S<rsup|1/2>-I=<big|sum><rsub|h=1><rsup|\<infty\>><around*|(|S<rsup|-1/2>*<around*|(|S-S<rsub|n>|)>*S<rsup|-1/2>|)><rsup|h>,
    </equation*>

    which entails

    <align*|<tformat|<table|<row|<cell|<around*|\||<space|-0.17em><around*|\||<space|-0.17em><around*|\||S<rsup|1/2>*S<rsub|n><rsup|-1>*S<rsup|1/2>-I|\|><space|-0.17em>|\|><space|-0.17em>|\|>>|<cell|=<around*|\||<space|-0.17em><around*|\||<space|-0.17em><around*|\||<big|sum><rsub|h=1><rsup|\<infty\>><around*|(|S<rsup|-1/2>*<around*|(|S-S<rsub|n>|)>*S<rsup|-1/2>|)><rsup|h>|\|><space|-0.17em>|\|><space|-0.17em>|\|>\<leq\><big|sum><rsub|h=1><rsup|\<infty\>><around*|\||<space|-0.17em><around*|\||<space|-0.17em><around*|\||S-S<rsub|n>|\|><space|-0.17em>|\|><space|-0.17em>|\|><rsup|h><around*|\||<space|-0.17em><around*|\||<space|-0.17em><around*|\||S<rsup|-1/2>|\|><space|-0.17em>|\|><space|-0.17em>|\|><rsup|2*h>>>|<row|<cell|>|<cell|=O<rsub|P>*<around*|(|\<lambda\><rsub|1><rsup|-1>*k*sup<rsub|i,j><around*|\||s<rsub|n><around*|(|i,j|)>-s<around*|(|i,j|)>|\|>|)>,>>>>>

    where <math|\<lambda\><rsub|1>> is the smallest eigenvalue of <math|S>.

    Since

    <\equation>
      C\<assign\>sup<rsub|i,j><around*|\||s<rsub|n><around*|(|i,j|)>-s<around*|(|i,j|)>|\|>\<leq\>sup<rsub|i,j><around*|\||<around*|(|P<rsub|n>-P|)>*f<rsub|i>*f<rsub|j>|\|>+sup<rsub|i><around*|\||<around*|(|P<rsub|n>-P|)>*f<rsub|i>|\|>*<around*|\||<around*|(|P<rsub|n>+P|)>*F|\|><label|technical
      2>
    </equation>

    the LLN implies that <math|C> tends to 0 a.s. which in turn implies that
    <math|B> tends to 0.

    Now consider the second term. <math|<around*|\||<wide*|\<nu\>|\<bar\>><rsub|n><rprime|'>S<rsup|-1><wide*|\<nu\>|\<bar\>><rsub|n>-<wide*|\<nu\>|\<bar\>><rprime|'>S<rsup|-1><wide*|\<nu\>|\<bar\>>|\|>=<around*|\||<around*|(|<wide*|\<nu\>|\<bar\>><rsub|n>+<wide*|\<nu\>|\<bar\>>|)><rprime|'>*S<rsup|-1><around*|(|n<rsup|-1/2><wide*|\<gamma\>|\<bar\>><rsub|n>|)>|\|>>
    tends to 0 by LLN.
  </proof>

  <subsubsection|Asymptotic distribution of the test statistic>

  Write

  <\equation*>
    n*\<chi\><rsub|n><rsup|2>=<sqrt|n><wide*|\<nu\>|\<bar\>><rsub|n><rprime|'>S<rsup|-1>*<sqrt|n><wide*|\<nu\>|\<bar\>><rsub|n>+<sqrt|n><wide*|\<nu\>|\<bar\>><rsub|n><rprime|'><around*|(|S<rsub|n><rsup|-1>-S<rsup|-1>|)>*<sqrt|n><wide*|\<nu\>|\<bar\>><rsub|n>.
  </equation*>

  We then have

  <\thm>
    <label|th:dlf>Let <math|\<Omega\>> be defined by
    <math|<around*|(|<reference|eqn:3.omega>|)>> and <math|\<cal-F\>> be a
    finite class of linearly independent functions with envelope function
    <math|F> such that <math|P*F<rsup|2>\<less\>\<infty\>>. Set
    <math|k=c*a*r*d<around|{|\<cal-F\>|}>>. Then, under <math|H*0>,

    <\equation*>
      n*\<chi\><rsub|n><rsup|2><above|\<longrightarrow\>|d>c*h*i<around*|(|k|)>
    </equation*>

    where <math|c*h*i<around*|(|k|)>> denotes a chi-square distribution with
    <math|k> degrees of freedom.
  </thm>

  <\proof>
    For <math|P> in <math|\<Omega\>>, <math|<sqrt|n><wide*|\<nu\>|\<bar\>><rsub|n>=<wide*|\<gamma\>|\<bar\>><rsub|n>>.
    Therefore <math|n*\<chi\><rsub|n><rsup|2>=<wide*|\<gamma\>|\<bar\>><rsub|n><rprime|'>S<rsup|-1><wide*|\<gamma\>|\<bar\>><rsub|n>+<sqrt|n><wide*|\<nu\>|\<bar\>><rsub|n><rprime|'><around*|(|S<rsub|n><rsup|-1>-S<rsup|-1>|)>*<sqrt|n><wide*|\<nu\>|\<bar\>><rsub|n>>.

    By continuity of the mapping <math|h<around*|(|<wide*|y|\<bar\>>|)>=<wide*|y|\<bar\>><rprime|'>S<rsup|-1><wide*|y|\<bar\>>>
    , <math|<wide*|\<gamma\>|\<bar\>><rsub|n><rprime|'>S<rsup|-1><wide*|\<gamma\>|\<bar\>><rsub|n>>
    has a limiting <math|c*h*i<around|(|k|)>> distribution.

    It remains to prove that the second term is negligible. Indeed again from

    <\equation*>
      <around*|(|<sqrt|n><wide*|\<nu\>|\<bar\>><rsub|n>|)><rprime|'>*<around*|(|S<rsub|n><rsup|-1>-S<rsup|-1>|)><around*|(|<sqrt|n><wide*|\<nu\>|\<bar\>><rsub|n>|)>\<leq\>c*s*t.*<space|0.22em>k<space|0.22em><around*|\||<space|-0.17em><around*|\||<space|-0.17em><around*|\||S<rsup|1/2>*S<rsub|n><rsup|-1>*S<rsup|1/2>-I|\|><space|-0.17em>|\|><space|-0.17em>|\|>
    </equation*>

    it is enough to show that <math|<around*|\||<space|-0.17em><around*|\||<space|-0.17em><around*|\||S<rsup|1/2>*S<rsub|n><rsup|-1>*S<rsup|1/2>-I|\|><space|-0.17em>|\|><space|-0.17em>|\|>>
    is <math|o<rsub|P><around*|(|1|)>>. This follows from
    (<reference|technical 2>).
  </proof>

  The asymptotic behavior of <math|\<chi\><rsub|n><rsup|2>> under <math|H*1>
  is captured by

  <align|<tformat|<table|<row|<cell|<label|consistencyH1><sqrt|n>*<around*|(|\<chi\><rsub|n><rsup|2>-\<chi\><rsup|2>|)>>|<cell|=-2<wide*|\<gamma\>|\<bar\>><rsub|n><rprime|'>S<rsup|-1><wide*|\<nu\>|\<bar\>>+<sqrt|n><wide*|\<nu\>|\<bar\>><rprime|'>S<rsup|-1/2>*<around*|(|S<rsup|1/2>*S<rsub|n><rsup|-1>*S<rsup|1/2>-I|)>*S<rsup|-1/2><wide*|\<nu\>|\<bar\>><no-number>>>|<row|<cell|>|<cell|-2<wide*|\<gamma\>|\<bar\>><rsub|n><rprime|'>S<rsup|-1/2>*<around*|(|S<rsup|1/2>*S<rsub|n><rsup|-1>*S<rsup|1/2>-I|)>*S<rsup|-1/2><wide*|\<nu\>|\<bar\>>+n<rsup|-1/2><wide*|\<gamma\>|\<bar\>><rsub|n><rprime|'>S<rsup|-1><rsub|n><wide*|\<gamma\>|\<bar\>><rsub|n>.>>>>>

  This proves that the test based on <math|n*\<chi\><rsub|n><rsup|2>> is
  asymptotically consistent.

  <subsection|Infinite number of linear constraints, an approach by sieves>

  In various cases <math|\<Omega\>> is defined through a countable collection
  of linear constraints. An example is presented in Section 3.3. Suppose thus
  that <math|\<Omega\>> is defined as in <math|<around*|(|<reference|eqn:3.omega>|)>>,
  with <math|\<cal-F\>> an infinite class of functions

  <\equation*>
    \<cal-F\>=<around*|{|f<rsub|\<alpha\>>:\<bbb-R\><rsup|d>\<rightarrow\>\<bbb-R\>,\<alpha\>\<in\>A|}>
  </equation*>

  where <with|font-shape|slanted|A> <math|\<subseteq\>\<bbb-R\>> is a
  countable set of indices and <math|c*a*r*d<around*|(|\<cal-F\>|)>=c*a*r*d<around*|(|A|)>=\<infty\>>.
  Thus <math|\<Omega\>=<around*|{|Q\<in\>M:<space|0.17em>Q*f=Q<rsub|0>*f,<space|0.27em>f\<in\>\<cal-F\>|}>>,
  for some <math|Q<rsub|0>> in <math|M>.

  Assume that the projection <math|Q<rsup|\<ast\>>> exists in
  <math|\<Omega\>>. Then, by Theorem <reference|th:2.linear>

  <\equation*>
    f<rsup|\<ast\>>\<in\>c*l<rsub|L<rsub|1><around*|(|Q<rsup|\<ast\>>|)>>*<around*|(|\<less\>\<cal-F\>\<gtr\>|)>.
  </equation*>

  We approximate <math|\<cal-F\>> through a suitable increasing sequence of
  classes of functions <math|\<cal-F\><rsub|n>> , with finite cardinality
  <math|k=k<around|(|n|)>> increasing with <math|n>. Each
  <math|\<cal-F\><rsub|n>> induces a subset <math|\<Omega\><rsub|n>> included
  in <math|\<Omega\>>.

  Define therefore <math|<around*|{|\<cal-F\><rsub|n>|}><rsub|n\<geq\>1>>
  such that

  <align|<tformat|<table|<row|<cell|\<cal-F\><rsub|n>>|<cell|\<subseteq\>\<cal-F\><rsub|n+1>\<subset\>\<cal-F\>,<space|0.17em><text|for
  \ all \ \ >n\<geq\>1<label|effe>>>|<row|<cell|<space|1em>\<cal-F\>>|<cell|=<big|cup><rsub|n\<geq\>1>\<cal-F\><rsub|n><label|union>>>>>>

  and

  <\equation*>
    \<Omega\><rsub|n>=<around*|{|Q:Q*f=Q<rsub|0>*f,f\<in\>\<cal-F\><rsub|n>|}>.
  </equation*>

  We thus have <math|\<Omega\><rsub|n>\<supseteq\>\<Omega\><rsub|n+1>,<space|0.17em>n\<geq\>1>
  and <math|\<Omega\>=<big|cap><rsub|n\<geq\>1>\<Omega\><rsub|n>>.

  The idea of determining the projection of a measure <math|P> on a set
  <math|\<Omega\>> through an approximating sequence of sets -or sieve- \ has
  been introduced in this setting in <nbsp><cite|TV93>.

  <\thm>
    [Teboulle-Vajda, 1993]With the above notation, define
    <math|Q<rsub|n><rsup|\<ast\>>> as the projection of <math|P> on
    <math|\<Omega\><rsub|n>>. Suppose that the above assumptions on
    <math|<around*|{|\<Omega\><rsub|n>|}><rsub|n\<geq\>1>> hold and that
    <math|\<Omega\><rsub|n>\<supseteq\>\<Omega\>> for each <math|n\<geq\>1>.
    Then

    <\equation>
      lim<rsub|n\<rightarrow\>\<infty\>><around*|\||f<rsup|\<ast\>>-f<rsub|n><rsup|\<ast\>>|\|><rsub|L<rsub|1><around*|(|P|)>>=lim<rsub|n\<rightarrow\>\<infty\>><around*|\||<frac|d*Q<rsup|\<ast\>>|d*P>-<frac|d*Q<rsub|n><rsup|\<ast\>>|d*P>|\|><rsub|L<rsub|1><around*|(|P|)>>.<label|vaj>
    </equation>
  </thm>

  By Scheffe's Lemma this is equivalent to
  <math|lim<rsub|n\<rightarrow\>\<infty\>>
  d<rsub|v*a*r><around|(|Q<rsub|n><rsup|\<ast\>>,Q<rsup|\<ast\>>|)>=0> where
  <math|d<rsub|v*a*r><around|(|Q,P|)>\<assign\>sup<rsub|A\<in\>\<cal-B\><around|(|\<bbb-R\><rsup|d>|)>><around*|\||Q<around|(|A|)>-P<around|(|A|)>|\|>>
  is the variation distance between the p.m's <math|P> and <math|Q>. When
  <math|sup<rsub|f\<in\>\<cal-F\>> sup<rsub|x>
  f<around|(|x|)>\<less\>\<infty\>> then (<reference|vaj>) implies

  <\equation>
    lim<rsub|k\<rightarrow\>\<infty\>> \<chi\><rsup|2><around*|(|\<Omega\><rsub|n>,P|)>=\<chi\><rsup|2><around*|(|\<Omega\>,P|)><label|sieve>
  </equation>

  The above result states that we can build a sequence of estimators of
  <math|\<chi\><rsup|2><around*|(|\<Omega\>,P|)>> letting
  <math|k=k<around|(|n|)>> grow to infinity together with <math|n>. Define

  <\equation*>
    \<chi\><rsub|n,k><rsup|2>=sup<rsub|f\<in\>\<cal-F\><rsub|n>><around*|(|Q<rsub|0>-P<rsub|n>|)>*f-<frac|1|4>*P<rsub|n>*f<rsup|2>.
  </equation*>

  In the following section we consider conditions on <math|k<around|(|n|)>>
  entailing the asymptotic normality of the suitably normalized sequence of
  estimates <math|\<chi\><rsub|n,k>> when <math|P> belongs to
  <math|\<Omega\>>, i.e. under <math|H*0>.

  <subsubsection|Convergence in distribution under <math|H*0>.>

  As a consequence of Theorem <math|<reference|th:dlf>>,
  <math|n*\<chi\><rsub|n,k><rsup|2>> tends to infinity with probability 1 as
  <math|n> <math|\<rightarrow\>\<infty\>>.

  We consider the statistics

  <\equation>
    <frac|n*\<chi\><rsub|n,k><rsup|2>-k|<sqrt|2*k>><label|chi-standard>
  </equation>

  which will be seen to have a nondegenerate distribution as
  <math|k<around|(|n|)>> tends to infinity together with <math|n>.

  As in <nbsp><cite|IKL93> and <nbsp><cite|IL90>, the main tool of the proof
  of the asymptotic normality of (<reference|chi-standard>) relies on the
  strong approximation of the empirical processes. We briefly recall some
  useful notions.

  <\defn>
    A class of functions <math|\<cal-F\>> is <em|pregaussian> if there exists
    a version <math|B<rsub|P><rsup|0><around*|(|.|)>> of <math|P->Brownian
    bridges uniformly continuous in <math|\<ell\><rsup|\<infty\>><around*|(|\<cal-F\>|)>>,
    with respect to the metric <math|\<rho\><rsub|P><around*|(|f,g|)>=<around*|(|V*a*r<rsub|P>*<around*|\||f-g|\|>|)><rsup|1/2>>,
    where <math|\<ell\><rsup|\<infty\>><around*|(|\<cal-F\>|)>> is the Banach
    space of all functionals <math|H:\<cal-F\>\<rightarrow\>\<bbb-R\>>
    uniformly bounded and with norm <math|<around*|\||H|\|><rsub|\<cal-F\>>=sup<rsub|f\<in\>\<cal-F\>><around*|\||H<around*|(|f|)>|\|>>.
  </defn>

  For some <math|a\<gtr\>0>, let <math|\<delta\><rsub|n>> be a decreasing
  sequence with <math|\<delta\><rsub|n>=o<around*|(|n<rsup|-a>|)>>.

  <\defn>
    A class of functions <math|\<cal-F\>> is <em|Komls-Major-Tusndy>
    (<em|KMT>) with respect to <math|P>, with rate <math|\<delta\><rsub|n>>
    <math|<around*|(|\<cal-F\>\<in\>K*M*T<around*|(|\<delta\><rsub|n>;P|)>|)>>
    iff it is pregaussian and there exists a version
    <math|B<rsub|n><rsup|0><around*|(|.|)>> of <math|P->Brownian bridges such
    that for any <math|t\<gtr\>0> it holds

    <\equation>
      Pr <around*|{|sup<rsub|f\<in\>\<cal-F\>><around*|\||<sqrt|n>*<around*|(|P<rsub|n>-P|)>*f-B<rsub|n><rsup|0><around*|(|f|)>|\|>\<geq\>\<delta\><rsub|n>*<around*|(|t+b*log
      n|)>|}>\<leq\>c*e<rsup|-\<theta\>*t>,<label|KMT>
    </equation>

    where the positive constants <math|b>, <math|c> and <math|\<theta\>>
    depend on <math|\<cal-F\>> only.
  </defn>

  We refer to <cite|BOR81>, <cite|Mas89>, <cite|BM89>, and <cite|Kol94> for
  examples of classical and useful classes of KMT classes, together with
  calculations of rates; we will use the fact that a KMT class is also a
  Donsker class.

  From <math|<around*|(|<reference|KMT>|)>> and Borel-Cantelli lemma it
  follows that

  <\equation>
    sup<rsub|f\<in\>\<cal-F\>><around*|\||\<gamma\><rsub|n><around|(|f|)>-B<rsub|n><rsup|0><around|(|f|)>|\|>=O*<around*|(|\<delta\><rsub|n>*log
    n|)><label|KMT2>
  </equation>

  a.s. where, with the same notation as in the finite case (see
  (<reference|nugamma>)), <math|\<gamma\><rsub|n><around|(|f|)>=<sqrt|n>*<around|(|P<rsub|n>-P|)>*f>
  is the empirical process indexed by <math|f\<in\>\<cal-F\>>.

  Let <math|<around*|{|\<cal-F\><rsub|n>|}><rsub|n\<geq\>1>> be a sequence of
  classes of linearly independent functions satisfying
  <math|<around*|(|<reference|effe>|)>>.

  For any <math|n>, set <math|<wide*|\<gamma\>|\<bar\>><rsub|n,k>=<wide*|\<gamma\>|\<bar\>><rsub|n><around|(|\<cal-F\><rsub|n>|)>>
  (resp. <math|<wide*|B|\<bar\>><rsub|n,k><rsup|0>>) the <math|k->dimensional
  vector resulting from the projection of the empirical process
  <math|\<gamma\><rsub|n>> (resp. of the <math|P->Brownian bridge
  <math|B<rsub|n><rsup|0>>) defined on <math|\<cal-F\>> to the subset
  <math|\<cal-F\><rsub|n>>. Then, if <math|\<cal-F\><rsub|n>=<around*|{|f<rsub|1><rsup|<around|(|n|)>>,\<ldots\>,f<rsub|k><rsup|<around|(|n|)>>|}>>,
  <math|<wide*|\<gamma\>|\<bar\>><rsub|n,k>=<around*|{|\<gamma\><rsub|n><around|(|f<rsub|1><rsup|<around|(|n|)>>|)>,\<ldots\>,\<gamma\><rsub|n><around|(|f<rsub|k><rsup|<around|(|n|)>>|)>|}>>
  and <math|<wide*|B|\<bar\>><rsup|0><rsub|n,k>=<around*|{|B<rsub|n><rsup|0><around*|(|f<rsub|1><rsup|<around|(|n|)>>|)>,\<ldots\>,B<rsub|n><rsup|0><around*|(|f<rsub|k><rsup|<around|(|n|)>>|)>|}>>.
  Denote <math|S<rsub|k>> the covariance matrix of the vector
  <math|<wide*|\<gamma\>|\<bar\>><rsub|n,k>> and <math|S<rsub|n,k>> its
  empirical covariance matrix. Let <math|\<lambda\><rsub|1,k>> be the
  smallest eigenvalue of <math|S<rsub|k>>.

  <\thm>
    <label|th:dli>Let <math|\<cal-F\>> have an envelope <math|F> and be
    <math|K*M*T<around*|(|\<delta\><rsub|n>;P|)>> for some sequence
    <math|\<delta\><rsub|n>*\<downarrow\>*0>. Define further a sequence
    <math|<around*|{|\<cal-F\><rsub|n>|}><rsub|n\<geq\>1>> of classes of
    linearly independent functions satisfying
    <math|<around*|(|<reference|effe>|)>>.

    Moreover, let <math|k> satisfy

    <align|<tformat|<table|<row|<cell|lim<rsub|n\<rightarrow\>\<infty\>>
    k<around|(|n|)>>|<cell|=\<infty\><no-number>>>|<row|<cell|lim<rsub|n\<rightarrow\>\<infty\>>
    \<lambda\><rsub|1,k><rsup|-1/2>*<space|0.17em>k<rsup|1/2>*\<delta\><rsub|n>*log
    n>|<cell|=0<label|B>>>|<row|<cell|lim<rsub|n\<rightarrow\>\<infty\>>
    \<lambda\><rsub|1,k><rsup|-1>*<space|0.17em>k<rsup|3/2>*n<rsup|-1/2>>|<cell|=0.<label|D>>>>>>

    Then under <math|H*0>

    <\equation*>
      <frac|n*\<chi\><rsub|n,k><rsup|2>-k|<sqrt|2*k>><above|\<longrightarrow\>|d>N<around*|(|0,1|)>.
    </equation*>
  </thm>

  <\proof>
    By Proposition <reference|th:matrixform>

    <align*|<tformat|<table|<row|<cell|<frac|n*\<chi\><rsup|2><rsub|n,k>-k|<sqrt|2*k>>>|<cell|=<frac|<around*|(|<wide*|B|\<bar\>><rsub|n,k><rsup|0>|)><rprime|'>*S<rsub|k><rsup|-1><wide*|B|\<bar\>><rsub|n,k><rsup|0>-k|<sqrt|2*k>>+2*<around*|(|2*k|)><rsup|-1/2><around*|(|<wide*|B|\<bar\>><rsub|n,k><rsup|0>|)><rprime|'>*S<rsub|k><rsup|-1>*<around*|(|<wide*|\<gamma\>|\<bar\>><rsub|n,k>-<wide*|B|\<bar\>><rsub|n,k><rsup|0>|)>>>|<row|<cell|>|<cell|+<around*|(|2*k|)><rsup|-1/2>*<around*|(|<wide*|\<gamma\>|\<bar\>><rsub|n,k>-<wide*|B|\<bar\>><rsub|n,k><rsup|0>|)><rprime|'>*S<rsub|k><rsup|-1>*<around*|(|<wide*|\<gamma\>|\<bar\>><rsub|n,k>-<wide*|B|\<bar\>><rsub|n,k><rsup|0>|)>>>|<row|<cell|>|<cell|+<around*|(|2*k|)><rsup|-1/2><wide*|\<gamma\>|\<bar\>><rsub|n,k><rprime|'><around*|(|S<rsub|n,k><rsup|-1>-S<rsub|k><rsup|-1>|)><wide*|\<gamma\>|\<bar\>><rsub|n,k>>>|<row|<cell|>|<cell|=A+B+C+D.>>>>>

    The first term above can be written

    <\equation*>
      A=<frac|<around*|(|<wide*|B|\<bar\>><rsub|n,k><rsup|0>|)><rprime|'>*S<rsub|k><rsup|-1><wide*|B|\<bar\>><rsub|n,k><rsup|0>-k|<sqrt|2*k>>=<frac|<big|sum><rsub|i=1><rsup|k><around*|(|Z<rsub|i><rsup|2>-E*Z<rsub|i><rsup|2>|)>|<sqrt|k*V*a*r*Z<rsub|i><rsup|2>>>
    </equation*>

    which converges weakly to the standard normal distribution by the CLT
    applied to the i.i.d.standard normal r.v's <math|Z<rsub|i>>.

    As to the term <math|C> it is straightforward that
    <math|C=o<around|(|B|)>>. From the proof of Theorem <reference|th:dlf>, D
    goes to zero if <math|\<lambda\><rsub|1,k><rsup|-1>*k<rsup|1/2><around*|(|sup<rsub|i,j><around*|\||s<rsub|n,k><around*|(|i,j|)>-s<rsub|k><around*|(|i,j|)>|\|>|)><around*|\<\|\|\>|<wide*|\<gamma\>|\<bar\>><rsub|n,k><rprime|'>S<rsub|k><rsup|-1/2>|\<\|\|\>><rsup|2>=o<rsub|P><around|(|1|)>>.
    Since, using (<reference|D>) and (<reference|effe>),
    <math|sup<rsub|i,j><around*|\||s<rsub|n,k><around*|(|i,j|)>-s<rsub|k><around*|(|i,j|)>|\|>\<leq\>sup<rsub|f,g\<in\>\<cal-F\>><around*|\||<around*|(|P<rsub|n>-P|)>*f*g|\|>>

    <math|+sup<rsub|f\<in\>\<cal-F\>><around*|\||<around*|(|P<rsub|n>-P|)>*f|\|>*<around*|\||<around*|(|P<rsub|n>+P|)>*F|\|>>
    <math|=O<rsub|P><around*|(|n<rsup|-1/2>|)>>, and considering that
    <math|<around*|\<\|\|\>|<wide*|\<gamma\>|\<bar\>><rsub|n,k><rprime|'>S<rsub|k><rsup|-1/2>|\<\|\|\>><rsup|2>=O<rsub|P><around|(|k|)>>,
    we are done.

    For B, <math|<around*|\||<around*|(|<wide*|B|\<bar\>><rsub|n,k><rsup|0>|)><rprime|'>*S<rsub|k><rsup|-1>*<around*|(|<wide*|\<gamma\>|\<bar\>><rsub|n,k>-<wide*|B|\<bar\>><rsub|n,k><rsup|0>|)>|\|>=<around*|\||<around*|(|<wide*|B|\<bar\>><rsub|n,k><rsup|0>|)><rprime|'>*S<rsub|k><rsup|-1/2>*S<rsub|k><rsup|-1/2>*<around*|(|<wide*|\<gamma\>|\<bar\>><rsub|n,k>-<wide*|B|\<bar\>><rsub|n,k><rsup|0>|)>|\|>>

    <math|\<leq\><around*|\<\|\|\>|<around*|(|<wide*|B|\<bar\>><rsub|n,k><rsup|0>|)><rprime|'>*S<rsub|k><rsup|-1/2>|\<\|\|\>>*<around*|\<\|\|\>|S<rsub|k><rsup|-1/2>*<around*|(|<wide*|\<gamma\>|\<bar\>><rsub|n,k>-<wide*|B|\<bar\>><rsub|n,k><rsup|0>|)>|\<\|\|\>>>
    <math|=<sqrt|<big|sum><rsub|i=1><rsup|k>Z<rsub|i><rsup|2>>*<around*|\<\|\|\>|S<rsub|k><rsup|-1/2>*<around*|(|<wide*|\<gamma\>|\<bar\>><rsub|n,k>-<wide*|B|\<bar\>><rsub|n,k><rsup|0>|)>|\<\|\|\>>>
    where, as used in A, <math|Z<rsub|i><rsup|2>> are i.i.d. with a
    <math|\<chi\><rsup|2>> distribution with 1 df. Hence
    <math|<sqrt|<big|sum><rsub|i=1><rsup|k>Z<rsub|i><rsup|2>>=O<rsub|P><around*|(|k<rsup|1/2>|)>>.
    Further

    <align*|<tformat|<table|<row|<cell|<around*|\<\|\|\>|S<rsub|k><rsup|-1/2>*<around*|(|<wide*|\<gamma\>|\<bar\>><rsub|n,k>-<wide*|B|\<bar\>><rsub|n,k><rsup|0>|)>|\<\|\|\>>>|<cell|\<leq\><around*|\||<space|-0.17em><around*|\||<space|-0.17em><around*|\||S<rsub|k><rsup|-1/2>|\|><space|-0.17em>|\|><space|-0.17em>|\|>\<cdot\><around*|\<\|\|\>|<wide*|\<gamma\>|\<bar\>><rsub|n,k>-<wide*|B|\<bar\>><rsub|n,k><rsup|0>|\<\|\|\>>>>|<row|<cell|>|<cell|\<leq\>\<lambda\><rsub|1,k><rsup|-1/2>*<space|0.17em>k<rsup|1/2>*<sqrt|<frac|1|k>*<big|sum><rsub|i=1><rsup|k><around*|(|<wide*|\<gamma\>|\<bar\>><rsub|n,k><around|(|f<rsub|i>|)>-<wide*|B|\<bar\>><rsub|n,k><rsup|0><around|(|f<rsub|i>|)>|)><rsup|2>>>>|<row|<cell|>|<cell|\<leq\>\<lambda\><rsub|1,k><rsup|-1/2>*<space|0.17em>k<rsup|1/2>*sup<rsub|f\<in\>\<cal-F\>><around*|\||\<gamma\><rsub|n><around|(|f|)>-B<rsub|n><rsup|0><around|(|f|)>|\|>>>>>>

    from which <math|B=O<rsub|P>*<around*|(|\<lambda\><rsub|1,k><rsup|-1/2>*<space|0.17em>k<rsup|1/2>*\<delta\><rsub|n>*log
    n|)>=o<rsub|P><around*|(|1|)>> if (<reference|B>) holds. We have used the
    fact that <math|P> belongs to <math|\<Omega\>> in the last evaluation of
    <math|B>.
  </proof>

  <\rem>
    <label|rem:consistency>Under <math|H*1>, using the relation
    <math|<wide*|\<nu\>|\<bar\>><rsub|n>=<wide*|\<nu\>|\<bar\>>-n<rsup|-1/2><wide*|\<gamma\>|\<bar\>><rsub|n,k>>,
    we can write

    <align*|<tformat|<table|<row|<cell|<frac|n*\<chi\><rsup|2><rsub|n,k>-k|<sqrt|2*k>>>|<cell|=<around|(|2*k|)><rsup|-1/2>*<around*|(|<wide*|\<gamma\>|\<bar\>><rsub|n,k><rprime|'>S<rsub|n,k><rsup|-1><wide*|\<gamma\>|\<bar\>><rsub|n,k>-k|)>+<around|(|2*k|)><rsup|-1/2>*<around*|(|n<wide*|\<nu\>|\<bar\>><rprime|'>S<rsub|n,k><rsup|-1><wide*|\<nu\>|\<bar\>>-2*<sqrt|n><wide*|\<gamma\>|\<bar\>><rsub|n,k><rprime|'>S<rsub|n,k><rsup|-1><wide*|\<nu\>|\<bar\>>|)>>>|<row|<cell|>|<cell|=<around|(|2*k|)><rsup|-1/2>*<around*|(|n<wide*|\<nu\>|\<bar\>><rprime|'>S<rsub|n,k><rsup|-1><wide*|\<nu\>|\<bar\>>-2*<sqrt|n><wide*|\<gamma\>|\<bar\>><rsub|n,k><rprime|'>S<rsub|n,k><rsup|-1><wide*|\<nu\>|\<bar\>>|)>+O<rsub|P><around|(|1|)>>>>>>

    where the <math|O<rsub|P><around|(|1|)>> term captures
    <math|<around|(|2*k|)><rsup|-1/2>*<around*|(|<wide*|\<gamma\>|\<bar\>><rsub|n,k><rprime|'>S<rsub|n,k><rsup|-1><wide*|\<gamma\>|\<bar\>><rsub|n,k>-k|)>>
    that coincides with the test statistic
    <math|<frac|n*\<chi\><rsup|2><rsub|n,k>-k|<sqrt|2*k>>> under <math|H*0>.
    We can bound the first term from below by

    <align*|<tformat|<table|<row|<cell|>|<cell|<around|(|2*k|)><rsup|-1/2>*n<wide*|\<nu\>|\<bar\>><rprime|'>S<rsub|k><rsup|-1><wide*|\<nu\>|\<bar\>><around*|(|1-O<rsub|P><around*|(|<around*|\||<space|-0.17em><around*|\||<space|-0.17em><around*|\||S<rsub|k><rsup|1/2>*S<rsub|n,k><rsup|-1>*S<rsub|k><rsup|1/2>-I|\|><space|-0.17em>|\|><space|-0.17em>|\|>|)>|)>>>|<row|<cell|>|<cell|<space|1em>-<around|(|2*n/k|)><rsup|1/2><around*|\||<wide*|\<gamma\>|\<bar\>><rsub|n,k><rprime|'>S<rsub|k><rsup|-1/2>|\|><around*|\||<wide*|\<nu\>|\<bar\>><rprime|'>S<rsub|k><rsup|-1/2>|\|>*<around*|(|1+O<rsub|P><around*|(|<around*|\||<space|-0.17em><around*|\||<space|-0.17em><around*|\||S<rsub|k><rsup|1/2>*S<rsub|n,k><rsup|-1>*S<rsub|k><rsup|1/2>-I|\|><space|-0.17em>|\|><space|-0.17em>|\|>|)>|)>>>|<row|<cell|>|<cell|=O<rsub|P>*<around*|(|n*k<rsup|-1/2>|)>-O<rsub|P><around|(|n<rsup|1/2>|)>.>>>>>

    Hence, if (<reference|B>) and (<reference|D>) are satisfied then the test
    statistic is asymptotically consistent also for the case of an infinite
    number of linear constraints.
  </rem>

  In both conditions (<reference|B>) and (<reference|D>) the value of
  <math|\<lambda\><rsub|1,k>> appears, which cannot be estimated without any
  further hypothesis on the structure of the class <math|\<cal-F\>>. However,
  for concrete problems, once defined <math|\<cal-F\>> it is possible to give
  bounds for <math|\<lambda\><rsub|1>>, depending on <math|k>. This is what
  will be shown in the last section, for a particular class of goodness of
  fit tests.

  <subsection|Application: testing marginal distributions><label|sec:example>

  Let <math|P> be an unknown distribution on <math|\<bbb-R\><rsup|d>> with
  density bounded by below. We consider goodness-of-fit tests for the
  marginal distributions <math|P<rsub|1>,...,P<rsub|d>> of <math|P> on the
  basis of an i.i.d. sample <math|<around|(|X<rsub|1>,...,X<rsub|n>|)>>.

  Let thus <math|Q<rsub|1><rsup|0>,\<ldots\>,Q<rsub|d><rsup|0>> denote
  <math|d> distributions on <math|\<bbb-R\>>. The null hypothesis writes
  <math|H*0>: <math|P<rsub|j>=Q<rsub|j><rsup|0>> for <math|j=1,...,d>. That
  is to say that we simultaneously test goodness-of-fit of the marginal laws
  <math|P<rsub|1>,\<ldots\>,P<rsub|d>> to the laws
  <math|Q<rsub|1><rsup|0>,\<ldots\>,Q<rsub|d><rsup|0>>. Through the transform
  <math|P<rprime|'><around|(|y<rsub|1>,\<ldots\>,y<rsub|d>|)>=P<around*|(|<around*|(|Q<rsub|1><rsup|0>|)><rsup|-1><around*|(|y<rsub|1>|)>,\<ldots\>,<around*|(|Q<rsub|d><rsup|0>|)><rsup|-1><around*|(|y<rsub|d>|)>|)>>
  we can restrict the analysis to the case when all p.m's have support
  <math|<around|[|0,1|]><rsup|d>> and marginal laws uniform in
  <math|<around|[|0,1|]>> under H0. So without loss of generality we write
  <math|Q<rsub|0>> for the uniform distribution on <math|<around|[|0,1|]>>.

  P.J. Bickel, Y. Ritov and J.A. Wellner <cite|BRW91> focused on the
  estimation of linear functionals of the probability measure subject to the
  knowledge of the marginal laws in the case of r.v.'s with a.c.
  distribution, letting the number of cells grow to infinity.

  Define the class

  <\equation*>
    <around*|\<nobracket\>|\<cal-F\>\<assign\><around*|{|<with|font-size|1.41|1><rsub|u,j>:<around*|[|0,1|]><rsup|d>\<longrightarrow\><around*|{|0,1|}>,j=1,\<ldots\>,d*<text|,
    \ >u\<in\><lbrack>0,1|]>|}>
  </equation*>

  where <math|<with|font-size|1.41|1><rsub|u,j><around*|(|x<rsub|1>,\<ldots\>,x<rsub|d>|)>=<around*|{|<array|c|c|1,<space|1em>x<rsub|j>\<leq\>u<next-line>0,<space|1em>x<rsub|j>\<gtr\>u>|\<nobracket\>>>

  <right|.>.

  Let <math|\<Omega\>> be the set of all p.m's on
  <math|<around*|[|0,1|]><rsup|d>> with uniform marginals, i.e

  <\equation>
    \<Omega\>=<around*|{|Q\<in\>M<rsub|1><around*|(|<around*|[|0,1|]><rsup|d>|)>*<text|
    such that \ >Q*f=<big|int><rsub|<around*|[|0,1|]><rsup|d>>f<around|(|x|)>*d*x<space|0.17em>,f\<in\>\<cal-F\>|}>.<label|Omega
    marge>
  </equation>

  This set <math|\<Omega\>> has the form of
  <math|<around*|(|<reference|eqn:3.omega>|)>>, where <math|\<cal-F\>> is the
  class of characteristic functions of intervals, which is a KMT class with
  rate <math|\<delta\><rsub|n>=n<rsup|-1/2*d>>
  (<math|\<delta\><rsub|n>=<sqrt|n>> if <math|d=2>) ; see <cite|BM89>.

  We now build the family <math|\<cal-F\><rsub|n>> satisfying
  (<reference|effe>) and (<reference|union>).

  Let <math|m=m<around|(|n|)>> tend to <math|+\<infty\>> with <math|n>. Let
  <math|0\<less\>u<rsub|1>\<less\>\<ldots\>\<less\>u<rsub|m>\<less\>1> and
  <math|<around*|{|\<cal-U\><rsup|<around|(|n|)>>|}>> be the
  <math|m\<cdot\>d> points in <math|<around|[|0,1|]><rsup|d>> with
  coordinates in <math|<around|{|u<rsub|1>,\<ldots\>,u<rsub|m>|}>>.

  Let <math|\<cal-F\><rsub|n>> denote the class of characteristic functions
  of the <math|d->dimensional rectangles <math|<around|[|<wide*|0|\<bar\>>,<wide*|u|\<bar\>>|]>>
  for <math|<wide*|u|\<bar\>>\<in\>\<cal-U\><rsup|<around|(|n|)>>>. Hence
  <math|c*a*r*d<around|{|\<cal-F\><rsub|n>|}>=k=m\<cdot\>d>.

  Namely,

  <\equation>
    \<cal-F\><rsub|n>=<around*|{|<with|font-size|1.41|1><rsub|u<rsub|i>,j>:<around*|[|0,1|]><rsup|d>\<rightarrow\><around*|{|0,1|}>,j=1,\<ldots\>,d*<text|,
    \ >u<rsub|i>\<in\><around*|(|0,1|)>,<space|0.17em>u<rsub|i>\<less\>u<rsub|i+1><space|0.17em>,i=1,\<ldots\>,m|}>,<label|effe>
  </equation>

  which satisfies <math|\<cal-F\><rsub|n>\<subseteq\>\<cal-F\><rsub|n+1>> for
  all <math|n\<geq\>1> (i.e. (<reference|effe>)) and
  <math|\<cal-F\>=<big|cup><rsub|n\<geq\>1>\<cal-F\><rsub|n>> (i.e.
  (<reference|union>)).

  The sequence <math|<around*|{|\<cal-F\><rsub|n>|}><rsub|n\<geq\>1>> and the
  class <math|\<cal-F\>> satisfy conditions of Theorem
  <math|<reference|th:dli>>: <math|\<cal-F\>> (and consequently each
  <math|\<cal-F\><rsub|n>>) has envelope function <math|F=1> and
  <math|R*F<rsup|h>=1>, for all <math|R> in
  <math|M<rsub|1><around*|(|<around*|[|0,1|]><rsup|d>|)>> and <math|h> in
  <math|\<bbb-R\>>.

  In order to establish a lower bound for <math|\<lambda\><rsub|1,k>>, the
  smallest eigenvalue of <math|S<rsub|k>>, we will impose that the volumes of
  the cells in the grid defined by the <math|u<rsub|i><rsup|<around|(|n|)>>>
  do not shrink too rapidly to 0. Suppose that the intervals
  <math|<around|(|u<rsub|i>,u<rsub|i+1>|]>> are such that

  <\equation>
    0\<less\>lim<rsub|n\<rightarrow\>\<infty\>> inf min<rsub|i=1,...,m-1>
    k*<around*|(|u<rsub|i+1>-u<rsub|i>|)>\<leq\>lim<rsub|n\<rightarrow\>\<infty\>>
    sup max<rsub|i=1,...,m-1> k*<around*|(|u<rsub|i+1>-u<rsub|i>|)>\<less\>\<infty\>.<label|grid>
  </equation>

  <\rem>
    Condition for the sequence <math|\<cal-F\><rsub|n>> to converge to
    <math|\<cal-F\>> coincides with (F2) and (F3) in <nbsp><cite|BRW91>.
  </rem>

  We first obtain an estimate for the eigenvalue <math|\<lambda\><rsub|1,k>>.
  The final result of this step is stated in Lemma <reference|lm:eigen2>
  below.

  Let <math|P> belong to <math|\<Omega\>>. Let us then write the matrix
  <math|S<rsub|k>>. We have <math|P*<with|font-size|1.41|1><rsub|u<rsub|i>,j>=Q<rsub|0>*<with|font-size|1.41|1><rsub|u<rsub|i>,j>=u<rsub|i>>
  for <math|i=1,...,m> and <math|j=1,...,d> . Set
  <math|P*<with|font-size|1.41|1><rsub|u<rsub|i>,j>*<with|font-size|1.41|1><rsub|u<rsub|l>,h>=P*<around*|(|X<rsub|j>\<leq\>u<rsub|i>,<space|0.17em>X<rsub|h>\<leq\>u<rsub|l>|)>>,
  for every <math|h,j=1,...,d> and <math|l,i=1,\<ldots\>,m>. When <math|j=h>
  then <math|P*<with|font-size|1.41|1><rsub|u<rsub|i>,j>*<with|font-size|1.41|1><rsub|u<rsub|l>,j>=P*<around*|(|X<rsub|j>\<leq\>u<rsub|i>\<wedge\>u<rsub|l>|)>=u<rsub|i>\<wedge\>u<rsub|l>>.

  Consider for the vector of functions <math|f<rsub|j>> the following
  ordering

  <\equation*>
    <around*|(|f<rsub|1>,\<ldots\>,f<rsub|m>,f<rsub|m+1>,\<ldots\>,f<rsub|2*m>,\<ldots\>,f<rsub|<around*|(|d-1|)>*m+1>,\<ldots\>,f<rsub|d*m>,|)>=<around*|(|<with|font-size|1.41|1><rsub|u<rsub|1>,1>*\<ldots\>,<with|font-size|1.41|1><rsub|u<rsub|m>,1>,<with|font-size|1.41|1><rsub|u<rsub|1>,2>,\<ldots\>,<with|font-size|1.41|1><rsub|u<rsub|m>,d>|)>.
  </equation*>

  The generic term of <math|S<rsub|k>> writes

  <align*|<tformat|<table|<row|<cell|s<rsub|k><around*|(|u,v|)>>|<cell|=s<rsub|k>*<around*|(|<around|(|j-1|)>*m+i,<around|(|h-1|)>*m+l|)>=P*<with|font-size|1.41|1><rsub|u<rsub|i>,j>*<with|font-size|1.41|1><rsub|u<rsub|l>,h>-P*<with|font-size|1.41|1><rsub|u<rsub|i>,j>*P*<with|font-size|1.41|1><rsub|u<rsub|l>,h>>>|<row|<cell|>|<\cell>
    =

    <\around*|{>
      <array|c|l*l|<tformat|<table|<row|<cell|u<rsub|i>-u<rsub|i><rsup|2><text|
      >,>|<cell|<text| if \ >j=h,<space|0.17em>i=l>>|<row|<cell|P*<around|(|X<rsub|j>\<leq\>u<rsub|i>\<wedge\>u<rsub|l>|)>-u<rsub|i>*u<rsub|l><text|
      >,>|<cell|<text| if \ >j=h,<space|0.17em>i\<neq\>l>>|<row|<cell|P*<around|(|X<rsub|j>\<leq\>u<rsub|i>,X<rsub|h>\<leq\>u<rsub|l>|)>-u<rsub|i>*u<rsub|l><around*|(|i,l|)>-p<rsub|i>*p<rsub|l>>|<cell|<text|
      if \ >j\<neq\>h>>>>>
    </around*|\<nobracket\>>
  </cell>>>>>

  We make use of the class of functions

  <align*|<tformat|<table|<row|<cell|\<cal-F\><rsub|n><rsup|\<delta\>>>|<cell|=<around*|{|f<rsub|j\<cdot\>i>-f<rsub|j*<around|(|i-1|)>>,<space|0.27em>i=1,\<ldots\>,m,<space|0.27em>j=1*\<ldots\>,d,<space|0.27em><space|0.27em>f<rsub|h>\<in\>\<cal-F\><rsub|n>,<space|0.27em>f<rsub|0>=0|}>>>|<row|<cell|>|<cell|=<around*|{|<with|font-size|1.41|1><rsub|A<rsub|i><rsup|j>>,<space|0.27em>i=1*\<ldots\>,m,<space|0.27em>j=1,\<ldots\>,d|}>.>>>>>

  In the above display the <math|<around|{|A<rsub|i><rsup|j>|}><rsub|i=1,\<ldots\>,m>>
  describe the partition of <math|<around|[|0,1|]>>, the support of the
  marginal distribution <math|P<rsub|j>>, induced by the vector
  <math|<around*|{|u<rsub|i>|}><rsub|i\<leq\>m>>. Namely we have for every
  <math|j=1,\<ldots\>,d>, <math|A<rsub|i><rsup|j>\<cap\>A<rsub|l><rsup|j>=\<emptyset\>>,
  <math|i\<neq\>l>, <math|\<cup\><rsub|i=1><rsup|m+1>A<rsub|i><rsup|j>=<around|[|0,1|]>>,
  with <math|A<rsub|m+1><rsup|j>=<around|[|u<rsub|m>,1|]>>, for all <math|j>.

  Set <math|S<rsub|k><rsup|\<delta\>>> the covariance matrix of the vector
  <math|<wide*|\<gamma\>|\<bar\>><rsub|n><rsup|\<delta\>>=<wide*|\<gamma\>|\<bar\>><rsub|n><around|(|\<cal-F\><rsup|\<delta\>>|)>>
  and consider the vectors <math|<wide*|\<nu\>|\<bar\>><rsup|\<delta\>>> and
  <math|<wide*|\<nu\>|\<bar\>><rsub|n><rsup|\<delta\>>> defined as in
  (<reference|nugamma>).

  <math|S<rsub|k><rsup|\<delta\>>> has <math|<around|(|<around|(|j-1|)>*m+i,<around|(|h-1|)>*m+l|)>->th
  component equal to <math|P<rsub|A<rsub|i><rsup|j>><rsub|A<rsub|l><rsup|h>>-P<rsub|A<rsub|i><rsup|j>>*P<rsub|A<rsub|l><rsup|h>>>
  , which is

  <\equation*>
    <\around*|{>
      <array|c|l*l|<tformat|<table|<row|<cell|p<rsub|i>-p<rsub|i><rsup|2>,>|<cell|<text|
      if \ >j=h,<space|0.17em>i=l>>|<row|<cell|-p<rsub|i>*p<rsub|l>,>|<cell|<text|
      if \ >j=h,<space|0.17em>i\<neq\>l>>|<row|<cell|P*<around|(|u<rsub|i-1>\<leq\>X<rsub|j>\<less\>u<rsub|i>,u<rsub|l-1>\<leq\>X<rsub|h>\<less\>u<rsub|l>|)>-p<rsub|i>*p<rsub|l>,>|<cell|<text|
      if \ >j\<neq\>h>>>>>
    </around*|\<nobracket\>>
  </equation*>

  where we have written <math|p<rsub|i>=P*<around|(|u<rsub|i-1>\<leq\>X<rsub|j>\<less\>u<rsub|i>|)>=u<rsub|i>-u<rsub|i-1>>,
  for all <math|j=1,\<ldots\>,d>.

  <math|\<chi\><rsup|2>> (and <math|\<chi\><rsub|n><rsup|2>>) can be written
  using <math|\<cal-F\><rsub|n><rsup|\<delta\>>> instead of
  <math|\<cal-F\><rsub|n>>:

  <\equation*>
    \<chi\><rsup|2><around|(|\<Omega\>,P|)>=<wide*|\<nu\>|\<bar\>><rprime|'>S<rsub|k><rsup|-1><wide*|\<nu\>|\<bar\>>=<around*|(|<wide*|\<nu\>|\<bar\>><rsup|\<delta\>>|)><rprime|'><around*|(|S<rsub|k><rsup|\<delta\>>|)><rsup|-1><around*|(|<wide*|\<nu\>|\<bar\>><rsup|\<delta\>>|)>.
  </equation*>

  Let <math|M> be the diagonal <math|d->block matrix with all diagonal blocks
  equal to the unit inferior triangular <math|<around|(|m\<times\>m|)>>
  matrix. Then <math|<wide*|\<nu\>|\<bar\>>=M<wide*|\<nu\>|\<bar\>><rsup|\<delta\>>>.

  On the other hand, after some algebra it can be checked that
  <math|S<rsub|k>=M*S<rsub|k><rsup|\<delta\>>*M<rprime|'>>.

  Thus <math|<wide*|\<nu\>|\<bar\>><rsup|\<delta\>><rprime|'><around|(|S<rsub|k><rsup|\<delta\>>|)><rsup|-1><wide*|\<nu\>|\<bar\>><rsup|\<delta\>>=<wide*|\<nu\>|\<bar\>><rprime|'><around|(|M<rprime|'>|)><rsup|-1>*M<rprime|'>*S<rsub|k><rsup|-1>*M<around|(|M|)><rsup|-1><wide*|\<nu\>|\<bar\>>=\<chi\><rsup|2>>.
  Similar arguments yield <math|\<chi\><rsub|n><rsup|2>=<wide*|\<nu\>|\<bar\>><rsub|n><rsup|\<delta\>><rprime|'><around|(|S<rsub|n,k><rsup|\<delta\>>|)><rsup|-1><wide*|\<nu\>|\<bar\>><rsub|n><rsup|\<delta\>>>.

  The matrix <math|M> has all eigenvalues equal to one. This allows us to
  write, for <math|\<lambda\><rsub|1,\<delta\>>> the minimum eigenvalue of
  <math|S<rsub|k><rsup|\<delta\>>>:

  <\equation*>
    \<lambda\><rsub|1,k>\<leq\>min<rsub|x>
    <frac|x<rprime|'>*S<rsub|k>*x|<around|\<\|\|\>|x|\<\|\|\>><rsup|2>>\<leq\>min<rsub|y>
    <frac|y<rprime|'>*S<rsub|k><rsup|\<delta\>>*y|<around|\<\|\|\>|y|\<\|\|\>><rsup|2>>*max<rsub|x>
    <frac|<around|\<\|\|\>|M*x|\<\|\|\>><rsup|2>|<around|\<\|\|\>|x|\<\|\|\>><rsup|2>>=\<lambda\><rsub|1,\<delta\>>\<leq\>min<rsub|y>
    <frac|y<rprime|'>*S<rsub|k>*y|<around|\<\|\|\>|y|\<\|\|\>><rsup|2>>*max<rsub|x>
    <frac|<around|\<\|\|\>|M<rsup|-1>*x|\<\|\|\>><rsup|2>|<around|\<\|\|\>|x|\<\|\|\>><rsup|2>>=\<lambda\><rsub|1,k>.
  </equation*>

  We will now consider the covariance matrix of
  <math|<wide*|\<gamma\>|\<bar\>><rsub|n><rsup|\<delta\>>> under H0, when the
  underlying distribution is <math|Q<rsub|0>>, i.e. the uniform distribution
  on <math|<around|[|0,1|]><rsup|d>>. Denote this matrix
  <math|S<rsub|k><rsup|0>>. We then have

  <\lem>
    <label|th:4.eigen1>If <math|P\<in\>\<Omega\>>, then

    (i) <math|S<rsub|k><rsup|0>=D<rsup|1/2>*<around|(|I-V|)>*D<rsup|1/2>>,
    where <math|D> and <math|V> are both diagonal block matrices with
    diagonal blocks equal to <math|d*i*a*g<around*|{|p<rsub|i>|}><rsub|i=1,\<ldots\>,m>>
    and to <math|U=<around*|{|<sqrt|p<rsub|i>*p<rsub|l>>|}><rsub|i=1,\<ldots\>,m,<space|0.17em>l=1,\<ldots\>,m>>
    respectively.

    (ii) The <math|<around|(|m\<times\>m|)>> matrix <math|U> has eigenvalues
    equal to

    <\equation*>
      \<lambda\><rsub|U>=

      <\around*|{>
        <array|c|l*l|<tformat|<table|<row|<cell|<around|(|1-p<rsub|m+1>|)>=<big|sum><rsub|i=1><rsup|m>p<rsub|i>>|<cell|<text|with
        cardinality >1>>|<row|<cell|0>|<cell|<text|with cardinality
        >m-1.>>>>>
      </around*|\<nobracket\>>
    </equation*>

    Moreover <math|<around|(|I-U|)><rsup|-1>=<around|(|I+<frac|1|p<rsub|m+1>>*U|)>>.

    (iii) For any eigenvalue <math|\<lambda\>> of <math|S<rsub|k><rsup|0>> it
    holds

    <\equation*>
      p<rsub|m+1>*min<rsub|1\<leq\>i\<leq\>m>
      p<rsub|i>\<leq\>\<lambda\>\<leq\>max<rsub|1\<leq\>i\<leq\>m> p<rsub|i>.
    </equation*>
  </lem>

  <\proof>
    (i) This can easily be checked through some calculation.

    (ii) First notice that

    <\equation>
      <label|eq:4.U>U<rsup|2>=<around|(|1-p<rsub|m+1>|)>*U
    </equation>

    Formula (<reference|eq:4.U>) implies that at least one eigenvalue equals
    <math|<around|(|1-p<rsub|m+1>|)>>. On the other hand, summing up all
    diagonal entries in <math|U> we get <math|t*r*a*c*e<around|(|U|)>=<big|sum><rsub|i=1><rsup|m>p<rsub|i>=1-p<rsub|m+1>>.
    This allows us to conclude that there can be only one eigenvalue equal to
    <math|1-p<rsub|m+1>> while the other must be zero.

    For the second statement, by Taylor expansion of
    <math|<around|(|1-x|)><rsup|-1>>, <math|<around|(|I-U|)><rsup|-1>=I+<big|sum><rsub|h=1><rsup|\<infty\>>U<rsup|h>>.

    Then, using recursively (<reference|eq:4.U>),
    <math|<around|(|I-U|)><rsup|-1>=I+U*<big|sum><rsub|h=1><rsup|\<infty\>><around|(|1-p<rsub|m+1>|)><rsup|h>=U+<frac|1|p<rsub|m+1>>*U>.

    (iii) For any eigenvalue <math|\<lambda\>> of <math|S<rsub|k><rsup|0>> we
    have:

    <\equation*>
      \<lambda\>\<leq\>\<lambda\><rsub|k,k>=<around*|\||<space|-0.17em><around*|\||<space|-0.17em><around*|\||S<rsub|k><rsup|0>|\|><space|-0.17em>|\|><space|-0.17em>|\|>\<leq\><around*|\||<space|-0.17em><around*|\||<space|-0.17em><around*|\||D<rsup|1/2>|\|><space|-0.17em>|\|><space|-0.17em>|\|><rsup|2><around*|\||<space|-0.17em><around*|\||<space|-0.17em><around*|\||<around|(|I-V|)>|\|><space|-0.17em>|\|><space|-0.17em>|\|>=max<rsub|1\<leq\>i\<leq\>m>
      p<rsub|i>*<around*|(|1-inf<rsub|<around|\<\|\|\>|<wide*|x|\<bar\>>|\<\|\|\>>=1><wide*|x|\<bar\>><rprime|'>V<wide*|x|\<bar\>>|)>=max<rsub|1\<leq\>i\<leq\>m>
      p<rsub|i>
    </equation*>

    where for the last identity we have used the fact that the eigenvalues of
    <math|V> coincide with the eigenvalues of <math|U> with order multiplied
    by <math|d>.

    For the opposite inequality consider

    <align*|<tformat|<table|<row|<cell|\<lambda\><rsup|-1>>|<cell|\<leq\>\<lambda\><rsub|1,k><rsup|-1>=<around*|\||<space|-0.17em><around*|\||<space|-0.17em><around*|\||S<rsub|k><rsup|0><rsup|-1>|\|><space|-0.17em>|\|><space|-0.17em>|\|>\<leq\><around*|\||<space|-0.17em><around*|\||<space|-0.17em><around*|\||D<rsup|-1>|\|><space|-0.17em>|\|><space|-0.17em>|\|><around*|\||<space|-0.17em><around*|\||<space|-0.17em><around*|\||<around*|(|I+<frac|1|p<rsub|m+1>>*V|)>|\|><space|-0.17em>|\|><space|-0.17em>|\|>>>|<row|<cell|>|<cell|=<around*|(|max<rsub|1\<leq\>i\<leq\>m>
    p<rsub|i><rsup|-1>|)>*<around*|(|1+<frac|1|p<rsub|m+1>>*<around|(|1-p<rsub|m+1>|)>|)>=<around*|(|min<rsub|1\<leq\>i\<leq\>m>
    p<rsub|i>|)><rsup|-1>*p<rsub|m+1><rsup|-1>.>>>>>
  </proof>

  <\lem>
    <label|lm:eigen2>Suppose that <math|P> has density on
    <math|<around|[|0,1|]><rsup|d>> bounded from below by
    <math|\<alpha\>\<gtr\>0>. Then the smallest eigenvalue of
    <math|S<rsub|k><rsup|\<delta\>>>, and consequently
    <math|\<lambda\><rsub|1,k>>, is bounded below by
    <math|p<rsub|m+1>*<space|0.17em>min<rsub|1\<leq\>i\<leq\>m> p<rsub|i>>.
  </lem>

  <\proof>
    Write <math|s<rsub|k><rsup|\<delta\>><around*|(|u,v|)>> for the
    <math|<around*|(|u,v|)>->th element of <math|S<rsub|k><rsup|\<delta\>>>.
    We have, for <math|P\<in\>\<Omega\>>, i.e. if <math|P*f=Q<rsub|0>*f>, for
    every <math|f\<in\>\<cal-F\><rsub|n><rsup|\<delta\>>>:

    <align*|<tformat|<table|<row|<cell|s<rsub|k><rsup|\<delta\>>*<around*|(|<around|(|j-1|)>*m+i,<around|(|h-1|)>*m+l|)>>|<cell|=s<rsub|k><rsup|\<delta\>><around*|(|u,v|)>=P*f<rsub|u>*f<rsub|v>-P*f<rsub|u>*P*f<rsub|v>=>>|<row|<cell|>|<cell|=P*<around*|(|f<rsub|u>-Q<rsub|0>*f<rsub|u>|)>*<around*|(|f<rsub|v>-Q<rsub|0>*f<rsub|v>|)>=P<around*|(|<wide|f<rsub|u>|\<bar\>><wide|f<rsub|v>|\<bar\>>|)>>>>>>

    where <math|<wide|f<rsub|u>|\<bar\>>=f<rsub|u>-Q<rsub|0>*f<rsub|u>>. For
    each vector <math|<wide*|a|\<bar\>>\<in\>\<bbb-R\><rsup|d\<cdot\>m>> it
    holds then

    <align*|<tformat|<table|<row|<cell|<wide*|a|\<bar\>><rprime|'>S<rsub|k><rsup|\<delta\>><wide*|a|\<bar\>>>|<cell|=<big|sum><rsub|u=1><rsup|d*m><big|sum><rsub|v=1><rsup|d*m>a<rsub|u>*a<rsub|v>*P<around*|(|<wide|f<rsub|u>|\<bar\>><wide|f<rsub|v>|\<bar\>>|)>=P<around*|(|<big|sum><rsub|u=1><rsup|d*m>a<rsub|u><wide|f<rsub|u>|\<bar\>>|)><rsup|2>>>|<row|<cell|>|<cell|=<big|int><rsub|<around*|[|0,1|]><rsup|d>><around*|(|<big|sum><rsub|u=1><rsup|d*m>a<rsub|u><wide|f<rsub|u>|\<bar\>>|)><rsup|2>*d*P\<geq\>\<alpha\>*<big|int><rsub|<around*|[|0,1|]><rsup|d>><around*|(|<big|sum><rsub|u=1><rsup|d*m>a<rsub|u><wide|f<rsub|u>|\<bar\>>|)><rsup|2>*d*Q<rsub|0>=>>|<row|<cell|>|<cell|=\<alpha\><space|0.17em><wide*|a|\<bar\>><rprime|'><around*|{|Q<rsub|0>*<around*|(|f<rsub|u>-Q<rsub|0>*f<rsub|u>|)>*<around*|(|f<rsub|v>-Q<rsub|0>*f<rsub|v>|)>|}><rsub|u,v><wide*|a|\<bar\>>=\<alpha\><space|0.17em><wide*|a|\<bar\>><rprime|'>S<rsub|k><rsup|0><wide*|a|\<bar\>>>>>>>

    On the other hand the preceding inequality implies

    <\equation>
      inf<rsub|<wide*|a|\<bar\>>> <frac|<wide*|a|\<bar\>><rprime|'>S<rsub|k><rsup|\<delta\>><wide*|a|\<bar\>>|<around*|\||<wide*|a|\<bar\>>|\|><rsup|2>>\<geq\>\<alpha\>*<space|0.17em>inf<rsub|<wide*|a|\<bar\>>>
      <frac|<wide*|a|\<bar\>><rprime|'>S<rsub|k><rsup|0><wide*|a|\<bar\>>|<around*|\||<wide*|a|\<bar\>>|\|><rsup|2>><label|sigma1>
    </equation>

    that is a lower bound for the smallest eigenvalue of
    <math|S<rsub|k><rsup|\<delta\>>> depending on the smallest eigenvalue of
    <math|S<rsub|k><rsup|0>>.

    Apply Lemma <reference|th:4.eigen1> (iii) to get the lower bound for
    <math|\<lambda\><rsub|1>>.
  </proof>

  <\rem>
    <label|rem:alpha>Existence of <math|\<alpha\>\<gtr\>0> such that the
    density of <math|P> in <math|<around|[|0,1|]><rsup|d>> is bounded below
    by <math|\<alpha\>> seems necessary for this kind of approach; see
    assumption (P3) in <cite|BRW91>.
  </rem>

  From Theorem <reference|th:dli> and using (<reference|grid>) in order to
  evaluate <math|p<rsub|m+1>*<space|0.17em>min<rsub|1\<leq\>i\<leq\>m>
  p<rsub|i>>, together with the fact that the class <math|\<cal-F\>> is KMT
  with rate <math|\<delta\><rsub|n>=n<rsup|-1/2>> we obtain

  <\thm>
    <label|sec4:thm F infinite>Let (<reference|grid>) hold. Assume that
    <math|P> belongs to <math|\<Omega\>> defined by (<reference|Omega marge>)
    and has a density bounded below by some positive number. Let further
    <math|k=d\<cdot\>m<around|(|n|)>> be a sequence such that
    <math|lim<rsub|n\<rightarrow\>\<infty\>> k=\<infty\>> and
    <math|lim<rsub|n\<rightarrow\>\<infty\>> k<rsup|7/2>*n<rsup|-1/2>=0>

    Then <math|<frac|n*\<chi\><rsub|n,k><rsup|2>-k|<sqrt|2*k>>=<frac|n<wide*|\<gamma\>|\<bar\>><rsub|n,k><rprime|'>S<rsub|n,k><rsup|-1><wide*|\<gamma\>|\<bar\>><rsub|n,k>-k|<sqrt|2*k>>>
    has limiting normal standard distribution.
  </thm>

  In the last part of this Section we intend to show that conditions in
  Theorem <reference|sec4:thm F infinite> can be weakened for small values of
  <math|d>. When <math|d=2> the rate for <math|k=2*m> is achieved when
  condition (<reference|B>) holds.

  We consider the case when <math|d=2>; for larger values of <math|d>, see
  Remark <reference|rmk:d\<gtr\>2>.

  In order to make the notation more clear, define <math|p<rsub|i,j>> and
  <math|N<rsub|i,j>>, respectively, <math|P*<around|(|A<rsub|i><rsup|1>\<times\>A<rsub|j><rsup|2>|)>>
  and <math|n*P<rsub|n>*<around|(|A<rsub|i><rsup|1>\<times\>A<rsub|j><rsup|2>|)>>,
  where the events <math|A<rsub|i><rsup|h>>, <math|h=1,2>,
  <math|i=1,\<ldots\>,m> are as above. The marginal distributions will be
  denoted <math|p<rsub|i,\<cdot\>>=p<rsub|\<cdot\>,i>=p<rsub|i>> (since H0
  holds), and the empirical marginal distributions by
  <math|N<rsub|i,\<cdot\>>/n> and <math|N<rsub|\<cdot\>,i>/n>.

  Turning back to the proof of Theorem <reference|th:dli> we see that
  condition (<reference|D>) is used in order to ensure that
  <math|<wide*|\<gamma\>|\<bar\>><rsub|n,k><rprime|'><around|(|S<rsub|n,k><rsup|-1>-S<rsub|k><rsup|-1>|)><wide*|\<gamma\>|\<bar\>><rsub|n,k>>
  goes to <math|0> in probability as <math|n> tends to infinity, while
  condition (<reference|B>) implies the convergence of
  <math|<frac|<wide*|\<gamma\>|\<bar\>><rsub|n,k><rprime|'>S<rsub|k><rsup|-1><wide*|\<gamma\>|\<bar\>><rsub|n,k>-2*m|<sqrt|4*m>>>
  to the standard normal distribution.

  Let

  <align*|<tformat|<table|<row|<cell|\<cal-Q\>=<around*|{|Q\<in\>M<rsub|1><around|(|<around|[|0,1|]><rsup|2>|)>|\<nobracket\>>>|<cell|:<around*|\<nobracket\>|<big|sum><rsub|i=1><rsup|m+1>q<rsub|i,j>=p<rsub|\<cdot\>,j>=q<rsub|\<cdot\>,j><rsup|0>=u<rsub|j+1>-u<rsub|j>,<space|0.22em>j=1,\<ldots\>,m+1;|\<nobracket\>>>>|<row|<cell|>|<cell|<around*|\<nobracket\>|<big|sum><rsub|j=1><rsup|m+1>q<rsub|i,j>=p<rsub|i,\<cdot\>>=q<rsub|i,\<cdot\>><rsup|0>=u<rsub|i+1>-u<rsub|i>,<space|0.22em>i=1,\<ldots\>,m+1|}>,>>>>>

  where <math|q<rsub|i,j><rsup|0>=Q<rsup|0>*<around|(|A<rsub|i><rsup|1>\<times\>A<rsub|j><rsup|2>|)>=<around|(|u<rsub|i+1>-u<rsub|i>|)>*<around|(|u<rsub|j+1>-u<rsub|j>|)>>.

  <\lem>
    When <math|P\<in\>\<Omega\>>, it holds

    <align|<tformat|<table|<row|<cell|n*\<chi\><rsub|n,k><rsup|2>>|<cell|=min<rsub|Q\<in\>\<cal-Q\>>
    <big|sum><rsub|i=1><rsup|m+1><big|sum><rsub|j=1><rsup|m+1><frac|<around|(|n*q<rsub|i,j>-N<rsub|i,j>|)><rsup|2>|N<rsub|i,j>>*\<bbb-I\><rsub|N<rsub|i,j>\<gtr\>0><label|chi>>>|<row|<cell|<wide*|\<gamma\>|\<bar\>><rsub|n,k><rprime|'>S<rsub|k><rsup|-1><wide*|\<gamma\>|\<bar\>><rsub|n,k>>|<cell|=min<rsub|Q\<in\>\<cal-Q\>>
    <big|sum><rsub|i=1><rsup|m+1><big|sum><rsub|j=1><rsup|m+1><frac|<around|(|n*q<rsub|i,j>-N<rsub|i,j>|)><rsup|2>|n*p<rsub|i,j>><label|chi>>>>>>
  </lem>

  <\proof>
    We prove (<reference|chi>), since the proof of (<reference|chi>) is
    similar. Following <cite|BRW91> the RHS in (<reference|chi>) is

    <\equation*>
      <big|sum><rsub|i=1><rsup|m+1><big|sum><rsub|j=1><rsup|m+1>N<rsub|i,j>*<around|(|a<rsub|i>+b<rsub|j>|)><rsup|2>,
    </equation*>

    where the vectors <math|a> and <math|b> <math|\<in\>\<bbb-R\><rsup|m+1>>
    are solutions of the equations

    <align*|<tformat|<table|<row|<cell|a<rsub|i>*<frac|N<rsub|i,\<cdot\>>|n>>|<cell|=p<rsub|i>-<frac|N<rsub|i,\<cdot\>>|n>-<big|sum><rsub|j=1><rsup|m+1>b<rsub|j>*<frac|N<rsub|i,j>|n>,<space|0.27em><space|0.27em><space|0.27em>i=1,\<ldots\>,m+1,>>|<row|<cell|b<rsub|j>*<frac|N<rsub|\<cdot\>,j>|n>>|<cell|=p<rsub|j>-<frac|N<rsub|\<cdot\>,j>|n>-<big|sum><rsub|i=1><rsup|m+1>a<rsub|i>*<frac|N<rsub|i,j>|n>,<space|0.27em><space|0.27em><space|0.27em>j=1,\<ldots\>,m+1.>>>>>

    Let <math|<wide*|a|\<bar\>>=<around|(|<wide|a|~><rsub|1>,\<ldots\>,<wide|a|~><rsub|m>,<wide|b|~><rsub|1>,\<ldots\>,<wide|b|~><rsub|m>|)>>
    be the coefficients in equation (<reference|a>). Making use of equations
    (<reference|a0>) and (<reference|eqn:3.a>) we obtain, using the class
    <math|\<cal-F\><rsub|n><rsup|\<delta\>>> in place of
    <math|\<cal-F\><rsub|n>> in the definition of
    <math|\<chi\><rsup|2><rsub|n,k>>,

    <align|<tformat|<table|<row|<cell|<label|a><wide|a|~><rsub|i>>|<cell|=2*<around*|(|a<rsub|i>-a<rsub|m+1>|)>,<space|0.27em>i=1,\<ldots\>,m>>|<row|<cell|<wide|b|~><rsub|j>>|<cell|=2*<around*|(|b<rsub|j>-b<rsub|m+1>|)>,<space|0.27em>j=1,\<ldots\>,m*<no-number>>>|<row|<cell|<wide|a|~><rsub|0>>|<cell|=2*<around*|(|a<rsub|m+1>+b<rsub|m+1>|)>.*<no-number>>>>>>

    From the proof of Proposition <reference|th:matrixform> we get, setting
    <math|\<delta\><rsub|i,j>=1> for <math|i=j> and 0 otherwise,

    <align*|<tformat|<table|<row|<cell|\<chi\><rsub|n,k><rsup|2>>|<cell|=<frac|1|4><wide*|a|\<bar\>><rprime|'>S<rsub|n,k><wide*|a|\<bar\>>>>|<row|<cell|>|<cell|=<frac|1|4*n>*<big|sum><rsub|i=1><rsup|m><big|sum><rsub|j=1><rsup|m><around*|[|<wide|a|~><rsub|i>*<wide|a|~><rsub|j>*<around*|(|N<rsub|i,\<cdot\>>*\<delta\><rsub|i,j>-N<rsub|i,\<cdot\>>*N<rsub|j,\<cdot\>>/n|)>+<wide|b|~><rsub|i>*<wide|b|~><rsub|j>*<around*|(|N<rsub|\<cdot\>,i>*\<delta\><rsub|i,j>-N<rsub|\<cdot\>,i>*N<rsub|\<cdot\>,j>/n|)>|\<nobracket\>>>>|<row|<cell|>|<cell|<space|2em><space|2em><space|2em><around*|\<nobracket\>|+2*<wide|a|~><rsub|i>*<wide|b|~><rsub|j>*<around*|(|N<rsub|i,j>-N<rsub|i\<cdot\>>*N<rsub|\<cdot\>,j>/n|)>|]>,>>>>>

    which, using (<reference|a>) and after some algebra yields

    <\equation*>
      n*\<chi\><rsub|n,k><rsup|2>=<big|sum><rsub|i=1><rsup|m+1><big|sum><rsub|j=1><rsup|m+1>N<rsub|i,j>*<around|(|a<rsub|i>+b<rsub|j>|)><rsup|2>.
    </equation*>
  </proof>

  We now can refine Theorem <reference|sec4:thm F infinite>.

  <\thm>
    Let (<reference|grid>) hold. Assume that <math|P\<in\>\<Omega\>>
    satisfies the condition in Lemma <reference|lm:eigen2> for some
    <math|\<alpha\>\<gtr\>0>.

    Let <math|m<around|(|n|)>> be such that
    <math|lim<rsub|n\<rightarrow\>\<infty\>> m=\<infty\>> and
    <math|lim<rsub|n\<rightarrow\>\<infty\>> m<rsup|3/2>*n<rsup|-1/2>*log
    n=0>.

    Then, under <math|H*0>,

    <\equation*>
      <frac|n*\<chi\><rsup|2><rsub|n,k>-2*m|<sqrt|4*m>>\<longrightarrow\>N<around|(|0,1|)>.
    </equation*>
  </thm>

  <\proof>
    It is enough to prove <math|<frac|n*\<chi\><rsup|2><rsub|n,k>-<wide*|\<gamma\>|\<bar\>><rsub|n,k><rprime|'>S<rsub|k><rsup|-1><wide*|\<gamma\>|\<bar\>><rsub|n,k>|<sqrt|4*m>>=o<rsub|P><around|(|1|)>>.

    Denote <math|<wide|P|^>> and <math|<wide|P|\<bar\>>> the minimizers of
    (<reference|chi>) and (<reference|chi>) in <math|\<cal-Q\>>. Let
    <math|<wide|p|^><rsub|i,j>> and <math|<wide|p|\<bar\>><rsub|i,j>> denote
    the respective probabilities of cells.

    We write

    <align*|<tformat|<table|<row|<cell|n*\<chi\><rsup|2><rsub|n,k>-<wide*|\<gamma\>|\<bar\>><rsub|n,k><rprime|'>S<rsub|k><rsup|-1><wide*|\<gamma\>|\<bar\>><rsub|n,k>>|<cell|\<leq\><big|sum><rsub|i=1><rsup|m+1><big|sum><rsub|j=1><rsup|m+1><around*|(|N<rsub|i,j>-n*<wide|p|\<bar\>><rsub|i,j>|)><rsup|2>*<around*|(|<frac|1|N<rsub|i,j>>-<frac|1|n*p<rsub|i,j>>|)>>>|<row|<cell|>|<cell|\<leq\>max<rsub|i,j><around*|(|<frac|n*p<rsub|i,j>|N<rsub|i,j>>-1|)>*n<wide*|\<gamma\>|\<bar\>><rsub|n,k><rprime|'>S<rsub|k><rsup|-1><wide*|\<gamma\>|\<bar\>><rsub|n,k>>>>>>

    and

    <align*|<tformat|<table|<row|<cell|n*\<chi\><rsub|n,k><rsup|2>-<wide*|\<gamma\>|\<bar\>><rsub|n,k><rprime|'>S<rsub|k><rsup|-1><wide*|\<gamma\>|\<bar\>><rsub|n,k>>|<cell|\<geq\>min<rsub|Q\<in\>\<cal-Q\>>
    <big|sum><rsub|i=1><rsup|m+1><big|sum><rsub|j=1><rsup|m+1><frac|<around*|(|N<rsub|i,j>-n*q<rsub|i,j>|)><rsup|2>|n*p<rsub|i,j>>*<around*|(|<frac|n*p<rsub|i,j>|N<rsub|i,j>>-1|)>>>|<row|<cell|>|<cell|\<geq\>-max<rsub|i,j><around*|\||<frac|n*p<rsub|i,j>|N<rsub|i,j>>-1|\|><wide*|\<gamma\>|\<bar\>><rsub|n,k><rprime|'>S<rsub|k><rsup|-1><wide*|\<gamma\>|\<bar\>><rsub|n,k>.>>>>>

    Whenever

    <\equation>
      <sqrt|m>*max<rsub|i,j><around*|\||<frac|n*p<rsub|i,j>|N<rsub|i,j>>-1|\|><above|\<rightarrow\>|P>0<label|limit
      celle>
    </equation>

    holds, then the above inequalities yield
    <math|<frac|n*\<chi\><rsup|2><rsub|n,k>-<wide*|\<gamma\>|\<bar\>><rsub|n,k><rprime|'>S<rsub|k><rsup|-1><wide*|\<gamma\>|\<bar\>><rsub|n,k>|<sqrt|4*m>>=o<rsub|P><around*|(|<frac|<wide*|\<gamma\>|\<bar\>><rsub|n,k><rprime|'>S<rsub|k><rsup|-1><wide*|\<gamma\>|\<bar\>><rsub|n,k>|m>|)>=o<rsub|P>*<around*|(|<frac|<wide*|\<gamma\>|\<bar\>><rsub|n,k><rprime|'>S<rsub|k><rsup|-1><wide*|\<gamma\>|\<bar\>><rsub|n,k>-2*m|<sqrt|4*m>>*<frac|2|<sqrt|m>>+1|)>=o<rsub|P><around|(|1|)>>
    , which proves the claim.

    We now prove (<reference|limit celle>). We proceed as in Lemma 2 in
    <cite|BRW91>, using inequalities (10.3.2) in <cite|ShoWell1986>. Let
    <math|B<rsub|n>\<sim\>B*i*n<around|(|n,p|)>>. Then, for <math|t\<gtr\>1>,

    <\equation>
      <label|eqn:SW>Pr <around*|(|<frac|n*p|B<rsub|n>>\<geq\>t|)>\<leq\>exp
      <around*|{|-n*p*<space|0.17em>h*<around*|(|1/t|)>|}>*<space|1em><text|and><space|1em>Pr
      <around*|(|<frac|B<rsub|n>|n*p>\<geq\>t|)>\<leq\>exp
      <around*|{|-n*p*<space|0.17em>h<around*|(|t|)>|}>,
    </equation>

    where <math|h<around*|(|t|)>=t*log t-t+1> is a positive function.

    Since <math|N<rsub|i,j>\<sim\>B*i*n<around|(|n,p<rsub|i,j>|)>>,

    <align*|<tformat|<table|<row|<cell|P*r*<around*|{|max<rsub|i,j><around*|(|<frac|n*p<rsub|i,j>|N<rsub|i,j>>-1|)>\<geq\><frac|t|<sqrt|m>>|}>>|<cell|\<leq\><big|sum><rsub|i=1><rsup|m+1><big|sum><rsub|j=1><rsup|m+1>P*r*<around*|{|<frac|n*p<rsub|i,j>|N<rsub|i,j>>\<geq\><frac|t|<sqrt|m>>+1|}>>>|<row|<cell|>|<cell|\<leq\><big|sum><rsub|i=1><rsup|m+1><big|sum><rsub|j=1><rsup|m+1>exp
    <around*|{|-n*p<rsub|i,j>*h*<around*|(|1/<around*|(|1+t/<sqrt|m>|)>|)>|}>>>|<row|<cell|<around|(|<text|by
    (<reference|grid>) and by>p<rsub|i,j>\<gtr\>\<alpha\>*p<rsub|i,\<cdot\>>*p<rsub|\<cdot\>,j>|)>>|<cell|\<leq\><around|(|m+1|)><rsup|2>*exp
    <around*|{|-c*\<alpha\>*<frac|n|log n><around|(|log
    n|)>*m<rsup|-2>*h*<around*|(|1/<around*|(|1+t/<sqrt|m>|)>|)>|}>.>>>>>

    For <math|x=1+\<varepsilon\>>, <math|h<around|(|x|)>=O<around|(|\<varepsilon\><rsup|2>|)>>.
    Therefore, using (<reference|B>) with <math|k=2*m>, for every
    <math|M\<gtr\>0> there exists <math|n> large enough that

    <\equation*>
      \<alpha\>*c*<frac|n|log n>*m<rsup|-2>*<space|0.17em>h*<around*|(|1+<frac|-t/<sqrt|m>|1+t/<sqrt|m>>|)>\<geq\>M
    </equation*>

    and consequently <math|P*r*<around*|{|max<rsub|i,j><around*|(|<frac|n*p<rsub|i,j>|N<rsub|i,j>>-1|)>\<geq\><frac|t|<sqrt|m>>|}>>
    goes to 0.

    To get convergence to zero of <math|P*r*<around*|{|max<rsub|i,j><around*|(|1-<frac|n*p<rsub|i,j>|N<rsub|i,j>>|)>\<geq\><frac|t|<sqrt|m>>|}>=>
    <math|P*r*<around*|{|max<rsub|i,j> <frac|N<rsub|i,j>|n*p<rsub|i,j>>\<geq\><frac|1|1-t/<sqrt|m>>|}>>,
    the second inequality in (<reference|eqn:SW>) is used in a similar way.
  </proof>

  <\rem>
    <label|rmk:d\<gtr\>2>The preceding arguments carry over to the case
    <math|d\<gtr\>2> and yield to the condition <math|lim<rsub|n>
    m<rsup|d+1/2>*n<rsup|-1/2>*log n=0>. However for <math|d\<geq\>6> this
    ultimate condition is stronger than (<reference|D>).
  </rem>

  <section|Application: a contamination model>

  Let <math|\<cal-P\><rsub|\<theta\>>> an identifiable class of densities on
  <math|\<bbb-R\>>. A contamination model typically writes

  <\equation>
    p<around|(|x|)>=<around|(|1-\<lambda\>|)>*f<rsub|\<theta\>><around|(|x|)>+\<lambda\>*r<around|(|x|)><label|eqn:5.contamination
    model>
  </equation>

  where <math|\<lambda\>> is supposed to be close to zero and
  <math|r<around|(|x|)>> is a density on <math|\<bbb-R\>> which represents
  the distribution of the contaminating data.

  An example is when <math|f<rsub|\<theta\>><around|(|x|)>=\<theta\>*e<rsup|-\<theta\>*x>>,
  <math|x\<gtr\>0> and <math|r<around|(|x|)>> is a Pareto type distribution,
  say

  <\equation>
    <label|eqn:5.pareto>r<around|(|x|)>\<assign\>r<rsub|\<gamma\>,\<nu\>><around|(|x|)>=\<gamma\>*\<nu\><rsup|\<gamma\>><around|(|x|)><rsup|-<around|(|\<gamma\>+1|)>>,
  </equation>

  with <math|x\<gtr\>\<nu\>> and <math|\<gamma\>\<gtr\>1,<space|0.27em>\<nu\>\<gtr\>1>.

  Such a case corresponds to a proportion <math|\<lambda\>> of outliers
  generated by the density <math|r<rsub|\<gamma\>,\<nu\>>>.

  We test contamination when we have at hand a sample
  <math|X<rsub|1>,\<ldots\>,X<rsub|n>> of i.i.d. r.v.'s with unknown density
  function <math|p<around|(|x|)>> as in (<reference|eqn:5.contamination
  model>). We state the test paradigm as follows.

  Let <math|H*0> denote the composite null hypothesis <math|\<lambda\>=0>,
  i.e.

  <align*|<tformat|<table|<row|<cell|H*0>|<cell|:<space|0.27em>p<around|(|x|)>=f<rsub|\<theta\><rsub|0>><around|(|x|)>,<space|0.27em>\<theta\><rsub|0>\<in\>\<Theta\>>>|<row|<cell|>|<cell|<text|versus>>>|<row|<cell|H*1>|<cell|:<space|0.27em>p<around|(|x|)>=<around|(|1-\<lambda\>|)>*f<rsub|\<theta\>><around|(|x|)>+\<lambda\>*r<around|(|x|)>>>|<row|<cell|>|<cell|<space|2em><text|for
  some >\<theta\>\<in\>\<Theta\>*<space|0.17em><text|and with
  >\<lambda\>\<neq\>0.>>>>>

  Such problems have been addressed in the recent literature; see
  <cite|lemdaniPons99> and references therein. We assume identifiability,
  stating that, under <math|H*1>, <math|\<lambda\>>, <math|\<theta\>> and
  <math|r> are uniquely defined. This assumption holds for example when
  <math|f<rsub|\<theta\>><around|(|x|)>=\<theta\>*e<rsup|-\<theta\>*x>> and
  <math|r<around|(|x|)>> is like in (<reference|eqn:5.pareto>).

  For test problems pertaining to <math|\<lambda\>> we embed
  <math|p<around|(|x|)>> in the class of density functions of signed measures
  with total mass 1, allowing to belong to <math|\<Lambda\><rsub|0>> an open
  interval that contains 0.

  In order to present the test statistic, we first consider a simplified
  version of the problem above.

  Assume that <math|\<theta\><rsub|0>=\<alpha\>> is fixed, i.e.
  <math|\<Theta\>=<around*|{|\<alpha\>|}>>. We consider the hypotheses

  <align*|<tformat|<table|<row|<cell|H*0>|<cell|:<space|0.27em>p<around|(|x|)>=f<rsub|\<alpha\>><around|(|x|)>>>|<row|<cell|>|<cell|<text|versus>>>|<row|<cell|H*1>|<cell|:<space|0.27em>p<around|(|x|)>=<around|(|1-\<lambda\>|)>*f<rsub|\<alpha\>><around|(|x|)>+\<lambda\>*r<around|(|x|)>,<space|2em><text|with
  >\<lambda\>\<neq\>0.>>>>>

  In this case <math|\<Omega\>=<around*|{|f<rsub|\<alpha\>>|}>> and the null
  hypothesis H0 is simple.

  For this problem the <math|\<chi\><rsup|2>> approach appears legitimate.
  From the discussion in Section <reference|sec:1.intro> the
  <math|\<chi\><rsup|2>> criterion is robust against inliers. A contamination
  model as (<reference|eqn:5.contamination model>) captures the outlier
  contamination through the density <math|r>. As such the test statistic does
  not need to have any robustness property against those, since they are
  included in the model. At the contrary, missing data might lead to advocate
  in favour of <math|H*1> unduly. Therefore the test statistic should be
  robust versus such cases.

  By the necessary inclusion <math|f<rsup|\<ast\>>=2*<around*|(|<frac|q<rsup|\<ast\>>|p>-1|)>\<in\>\<cal-F\>>
  we define

  <\equation>
    \<cal-F\>=\<cal-F\><rsub|\<alpha\>>=<around*|{|g=2*<around*|(|<frac|f<rsub|\<alpha\>>|<around|(|1-\<lambda\>|)>*f<rsub|\<alpha\>>+\<lambda\>*r>-1|)>*<text|
    such that \ \ ><space|0.27em><big|int><around|\||g|\|>*f<rsub|\<alpha\>>\<less\>\<infty\>,<space|0.17em>\<lambda\>\<in\>\<Lambda\><rsub|0>|}>.<label|en;5.F>
  </equation>

  Following (<reference|eqn:2.chi>)

  <\equation>
    \<chi\><rsub|n><rsup|2><around|(|f<rsub|\<alpha\>>,p|)>=sup<rsub|g\<in\>\<cal-F\><rsub|\<alpha\>>>
    <big|int>g*f<rsub|\<alpha\>>-T<around|(|g,P<rsub|n>|)>.<label|eqn:5.chi>
  </equation>

  <\ex>
    <label|ex:5.example>Consider the case
    <math|f<rsub|\<alpha\>><around|(|x|)>=\<alpha\>*e<rsup|-\<alpha\>*x>> and
    <math|r<around|(|x|)>=\<gamma\>*\<nu\><rsup|\<gamma\>><around|(|x|)><rsup|-<around|(|\<gamma\>+1|)>>>
    for some <math|\<gamma\>> fixed, <math|x\<gtr\>\<nu\>>.

    Then <math|\<cal-F\><rsub|\<alpha\>>=<around*|{|2*<around*|(|<frac|\<alpha\>*e<rsup|-\<alpha\>*x>|<around|(|1-\<lambda\>|)>*\<alpha\>*e<rsup|-\<alpha\>*x>+\<lambda\>*r<rsub|\<gamma\>,\<nu\>>>-1|)>,<space|0.22em>\<lambda\>\<in\>\<Lambda\><rsub|0>*<text|such
    that ><big|int><frac|\<alpha\><rsup|2>*e<rsup|-2*\<alpha\>*x>|<around|(|1-\<lambda\>|)>*\<alpha\>*e<rsup|-\<alpha\>*x>+\<lambda\>*r<rsub|\<gamma\>,\<nu\>><around|(|x|)>>*d*x\<less\>\<infty\>|}>>.
  </ex>

  Let us now turn back to composite hypothesis.

  Let <math|\<Omega\>> be defined by

  <\equation*>
    \<Omega\>=<around*|{|q<around|(|x|)>=f<rsub|\<alpha\>><around|(|x|)>,\<alpha\>\<in\>\<Theta\>|}>.
  </equation*>

  We can write

  <\equation*>
    \<cal-F\><rsub|\<alpha\>>=<around*|{|g<around|(|\<theta\>,\<lambda\>,\<alpha\>|)>=2*<space|-0.17em><space|-0.17em><around*|(|<space|-0.17em><frac|f<rsub|\<alpha\>>|<around|(|1-\<lambda\>|)>*f<rsub|\<theta\>>+\<lambda\>*r>-1<space|-0.17em>|)>:<big|int><around|\||g|\|>*f<rsub|\<alpha\>>\<less\>\<infty\>,\<lambda\>\<in\>\<Lambda\><rsub|0>,\<theta\>\<in\>\<Theta\>|}>
  </equation*>

  and

  <\equation*>
    \<chi\><rsup|2><around|(|\<Omega\>,P|)>=inf<rsub|\<alpha\>\<in\>\<Theta\>>
    sup<rsub|g\<in\>\<cal-F\><rsub|\<alpha\>>>
    <big|int>g*f<rsub|\<alpha\>>-T<around|(|g,P|)>.
  </equation*>

  The supremum is to be found over a class of functions
  <math|\<cal-F\><rsub|\<alpha\>>> which changes with <math|\<alpha\>>.

  Denote <math|\<Delta\><rsub|\<alpha\>>> the subset of
  <math|<around|(|\<Theta\>,\<Lambda\><rsub|0>|)>> which parametrizes
  <math|\<cal-F\><rsub|\<alpha\>>>.

  <\ex>
    [Continued]<label|ex:5.continuation1> We assume
    <math|\<Theta\>=<around|[|<wide*|\<alpha\>|\<bar\>>,<wide|\<alpha\>|\<bar\>>|]>>,
    which corresponds, in our example, to the restriction of the expected
    value of <math|P> (under <math|H*0>) to the finite interval
    <math|<around|[|<frac|1|<wide|\<alpha\>|\<bar\>>>,<frac|1|<wide*|\<alpha\>|\<bar\>>>|]>>.

    Therefore

    <\equation>
      <label|eqn:5.chi>\<chi\><rsup|2><rsub|n><around|(|\<Omega\>,P|)>=inf<rsub|<wide*|\<alpha\>|\<bar\>>\<leq\>\<alpha\>\<leq\><wide|\<alpha\>|\<bar\>>>
      sup<rsub|<around|(|\<theta\>,\<lambda\>|)>\<in\>\<Delta\><rsub|\<alpha\>>>
      <big|int>2*<around*|(|<frac|\<alpha\>*e<rsup|-\<alpha\>*x>|<around|(|1-\<lambda\>|)>*\<theta\>*e<rsup|-\<theta\>*x>+\<lambda\>*r<rsub|\<gamma\>><around|(|x|)>>-1|)>*\<alpha\>*e<rsup|-\<alpha\>*x>*d*x-T<around|(|g<around|(|\<theta\>,\<lambda\>,\<alpha\>|)>;P<rsub|n>|)>.
    </equation>

    The supremum in (<reference|eqn:5.chi>) is evaluated over a set which
    changes with <math|\<alpha\>>.

    In accordance with the discussion in Section <reference|sec:2.estimator>
    we may define

    <\equation>
      <label|eqn:5.chi><array|c|l*c*l|<tformat|<table|<row|<cell|\<cal-F\><space|-0.17em><space|-0.17em>>|<cell|<space|-0.17em><space|-0.17em>=<space|-0.17em><space|-0.17em>>|<cell|<space|-0.17em><space|-0.17em><around*|{|g<around|(|\<theta\>,\<lambda\>,\<beta\>|)><space|-0.17em>=<space|-0.17em>2*<space|-0.17em><space|-0.17em><around*|(|<frac|\<beta\>*e<rsup|-\<beta\>*x>|<around|(|1-\<lambda\>|)>*\<theta\>*e<rsup|\<theta\>*x>+\<lambda\>*r<around|(|x|)>>-1|)><space|-0.17em>:<big|int><frac|\<alpha\>*\<beta\>*e<rsup|-<around|(|\<alpha\>+\<beta\>|)>*x>|<around|(|1-\<lambda\>|)>*\<theta\>*e<rsup|-\<theta\>*x>+\<lambda\>*r>*d*x\<less\>\<infty\>,<around|(|\<alpha\>,\<theta\>,\<beta\>|)><space|-0.17em>\<in\><space|-0.17em>\<Theta\><rsup|3>,\<lambda\><space|-0.17em>\<in\><space|-0.17em>\<Lambda\><rsub|0><space|-0.17em>|}>>>|<row|<cell|<space|-0.17em><space|-0.17em>>|<cell|<space|-0.17em><space|-0.17em>\<subseteq\><space|-0.17em><space|-0.17em>>|<cell|<space|-0.17em><space|-0.17em><around*|{|g<around|(|\<theta\>,\<lambda\>,\<beta\>|)>:\<lambda\>\<in\>\<Lambda\><rsub|0>,<space|0.17em><around|(|\<theta\>,\<beta\>|)>\<in\>\<Gamma\><space|-0.17em>|}>,>>>>>
    </equation>

    a class not depending upon <math|\<alpha\>>.

    The resulting test statistic would be then

    <\equation>
      \<chi\><rsub|n><rsup|2><around|(|\<Omega\>,P|)>=inf<rsub|\<alpha\>\<in\>\<Theta\>>
      sup<rsub|<around|(|\<theta\>,\<beta\>|)>\<in\>\<Gamma\>,<space|0.27em>\<lambda\>\<in\>\<Lambda\><rsub|0>>
      <big|int>g<around|(|\<theta\>,\<lambda\>,\<beta\>|)>*\<alpha\>*e<rsup|-\<alpha\>*x>*d*x-T<around|(|g<around|(|\<theta\>,\<lambda\>,\<beta\>|)>;P<rsub|n>|)><label|eqn:5.chi>
    </equation>

    and the supremum in (<reference|eqn:5.chi>) is determined on a set that
    does not depend on <math|\<alpha\>>.
  </ex>

  The use of (<reference|eqn:5.chi>) is proposed by M. Broniatowski and A.
  Keziou <cite|Broniatowski-Keziou2003>. Also in our context it is easy to
  see that (<reference|eqn:5.chi>) is preferable to (<reference|eqn:5.chi>),
  in the sense that it reduces considerably the computational complexity of
  the problem, from a subset of <math|<around*|{|<around|(|\<theta\>,\<lambda\>,\<beta\>|)>\<in\>\<Theta\>\<times\>\<Lambda\><rsub|0>\<times\>\<Theta\>|}>>
  to a subset of <math|<around|{|<around|(|\<lambda\>,\<theta\>|)>\<in\>\<Lambda\><rsub|0>\<times\>\<Theta\>|}>>.
  <vspace|1fn>

  We first derive the asymptotic distribution of the test statistic
  <math|\<chi\><rsub|n><rsup|2>> under <math|H*1>; in order to use Theorem
  <reference|th:2.weakconv> we commute the <math|i*n*f> and <math|s*u*p>
  operators in (<reference|eqn:5.chi>) through the following Lemma
  <reference|th:5.minimax>.

  Assume

  <\itemize>
    <item*|(A1)><math|\<Theta\>> is compact.

    <item*|(A2)>For all <math|\<alpha\>> in <math|\<Theta\>>,
    <math|\<Delta\><rsub|\<alpha\>>> is compact.
  </itemize>

  Condition (A2) is verified in our example due to the compactness of the
  interval <math|\<Theta\>> and to the distribution of the outliers.

  <\lem>
    <label|th:5.minimax>Let

    <\equation*>
      \<Theta\><rsub|<space|-0.17em><space|-0.17em><around|(|<space|-0.17em>\<theta\><rsub|1><space|-0.17em>,\<lambda\><space|-0.17em>,\<theta\><rsub|2><space|-0.17em>|)>>=<around*|{|\<alpha\>\<in\>\<Theta\>:<space|0.17em><around|(|\<theta\><rsub|1>,\<lambda\>,\<theta\><rsub|2>|)>\<in\>\<Delta\><rsub|\<alpha\>>|}>.
    </equation*>

    Under (A1) and (A2),

    <align|<tformat|<table|<row|<cell|>|<cell|inf<rsub|\<alpha\>\<in\>\<Theta\>>
    sup<rsub|<around|(|\<theta\><rsub|1>,\<lambda\>,\<theta\><rsub|2>|)>\<in\>\<Delta\><rsub|\<alpha\>>>
    <big|int>g<around|(|\<theta\><rsub|1>,\<lambda\>,\<theta\><rsub|2>|)>*f<rsub|\<alpha\>>-T<around|(|g<around|(|\<theta\><rsub|1>,\<lambda\>,\<theta\><rsub|2>|)>;P|)><label|eqn:5.minimax>>>|<row|<cell|>|<cell|<space|1em>=sup<rsub|<around|(|\<theta\><rsub|1>,\<theta\><rsub|2>,\<lambda\>|)>\<in\>\<Theta\><rsup|2>\<times\>\<Lambda\><rsub|0>>
    inf<rsub|\<alpha\>\<in\>\<Theta\><rsub|<space|-0.17em><space|-0.17em><around|(|<space|-0.17em>\<theta\><rsub|1><space|-0.17em>,<space|-0.17em>\<lambda\><space|-0.17em>,<space|-0.17em>\<theta\><rsub|2><space|-0.17em>|)>>>
    <big|int>g<around|(|\<theta\><rsub|1>,\<lambda\>,\<theta\><rsub|2>|)>*f<rsub|\<alpha\>>-T<around|(|g<around|(|\<theta\><rsub|1>,\<lambda\>,\<theta\><rsub|2>|)>;P|)>.*<no-number>>>>>>
  </lem>

  <\proof>
    For <math|\<Theta\><rsub|<space|-0.17em><space|-0.17em><around|(|<space|-0.17em>\<theta\><rsub|1><space|-0.17em>,<space|-0.17em>\<lambda\><space|-0.17em>,<space|-0.17em>\<theta\><rsub|2><space|-0.17em>|)>>>
    defined as above we have

    <align|<tformat|<table|<row|<cell|>|<cell|inf<rsub|\<alpha\>\<in\>\<Theta\>>
    sup<rsub|<around|(|\<theta\><rsub|1>,\<lambda\>,\<theta\><rsub|2>|)>\<in\>\<Delta\><rsub|\<alpha\>>>
    <big|int>g<around|(|\<theta\><rsub|1>,\<lambda\>,\<theta\><rsub|2>|)>*f<rsub|\<alpha\>>-T<around|(|g<around|(|\<theta\><rsub|1>,\<lambda\>,\<theta\><rsub|2>|)>;P|)><label|eqn:5.minimax>>>|<row|<cell|>|<cell|<space|1em>\<leq\>sup<rsub|<around|(|\<theta\><rsub|1>,\<theta\><rsub|2>,\<lambda\>|)>\<in\>\<Theta\><rsup|2>\<times\>\<Lambda\><rsub|0>>
    inf<rsub|\<alpha\>\<in\>\<Theta\><rsub|<space|-0.17em><space|-0.17em><around|(|<space|-0.17em>\<theta\><rsub|1><space|-0.17em>,<space|-0.17em>\<lambda\><space|-0.17em>,<space|-0.17em>\<theta\><rsub|2><space|-0.17em>|)>>>
    <big|int>g<around|(|\<theta\><rsub|1>,\<lambda\>,\<theta\><rsub|2>|)>*f<rsub|\<alpha\>>-T<around|(|g<around|(|\<theta\><rsub|1>,\<lambda\>,\<theta\><rsub|2>|)>;P|)>.*<no-number>>>>>>

    On the other hand,

    <align*|<tformat|<table|<row|<cell|sup<rsub|\<theta\><rsub|1><space|-0.17em>,\<lambda\><space|-0.17em>,\<theta\><rsub|2>>
    <space|-0.17em><big|int><space|-0.17em>g*f<rsub|\<alpha\>><space|-0.17em>-<space|-0.17em>T<around|(|g;P|)><space|-0.17em><space|-0.17em><space|-0.17em>>|<cell|<space|-0.17em><space|-0.17em>=<space|-0.17em><space|-0.17em><space|-0.17em><space|-0.17em>sup<rsub|\<theta\><rsub|1><space|-0.17em>,\<lambda\><space|-0.17em>,\<theta\><rsub|2>><around*|{|<big|int><space|-0.17em>2*<space|-0.17em><frac|f<rsub|\<theta\><rsub|2>>|<around|(|1-\<lambda\>|)>*f<rsub|\<theta\><rsub|1>>+\<lambda\>*r>*<frac|f<rsub|\<alpha\>>|p>*p*d*x<space|-0.17em>-<space|-0.17em><big|int><space|-0.17em><space|-0.17em><around*|(|<space|-0.17em><frac|f<rsub|\<theta\><rsub|2>>|<around|(|1-\<lambda\>|)>*f<rsub|\<theta\><rsub|1>><space|-0.17em>+<space|-0.17em>\<lambda\>*r><space|-0.17em>|)><rsup|<space|-0.17em>2>*<space|-0.17em>p*d*x<space|-0.17em>+<space|-0.17em>1|}>>>|<row|<cell|>|<cell|<space|-0.17em><space|-0.17em><space|-0.17em>=<space|-0.17em><space|-0.17em><space|-0.17em><space|-0.17em>sup<rsub|\<theta\><rsub|1>,\<lambda\>,\<theta\><rsub|2>>-<big|int><around*|(|<frac|f<rsub|\<theta\><rsub|2>>|<around|(|1-\<lambda\>|)>*f<rsub|\<theta\><rsub|1>>+\<lambda\>*r>-<frac|f<rsub|\<alpha\>>|p>|)><rsup|2>*p*d*x+<big|int><around*|(|<frac|f<rsub|\<alpha\>>|p>-1|)><rsup|2>*p*d*x>>|<row|<cell|>|<cell|<space|-0.17em><space|-0.17em><space|-0.17em>\<leq\><space|-0.17em><space|-0.17em><space|-0.17em><space|-0.17em><big|int><around*|(|<frac|f<rsub|\<alpha\>>|p>-1|)><rsup|2>*p*d*x=\<chi\><rsup|2><around|(|f<rsub|\<alpha\>>,p|)>>>>>>

    and equality holds if <math|<around|(|\<theta\><rsub|1>,\<theta\><rsub|2>,\<lambda\>|)>>
    are such that <math|<frac|f<rsub|\<alpha\>>|p>=<frac|f<rsub|\<theta\><rsub|2>>|<around|(|1-\<lambda\>|)>*f<rsub|\<theta\><rsub|1>>+\<lambda\>*r>>
    (identifiability allows to find such <math|<around|(|\<theta\><rsub|1>,\<theta\><rsub|2>,\<lambda\>|)>>
    for every <math|\<alpha\>\<in\>\<Theta\>>, and for every contaminated
    measure <math|p>).

    Also we have

    <align*|<tformat|<table|<row|<cell|>|<cell|sup<rsub|<around|(|\<theta\><rsub|1>,\<lambda\>,\<theta\><rsub|2>|)>>
    inf<rsub|\<alpha\>> <big|int>g*f<rsub|\<alpha\>>-T<around|(|g;P|)>>>|<row|<cell|>|<cell|<space|2em>=sup<rsub|<around|(|\<theta\><rsub|1>,\<lambda\>,\<theta\><rsub|2>|)>><around*|{|-<big|int><around*|(|<frac|f<rsub|\<theta\><rsub|2>>|<around|(|1-\<lambda\>|)>*f<rsub|\<theta\><rsub|1>>+\<lambda\>*r>-<frac|f<rsub|\<alpha\><rsup|\<ast\>>>|p>|)><rsup|2>*p*d*x+<big|int><around*|(|<frac|f<rsub|\<alpha\><rsup|\<ast\>>>|p>-1|)><rsup|2>*p*d*x|}>>>|<row|<cell|>|<cell|<space|2em>=\<chi\><rsup|2><around|(|f<rsub|\<alpha\><rsup|\<ast\>>>,p|)>,>>>>>

    for some <math|\<alpha\><rsup|\<ast\>>> in
    <math|\<Theta\><rsub|<space|-0.17em><space|-0.17em><around|(|<space|-0.17em>\<theta\><rsub|1><space|-0.17em>,<space|-0.17em>\<lambda\><space|-0.17em>,<space|-0.17em>\<theta\><rsub|2><space|-0.17em>|)>>>.

    We thus get

    <\equation*>
      sup<rsub|<around|(|\<theta\><rsub|1>,\<lambda\>,\<theta\><rsub|2>|)>>
      inf<rsub|\<alpha\>> <big|int>g*f<rsub|\<alpha\>>-T<around|(|g;P|)>=\<chi\><rsup|2><around|(|f<rsub|\<alpha\><rsup|\<ast\>>>,p|)>\<geq\>\<chi\><rsup|2><around|(|\<Omega\>,p|)>=inf<rsub|\<alpha\>>
      sup<rsub|<around|(|\<theta\><rsub|1>,\<lambda\>,\<theta\><rsub|2>|)>>
      <big|int>g*f<rsub|\<alpha\>>-T<around|(|g;P|)>,
    </equation*>

    which, by (<reference|eqn:5.minimax>), concludes the proof.
  </proof>

  Theorem <reference|th:2.weakconv> implies consistency of
  <math|\<chi\><rsub|n><rsup|2>> as an estimator of <math|\<chi\><rsup|2>>
  and convergence in distribution of <math|<sqrt|n>*<around*|(|\<chi\><rsub|n><rsup|2>-\<chi\><rsup|2>|)>>
  to a normally distributed r.v. with mean zero and variance given by
  <math|P<around*|(|<around*|(|-g<rsup|\<ast\>>-<frac|1|4>*g<rsup|\<ast\>><rsup|2>|)><rsup|2>|)>-<around*|(|P*<around*|(|-g<rsup|\<ast\>>-<frac|1|4>*g<rsup|\<ast\>><rsup|2>|)>|)><rsup|2>>,
  under <math|H*1>.

  The asymptotic distribution under the null hypothesis can be found subject
  to the choice of the parametric class <math|<around*|{|f<rsub|\<alpha\>>|}>>
  and of the density <math|r>, as can be deduced by Theorem 3.5 in
  <cite|Broniatowski-Keziou2003>. Following their Theorem 3.5, which holds
  for composite hypothesis testing in a parametric environment, the test
  statistic <math|n*\<chi\><rsub|n><rsup|2>> converges weakly, under
  <math|H*0>, to a chi-squared distribution with degrees of freedom depending
  on the dimension of the parameter space <math|\<Theta\>> and on the
  cardinality of the constraints induced by <math|P\<in\>\<Omega\>>.

  In the following, we focus on definition (<reference|eqn:5.chi>) for
  <math|\<chi\><rsub|n><rsup|2><around|(|\<Omega\>,P|)>>.

  The null hypothesis reduces the space <math|\<Theta\>\<times\>\<Lambda\><rsub|0>>
  to <math|\<Theta\>\<times\><around*|{|0|}>>.

  Theorem 3.5 in <cite|Broniatowski-Keziou2003> implies that the degree of
  freedom <math|d> of the limiting chi-squared distribution equals the number
  of parameters of <math|P> under <math|H*0>. In the following we assume
  <math|d=1>, as in Example <reference|ex:5.example>.

  Let <math|h<around|(|\<theta\>,\<lambda\>;x|)>=<around|(|1-\<lambda\>|)>*f<rsub|\<theta\>><around|(|x|)>+\<lambda\>*r<around|(|x|)>>.

  Checking conditions (C.12)-(C.15) in <cite|Broniatowski-Keziou2003> yields:

  <\thm>
    Under <math|H*0>, with <math|P=P<rsub|\<theta\><rsub|0>>>, assume that

    <\itemize>
      <item*|(i)>The class of contaminated densities
      <math|<around*|{|h<around|(|\<theta\>,\<lambda\>|)>,\<theta\>\<in\>\<Theta\>,\<lambda\>\<in\>\<Lambda\><rsub|0>|}>>
      is <math|P<rsub|\<theta\><rsub|0>>->identifiable;

      <item*|(ii)>The class of functions <math|<around*|{|<frac|h<around|(|\<alpha\>,\<nu\>|)>|h<around|(|\<theta\>,\<lambda\>|)>>,\<theta\>\<in\>\<Theta\><rsub|\<alpha\>>,\<lambda\>\<in\>\<Lambda\><rsub|0>,\<alpha\>\<in\>\<Theta\>,<around|\||\<nu\>|\|>\<less\>\<varepsilon\>|}>>
      is <math|P<rsub|\<theta\><rsub|0>>->GC for some <math|\<varepsilon\>>
      small enough;

      <item*|(iii)>The densities <math|f<rsub|\<theta\>>> are differentiable
      up to the second order in some neighborhood
      <math|V<around|(|\<theta\><rsub|0>|)>> of <math|\<theta\><rsub|0>> and
      <math|F<rsub|\<theta\>><around|(|x|)>=<big|int><rsub|-\<infty\>><rsup|x>f<rsub|\<theta\>><around|(|u|)>*d*u>
      is differentiable with respect to <math|\<theta\>>;

      <item*|(iv)>There exists a neighborhood <math|V> of
      <math|<around|(|\<theta\><rsub|0>,0,\<theta\><rsub|0>,0|)>> such that,
      for every <math|<around|(|\<theta\>,\<lambda\>,\<alpha\>,\<nu\>|)>\<in\>V>
      we have

      <\equation*>
        <array|c|l*l*l|<tformat|<table|<row|<cell|<frac|f<rsub|\<alpha\>>|h<around|(|\<theta\>,\<lambda\>|)>>\<leq\>H<rsub|1><around|(|x|)>,>|<cell|<space|1em>>|<cell|<frac|<wide|f|\<ddot\>><rsub|\<alpha\>>|h<around|(|\<theta\>,\<lambda\>|)>>\<leq\>H<rsub|3><around|(|x|)>,>>|<row|<cell|<frac|<wide|f|\<dot\>><rsub|\<alpha\>>|h<around|(|\<theta\>,\<lambda\>|)>>\<leq\>H<rsub|2><around|(|x|)>,>|<cell|>|<cell|<frac|r|h<around|(|\<theta\>,\<lambda\>|)>>\<leq\>H<rsub|4><around|(|x|)>,>>>>>
      </equation*>

      <no-indent>where each of the functions <math|H<rsub|j>>
      (<math|j=1,2,3,4>) is square integrable w.r. to the density
      <math|h<around|(|\<alpha\>,\<nu\>|)>> and is in
      <math|L<rsub|4><around|(|P<rsub|\<theta\><rsub|0>>|)>>.
    </itemize>

    Then, <math|n*\<chi\><rsub|n><rsup|2>> converges to a chi-squared
    distributed r.v. with degree of freedom equal to 1.
  </thm>

  <section*|Acknowledgements>

  This work was supported by <em|Progetto Ateneo di Padova> coordinated by
  Prof. G. Celant.

  <\bibliography|bib|plain|bibki2>
    <bib-list|[99]|>
  </bibliography>
</body>