<TeXmacs|1.99.7>

<style|<tuple|ieeetran|std-latex>>

<\body>
  <doc-data|<doc-title|Peak Reduction and Clipping Mitigation by Compressive
  Sensing>|<doc-author|<author-data|<author-misc|The authors are with the
  Department of Electrical Engineering, King Fahd University of Petroleum
  <math|&> Minerals, Dhahran, KSA e-mail:alsafadi@kfupm.edu.sa;naffouri@kfupm.edu.sa>|<author-misc|>|<author-name|Ebrahim<nbsp>Al-Safadi
  and Tareq<nbsp>Al-Naffouri>>>|<doc-date|<date|>>>

  <abstract-data|<\abstract>
    This work establishes the design, analysis, and fine-tuning of a
    Peak-to-Average-Power-Ratio (PAPR) reducing system, based on compressed
    sensing at the receiver of a peak-reducing sparse clipper applied to an
    OFDM signal at the transmitter. By exploiting the sparsity of the OFDM
    signal in the time domain relative to a pre-defined clipping threshold,
    the method depends on partially observing the frequency content of
    extremely simple sparse clippers to recover the locations, magnitudes,
    and phases of the clipped coefficients of the peak-reduced signal. We
    claim that in the absence of optimization algorithms at the transmitter
    that confine the frequency support of clippers to a predefined set of
    reserved-tones, no other tone-reservation method can reliably recover the
    original OFDM signal with such low complexity.

    Afterwards we focus on designing different clipping signals that can
    embed a priori information regarding the support and phase of the
    peak-reducing signal to the receiver, followed by modified compressive
    sensing techniques for enhanced recovery. This includes data-based
    weighted <math|\<ell\><rsub|1>> minimization for enhanced support
    recovery and phase-augmention for homogeneous clippers followed by
    Bayesian techniques.

    We show that using such techniques for a typical OFDM signal of 256
    subcarriers and 20<math|%> reserved tones, the PAPR can be reduced by
    approximately 4.5 dB with a significant increase in capacity compared to
    a system which uses all its tones for data transmission and clips to such
    levels. The design is hence appealing from both capacity and PAPR
    reduction aspects.
  </abstract>|<abstract-keywords| PAPR reduction| tone reservation
  techniques| compressive sensing| sparse signal estimation.>>

  <IEEEpeerreviewmaketitle>

  <section|Introduction>

  <IEEEPARstart|D|espite> the introduction of Single Carrier Frequency
  Division Multiple Access (SC-FDMA) into current multicarrier transmission
  standards, the success of Orthogonal Frequency Division Multiplexing (OFDM)
  in high data rate transmission remains truly remarkable, with no better
  proof than the fact that variants of the IEEE 802.16 and IEEE 802.18
  standards are still emerging <cite|OFDM_applications_3|Wimax2>.

  The main problem with OFDM signalling however lies in the high temporal
  peaks relative to the signal mean, portrayed in a parameter most commonly
  referred to as Peak-to-Average-Power-Ratio (PAPR)<footnotemark>. Since an
  OFDM signal is typically constructed by the superposition of a large number
  of modulated subcarriers, its envelope fluctuates with significant
  variance, causing the high PAPR. This enforces the use of expensive Power
  Amplifiers that should operate linearly over a wide range of signal
  amplitudes, which also dissipate a lot of energy as well
  <cite|PAPR_overview3>.

  Due to the monotonically increasing importance of OFDM signals, the problem
  of high PAPR has received considerable attention ever since OFDM was
  adopted in important communication standards (see
  <cite|PAPR_overview1|PAPR_overview2> for an overview). In the last decade,
  the problem of high PAPR in OFDM systems has been tackled by a variety of
  approaches, including coding techniques <cite|coding1|coding3|coding4>,
  selective mapping <cite|selected_mapping|selected_mapping_2_low_complexity>,
  partial transmit sequences <cite|pts_1|pts_2>, constellation expansion
  (also known as tone injection) <cite|Constellation_reshaping|constellation1|constellation2|Tellado2>,
  tone-reservation <cite|Tellado|active_set|Tone_reservation_new>, and
  companding <cite|companding2|companding4|exponential_companding> to name a
  few. Although many of these reduction techniques are brilliant and very
  effective, the main obstacle limiting the implementation of most of them is
  commonly related to high complexity <cite|PAPR_overview3>.

  In this paper we design, fine-tune, implement, and analyze a novel
  tone-reservation based PAPR reducing system that makes a radically
  different utilization of these tones compared to previous techniques. Such
  a utilization could not have been practically developed without the
  implementation of algorithms capable of robust reconstruction from partial
  frequency observations. Furthermore, the application we propose completely
  switches the stage at which signal processing complexity is required from
  the transmitter's side to the receiver's side of the communication system,
  and hence provides an alternate solution to different communication models
  where the transmitter's complexity is a bottleneck.

  We wish to establish that to the best of our knowledge this is the first
  work in the literature where PAPR reduction is achieved using compressive
  sensing (CS) <cite|Safadi>. The methods throughout will always assume
  sparsity of clipping events relative to a clipping threshold, and use null
  tones to estimate these events, providing the first application of the
  major work of Candes and Tao on recovering sparse signals from highly
  incomplete frequency information <cite|Candes1> in this context. As such,
  we also remove the obstacle faced by all previous tone-reservation-based
  PAPR reduction techniques beginning with the pioneering work of Tellado
  <cite|Tellado|Tellado2> till very recently
  <cite|Chen_tone|Kashin|Shao|Janaaththanan>, all of which required careful
  construction of peak-reducing signals at the transmitter in order to keep
  them orthogonal to the data signal in the frequency domain.

  Afterwards, we branch off to many solutions to enhance the basic algorithm
  by designing different clipping techniques at the transmitter, modifying
  the CS algorithm to make use of a priori support and phase information, and
  pursuing Bayesian Estimation techniques for joint support and amplitude
  estimation at the final stage.

  Unless mentioned otherwise, we use lower case letters for (column) vectors
  and upper case letters for matrices. Since we will be toggling extensively
  between the time domain and frequency domain, we will denote by
  <math|<wide|x|\<check\>>> the Discrete Fourier Transform (DFT) of <math|x>,
  while we reserve the hat notation <math|<wide|x|^>> to denote the estimate
  of <math|x>. We use <math|x<around|(|i|)>> to denote a scalar which is the
  <math|i<rsup|t*h>> coefficient of the vector <math|x>, while we reserve the
  subindex notation in <math|x<rsub|i>> to denote a vector that is the
  <math|i<rsup|t*h>> column of the matrix <math|X>. Furthermore, we denote by
  <math|x<rsup|H>> the Hermitian conjugate of <math|x>.

  The vectors we treat throughout are complex in general and of dimension
  <math|N>. We denote by <math|<around|\<\|\|\>|x|\<\|\|\>><rsub|p>=<around*|(|<big|sum><rsub|i=1><rsup|N><around|\||x<around|(|i|)>|\|><rsup|p>|)><rsup|1/p>>
  the <math|\<ell\><rsub|p>>-norm of a vector <math|x> where <math|p> could
  be an integer or a real number between zero and one. In the special case
  where <math|p=0> the definition is modified to the pseudo-norm
  <math|<around|\<\|\|\>|x|\<\|\|\>><rsub|0>=<big|sum><rsub|i=1><rsup|N>q<around|(|i|)>>,
  where <math|q<around|(|i|)>={1> if <math|x<around|(|i|)>\<neq\>0>, and
  <math|0> otherwise<math|}>.

  Although we use the upper case letter <math|<with|font-series|bold|F>> for
  the Fourier matrix, it will be clear from context when we also use it to
  denote the Cumulative Distribution Function (CDF) of a random variable
  <math|<with|font-series|bold|x>>, <math|\<bbb-F\><rsub|<with|font-series|bold|x>><around|(|x|)>>
  and Complementary CDF, <math|<wide|\<bbb-F\>|\<bar\>><rsub|<with|font-series|bold|x>><around|(|x|)>=1-\<bbb-F\><rsub|<with|font-series|bold|x>><around|(|x|)>>.
  The Probability Density Function (PDF) will then be denoted by
  <math|f<rsub|<with|font-series|bold|x>><around|(|x|)>>. We use
  <math|E<around|[|<with|font-series|bold|x><rsup|m>|]>> to denote the
  <math|m<rsup|t*h>> central moment of a random variable
  <math|<with|font-series|bold|x>>.

  <section|Transceiver Model>

  We define the time-domain complex base-band transceiver model as

  <\equation>
    y<around|(|k|)>=<big|sum><rsub|\<ell\>=0><rsup|L-1>h<around|(|\<ell\>|)>*x*<around|(|k-\<ell\>|)>+z<around|(|k|)>,
  </equation>

  where <math|<around|{|x<around|(|k|)>|}>> and
  <math|<around|{|y<around|(|k|)>|}>> denote the channel scalar input and
  output, <math|h=<around|(|h<rsub|0>,h<rsub|1>,\<ldots\>,h<rsub|L-1>|)>> is
  the impulse response of the channel, <math|z<around|(|k|)>\<sim\><with|math-font|cal|C*N><around|(|0,\<sigma\><rsub|z><rsup|2>|)>>
  is AWGN. In matrix form this becomes

  <\equation>
    y=<with|font-series|bold|H>*x+z,
  </equation>

  where <math|y> and <math|x> are the time-domain OFDM receive and transmit
  signal blocks (after cyclic prefix removal) and
  <math|z\<sim\><with|math-font|cal|C*N><around|(|<with|font-series|bold|0>,\<sigma\><rsub|z><rsup|2>*<with|font-series|bold|I>|)>>.

  By the cyclic prefix, <math|<with|font-series|bold|H>> is a circulant
  matrix describing the cyclic convolution of the channel impulse response
  with the block <math|x> and can be decomposed into
  <math|<with|font-series|bold|H>=<with|font-series|bold|F><rsup|H>*<with|font-series|bold|DF>>
  where <math|<with|font-series|bold|F>> denotes a unitary Discrete Fourier
  Transform (DFT) matrix with <math|<around|(|k,l|)>> element

  <\equation*>
    <around*|[|F<around|(|k,\<ell\>|)>|]>=N<rsup|-1/2>*<space|0.17em>e<rsup|-j*2*\<pi\>*k*\<ell\>/N>,<space|1em>k,\<ell\>\<in\>0,1,\<ldots\>,N-1
  </equation*>

  <math|<with|font-series|bold|D>=<with|font-series|medium|d*i*a*g><around|(|<wide|h|\<check\>>|)>>,
  and <math|<wide|h|\<check\>>=<sqrt|N>*<with|font-series|bold|F>*h> is the
  DFT of the channel impulse response.

  <section|Basic PAPR Reduction Design><label|basic design>

  The time-domain OFDM signal <math|x> is typically constructed by taking the
  IDFT of the data vector <math|<wide|d|\<check\>>> whose entries are drawn
  from a generic constellation. Since this signal is of high PAPR, we add a
  peak-reducing signal <math|c> of arbitrary spectral support at the
  transmitter and then estimate it and subtract it from the demodulated
  signal at the receiver.

  In what follows, the main condition we impose on <math|c> is that it be
  sparse in time. This is basically the case if we set a clipping threshold
  <math|\<gamma\>> on the envelope of the OFDM symbols, or if the transmitter
  were to clip the highest <math|s> peaks. By the incoherence property of the
  time-frequency bases <cite|Candes1>, this necessarily implies that <math|c>
  is then dense (i.e. non-sparse) in the frequency domain <cite|Tropp4> and
  such a condition thus cannot be satisfied in methods where the data and
  peak-reducing signal must occupy disjoint tones
  <cite|Tellado|Tone_reservation_new|active_set|Chen_tone|Kashin|Shao|Janaaththanan>.
  We will denote by <math|\<cal-I\><rsub|c>=<around|{|i:<around|\<\|\|\>|c<around|(|i|)>|\<\|\|\>>\<neq\>0|}>>
  the sparse temporal support of <math|c> where
  <math|<around|\||\<cal-I\><rsub|c>|\|>=s=<around|\<\|\|\>|c|\<\|\|\>><rsub|0>>.

  Throughout this work, we will only consider clipping the Nyquist rate
  samples of the OFDM signal. Such a restriction is unnecessary as it is
  irrelevant to the data-augmented CS methods we prescribe, but will
  otherwise require more elaborate tools such as recent findings that deal
  with block sparsity <cite|Block_Sparsity1|Block_Sparsity2>, and we are
  forced to delay such topics for lack of space. With this in mind, following
  <cite|Imai> and <cite|Wei> we assume the entries of <math|x> will be
  uncorrelated and that the real and imaginary parts of <math|x> are
  asymptotically Gaussian processes for large <math|N>. This directly implies
  that the entries of <math|x> are independent and that the envelope of
  <math|x> can be modeled as a sequence of <math|i*i*d> Rayleigh random
  variables with a common CDF <math|\<bbb-F\><rsub|<around|\||X|\|>><around|(|<around|\||x|\|>|)>>
  and parameter <math|\<sigma\><rsub|<around|\||X|\|>>> which we will use
  extensively throughout.

  Denoting <math|\<Omega\>> as the set of frequencies in an OFDM signal of
  cardinality <math|N>, let <math|\<Omega\><rsub|d>\<subset\>\<Omega\>> be
  the set of frequencies that are used for data transmission and
  <math|\<Omega\><rsub|m>=\<Omega\>\<setminus\>\<Omega\><rsub|d>> the
  complementary set reserved for measurement tones of cardinality
  <math|<around|\||\<Omega\><rsub|m>|\|>=m>. Note that for compressive
  sensing purposes, a near optimal strategy is to use a random assignment of
  tones for estimating <math|c> <cite|Candes2>. <footnote|Based on results in
  <cite|Xia> it was found in <cite|Safadi> and <cite|Naffouri> that by using
  <with|font-shape|italic|difference sets>, one is able to boost the
  performance of the recovery algorithm and reduce the symbol error rate.>

  The data symbols <math|<wide|d<rsub|i>|\<check\>>> are drawn from a QAM
  constellation of size <math|M> and are supported by
  <math|\<Omega\><rsub|d>> of cardinality
  <math|<around|\||\<Omega\><rsub|d>|\|>=N-m=k>. Consequently, the
  transmitted peak-reduced time-domain signal is

  <\equation>
    <wide|x|\<bar\>>=x+c=<with|font-series|bold|F><rsup|H>*<with|font-series|bold|S><rsub|x>*<wide|d|\<check\>>+c
  </equation>

  where <math|<with|font-series|bold|S><rsub|x>> is an <math|N\<times\>k>
  selection matrix containing only one element equal to 1 per column, and
  with <math|m> zero rows. The columns of
  <math|<with|font-series|bold|S><rsub|x>> index the subcarriers that are
  used for data transmission in the OFDM system. Similarly, we denote by
  <math|<with|font-series|bold|S><rsub|m>> the <math|N\<times\>m> matrix with
  a single element equal to 1 per column, that span the orthogonal complement
  of the columns of <math|<with|font-series|bold|S><rsub|x>>.

  <no-indent>Demodulation amounts to computing the DFT

  <eqnarray|<tformat|<table|<row|<cell|<wide|y|\<check\>>>|<cell|=>|<cell|<with|font-series|bold|F>y=<with|font-series|bold|F>*<around|(|<with|font-series|bold|H>*<wide|x|\<bar\>>+z|)>>>|<row|<cell|>|<cell|=>|<cell|<with|font-series|bold|F>*<around|(|<with|font-series|bold|F><rsup|H>*<with|font-series|bold|D*F>*<around|(|<with|font-series|bold|F><rsup|H>*<with|font-series|bold|S><rsub|x>*<wide|d|\<check\>>+c|)>+z|)>>>|<row|<cell|>|<cell|=>|<cell|<with|font-series|bold|DS><rsub|x><wide|d|\<check\>>+<with|font-series|bold|DF>c+<wide|z|\<check\>><eq-number>>>>>>

  where <math|<wide|z|\<check\>>=<with|font-series|bold|F>z> has the same
  distribution of <math|z> since <math|<with|font-series|bold|F>> is unitary.
  Assuming the channel is known at the receiver, we can now estimate <math|c>
  by projecting <math|<wide|y|\<check\>>> onto the orthogonal complement of
  the signal subspace leaving us with

  <eqnarray|<tformat|<table|<row|<cell|<wide|y|\<acute\>>>|<cell|=>|<cell|<with|font-series|bold|S><rsub|m><rsup|T>*<wide|y|\<check\>>>>|<row|<cell|>|<cell|=>|<cell|<with|font-series|bold|S><rsub|m><rsup|T>*<with|font-series|bold|D*F>*c+<wide|z|\<acute\>>>>|<row|<cell|>|<cell|=>|<cell|\<Psi\>*c+<wide|z|\<acute\>>.<eq-number><label|y>>>>>>

  Note that <math|<wide|z|\<acute\>>=<with|font-series|bold|S><rsub|m><rsup|T>*<with|font-series|bold|F>z>
  is an <math|m\<times\>1> <math|i.*i.*d> Gaussian vector with a covariance
  matrix <math|<with|font-series|bold|R><rsub|<wide|z|\<acute\>>>=\<sigma\><rsub|z><rsup|2>*<with|font-series|bold|I><rsub|m\<times\>m>>.

  The observation vector <math|<wide|y|\<acute\>>> is a projection of the
  sparse <math|N>-dimensional peak-reducing signal <math|c> onto a basis of
  dimension <math|m\<ll\>N> corrupted by <math|<wide|z|\<acute\>>>. To
  demonstrate how such an <math|N>-dimensional vector can be estimated from
  <math|m> linear measurements, we refer the reader to
  <cite|Candes1|Candes2|Donoho2|Tropp2|Tropp3|Fletcher|Wainwright1|Wainwright2>,
  which also investigate theoretical bounds on <math|m>, <math|s>, and
  <math|N> for guaranteed recovery under various conditions. Note that in our
  case, the number of measurements <math|m> is equivalent to the number of
  reserved tones, while the number of clipped coefficients is equivalent to
  <math|s>, and hence the amount of clipping should be below certain bounds
  for reliable recovery given a fixed number of tones <math|m>. However,
  these generic CS bounds will be significantly relaxed to our advantage in
  the second part of the paper when we exploit background information from
  the data vector <math|x>.

  Now coming back to our problem, assume the peak reducing signal <math|c> is
  <math|s>-sparse in time, given <math|<wide|y|\<acute\>>> in
  (<reference|y>), we can use any compressive sensing technique at the
  receiver to estimate <math|c>. We will follow the main stream CS literature
  and use a convex relaxation of an otherwise NP-hard problem <cite|Tropp3>
  such as

  <eqnarray|<tformat|<table|<row|<cell|min<rsub|c\<in\><with|font-series|bold|C><rsup|N>><around|\<\|\|\>|<wide|y|\<acute\>>-\<Psi\>*c|\<\|\|\>><rsub|p><rsup|p>+\<lambda\><around|\<\|\|\>|<space|0.17em>c<space|0.17em>|\<\|\|\>><rsub|1><eq-number><label|CS>>>>>>

  <no-indent>for recovery, where <math|p> is either <math|1> (for basis
  pursuit <cite|Donoho1>) or <math|2> (for LASSO <cite|Tibshirani>) and
  <math|\<lambda\>> is a parameter for adjusting the sparsity penalty. The
  resulting solution by compressive sensing alone is an estimate
  <math|<wide|c|^><rsub|c*s>> of the peak reducing signal which not only
  reliably detects the positions of its nonzero entries, but also gives a
  good approximation to the corresponding amplitudes. Notice however that the
  estimation of <math|c> is by no means restricted to convex relaxations such
  as (<reference|CS>), and any compressive sensing method is valid in
  general, thus opening the door for many possible improvements in regard to
  complexity and efficiency.

  <big-figure|<with|par-mode|center|<image|primary_illustration|3.5in|||><label|primary>>|Clipping
  and Tone Reservation>

  Fig. <reference|primary> illustrates the main points we've described so
  far, although caution must be taken as the actual OFDM signal is generally
  complex.

  The block diagram in Fig. <reference|Block Diagram> stresses that upon
  observing <math|y>, the receiver is confronted with two estimation
  problems, the first is the typical estimation of the transmitted (clipped)
  OFDM signal <math|<wide|x|\<bar\>>>, and the second is the estimation of
  the peak reducing signal <math|c>. Although the noise statistics are the
  same in both cases, the estimation SNR is nevertheless very different,
  depending on the clipping procedure. We will hence reserve the SNR notation
  for the received signal-to-noise-ratio and denote by CNR the
  clipper-to-noise-ratio which is defined as

  <eqnarray|<tformat|<table|<row|<cell|<with|font-series|medium|C*N*R>>|<cell|=>|<cell|<frac|E<around*|[|<around|\<\|\|\>|\<Psi\>*c|\<\|\|\>><rsup|2>|]>|E<around*|[|<around|\<\|\|\>|<wide|z|\<acute\>>|\<\|\|\>><rsup|2>|]>>>>|<row|<cell|>|<cell|=>|<cell|<frac|E<around*|[|<around|\<\|\|\>|<big|sum><rsub|k\<in\>\<cal-I\><rsub|c>>c<around|(|k|)>*\<psi\><rsub|k>|\<\|\|\>><rsup|2>|]>|\<sigma\><rsub|z><rsup|2>><eq-number>>>>>>

  and hence depends on the sparsity level
  <math|<around|\<\|\|\>|c|\<\|\|\>><rsub|0>=<around|\||\<cal-I\><rsub|c>|\|>>
  and the magnitudes of <math|<around|{|c<around|(|k|)>|}><rsub|k\<in\>\<cal-I\><rsub|c>>>
  which are both functions of the clipping threshold <math|\<gamma\>>. This
  is the parameter of concern when it comes to compressive sensing in this
  paper. By definition, the CNR is typically less than the SNR since the
  energy of <math|c> leaks onto all the subcarriers even though the CS
  algorithm only has access to <math|<frac|m|N>> of them, and also since the
  magnitudes of the nonzero coefficients of <math|c> are practically smaller
  than those of <math|x>.

  <big-figure|<with|par-mode|center|<image|paper_block_diagrams_new_tx_and_rx.eps|2.5in|||><label|Block
  Diagram>>|Block Diagram of Basic Design>

  Note that in using CS our objective is to find the support
  <math|\<cal-I\><rsub|c>> of the sparse signal and its complex coefficients
  <math|<around|{|v<around|(|k|)>|}><rsub|k\<in\>\<cal-I\><rsub|c>>> at those
  locations. We could hence decompose the two problems into
  <math|c=<with|font-series|bold|S><rsub|c>*v<rsub|c>> and use CS for the
  first problem only, giving us <math|<wide|<with|font-series|bold|S>|^><rsup|<around|(|c*s|)>><rsub|c>>
  based on <math|<wide|\<cal-I\>|^><rsup|<around|(|c*s|)>><rsub|c>>, then
  refine our coefficient estimate by a more robust technique such as lease
  squares after conditioning on the detected support. To do so we define the
  <math|m\<times\>s> matrix <math|<wide|\<Phi\>|^>=\<Psi\>*<wide|S|^><rsup|<around|(|c*s|)>><rsub|c>>
  and refine our amplitude estimate to

  <\equation>
    <wide|v|^><rsup|<around|(|l*s\|c*s|)>><rsub|c>=<around|(|<wide|\<Phi\>|^><rsup|H>*<wide|\<Phi\>|^>|)><rsup|-1>*<wide|\<Phi\>|^><rsup|H>*<wide|y|\<acute\>><label|ls
    cs>
  </equation>

  in which <math|<wide|c|^><rsup|<around|(|c*s,l*s|)>>=<wide|<with|font-series|bold|S>|^><rsup|<around|(|c*s|)>><rsub|c>*<wide|v|^><rsup|<around|(|l*s\|c*s|)>><rsub|c>>
  follows. This dual approach is necessary in order to approach an oracle
  receiver that uses least squares (see the interesting discussion in
  <cite|Wainwright1>).

  <section|Comparison with Typical Tone-Reservation PAPR Reduction
  Techniques>

  The common function of reserved tones in the literature is to act as a
  frequency support for the peak reducing signal that is disjoint from the
  data-carrying tones <cite|Tellado|active_set|Tone_reservation_new|Shao|Kashin|Janaaththanan>.
  In other words, for each OFDM signal a search is conducted for some signal
  <math|c> that will reduce the PAPR while being spectrally confined to a
  limited number of tones such that <math|<around|\<\|\|\>|<wide|c|\<check\>>|\<\|\|\>><rsub|2>-<around|\<\|\|\>|<with|font-series|bold|S><rsub|m><rsup|T>*<wide|c|\<check\>>|\<\|\|\>><rsub|2>=0>
  and hence <math|<wide|c|\<check\>><rsup|<space|0.17em>H>*<wide|d|\<check\>>=0>.
  Although many different methods exist to find such a signal, we only
  mention the well-known work of Tellado's <cite|Tellado> for brevity, which
  requires solving the convex optimization problem

  <eqnarray|<tformat|<table|<row|<cell|>|<cell|>|<cell|min<rsub|<wide|c|\<check\>>>
  <space|1em>t>>|<row|<cell|>|<cell|>|<cell|s.*t.*<space|0.22em><around|\<\|\|\>|x+<with|font-series|bold|F><rsup|H>*<with|font-series|bold|S>*<wide|c|\<check\>>|\<\|\|\>><rsup|2>\<leq\>t<eq-number><label|Tellado>>>>>>

  where <math|<wide|c|\<check\>>=<with|font-series|bold|F>*c> is nonzero only
  on <math|\<Omega\><rsub|c>> from the definition of
  <math|<with|font-series|bold|S>>. Clearly, this optimization approach
  should result in significantly more PAPR reduction compared to our design,
  since for the same number of reserved tones <math|m>, we can only clip
  <math|s\<less\>m> maximum peaks, whereas by Tellado's method no such
  restriction exists.

  Most importantly however, the main complexity (i.e. the stage at which the
  optimization search is performed) in these techniques is at the
  transmitter, since the main concern is to find <math|c> that will reduce
  the PAPR while occupying completely disjoint tones in order to remain
  discernable at the receiver.

  <section|Enhanced PAPR Reduction by Data-Induced Weighted and
  Phase-Augmented <math|\<ell\><rsub|1>> Minimization><label|WCS>

  So far we were only interested in using compressive sensing in its most
  abstract form as it applies to our problem. We assumed, following the
  general literature on CS, that absolutely no information is known about the
  locations, magnitudes, and phases of the sparse signal <math|c>, beyond the
  incomplete frequency observations which we obtained from the reserved tones
  <math|\<Omega\><rsub|c>> <cite|Candes1|Candes2>. In other words, the model
  <math|<wide|y|\<acute\>>=\<Psi\>*c+<wide|z|\<acute\>>> was assumed to exist
  independently of the general transceiver model
  <math|y=H*<wide|x|\<bar\>>+z>, even though in reality we know that <math|c>
  is intimately linked to <math|<wide|x|\<bar\>>> by the simple fact that
  it's superimposed on <math|x> in the time domain.

  The upshot of this section is to demonstrate that for optimal PAPR
  reduction using CS, the estimation of the clipping signal at the receiver
  should exploit as much information as possible in both basis
  representations, which can be achieved by weighting, constraining, or
  rotating the frequency-based CS search, based on information we infer from
  the data in the time domain.

  The difficulty of these problems is strongly related to the way clipping is
  performed. Although we have full control in selecting the sparsity level
  and the clipping magnitudes and phases to best suite our purpose, there
  can't be a clipping technique that optimizes both the support recovery
  <em|and> coefficient estimation, and a compromise must be made regarding
  the quality of the two.

  <subsection|Homogeneous Clipping Techniques>

  we first begin with defining two simple clipping techniques that do not
  require any optimization or spectral confinement, and although we derive
  their PDFs along other properties, we focus exclusively on deterministic CS
  enhancement techniques<footnote|Although the LASSO estimate has a MAP
  interpretation <cite|Tibshirani> we don't assume any prior or statistic is
  used.>, and delay the matter of Bayesian compressive estimation or sensing
  to the following section.

  <subsubsection|Peak Suppression to <math|\<gamma\>> (PS)><label|PS>

  Because clipping is done on the coefficients of <math|x> whose envelope
  exceed <math|\<gamma\>>, the most natural construction of the clipping
  signal <math|c> would be to basically suppress the magnitudes of the
  entries <math|x<rsub|i>:<around|\||x<rsub|i>|\|>\<geq\>\<gamma\>> to
  <math|\<gamma\>> while preserving their angles, such that
  <math|<around|\||x<rsub|i>+c<rsub|i>|\|>=\<gamma\>> (see Fig. <reference|PS
  figure>). This is commonly expressed in the literature
  <cite|Bahai|Capacity_clipped_1> as

  <eqnarray|<tformat|<table|<row|<cell|<wide|x|\<bar\>><around|(|i|)>=<choice|<tformat|<table|<row|<cell|\<gamma\>*e<rsup|<space|0.27em>j*\<theta\><rsub|x<around|(|i|)>>><space|0.27em>>|<cell|<text|if><space|0.27em><around|\||x<around|(|i|)>|\|>\<gtr\>\<gamma\>,>>|<row|<cell|<space|0.17em><space|0.17em><space|0.17em><space|0.17em>x<around|(|i|)><space|0.27em>>|<cell|<text|otherwise>>>>>><eq-number>>>>>>

  Obviously, the PDF of the nonzero coefficients of <math|c<rsup|p*s>> will
  depend on the PDF of <math|<around|\||x|\|><mid|\|><around|\||x|\|>\<gtr\>\<gamma\>>.
  Hence if we define the binary set <math|\<cal-Q\>> to label the mutually
  exclusive events of clipping or not at a certain index <math|i> then

  <eqnarray|<tformat|<table|<row|<cell|f<around|(|<around|\||c<rsup|p*s>|\|><around|(|i|)>|)>>|<cell|=>|<cell|<big|sum><rsub|q\<in\>\<cal-Q\>>P*<around*|(|\|c<rsup|p*s><around|(|i|)><around|\||<space|0.17em>|\|>*q|)>*P<around|(|q|)>>>|<row|<cell|>|<cell|=>|<cell|f<rsub|<around|\||X|\|>\|<around|\||X|\|>\<gtr\>\<gamma\>>*<around|(|<around|\||c<rsup|p*s><around|(|i|)>|\|>+\<gamma\>|)><around*|(|<wide|\<bbb-F\>|\<bar\>><rsub|<around|\||X|\|>><around|(|\<gamma\>|)>|)>>>|<row|<cell|>|<cell|>|<cell|<space|1em><space|1em>+<space|0.27em>\<delta\><around|(|<around|\||c<rsup|p*s><around|(|i|)>|\|>|)>*\<bbb-F\><rsub|<around|\||X|\|>><around|(|\<gamma\>|)>>>|<row|<cell|>|<cell|=>|<cell|\<alpha\><rsup|-1><around|(|\<gamma\>|)><around*|(|<wide|\<bbb-F\>|\<bar\>><rsub|<around|\||X|\|>><around|(|\<gamma\>|)>|)>*f<rsub|<around|\||X|\|>>*<around|(|<around|\||c<rsup|p*s><around|(|i|)>|\|>+\<gamma\>|)>>>|<row|<cell|>|<cell|>|<cell|<space|1em><space|1em>\<cdot\><space|0.27em>u<around|(|<around|\||c<rsup|p*s><around|(|i|)>|\|>|)>+\<bbb-F\><rsub|<around|\||X|\|>><around|(|\<gamma\>|)>*\<delta\><around|(|<around|\||c<rsup|p*s><around|(|i|)>|\|>|)><eq-number><label|pdf>>>>>>

  <no-indent>where <math|u<around|(|\<cdummy\>|)>> is the unit step function
  and <math|\<alpha\><around|(|\<gamma\>|)>=<big|int><rsub|\<gamma\>><rsup|\<infty\>>f<rsub|<around|\||X|\|>><around*|(|<around|\||x|\|>|)>*d*x>
  is a normalizing constant which depends only on <math|\<gamma\>> and is
  required to ensure that <math|<big|int><rsub|0><rsup|\<infty\>>f<rsub|<around|\||X|\|>\|<around|\||X|\|>\<gtr\>\<gamma\>>*<around*|(|<around|\||x|\|>\|<around|\||x|\|>\<gtr\>\<gamma\>|)>*d<around|\||x|\|>=1>.
  Not surprisingly, this is the most popular soft clipping scheme due to its
  simplicity and relatively low spectral distortion.

  Two features of this clipping scheme stand out in regard to CS enhancement.
  The first is that by suppressing all the data coefficients to a fixed and
  known threshold value <math|\<gamma\>>, we could actually infer some
  additional information regarding possible clipping locations from the
  distance between the estimated coefficients' magnitudes and
  <math|\<gamma\>>. This clipping scheme can hence provide additional
  information regarding the support <math|\<cal-I\><rsub|c>>. The second
  feature is that the nonzero coefficients of <math|c<rsup|p*s>> are exactly
  anti-phased with the data coefficients at
  <math|\<cal-I\><rsub|c>><footnote|we will call such signals
  <em|homogeneous> clippers since their phases are aligned with the data.>,
  giving us another source of information regarding the phases
  <math|\<theta\><rsub|c<rsup|p*s><around|(|\<cal-I\><rsub|c>|)>>> based on
  <math|<wide|<wide|x|\<bar\>>|^>>.

  <big-figure|<with|par-mode|center|<image|peak_suppression_JNL.eps|2.5in|||><label|PS
  figure>>|Peak Suppression Illustrated on the Complex Plane>

  In terms of delectability from standard compressive sensing, however, the
  method is quite un-satisfying if left un-enhanced, demanding a higher
  number of measurements for the same sparsity level and Symbol Error Rate
  (SER) compared to other clipping techniques. The main reasons are

  <\enumerate>
    <item><with|font-series|bold|Low CNR>: The CNR in PS decreases very
    rapidly with <math|\<gamma\>>. Assuming we neglect the effect of
    <math|\<Psi\>>,

    <eqnarray|<tformat|<table|<row|<cell|E<around|[|<around|\<\|\|\>|c<rsup|p*s>|\<\|\|\>><rsup|2>|]>>|<cell|=>|<cell|<big|sum><rsub|k\<in\>\<cal-I\><rsub|c>>E<around*|[|\|c<rsup|p*s><around|(|k|)><around|\||<rsup|2>|\|>|]>>>|<row|<cell|>|<cell|=>|<cell|E<around*|[|<around|\||c<rsup|p*s><around|(|k|)>|\|><rsup|2>|]>\<cdot\>E<around|[|<around|\<\|\|\>|c|\<\|\|\>><rsub|0>|]>>>|<row|<cell|>|<cell|=>|<cell|<big|int><rsub|\<infty\>><rsup|\<infty\>><around|\||c<rsup|p*s><around|(|k|)>|\|><rsup|2>*f<around|(|<around|\||c<rsup|p*s><around|(|k|)>|\|>|)>*d<around|\||c<rsup|p*s><around|(|k|)>|\|>>>|<row|<cell|>|<cell|>|<cell|<space|1em><space|1em>\<cdot\>E<around|[|<around|\<\|\|\>|c|\<\|\|\>><rsub|0>|]>>>|<row|<cell|>|<cell|=>|<cell|<around*|[|\<alpha\><rsup|-1><around|(|\<gamma\>|)>*<around|(|2*\<sigma\><rsub|<around|\||X|\|>><rsup|2>+\<gamma\><rsup|2>|)>*e<rsup|-<frac|\<gamma\><rsup|2>|2*\<sigma\><rsub|<around|\||X|\|>><rsup|2>>>-\<gamma\>|]>>>|<row|<cell|>|<cell|>|<cell|<space|1em><space|1em>\<cdot\>E<around|[|<around|\<\|\|\>|c|\<\|\|\>><rsub|0>|]><eq-number>>>>>>

    where the average sparsity

    <eqnarray*|<tformat|<table|<row|<cell|E<around|[|<around|\<\|\|\>|c|\<\|\|\>><rsub|0>|]>>|<cell|=>|<cell|N<rsup|2><around*|(|<wide|\<bbb-F\>|\<bar\>><rsub|<around|\||X|\|>><around|(|\<gamma\>|)>|)><rsup|2>-N<around*|(|<wide|\<bbb-F\>|\<bar\>><rsub|<around|\||X|\|>><around|(|\<gamma\>|)>|)><rsup|2>>>|<row|<cell|>|<cell|>|<cell|<space|1em><space|1em>+N<around*|(|<wide|\<bbb-F\>|\<bar\>><rsub|<around|\||X|\|>><around|(|\<gamma\>|)>|)>>>>>>

    is simply the expectation of the Binomial corresponding to the sparsity
    level. Notice the accumulative effect of <math|\<gamma\>> on
    <math|E<around|[|<around|\<\|\|\>|c<rsup|p*s>|\<\|\|\>><rsup|2>|]>>.

    <item><with|font-series|bold|The vanishing of
    <math|<around|\||c<rsup|p*s><rsub|m*i*n>|\|>>>: the random magnitudes of
    <math|c<rsup|p*s>> are drawn from the tail distributions of the data
    coefficients, making the limiting distance between the minimum
    penetrating coefficient and <math|\<gamma\>> approach zero. This is a
    critical bottleneck in CS that cannot be completely compensated for by
    increasing the CNR. Fletcher et al. <cite|Fletcher> and Wainwright
    <cite|Wainwright1|Wainwright2|Wainwright3> stress this point.
  </enumerate>

  <subsubsection|Digital-Magnitude Clipping (DMC)><label|dc>

  <big-figure|<with|par-mode|center|<image|digital_clipping_JNL.eps|2.5in|||><label|illustration>>|Clipping
  with Fixed Magnitude <math|\<zeta\>>>

  In order to avoid the problems of the previous clipping technique, we could
  increment the magnitudes of <math|c<rsup|p*s>> by some constant until we're
  satisfied with the CNR and <math|<around|\||c<rsup|p*s><rsub|m*i*n>|\|>>.
  This however still leaves us with the burden of estimating the random
  magnitudes while destroying the enhanced support detection property of peak
  suppression. Instead, consider inverting the procedure from suppressing
  <em|to> a fixed value <math|\<gamma\>>, to suppressing <em|by> a fixed
  value <math|\<zeta\>>. <footnote|Quite expectedly, in <cite|Fletcher> it
  was shown that, with no modification or realization to this additional
  structure, a compressive estimation algorithm works best when all the
  nonzero coefficients in <math|c> are equal in magnitude.>

  Now that <math|<around|{|<around|\||c<around|(|k|)>|\|>|}><rsub|k\<in\>\<cal-I\><rsub|c>>=\<zeta\>>,
  we've decreased the degrees of freedom of <math|c> to
  <math|\<cal-I\><rsub|c>> and <math|\<theta\><rsub|c>> only. Furthermore,
  such a clipping scheme preserves the anti-phase property as well, thus
  possibly reducing the problem to that of support detection. <footnote|In
  the case of digital clipping with phase augmentation, the problem can also
  be recast as that of <em|detecting> a point on a sparse lattice, and a
  regularized sphere decoding algorithm could be used
  <cite|Rank_Deficient_Sphere|Finite_Alphabet|Giannakis_Finite_Alphabet>.>

  More generally, we could suppress the high peaks of <math|x> by a finite
  set of magnitudes <math|<around|{|\<zeta\><rsub|0>,\<zeta\><rsub|1>,\<ldots\>,\<zeta\><rsub|\<ell\>>|}>\<in\>\<bbb-Z\><rsup|\<ell\>>>,
  hence the attribute of Digital Magnitude Clipping (or simply Digital
  Clipping for short), although we will only focus here on the binary
  magnitude space <math|<around|\||c|\|>\<in\><around|{|0,\<zeta\>|}>>.

  Following the same procedure in finding (<reference|pdf>), and by noting
  the interesting relation <math|<around|\<\|\|\>|c|\<\|\|\>><rsub|p>=\<zeta\><around|\<\|\|\>|c|\<\|\|\>><rsub|0><rsup|1/p>,<space|0.17em><space|0.17em>p=1,2>,..,
  the PDF of the clipping signal's envelope is basically

  <eqnarray|<tformat|<table|<row|<cell|f<around*|(|<around|\||c|\|><rsup|d*m><around|(|i|)>|)>=<around*|(|<wide|\<bbb-F\>|\<bar\>><rsub|<around|\||X|\|>><around|(|\<gamma\>|)>|)>*\<delta\>*<around*|(|<around|\||c|\|>-\<zeta\>|)>+\<bbb-F\><rsub|<around|\||X|\|>><around|(|\<gamma\>|)>*\<delta\><around*|(|<around|\||c|\|>|)>.<eq-number>>>>>>

  The PDF of a coefficient's magnitude has been reduced to a Bernoulli random
  variable with probability of success <math|<around*|(|<wide|\<bbb-F\>|\<bar\>><rsub|<around|\||X|\|>><around|(|\<gamma\>|)>|)>>.
  Furthermore, the two clipping methods PS and DMC achieve the same CNR when

  <eqnarray|<tformat|<table|<row|<cell|\<zeta\>=<sqrt|\<alpha\><rsup|-1><around|(|\<gamma\>|)>*<around*|(|2*\<sigma\><rsub|<around|\||X|\|>><rsup|2>+\<gamma\><rsup|2>|)>*e<rsup|\<gamma\><rsup|2>/2*\<sigma\><rsub|<around|\||X|\|>><rsup|2>>-\<gamma\>>.<eq-number><label|CNR>>>>>>

  There is a conflicting interest in deciding the value of <math|\<zeta\>>.
  On one hand, the more we increase it the higher the CNR and the easier the
  support detection becomes, but on the other, the overall error of the
  system dramatically increases in case of faulty support detection.
  Furthermore, oversampling at the subsequent stage of transmission becomes
  more complex in this latter case.

  Nevertheless, we should at least set a lower bound on its value to ensure
  that all clipped coefficients will always end up with magnitudes equivalent
  to or bellow the desired clipping threshold <math|\<gamma\>>, depending on
  the envelopes maximum order statistic. Afterwards, we should be very
  conservative in increasing <math|\<zeta\>>

  <subsection|Externally Weighted <math|\<ell\><rsub|1>>
  Minimization><label|EWCS>

  If by some prior information we have a better picture regarding the support
  <math|\<cal-I\><rsub|c>> beyond the Bernoulli process assumption, we can
  modify the LASSO in (<reference|CS>) by penalizing disfavored locations so
  that

  <eqnarray|<tformat|<table|<row|<cell|<wide|c|^>=arg min
  <around|\<\|\|\>|<wide|y|\<acute\>>-\<Psi\>*c|\<\|\|\>><rsub|2><rsup|2>+\<lambda\>*<around|\<\|\|\>|w<rsup|T>*c|\<\|\|\>><rsub|1>,<eq-number><label|weighted
  LASSO>>>>>>

  where <math|w> is a weighting vector imposed on the <math|\<ell\><rsub|1>>
  penalty term based on this prior information. In the literature, the source
  of <math|w> is from previous runs of the CS algorithm itself
  <cite|Candes4><cite|Wipf1>, where the hope is that with each iteration more
  confidence will exist in <math|<wide|\<cal-I\>|^><rsup|<around|(|k+1|)>><rsub|c>>
  based on, for instance <cite|Candes4>,

  <eqnarray|<tformat|<table|<row|<cell|w<around|(|i|)><rsup|<around|(|k+1|)>>\<propto\><around*|[|<space|0.27em><around|\||<wide|c|^><around|(|i|)><rsup|<around|(|k|)><rsup|c*s>>|\|>+\<epsilon\>|]><rsup|-1>*<space|2em>i=1,2,\<ldots\>,N<eq-number><label|w>>>>>>

  where <math|\<epsilon\>\<gtr\>0> is a small stabilizing parameter. We will
  refer to this procedure as <em|internally> weighted <math|\<ell\><rsub|1>>
  minimization.

  Repeating the CS algorithm is computationally expensive, and the process is
  sensitive to the quality of the first unguided CS estimate. Instead, we
  would rather use a one-shot weighting scheme that minimally increases the
  complexity of an ordinary LASSO. Fortunately, this could be done if we had
  an <em|external> source of information based on the data vector
  <math|<wide|<wide|x|\<bar\>>|^>>.

  Recall the discussion in <reference|PS> regarding embedded information on
  the support <math|\<cal-I\><rsub|c>> in peak suppression. The idea is that
  we expect the coefficients of <math|<wide|<wide|x|\<bar\>>|^>> whose
  magnitudes are close to <math|\<gamma\>> to be more probable clipping
  locations compared to ones that are not. Consequently, we can define a
  weighting vector <math|w<rsup|p*s>> based on the distance

  <eqnarray|<tformat|<table|<row|<cell|d<around|(|i|)>=<around|\|||\|>*<wide|<wide|x|\<bar\>>|^><around|(|i|)>\|-\<gamma\>\|,<space|2em>i=1,2,\<ldots\>,N<eq-number><label|ps>>>>>>

  and use it in (<reference|weighted LASSO>). Another data-based weighting
  scheme would be the posterior probability of not having a clip (<math|q=0>)
  given the observation (<reference|ps>), such that less likely clipping
  locations are more severely penalized by having a higher such posterior
  probability

  <eqnarray|<tformat|<table|<row|<cell|w<around|(|i|)><rsup|p*s>>|<cell|=>|<cell|Pr
  <around*|(|q=0<space|0.17em>\|<space|0.17em>d<around|(|i|)>|)><eq-number>>>|<row|<cell|>|<cell|=>|<cell|<frac|Pr
  <around|(|d<around|(|i|)>\|q=0|)>*Pr <around|(|q=0|)>|<big|sum><rsub|q\<in\>\<cal-Q\>>Pr
  <around|(|d<around|(|i|)>\|q|)>*Pr <around|(|q|)>>>>|<row|<cell|>|<cell|=>|<cell|<frac|f<rsub|<around|\||<wide|X|^>|\|>>*<around|(|\<gamma\>-d<around|(|i|)>|)>*\<bbb-F\><rsub|<around|\||X|\|>><around|(|\<gamma\>|)>|f<rsub|<around|\||<wide|X|^>|\|>>*<around|(|\<gamma\>-d<around|(|i|)>|)>*\<bbb-F\><rsub|<around|\||X|\|>><around|(|\<gamma\>|)>+f<rsub|<around|\||E|\|>><around|(|d<around|(|i|)>|)>*<wide|\<bbb-F\>|\<bar\>><rsub|<around|\||X|\|>><around|(|\<gamma\>|)>><label|ps>>>>>>

  where <math|f<rsub|<around|\||E|\|>>> is the density function corresponding
  to the estimation error of the data envelope
  <math|<around|\||<wide|x|^>|\|><around|(|i|)>>, which is the sole reason
  <math|d<around|(|i|)><space|-0.17em>\<gtr\><space|-0.17em>0> when
  conditioned on clipping <math|x<around|(|i|)>>. Using least squares to
  recover <math|x<around|(|i|)>>, we assume its error to be Gaussian and
  hence <math|f<rsub|<around|\||E|\|>>> and
  <math|f<rsub|<around|\||<wide|X|^>|\|>>=f<rsub|<around|\||X+E|\|>>> to be
  Rayleigh with parameters <math|\<sigma\><rsub|<around|\||E|\|>>> and
  <math|\<sigma\><rsub|<around|\||X+E|\|>>=<around*|[|2<rsup|-1>*<around|(|\<sigma\><rsub|X><rsup|2>+\<sigma\><rsub|E><rsup|2>|)>|]><rsup|1/2>>,
  respectively. Defining <math|\<eta\><around|(|\<gamma\>|)>=1-e<rsup|-\<gamma\><rsup|2>/\<sigma\><rsub|X><rsup|2>>>,
  this becomes

  <eqnarray|<tformat|<table|<row|<cell|w<around|(|i|)><rsup|p*s>>|<cell|=>|<cell|<frac|\<diamondsuit\>*e<rsup|\<clubsuit\>>|\<diamondsuit\>*e<rsup|\<clubsuit\>>+\<triangle\>*e<rsup|\<spadesuit\>>><eq-number><label|ps>>>|<row|<cell|>|<cell|=>|<cell|<choice|<tformat|<table|<row|<cell|<frac|\<diamondsuit\>|\<diamondsuit\>+\<triangle\>*e<rsup|\<spadesuit\>-\<clubsuit\>>>;>|<cell|<text|if><space|0.27em><around|\||\<clubsuit\>|\|>\<gtr\><around|\||\<spadesuit\>|\|>,>>|<row|<cell|<frac|\<diamondsuit\>*e<rsup|\<clubsuit\>-\<spadesuit\>>|\<diamondsuit\>*e<rsup|\<clubsuit\>-\<spadesuit\>>+\<triangle\>>>|<cell|<text|if><space|0.27em><around|\||\<spadesuit\>|\|>\<gtr\><around|\||\<clubsuit\>|\|>>>>>>>>>>>

  where

  <eqnarray|<tformat|<table|<row|<cell|\<diamondsuit\>=<frac|2*\<eta\><around|(|\<gamma\>|)>*<around|(|\<gamma\>-d<around|(|i|)>|)>|\<sigma\><rsub|X><rsup|2>+\<sigma\><rsub|E><rsup|2>>>|<cell|,>|<cell|\<clubsuit\>=<frac|<around|(|\<gamma\>-d<around|(|i|)>|)><rsup|2>|\<sigma\><rsub|X><rsup|2>+\<sigma\><rsub|E><rsup|2>>>>|<row|<cell|\<triangle\>=<frac|2*<around|(|1-\<eta\><around|(|\<gamma\>|)>|)>*d<around|(|i|)>|\<sigma\><rsub|E><rsup|2>>>|<cell|,>|<cell|\<spadesuit\>=<frac|d<around|(|i|)><rsup|2>|\<sigma\><rsub|E><rsup|2>>.>>>>>

  The second part of (<reference|ps>) is a necessary manipulation for
  numerical stability.

  Notice also that what helps in suppressing only to <math|\<gamma\>> here is
  that we have a probabilistic means to cast out most of the possible false
  positives. Had we suppressed the magnitudes to the envelope mean for
  instance, <math|E<around|[|<around|\||x<around|(|i|)>|\|>|]>>, the
  procedure above would favor many locations as clipping positions by the
  fact that <math|<around|\||<wide|<wide|x|\<bar\>>|^>|\|>-E<around|[|<around|\||x<around|(|i|)>|\|>|]>>
  is small. Nonetheless, misleading bias to certain locations as candidates
  for clipping positions due to their coefficient's natural proximity to
  <math|\<gamma\>> can never be completely eliminated, even at infinite CNR.

  <subsection|Phase-Augmented CS for Homogenous Clippers><label|Phase
  Augmented CS>

  In the case of homogenous clipping, <math|\<theta\><rsub|c><around|(|\<cal-I\><rsub|c>|)>=\<theta\><rsub|<wide|x|\<bar\>>><around|(|\<cal-I\><rsub|c>|)>>
  at the transmitter, and consequently the CS algorithm should have access to
  additional information regarding the phases of the nonzero coefficients.
  The problem however is that we only have an estimate
  <math|\<theta\><rsub|<wide|<wide|x|\<bar\>>|^>><around|(|\<cal-I\><rsub|c>|)>>
  at the receiver, and the extent to which CS can benefit from this property
  depends on how good the estimate <math|<wide|<wide|x|\<bar\>>|^>> is in
  general. To this end, we will only consider the SNR as the parameter to
  which we judge the quality of the data estimate.

  Recall the discussion following Fig. <reference|Block Diagram> regarding
  the CNR and SNR, and consider the effect of gradually increasing
  <math|\<zeta\>> which we defined in <reference|dc>. Notice that when
  <math|\<zeta\>=0>, the <math|\<gamma\>>-penetrating coefficient attains its
  maximum SNR, then as we increase <math|\<zeta\>> the CNR increases as
  <math|\<zeta\><rsup|2>*E<around*|[|<around|\<\|\|\>|c|\<\|\|\>><rsub|0>|]>>
  while the SNR decreases by <math|\<zeta\>*<around*|(|2*E<around|[|<around|\||x|\|>|]>-\<zeta\>|)>>.
  Consequently, the CNR will be larger than the SNR in the locations where
  <math|<around*|(|E<around*|[|<around|\<\|\|\>|c|\<\|\|\>><rsub|0>|]>-1|)>*\<zeta\><rsup|2>+2*E<around*|[|<around|\||x|\|>|]>*\<zeta\>-E<around*|[|<around|\||x|\|><rsup|2>|]>\<gtr\>0>.
  Fortunately practical values of <math|\<zeta\>> relative to
  <math|E<around*|[|<around|\||x|\|>|]>> fall outside this region, and we
  would normally expect to gain information regarding
  <math|\<theta\><rsub|c>> from <math|<wide|<wide|x|\<bar\>>|^>> that is more
  reliable than information from CS alone.

  This fact encourages us to absorb, and perhaps even <em|replace>
  altogether, as much information as possible regarding
  <math|\<theta\><rsub|c>> from the estimated data vector
  <math|<wide|<wide|x|\<bar\>>|^>>. Assume first that we know the vector
  <math|\<theta\><rsub|c>>, we could then merge this information into the CS
  algorithm by expressing the clipping signal as
  <math|c=\<Theta\><rsub|c><around|\||c|\|>> such that

  <eqnarray|<tformat|<table|<row|<cell|c=<around*|[|<tabular*|<tformat|<table|<row|<cell|e<rsup|<space|0.17em>j*\<theta\><rsub|c<around|(|1|)>>>>|<cell|0>|<cell|0>|<cell|0>>|<row|<cell|0>|<cell|e<rsup|<space|0.17em>j*\<theta\><rsub|c<around|(|2|)>>>>|<cell|0>|<cell|0>>|<row|<cell|0>|<cell|0>|<cell|\<ddots\>>|<cell|0>>|<row|<cell|0>|<cell|0>|<cell|0>|<cell|e<rsup|<space|0.17em>j*\<theta\><rsub|c<around|(|N|)>>>>>>>>|]>\<cdot\><around*|[|<tabular*|<tformat|<table|<row|<cell|<around|\||c<around|(|1|)>|\|>>>|<row|<cell|<around|\||c<around|(|2|)>|\|>>>|<row|<cell|\<vdots\>>>|<row|<cell|<around|\||c<around|(|N|)>|\|>>>>>>|]>,>>|<row|<cell|<eq-number>>>>>>

  <no-indent>which could be directly fused into the measurement matrix
  <math|\<Psi\>>, thus transforming our model from
  <math|<wide|y|\<acute\>>=\<Psi\>*c+<wide|z|\<acute\>>> to
  <math|<wide|y|\<acute\>>=\<Psi\>*\<Theta\><rsub|c><around|\||c|\|>+<wide|z|\<acute\>>>
  where

  <eqnarray|<tformat|<table|<row|<cell|\<Psi\>*\<Theta\><rsub|c>=<around*|[|<tabular*|<tformat|<table|<row|<cell|\|>|<cell|\|>|<cell|>|<cell|\|>>|<row|<cell|e<rsup|<space|0.17em>j*\<theta\><rsub|c<around|(|1|)>>>*\<psi\><rsub|1>>|<cell|e<rsup|<space|0.17em>j*\<theta\><rsub|c<around|(|2|)>>>*\<psi\><rsub|2>>|<cell|\<ldots\>>|<cell|e<rsup|<space|0.17em>j*\<theta\><rsub|c<around|(|N|)>>>*\<psi\><rsub|N>>>|<row|<cell|\|>|<cell|\|>|<cell|>|<cell|\|>>>>>|]>>>>>>

  <no-indent>has now realigned the phases of the coefficients sought and
  reduced the problem to estimating a real sparse vector, with only the
  locations and magnitudes of the nonzero coefficients of <math|c> to be
  found. In the case of digital clipping, we can then force the magnitudes to
  the nearest alphabets as well. In any case, with <math|\<Theta\><rsub|c>>
  unknown prior to CS, we will instead use
  <math|\<Theta\><rsub|<wide|<wide|x|\<bar\>>|^>>-2*\<pi\>*<with|font-series|bold|I><rsub|N\<times\>N>>
  to augment the CS algorithm. This could be done in two ways:

  <\enumerate>
    <item><with|font-series|bold|Sense then Rotate (StR)>: Use the standard
    CS or weighted CS algorithms used so far to regain
    <math|<wide|c|^><rsup|<space|0.17em><with|font-series|medium|N*o*P*A>>=arg<rsub|c<space|0.17em>\<in\><with|font-series|bold|C><rsup|N>>
    min <around|{|<around|\<\|\|\>|<wide|y|\<acute\>>-\<Psi\>*c|\<\|\|\>><rsub|2><rsup|2>+\<lambda\><around|\<\|\|\>|c|\<\|\|\>><rsub|1>|}>>
    where PA stands for <with|font-shape|italic|Phase Augmentation>, extract
    the locations and magnitudes of the nonzero coefficients from
    <math|<wide|c|^>>, and then rotate them according to the corresponding
    estimated directions in <math|<wide|<wide|x|\<bar\>>|^>>. i.e.

    <eqnarray|<tformat|<table|<row|<cell|<around*|{|<wide|c|^><rsup|<space|0.17em><with|font-series|medium|S*t*R>><around|(|i|)>|}><rsub|i\<in\><wide|\<cal-I\>|^><rsub|c>>=<around*|{|<around|\||<space|0.17em><wide|c|^><rsup|<space|0.17em><with|font-series|medium|N*o*P*A>><around|(|i|)>|\|>*<space|0.17em>e<rsup|j*<around*|(|\<theta\><rsub|<wide|<wide|x|\<bar\>>|^><around|(|i|)>>-2*\<pi\>|)>>|}><rsub|i\<in\><wide|\<cal-I\>|^><rsub|c>><eq-number><label|StR>>>>>>

    <item><with|font-series|bold|Rotate then Sense (RtS)>: In this case
    supply the CS algorithm with the phase information from
    <math|<wide|<wide|x|\<bar\>>|^>> as described above. This rotation prior
    to compressive sensing recasts the problem as an estimation of a real
    vector with <math|2*m> real observations. Defining
    <math|<wide|\<Psi\>|~><rsub|c>=\<Psi\>*\<Theta\><rsub|c>>, we're left
    with the following model

    <eqnarray|<tformat|<table|<row|<cell|<wide|y|~>=<around*|[|<tabular*|<tformat|<cwith|1|-1|1|1|cell-halign|c>|<cwith|1|-1|1|1|cell-lborder|0ln>|<cwith|1|-1|1|1|cell-rborder|0ln>|<table|<row|<cell|\<Re\><wide|y|\<acute\>>>>|<row|<cell|\<Im\><wide|y|\<acute\>>>>>>>|]>=<around*|[|<tabular*|<tformat|<cwith|1|-1|1|1|cell-halign|c>|<cwith|1|-1|1|1|cell-lborder|0ln>|<cwith|1|-1|1|1|cell-rborder|0ln>|<table|<row|<cell|\<Re\>*<wide|\<Psi\>|~><rsub|c>>>|<row|<cell|\<Im\>*<wide|\<Psi\>|~><rsub|c>>>>>>|]>\<cdot\><around|\||c|\|>+<around*|[|<tabular*|<tformat|<cwith|1|-1|1|1|cell-halign|c>|<cwith|1|-1|1|1|cell-lborder|0ln>|<cwith|1|-1|1|1|cell-rborder|0ln>|<table|<row|<cell|\<Re\><wide|z|\<acute\>>>>|<row|<cell|\<Im\><wide|z|\<acute\>>>>>>>|]><eq-number>>>>>>

    for which we use the following program to recover <math|c>

    <eqnarray|<tformat|<table|<row|<cell|<wide|c|^><rsup|<space|0.17em><with|font-series|medium|R*t*S>>=arg<rsub|<around|\||<space|0.17em>c<space|0.17em>|\|>\<in\><space|0.17em><with|font-series|bold|R><rsup|N>>
    min <around*|{|<around|\<\|\|\>|<wide|y|~>-<wide|\<Psi\>|~><rsub|<wide|<wide|x|\<bar\>>|^>><space|0.17em><around|\||<space|0.17em>c<space|0.17em>|\|><space|0.17em>|\<\|\|\>><rsub|2><rsup|2>+\<lambda\><around|\<\|\|\>|c|\<\|\|\>><rsub|1>|}><eq-number><label|RtS>>>>>>

    where <math|<wide|\<Psi\>|~><rsub|<wide|<wide|x|\<bar\>>|^>>=\<Psi\>*<around*|(|\<Theta\><rsub|<wide|<wide|x|\<bar\>>|^>>-2*\<pi\>*<with|font-series|bold|I><rsub|N\<times\>N>|)>>.
    Notice that, similar to (<reference|StR>) one could also replace the
    phases of <math|<wide|c|^><rsup|<space|0.17em><with|font-series|medium|R*t*S>>>
    with <math|<around*|{|e<rsup|j*<around*|(|\<theta\><rsub|<wide|<wide|x|\<bar\>>|^><around|(|i|)>>-2*\<pi\>|)>>|}><rsub|i\<in\><wide|\<cal-I\>|^><rsub|c>>>
    after (<reference|RtS>) but we have not observed any significant
    improvement in doing so.
  </enumerate>

  <section|Bayesian Estimation of Sparse Clipping Signals><label|bcs>

  To take into account the statistical information at hand, we could simply
  modify the dual stage estimate in (<reference|ls cs>) to a linear minimum
  mean-square (LMMSE) estimate of the amplitudes <math|v<rsub|c>> conditioned
  on the support estimate <math|<wide|\<cal-I\>|^><rsup|c*s><rsub|c>>

  <eqnarray|<tformat|<table|<row|<cell|<wide|v|^><rsub|c><rsup|l*m*m*s*e\|<wide|\<cal-I\>|^><rsup|c*s><rsub|c>>=\<sigma\><rsub|v<rsub|c>><rsup|2>*<wide|\<Phi\>|^><rsup|H>*<around*|(|\<sigma\><rsub|v<rsub|c>><rsup|2>*<wide|\<Phi\>|^>*<wide|\<Phi\>|^><rsup|H>+\<sigma\><rsub|z><rsup|2>*I|)><rsup|-1>*<around*|(|<wide|y|\<acute\>>-<wide|\<Phi\>|^>*E<rsub|v<rsub|c>>|)>.<label|mmse
  cs>>>>>>

  This should clearly improve upon the least square estimate (<reference|ls
  cs>) in case the distribution of <math|v<rsub|c>> is Gaussian, but will not
  be able to invoke any statistical information into the support estimate.
  Using a Maximum a Posteriori (MAP) estimate <math|<wide|c|^>=arg max
  P<around|(|<wide|y|\<acute\>>\|c|)>*P<around|(|c|)>> generally leads to
  non-convex optimization problems in sparse models, and we refer instead to
  an MMSE estimate. First define <math|J<rsup|<around|\||\<cal-I\>|\|>>> as
  the Hamming vector of length <math|N> and Hamming weight
  <math|<around|\||\<cal-I\>|\|>> with active coefficients according to the
  support set <math|\<cal-I\>>. Then marginalizing on all such possible
  vectors we obtain

  <eqnarray|<tformat|<table|<row|<cell|<wide|c|^><rsup|<space|0.17em><with|font-series|medium|M*M*S*E>>>|<cell|=>|<cell|E<around*|[|<space|0.17em>c<space|0.17em>\|<space|0.17em><wide|y|\<acute\>><space|0.17em>|]>>>|<row|<cell|>|<cell|=>|<cell|<big|sum><rsub|i=1><rsup|2<rsup|N>>E<around*|[|<space|0.17em>c<space|0.17em>\|<space|0.27em><wide|y|\<acute\>>,J<rsub|i>|]>*P<around|(|<wide|y|\<acute\>>\|J<rsub|i>|)>*P<around|(|J<rsub|i>|)><eq-number><label|MMSE2>>>>>>

  <no-indent>with dropping off <math|P<around|(|<wide|y|\<acute\>>|)>> in
  (<reference|MMSE2>) due to its independence of <math|i>. The estimate is a
  weighted sum of conditional expectations, and the formal (exact) approach
  requires computing <math|2<rsup|N>> terms which is a formidable task for
  large <math|N>. To limit the search space, the key is to truncate the
  summation index to a much smaller subset of support vectors
  <math|<with|font-series|bold|J><rsup|\<ast\>>>. As such, the weights
  <math|<around|{|P<around|(|J<rsub|k>\|<wide|y|\<acute\>>|)>|}><rsub|k\<in\>J<rsup|\<ast\>>>>
  will not sum up to unity, and we will need to mitigate this by normalizing
  the truncated weighted sum by the sum of weights
  <math|\<cal-W\>=<big|sum><rsub|k\<in\><with|font-series|bold|J><rsup|\<ast\>>>P<around|(|<wide|y|\<acute\>>\|J<rsub|k>|)>*P<around|(|J<rsub|k>|)>>,
  hence reducing (<reference|MMSE2>) to

  <eqnarray|<tformat|<table|<row|<cell|<wide|c|^><rsup|<space|0.17em><with|font-series|medium|M*M*S*E>>>|<cell|\<approx\>>|<cell|<frac|1|\<cal-W\>>*<big|sum><rsub|k\<in\><with|font-series|bold|J><rsup|\<ast\>>>E<around*|[|<space|0.17em>c<space|0.17em>\|<space|0.27em><wide|y|\<acute\>>,J<rsub|k>|]>*P<around|(|<wide|y|\<acute\>>\|J<rsub|k>|)>*P<around|(|J<rsub|k>|)>.<eq-number><label|approx
  MMSE>>>>>>

  In effect, estimating <math|c> in an MMSE criterion boils down to
  appropriately selecting <math|<with|font-series|bold|J><rsup|\<ast\>>> and
  evaluating the terms <math|P<around|(|J<rsub|k>|)>>,
  <math|P<around|(|<wide|y|\<acute\>>\|J<rsub|k>|)>>, and
  <math|E<around*|[|<space|0.17em>c<space|0.17em>\|<space|0.27em><wide|y|\<acute\>>,J<rsub|k>|]>,\<forall\>J<rsub|k>\<in\><with|font-series|bold|J><rsup|\<ast\>>>,
  which are in increasing complexity in the order we've just mentioned.

  When using peak suppression to <math|\<gamma\>>, the receiver is given a
  vague picture of where clipping has occurred based on the affinity of
  <math|<wide|<wide|x|\<bar\>>|^>> to <math|\<gamma\>>. Consequently, by
  sorting the magnitudes of the weighting vector <math|w<rsup|\<downarrow\>>>
  in (<reference|ps>) in ascending order, the probability of the true support
  coinciding with the first <math|\<beta\>> elements in <math|arg
  <around|{|w<rsup|\<downarrow\>>|}>> will increase rapidly with
  <math|\<beta\>>. Fig. <reference|Probability> shows a Monte Carlo
  simulation of this probability at different clipping thresholds. For
  instance, this implies that given a clipping threshold of
  <math|\<gamma\>=2*\<sigma\><rsub|<around|\||X|\|>>>, one could exclude
  <math|70%> of the <math|N> indices as having too low a probability of
  corresponding to a clipping location, thus reducing the possible candidates
  from <math|2<rsup|N>> to <math|2<rsup|\<beta\>>> Hamming vectors.

  <big-figure|<with|par-mode|center|<image|Pr_inclusion_vs_alpha_4_gamma_1000_c.eps|2.5in|||><label|Probability>>|Probability
  of support index set <math|\<cal-I\><rsub|c>> being completely included in
  the first <math|\<beta\>/N%> of <math|arg
  <around|{|w<rsup|\<downarrow\>>|}>>>

  Given this reduced set <math|<with|font-series|bold|J><rsup|<around|{|k:k\<leq\>\<beta\>|}>>>
  of vectors, we adopt a search over it by latching a vector of unity Hamming
  weight based on (<reference|approx MMSE>), and then proceed in a greedy
  fashion similar to Larsson <cite|Larsson> and Schniter
  <cite|Schniter1|Schniter2> until a maximum sparsity level
  <math|s<rsup|m*a*x>> is reached. This will preserve the quality of the
  greedy estimate using Fast Bayesian Matching Pursuit (FBMP) in
  <cite|Schniter1> while reducing the number of executions of
  (<reference|approx MMSE>) by

  <eqnarray|<tformat|<table|<row|<cell|100*<around*|(|1-<frac|\<beta\>*<around|(|1+\<rho\>\<cdot\>s<rsup|m*a*x>|)>-<frac|\<rho\>\<cdot\>s<rsup|m*a*x>*<around|(|s<rsup|m*a*x>+1|)>|2>|N*<around|(|1+\<rho\>\<cdot\>s<rsup|m*a*x>|)>-<frac|\<rho\>\<cdot\>s<rsup|m*a*x>*<around|(|s<rsup|m*a*x>+1|)>|2>>|)>%>>>>>

  where <math|\<rho\>> is the number of tested candidates for each Hamming
  weight. This would correspond to a reduction of <math|60-80%> of executions
  with our practical parameters, and we will henceforth refer to this
  procedure as <math|\<beta\>>-FBMP.

  <section|Performance Analysis and Simulations>

  For our simulation purposes we considered an OFDM signal of <math|N=256>
  subcarriers of which <math|m=0.2*N> are randomly dispersed measurement
  tones. The data coefficients were generated from a QAM constellation of
  size <math|M=32>. The Rayleigh fading channel model was of 32 taps,
  operating at a <math|30> dB SNR environment. The performance parameters we
  considered were the SER, the relative temporal complexity, the PAPR
  reduction ability, and the capacity.

  Our primary objective was to test the SER variation with the clipping
  threshold <math|\<gamma\>> for a clipped OFDM signal that used our
  different adaptations of CS algorithms and clipping techniques. Observed as
  a variable, the clipping threshold in particular is of central importance
  due to its critical effect on both CS generic performance and the PAPR
  reduction. Decreasing <math|\<gamma\>> significantly reduces the PAPR but
  also implies a nonlinear increase in the average sparsity level that the
  estimation algorithms must tolerate. It also has a positive counter effect
  on CS performance as well since it increases the CNR, making the overall
  behavior of SER(<math|\<gamma\>>) difficult to predict.

  Furthermore, when testing the precise performance of an algorithm we used
  the Normalized Mean Square Error

  <eqnarray|<tformat|<table|<row|<cell|<with|font-series|medium|N*M*S*E>>|<cell|=>|<cell|E<around*|[|<frac|<around|(|c-<wide|c|^>|)><rsup|2>|<around|\<\|\|\>|c|\<\|\|\>><rsub|2><rsup|2>>|]>>>>>>

  to ensure that error decrease was not simply due to a decrease in the
  number of estimated variables.

  <big-figure|<with|par-mode|center|<image|SER_gamma_7curves_500_c.eps|2.5in|||><label|SER>>|SER
  of PS vs <math|\<gamma\>>>

  <big-figure|<with|par-mode|center|<image|DC_NMSE_vs_zeta_1000.eps|2.5in|||><label|DC>>|NMSE
  of Digital Clipper estimate as a function of the coefficient magnitude
  <math|\<zeta\>>>

  <big-figure|<with|par-mode|center|<image|DC_NMSE_vs_gamma_1000_forced.eps|2.5in|||><label|DC>>|NMSE
  of Digital Clipper Estimate as a function of the clipping threshold
  <math|\<gamma\>>>

  <big-figure|<with|par-mode|center|<image|DC_SER_vs_gamma_1000_not_forc.eps|2.5in|||><label|DC>>|SER
  of Digital Clipping with <math|\<zeta\>=0.8*<space|0.17em>\<sigma\><rsub|<around|\||X|\|>>>
  vs <math|\<gamma\>>>

  Fig. <reference|SER> shows the SER for Peak Suppressing clippers in
  <reference|PS> after QAM decoding <math|<around|(|<with|font-series|bold|FS><rsub|x>|)><rsup|\<dagger\>>*<around|(|<wide|<wide|x|\<bar\>>|^><rsup|l*s>+<wide|c|^><rsup|<around|(|p*s|)>>|)>>
  as the clipping threshold is varied. The methods tested were the reduced
  search space greedy method (<math|\<beta\>>-FBMP), the LASSO, the
  Phase-Augmented LASSO (PAL) using (<reference|RtS>), the data-based
  Weighted LASSO (WL), and the Weighted Phase-Augmented LASSO (WPAL). These
  were compared against two performance bounds: the lower bound of not
  estimating <math|c>, and the upper bound of an oracle receiver that knows
  the support <math|\<cal-I\><rsub|c>>, and simply uses least squares to
  estimate the coefficients' amplitudes. Interestingly, combining the support
  and phase augmentation techniques into the LASSO enables it to perform very
  close to the support oracle, and even beat it at low clipping thresholds
  where <math|s\<gtr\>0.55*<space|0.17em>m> since it has additional
  information regarding the coefficients' phases. Furthermore, weighting
  alone is more effective then phase-augmentation, although both
  significantly improve the performance of the LASSO.

  To see the effect of varying the magnitude of active coefficients in
  digital clipping of section <reference|dc> we plotted the NMSE vs
  <math|\<zeta\>> in Fig. <reference|DC>. This avoids a biased evaluation due
  to increased CNR with <math|\<zeta\>>. The results imply that embedding the
  phase information into the LASSO in (<reference|RtS>) is much more
  effective than rotating the estimate after compressed sensing in
  (<reference|StR>). It also shows that the former method is considerably
  close to a phase oracle that uses the same technique for practical values
  of <math|\<zeta\>> relative to <math|\<sigma\><rsub|<around|\||X|\|>>>.
  However, as expected they eventually deviate as we increase <math|\<zeta\>>
  since this corresponds to decreasing the SNR and hence the accuracy of the
  phase information induced from the data vector estimate
  <math|\<theta\><rsub|<wide|<wide|x|\<bar\>>|^>>>. Fig. <reference|DC>
  implies that forcing the magnitudes of the estimates in (<reference|StR>)
  and (<reference|RtS>) is generally ineffective except in the very sparse
  cases for the former. The overall result on the SER is portrayed in Fig.
  <reference|DC> at a fixed <math|\<zeta\>=0.8*<space|0.17em>\<sigma\><rsub|<around|\||X|\|>>>.

  Complexity-wise, we neglect mentioning implementation and orders of
  complexity since they match those of standard algorithms we've built on and
  that are well documented in the CS literature (e.g.
  <cite|Tropp3|Schniter1|Boyd2>). Instead we investigate the practical aspect
  of the relative time required to execute the major techniques proposed in
  the paper compared to Tellado's primary tone-reservation algorithm using
  the same generic CVX software <cite|Boyd>.<footnote|With the only exception
  being Schniter's Greedy algorithm when evaluating <math|\<beta\>>-FBMP.> As
  such we collected the random execution times for <math|2000> runs of each,
  normalized them by the maximum execution time among all, and plotted their
  CCDF. Fig. <reference|temporal> depicts the results. Roughly speaking, the
  methods stemming from the LASSO required less then <math|12%> of the time
  required to execute Tellado's primary QCQP algorithm on average, while the
  <math|\<beta\>>-FBMP required less than <math|2%> of the time.

  A major advantage of clipping to a fixed threshold is that, unlike
  tone-reservation methods such as <cite|Tellado|Kashin> the dynamic range,
  maximum power, and PAPR of the transmitted signal are fixed. The
  distribution of PAPR reduction, <math|10*log<rsub|10><around*|(|<frac|P<rsub|m*a*x>|\<gamma\><rsup|2>>|)>>,
  would simply follow from the distribution of the maximum squared
  coefficient in <math|x> (refer to <cite|Imai|Bahai|Wei> for relevant
  analysis) which we plot in Fig. <reference|CCDF>. The fixed maximum power
  followed from the clipping threshold that corresponded to a SER of
  <math|10<rsup|-2>> for the different techniques in this work.

  <big-table|<assign|arraystretch|<macro|1.3>>
  <with|par-mode|center|<tabular*|<tformat|<cwith|1|-1|1|1|cell-lborder|1ln>|<cwith|1|-1|1|1|cell-halign|c>|<cwith|1|-1|1|1|cell-rborder|1ln>|<cwith|1|-1|2|2|cell-halign|c>|<cwith|1|-1|2|2|cell-rborder|1ln>|<cwith|1|-1|3|3|cell-halign|c>|<cwith|1|-1|3|3|cell-rborder|1ln>|<cwith|1|-1|4|4|cell-halign|c>|<cwith|1|-1|4|4|cell-rborder|1ln>|<cwith|1|-1|1|-1|cell-valign|c>|<cwith|1|1|1|-1|cell-tborder|1ln>|<cwith|1|1|1|-1|cell-bborder|1ln>|<cwith|6|6|1|-1|cell-bborder|1ln>|<table|<row|<cell|>|<cell|Tolerable
  <math|\<gamma\>>>|<cell|Avg. PAPR Red. (dB)>|<cell|<math|%> Exec.
  Time>>|<row|<cell|DC (RtS)>|<cell|2.40 <math|\<cdot\><space|0.17em>\<sigma\><rsub|<around|\||X|\|>>>>|<cell|3.19>|<cell|<math|11.06%>>>|<row|<cell|<math|\<beta\>>-FBMP>|<cell|2.26
  <math|\<cdot\><space|0.17em>\<sigma\><rsub|<around|\||X|\|>>>>|<cell|3.71>|<cell|<math|1.6%>>>|<row|<cell|LASSO>|<cell|2.25
  <math|\<cdot\><space|0.17em>\<sigma\><rsub|<around|\||X|\|>>>>|<cell|3.75>|<cell|<math|12.3%>>>|<row|<cell|WPAL>|<cell|2.02
  <math|\<cdot\><space|0.17em>\<sigma\><rsub|<around|\||X|\|>>>>|<cell|4.68>|<cell|<math|13.9%>>>|<row|<cell|Tellado>|<cell|->|<cell|4.37>|<cell|<math|100%>>>>>><label|table:summary>>|Summary
  of Results>

  The most fundamental parameter of interest given a desired clipping
  threshold is the channel capacity <cite|Tellado|Capacity_clipped_1>

  <eqnarray|<tformat|<table|<row|<cell|C=<big|sum><rsub|k=1><rsup|N>log<rsub|2><around*|(|1+<frac|<around|\||D<around|(|k,k|)>|\|><rsup|2>*\<sigma\><rsub|x<around|(|k|)>><rsup|2>|\<sigma\><rsub|z<around|(|k|)>><rsup|2>>|)>,>>>>>

  and we will thus consider two systems. The first system
  <math|\<cal-S\><rsub|1>> clips all coefficients above <math|\<gamma\>> and
  does not reserve tones to estimate the clipping signal <math|c>, resulting
  in a higher clipping noise over all <math|N> tones while retaining all of
  them for data transmission. The second system <math|\<cal-S\><rsub|2>>
  reserves <math|m> tones to estimate <math|c>, thus reducing the SER
  degradation while also reducing the data tones by <math|m>.

  <big-figure|<with|par-mode|center|<image|temporal_complexity_CCDF_5_curves_1000.eps|2.5in|||><label|temporal>>|CCDF
  of execution time normalized by maximum value>

  <big-figure|<with|par-mode|center|<image|CCDF_PAPR_Reduction_7_curves_lines.eps|2.5in|||><label|CCDF>>|CCDF
  of PAPR Reduction (dB)>

  The justification then depends very much on the variances of the clipping
  noise <math|<around|{|\<sigma\><rsub|c><rsup|2><around|(|k;\<gamma\>|)>|}><rsub|k\<in\>\<Omega\><rsub|d>>>
  with and without estimation at the receiver. Furthermore, if the threshold
  <math|\<gamma\>> is sufficiently low relative to
  <math|\<sigma\><rsub|<around|\||X|\|>>> (e.g.
  <math|E<around*|[|<around|\<\|\|\>|c|\<\|\|\>><rsub|0><space|0.17em>;\<gamma\>|]>=10%>
  of <math|N>), the clipping noise on each tone will be the result of a
  reasonably large summation of scaled coefficients of <math|c> in the time
  domain, and so will the distribution of the priors in (<reference|pdf>)
  converge to a Gaussian. With this theoretical justification aided by
  extensive simulations, we will assume for simplicity that the distortion on
  each carrier follows a Gaussian with a common variance
  <math|\<sigma\><rsub|c><rsup|2>>. However, caution must be taken when
  comparing this parameter for the two systems. The reason is that
  <math|\<cal-S\><rsub|1>> has more data energy than <math|\<cal-S\><rsub|2>>
  by using all <math|N> tones, and will thus have a higher distortion
  variance at the same clipping level <math|\<gamma\>>, i.e.
  <math|\<sigma\><rsub|c<space|0.17em>\<mid\><around|\||\<Omega\><rsub|d>|\|>=N><rsup|2>\<gtr\>\<sigma\><rsub|c<space|0.17em>\<mid\><around|\||\<Omega\><rsub|d>|\|>=N-m><rsup|2>>.
  Consequently, the capacity of the first system (after dropping the tone
  index) will be

  <eqnarray|<tformat|<table|<row|<cell|C<rsub|1>=N*log<rsub|2><around*|(|1+<frac|<around|\||D|\|><rsup|2>*\<sigma\><rsub|x<space|0.17em>\<mid\><around|\||\<Omega\><rsub|d>|\|>=N><rsup|2>|<around|\||D|\|><rsup|2>*\<sigma\><rsub|c<space|0.17em>\<mid\><around|\||\<Omega\><rsub|d>|\|>=N><rsup|2>+\<sigma\><rsub|z><rsup|2>>|)><eq-number><label|C1>>>>>>

  while the capacity of the second will be

  <eqnarray|<tformat|<table|<row|<cell|C<rsub|2>=<around|(|N-m|)>*log<rsub|2><around*|(|1+<frac|<around|\||D|\|><rsup|2>*\<sigma\><rsub|x<space|0.17em>\<mid\><around|\||\<Omega\><rsub|d>|\|>=N-m><rsup|2>|<around|\||D|\|><rsup|2>*\<sigma\><rsub|<around*|(|c-<wide|c|^>|)>\<mid\><around|\||\<Omega\><rsub|d>|\|>=N-m><rsup|2>+\<sigma\><rsub|z><rsup|2>>|)><eq-number><label|C2>>>>>>

  The use of reserved tones for CS is then justified if
  <math|C<rsub|2>\<gtr\>C<rsub|1>>, i.e. when

  <eqnarray|<tformat|<table|<row|<cell|\<sigma\><rsub|<around*|(|c-<wide|c|^>|)>\<mid\><around|\||\<Omega\><rsub|d>|\|>=N-m><rsup|2>\<less\><frac|\<sigma\><rsub|x<space|0.17em>\<mid\><around|\||\<Omega\><rsub|d>|\|>=N-m><rsup|2>|<around*|[|1+<frac|<around|\||D|\|><rsup|2>*\<sigma\><rsub|x<space|0.17em>\<mid\><around|\||\<Omega\><rsub|d>|\|>=N><rsup|2>|<around|\||D|\|><rsup|2>*\<sigma\><rsub|c<space|0.17em>\<mid\><around|\||\<Omega\><rsub|d>|\|>=N><rsup|2>+\<sigma\><rsub|z><rsup|2>>|]><rsup|<frac|N|N-m>>-1>-<frac|\<sigma\><rsub|z><rsup|2>|<around|\||D|\|><rsup|2>><eq-number><label|C>>>>>>

  <big-figure|<with|par-mode|center|<image|capacity_vs_gamma_3curves_FAIR.eps|2.5in|||><label|capacity>>|Capacity
  per transmitted tone at different clipping thresholds>

  It would be very interesting to observe how this parameter behaves as a
  function of the clipping threshold <math|\<gamma\>> as both the distortion
  <math|\<sigma\><rsub|c><rsup|2>> and the quality of the estimate
  <math|<wide|\<sigma\>|^><rsub|c><rsup|2>> nonlinearly counteract each
  other. Fig. <reference|capacity> shows such results upon <math|1000> runs
  at each <math|\<gamma\>> for estimating <math|\<sigma\><rsub|c><rsup|2>>
  and <math|\<sigma\><rsub|<around*|(|c-<wide|c|^>|)>><rsup|2>>. The results
  show that by reserving <math|20%> of the tones for data-based weighted and
  phase-augmented LASSO the capacity of such a system can significantly
  outperform the naive system which uses all the tones for data transmission.
  What's more, the capacity associated with this technique behaves in a
  convex fashion so that by reducing the capacity by less then 1 bit per
  second per transmitted tone, the clipping threshold can be dramatically
  reduced from <math|\<gamma\>=2.5*<space|0.17em>\<sigma\><rsub|<around|\||X|\|>>>
  to <math|\<gamma\>=2*<space|0.17em>\<sigma\><rsub|<around|\||X|\|>>>.
  Unlike the semi-linear relation of <math|\<cal-S\><rsub|1>> with
  <math|\<gamma\>>, such behavior offers a very tempting compromise between
  capacity and peak-reduction. Using the typical LASSO at such conditions is
  effective at clipping thresholds reaching as low as
  <math|1.9*<space|0.17em>\<sigma\><rsub|<around|\||X|\|>>> which is
  impressive.

  Fig. <reference|capacity> implies that increasing the SNR is much more
  rewarding for <math|\<cal-S\><rsub|2>> compared to <math|\<cal-S\><rsub|1>>
  which we test at a fixed clipping threshold of
  <math|2.3*<space|0.17em>\<sigma\><rsub|<around|\||X|\|>>>. The reason is
  that eliminating <math|\<sigma\><rsub|z><rsup|2>> has no effect on
  <math|\<sigma\><rsub|c><rsup|2>> and the capacity of the naive system
  saturates after an SNR of 35 dB. On the other hand, decreasing the noise
  level improves the CS estimate and hence has a dual effect in increasing
  the capacity, leading to the semi-linear relation with the SNR.

  <big-figure|<with|par-mode|center|<image|capacity_vs_SNR_3curves_FAIR.eps|2.5in|||><label|capacity>>|Capacity
  per transmitted tone vs SNR>

  <section|Conclusion>

  In this work we have established the new general concept of clipping
  mitigation (and hence PAPR reduction) in OFDM using compressive sensing
  techniques. The general framework stressed the use of reserved subcarriers
  to compressively estimate the locations and amplitudes of the clipped
  portions of a transmitted OFDM signal at the receiver, instead of using
  them at the transmitter as a spectral support for optimized peak reducing
  signals in the time domain. Consequently, the method interchanges the
  <em|stage> at which signal processing complexity is required compared to
  the previous techniques, hence introducing a real solution to communication
  systems that use OFDM signals at the physical layer and require minimal
  complexity at the transmitter.

  The other major contribution is demonstrating how by a marginal increase in
  complexity one can augment the standard <math|\<ell\><rsub|1>> minimization
  of CS by extracting information regarding clipping locations, magnitudes,
  and phases from the data, and hence enable the system to estimate sparse
  clippers far beyond the recoverability conditions of CS (e.g. sparsity
  levels above <math|55%> of <math|m>). Such augmentation was shown to
  significantly boost the overall system's capacity at low clipping
  thresholds and thus suggests a very appealing compromise between capacity
  and peak-reduction.

  <ifCLASSOPTIONcaptionsoff><new-page><fi>

  <\thebibliography|1>
    <bibitem|OFDM_applications_3>J. G. Andrews, A. Ghosh, R. Muhamed,
    <with|font-shape|italic|Fundamentals of WiMAX: Understanding Broadband
    Wireless Networking>, Prentice Hall, part of the Prentice Hall
    Communications Engineering and Emerging Technologies Series, 2007.

    <bibitem|Wimax2>T. Jiang, W. Xiang, H. H. Chen, and Q. Ni, \PMulticast
    broadcasting services support in OFDMA-based WiMAX systems,"
    <with|font-shape|italic|IEEE Commun. Mag.>, vol. 45, no. 8, pp. 7886,
    Aug. 2007.

    <bibitem|PAPR_overview3>S. Litsyn, <with|font-shape|italic|Peak Power
    Control in Multicarrier Communications>, Cambridge University Press,
    <math|1<rsup|s*t>> edition, Jan. 2007.

    <bibitem|PAPR_overview1>T. Jiang and Y. Wu, \PAn Overview:
    Peak-to-Average Power Ratio Reduction Techniques for OFDM Signals,"
    <with|font-shape|italic|IEEE Trans. Broadcast.>, vol. 54, no. 2, June
    2008.

    <bibitem|PAPR_overview2>S. H. Han and J. H. Lee, \PAn overview of
    peak-to-average power ratio reduction techniques for multicarrier
    transmission," <with|font-shape|italic|IEEE Pers. Commun.>, vol. 12, no.
    2, pp. 5665, Apr. 2005.

    <bibitem|coding1>K. Sathananthan, C. Tellambura, \PCoding to reduce both
    PAR and PICR of an OFDM signal," <with|font-shape|italic|IEEE Commun.
    Lett.>, vol.6, no.8, pp. 316-318, Aug 2002.

    <bibitem|coding3>J. A. Davis and J. Jedwab, \PPeak-to-Mean Power Control
    in OFDM, Golay Complementary Sequences, and Reed-Muller Codes,"
    <with|font-shape|italic|IEEE Trans. on Inf. Theory>, vol.45, No.7, Nov.
    1999.

    <bibitem|coding4>T. Jiang and G. X. Zhu, \PComplement Block Coding For
    Reduction in Peak-To-Average Power Ratio of OFDM Signals,"
    <with|font-shape|italic|IEEE Commun. Mag.>, vol. 43, no. 9, pp. 1722,
    Sept. 2005.

    <bibitem|selected_mapping>R. W. Bauml, R. F. H. Fischer, and J. B. Huber,
    \PReducing the Peak-To-Average Power Ratio of Multicarrier Modulation by
    Selected Mapping," <with|font-shape|italic|Electronics Letters>, vol. 32,
    no. 22, pp. 2056-2057, 1996.

    <bibitem|selected_mapping_2_low_complexity>A. Ghassemi and T. A.
    Gulliver, \PA Low Complexity Selective Mapping OFDM using Multiple IFFT
    Stages," <with|font-shape|italic|International Journal of Communication
    Networks and Distributed Systems>, vol. 1, Issue 2, Sep. 2008.

    <bibitem|pts_1>S. H. Muller and J. B. Huber, \POFDM with Reduced
    Peak-to-Average Power Ratio by Optimum Combination of Partial Transmit
    Sequences, <with|font-shape|italic|Electronic Letters>, vol. 33, no. 5,
    pp. 20562057, Feb. 1997.

    <bibitem|pts_2>A. Alavi, C. Tellambura, and I. Fair, \PPAPR Reduction Of
    OFDM Signals using Partial Transmit Sequence: An Optimal Approach using
    Sphere Decoding," <with|font-shape|italic|IEEE Commun. Lett.>, vol. 9,
    no. 11, pp. 982984, Nov. 2005.

    <bibitem|Constellation_reshaping>Y. J. Kou, W. S. Lu, and A. Antoniou,
    \PA New Peak-To-Average Power-Ratio Reduction Algorithm For OFDM Systems
    via Constellation Extension," <with|font-shape|italic|IEEE Trans.
    Wireless Commun.>, vol. 6, no. 5, pp. 18231832, May 2007.

    <bibitem|constellation1>M. Malkin, B. Krongold, and J. M. Cioffi,
    \POptimal Constellation Distortion For PAR Reduction In OFDM Systems,"
    <with|font-shape|italic|PIMRC 2008, IEEE 19th International Symposium on
    Personal, Indoor and Mobile Radio Communications>, 2008.

    <bibitem|constellation2>B. Crongold and D. Jones, \PPAR Reduction in OFDM
    via Active Constellation Extension", <with|font-shape|italic|IEEE Trans.
    Broadcast.>, vol.49, iss.3, September 2003.

    <bibitem|Tellado2>J. Tellado and J. M. Cioffi, \PPeak Power Reduction for
    Multicarrier Transmission", <with|font-shape|italic|IEEE Globecom 99>,
    Rio de Janeiro, Brazil, Dec. 5-9, 1999.

    <bibitem|Tellado>J. Tellado, <with|font-shape|italic|Multicarrier
    Modulation with Low PAR Applications to DSL and Wireless>, Kluwer
    Academic Publishers, Norwell 2000.

    <bibitem|Tone_reservation_new>N. Andgart <with|font-shape|italic|et al.>,
    \PDesigning Tone Reservation PAR Reduction,"
    <with|font-shape|italic|EURASIP J. Appl. Signal Process.>, vol. 2006, pp
    82-82, 2006.

    <bibitem|active_set>B.S. Krongold and D.L. Jones, \PAn Active-Set
    Approach for OFDM PAR Reduction via Tone Reservation,"
    <with|font-shape|italic|IEEE Trans. Signal Process.>, vol.52, no.2, pp.
    495-509, Feb. 2004.

    <bibitem|Safadi>E. B. Al-Safadi and T. Y. Al-Naffouri, \POn Reducing the
    Complexity of Tone Reservation Based PAPR Reduction Schemes by
    Compressive Sensing," <with|font-shape|italic|IEEE Globecom '09>,
    Honolulu HI, Nov. 2009.

    <bibitem|Chen_tone>J. C. Chen and C. P. Li, \P Tone Reservation Using
    Near-Optimal Peak Reduction Tone Set Selection Algorithm for PAPR
    Reduction in OFDM Systems," <with|font-shape|italic|IEEE Signal Process.
    Lett.> vol. 17 no. 11 pp. 933-936, Nov. 2010.

    <bibitem|Kashin>J. Ilic and T. Strohmer, \PPAPR Reduction in OFDM using
    Kashin's Representation," <with|font-shape|italic|IEEE 10th Workshop on
    Signal Process. Advances in Wireless Commun.>, pp.444-448, Perugia,
    Italy, June 2009.

    <bibitem|Shao>Fei Shao <with|font-shape|italic|et al.>, \PSOCP Approach
    for PAPR Reduction Using Tone Reservation for the Future DVB-T/H
    Standards," <with|font-shape|italic|Multi-Carrier Systems & amp
    Solutions>, Springer Netherlands, 2009.

    <bibitem|Janaaththanan>S. Janaaththanan, \PA Gradient Based Algorithm for
    PAPR Reduction of OFDM using Tone Reservation Technique,"
    <with|font-shape|italic|IEEE Veh. Tech. Conf.>, pp. 2977-2980, Singapore,
    May 2008.

    <bibitem|companding2>T. Jiang, W. D. Xiang, P. C. Richardson, D. M. Qu,
    and G. X. Zhu, \POn the Nonlinear Companding Transform for Reduction in
    PAPR of MCM signals," <with|font-shape|italic|IEEE Trans. Wireless
    Commun.>, vol. 6, no. 6, pp. 20172021, Jun. 2007.

    <bibitem|companding4>T. Jiang, W. Yao, P. Guo, Y. Song, and D. Qu, \PTwo
    novel nonlinear companding schemes with iterative receiver to reduce PAPR
    in multicarrier modulation systems," <with|font-shape|italic|IEEE Trans.
    Inf. Theory>, vol. 52, no. 2, pp. 268273, Mar. 2006.

    <bibitem|exponential_companding>Tao Jiang, Yang Yang and Yong-Hua Song,
    \PExponential Companding technique for PAPR reduction in OFDM Systems",
    <with|font-shape|italic|IEEE Trans. Inf. Theory>, vol. 51, no. 2, pp. 244
    - 248, June 2005.

    <bibitem|Xia>P. Xia, S. Zhou, and G.B Giannakis, \PAchieving the Welch
    Bound with Difference Sets," <with|font-shape|italic|IEEE Int. Conf. on
    Acoustics, Speech, and Signal Processing>, March 2005.

    <bibitem|Block_Sparsity1>M. Stojnic <with|font-shape|italic|et al.>, \POn
    the Reconstruction of Block-Sparse Signals with an Optimal Number of
    Measurements," <with|font-shape|italic|IEEE Trans. on Signal Process.>,
    vol. 57 no. 8 pp. 3075-3085, 2009.

    <bibitem|Block_Sparsity2>Y. C. Eldar and H. Blcskei, \PBlock-Sparsity:
    Coherence and Efficient Recovery," in <with|font-shape|italic|Proc. IEEE
    International Conference on Acoustics, Speech and Signal Processing>,
    pp.2885-2888, 2009.

    <bibitem|Candes1>E. J. Candes, J. Romberg and T. Tao. \PRobust
    Uncertainty Principles: Exact Signal Reconstruction From Highly
    Incomplete Frequency Information," <with|font-shape|italic|IEEE Trans.
    Inf. Theory>, vol. 52 pp. 489-509, 2004.

    <bibitem|Candes2>E. J. Candes, T. Tao. \PNear-optimal signal recovery
    from random projections:universal encoding strategies?,"
    <with|font-shape|italic|IEEE Trans. Inf. Theory>, vol. 52 pp. 5406-5425,
    Dec. 2006.

    <bibitem|Candes3>E. J. Candes, J. Romberg and T. Tao. \PStable signal
    recovery from incomplete and inaccurate measurements,"
    <with|font-shape|italic|Comm. Pure Appl. Math.>, vol. 59 pp. 1207-1223,
    2005.

    <bibitem|Candes4>E. J. Candes, M. Wakin, and S. Boyd, \PEnhancing
    sparsity by reweighted <math|\<ell\><rsub|1>> minimization,"
    <with|font-shape|italic|J. Fourier Anal. Appl.>, vol. 14, no. 5, pp.
    877.905, 2008.

    <bibitem|Candes5>E. J. Candes and Yaniv Plan, \PNear-Ideal Model
    Selection by <math|\<ell\><rsub|1>> Minimization," Preprint, 2007.

    <bibitem|Donoho1>S. S. Chen, D. L. Donoho, and M. A. Saunders, \PAtomic
    Decomposition by Basis Pursuit," <with|font-shape|italic|SIAM J. Sci.
    Comput>. vol. 20, Issue 1, pp. 33-61 (1998).

    <bibitem|Donoho2>D. Donoho, \PCompressed Sensing,"
    <with|font-shape|italic|IEEE Trans. Inf. Theory>, vol. 52(4), pp. 1289 -
    1306, April 2006.

    <bibitem|Tropp2>J. A. Tropp, A. C. Gilbert \PSignal Recovery from Random
    Measurements via Orthogonal Matching Pursuit,"
    <with|font-shape|italic|IEEE Trans. Inf. Theory>, vol. 53, no. 12, pp.
    4655 - 4666, Dec. 2007.

    <bibitem|Tropp3>J. A. Tropp, \PJust relax: Convex programming methods for
    identifying sparse signals," <with|font-shape|italic|IEEE Trans. Inf.
    Theory>, vol. 52, no. 3, pp. 1030 - 1051, Mar. 2006.

    <bibitem|Tropp4>J. A. Tropp, \POn the Linear Independence of Spikes and
    Sines," <with|font-shape|italic|J. Fourier Anal. Appl.>, vol. 14, pp. 838
    - 858, 2008.

    <bibitem|Fletcher>A.K. Fletcher, S. Rangan, and V. Goyal. \PNecessary and
    Sufficient Conditions on Sparsity Pattern Recovery,"
    <with|font-shape|italic|IEEE Trans. Inf. Theory>, vol. 55, num. 12 pp.
    5758 - 5772, Nov 2009.

    <bibitem|Wainwright1>M. J. Wainwright. \PSharp Thresholds for
    High-Dimensional and Noisy Sparsity Recovery Using
    <math|\<ell\><rsub|1>>-Constrained Quadratic Programming (Lasso),"
    <with|font-shape|italic|IEEE Trans. Inf. Theory>, vol. 55, no. 5, May
    2009.

    <bibitem|Wainwright2>W. Wang, M. J. Wainwright, and K. Ramchandran
    \PInformation-theoretic limits on sparse signal recovery: Dense versus
    sparse measurement matrices," <with|font-shape|italic|Technical Report,
    Dept. of Statistics, UC Berekely>, May 2008.

    <bibitem|Wainwright3>M. J. Wainwright, \PInformation-theoretic
    limitations on sparsity recovery in the high-dimensional and noisy
    setting," <with|font-shape|italic|IEEE Trans. Inf. Theory>, vol.
    55:5728\U5741, December 2009.

    <bibitem|Naffouri>G. Caire, T.Y. Al-Naffouri, and A.K. Narayanan,
    \PImpulse Noise Cancellation in OFDM: an application of compressed
    sensing," <with|font-shape|italic|IEEE Int. Symp. on Info. Theory>, July
    2008.

    <bibitem|Rank_Deficient_Sphere>T. Cui and C. Tellambura, \PAn Efficient
    Generalized Sphere Decoder for Rank-Deficient MIMO Systems,"
    <with|font-shape|italic|IEEE Commun. Lett.>, vol. 9, no. 5, pp. 423-425,
    May 2005.

    <bibitem|Finite_Alphabet>Zhi Tian, Geert Leus, Vincenzo Lottici,
    \PDetection of sparse signals under finite-alphabet constraints," in
    <with|font-shape|italic|Proc. IEEE Int. Conf. on Acoust., Speech and
    Signal Process.>, 2009 pp.2349-2352.

    <bibitem|Giannakis_Finite_Alphabet>H. Zhu and G. B. Giannakis
    \PSparsity-Embracing Multiuser Detection for CDMA Systems with Low
    Activity Factor," in <with|font-shape|italic|Proc. IEEE Int. Symp. Inf.
    Theory>, Seoul, Korea, June 28-July 3, 2009.

    <bibitem|Larsson>E. G. Larsson and Y. Seln, \PLinear Regression With a
    Sparse Parameter Vector", <with|font-shape|italic|IEEE Trans. Signal
    Process.>, vol. 55, no. 2, February 2007.

    <bibitem|Schniter1>P. Schniter, L.C. Potter, and Ziniel, J., \PFast
    Bayesian Matching Pursuit: Model Uncertainty and Paramter Estimation for
    Sparse Linear Models," submitted to <with|font-shape|italic|IEEE Trans.
    Inf. Theory>.

    <bibitem|Schniter2>P. Schniter, L.C. Potter, and Ziniel, J., \PFast
    bayesian matching pursuit," <with|font-shape|italic|Workshop on Inf.
    Theory and Applicat.> (ITA), La Jolla, CA, January 2008.

    <bibitem|Boyd>M. Grant and S. Boyd. CVX: Matlab software for disciplined
    convex programming (web page and software).
    http://stanford.edu/<nbsp>boyd/cvx, February 2009.

    <bibitem|Boyd2>S. Boyd and L. Vandenberghe,
    <with|font-shape|italic|Convex Optimization>, Cambridge University Press,
    2004.

    <bibitem|Tibshirani>R. Tibshirani, \PRegression Shrinkage and Selection
    via the LASSO," <with|font-shape|italic|J. of the Roy. Stat. Soc.>,
    Series B, vol. 58, no. 1, pp. 267-288, 1996.

    <bibitem|Wipf1>D. Wipf and S. Nagarajan, \PIterative Reweighted
    <math|\<ell\><rsub|1>> and <math|\<ell\><rsub|2>> Methods for Finding
    Sparse Solutions," Submitted, 2009.

    <bibitem|Wipf4>D. Wipf and S. Nagarajan, \PA New View of Automatic
    Relevance Determination", <with|font-shape|italic|Advances in Neural Inf.
    Process. Syst.>, vol. 20, pp. 1625.1632, 2008.

    <bibitem|Imai>H. Ochiai, H. Imai, \POn the Distribution of the
    Peak-to-Average Power Ratio in OFDM Signals,"
    <with|font-shape|italic|IEEE Trans. Commun>, vol.49, no.2, pp.282-289,
    Feb 2001.

    <bibitem|Bahai>A.R.S, Bahai, M. Singh, A.J. Goldsmith, B.R. Saltzberg,
    \PA New Approach For Evaluating Clipping Distortion In Multicarrier
    Systems", <with|font-shape|italic|IEEE J. Sel. Areas Commun.>, vol.20,
    no.5, pp.1037-1046, June 2002.

    <bibitem|Wei>S. Wei, D.L. Goeckel, and P.E. Kelly, \PA Modern Extreme
    Value Theory Approach to Calculating the Distribution of the
    Peak-to-Average Power Ratio in OFDM Systems," in
    <with|font-shape|italic|Proc. IEEE Int. Conf. on Commun.>, vol.3, pp.
    1686-1690, 2002

    <bibitem|Capacity_clipped_1>F. Peng and W. E. Ryan, \POn the Capacity of
    Clipped OFDM Channels," in <with|font-shape|italic|Proc. IEEE Int. Symp.
    Inf. Theory>, Seattle, WA, July 2006, pp. 1866 - 1870.
  </thebibliography>

  <new-page>
</body>