# **Monolithic Pixel Detectors**

Some examples and practical considerations

Piero Giubilato 26/11/2013 Torino

## **Introduction** – standard CMOS monolithic pixels



# Introduction – monolithic vs hybrid pixel detectors



- Established, proven, effective technology.
- Unique possibility to use the best sensor depending on the radiation to track.
- Plenty of room for extremely advanced in-pixel electronic.
- Cost, complexity, mid power consumption, material budget.
- Producing small (< 20 μm) pixels still a challenge for bump bonding.



- Young technology
- Sensing material limited to silicon
- No room in pixel for advanced signal processing
- Radiation tolerance could still be an issue for high doses applications
- Cost effective, simple, low power and low material budget
- High resolution (pixels < 5µm)

# Do we really need monolithic?

1

Monolithic pixel detectors are not the Holy Grail of pixelated radiation imagers!



Actually, they have severe limitations when compared to other technologies!



We need strong motivations to get real advantages by using them.

## **Motivations** – sheer quantity

The quantity of detectors to produce is large enough to call for costs saving both for production and assembly.



CMS tracker – 200 m<sup>2</sup>

Imagers – Mpieces/year

In such cases using a standard CMOS process could save the day.

## **Motivations** – passing particles

The radiation passing trough the detector must stay unaffected as much as possible. Ultra-thin detectors (less than 50 µm) required.



## **Motivations** – passing particles 2

Ancillary constraint of minimizing the device thickness to the technical limit is that it is not possible to use any backbone to supply / support / connect tiled large area detector. This implies using very large (many 10 cm<sup>2</sup>) dies



realized by stitching, i.e. sensors larger than the reticle size. Bump-bonding is not practical for such oversized detectors.



# Motivations – ultra low power (<≈10 mW/cm<sup>2</sup>)

For real low mass very low power is mandatory in large area systems (think about cooling, supply lines...)



33 kW in the detector and... 62 kW in the cables !

# Motivations – ultra low power (<≈10 mW/cm<sup>2</sup>)

Also for small and medium sensors operated in vacuum with no cooling (the detector must stay very thin, no support allowed) ultra low power can be a mandatory requirement.



The sensor must be large, ultra thin, and sits in the vacuum

## **Motivations** – very small pixels

Whenever the pixel pitch has to go below 10 um, the room for electronics in the pixel cell, even in hybrid detectors, starts to be in shortage.



Piero Giubilato – Data driven monolithic pixel sensors – Torino 26/11/2013

# **Reasons for going monolithic:**

Ultra low power budget (thanks to sensor capacitance)

Low material budget (thin device, small clusters)

Small pixels (no room for bonding and/or complex inpixel electronics)

Very large areas (stitching)

 All these requires very specific architectures and technical solutions!
 Low transistors count (per pixel), ultra low power, on chip data reduction,

etc...

**Common ground** – power consumption factors



Analog: determined by collected charge over capacitance (Q/C) in the pixel: pixel sensor optimization.



Digital: determined by on-chip architecture & cluster size Standard is Rolling shutter, studies on architectures with in-pixel binary front-end.



Data transmission off-chip: determined by cluster size and required bandwidth unless data reduction by clustering algorithm

### **Common ground** – low capacitance to low power 1



$$di_{eq} = g_m \cdot d\nu_{eq}$$

Transconductance  $g_m$  is related to power consumption, hence higher current (power) in the first stage improves performances and noise.

Noise, Weak inversion  

$$dv_{eq}^{2} = \left(\frac{K_{F}}{WLC_{ox}^{2}f^{\alpha}} + \frac{4K_{B}Tn}{g_{m}}\right)df \quad g_{m} \sim I$$
thermal  
Noise, strong inversion  

$$dv_{eq}^{2} = \left(\frac{K_{F}}{WLC_{ox}^{2}f^{\alpha}} + \frac{2K_{B}T\gamma}{g_{m}}\right)df \quad g_{m} \sim \sqrt{I}$$

- *K<sub>F</sub>* technology dependent constant
- W, L transistor width and length
- $C_{ox}$  gate oxide capacitance per unit area
- $g_m$  transistor transconductance
- *K<sub>B</sub>* Boltzmann constant
- *T* absolute temperature
- *n* weak inversion slope
- $\Upsilon$  often around  $\frac{1}{2}$  2/3 in strong inversion

### **Common ground** – low capacitance to low power 2



#### Analog power is very strongly dependent on Q/C => we wants low C

## **Common ground** – Q/C in monolithic visualized



## **Common ground** – how to keep C very low (LePix)



## **Common ground** – depletion to reduce data and increase S/N



## **Common ground** – backlit + thinning to increase S/N



## **Common ground** – depleted + thinned + backlit to maximize S/N



- Back-illuminated, fully depleted could see the whole spectrum.
- Spectroscopic capabilities over full detection range thanks low noise.
- Soft X-Rays (0.5 keV 10 keV) absorption length 1 μm – 100 μm.

- Visible range 0.05 μm 7 μm color imaging without filters.
- Large area (up to 20 x 20 cm<sup>2</sup>) and small pixel pitch (down to 1 μm) are key characteristics for Synchrotron Light Sources and Free Electrons Lasers.

## **Common ground** – <sup>55</sup>Fe (5.9 keV) with depleted MAPS (LePix)



# **Clever architecture**

Assuming we implemented all previous technical tricks and squeezed the ultimate S/N out of our pixels, how can we effectively retrieve all data out of the matrix maintaining the low power goal?

- ALICE Inner Tracking System pALPIDE prototype.
- 2

3

1

- Ultra low power large area OrthoPix prototype.
- I'm sure no time for this: large TEM detectors.

# **pALPIDE** pixel detector



- Improve impact parameter resolution by a factor of  $\approx 3$
- Get closer to IP (position of first layer): 39 mm to 22mm
- Reduce material budget: X/X<sub>0</sub> per layer: from 1.14% to 0.3% (inner layers)
- Reduce pixel size (currently 50  $\mu$ m x 425  $\mu$ m) by using to monolithic pixels: foreseen size ranging from 20  $\mu$ m x 20  $\mu$ m to 40  $\mu$ m x 40  $\mu$ m (roughly).
- Improve tracking efficiency and resolution at low pT
- Increase granularity, reduced pixel size: from 6 to 7 layers
- Fast readout of Pb-Pb interactions at > 50 kHz and pp interactions at ~ MHz.
- Fast insertion/removal for yearly maintenance.
- Possibility to replace non functioning detector modules during yearly shutdown

## **pALPIDE** – ALICE Inner Tracking System specifications



|                                                  | Inner                |                     | Middle               |                      | Outer                |                      |                      |
|--------------------------------------------------|----------------------|---------------------|----------------------|----------------------|----------------------|----------------------|----------------------|
| Layer                                            | 0                    | 1                   | 2                    | 3                    | 4                    | 5                    | 6                    |
| Position [mm]                                    | 23                   | 32                  | 39                   | 196                  | 245                  | 344                  | 393                  |
| Particles [10 <sup>-5</sup> s cm <sup>-2</sup> ] | 30                   | 20                  | 15                   | 1                    | 0.7                  | 0.3                  | 0.3                  |
| NIEL [1 Mev n cm <sup>-2</sup> ]                 | 9.2×10 <sup>12</sup> | 6×10 <sup>12</sup>  | 3.8×10 <sup>12</sup> | 5.4×10 <sup>11</sup> | 5.0×10 <sup>11</sup> | 4.8×10 <sup>11</sup> | 4.6×10 <sup>11</sup> |
| TID [kGray]                                      | 6.46                 | 3.8                 | 2.16                 | 0.15                 | 0.1                  | 0.08                 | 0.06                 |
| Material [X <sub>0</sub> %]                      |                      | 0.3% X <sub>0</sub> |                      |                      | 0.89                 | % X <sub>0</sub>     |                      |
| Data [Mbit chip <sup>-1</sup> s <sup>-1</sup> ]  | 284                  | 174                 | 121                  | 14                   | 12                   | 11                   | 10                   |

# **pALPIDE** – overview

Developed jointly by Wuhan, INFN, and CERN, is one of the prototypes under evaluation to equip the ALICE Inner Tracking System.

- Tower-Jazz 0.18  $\mu$ m process, deep p-well, high  $\Omega$  epitaxial layer.
- New low-power in-pixel discriminator front-end.
- Data-driven digital read-out ("priority encoder").



- 32 678 pixels of 22  $\mu m$  × 22  $\mu m$  (+ a few test pixels).
- Active area:  $11.3 \times 1.4 \text{ mm}^2$ .
- Prototyped in four different versions and on seven different substrates.



- AC sensitive circuit, always active.
- "Shaping time"  $\approx$  3 to 5 µs (in fieri).
- Hit latch inside each pixel (one).
- Global shutter capable (WRITE\_EN).



#### Block Size: 10.5 x 22.0 µm<sup>2</sup>

Memory size: 6.9 x 7.3 µm<sup>2</sup>



### All circuitry in deep p-well, except for the collection node.

**pALPIDE** – global architecture



- Double column with mixed analog/digital cells.
- Asynchronous matrix readout through "priority encoder" (see next slides).

## **pALPIDE** – priority encoder addressing

data[0] valid[0] Pix\_reg[0] rst[0] fdbk[0] valid[0] valid[4] data[1] Pix\_reg[1] rst[1] addr[0] fdbk[0] valid[1] fdbk[4]  $N_{stages} = log_b(N)$ data[2] valid[1] fdbk[1] addr[1] Pix\_reg[2] rst[2] fdbk[1] data[3] valid[4] Pix\_reg[3] valid[6]  $N_{blocks} = \frac{N-1}{h-1}$ fdbk[4] rst[3] addr[0] fdbk[6] valid[5] data[4] fdbk[5] addr[2] valid[2] Pix\_reg[4] rst[4] fdbk[2] data[5] valid[2] valid[5] Pix\_reg[5] rst[5] addr[0] fdbk[2] N is the total number of valid[3] fdbk[5] pixels to read and b is the data[6] valid[3] fdbk[3] addr[1] Pix\_reg[6] rst[6] basic block inputs fdbk[3] data[7] Pix\_reg[7] rst[7] addr[0] addr[1] addr[2] addr[0]

- Each clock cycle the highest priority pixel address is read out (and reset).
- Pixel address readout time: 20 ns / address (@ 50 MHz)
- Asynchronous circuit with no clock propagation into the pixel matrix (combinatorial logic to manage the reset): power & noise reduction.

*Piero Giubilato* – Data driven monolithic pixel sensors – Torino 26/11/2013

T. Kugathasan



Pixels arranged in columns: 2 adjacent, mirrored, columns share the same digital area. After a trigger, read only the active pixels, then reset them. Possible readout architecture with priority encoder: basic cell of 4 pixels repeated.

# **pALPIDE** – pixel cell analog/digital in pixel cell

T. Kugathasan



# **pALPIDE** – priority encoder synch or asynch implementation



- Easy encoding of hit data
- Few logic (smaller pixel area, 1 FF)
- Clock through the matrix: switching problems, power consumption





- No clock propagation: lower power
- Asynchronous hit encoding
- Needs more logic to ensure proper reset: larger pixel area

## **pALPIDE** – priority encoder synch or asynch power consumption

### T. Kugathasan



- Always reset the pixel with higher priority (less significant bit).
- Asynchronous combinatorial logic to manage the reset.
- Readout time logarithmically dependent on pixel count.

## **pALPIDE** – serializer and driver at the periphery

G. Mazza

| Ask people here!        |            |                       | Driver     | DF         |
|-------------------------|------------|-----------------------|------------|------------|
| Input clock             | 40 MHz     |                       | output MUX | output MUX |
| Transmission clock      | 1 GHz      | ez 246 246 246 246 44 |            |            |
| Transmission type       | DDR        |                       |            |            |
| Line rate               | 2 Gbit/s   |                       |            |            |
| Line encoding           | 8/10       |                       |            |            |
| Effective line capacity | 1.6 Gbit/s |                       |            |            |
|                         |            |                       |            |            |

*Piero Giubilato* – Data driven monolithic pixel sensors – Torino 26/11/2013

**Clock drivers** 

#### Power

| Source                                  | Power                       |
|-----------------------------------------|-----------------------------|
| Analog front end: 40 nW * 250000 pixels | 10 – 15 mW cm <sup>-2</sup> |
| Digital readout (simulations)           | 10 – 20 mW cm <sup>-2</sup> |
| Data transmission (simulations)         | 5 – 15 mW cm <sup>-2</sup>  |

Trying to approach 50 mW cm<sup>-2</sup>

#### Speed

| Region                                        | Time    |
|-----------------------------------------------|---------|
| Rolling shutter, full matrix (@ 50 MHz)       | ≈20 µs  |
| Priority encoder, 2 hit pixels over 512 × 32  | ≈40 ns  |
| Priority encoder, 80 hit pixels over 512 × 32 | ≈1.6 µs |

# Pushing the limit: OrthoPix



## **OrthoPix** – the case: sparse populated "images"

Sparse populated image: only few % interesting pixels



## **OrthoPix** – zero suppression (sparsification) overview

Current pixels (mostly hybrids but also monolithic) can do that

- Millions pixels and frame/s
- Counting, suppressing, compression, ... in pixel!!
- Many successful applications (HEP, medical, light, ...)

But what if we need also any (combination) of the following?

- Very very large area detector (many m<sup>2</sup> scale)
- Low power consumption (<10 mW/cm<sup>2</sup>)
- Low material budget (<50 μm thick)</li>
- Small pixel pitch (<< 20 μm)
- Assembly simplicity and low cost

?

# **OrthoPix** – zero suppression (sparsification) approaches



The pixel decides whether or not it is carrying data for the periphery. This requires space and power. Pixel are connected to the periphery in a static way, and they are brainless. Neither space nor power required.

## **OrthoPix** – reshaping power geography



Sparsification to limit the output bandwidth for high count and/or speed.

### <u>BUT</u>

Carrying clock/data through the sensitive area is a power nightmare. (50 MHz clock over 2 cm easily 5-10 mW/cm<sup>2</sup>)

### **IDEALLY**

Everything should happen in the periphery. No clock around!

Moving power (and data) to the periphery allows for an easier and more effective cooling, hence for lower power consumption and material budget.

## **OrthoPix** – starting point: xy projections and the Poisson limit



XY projections offer classic "static" compression, with the limit they fail in case of multiple hits: even at low occupancies Poisson stat is against you!

# **OrthoPix** – using $\pi_n$ : xy projections and the Poisson limit



## **OrthoPix** – using $\pi_n$ : xy projections and the Poisson limit



# **OrthoPix** – what does it mean in practical applications?

To produce large area detector in a convenient way, big size chips are necessary. Stitching allows to produce single piece detector up to 10 cm side. OrthoPix can read them at GHz speed with minimal power (10 mW cm<sup>-2</sup>) consumption.







Just seen "first light" yesterday... a lot of testing to come!

Designed in both standard CMOS and "specialty" BJT layout Realized in Tower-Jazz 0.18 μm, various substrates thickness/resistivity.

# **Backup slides**

## **pALPIDE** – pixel cell cascode amplifier details

T. Kugathasan



# Rad hard CMOS for proton imaging



# **Proton therapy** – physics rationale

Proton (ion) energy transfer is highly localized (Bragg peak): greater effectiveness and much lower collateral damage respect to traditional x-rays



## **Proton imaging** – Bragg peak to reduced collateral damage

Much lower collateral damage respect to photons due to the focused energy deposition: less damage to surrounding tissues, less chance of secondary tumors.







# Proton therapy – aiming limits

Aiming the Bragg peak requires fine tuning of the <u>proton energy</u> to account for the tissue densities they have to traverse to reach the tumor.



X-ray 3D CTs cannot distinguish tissue densities with the required precision, leading to Bragg peak aiming errors \_\_\_\_\_ much worse than the Bragg peak intrinsic spread. <u>But protons actually</u> can (and with much less dose).



# **Proton imaging** – state of the art

The pCT works on the same principle as a "standard" x-rays CT: recording particles passing through the target from different angles to reconstruct a 3D image. Main difference is that, while photons are simply absorbed, protons also scatters.









# **Microscopy** – a lot of R&D toward rad-hard MAPS in past years

- Developed in the frame of LBNL Laboratory Directed Research & Development (LDRD) grant.
- Manufactured in AMS 0.35 μm CMOS-OPTO (optimized low leakage current, 5 metal layers)process, with 14 μm nominal epitaxial layer thickness.





**NW** layout

n-well diode with p+ guard-

ring and thin oxide on top

- 96x96 pixels, 20x20 µm<sup>2</sup>, arrayed in several sub-sectors implementing different transistor layouts and different configurations of the charge collection diode.
- Simple 3-transistor (3T) pixel architecture.





PO layout

n-well diode with p+ rings, thin oxide on top and polysilicon ring

GR layout n-well diode with enclosing p+ guard-ring

# **Microscopy – 200 keV electrons irradiation results**

- 200 keV electrons are expected to cause only ionising damage in Si (thr. energy for DD is 260 keV).
- Electron flux of ~2300 e<sup>-</sup>µm<sup>-2</sup>s<sup>-1</sup> ~9x10<sup>5</sup> e<sup>-</sup> /pixel/s (e.g. diffraction mode).
- Irradiation performed in steps up to a total dose of 1.11 MRad. Dark levels monitored as dose function.



- After irradiation, the increase of leakage current in the exposed pixels gives a latent image of the mesh wires.
- Measurement of PSF ~30 μm possible, but e<sup>-</sup> scattering on mesh borders spoils the actual figure.



## Microscopy – atoms e<sup>-</sup> imaging with 1 MPixel



#### TEAM 1K detector

- 0.35 AMS opto process.
- 1M pixel, 9.5 um pixel pitch
- Rad-hard design.
- 25 MHz readout speed
- 16 parallel analog outputs
- Up to 400 Frames/s.
- Thinned down to 50µm to reduce backscattering.



# **Microscopy** – cluster imaging for high resolution imaging

Cluster imaging: instead of integrating the e<sup>-</sup> flux into the detector, operate it in "single particle" tracking mode, retrieving each e<sup>-</sup> impact generated cluster. Reconstruct the image by summing up all the collected clusters coordinates.

Bright field





### **Microscopy** – atoms e<sup>-</sup> imaging with 4 MPixel



#### **TEAM 2K detector**

- 0.35 AMS opto process
- 4M pixel, 9.5 um pixel pitch
- Rad-hard design
- 25 MHz readout speed
- 64 parallel analog outputs
- Up to 400 Frames/s
- Thinned down to 50µm to reduce backscattering



## **Microscopy** – TEAM in 4MPixel counting mode

A recent reconstruction of the 20S Proteasome from K2 Summit<sup>™</sup> Counting data shows estimated to be at 4.4 Å resolution (0.5 FSC). Å resolution shows both beta sheet and alpha helices.

