event camera
File:Prophesee Event Camera Evaluation Kit.jpg
{{short description|Type of imaging sensor}}
{{Redirect|Silicon retina|visual prosthesis|Artificial silicon retina}}
{{Redirect|Dynamic vision sensor|information processing cameras|Smart vision sensor||Vision sensor (disambiguation){{!}}Vision sensor}}
An event camera, also known as a neuromorphic camera,{{cite journal |last1=Li |first1=Hongmin |last2=Liu |first2=Hanchao |last3=Ji |first3=Xiangyang |last4=Li |first4=Guoqi |last5=Shi |first5=Luping |title=CIFAR10-DVS: An Event-Stream Dataset for Object Classification |journal=Frontiers in Neuroscience |date=2017 |volume=11 |page=309 |doi=10.3389/fnins.2017.00309 |pmid=28611582 |language=English |issn=1662-453X|pmc=5447775 |doi-access=free }} silicon retina,{{cite journal |last1=Sarmadi |first1=Hamid |last2=Muñoz-Salinas |first2=Rafael |last3=Olivares-Mendez |first3=Miguel A. |last4=Medina-Carnicer |first4=Rafael |title=Detection of Binary Square Fiducial Markers Using an Event Camera |journal=IEEE Access |date=2021 |volume=9 |pages=27813–27826 |doi=10.1109/ACCESS.2021.3058423 |url=https://ieeexplore.ieee.org/document/9351958 |issn=2169-3536|arxiv=2012.06516 |bibcode=2021IEEEA...927813S |s2cid=228375825 }} or dynamic vision sensor,{{cite book |last1=Liu |first1=Min |last2=Delbruck |first2=Tobi |title=2017 IEEE International Symposium on Circuits and Systems (ISCAS) |chapter=Block-matching optical flow for dynamic vision sensors: Algorithm and FPGA implementation |date=May 2017 |pages=1–4 |doi=10.1109/ISCAS.2017.8050295 |arxiv=1706.05415 |isbn=978-1-4673-6853-7 |s2cid=2283149 |url=https://ieeexplore.ieee.org/document/8050295 |access-date=27 June 2021}} is an imaging sensor that responds to local changes in brightness. Event cameras do not capture images using a shutter as conventional (frame) cameras do. Instead, each pixel inside an event camera operates independently and asynchronously, reporting changes in brightness as they occur, and staying silent otherwise.
Functional description
Event camera pixels independently respond to changes in brightness as they occur.{{Cite journal |last1=Lichtsteiner |first1=P. |last2=Posch |first2=C. |last3=Delbruck |first3=T. |date=February 2008 |title=A 128×128 120 dB 15μs Latency Asynchronous Temporal Contrast Vision Sensor |url=https://www.zora.uzh.ch/id/eprint/17629/1/Lichtsteiner_Latency_V.pdf |journal=IEEE Journal of Solid-State Circuits |volume=43 |issue=2 |pages=566–576 |bibcode=2008IJSSC..43..566L |doi=10.1109/JSSC.2007.914337 |issn=0018-9200 |s2cid=6119048 |access-date=2019-12-06 |archive-date=2021-05-03 |archive-url=https://web.archive.org/web/20210503085033/https://www.zora.uzh.ch/id/eprint/17629/1/Lichtsteiner_Latency_V.pdf |url-status=dead }} Each pixel stores a reference brightness level, and continuously compares it to the current brightness level. If the difference in brightness exceeds a threshold, that pixel resets its reference level and generates an event: a discrete packet that contains the pixel address and timestamp. Events may also contain the polarity (increase or decrease) of a brightness change, or an instantaneous measurement of the illumination level,{{Cite journal |last1=Posch |first1=C. |last2=Matolin |first2=D. |last3=Wohlgenannt |first3=R. |date=January 2011 |title=A QVGA 143 dB Dynamic Range Frame-Free PWM Image Sensor With Lossless Pixel-Level Video Compression and Time-Domain CDS |journal=IEEE Journal of Solid-State Circuits |volume=46 |issue=1 |pages=259–275 |bibcode=2011IJSSC..46..259P |doi=10.1109/JSSC.2010.2085952 |issn=0018-9200 |s2cid=21317717}} depending on the specific sensor model. Thus, event cameras output an asynchronous stream of events triggered by changes in scene illumination.File:Event camera comparison.jpgEvent cameras typically report timestamps with a microsecond temporal resolution, 120 dB dynamic range, and less under/overexposure and motion blur{{Cite web|last=Longinotti|first=Luca|title=Product Specifications|url=https://inivation.com/support/product-specifications/|access-date=2019-04-21|website=iniVation|archive-date=2019-04-02|archive-url=https://web.archive.org/web/20190402163200/https://inivation.com/support/product-specifications/|url-status=dead}} than frame cameras. This allows them to track object and camera movement (optical flow) more accurately. They yield grey-scale information. Initially (2014), resolution was limited to 100 pixels.{{Citation needed|date=January 2025}} A later entry reached 640x480 resolution in 2019.{{Citation needed|date=January 2025}} Because individual pixels fire independently, event cameras appear suitable for integration with asynchronous computing architectures such as neuromorphic computing. Pixel independence allows these cameras to cope with scenes with brightly and dimly lit regions without having to average across them.{{Cite news|date=2022-01-29|title=A new type of camera|newspaper=The Economist|url=https://www.economist.com/science-and-technology/a-new-type-of-camera/21807384|access-date=2022-02-02|issn=0013-0613}} It is important to note that, while the camera reports events with microsecond resolution, the actual temporal resolution (or, alternatively, the bandwidth for sensing) is on the order of tens of microseconds to a few milliseconds, depending on signal contrast, lighting conditions, and sensor design.{{cite arXiv |last1=Hu |first1=Yuhuang |title=v2e: From Video Frames to Realistic DVS Events |date=2021-04-19 |eprint=2006.07722 |last2=Liu |first2=Shih-Chii |last3=Delbruck |first3=Tobi|class=cs.CV }}
class="wikitable"
|+Typical image sensor characteristics !Sensor !Dynamic range (dB) !Equivalent framerate (fps) !Spatial resolution (MP) !Power consumption (mW) |
Human eye
|30–40 |200-300* | - |
High-end DSLR camera (Nikon D850)
|120 |2–8 | - |
Ultrahigh-speed camera (Phantom v2640){{Cite web|url=https://www.phantomhighspeed.com/products/cameras/ultrahighspeed/v2640|title=Phantom v2640|website=www.phantomhighspeed.com|access-date=2019-04-22}}
|64 |12,500 |0.3–4 | - |
Event camera{{Cite web|url=https://inivation.com/support/product-specifications/|title=Product Specifications|last=Longinotti|first=Luca|website=iniVation|access-date=2019-04-22|archive-date=2019-04-02|archive-url=https://web.archive.org/web/20190402163200/https://inivation.com/support/product-specifications/|url-status=dead}}
|120 |50,000 – 300,000** |0.1–1 |30 |
* Indicates human perception temporal resolution, including cognitive processing time. **Refers to change recognition rates, and varies according to signal and sensor model.
Types
Temporal contrast sensors (such as DVS (Dynamic Vision Sensor), or sDVS{{Cite journal |last1=Serrano-Gotarredona |first1=T. |last2=Linares-Barranco |first2=B. |date=March 2013 |title=A 128x128 1.5% Contrast Sensitivity 0.9% FPN 3μs Latency 4mW Asynchronous Frame-Free Dynamic Vision Sensor Using Transimpedance Amplifiers |url=http://www.imse-cnm.csic.es/~bernabe/jssc13_AuthorAcceptedVersion.pdf |journal=IEEE Journal of Solid-State Circuits |volume=48 |issue=3 |pages=827–838 |bibcode=2013IJSSC..48..827S |doi=10.1109/JSSC.2012.2230553 |issn=0018-9200 |s2cid=6686013}} (sensitive-DVS)) produce events that indicate polarity (increase or decrease in brightness), while temporal image sensors indicate the instantaneous intensity with each event. The DAVIS{{Cite journal|last1=Brandli|first1=C.|last2=Berner|first2=R.|last3=Yang|first3=M.|last4=Liu|first4=S.|last5=Delbruck|first5=T.|date=October 2014|title=A 240 × 180 130 dB 3 µs Latency Global Shutter Spatiotemporal Vision Sensor|journal=IEEE Journal of Solid-State Circuits|volume=49|issue=10|pages=2333–2341|doi=10.1109/JSSC.2014.2342715|issn=0018-9200|bibcode=2014IJSSC..49.2333B|doi-access=free}} (Dynamic and Active-pixel Vision Sensor) contains a global shutter active pixel sensor (APS) in addition to the dynamic vision sensor (DVS) that shares the same photosensor array. Thus, it has the ability to produce image frames alongside events. Many event cameras additionally carry an inertial measurement unit (IMU).
Retinomorphic sensors
{{Main|Retinomorphic sensor}}
File:Retinomorphic Sensor.jpg, with photosensitive capacitor at top. Right: Expected transient response of retinomorphic sensor to application of constant illumination.]]
Another class of event sensors are so-called retinomorphic sensors. While the term retinomorphic has been used to describe event sensors generally,{{Cite book|last=Boahen|first=K.|title=Proceedings of Fifth International Conference on Microelectronics for Neural Networks |chapter=Retinomorphic vision systems |date=1996|chapter-url=https://ieeexplore.ieee.org/document/493766|pages=2–14|doi=10.1109/MNNFS.1996.493766|isbn=0-8186-7373-7|s2cid=62609792}}{{Cite journal|last1=Posch|first1=Christoph|last2=Serrano-Gotarredona|first2=Teresa|last3=Linares-Barranco|first3=Bernabe|last4=Delbruck|first4=Tobi|date=2014|title=Retinomorphic Event-Based Vision Sensors: Bioinspired Cameras With Spiking Output|url=https://ieeexplore.ieee.org/document/6887319|journal=Proceedings of the IEEE|volume=102|issue=10|pages=1470–1484|doi=10.1109/JPROC.2014.2346153|hdl=11441/102353|s2cid=11513955|issn=1558-2256|hdl-access=free}} in 2020 it was adopted as the name for a specific sensor design based on a resistor and photosensitive capacitor in series.{{Cite journal|last1=Trujillo Herrera|first1=Cinthya|last2=Labram|first2=John G.|date=2020-12-07|title=A perovskite retinomorphic sensor|journal=Applied Physics Letters|volume=117|issue=23|pages=233501|doi=10.1063/5.0030097|bibcode=2020ApPhL.117w3501T|s2cid=230546095|issn=0003-6951|doi-access=free}} These capacitors are distinct from photocapacitors, which are used to store solar energy,{{Cite journal|last1=Miyasaka|first1=Tsutomu|last2=Murakami|first2=Takurou N.|date=2004-10-25|title=The photocapacitor: An efficient self-charging capacitor for direct storage of solar energy|url=https://aip.scitation.org/doi/10.1063/1.1810630|journal=Applied Physics Letters|volume=85|issue=17|pages=3932–3934|doi=10.1063/1.1810630|bibcode=2004ApPhL..85.3932M|issn=0003-6951|url-access=subscription}} and are instead designed to change capacitance under illumination. They (dis)charge slightly when the capacitance is changed, but otherwise remain in equilibrium. When a photosensitive capacitor is placed in series with a resistor, and an input voltage is applied across the circuit, the result is a sensor that outputs a voltage when the light intensity changes, but otherwise does not.
Unlike other event sensors (typically a photodiode and some other circuit elements), these sensors produce the signal inherently. They can hence be considered a single device that produces the same result as a small circuit in other event cameras. Retinomorphic sensors have to-date{{As of when|date=January 2025}} only been studied in a research environment.{{Cite web|date=2021-01-18|title=Perovskite sensor sees more like the human eye|url=https://physicsworld.com/perovskite-sensor-sees-more-like-the-human-eye/|access-date=2021-10-28|website=Physics World|language=en-GB}}{{Cite web|title=Simple Eyelike Sensors Could Make AI Systems More Efficient|url=https://insidescience.org/news/simple-eyelike-sensors-could-make-ai-systems-more-efficient|access-date=2021-10-28|website=Inside Science|date=8 December 2020 |language=en}}{{Cite web|last=Hambling|first=David|title=AI vision could be improved with sensors that mimic human eyes|url=https://www.newscientist.com/article/2259491-ai-vision-could-be-improved-with-sensors-that-mimic-human-eyes/|access-date=2021-10-28|website=New Scientist|language=en-US}}{{Cite web|title=An eye for an AI: Optic device mimics human retina|url=https://www.sciencefocus.com/news/an-eye-for-an-ai-optic-device-mimics-human-retina/|access-date=2021-10-28|website=BBC Science Focus Magazine|language=en}}
Algorithms
= Image reconstruction =
Image reconstruction from events has the potential to create images and video with high dynamic range, high temporal resolution, and reduced motion blur. Image reconstruction can be achieved using temporal smoothing, e.g. high-pass or complementary filter. Alternative methods include optimization{{Cite book|last1=Pan|first1=Liyuan|last2=Scheerlinck|first2=Cedric|last3=Yu|first3=Xin|last4=Hartley|first4=Richard|last5=Liu|first5=Miaomiao|last6=Dai|first6=Yuchao|title=2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)|date=June 2019|chapter=Bringing a Blurry Frame Alive at High Frame-Rate With an Event Camera|url=https://ieeexplore.ieee.org/document/8953329|location=Long Beach, CA, USA|publisher=IEEE|pages=6813–6822|doi=10.1109/CVPR.2019.00698|isbn=978-1-7281-3293-8|arxiv=1811.10180|s2cid=53749928}} and gradient estimation{{Cite journal|last1=Scheerlinck|first1=Cedric|last2=Barnes|first2=Nick|last3=Mahony|first3=Robert|date=April 2019|title=Asynchronous Spatial Image Convolutions for Event Cameras|journal=IEEE Robotics and Automation Letters|volume=4|issue=2|pages=816–822|arxiv=1812.00438|doi=10.1109/LRA.2019.2893427|s2cid=59619729|issn=2377-3766}} followed by Poisson integration. It has been also shown that the image of a static scene can also be recovered from noise events only by analyzing their correlation with scene brightness.{{cite journal |last1=Cao |first1=Ruiming |last2=Galor |first2=Dekel |last3=Kohli |first3=Amit |last4=Yates |first4=Jacob L. |last5=Waller |first5=Laura |title=Noise2Image: noise-enabled static scene recovery for event cameras |journal=Optica |date=20 January 2025 |volume=12 |issue=1 |pages=46 |doi=10.1364/OPTICA.538916|arxiv=2404.01298 }}
= Spatial convolutions =
The concept of spatial event-driven convolution was postulated in 1999{{Cite journal|last1=Serrano-Gotarredona|first1=T.|last2=Andreou|first2=A.|last3=Linares-Barranco|first3=B.|date=Sep 1999|title=AER Image Filtering Architecture for Vision Processing Systems|journal= IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications|volume=46|issue=9|pages=1064–1071|doi=10.1109/81.788808|hdl=11441/76405|issn=1057-7122|hdl-access=free}} (before the DVS), but later generalized during EU project CAVIAR{{Cite journal|last1=Serrano-Gotarredona|first1=R.|last2=et|first2=al|date=Sep 2009|title=CAVIAR: A 45k-Neuron, 5M-Synapse, 12G-connects/sec AER Hardware Sensory-Processing-Learning-Actuating System for High Speed Visual Object Recognition and Tracking|journal= IEEE Transactions on Neural Networks|volume=20|issue=9|pages=1417–1438|doi=10.1109/TNN.2009.2023653|pmid=19635693|issn=1045-9227|hdl=10261/86527|s2cid=6537174|hdl-access=free}} (during which the DVS was invented) by projecting event-by-event an arbitrary convolution kernel around the event coordinate in an array of integrate-and-fire pixels.{{Cite journal|last1=Serrano-Gotarredona|first1=R.|last2=Serrano-Gotarredona|first2=T.|last3=Acosta-Jimenez|first3=A.|last4=Linares-Barranco|first4=B.|date=Dec 2006|title=A Neuromorphic Cortical-Layer Microchip for Spike-Based Event Processing Vision Systems|journal= IEEE Transactions on Circuits and Systems I: Regular Papers|volume=53|issue=12|pages=2548–2566|doi=10.1109/TCSI.2006.883843|issn=1549-8328|hdl=10261/7823|s2cid=8287877|hdl-access=free}} Extension to multi-kernel event-driven convolutions{{Cite journal|last1=Camuñas-Mesa|first1=L.|last2=et|first2=al|date=Feb 2012|title=An Event-Driven Multi-Kernel Convolution Processor Module for Event-Driven Vision Sensors|journal=IEEE Journal of Solid-State Circuits|volume=47|issue=2|pages=504–517|doi=10.1109/JSSC.2011.2167409|bibcode=2012IJSSC..47..504C|hdl=11441/93004|s2cid=23238741|issn=0018-9200|hdl-access=free}} allows for event-driven deep convolutional neural networks.{{Cite journal|last1=Pérez-Carrasco|first1=J.A.|last2=Zhao|first2=B.|last3=Serrano|first3=C.|last4=Acha|first4=B.|last5=Serrano-Gotarredona|first5=T.|last6=Chen|first6=S.|last7=Linares-Barranco|first7=B.|date=November 2013|title=Mapping from Frame-Driven to Frame-Free Event-Driven Vision Systems by Low-Rate Rate-Coding and Coincidence Processing. Application to Feed-Forward ConvNets|journal= IEEE Transactions on Pattern Analysis and Machine Intelligence|volume=35|issue=11|pages=2706–2719|doi=10.1109/TPAMI.2013.71|pmid=24051730|hdl=11441/79657|s2cid=170040|issn=0162-8828|url=http://www.imse-cnm.csic.es/~bernabe/tpami2013_AdditionalMaterialVideo.wmv|hdl-access=free}}
= Motion detection and tracking =
Segmentation and detection of moving objects viewed by an event camera can seem to be a trivial task, as it is done by the sensor on-chip. However, these tasks are difficult, because events carry little information{{Cite journal|last1=Gallego|first1=Guillermo|last2=Delbruck|first2=Tobi|last3=Orchard|first3=Garrick Michael|last4=Bartolozzi|first4=Chiara|last5=Taba|first5=Brian|last6=Censi|first6=Andrea|last7=Leutenegger|first7=Stefan|last8=Davison|first8=Andrew|last9=Conradt|first9=Jorg|last10=Daniilidis|first10=Kostas|last11=Scaramuzza|first11=Davide|date=2020|title=Event-based Vision: A Survey|url=https://ieeexplore.ieee.org/document/9138762|journal=IEEE Transactions on Pattern Analysis and Machine Intelligence|volume=PP|issue=1 |pages=154–180|doi=10.1109/TPAMI.2020.3008413| arxiv=1904.08405 |pmid=32750812|s2cid=234740723|issn=1939-3539}} and do not contain useful visual features like texture and color.{{Cite book|last1=Mondal|first1=Anindya|last2=R|first2=Shashant|last3=Giraldo|first3=Jhony H.|last4=Bouwmans|first4=Thierry|last5=Chowdhury|first5=Ananda S.|title=2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) |chapter=Moving Object Detection for Event-based Vision using Graph Spectral Clustering |date=2021|chapter-url=https://openaccess.thecvf.com/content/ICCV2021W/GSP-CV/html/Mondal_Moving_Object_Detection_for_Event-Based_Vision_Using_Graph_Spectral_Clustering_ICCVW_2021_paper.html|language=en|pages=876–884|doi=10.1109/ICCVW54120.2021.00103|arxiv=2109.14979|isbn=978-1-6654-0191-3|s2cid=238227007|via=IEEE Xplore}} These tasks become even more challenging given a moving camera, because events are triggered everywhere on the image plane, produced by moving objects and the static scene (whose apparent motion is induced by the camera's ego-motion). Some of the recent{{When|date=January 2025}} approaches to solving this problem include the incorporation of motion-compensation models{{Cite book|last1=Mitrokhin|first1=Anton|last2=Fermuller|first2=Cornelia|last3=Parameshwara|first3=Chethan|last4=Aloimonos|first4=Yiannis|title=2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |chapter=Event-Based Moving Object Detection and Tracking |date=October 2018|chapter-url=https://ieeexplore.ieee.org/document/8593805|location=Madrid|publisher=IEEE|pages=1–9|doi=10.1109/IROS.2018.8593805|arxiv=1803.04523|isbn=978-1-5386-8094-0|s2cid=3845250}}{{Cite book|last1=Stoffregen|first1=Timo|last2=Gallego|first2=Guillermo|last3=Drummond|first3=Tom|last4=Kleeman|first4=Lindsay|last5=Scaramuzza|first5=Davide|date=2019|title=2019 IEEE/CVF International Conference on Computer Vision (ICCV)|chapter=Event-Based Motion Segmentation by Motion Compensation |chapter-url=https://openaccess.thecvf.com/content_ICCV_2019/html/Stoffregen_Event-Based_Motion_Segmentation_by_Motion_Compensation_ICCV_2019_paper.html|pages=7244–7253|doi=10.1109/ICCV.2019.00734 |arxiv=1904.01293|isbn=978-1-7281-4803-8 |s2cid=91183976 }} and traditional clustering algorithms.{{Cite book|last1=Piątkowska|first1=Ewa|last2=Belbachir|first2=Ahmed Nabil|last3=Schraml|first3=Stephan|last4=Gelautz|first4=Margrit|title=2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops |chapter=Spatiotemporal multiple persons tracking using Dynamic Vision Sensor |date=June 2012|chapter-url=https://ieeexplore.ieee.org/document/6238892|pages=35–40|doi=10.1109/CVPRW.2012.6238892|isbn=978-1-4673-1612-5|s2cid=310741}}{{Cite journal|last1=Chen|first1=Guang|last2=Cao|first2=Hu|last3=Aafaque|first3=Muhammad|last4=Chen|first4=Jieneng|last5=Ye|first5=Canbo|last6=Röhrbein|first6=Florian|last7=Conradt|first7=Jörg|last8=Chen|first8=Kai|last9=Bing|first9=Zhenshan|last10=Liu|first10=Xingbo|last11=Hinz|first11=Gereon|date=2018-12-02|title=Neuromorphic Vision Based Multivehicle Detection and Tracking for Intelligent Transportation System|journal=Journal of Advanced Transportation|language=en|volume=2018|pages=e4815383|doi=10.1155/2018/4815383|issn=0197-6729|doi-access=free}}{{cite book|last1=Mondal|first1=Anindya|last2=Das|first2=Mayukhmali|date=2021-11-08|title=2021 IEEE 8th Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON)|chapter=Moving Object Detection for Event-based Vision using k-means Clustering|pages=1–6|doi=10.1109/UPCON52273.2021.9667636|arxiv=2109.01879|isbn=978-1-6654-0962-9|s2cid=237420620}}
Potential applications
Potential applications include most tasks classically fitting conventional cameras, but with emphasis on machine vision tasks (such as object recognition, autonomous vehicles, and robotics.). The US military is{{As of when|date=January 2025}} considering infrared and other event cameras because of their lower power consumption and reduced heat generation.
Considering the advantages the event camera possesses, compared to conventional image sensors, it is considered fitting for applications requiring low power consumption and latency, and where it is difficult to stabilize the camera's line of sight. These applications include the aforementioned autonomous systems, but also space imaging, security, defense, and industrial monitoring. Research into color sensing with event cameras is{{When|date=January 2025}} underway,{{Cite web |title=CED: Color Event Camera Dataset |url=https://rpg.ifi.uzh.ch/CED.html |access-date=2024-04-08 |website=rpg.ifi.uzh.ch}} but it is not yet{{When|date=January 2025}} convenient for use with applications requiring color sensing.