structure from motion

{{Short description|Method of 3D reconstruction from moving objects}}

Structure from motion (SfM){{Cite journal | journal = Proceedings of the Royal Society of London | title = The interpretation of structure from motion | author = S. Ullman | year = 1979 | doi=10.1098/rspb.1979.0006 | pmid = 34162 | volume=203 | issue = 1153 | pages=405–426| bibcode = 1979RSPSB.203..405U | hdl = 1721.1/6298 | s2cid = 11995230 | url = https://dspace.mit.edu/bitstream/1721.1/6298/2/AIM-476.pdf | hdl-access = free }} is a photogrammetric range imaging technique for estimating three-dimensional structures from two-dimensional image sequences that may be coupled with local motion signals. It is a classic problem studied in the fields of computer vision and visual perception. In computer vision, the problem of SfM is to design an algorithm to perform this task. In visual perception, the problem of SfM is to find an algorithm by which biological creatures perform this task.

Principle

File:DSM construction site.jpg interchange construction site]]

File:SfM PPT GUI vs PHOTO.png

File:Bezmiechowa DSM 3D 2010-05-29 Pteryx UAV.jpg extracted from data collected during 30min flight of Pteryx UAV ]]

Humans perceive a great deal of information about the three-dimensional structure in their environment by moving around it. When the observer moves, objects around them move different amounts depending on their distance from the observer. This is known as motion parallax, and this depth information can be used to generate an accurate 3D representation of the world around them.{{cite book | title = Computer Vision |author1=Linda G. Shapiro |author2=George C. Stockman | publisher = Prentice Hall | year = 2001 | isbn = 978-0-13-030796-5 |author1-link=Linda Shapiro }}

Finding structure from motion presents a similar problem to finding structure from stereo vision. In both instances, the correspondence between images and the reconstruction of 3D object needs to be found.

To find correspondence between images, features such as corner points (edges with gradients in multiple directions) are tracked from one image to the next. One of the most widely used feature detectors is the scale-invariant feature transform (SIFT). It uses the maxima from a difference-of-Gaussians (DOG) pyramid as features. The first step in SIFT is finding a dominant gradient direction. To make it rotation-invariant, the descriptor is rotated to fit this orientation.{{Cite journal | journal = International Journal of Computer Vision | title = Distinctive image features from scale-invariant keypoints | author = D. G. Lowe | year = 2004 | doi=10.1023/b:visi.0000029664.99615.94 | volume=60 | issue = 2 | pages=91–110| citeseerx = 10.1.1.73.2924 | s2cid = 221242327 }} Another common feature detector is the SURF (speeded-up robust features).{{Cite journal | journal = 9th European Conference on Computer Vision | title = Surf: Speeded up robust features |author1=H. Bay |author2=T. Tuytelaars |author3=L. Van Gool |name-list-style=amp | year = 2006 }} In SURF, the DOG is replaced with a Hessian matrix-based blob detector. Also, instead of evaluating the gradient histograms, SURF computes for the sums of gradient components and the sums of their absolute values.{{Cite journal | journal = Kybernetika | volume = 46 | issue = 5 | pages = 926–937 | title = The structure-from-motion reconstruction pipeline – a survey with focus on short image sequences |author1=K. Häming |author2=G. Peters |name-list-style=amp | year = 2010 | url = http://dml.cz/dmlcz/141400 }} Its usage of integral images allows the features to be detected extremely quickly with high detection rate.{{Cite book|last1=Viola|first1=P.|last2=Jones|first2=M.|title=Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001 |chapter=Rapid object detection using a boosted cascade of simple features |date=2001|chapter-url=https://ieeexplore.ieee.org/document/990517|location=Kauai, HI, USA|publisher=IEEE Comput. Soc|volume=1|pages=I–511–I-518|doi=10.1109/CVPR.2001.990517|isbn=978-0-7695-1272-3|s2cid=2715202}} Therefore, comparing to SIFT, SURF is a faster feature detector with drawback of less accuracy in feature positions.

Another type of feature recently made practical for structure from motion are general curves (e.g., locally an edge with gradients in one direction), part of a technology known as pointless SfM,{{cite book

|last1=Nurutdinova |first1=Andrew

|last2=Fitzgibbon |first2=Andrew

|title=2015 IEEE International Conference on Computer Vision (ICCV)

|chapter=Towards Pointless Structure from Motion: 3D Reconstruction and Camera Parameters from General 3D Curves

|date=2015

| pages=2363–2371

|chapter-url=http://openaccess.thecvf.com/content_iccv_2015/papers/Nurutdinova_Towards_Pointless_Structure_ICCV_2015_paper.pdf

|doi=10.1109/ICCV.2015.272

|isbn=978-1-4673-8391-2

|s2cid=9120123

}}{{cite book

|last1=Fabbri |first1=Ricardo

|last2=Giblin |first2=Peter

|last3=Kimia |first3=Benjamin

|title=Computer Vision – ECCV 2012

|chapter=Camera Pose Estimation Using First-Order Curve Differential Geometry

|date=2012

|volume=7575 |pages=231–244

|url=https://rfabbri.github.io/stuff/fabbri-giblin-kimia-eccv2012-final-ext.pdf

|doi=10.1007/978-3-642-33765-9_17

|series=Lecture Notes in Computer Science

|isbn=978-3-642-33764-2

|s2cid=15402824

}} useful when point features are insufficient, common in man-made environments.{{cite journal |last1=Apple |first1=ARKIT team |title=Understanding ARKit Tracking and Detection |journal=WWDC |date=2018 |url=https://developer.apple.com/videos/play/wwdc2018/610}}

The features detected from all the images will then be matched. One of the matching algorithms that track features from one image to another is the Lucas–Kanade tracker.{{Cite journal | journal = Ijcai81 | title = An iterative image registration technique with an application to stereo vision |author1=B. D. Lucas |author2=T. Kanade |name-list-style=amp }}

Sometimes some of the matched features are incorrectly matched. This is why the matches should also be filtered. RANSAC (random sample consensus) is the algorithm that is usually used to remove the outlier correspondences. In the paper of Fischler and Bolles, RANSAC is used to solve the location determination problem (LDP), where the objective is to determine the points in space that project onto an image into a set of landmarks with known locations.{{Cite journal | journal = Commun. ACM | title = Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography |author1=M. A. Fischler |author2=R. C. Bolles |name-list-style=amp | year = 1981 | doi=10.1145/358669.358692 | volume=24 | issue = 6 | pages=381–395| s2cid = 972888 | doi-access=free }}

The feature trajectories over time are then used to reconstruct their 3D positions and the camera's motion.{{Cite journal | journal = IEEE Computer Society Conference on Computer Vision and Pattern Recognition | title = Structure from Motion without Correspondence |author1=F. Dellaert |author2=S. Seitz |author3=C. Thorpe |author4=S. Thrun |name-list-style=amp | year = 2000 | url = http://www.ri.cmu.edu/pub_files/pub2/dellaert_frank_2000_1/dellaert_frank_2000_1.pdf }}

An alternative is given by so-called direct approaches, where geometric information (3D structure and camera motion) is directly estimated from the images, without intermediate abstraction to features or corners.{{Cite conference | last1 = Engel | first1 = Jakob | last2 = Schöps | first2 = Thomas | last3 = Cremers | first3 = Daniel | contribution = LSD-SLAM: Large-Scale Direct Monocular SLAM | year = 2014 | title = European Conference on Computer Vision (ECCV) 2014 | url = https://vision.in.tum.de/_media/spezial/bib/engel14eccv.pdf }}

There are several approaches to structure from motion. In incremental SfM,{{Cite journal | journal = IEEE Computer Society Conference on Computer Vision and Pattern Recognition | title = Structure-from-Motion Revisited |author1=J.L. Schönberger |author2=J.M. Frahm |name-list-style=amp | year = 2016 | url = https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Schonberger_Structure-From-Motion_Revisited_CVPR_2016_paper.pdf }} camera poses are solved for and added one by one to the collection. In global SfM,{{Cite journal | journal = International Journal of Computer Vision | volume = 9 | issue = 2 | pages = 137–154 | title = Shape and motion from image streams under orthography: a factorization method |author1=C. Tomasi |author2=T. Kanade |name-list-style=amp | year = 1992 | doi = 10.1007/BF00129684 | citeseerx = 10.1.1.131.9807 | s2cid = 2931825 }}{{Cite book |author1=V.M. Govindu | title = Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001 | chapter = Combining two-view constraints for motion estimation | volume = 2 | pages = II-218-II-225 |name-list-style=amp | year = 2001 | doi = 10.1109/CVPR.2001.990963 | isbn = 0-7695-1272-0 | s2cid = 8252027 }} the poses of all cameras are solved for at the same time. A somewhat intermediate approach is out-of-core SfM, where several partial reconstructions are computed that are then integrated into a global solution.

Applications

=Geosciences=

Structure-from-motion photogrammetry with multi-view stereo provides hyperscale landform models using images acquired from a range of digital cameras and optionally a network of ground control points. The technique is not limited in temporal frequency and can provide point cloud data comparable in density and accuracy to those generated by terrestrial and airborne laser scanning at a fraction of the cost.{{Cite journal|last1=Westoby|first1=M. J.|last2=Brasington|first2=J.|last3=Glasser|first3=N. F.|last4=Hambrey|first4=M. J.|last5=Reynolds|first5=J. M.|date=2012-12-15|title='Structure-from-Motion' photogrammetry: A low-cost, effective tool for geoscience applications|journal=Geomorphology|volume=179|pages=300–314|doi=10.1016/j.geomorph.2012.08.021|bibcode=2012Geomo.179..300W|s2cid=33695861 |url=https://pure.aber.ac.uk/portal/en/publications/structurefrommotion-photogrammetry-a-lowcost-effective-tool-for-geoscience-applications(8f446bb7-ebe8-4bf3-b6f3-89951cdc87ed).html|hdl=2160/11389|hdl-access=free}}{{Cite journal|last1=James|first1=M. R.|last2=Robson|first2=S.|date=2012-09-01|title=Straightforward reconstruction of 3D surfaces and topography with a camera: Accuracy and geoscience application|journal=Journal of Geophysical Research: Earth Surface|language=en|volume=117|issue=F3|pages=F03017|doi=10.1029/2011jf002289|issn=2156-2202|bibcode=2012JGRF..117.3017J|url=https://eprints.lancs.ac.uk/id/eprint/56018/1/James_and_Robson_2012_SfM_MVS.pdf|doi-access=free}}{{Cite journal|last1=Fonstad|first1=Mark A.|last2=Dietrich|first2=James T.|last3=Courville|first3=Brittany C.|last4=Jensen|first4=Jennifer L.|last5=Carbonneau|first5=Patrice E.|date=2013-03-30|title=Topographic structure from motion: a new development in photogrammetric measurement|journal=Earth Surface Processes and Landforms|language=en|volume=38|issue=4|pages=421–430|doi=10.1002/esp.3366|issn=1096-9837|url=http://dro.dur.ac.uk/20541/1/20541.pdf|bibcode=2013ESPL...38..421F|s2cid=15601931 }} Structure from motion is also useful in remote or rugged environments where terrestrial laser scanning is limited by equipment portability and airborne laser scanning is limited by terrain roughness causing loss of data and image foreshortening. The technique has been applied in many settings such as rivers,{{Cite journal|last1=Javernick|first1=L.|last2=Brasington|first2=J.|last3=Caruso|first3=B.|title=Modeling the topography of shallow braided rivers using Structure-from-Motion photogrammetry|journal=Geomorphology|volume=213|pages=166–182|doi=10.1016/j.geomorph.2014.01.006|year=2014|bibcode=2014Geomo.213..166J}} badlands,{{Cite journal|last1=Smith|first1=Mark William|last2=Vericat|first2=Damià|date=2015-09-30|title=From experimental plots to experimental landscapes: topography, erosion and deposition in sub-humid badlands from Structure-from-Motion photogrammetry|journal=Earth Surface Processes and Landforms|language=en|volume=40|issue=12|pages=1656–1671|doi=10.1002/esp.3747|issn=1096-9837|url=http://eprints.whiterose.ac.uk/85414/14/MWS_main_text_badlands_forLeeds.pdf|bibcode=2015ESPL...40.1656S|s2cid=128402144 }} sandy coastlines,{{Cite journal|last1=Goldstein|first1=Evan B|last2=Oliver|first2=Amber R|last3=deVries|first3=Elsemarie|last4=Moore|first4=Laura J|last5=Jass|first5=Theo|date=2015-10-22|title=Ground control point requirements for structure-from-motion derived topography in low-slope coastal environments|journal=PeerJ PrePrints|language=en|doi=10.7287/peerj.preprints.1444v1|issn=2167-9843|doi-access=free}}{{Cite journal|last1=Mancini|first1=Francesco|last2=Dubbini|first2=Marco|last3=Gattelli|first3=Mario|last4=Stecchi|first4=Francesco|last5=Fabbri|first5=Stefano|last6=Gabbianelli|first6=Giovanni|date=2013-12-09|title=Using Unmanned Aerial Vehicles (UAV) for High-Resolution Reconstruction of Topography: The Structure from Motion Approach on Coastal Environments|journal=Remote Sensing|language=en|volume=5|issue=12|pages=6880–6898|doi=10.3390/rs5126880|bibcode=2013RemS....5.6880M|doi-access=free|hdl=11380/1055514|hdl-access=free}} fault zones,{{Cite journal|last1=Johnson|first1=Kendra|last2=Nissen|first2=Edwin|last3=Saripalli|first3=Srikanth|last4=Arrowsmith|first4=J. Ramón|last5=McGarey|first5=Patrick|last6=Scharer|first6=Katherine|last7=Williams|first7=Patrick|last8=Blisniuk|first8=Kimberly|date=2014-10-01|title=Rapid mapping of ultrafine fault zone topography with structure from motion|journal=Geosphere|volume=10|issue=5|pages=969–986|doi=10.1130/GES01017.1|bibcode=2014Geosp..10..969J|doi-access=free}} landslides,{{Cite journal|last1=Del Soldato|first1=M.|last2=Riquelme|first2=A.|last3=Bianchini|first3=S.|last4=Tomàs|first4=R.|last5=Di Martire|first5=D.|last6=De Vita|first6=P.|last7=Moretti|first7=S.|last8=Calcaterra|first8=D.|date=2018-06-06|title=Multisource data integration to investigate one century of evolution for the Agnone landslide (Molise, southern Italy)|journal=Landslides|volume=15|issue=11|pages=2113–2128|doi=10.1007/s10346-018-1015-z|issn=1612-510X|doi-access=free|bibcode=2018Lands..15.2113D |hdl=2158/1131012|hdl-access=free}}{{Cite journal |last1=Tomás |first1=Roberto |last2=Pinheiro |first2=Marisa |last3=Pinto |first3=Pedro |last4=Pereira |first4=Eduardo |last5=Miranda |first5=Tiago |date=August 2023 |title=Preliminary analysis of the mechanisms, characteristics, and causes of a recent catastrophic structurally controlled rock planar slide in Esposende (northern Portugal) |url=https://link.springer.com/10.1007/s10346-023-02082-y |journal=Landslides |language=en |volume=20 |issue=8 |pages=1657–1665 |doi=10.1007/s10346-023-02082-y |bibcode=2023Lands..20.1657T |issn=1612-510X|hdl=1822/88576 |hdl-access=free }} and coral reef settings.{{Cite journal|last1=Bryson|first1=Mitch|last2=Duce|first2=Stephanie|last3=Harris|first3=Dan|last4=Webster|first4=Jody M.|last5=Thompson|first5=Alisha|last6=Vila-Concejo|first6=Ana|last7=Williams|first7=Stefan B.|title=Geomorphic changes of a coral shingle cay measured using Kite Aerial Photography|journal=Geomorphology|volume=270|pages=1–8|doi=10.1016/j.geomorph.2016.06.018|year=2016|bibcode=2016Geomo.270....1B}} SfM has been also successfully applied for the assessment of changes{{Cite journal |last1=Conesa-García |first1=Carmelo |last2=Puig-Mengual |first2=Carlos |last3=Riquelme |first3=Adrián |last4=Tomás |first4=Roberto |last5=Martínez-Capel |first5=Francisco |last6=García-Lorenzo |first6=Rafael |last7=Pastor |first7=José L. |last8=Pérez-Cutillas |first8=Pedro |last9=Martínez-Salvador |first9=Alberto |last10=Cano-Gonzalez |first10=Miguel |date=2022-02-01 |title=Changes in stream power and morphological adjustments at the event-scale and high spatial resolution along an ephemeral gravel-bed channel |url=https://www.sciencedirect.com/science/article/pii/S0169555X2100461X |journal=Geomorphology |volume=398 |pages=108053 |doi=10.1016/j.geomorph.2021.108053 |bibcode=2022Geomo.39808053C |hdl=10251/190056 |issn=0169-555X|hdl-access=free }} and large wood accumulation volume{{Cite journal|date=2019-12-01|title=Using Structure from Motion photogrammetry to assess large wood (LW) accumulations in the field|journal=Geomorphology|language=en|volume=346|doi=10.1016/j.geomorph.2019.106851|bibcode=2019Geomo.34606851S|last1=Spreitzer|first1=Gabriel|last2=Tunnicliffe|first2=Jon|last3=Friedrich|first3=Heide |pages=106851|s2cid=202908775 }} and porosity{{Cite journal|title=Large wood (LW) 3D accumulation mapping and assessment using structure from Motion photogrammetry in the laboratory|journal=Journal of Hydrology|language=en|volume=581|doi=10.1016/j.jhydrol.2019.124430|bibcode = 2020JHyd..58124430S|last1=Spreitzer|first1=Gabriel|last2=Tunnicliffe|first2=Jon|last3=Friedrich|first3=Heide |pages = 124430|year = 2020|s2cid=209465940 }} in fluvial systems, the characterization of rock masses through the determination of some properties as the orientation, persistence, etc. of discontinuities.{{Cite journal|date=2017-01-01|title=Identification of Rock Slope Discontinuity Sets from Laser Scanner and Photogrammetric Point Clouds: A Comparative Analysis|journal=Procedia Engineering|language=en|volume=191|pages=838–845|doi=10.1016/j.proeng.2017.05.251|issn=1877-7058|last1=Riquelme|first1=A.|last2=Cano|first2=M.|last3=Tomás|first3=R.|last4=Abellán|first4=A.|doi-access=free|hdl=10045/67538|hdl-access=free}}{{Cite journal|date=2017-09-01|title=Comparing manual and remote sensing field discontinuity collection used in kinematic stability assessment of failed rock slopes|journal=International Journal of Rock Mechanics and Mining Sciences|language=en|volume=97|pages=24–32|doi=10.1016/j.ijrmms.2017.06.004|issn=1365-1609|last1=Jordá Bordehore|first1=Luis|last2=Riquelme|first2=Adrian|last3=Cano|first3=Miguel|last4=Tomás|first4=Roberto|bibcode=2017IJRMM..97...24J |hdl=10045/67528|url=http://rua.ua.es/dspace/bitstream/10045/67528/4/2017_Jorda_etal_IJRMMS_revised.pdf|hdl-access=free}} as well as for the evaluation of the stability of rock cut slopes.{{Cite journal |last1=Tomás |first1=R. |last2=Riquelme |first2=A. |last3=Cano |first3=M. |last4=Pastor |first4=J. L. |last5=Pagán |first5=J. I. |last6=Asensio |first6=J. L. |last7=Ruffo |first7=M. |date=2020-06-23 |title=Evaluación de la estabilidad de taludes rocosos a partir de nubes de puntos 3D obtenidas con un vehículo aéreo no tripulado |url=http://dx.doi.org/10.4995/raet.2020.13168 |journal=Revista de Teledetección |issue=55 |pages=1 |doi=10.4995/raet.2020.13168 |issn=1988-8740|hdl=10045/107612 |hdl-access=free }} A full range of digital cameras can be utilized, including digital SLR's, compact digital cameras and even smart phones. Generally though, higher accuracy data will be achieved with more expensive cameras, which include lenses of higher optical quality. The technique therefore offers exciting opportunities to characterize surface topography in unprecedented detail and, with multi-temporal data, to detect elevation, position and volumetric changes that are symptomatic of earth surface processes. Structure from motion can be placed in the context of other digital surveying methods.

=Cultural heritage=

Cultural heritage is present everywhere. Its structural control, documentation and conservation is one of humanity's main duties (UNESCO). Under this point of view, SfM is used in order to properly estimate situations as well as planning and maintenance efforts and costs, control and restoration. Because serious constraints often exist connected to the accessibility of the site and impossibility to install invasive surveying pillars that did not permit the use of traditional surveying routines (like total stations), SfM provides a non-invasive approach for the structure, without the direct interaction between the structure and any operator. The use is accurate as only qualitative considerations are needed. It is fast enough to respond to the monument’s immediate management needs.Guidi. G.; Beraldin, J.A.; Atzeni, C. High accuracy 3D modelling of cultural heritage: The digitizing of Donatello. IEEE Trans. Image Process. 2004, 13, 370–380

The first operational phase is an accurate preparation of the photogrammetric surveying where is established the relation between best distance from the object, focal length, the ground sampling distance (GSD) and the sensor’s resolution. With this information the programmed photographic acquisitions must be made using vertical overlapping of at least 60% (figure 02).Kraus, K., 2007. Photogrammetry: Geometry from Image and Laser Scans. Walter de Gruyter, 459 pp. {{ISBN|978-3-11-019007-6}}

Furthermore, structure-from-motion photogrammetry represents a non-invasive, highly flexible and low-cost methodology to digitalize historical documents.{{Cite journal|last1=Brandolini|first1=Filippo|last2=Patrucco|first2=Giacomo|date=September 2019|title=Structure-from-Motion (SFM) Photogrammetry as a Non-Invasive Methodology to Digitalize Historical Documents: A Highly Flexible and Low-Cost Approach?|journal=Heritage|language=en|volume=2|issue=3|pages=2124–2136|doi=10.3390/heritage2030128|doi-access=free|hdl=2434/666172|hdl-access=free}}

structure from motion

Principle

Applications

=Geosciences=

=Cultural heritage=

See also

References

Further reading