CAD and image processing software tools play a key role in the RAS-assisted inspection loop and therefore their development is a main objective of ROBINS. The software is expected to:
- Devise a 3D numeric model of the confined space subject to inspection by means of image processing algorithms capable of combining together 2D pictures, and/or by means of meshing algorithms capable to devise textured meshes from point clouds and photographs
- Provide virtual tours of the space subject to inspection. The user should be given the possibility to examine accurately the details of interest by moving in the 3D virtual space and setting the orientation of its viewpoint according to his need, and having always a detailed rendering of the surface observed consistent with the viewpoint
- Provide the possibility to add hotspots and/or associate additional information to selected parts of the 3D model (augmented virtual reality model)
- Identify critical or suspect areas from the analysis of visual data acquired during the inspection and highlight such areas in order to provide a valuable guidance to the surveyor
The ROBINS project also aims at integrating image-processing algorithms specifically developed for the recognition of ship hull’s critical or suspect areas in the software dedicated to virtual tours, thus creating a unified environment for virtual inspection. Dedicated software tools and algorithms for image processing will enhance the possibility to effectively and easily identify critical or suspect areas in inspected spaces.
3D reconstruction of real world objects has been an important research area for decades in computer vision as well as photogrammetric community. Accurate surface reconstruction has been established as a necessity for a variety of mapping, modelling, and monitoring applications (e.g. thickness measurement, cracks detection, coating condition assessment, evaluation of mechanical damages).
The most fundamental step required for surface reconstruction is generation of 3D point cloud. A point cloud is basically a large collection of points that are placed on a three-dimensional coordinate system. Two main methods of extracting point clouds exist: 3D scanning and photogrammetry. The 3D scanning techniques can be further subcategorized into laser scanning and structured light scanning. Spatial data is obtained by moving the laser head or the structured light cameras relative to the object being scanned to directly obtain point clouds of the object (or sections of the object if it is too large). With photogrammetry techniques, 3D point clouds are generated from a large set of images by matching 2D points and edges which are transformed in the 3D data by forward ray intersection. The following steps are required to extract point cloud data from a set of photos:
- Feature recognition – each photograph is analyzed and key features are identified, which are invariant to scale and rotation and may be potentially used to align the photos.
- Feature matching – features have to be matched between photos.
- Alignment of cameras – the coordinates of the cameras relative to each other and the recognized features is calculated by minimizing the error between distances on images and expected distances for all cameras. Minimization is usually performed using Levenberg-Marqardt algorithm and is collectively known as bundle adjustment.
- Construction of dense point cloud – once the cameras are aligned and distances between key features are known, construction of the dense point cloud can begin. This step is the most computationally intensive and can be performed using forward ray tracing.


At the final stage, the reconstructed 3D model is textured by finding the best images for each triangle and adjusting colors. Since both 3D TIN model and the whole set of captured images are registered within the global coordinate system, creating texture map is essentially a problem of combination of texture fragments. Individual texture fragments are obtained by mapping (“backprojecting”) input images onto the generated 3D model. Multiple views produce multiple texture fragments, and the domains of these fragments are different parts of TIN model. It should be noted that overlapping fragments may differ photometrically due to different lighting, camera settings, or nonlambertian object surface (e.g. glass windows, water basins, or structures of polished metal) and geometrically due
to model imprecision and/or imperfect registration. Thus, texturing process should generate texture fragments very carefully in order to minimize visible seams, and also perform some post-processing in order to remove remaining photometrical differences between the fragments.
Accurate reconstructed 3D models of industrial environments are required for many purposes like maintenance, documentation, training, and monitoring. Most of current research is focusing on applying Virtual and Augmented Reality for providing various services in industrial environments. Accurate georeferenced photorealistic 3D models of active construction site provide an important tool for impact assessment, decision-making, or project monitoring. Relatively cheap, flexible and general 3D model acquisition process is widely used for reverse-engineering and rapid prototyping. Inbuilt tools allow to measure distances, areas and volumes or to perform more sophisticated metric analysis on point clouds or meshes.
Summarizing, photogrammetry is a valuable and attractive approach in many applications which has the major advantage of being low-cost, portable, flexible and able to deliver, at the same time, highly detailed geometries and textures. At the same time, many objects are problematic for image-based 3D modeling techniques (unstructured, monochrome, translucent, reflective, and/or self-resembling surfaces). Moreover, additional constraints are related to lighting conditions which play a key role in production of high quality models. Problematic image capturing conditions result in a high level of noise in the final mesh models and more topological errors. Therefore, new improved and robust algorithms should be developed in order to deal with complex industrial sites.
Automatic defect detection is a continuously advancing topic in research that lacks an effective market implementation in the case of shipping industry. Recently, there exists a boost for autonomous inspection of industrial, transport and building infrastructure through the synergy of robotics and computer vision. A first approach on rust detection on vessels has been developed within the MINOAS and INCASS frameworks.
The automation of corrosion and crack detection via non-contact and non-destructive techniques, instead of electrochemical methods, is an elaborate research problem. Typical methods perform image analysis on RGB data from metal surfaces. The various employed methods are mainly categorized in two groups; the ones based on an automated detection (e.g. on the wavelet domain, thresholding, spectral band combinations and analysis, image segmentation, boundary or shape analysis); and the methods based on image classification procedures.
ROBINS will promote the state-of-the-art in research and bring productive cutting-edge technologies closer to shipping market via the robotic integration of advanced defect detection and recording. Novel platforms of integrated lightweight sensors will be built, specifically oriented to defect detection. These platforms will take advantage of the modular structure of the proposed robots, aiming at adaptation on each case.
Simple image feature extraction may not be enough to detect damage in ship structure due to complex lighting conditions and great diversity of other conditions within data obtained from different robots. In this scenario, manual definition of features for defect representation is not feasible. Thus, digital image based defect detection should be built on a basis of highly robust machine-learning approach such as convolutional neural networks (CNN), which already showed good results in problem of object classification in photographs and looks promising for industrial inspection applications. In contrast to manually designed image processing solutions, deep CNN automatically generate powerful features by hierarchical learning strategies from massive amounts of training data with a minimum of human interaction or expert process knowledge.