COMPARATIVE EVALUATION OF KEYPOINT DETECTORS FOR 3D DIGITAL AVATAR RECONSTRUCTION

Dusan Gajic, Gorana Gojić, Dinu Dragan, Veljko Petrović

DOI Number
10.2298/FUEE2003379G
First page
379
Last page
394

Abstract


Three-dimensional personalized human avatars have been successfully utilized in shopping, entertainment, education, and health applications. However, it is still a challenging task to obtain both a complete and highly detailed avatar automatically. One approach is to use general-purpose, photogrammetry-based algorithms on a series of overlapping images of the person. We argue that the quality of avatar reconstruction can be increased by modifying parts of the photogrammetry-based algorithm pipeline to be more specifically tailored to the human body shape. In this context, we perform an extensive, standalone evaluation of eleven algorithms for keypoint detection, which is the first phase of the photogrammetry-based reconstruction pipeline. We include well established, patented Distinctive image features from scale-invariant keypoints (SIFT) and Speeded up robust features (SURF) detection algorithms as a baseline since they are widely incorporated into photogrammetry-based software. All experiments are conducted on a dataset of 378 images of human body captured in a controlled, multi-view stereo setup. Our findings are that binary detectors highly outperform commonly used SIFT-like detectors in the avatar reconstruction task, both in terms of detection speed and in number of detected keypoints.


Keywords

Detector, Photogrammetry-based reconstruction, 3D human avatar, Structure from Motion, Multi-view Stereo

Full Text:

PDF

References


J. N. Bailenson, N. Yee, J. Blascovich, and R. E. Guadagno, “Transformed social interaction in mediated interpersonal communication”, Mediated Interpersonal Communication, 2008, pp. 77–99.

H. Lin and H. Wang, “Avatar creation in virtual worlds: Behaviors and motivations”, Comput. Human Behav., vol. 34, pp. 213–218, May 2014.

F. Cordier, W. Lee, H. Seo, and N. Magnenat-Thalmann, “From 2D Photos of Yourself to Virtual Try-on Dress on the Web,” In People and Computers XV—Interaction without Frontiers, London: Springer London, 2011, pp. 31–46.

C. Zizza, A. Starr, D. Hudson, S. S. Nuguri, P. Calyam, and Z. He, “Towards a social virtual reality learning environment in high fidelity,” In Proceedings of the 15th IEEE Annual Consumer Communications & Networking Conference (CCNC), 2018, pp. 1–4.

D. Dragan, Z. Anišić, S. Mihić, and V. Puhalac, “3D Avatar Platforms: Tomorrow’s Gateways for Digitized Persons into Virtual Worlds”, Springer, Cham, 2018, pp. 141–155.

I. Hudson and J. Hurter, “Avatar types matter: Review of avatar literature for performance purposes,” In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2016, vol. 9740, pp. 14–21.

M. Yuan, I. R. Khan, F. Farbiz, S. Yao, A. Niswar, and M.-H. Foo, “A Mixed Reality Virtual Clothes Try-On System”, IEEE Trans. Multimed., vol. 15, no. 8, pp. 1958–1968, Dec. 2013.

T. Luhmann, S. Robson, S. Kyle, and J. Boehm, Close Range Photogrammetry and 3D Imaging. 2013.

AgiSoft, “AgiSoft PhotoScan Professional (Version 1.2.6) (Software)”, 2016. [Online]. Available: https://www.agisoft.com/downloads/installer/.

J. Heinly, E. Dunn, and J. M. Frahm, “Comparative evaluation of binary features”, In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2012, vol. 7573 LNCS, no. PART 2, pp. 759–773.

AliceVision, “Meshroom: A 3D reconstruction software.” 2018.

P. Moulon, P. Monasse, R. Perrot, and R. Marlet, “Openmvg: Open multiple view geometry,” In Proceedings of the International Workshop on Reproducible Research in Pattern Recognition, 2016, pp. 60–74.

H. Aanæs, A. L. Dahl, and K. S. Pedersen, “Interesting interest points: A comparative study of interest point performance on a unique data set”, Int. J. Comput. Vis., vol. 97, no. 1, pp. 18–35, Mar. 2012.

K. Mikolajczyk and C. Schmid, “A performance evaluation of local descriptors”, IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 10, pp. 1615–1630, 2005.

K. Mikolajczyk and C. Schmid, “Scale & affine invariant interest point detectors”, Int. J. Comput. Vis., vol. 60, no. 1, pp. 63–86, Oct. 2004.

O. Miksik and K. Mikolajczyk, “Evaluation of Local Detectors and Descriptors for Fast Feature Matching,” In Proceedings of the 21st Int. Conf. Pattern Recognit. (ICPR), 2012, Icpr, pp. 2681–2684, 2012.

A. Canclini, M. Cesana, A. Redondi, M. Tagliasacchi, J. Ascenso, and R. Cilla, “Evaluation of low-complexity visual feature detectors and descriptors”, In Proceedings of the 18th International Conference on Digital Signal Processing, DSP 2013, 2013, pp. 1–7.

Ş. Işık, “A Comparative Evaluation of Well-known Feature Detectors and Descriptors,” Int. J. Appl. Math. Electron. Comput., vol. 3, no. 1, p. 1, Dec. 2014.

D. Mukherjee, Q. M. Jonathan Wu, and G. Wang, “A comparative experimental study of image feature detectors and descriptors,” Mach. Vis. Appl., vol. 26, no. 4, pp. 443–466, May 2015.

K. Yamada and A. Kimura, “A performance evaluation of keypoints detection methods SIFT and AKAZE for 3D reconstruction,” In Proceedings of the 2018 International Workshop on Advanced Image Technology, IWAIT 2018, 2018, pp. 1–4.

B. Allen, B. Curless, and Z. Popović, “The space of human body shapes”, ACM Trans. Graph., vol. 22, no. 3, p. 587, 2003.

A. S. Jackson, C. Manafas, and G. Tzimiropoulos, “3D Human Body Reconstruction from a Single Image via Volumetric Regression”, Sep. 2018.

G. Varol et al., “BodyNet: Volumetric inference of 3D human body shapes,” In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2018, vol. 11211 LNCS, pp. 20–38.

Z. Zheng, T. Yu, Y. Wei, Q. Dai, and Y. Liu, “DeepHuman: 3D Human Reconstruction from a Single Image,” Mar. 2019.

A. Venkat, S. S. Jinka, and A. Sharma, “Deep Textured 3D Reconstruction of Human Bodies,” Sep. 2018.

J. Tong, J. Zhou, L. Liu, Z. Pan, and H. Yan, “Scanning 3D full human bodies using kinects”, IEEE Trans. Vis. Comput. Graph., vol. 18, no. 4, pp. 643–650, Apr. 2012.

Z. Liu et al., “3D real human reconstruction via multiple low-cost depth cameras”, Signal Processing, vol. 112, pp. 162–179, Jul. 2015.

Y. M. Kim, C. Theobalt, J. Diebel, J. Kosecka, B. Miscusik, and S. Thrun, “Multi-view image and ToF sensor fusion for dense 3D reconstruction”, In Proceedings of the 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops 2009, 2009, pp. 1542–1546.

Y. Furukawa and J. Ponce, “Accurate, dense, and robust multiview stereopsis”, IEEE Trans. Pattern Anal. Mach. Intell., vol. 32, no. 8, pp. 1362–1376, Aug. 2010.

A. Weiss, D. Hirshberg, and M. J. Black, “Home 3D body scans from noisy image and range data,” In Proceedings of the IEEE International Conference on Computer Vision, 2011, pp. 1951–1958.

J. L. Schönberger and J.-M. Frahm, “Structure-from-Motion Revisited”, In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), 2016.

J. L. Schönberger, E. Zheng, J. M. Frahm, and M. Pollefeys, “Pixelwise view selection for unstructured multi-view stereo”, In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2016, vol. 9907 LNCS, pp. 501–518.

H. Aanæs, R. R. Jensen, G. Vogiatzis, E. Tola, and A. B. Dahl, “Large-Scale Data for Multiple-View Stereopsis,” Int. J. Comput. Vis., vol. 120, no. 2, pp. 153–168, Nov. 2016.

M. Goesele, B. Curless, and S. M. Seitz, “Multi-View Stereo Revisited,” In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2 (CVPR’06), vol. 2, pp. 2402–2409.

S. R. Fanello et al., “UltraStereo: Efficient learning-based matching for active stereo systems,” In Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, 2017, vol. 2017-Janua, pp. 6535–6544.

I. Stančić, M. Brajović, I. Orović, and J. Musić, “Compressive sensing for reconstruction of 3D point clouds in smart systems,” In Proceedings of the 24th International Conference on Software, Telecommunications and Computer Networks, SoftCOM 2016, 2016, pp. 1–5.

V. Tan, I. Budvytis, and R. Cipolla, “Indirect deep structured learning for 3D human body shape and pose prediction,” In Proceedings of the British Machine Vision Conference 2017, 2017.

A. Kanazawa, M. J. Black, D. W. Jacobs, and J. Malik, “End-to-End Recovery of Human Shape and Pose,” In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2018, pp. 7122–7131.

D. Gajić, S. Mihić, D. Dragan, V. Petrović, and Z. Anišić, “Simulation of photogrammetry-based 3D data acquisition,” Int. J. Simul. Model., vol. 18, no. 1, 2019.

G. Bradski, “The OpenCV Library,” Dr. Dobb’s J. Softw. Tools, 2000.

D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vis., vol. 60, no. 2, pp. 91–110, 2004.

H. Bay, T. Tuytelaars, and L. Van Gool, “SURF: Speeded up robust features”, In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2006, vol. 3951 LNCS, pp. 404–417.

M. Agrawal, K. Konolige, and M. R. Blas, “CenSurE: Center surround extremas for realtime feature detection and matchin”, In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2008, vol. 5305 LNCS, no. PART 4, pp. 102–115.

F. Tombari and L. Di Stefano, “Interest points via maximal self-dissimilarities”, In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2015, vol. 9004, pp. 586–600.

P. E. Forssén, “Maximally stable colour regions for recognition and matching”, In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2007, pp. 1–8.

D. Nistér and H. Stewénius, “Linear time maximally stable extremal regions,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2008, vol. 5303 LNCS, no. PART 2, pp. 183–196.

Jianbo Shi and Tomasi, “Good features to track”, In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR-94, 1994, pp. 593–600.

E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, “ORB: An efficient alternative to SIFT or SURF”, In Proceedings of the IEEE International Conference on Computer Vision, 2011, pp. 2564–2571.

E. Rosten and T. Drummond, “Fusing points and lines for high performance tracking”, In Proceedings of the IEEE International Conference on Computer Vision, 2005, vol. II, pp. 1508–1515.

S. Leutenegger, M. Chli, and R. Y. Siegwart, “BRISK: Binary Robust invariant scalable keypoints”, In Proceedings of the IEEE International Conference on Computer Vision, 2011, pp. 2548–2555.

E. Mair, G. D. Hager, D. Burschka, M. Suppa, and G. Hirzinger, “Adaptive and generic corner detection based on the accelerated segment test”, in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2010, vol. 6312 LNCS, no. PART 2, pp. 183–196.

P. Alcantarilla, J. Nuevo, and A. Bartoli, “Fast Explicit Diffusion for Accelerated Features in Nonlinear Scale Spaces”, In Proceedings of the British Machine Vision Conference 2013, 2014, pp. 13.1-13.11.


Refbacks

  • There are currently no refbacks.


ISSN: 0353-3670 (Print)

ISSN: 2217-5997 (Online)

COBISS.SR-ID 12826626