Evaluation of Novel Method of Hand Gesture Input to Define Automatic Scanning Path for UAV SAR Missions

Chang-Geun Oh;

doi:10.12673/jant.2023.27.4.473

Journal of Advanced Navigation Technology (한국항행학회논문지)

Volume 27 Issue 4
/
Pages.473-480
/
2023
/
1226-9026(pISSN)
/
2288-842X(eISSN)

The Korean Navigation Institute (한국항행학회)

DOI QR Code

Evaluation of Novel Method of Hand Gesture Input to Define Automatic Scanning Path for UAV SAR Missions

손 제스처를 이용하여 탐색 구조용 무인항공기의 자동 스캐닝 경로를 정의하는 가상현실 입력방법 개발 및 평가

Chang-Geun Oh (Department of Aerospace Industrial and Systems Engineering, Hanseo University)

오창근 (한서대학교 항공산업공학과)

Received : 2023.07.31
Accepted : 2023.08.22
Published : 2023.08.31

https://doi.org/10.12673/jant.2023.27.4.473 Citation PDF HTML

Download PDF

⟨ Previous Next ⟩

Abstract

This study evaluated a novel method of defining the automatic flight path of unmanned aerial vehicles (UAVs) for search and rescue missions in a VR environment. The developed VR content reserves miniature digital twins of a building in the fire and a steep mountain terrain site. The users drow the UAV's scanning path using hand gestures on the surface of digital twins, and then the UAV make an automatic flight along the defined path. According to human-in-the-loop simulation tests comparing the novel method with a conventional manual flight task with 19 participants, the novel method did not improve the mission performance but participants felt a lower mental workload. The designer may need to consider the automation support on the vulnerable points of the SAR mission environment while maintaining experts' mapping capability.

본 연구에서는 손 제스처를 이용하여 탐색 구조용 무인항공기의 카메라 스캐닝 경로를 입력하는 방법을 개발하여 가상현실 환경에서 시험하였다. 두 개의 임무영역 - 화재 발생 빌딩과 산악지역-에서 인명구조 탐색을 위해 관심 지역 빌딩과 산악 영역의 축소형 디지털트윈을 각각의 관심 지역 앞에 생성하고 디지털트윈 표면 상에 손 제스처로 비행경로를 정의한 후 무인항공기의 자동비행이 이루어지도록 하였다. 19명의 실험참가자로 이루어진 Human-in-the-loop 시뮬레이션 테스트에서 기존의 매뉴얼 비행 방식과 이 연구에서 설계한 경로 정의 후 자동비행 방식을 비교하였다. 테스트 결과, 객관적 결과는 새로운 방식이 임무의 정확성을 전반적으로 더 높였다고 보기는 어려웠으나, 주관적 답변은 산악지역 임무시 더 낮은 작업부하를 도출한다는 것을 보여주었다. 이러한 전문가의 특수한 업무능력에 의존하는 탐색구조 분야에서는 업무부하 감소가 필요한 부분에 선택적으로 적용하여 자동화로 보완해주는 것이 필요하다.

Keywords

Ⅰ. Introduction

Search and rescue (SAR) missions have been conducted with fixed or rotary wing-manned aircraft and are considered challenging tasks due to the difficulty of scanning a large area using human eyes from the sky within a limited time in various adverse weather conditions. Recently, Unmanned Aerial Vehicles (UAVs) have been considered a cost-effective scheme for SAR missions in various sectors. Human operators can control the UAVs in a safe place remotely and be helped by computer vision analysis to identify people in distress in the area of interest. Researchers have studied technologies to improve SAR missions using UAVs. The technologies include the logic of the UAV scanning plan, UAV’s autonomy, or camera vision analysis for the successful identification of people to rescue. Table 1 shows the literature review summaries about the technologies.

Table 1. Literature reviews about UAV technologies for SAR missions

HHHHBI_2023_v27n4_473_t0001.png 이미지

Beyond the visual range, most UAV pilots control their UAVs using the 1st-person view (i.e., the moving actor’s viewpoint only to the forward scene; aircraft cockpit view) from the UAV camera. However, the 3rd-person view (i.e., the bird’s eye- or god’s eye-viewpoint, the moving actor’s movement is seen as an object as well as other objects in the bigger environment sight from this viewpoint) can provide good spatial information for better situation awareness. Meilinger and Vosgerau found that the 1st-person and 3rd-person views could interact with each other rather than comprising separate representations for complex spatial representations[11].

Virtual Reality (VR) is a good training method with its capability of high-fidelity 360 degree-training environment generation. Albeaino, Eiris, Gheisari, and Issa tested the effectiveness of VR-based UAV flight training for building inspections and found the potential to create various realistic mission environments saving cost and time for the implementation of challenging variables[12].

VR technologies reserve various input techniques, and gesture input is one of them. Yang, Huang, Feng, Hong-An, and Guo-Zhong investigated the paradigms and classification of interaction for VR applications assuming human express their demands effectively with gestures[13].

The goal of this study is to develop and evaluate a prototype of a UAV training system for SAR missions using a VR application that provides the combined 1st-person and 3rd-person view. This study developed a VR application that applied a hand gesture to define the UAV scanning routes and the automatic UAV flight, and the author evaluated its performance compared with the conventional method of manual UAV flight for scanning.

Ⅱ. VR-based SAR Training System Prototype

This study used VR technology to demonstrate and train UAV pilots for SAR missions. The software team developed the UAV SAR training application using Unreal game engine (Version 4.25). The user could play the application with Oculus Rift S VR device. The users could use the Oculus Rift S’s controller to control UAVs because the setting is equivalent to most current drone controllers. The SAR sites applied in this application are a high-story building in the fire and a steep mountain terrain. The usual SAR mission procedure using UAVs was flying UAVs close to the disaster site, identifying the people in distress in a specific room in the building or a specific point in the mountain, determining the person’s criticality (e.g., are they close to the fire or are they unconscious?), remembering the spatial points, and notifying the point information to the rescue team. This study implemented the procedure.

The developed VR application demonstrates the 1st-person view and 3rd-person view to aid UAV pilots’ situation awareness. The VR application requires a dedicated open space for interaction. The application shows 3D graphics of a building in the fire and mountains in their respective scenario. All UAVs for the SAR mission install a camera and provide the 1st-person camera view to the UAV pilots. The VR application provides an augmented 1st-person camera view frame in the scenario site. A 3D miniature digital twin of a building/mountain of interest was designed and placed close to the VR user in the VR application to plan the UAV camera scanning before beginning the SAR mission. The VR user can see the miniature as a flying bird above the real place, which is the 3rd-person view. Fig. 1 shows the initial visual scene of the building in the fire scenario and the mountain scenario respectively.

HHHHBI_2023_v27n4_473_f0001.png 이미지

Fig. 1. Initial view scenes

Users can customize the program setting with their preference including the UAV’s time of battery limit, UAV color, and UAV speed. The application provides a function of drawing the UAV’s planned flight paths for camera scanning using the user’s hand gestures with VR controllers considering the 3D features of the building and mountain terrains. Once drawing the UAV’s scanning path is finished, the UAV begins the automatic flight along the predefined path and the digital twin visualizes their flight progress on the path with different colors from the 3 rd person view. The planned scanning paths are in magenta and turn to the cyan color from the starting point to the current point as the automatic flight progresses along the path. The color coding complies with the design standards of electronic flight instruments for navigation [14]. Fig. 2 and Fig. 3 show that the VR user draws the UAV flight path on the 3D miniature digital twins of building and mountain respectively, and the UAV’s automatic flights along the predefined paths. Since the miniature is 3D graphics, the VR user can slightly move around the miniature graphics to see all the 3D features of buildings or mountain terrains if there is an open space reserved for the VR interaction. And then the users can draw the path lines on the surfaces at different angles. They can define the UAV’s camera scanning path considering the UAV’s battery life to avoid double scanning in the same sector in case the UAV pilot forgets the region where the UAV already finishes the scanning.

HHHHBI_2023_v27n4_473_f0002.png 이미지

Fig. 2. Defining SAR UAV flight path by hand gesture at the building site

HHHHBI_2023_v27n4_473_f0003.png 이미지

Fig. 3. Defining SAR UAV flight path by hand gesture at the mountain site

The pilot can stop the automatic flight to conduct manual flights within a specific area when they think they identify people at a specific point. When they confirm they found people who need a rescue in a specific room in the building or a specific sector in the mountain, the VR user can make a circle mark at the point in the miniature (Fig. 4). The VR user can change the color of the circle mark depending on the priority level (red: Level 3, the high priority for the people who is close the fire in the building or who is unconscious, yellow: Level 2, the medium priority for the people who is less critical than the red, white: Level 1, the low priority for the people who is less critical than the yellow: safe and conscious). Then they can continue to the automatic flight in the rest predefined paths.

HHHHBI_2023_v27n4_473_f0004.png 이미지

Fig. 4. Markings for the identified people on the digital twin and identified people seen in the camera view

A UAV’s augmented camera screen frame provides the 1st-person view to conduct the manual flight. The application can expand the size of the 1st-person view frame so that the VR user can see the sight better. The 1st-person view and 3rd-person view are synchronized with each other. The users also can manipulate the scale of the miniature.

After two pilot studies to identify the potential problems in the test setting, the author conducted human-in-the-loop (HITL) simulation tests using the VR application for data collection.

Ⅲ. Experiment & Results

3-1 Research Design

The experiment was a within-subject design with two task conditions (conventional vs. novel) as the independent variable. The conventional condition does not have the miniature digital twin of the task site, nor provide any automatic flight support. The user should control the SAR UAV only with the 1^st person camera view display in the conventional condition; it does not have the 3^rd person view option. The novel condition provides the digital twin, the hand gesture input, and the automatic flight along the defined path. The users could refer both to the 1^st person camera view and the 3^rd person digital twin view in the novel condition. Dependent variables are the time to find people in distress, the number of identified people within each SAR site, and perceived mental workload using the NASA Task Load Index (NASA-TLX) method. The NASA TLX method comprises 6 criteria (mental demand, physical demand, temporal demand, performance, effort, and frustration) to measure a participant’s perceived mental workload [15]. The number of participants is 19 (18 males and 1 female). The total participation time per individual is about 1 hour.

The hypothesis is that human performance with the novel design is faster and more accurate than with the conventional condition, and the perceived mental workload with the novel design is lower than with the conventional condition. Kent State University, Ohio, USA granted the institutional review board (IRB) of this experiment.

3-2 Procedures

After the informed consent procedure, the participants were provided with a tutorial session to familiarize themselves with the VR application. And then the experimenter provided four different conditions and asked each participant to search for people in distress: a conventional SAR mission only with the 1st-person camera view at the mountain, a conventional SAR mission only with the 1st-person camera view at the building in the fire, a novel SAR mission with the 1st-person and 3rd-person view at the mountain, and a novel SAR mission with the 1st-person and 3rd-person view at the building in the fire. The order of the condition per individual was counterbalanced. The conventional conditions only used the 1st-person camera view frame for the SAR missions. Participants conducted the task of searching for people in distress in the building and the mountain terrain. When the participants found the people in the novel conditions, they marked the point of identified people. In the conventional conditions, participants did not have the capability of marking the point. Upon the completion of the given tasks in the four conditions, the participants were asked to rate their subjective mental workload in the NASA TLX sheets.

3-3 Results

The collected data were statistically analyzed by applying analysis of variance (ANOVA). The researcher used JMP 17.1.0 statistics software. The analysis result of the objective and subjective responses are as follows:

Objective Results: Number of Found People

The author conducted Student’s t-test for the pair comparison (α = 0.05) for all the objective results. The responses show the heterogenous results from the building and mountain sites. At both task sites, the numbers of correctly determining Level 3 priority, the highest level, in the novel conditions are not different from the conventional conditions. The results produced significant differences between the conventional and novel conditions in the number of Level 2 faults at the building site and in the number of Level 1 faults at the mountain site. Not satisfying the hypothesis, the participants did not successfully determine the Level 2 priority of people in distress in the novel condition (M = 1.37) than conventional (M = 0.53) at the building site. However, the participants better determined the Level 1 priority in the novel condition (M = 1.63) than in conventional (M = 3.1) at the mountain site satisfying the hypothesis (Fig. 5). The different alphabet letters above the graphs indicate the Student’s t-test resultant levels; the common letters of A and A imply there is no difference between the two conditions and A and B imply there is a significant difference between the two conditions.

HHHHBI_2023_v27n4_473_f0005.png 이미지

Fig. 5 Level 2 fault (building) and Level 1 fault (mountain)

The overall accuracy (how accurate were the participants’ responses concerning if they correctly identified a people in distress and the responded priority rating of the people in distress?) and total success number show no difference between the conventional and novel conditions at the building site (Fig. 6). Contradicting the hypothesis seen in Fig. 7, the total accuracy at the mountain site is even lower in the novel condition (M = 1.89) than conventional (M = 3.10), and the total success was lower in the novel (M = 3.95 times) than conventional (M = 6.58 times). The novel condition also shows a higher number of passing (i.e., overlooking the people in distress, M = 8.05 times) than the conventional (M = 5.42 times) at the mountain site (Fig. 8). The number of passing was not different between the conventional and novel at the building site. There was no difference between the two conditions for the task duration in both the building and mountain missions either. Furthermore, for the number of false alarms (i.e., identified a people at a specific point when there are no people in distress at the point), there is no difference between the conventional and novel both at the building and mountain.

HHHHBI_2023_v27n4_473_f0006.png 이미지

Fig. 6. Accuracy & success rate in building site

HHHHBI_2023_v27n4_473_f0007.png 이미지

Fig. 7. Accuracy & success rate in mountain site

HHHHBI_2023_v27n4_473_f0008.png 이미지

Fig. 8. Number of passing

Subjective Results; Mental Workload

The author conducted Student’s t-test for the pair comparison (α = 0.05) for all the subjective results. Fig. 9 and Fig. 10 show the NASA-TLX responses. The five NASA-TLX criteria out of six meet the hypothesis at the building site (Fig. 9). The responses were significantly lower in the novel condition than the conventional for the mental demand (novel M = -3.42, conventional M = -0.05), physical demand (novel M = -5.37, conventional M = -2.74), temporal demand (novel M = -7.47, conventional M = -4.79), effort (novel M = -2.89, conventional M = 1.26), and frustration (novel M = -6.84, conventional M = -3.42) at the building site. The performance was the only criterion that fails to meet the hypothesis. At the mountain site (Fig. 10), the workload was lower in the novel condition only for the physical demand (novel M = -2.95, conventional M = -0.95) and the performance (novel M = 0.79, conventional M = -1.89). The rest responses do not make a difference between the novel and the conventional.

HHHHBI_2023_v27n4_473_f0009.png 이미지

Fig. 9. NASA-TLX responses in the building task

HHHHBI_2023_v27n4_473_f0010.png 이미지

Fig. 10. NASA-TLX responses in the mountain task

Ⅳ. Discussions

This study evaluated if the developed VR application using the hand gesture input for the automatic UAV scanning path was effective for the SAR missions in two mission sites. The results indicated the application did not show any comparative merit in the mission success rate and accuracy. The novel method only showed a higher accuracy for the Level 1 priority situations at the mountain site, in which the people in distress looked okay but needed a rescue. The higher accuracy level and higher total success number in the conventional condition of the mountain site may indicate the novel design is not aligned with the participants’ intention. The conventional scanning method during manual flight may represent the participants’ capability of mapping the mission site. However, the novel design helped participants to save their mental workload at the building site. Compared with the mountain site, monitoring the situations inside building rooms is a simple repetitive task because the scenes of each room transmitted from the UAV camera look the same. In this condition, human operators may not need a sophisticated mapping capability. Participants could feel boredom and the boredom could make their monitoring performance degraded. Since the robot was invented to replace human repetitive and boring tasks, the novel method may have saved the workload of the tasks[16]. Compared with the building site, participants may need sophisticated mapping at the mountain site not to scan the same region again. The predefinition of scanning patterns and the automatic UAV flight at this site may be able to reinforce the user’s performance instead of saving the mental workload. Reflecting this interpretation, the participants’ NASA-TLX responses were higher only with the performance criteria. The participants did not feel the novel design could save other mental workload criteria at the mountain site.

4-1 Human Factors when Applying Automation

The participants added subjective feedback that the novel application had the benefits of effective predefinition of UAV flight path for economic operation considering the limited battery life. Also, the feedback includes the capability of remembering the spatial point of people to rescue was more effective than conventional UAV SAR models.

In some cases, the conventional manual flight method even outperformed the novel design. They indicate the application of automation to a high-level task may not necessarily be beneficial for all kinds of operations. They may imply that the application of automation should be controlled considering the site and users’ task behaviors. Humans have gotten used to performing voluntary actions based on multiple sensory inputs. The remote visual scenes during automatic flight may get rid of users’ voluntary control activities and the users may feel the mechanism of their behavior in the automatic mode is different from how they developed their expertise. This may cause the degradation of their performance with the automatic mode. Even they may need a training specially for the automatic mode assuming it is a new task environment. The automatic mode may work if the users feel the required task workload is higher than their physical/mental capacity and find the certain subtask that they want to be helped. If the automation is applied when the users feel okay with the conventional mode and do not want to be helped, it is possible that the application of mandatory automation even breaks the human’s optimal resource allocation (perception, attention, and vigilance). The participants in this study could want to slow down the search task when they felt it was difficult to identify people in the automatic mode. However, they maintained the given UAV moving speed in the automatic mode maybe because they thought the speed was optimal. There was no difference in mission time between conventional and novel conditions both in the building and mountain sites.

Automation sometimes drives people to become lazy because they naturally feel that they do not have to do a certain part of a task. Other times automation may even provide another workload when users do not feel comfortable with the automatic mode. The optimal level of automation application may be determined where the level of workload-induced stress with combined voluntary behavior and automation reaches the apex of the Yerkes-Dodson curve[17].; not bored nor exhausted. The apex will vary depending on the task characteristics.

4-2 Limitations of the Study

The limitation of this study is the participant group did not include any UAV pilot for the SAR mission and many participants did not have prior UAV control experience. The SAR UAV experts might want to apply their expertise in spatial mapping to manually control UAVs before deciding to use the path drawing and automatic flight. To this end, the next phase of the study will need to test a hybrid manual and automatic method without a mandatory automatic mode for the SAR mission with experienced UAV pilots.

4-3 Motion sickness issues from VR use

In the pilot tests, the conventional condition applied the UAV camera scene to the entire angle of the VR environment. Participants revealed severe motion sickness problems with the condition. Controlling the UAVs in the entire 1st-person camera view in the VR was very uncomfortable because they felt the conflict between their visions and their vestibular organ’s perception. While the UAV camera’s attitude and angles moved and rotated, the VR user’s body did not move along the movement. Then the design of the conventional mode changed to adding the 1st-person camera view frame instead of the entire screen of the 1st-person view. No participants revealed the motion sickness problem in the main experiment sessions.

Ⅴ. Conclusion

The evaluation of a VR-based definition of a UAV’s automatic camera scanning paths using hand gestures was conducted with 19 non-professional UAV pilot participants. The prototype did not improve the SAR mission performance but saved the mental workload where people need an automation aid. Based on this study’s implications, professional SAR missions that require task experience and expertise for high-level mapping skills may find benefits if the system is designed to selectively support the automation especially when the task workload is higher than the pilot’s capacity, rather than making the automation support mandatory. To validate this conclusion, the system may need a function modification (selective automation application), and another study with professional UAV pilots may provide more insights into the application of automation to SAR missions.

Acknowledgment

This study was conducted with support from the Air Force Research Lab of the US Air Force. The author would like to thank Mr. Tyler Frost, the software engineer of AFRL for the VR application development. Also, the author thanks Mr. Evan Wachholz for the data collection activity.

References

V. San Juan, M. Santos, and J. M. Andujar, "Intelligent UAV map generation and discrete path planning for search and rescue operations," Complexity, 2018.
A. Ryan, and J. K. Hedrick, "A mode-switching path planner for UAV-assisted search and rescue," in Proceedings of the 44th IEEE Conference on Decision and Control, pp. 1471-1476, Dec. 2005.
M. Eldridge, J. Harvey, T. Sandercock, and A. Smith, Design and build a search and rescue UAV. Univ. of Adelaide, Adelaide, Australia., 2009.
J. Cooper, and M. A. Goodrich, "Towards combining UAV and sensor operator roles in UAV-enabled visual search," in 3rd ACM/IEEE International Conference on Human-Robot Interaction (HRI), pp. 351-358, Mar. 2008.
M. A. Goodrich, B. S. Morse, D. Gerhardt, J. L. Cooper, M. Quigley, J. A. Adams, and C. Humphrey, "Supporting wilderness search and rescue using a camera-equipped mini UAV," Journal of Field Robotics, Vol. 25, No.1-2, pp. 89-110, 2008. https://doi.org/10.1002/rob.20226
J. Sun, B. Li, Y. Jiang, and C. Y. Wen, "A camera-based target detection and positioning UAV system for search and rescue (SAR) purposes," Sensors, Vol. 16, No. 11, 2016.
B. S. Morse, C. H. Engh, and M. A. Goodrich, "UAV video coverage quality maps and prioritized indexing for wilderness search and rescue," in 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI), pp. 227-234, Mar. 2010
P. Doherty, and P. Rudol, "A UAV search and rescue scenario with human body detection and geo-localization," in Australasian Joint Conference on Artificial Intelligence, Berlin, Heidelberg, pp. 1-13, Dec. 2007.
S. Verykokou, A. Doulamis, G. Athanasiou, C. Ioannidis, and A. Amditis, "UAV-based 3D modelling of disaster scenes for urban search and rescue," in 2016 IEEE International Conference on Imaging Systems and Techniques (IST), pp. 106-111, Oct. 2016.
A. S. Khalaf, P. Pianpak, S. A. Alharthi, Z. Namini-Mianji, R. Torres, S. Tran, ... and Z. O. Toups, "An architecture for simulating drones in mixed reality games to explore future search and rescue scenarios," in Proceedings of the International ISCRAM Conference. Jan. 2018.
T. Meilinger, and G. Vosgerau, "Putting egocentric and allocentric into perspective," in International Conference on Spatial Cognition, Berlin, Heidelberg, pp. 207-221, Aug. 2010.
G. Albeaino, R. Eiris, M. Gheisari, and R. R. Issa, "DroneSim: A VR-based flight training simulator for drone-mediated building inspections," Construction Innovation, Vol. 22, No. 4, pp. 831-848, 2022. https://doi.org/10.1108/CI-03-2021-0049
L. Yang, J. Huang, T. Feng, W. Hong-An, and D. Guo-Zhong, "Gesture interaction in virtual reality," Virtual Reality & Intelligent Hardware, Vo. 1, No. 1, pp. 84-112, 2019. https://doi.org/10.3724/SP.J.2096-5796.2018.0006
J. E. Duven, Electronic Flight Displays, Federal Aviation Administration (FAA) Advisory Circular (AC) AC 25-11B, 2014.
S. G. Hart, and E. S. Lowell, "Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research," Advances in psychology. Vol. 52. pp. 139-183, 1988. https://doi.org/10.1016/S0166-4115(08)62386-9
J. Wallen, The history of the industrial robot. Linkoping University Electronic Press. 2008.
N. Rote, L. Collins, and D. Villicana, ASTE 561: Human Factors in Spacecraft Design Final Report Performance vs. Workload. 2022.

Journal of Advanced Navigation Technology (한국항행학회논문지)

Evaluation of Novel Method of Hand Gesture Input to Define Automatic Scanning Path for UAV SAR Missions

손 제스처를 이용하여 탐색 구조용 무인항공기의 자동 스캐닝 경로를 정의하는 가상현실 입력방법 개발 및 평가

Abstract

Keywords

Ⅰ. Introduction

Ⅱ. VR-based SAR Training System Prototype

Ⅲ. Experiment & Results

3-1 Research Design

3-2 Procedures

3-3 Results

Objective Results: Number of Found People

Subjective Results; Mental Workload

Ⅳ. Discussions

4-1 Human Factors when Applying Automation

4-2 Limitations of the Study

4-3 Motion sickness issues from VR use

Ⅴ. Conclusion

Acknowledgment

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)