Ⅰ. Introduction
Search and rescue (SAR) missions have been conducted with fixed or rotary wing-manned aircraft and are considered challenging tasks due to the difficulty of scanning a large area using human eyes from the sky within a limited time in various adverse weather conditions. Recently, Unmanned Aerial Vehicles (UAVs) have been considered a cost-effective scheme for SAR missions in various sectors. Human operators can control the UAVs in a safe place remotely and be helped by computer vision analysis to identify people in distress in the area of interest. Researchers have studied technologies to improve SAR missions using UAVs. The technologies include the logic of the UAV scanning plan, UAV’s autonomy, or camera vision analysis for the successful identification of people to rescue. Table 1 shows the literature review summaries about the technologies.
Table 1. Literature reviews about UAV technologies for SAR missions
Beyond the visual range, most UAV pilots control their UAVs using the 1st-person view (i.e., the moving actor’s viewpoint only to the forward scene; aircraft cockpit view) from the UAV camera. However, the 3rd-person view (i.e., the bird’s eye- or god’s eye-viewpoint, the moving actor’s movement is seen as an object as well as other objects in the bigger environment sight from this viewpoint) can provide good spatial information for better situation awareness. Meilinger and Vosgerau found that the 1st-person and 3rd-person views could interact with each other rather than comprising separate representations for complex spatial representations[11].
Virtual Reality (VR) is a good training method with its capability of high-fidelity 360 degree-training environment generation. Albeaino, Eiris, Gheisari, and Issa tested the effectiveness of VR-based UAV flight training for building inspections and found the potential to create various realistic mission environments saving cost and time for the implementation of challenging variables[12].
VR technologies reserve various input techniques, and gesture input is one of them. Yang, Huang, Feng, Hong-An, and Guo-Zhong investigated the paradigms and classification of interaction for VR applications assuming human express their demands effectively with gestures[13].
The goal of this study is to develop and evaluate a prototype of a UAV training system for SAR missions using a VR application that provides the combined 1st-person and 3rd-person view. This study developed a VR application that applied a hand gesture to define the UAV scanning routes and the automatic UAV flight, and the author evaluated its performance compared with the conventional method of manual UAV flight for scanning.
Ⅱ. VR-based SAR Training System Prototype
This study used VR technology to demonstrate and train UAV pilots for SAR missions. The software team developed the UAV SAR training application using Unreal game engine (Version 4.25). The user could play the application with Oculus Rift S VR device. The users could use the Oculus Rift S’s controller to control UAVs because the setting is equivalent to most current drone controllers. The SAR sites applied in this application are a high-story building in the fire and a steep mountain terrain. The usual SAR mission procedure using UAVs was flying UAVs close to the disaster site, identifying the people in distress in a specific room in the building or a specific point in the mountain, determining the person’s criticality (e.g., are they close to the fire or are they unconscious?), remembering the spatial points, and notifying the point information to the rescue team. This study implemented the procedure.
The developed VR application demonstrates the 1st-person view and 3rd-person view to aid UAV pilots’ situation awareness. The VR application requires a dedicated open space for interaction. The application shows 3D graphics of a building in the fire and mountains in their respective scenario. All UAVs for the SAR mission install a camera and provide the 1st-person camera view to the UAV pilots. The VR application provides an augmented 1st-person camera view frame in the scenario site. A 3D miniature digital twin of a building/mountain of interest was designed and placed close to the VR user in the VR application to plan the UAV camera scanning before beginning the SAR mission. The VR user can see the miniature as a flying bird above the real place, which is the 3rd-person view. Fig. 1 shows the initial visual scene of the building in the fire scenario and the mountain scenario respectively.
Fig. 1. Initial view scenes
Users can customize the program setting with their preference including the UAV’s time of battery limit, UAV color, and UAV speed. The application provides a function of drawing the UAV’s planned flight paths for camera scanning using the user’s hand gestures with VR controllers considering the 3D features of the building and mountain terrains. Once drawing the UAV’s scanning path is finished, the UAV begins the automatic flight along the predefined path and the digital twin visualizes their flight progress on the path with different colors from the 3 rd person view. The planned scanning paths are in magenta and turn to the cyan color from the starting point to the current point as the automatic flight progresses along the path. The color coding complies with the design standards of electronic flight instruments for navigation [14]. Fig. 2 and Fig. 3 show that the VR user draws the UAV flight path on the 3D miniature digital twins of building and mountain respectively, and the UAV’s automatic flights along the predefined paths. Since the miniature is 3D graphics, the VR user can slightly move around the miniature graphics to see all the 3D features of buildings or mountain terrains if there is an open space reserved for the VR interaction. And then the users can draw the path lines on the surfaces at different angles. They can define the UAV’s camera scanning path considering the UAV’s battery life to avoid double scanning in the same sector in case the UAV pilot forgets the region where the UAV already finishes the scanning.
Fig. 2. Defining SAR UAV flight path by hand gesture at the building site
Fig. 3. Defining SAR UAV flight path by hand gesture at the mountain site
The pilot can stop the automatic flight to conduct manual flights within a specific area when they think they identify people at a specific point. When they confirm they found people who need a rescue in a specific room in the building or a specific sector in the mountain, the VR user can make a circle mark at the point in the miniature (Fig. 4). The VR user can change the color of the circle mark depending on the priority level (red: Level 3, the high priority for the people who is close the fire in the building or who is unconscious, yellow: Level 2, the medium priority for the people who is less critical than the red, white: Level 1, the low priority for the people who is less critical than the yellow: safe and conscious). Then they can continue to the automatic flight in the rest predefined paths.
Fig. 4. Markings for the identified people on the digital twin and identified people seen in the camera view
A UAV’s augmented camera screen frame provides the 1st-person view to conduct the manual flight. The application can expand the size of the 1st-person view frame so that the VR user can see the sight better. The 1st-person view and 3rd-person view are synchronized with each other. The users also can manipulate the scale of the miniature.
After two pilot studies to identify the potential problems in the test setting, the author conducted human-in-the-loop (HITL) simulation tests using the VR application for data collection.
Ⅲ. Experiment & Results
3-1 Research Design
The experiment was a within-subject design with two task conditions (conventional vs. novel) as the independent variable. The conventional condition does not have the miniature digital twin of the task site, nor provide any automatic flight support. The user should control the SAR UAV only with the 1st person camera view display in the conventional condition; it does not have the 3rd person view option. The novel condition provides the digital twin, the hand gesture input, and the automatic flight along the defined path. The users could refer both to the 1st person camera view and the 3rd person digital twin view in the novel condition. Dependent variables are the time to find people in distress, the number of identified people within each SAR site, and perceived mental workload using the NASA Task Load Index (NASA-TLX) method. The NASA TLX method comprises 6 criteria (mental demand, physical demand, temporal demand, performance, effort, and frustration) to measure a participant’s perceived mental workload [15]. The number of participants is 19 (18 males and 1 female). The total participation time per individual is about 1 hour.
The hypothesis is that human performance with the novel design is faster and more accurate than with the conventional condition, and the perceived mental workload with the novel design is lower than with the conventional condition. Kent State University, Ohio, USA granted the institutional review board (IRB) of this experiment.
3-2 Procedures
After the informed consent procedure, the participants were provided with a tutorial session to familiarize themselves with the VR application. And then the experimenter provided four different conditions and asked each participant to search for people in distress: a conventional SAR mission only with the 1st-person camera view at the mountain, a conventional SAR mission only with the 1st-person camera view at the building in the fire, a novel SAR mission with the 1st-person and 3rd-person view at the mountain, and a novel SAR mission with the 1st-person and 3rd-person view at the building in the fire. The order of the condition per individual was counterbalanced. The conventional conditions only used the 1st-person camera view frame for the SAR missions. Participants conducted the task of searching for people in distress in the building and the mountain terrain. When the participants found the people in the novel conditions, they marked the point of identified people. In the conventional conditions, participants did not have the capability of marking the point. Upon the completion of the given tasks in the four conditions, the participants were asked to rate their subjective mental workload in the NASA TLX sheets.
3-3 Results
The collected data were statistically analyzed by applying analysis of variance (ANOVA). The researcher used JMP 17.1.0 statistics software. The analysis result of the objective and subjective responses are as follows:
Objective Results: Number of Found People
The author conducted Student’s t-test for the pair comparison (α = 0.05) for all the objective results. The responses show the heterogenous results from the building and mountain sites. At both task sites, the numbers of correctly determining Level 3 priority, the highest level, in the novel conditions are not different from the conventional conditions. The results produced significant differences between the conventional and novel conditions in the number of Level 2 faults at the building site and in the number of Level 1 faults at the mountain site. Not satisfying the hypothesis, the participants did not successfully determine the Level 2 priority of people in distress in the novel condition (M = 1.37) than conventional (M = 0.53) at the building site. However, the participants better determined the Level 1 priority in the novel condition (M = 1.63) than in conventional (M = 3.1) at the mountain site satisfying the hypothesis (Fig. 5). The different alphabet letters above the graphs indicate the Student’s t-test resultant levels; the common letters of A and A imply there is no difference between the two conditions and A and B imply there is a significant difference between the two conditions.
Fig. 5 Level 2 fault (building) and Level 1 fault (mountain)
The overall accuracy (how accurate were the participants’ responses concerning if they correctly identified a people in distress and the responded priority rating of the people in distress?) and total success number show no difference between the conventional and novel conditions at the building site (Fig. 6). Contradicting the hypothesis seen in Fig. 7, the total accuracy at the mountain site is even lower in the novel condition (M = 1.89) than conventional (M = 3.10), and the total success was lower in the novel (M = 3.95 times) than conventional (M = 6.58 times). The novel condition also shows a higher number of passing (i.e., overlooking the people in distress, M = 8.05 times) than the conventional (M = 5.42 times) at the mountain site (Fig. 8). The number of passing was not different between the conventional and novel at the building site. There was no difference between the two conditions for the task duration in both the building and mountain missions either. Furthermore, for the number of false alarms (i.e., identified a people at a specific point when there are no people in distress at the point), there is no difference between the conventional and novel both at the building and mountain.
Fig. 6. Accuracy & success rate in building site
Fig. 7. Accuracy & success rate in mountain site
Fig. 8. Number of passing
Subjective Results; Mental Workload
The author conducted Student’s t-test for the pair comparison (α = 0.05) for all the subjective results. Fig. 9 and Fig. 10 show the NASA-TLX responses. The five NASA-TLX criteria out of six meet the hypothesis at the building site (Fig. 9). The responses were significantly lower in the novel condition than the conventional for the mental demand (novel M = -3.42, conventional M = -0.05), physical demand (novel M = -5.37, conventional M = -2.74), temporal demand (novel M = -7.47, conventional M = -4.79), effort (novel M = -2.89, conventional M = 1.26), and frustration (novel M = -6.84, conventional M = -3.42) at the building site. The performance was the only criterion that fails to meet the hypothesis. At the mountain site (Fig. 10), the workload was lower in the novel condition only for the physical demand (novel M = -2.95, conventional M = -0.95) and the performance (novel M = 0.79, conventional M = -1.89). The rest responses do not make a difference between the novel and the conventional.
Fig. 9. NASA-TLX responses in the building task
Fig. 10. NASA-TLX responses in the mountain task
Ⅳ. Discussions
This study evaluated if the developed VR application using the hand gesture input for the automatic UAV scanning path was effective for the SAR missions in two mission sites. The results indicated the application did not show any comparative merit in the mission success rate and accuracy. The novel method only showed a higher accuracy for the Level 1 priority situations at the mountain site, in which the people in distress looked okay but needed a rescue. The higher accuracy level and higher total success number in the conventional condition of the mountain site may indicate the novel design is not aligned with the participants’ intention. The conventional scanning method during manual flight may represent the participants’ capability of mapping the mission site. However, the novel design helped participants to save their mental workload at the building site. Compared with the mountain site, monitoring the situations inside building rooms is a simple repetitive task because the scenes of each room transmitted from the UAV camera look the same. In this condition, human operators may not need a sophisticated mapping capability. Participants could feel boredom and the boredom could make their monitoring performance degraded. Since the robot was invented to replace human repetitive and boring tasks, the novel method may have saved the workload of the tasks[16]. Compared with the building site, participants may need sophisticated mapping at the mountain site not to scan the same region again. The predefinition of scanning patterns and the automatic UAV flight at this site may be able to reinforce the user’s performance instead of saving the mental workload. Reflecting this interpretation, the participants’ NASA-TLX responses were higher only with the performance criteria. The participants did not feel the novel design could save other mental workload criteria at the mountain site.
4-1 Human Factors when Applying Automation
The participants added subjective feedback that the novel application had the benefits of effective predefinition of UAV flight path for economic operation considering the limited battery life. Also, the feedback includes the capability of remembering the spatial point of people to rescue was more effective than conventional UAV SAR models.
In some cases, the conventional manual flight method even outperformed the novel design. They indicate the application of automation to a high-level task may not necessarily be beneficial for all kinds of operations. They may imply that the application of automation should be controlled considering the site and users’ task behaviors. Humans have gotten used to performing voluntary actions based on multiple sensory inputs. The remote visual scenes during automatic flight may get rid of users’ voluntary control activities and the users may feel the mechanism of their behavior in the automatic mode is different from how they developed their expertise. This may cause the degradation of their performance with the automatic mode. Even they may need a training specially for the automatic mode assuming it is a new task environment. The automatic mode may work if the users feel the required task workload is higher than their physical/mental capacity and find the certain subtask that they want to be helped. If the automation is applied when the users feel okay with the conventional mode and do not want to be helped, it is possible that the application of mandatory automation even breaks the human’s optimal resource allocation (perception, attention, and vigilance). The participants in this study could want to slow down the search task when they felt it was difficult to identify people in the automatic mode. However, they maintained the given UAV moving speed in the automatic mode maybe because they thought the speed was optimal. There was no difference in mission time between conventional and novel conditions both in the building and mountain sites.
Automation sometimes drives people to become lazy because they naturally feel that they do not have to do a certain part of a task. Other times automation may even provide another workload when users do not feel comfortable with the automatic mode. The optimal level of automation application may be determined where the level of workload-induced stress with combined voluntary behavior and automation reaches the apex of the Yerkes-Dodson curve[17].; not bored nor exhausted. The apex will vary depending on the task characteristics.
4-2 Limitations of the Study
The limitation of this study is the participant group did not include any UAV pilot for the SAR mission and many participants did not have prior UAV control experience. The SAR UAV experts might want to apply their expertise in spatial mapping to manually control UAVs before deciding to use the path drawing and automatic flight. To this end, the next phase of the study will need to test a hybrid manual and automatic method without a mandatory automatic mode for the SAR mission with experienced UAV pilots.
4-3 Motion sickness issues from VR use
In the pilot tests, the conventional condition applied the UAV camera scene to the entire angle of the VR environment. Participants revealed severe motion sickness problems with the condition. Controlling the UAVs in the entire 1st-person camera view in the VR was very uncomfortable because they felt the conflict between their visions and their vestibular organ’s perception. While the UAV camera’s attitude and angles moved and rotated, the VR user’s body did not move along the movement. Then the design of the conventional mode changed to adding the 1st-person camera view frame instead of the entire screen of the 1st-person view. No participants revealed the motion sickness problem in the main experiment sessions.
Ⅴ. Conclusion
The evaluation of a VR-based definition of a UAV’s automatic camera scanning paths using hand gestures was conducted with 19 non-professional UAV pilot participants. The prototype did not improve the SAR mission performance but saved the mental workload where people need an automation aid. Based on this study’s implications, professional SAR missions that require task experience and expertise for high-level mapping skills may find benefits if the system is designed to selectively support the automation especially when the task workload is higher than the pilot’s capacity, rather than making the automation support mandatory. To validate this conclusion, the system may need a function modification (selective automation application), and another study with professional UAV pilots may provide more insights into the application of automation to SAR missions.
Acknowledgment
This study was conducted with support from the Air Force Research Lab of the US Air Force. The author would like to thank Mr. Tyler Frost, the software engineer of AFRL for the VR application development. Also, the author thanks Mr. Evan Wachholz for the data collection activity.
References
- V. San Juan, M. Santos, and J. M. Andujar, "Intelligent UAV map generation and discrete path planning for search and rescue operations," Complexity, 2018.
- A. Ryan, and J. K. Hedrick, "A mode-switching path planner for UAV-assisted search and rescue," in Proceedings of the 44th IEEE Conference on Decision and Control, pp. 1471-1476, Dec. 2005.
- M. Eldridge, J. Harvey, T. Sandercock, and A. Smith, Design and build a search and rescue UAV. Univ. of Adelaide, Adelaide, Australia., 2009.
- J. Cooper, and M. A. Goodrich, "Towards combining UAV and sensor operator roles in UAV-enabled visual search," in 3rd ACM/IEEE International Conference on Human-Robot Interaction (HRI), pp. 351-358, Mar. 2008.
- M. A. Goodrich, B. S. Morse, D. Gerhardt, J. L. Cooper, M. Quigley, J. A. Adams, and C. Humphrey, "Supporting wilderness search and rescue using a camera-equipped mini UAV," Journal of Field Robotics, Vol. 25, No.1-2, pp. 89-110, 2008. https://doi.org/10.1002/rob.20226
- J. Sun, B. Li, Y. Jiang, and C. Y. Wen, "A camera-based target detection and positioning UAV system for search and rescue (SAR) purposes," Sensors, Vol. 16, No. 11, 2016.
- B. S. Morse, C. H. Engh, and M. A. Goodrich, "UAV video coverage quality maps and prioritized indexing for wilderness search and rescue," in 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI), pp. 227-234, Mar. 2010
- P. Doherty, and P. Rudol, "A UAV search and rescue scenario with human body detection and geo-localization," in Australasian Joint Conference on Artificial Intelligence, Berlin, Heidelberg, pp. 1-13, Dec. 2007.
- S. Verykokou, A. Doulamis, G. Athanasiou, C. Ioannidis, and A. Amditis, "UAV-based 3D modelling of disaster scenes for urban search and rescue," in 2016 IEEE International Conference on Imaging Systems and Techniques (IST), pp. 106-111, Oct. 2016.
- A. S. Khalaf, P. Pianpak, S. A. Alharthi, Z. Namini-Mianji, R. Torres, S. Tran, ... and Z. O. Toups, "An architecture for simulating drones in mixed reality games to explore future search and rescue scenarios," in Proceedings of the International ISCRAM Conference. Jan. 2018.
- T. Meilinger, and G. Vosgerau, "Putting egocentric and allocentric into perspective," in International Conference on Spatial Cognition, Berlin, Heidelberg, pp. 207-221, Aug. 2010.
- G. Albeaino, R. Eiris, M. Gheisari, and R. R. Issa, "DroneSim: A VR-based flight training simulator for drone-mediated building inspections," Construction Innovation, Vol. 22, No. 4, pp. 831-848, 2022. https://doi.org/10.1108/CI-03-2021-0049
- L. Yang, J. Huang, T. Feng, W. Hong-An, and D. Guo-Zhong, "Gesture interaction in virtual reality," Virtual Reality & Intelligent Hardware, Vo. 1, No. 1, pp. 84-112, 2019. https://doi.org/10.3724/SP.J.2096-5796.2018.0006
- J. E. Duven, Electronic Flight Displays, Federal Aviation Administration (FAA) Advisory Circular (AC) AC 25-11B, 2014.
- S. G. Hart, and E. S. Lowell, "Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research," Advances in psychology. Vol. 52. pp. 139-183, 1988. https://doi.org/10.1016/S0166-4115(08)62386-9
- J. Wallen, The history of the industrial robot. Linkoping University Electronic Press. 2008.
- N. Rote, L. Collins, and D. Villicana, ASTE 561: Human Factors in Spacecraft Design Final Report Performance vs. Workload. 2022.