Virtual Batting Cage and Human Model

Jordan Smith and Richard Davis


Contents


Abstract

The purpose of our project was to give participants in our virtual environment a greater sense of presence by modeling a virtual representation of them which could be seen in the VE. The hope was that the participant would feel more a part of the environment because they were visually represented in that space instead of being a disembodied spirit. In particular, we hoped that the juxtaposition of virtual limbs next to other objects in the virtual environment could help in giving an enhanced sense of depth perception. To test our representation, we constructed a VE of an indoor racquet ball court in which the participant can bat a ball around and bounce it off the walls.

The Body Model

Based on the fact that we had a limited amount of time and Polhemus sensors we decided to only model the torso and the right arm. The model was designed to conform to the actions of the user as closely as possible. The configuration of the model was controlled by collecting data at specified control points on the user's body using Polhemus Fastrak six degree of freedom sensors. We had three different configurations of sensors which we experimented with to see how many sensors were necessary to accurately track the arm.

The one sensor version had a single sensor mounted on the user's hand as in figure 1. With this scheme we could only accurately present a free floating hand in the correct position and orientation in space. This configuration may not give a totally compelling feeling of presence, but it is good enough to do certain kinds of interactions.

* *
Figure 1:The One Sensor Model

The two sensor version had the first sensor on the user's hand and the second on the user's back near the shoulder as in figure 2. We hoped to be able to represent the entire torso and right arm with this method by solving the inverse kinematics for a six degree of freedom arm manipulator. The human arm has seven degrees of freedom, so this was clearly an approximation. The joint angle which we dropped was the degree of freedom which makes lateral rotations of the wrist. We chose this one to drop because it seemed to have the least range in human movement, and we hoped that even in its absence the model would work all right. This was not the case. It was disorienting for small wrist waving motions of the user to cause large scale changes in the model's elbow and shoulder angles. The inverse kinematics was also a large bottle neck which slowed down our operating speed making the application less interactive. So for practical purposes we stopped using this configuration by the end.

* *
Figure 2:The Two Sensor Model

The three sensor version had the first sensor on the hand, the second sensor on the shoulder, and the third sensor on the humerus just below the elbow, as shown in figure 3. With this extra constraint it was fairly easy to represent the torso and right arm in the correct orientation. The base reference frame was the shoulder's frame and the arm was extended from there. We made sure to constrain the model so that the links in the models arm stayed well attached. This was maintained by finding the rotation of the humerus based on the difference between the measured shoulder and elbow orientations. A translation by a "known" humerus link then positioned us at the elbow. The elbow angle was determined by taking the three measured positions in space and calculating the angle between the two vectors leaving the elbow. After this rotation was applied, another translation by the "known" forearm length was applied. Finally, the hand orientation was determined by finding the rotation matrix between the measured hand orientation and the modified elbow orientation (the elbow orientation rotated by the elbow orientation around the z-axis). The final result was a fairly realistic model of the user's actions with a blocky representation of their body in the VE. The blocky body could easily be replaced with a less crude model. The "known" lengths mentioned before were measured from an average sized male, and we did not put in any calibration code to change these lengths for different users. So if the user was a significantly different size the calibration could be off. Calibration could be added in the future either by measuring the user's dimensions and changing the values in a GUI or by doing a calibration step where the user would position himself in a certain position and the program could solve some simple equations to determine the correct lengths. All in all, users stated that the body model enhanced their sense of presence.

* *
Figure 3:The Three Sensor Model

Environment

Our original intention was to place users of our system in an environment similar to Lawrence Stark's Tele-Robotics demonstration, in which a user's hand would control a robotic arm. We soon realized, however, that it was fairly easy to create a simple set of bouncing ball games that would be much more engaging.

There are three games in our system. The first, seen in figure 4, is a simple paddle game, in which the user must use the paddle on his hand to keep the ball from hitting the floor. The second game, in figure 5, is similar to squash, where the user must hit the ball against a wall. The third game is similar to the first, but the user is surrounded by walls on all sides, so it is harder to lose track of the ball. In all three environments, the floor is placed at about waist level, so that it is easy to pick up the ball if it falls to the floor.

* *
Figures 4 and 5: The Paddle Game and the Squash Game

We made it possible to play each of these games with any of our body configurations. Users could play with a disembodied hand, or with an arm representation with any number of sensors.

Implementation

We implemented our VR system using a combination C++, Open GL, and Tcl/Tk. We used Open GL for the 3D graphics, Tcl/Tk to make the user interface easily changeable and configurable, and C++ to organize our application in an object oriented way. The code was divided into three main modules: user interaction, the camera, and the world.

User Interface

The user interaction included both C++ and Tcl source files, along with calls to the Open GL libraries. There was a main window which had the Open GL drawing area. This window was in charge of running the real time loop which we implemented with Tcl timer callbacks in order to keep a constant frame rate.

This window also was in charge of communicating the Polhemus driver process. The driver and the main window communicated by writing and reading seven floating point values in agreed apon shared memory locations. The seven values represented a 3D position and a 4D quaternion for orientation. We had to modify and debug the Polhemus driver code to work for up to four sensors and to work in both z hemispheres. Once the window polled and received values from the driver it would transform them to a virtual position on the user's body which was easier to deal with internally. This concept was described in class during Michael Deering's talk.

The main window also had the ability of popping up object specific user interface windows for changing parameters within the program. These UI windows were all subclassed from the WindowUI class, and their implementation basically just made certain object methods and fields accessible within Tcl. The way this exporting of methods and fields worked was through an interface layer to Tcl called OOT (Object Oriented Tcl) which was designed to integrate Tcl with C++. All of the actual graphical layouts of the UI's were implemented in Tcl, which made making modifications to the UI much easier. Figure 6 shows the UI for the world object.

*
Figures 6: The UI for the World Object

The Camera

The main window had a camera object which defined the user's viewing frustum. This camera could be oriented by a mouse interface or by a Polhemus sensor mounted on top of a head mounted display. This allowed the user to be immersed in the VE by tracking their gaze in virtual space and presenting it to them on the HMD. In order to display our scene on the HMD, we used the SGI videoout program which will grab a 640x480 pixel window and output it as an NTSC signal. This NTSC signal could interpreted by the Virtual IO HMD's we were using. Unfortunately we did not have time to write code to put out an interlaced NTSC signal for stereo viewing, though our camera model was fully capable of render the scene from center, left, and right eye points.

The World

The world object contains all the objects of the VE represented in a scene graph. The scene graph includes group and instance nodes. The group nodes are solely for grouping objects together. The instance on the other hand has a list of transformations to be applied at that node and a pointer to a single object which could be a group. The leaves of the tree are atomic objects. The arm model on the other hand was a special instance node which did some inverse kinematics.

Dynamics

Special handling is needed for objects that move autonomously (i.e. not controlled by an input device) in our scenes. We create moving objects by associating "dynamic transforms" with the object. These special transforms find the new positions of objects influenced by gravity or other forces and are all updated once for each frame. Just before collision detection and rendering, these transforms are combined with all the others in the scene graph to compute the new position and orientation of each object.

Collision Detection

Collision detection in our system is extremely simple in order to save processing time. We assume that all collisions are between spheres (the balls) and planar obstacles (the walls and paddles). We also assume that the momentum of the obstacles is not affected by the balls. This means that a ball's velocity after collision is only dependent on its velocity before collision, the speed of the obstacle (at the collision point), and the normal of the obstacle.

In order to avoid computing the exact speed and orientation of the obstacle at the point of collision, we make the gross assumption that the obstacle's orientation does not change over the timestep in which the moment of collision is computed. This is can lead to obviously incorrect behavior in a small number of cases; for instance, when a user tries to hit a ball by twisting his hand without shifting it at all, the ball will simply slip through the paddle. To avoid this problem, we put in a simple check to ensure that a ball always remains on the same side of an obstacle from the beginning to the end of a time step (unless, of course, it is outside the bounds of the obstacle). This gives visually pleasing results in most cases.

We check for collisions just after the positions of dynamic objects have been updated, and all objects know their positions at the beginning and the end of the time step. The dynamic transforms associated with ball objects are "collision transforms." These transforms check to see if their objects' paths crossed the paths of any obstacles, and correct their positions if a collision occurred. With the assumptions we have made, the exact moment of collision can be computed with a relatively simple quadratic equation.