This document gives a high-level design overview of a software ecosystem for motion capturing. The emphasis is on maximal decoupling of functionalities, and on creation or use of FOSS (Free Software and Open Source software) libraries and applications. The horizon of the ecosystem's development is rather long-term, so including “everything”.
Motion capture uses sensors to “watch” a moving human or machine, and to process the raw sensor measurements into the motion of a mathematical model of the human or the machine. This document also captures other related activities under the name “motion capture”:
Analysis of motions: to classify, to quantify, and to assess the amount of certain motion pathologies in patients; to decompose a captured motion into a set of “elementary motion strokes/gaits”; to calibrate a patient-specific kinematic or dynamic model to make it fit to the motion capture data; to find the amount of “angriness” or “despair” in facial expressions; to process the raw data into application-specific form; to visualise in 2D or 3D environments, including synchronisation with other sensor data, such as camera images; …
Synthesis of motions: via a combination of “elementary motion strokes”; via key frames; via dynamic models that link neural and muscle stimulation signals to forces acting on the skeleton, and hence to motion; …
Mathematical representations of motions: instantaneous and non-instantaneous motions (“trajectories”); joint space and Cartesian space motions; (explicit) parameterized motions and (implicit) motion constraints; …
Clinical standardization of terminology and description of nominal motions (“healthy” motions) and the pathology-related deviations from those nominal motions; creation of families of nominal motion, where members of the same family are parameterized according to velocity, range, patient size, etc.; …
Integration with other sensing systems: many applications require the motion information to be closely integrated with other sensory information, such as computer vision, EMG and force plate signals, inertial and acceleration signals, (3D) laser scans, fluoroscopy; motion control of exoskeletons, orthoses or robots; surgical operation planning; dynamic registration; sport game analysis and scouting; preparation for surgery; …
There are different design, modelling and processing steps involved in motion capture; the items marked with a “(S)” in the list below consist mostly of software:
Marker design and placement: the markers must be “visible” to the sensors, and be uniquely identifiable in one way or another. Markers can be passive or active. Future systems will probably use no artificial markers at all, but use “natural landmarks”. The placement of markers is often standardized in a particular domain, in order to facilitate communication, data exchange, and (hopefully) to optimize identifiability and accuracy.
Sensor design and placement: the properties of the sensors should match the properties of the markers. The placement of the sensors can have a large impact on the accuracy of marker measurements.
(S) Active sensor control: some sensors require active control to achieve their goal. For example, a camera driven next to a walking patient, zooming in on the knee, requires advanced motion control of the robot camera.
In addition, this sensing, as well as the integration of different sensors, requires hard realtime performance of the acquisition and control software.
Kinematic model definition: most often, the users are not interested in the motion of the markers, but of the subject (“natural landmarks”, “anatomical landmarks”) to which the markers are (supposed to be rigidly) attached. This subject's motion is represented by a mathematical model of its motion capabilities, and the “joint angles” of this model are used to store and evaluate the subject's motion. The choice of this kinematic model has a large influence on the resulting motion model.
The easiest kinematic models are explicit: every degree of freedom of the relative motion X of the moving segments is “generated” by an ideal joint qi between ideal rigid bodies in the kinematic model:
X = f (q1, …, qn).
However, some parts of the human body, such as the shoulder, cannot be represented faithfully by an explicit kinematic model, and require an implicit model: each segment i has six degrees of freedom Xi, with kinematic/dynamic constraints that take away some motion freedom:
g (X1, …, Xm) = 0.
Implicit descriptions are more flexible, but computationally more involved, and intuitively less clear or deterministic.
Assigning markers to kinematic model: the position of the markers with respect to the model has to be known, in order to be able to derive the model's motion from the markers' motion. In human motion capture, markers are mostly attached to the subject's skin, which results in “skin artefact” disturbances on te marker-to-model transformation.
(S) Sensor signal processing: the raw sensor data is most often not used directly, but is processed via computer algorithms into marker “measurements”; for example, calculation of the center of a visually bright spot in a camera image; template matching in a laser scan; etc.
(S) Marker measurement processing: the “measurements” of the markers have to be transformed into the motion of the (implicit or explicit) kinematic model. Most often, this transformation is formulated mathematically as an optimization problem: the calculated model motion is the one that gives the smallest error between the modelled motion of the markers and their measured motion.
“Least-squares” optimization is the most common, but it has a number of known drawbacks: sensitivity to outliers; not all sensors have a uniform error distribution in all directions or for all markers; the skin artefacts are not “random noise”; etc.
Data association is an important aspect of marker measurement processing: measured markers have to be assigned to the corresponding model markers. It's hard to do this completely automatically and computationally efficiently. simple
(S) Kinematic model identification (“kinematic calibration”): the transformation from the marker measurements in the best fitting motion of the kinematic model. During calibration, a prescribed set of parameters in the kinematic model (not just the “joint angles” but also geometric measures) can be adapted in order to optimize the fitting; these geometric adaptations can not be arbitrary, because there are constraints between several of the geometric parameters; for example, the scale of the whole kinematic model must correspond to physically realistic dimensions of humans.
(S) Dynamic model definition and identification (“dynamic calibration”): instead of using only a kinematic model, the inertial and force effects can be integrated into the motion model, resulting in a fully dynamic model. The model definition and identification problems are similar to the kinematic case, but have more parameters. However, the advantage of taking into account the dynamics is that some “noise” effects in the kinematic models can now be described more explicitly and (hence?) accurately.
(S) Muscle models often appear in this stage of the motion capture “work flow”. Both kinematic and dynamic models.
(S) Recognition, classification, clustering: after a motion for the kinematic (or dynamic) model has been identified from the “raw” motion capture, the next step could be to classify the captured motion into a set of given classes, e.g., walking, running, jumping, etc. Clustering is a related post-processing activity, in which a set of motion trajectories is automatically divided into clusters with similar properties. It is currently not clear yet what are the most appropriate motion metrics with which the separation between different clusters are measured.
A motion can be decomposed into a (non)linear combination of “elementary motions”, for recogition, classification, clustering, synthesis, etc. The scalar parameters in this combination could be the basis for the recognition, classification and/or clustering. This decomposition should be more model-based than the traditional but rather “blind” PCA, ICA or other factorization algorithms.
(S) Visualisation of motion data: the captured data is to displayed to humans in a “virtual reality” environment: bone or limb model, with possibly video superimposed, from different view points, with synchronous visualisation of the other signals (EMG, force, &hellip), etc.
(S) Storage and retrieval of motion data: each run of motion capturing on a “patient” or a “machine” generates a lot of data, from various sensors, and using different models; these data should be stored in a very systematic way for later analysis and data mining. The database in which the data is stored should be integrated into the total workflow, and contain domain-specific knowledge to monitor the access and storage of the data, e.g., to guarantee correctness.
(S) Integrated Development Environments (IDE) support the integration of all or some of the above-mentioned software modules into one single interface to the human user. The IDE offers the most appropriate workflow to its different categories of users.
This Section lists software that is available under a Free Software or Open Source Software (“FOSS”) license.
BodyMech: a Matlab based open source package for 3D kinematic analysis. It does not focus on the sensor aspects of motion capturing, but on the subsequent marker processing steps.
Orchestra: a workflow support package.
…
Here are two of the more popular file formats in which (human) motion capture information is stored and exchanged: C3D (more information at the www.c3d.org website), BVH (BioVision Hierarchical data format).
NetCDF (network Common Data Form) and HDF5 are two domain-independent binary data file formats with a lot of potential in the context of motion capture.
H-anim is a standard for the representation of human figures, basically for use in video and much less for accurate biomechanical modelling. Here is another link.
Here are detailed VRML files of the human skeleton. They can be imported in Blender by first transforming them to AC3D format (e.g., via white_dune on Linux).