PITA Fiscal Year 2008 Projects - Information and Systems Technology

Maples: Fault-tolerant middleware and diagnostic tools for sensor networks

Principal Investigators: Priya Narasimhan, Rajeev Gandhi

When multiple heterogeneous sensors are connected together for critical sensing applications (such as those envisioned for Sensor Andrew), there is clearly a need for them to communicate with each other dependably, regardless of any failures that occur, and regardless of any differences in the underlying platform, operating system, location, etc. The proposed Maples embedded middleware will comprise a suite of tools and libraries that mask the intricate and complex details (such as protocol differences, byte-order differences, concurrency, asynchrony, sensor failures, lost messages, consistency, etc.) of enabling distributed sensor nodes and their hosted applications to work together seamlessly and reliably. Much of a deployed sensor-network’s administrative cost will inevitably arise from manually finding and fixing problems. To support an automated, self-managing and dependable SensorAndrew deployment, and to mitigate the administrative burden for the hosted applications, Maples aims for fault-tolerance and diagnosability. Maples’ capabilities would free Sensor Andrew’s application programmers from having to worry about the impact of failures, or having to waste time and effort in determining the root causes of these failures. Maples will automatically: (1) identify the culprit sensor node or component that is the root cause of a problem, and (2) perform recovery that targets the actual source of the problem appropriately, rather than distracting the programmers’/administrators' attention with potential ``red-herrings.'' The challenges arise from the fact that there can be multiple root-causes for an observed problem manifestation, and there can be multiple (possibly distributed) observed manifestations for the same underlying root-cause. Maples’ failure-diagnosis strategy will seek the right system information to monitor and determine how often it should be monitored, in order to fingerpoint the root cause of failures accurately and rapidly, with low false-positive and false-negative rates. We propose to evaluate Maples using Sensor Andrew applications such as electricity monitoring and first-responder assistance.