Towards Succinctness
in Specification Mining: The Scenario-Based Case
Supporting Materials
Specification mining methods are used to extract candidate specifications from system execution traces. A major challenge for specification mining is succinctness. That is, in addition to the soundness, completeness, and scalable performance of the specification mining method, one is interested in producing a succinct result, which conveys a lot of information about the system under investigation but uses a short, machine and human-readable representation.
In this paper we address the succinctness challenge in the context of scenario-based specification mining, whose target formalism is live sequence charts (LSC), an expressive extension of classical sequence diagrams. We do this by adapting three classical notions: a definition of an equivalence relation over LSCs, a definition of a redundancy and inclusion relation based on isomorphic embeddings among LSCs, and a delta-discriminative measure based on an information gain metric on a sorted set of LSCs. These are applied on top of the commonly used statistical metrics of support and confidence.
A number of case studies show the utility of our approach towards succinct mined specifications.
Machine
Settings:
2Ghz Intel
Core 2 Duo with 2GB RAM Fujitsu Laptop
Windows Vista Business
Mining program is written using C#.Net 2.0
compiled using VS.Net 2005
Target Program:
crossFTP -- a commercial
open source FTP server built on top of Apache FTP server.
List
of Method Calls Traced:
Input:
Output:
All
Succinct Scenarios Mined (Significant,
Non-Redundant, Delta-Discriminative) (11 scenarios)
Comparative
Output:
All
Significant Symbolic Scenarios Mined (118,012 scenarios)
Technical Report: