KiviatNavigator. Diplomarbeit. Navigation of Source Code Data using Kiviat-Graphs. Roman Flückiger of Olten, Switzerland ( ) - PDF

Description
Diplomarbeit October 2, 2006 KiviatNavigator Navigation of Source Code Data using Kiviat-Graphs Roman Flückiger of Olten, Switzerland ( ) supervised by Prof. Dr. Harald Gall Dr. Martin Pinzger

Please download to get full document.

View again

of 48
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Information
Category:

Entertainment & Media

Publish on:

Views: 9 | Pages: 48

Extension: PDF | Download: 0

Share
Transcript
Diplomarbeit October 2, 2006 KiviatNavigator Navigation of Source Code Data using Kiviat-Graphs Roman Flückiger of Olten, Switzerland ( ) supervised by Prof. Dr. Harald Gall Dr. Martin Pinzger Department of Informatics software evolution & architecture lab Diplomarbeit KiviatNavigator Navigation of Source Code Data using Kiviat-Graphs Roman Flückiger Department of Informatics software evolution & architecture lab Diplomarbeit Author: Roman Flückiger, Project period: 3. April Oktober 2006 Software Evolution & Architecture Lab Department of Informatics, University of Zurich Acknowledgements First and foremost, I would like to thank Martin Pinzger for his imperturbable calm and his reassurances during the dark hours of scientific research. Next, where would I be without the tenacious companionship of Michael Würsch and Andreas Jetter through all these days. Your good spirits and helpfulness is unheard of. Thanks, you two. Special thanks to Beat Fluri for his both gracious and generous support in my hour of need. Further thanks to Patrick Knab for postgresql support and constantly taking away the Eclipse book I have my own copy now. Additional thanks to my parents who, to my constant surprise, never seem to loose faith in me... And finally, thanks to my sister, Simone, for proof-reading my work. Abstract Source code data of large software systems tend to be very complex. To visualize and navigate these data pools, in a manner to reveal specific software traits, remains a challenge to date. In this thesis we present an exploration strategy for navigating such source code data. We generate graphical views that expose specific design aspects, such as bad smells, and hotspots in general. The approach uses sequences of such views to incrementally gather knowledge about the code in scope. This finally allows us to identify entities of questionable design. Our approach uses the measurement mapping principle combined with kiviat diagrams to visualize system entities. We further present a prototype implementation as an Eclipse plug-in and evaluate it in a case study, analyzing parts of the Mozilla source code. Zusammenfassung Programmcode von grossen Software Systemen tendiert dazu sehr komplex zu werden. Diese Daten zu visualisieren und navigieren, so dass spezifische Charakterzüge des Programmcodes hervorgehoben werden, ist nach wie vor eine Herausforderung. In dieser Arbeit werden wir eine Strategie zur Erforschung von Programmcodedaten präsentieren. Wir werden graphische Ansichten generieren, die spezifische Designschwächen, wie zum Beispiel Bad Smells, sowie allgemein verdächtige Strukturen entlarven sollen. Unser Ansatz verwendet Sequenzen solcher Ansichten um inkrementell Wissen über den Programmcode zu sammeln. Dies erlaubt uns schliesslich Entitäten mit fraglicher Struktur zu identifizieren. Unser Ansatz verwendet das Measurement Mapping-Prinzip, kombiniert mit Kiviat-Diagrammen als Visualisierung von Software-Entitäten. Des weiteren werden wir einen Prototypen als Eclipse-Plugin implementieren und evaluieren. Letzteres mit Hilfe einer Fallstudie, in der wir einen Teil des Mozilla Programmcodes analysieren werden. Contents 1 Introduction Contribution Structure of the Thesis Related Work Polymetric Views CodeCrawler ArchView Simple Hierarchical Multi-Perspective (SHriMP) Approach Exploring Large Graphs Incremental Exploration and Navigation Polymetric Views Preset Views Concept Preset View Catalog Provider/Consumer View Roots/Leaves View Concluding Thoughts Implementation Integration Requirements System Setup Kiviat Navigator Overview Kiviat Navigator Architecture KiviatContainer and KiviatContainerGenerator IKiviatContainerNormalizer and KiviatMaxNormalizer KiviatNodeRealizer Mozilla Case Study Approach Investigation Summary Conclusions Contribution Outlook viii CONTENTS A Contents of CD-ROM 31 CONTENTS ix List of Figures 2.1 Up to five metric values can be mapped on a CodeCrawler node A polymetric view in CodeCrawler (source: [LD03]) A view from the ArchView approach (source: [Pin05]) On the left, a SHriMP visualization using the fisheye distortion algorithm (source: [SWFM97]). In the center a newer implementation of SHriMP and on the right as view from Creole, as well using SHriMP (source: [Chi06]) Exploration paths as sequences of single viewpoints A sample kiviat diagram, and how the three dimensions are mapped onto it The left diagram shows a typical data consumer, the right one a data storage object The left diagram shows a typical consumer of functionality. The right graphic depicts a provider The left graphic shows a consumer of data and functionality, the right one a provider The left graphic shows most probably dead code, the right diagram an entity that is both provider and consumer at the same time Leaves of the inheritance tree Roots of the inheritance tree Extreme members of the view This chart shows how the Kiviat Navigator is embedded into the Eclipse environment and where data comes from. Components in faded grey are not currently used by our implementation, but show the envisioned goal The HierarchyView shows all entitites available for visualization The MetricView allows selection of metrics This is a simplified graph of the components involved in the Kiviat Navigator. The modules with thicker black borders contain multiple classes, which mostly are members of model-view-control patterns A simplified System Hotspot View [Pin05][LD03]. The metrics are the following: (0) imagix ClMemVar, (1) imagix ClMemTyp, (2) imagix ClMemFnc, (3), imagix Cl- MemCl, (4) length Provider/Consumer View. The metrics are the following: (0) in invokesnrrelsdirect, (1) in accessesnrrelsdirect, (2) out invokesnrrelsdirect, (3), out accessesnrrelsdirect Roots/Leaves View. The metrics are the following: (0) in overridesnrrelsdirect, (1) in inheritsnrrelsdirect, (2) out overridesnrrelsdirect, (3), out inheritsnrrelsdirect. 27 List of Tables 3.1 Table of Metrics Table of Findings List of Listings 4.1 IKiviatContainerGenerator NameSpaceDecl x CONTENTS 5.2 Two method examples from nsxmlprocessinginstruction Chapter 1 Introduction Software systems tend to get large and complex during their lifetime. They are subject to constant change and extensions in numerous ways. In addition, people leave projects, new developers join the team, documentation gets sloppy, or is never done at all. Sooner or later, there comes a time when the people working on a software system cease to know anything about the tricks and traps hidden within this behemoth, being their work. This is where reverse engineering comes into play. How can we recover from the raw source code data what we have lost along the way? While there exist concepts how to systematically recover the architecture and structure of a software, our set of mind is a bit more optimistic, since we assume that unknown software systems are not entirely bad. We will focus on how to specifically expose unfortunate structures or bad smells [FBB + 99]. We are going to do this using the concept of measurement mapping as the base of our approach. This will lead to a graphical representation of source code data with so-called kiviat diagrams. How we will navigate the complexity of software systems with the help of the mentioned visualization is the core subject of this thesis. 1.1 Contribution Our goal is to propose a simple and useful way of navigating source code data using kiviat graphs as presented in the ArchView approach [Pin05]. Thus simplifying the analysis of large and unknown source code data and maybe even make a first step to standardization of such analyses. We will use kiviat graphs to devise views that highlight specific source code aspects, such as bad smells. We will further present a strategy called incremental exploration that uses sequences of such views to gather knowledge about a target system. During the thesis a prototype Eclipse plug-in implementing these concepts will be developed. Finally, we evaluate our tool in a case study, analyzing parts of the Mozilla source code, and discuss the results. 1.2 Structure of the Thesis Chapter 2 presents different concepts of investigating and navigating source code data, considered as related work. In chapter 3 our approach to the problem is presented. Starting with our own thoughts of inspecting large graphs, leading to the principles of incremental exploration and finally examples of useful view configurations and their analysis. The subsequent chapter will describe the implementation of our approach, the Kiviat Navigator. Our tool will finally be put to the test in the case study, in chapter 5, where we will apply our methods to a subset of the Mozilla 2 Chapter 1. Introduction web browser source code. In the final chapter we will conclude the thesis and give an outlook for future work. Chapter 2 Related Work In this chapter we review concepts and tools that focus on layout techniques and navigation of source code data, which is a part of information visualization. Before we focus on a few closely related approaches, we will do a fast sweep of to vast field of other interesting work. First of all, there are visualization techniques that make use of the third dimension. One way of using 3D would be to add the dimension of time to 2D views, a concept presented by [SDB98]. Another approach is given by the source viewer 3D (sv3d) [MMF03], that uses an extension of SeeSoft [ESS92] to represent software systems. Then there are concept that focus on visualizing runtime information of programs, such as the program explorer [LN95]. Finally, coming back to the frame where our thesis best fits in, there are static visualizations. A taxonomy of software visualization of this kind is given by Price et al. [PBS93]. In this class belong approaches like Rigi [MK88], SeeSoft [ESS92], SHriMP [SWFM97] and CodeCrawler [LD03]. We will describe two of these concepts in more detail in the next sections. We start with an introduction to polymetric views, the visualization method that is our primary focus. Next, two software tools, that use polymetric views to inspect source code data will be briefly presented, CodeCrawler [LD03] and ArchView [Pin05]. Finally, the SHriMP approach [SWFM97] is presented, and along with it a number of graph navigation techniques, that are being used by this approach. 2.1 Polymetric Views Polymetric views are the starting point of this thesis. The basic concept our visualization techniques will use to layout source code data and enrich it with information. It is solely intended for objectoriented source code data. A polymetric view is a two dimensional visualization in which nodes represent software entities and edges represent relationships between entities. Furthermore a number of metric measurements are mapped onto nodes (and edges). This methodology is called measurement mapping. It should fulfill the representation condition: if a number a is bigger than a number b, the graphical representation of a and b must preserve this fact [LD03]. This approach essentially uses metric visualizations to show symptoms of the underlying source code data. The following two tools are both based on this concept. 4 Chapter 2. Related Work CodeCrawler The CodeCrawler is presented by Lanza et al. in [LD03]. It uses rectangles as nodes and maps up to five metrics onto a single node. Figure 2.1 shows how this is done. Both width and height represent a measurement, as well as the color of the node body. In some configurations the location of the node in the view is used to map two additional metric values. Figure 2.1: Up to five metric values can be mapped on a CodeCrawler node. The CodeCrawler needs three basic ingredients to generate a polymetric view. A choice of entities and metrics and a third one: a layout. The layout determines how nodes are arranged in a view, for instance if they should be sorted in a specific manner. The list of layout strategies used by the CodeCrawler contains layouts such as tree structures and scatterplots. Figure 2.2 shows a so-called checker distribution of nodes. The entities are sorted according to a specific metric. In this particular case the target entities are attributes. The width and height of the nodes render the number of local accesses and the number of nonlocal accesses, respectively. The color indicates the total number of accesses. In this way, attributes that are never accessed at all, and can therefore be removed, line up in the top row. Attributes, which are heavily accessed are found at the bottom. In addition, attributes that get predominantly nonlocal accesses stand out as very tall nodes candidates for accessor methods. Figure 2.2: A polymetric view in CodeCrawler (source: [LD03]) ArchView The ArchView approach by Pinzger [Pin05] is also based on the concept of polymetric views. The crucial difference to CodeCrawler is the graphical representation of nodes. Instead of rectangles ArchView uses kiviat diagrams (refer to Figure 2.3 for examples of kiviat diagrams). The advantage of such a representation is the possibility to map considerably more than five metrics onto a single node simultaneously. A benefit, that is actually seldom used. There are cases where metrics 2.2 Simple Hierarchical Multi-Perspective (SHriMP) 5 are all of the same kind and kiviat diagrams prove their worth. But we will see, that most of the time around four metrics produce views that have stronger interpretations. Figure 2.3: A view from the ArchView approach (source: [Pin05]) Figure 2.3 shows a detailed modification hotspots view generated by ArchView. Nodes represent files in this view. The rendered metrics are concerned with problem reports, their priority and severity. This reveals how many problems a file was affected by and how severe these problems were. ArchView introduces a number of strong changes to polymetric views as CodeCrawler uses them. Firstly, there is just one layout in ArchView. The location of the nodes are subject to the user s wishes and not bound to any metric. Moreover the representation condition of measurement mapping is no longer fulfilled by the complete visualization, since metrics are no longer normalized within a node. This means values can no longer be related within a single node, but still between all the diagrams. We will later on see what consequences this fact has for our visualization. On the other hand, ArchView adds code releases as a new dimension to the concept. 2.2 Simple Hierarchical Multi-Perspective (SHriMP) An entirely different approach to reverse engineer large source code data is the SHriMP approach [SWFM97]. When graphs reach a certain size and complexity, navigation techniques become necessary to find your way around. Software systems usually are large and complex and therefore 6 Chapter 2. Related Work also yield complex graphs. SHriMP has its focus not on symptoms but on traversing hierarchies and relationships of source code. It uses different navigation strategies to accomplish this goal. To prevent information overflow by displaying a whole system at once, SHriMP uses semantic zooming [HMM00] to hide and reveal information of nodes. This zoom is not applied to the whole view at the same time. You can chose which node to enlarge and show details, with the context of the entity is still visible. This is called fisheye distortion. SHriMP even allows you to focus on different parts of the graph at the same time, to inspect disjoint source code entities. Figure 2.4 shows different implementations of the SHriMP approach. Figure 2.4: On the left, a SHriMP visualization using the fisheye distortion algorithm (source: [SWFM97]). In the center a newer implementation of SHriMP and on the right as view from Creole, as well using SHriMP (source: [Chi06]) Chapter 3 Approach The previous chapter gave a brief insight into some already existing source code visualization techniques. From the beginning our goal was to take the ArchView [Pin05] approach as our starting point and enhance it while integrating it into the Eclipse platform. The ArchView approach has its focus on visualizing source code metrics, as opposed to SHriMP [SWFM97], that basically visualizes whole tree structures and uses zooming, panning and disjoint context focussing to navigate. The ArchView stand-alone implementation is more of a static visualizer. What is left is our contribution, to find out where navigation fits into this concept. The question we asked ourselves was, how to navigate such large graphs, like software systems usually are? We found most of the answers in the work of Lanza et al. [LD03] and Herman et al. [HMM00]. 3.1 Exploring Large Graphs Software systems are usually very large and a single mind has most of the time difficulties to understand it entirely. That is certainly the case, when you look at the whole system at once. Even other researchers, who committed quite some time in solving this problem have not come up with a satisfactory solution. Storey et al.. for instance have presented the SHriMP approach [SWFM97] with the incentive to let the user see the whole system all the time, since this increases the understanding of the whole system. They use fisheye distortion methods to let the user zoom in on specific parts of the system without loosing the connections to the rest. But their approach is not safe from information overflow in the visualization. Some intelligent filtering will be necessary to make this concept useful. So here is a thought about the exploration of large spaces. Since their vastness is the biggest obstacle in understanding, why let the user see everything and burden him with the difficult task to mentally (or graphically supported) fade out irrelevant information? Being able to see the whole picture all the time certainly helps building a mental map, but since the entire space is not understood at this time, most mental associations will be useless. There are other strategies to explore such large networks, by continuously fading relevant information in, instead of fading irrelevant facts out. Let us take the World Wide Web for example [HMM00]. It is impossible to behold the whole network at once, but day by day by searching for specific information a user may explore this vast network and start to build a mental map from it, which is solely built from pieces of relevant information. There is no use to know the entire system if all you want to know can be found by a couple of mouse-clicks, just two or three hyperlinks away. 8 Chapter 3. Approach So let us imagine our target software system like a large three dimensional cloud (refer to Figure 3.1). It is impossible to get a grasp of it all from just one viewpoint. Something will always be hidden from view or covered by some other information. A layout alone can not overcome the challenge of revealing the structure of large spaces [HMM00]. We need some kind of navigation to travel around. The key is to look at the target system from multiple perspectives or viewpoints and hereby gain information step by step. This approach is called incremental exploration and navigation and is also presented in the survey by Herman et al. [HMM00]. 3.2 Incremental Exploration and Navigation By gaging a system through a sequence of different viewpoints (look at Figure 3.1 to get a general idea), which is essentially the process of incremental exploration, the main question is who chooses the next step in the path and how? There are two possibilities for the who. It is either the user itself, who has all the time complete control over the next step in line, or the computer, that suggests the next view configuration based upon some heuristics, for example. Of course the idea, that there exist specific scenarios of view sequences that lead to the detection of certain code issues is very appealing. It has to be shown if such scenarios emerge during the case study. Figure 3.1: Exploration paths as sequences of single view
Related Search
Similar documents
View more...
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks