Commit 50be5376 authored by Kevin Moran's avatar Kevin Moran

Initial commit for public repo

Initial commit for public repo
Pipeline #34203178 passed with stages
in 12 minutes and 26 seconds
<?xml version="1.0" encoding="UTF-8"?>
<classpathentry kind="src" output="target/classes" path="src/main/java">
<attribute name="optional" value="true"/>
<attribute name="maven.pomderived" value="true"/>
<classpathentry excluding="**" kind="src" output="target/classes" path="src/main/resources">
<attribute name="maven.pomderived" value="true"/>
<classpathentry kind="src" output="target/test-classes" path="src/test/java">
<attribute name="optional" value="true"/>
<attribute name="maven.pomderived" value="true"/>
<attribute name="test" value="true"/>
<classpathentry kind="con" path="org.eclipse.jdt.launching.JRE_CONTAINER/org.eclipse.jdt.internal.debug.ui.launcher.StandardVMType/JavaSE-1.8">
<attribute name="maven.pomderived" value="true"/>
<classpathentry kind="con" path="org.eclipse.m2e.MAVEN2_CLASSPATH_CONTAINER">
<attribute name="maven.pomderived" value="true"/>
<classpathentry kind="output" path="target/classes"/>
# This file copied from the GVT code.
# eclipse specific git ignore
# External tool builders
# Locally stored "Eclipse launch configurations"
# This will supress any download for dependencies and plugins or upload messages which would clutter the console log.
# `showDateTime` will show the passed time in milliseconds. You need to specify `--batch-mode` to make this work.
MAVEN_OPTS: " -Dorg.slf4j.simpleLogger.showDateTime=true -Djava.awt.headless=true"
# As of Maven 3.3.0 instead of this you may define these options in `.mvn/maven.config` so the same config is used
# when running from the command line.
# `installAtEnd` and `deployAtEnd`are only effective with recent version of the corresponding plugins.
MAVEN_CLI_OPTS: "--batch-mode --errors --fail-at-end --show-version -DinstallAtEnd=true -DdeployAtEnd=true"
image: tomsontom/oracle-java8-mvn:latest
- test
- build
stage: test
- 'mvn $MAVEN_CLI_OPTS test'
stage: build
- 'mvn $MAVEN_CLI_OPTS -Dmaven.test.skip=true package'
- target/GCat.jar
- libs/
- html/
\ No newline at end of file
<?xml version="1.0" encoding="UTF-8"?>
Layout Changes: These are changes where the location of a component on the screen changes, but the size of the component remains consistent.
Size Changes: These are changes in which the size of a component changes, but its location remains consistent.
Image Changes: These are changes where an Image is modified or replaced.
Image Color Changes: These are changes where the correct Image content does not change, but the colors of that image do change.
Missing/Added Component Changes: These are changes where a component from a previous version of an application, is removed from a subsequent version of an application; or when a new component is added.
Text Color Changes: These are errors where the color of text changes, but the content and font remain consistent.
Text Content Changes: These are changes where the content of the text on a screen in a previous version of an app does not match a subsequent version.
Text Font-Style Changes: These are changes where the font type of text in the previous version of a screen does not match a subsequent version.
\ No newline at end of file
This small module parses the crashscope image data, seperating it out by version number, removing some duplicates (it doesn't seem to get them all), and generates a file that shows which screens precisely match each other.
Navigate to the root directory of the crash scope data.
Execute python3
import os, sys, argparse
from shutil import copyfile
from imgproc.imageset import ImageSet
from PIL import Image
# For the progress bar
import time
import progressbar
def get_filenames_of_extension(directory, extension):
"""Iterate over the filenames in a directory with a specific extension."""
filenames = (f for f in os.listdir(directory) if f.split(".")[-1] == extension)
bar = progressbar.ProgressBar()
for filename in bar(filenames):
yield filename
def remove_duplicates(directory):
"""!!NOT ROBUST, it removes SOME duplicates, but some are not detected because
of minute, inperceptable differences."""
image_set = ImageSet()
for filename in get_filenames_of_extension(directory, "png"):
image = + os.path.sep + filename)
if not image_set.add(image):
# Delete the image if it is a duplicate.
os.remove(directory + os.path.sep + filename)
print("| Operation complete. %d unique images identified." % len(image_set))
def match_screens(dir1, dir2):
"""Looks for IDENTICAL matches between two directories. Does not catch them all
because of minute, inperceptable differences in the images (I think that's the reason)."""
matches = []
print("Building reference set...")
image_set = ImageSet()
for filename in get_filenames_of_extension(dir1, "png"):
image = + "/" + filename)["filename"] = filename
print("Checking new values...")
for filename in get_filenames_of_extension(dir2, "png"):
image = + "/" + filename)
if image in image_set:
match_fn = image_set.find_match(image)
if match_fn:
print("%d Matches found." % len(matches))
return matches
def extract_version_numbers(directory):
""" Parses out the version numbers of the files in a directory"""
versions = []
for filename in os.listdir(directory):
splt = filename.split("_")
if len(splt) > 1:
version = splt[-2]
if version not in versions:
if len(versions) != 2:
raise RuntimeError("There are not two versions present: %s" % str(versions))
return versions
def make_dirs(directory, names):
for name in names:
if not os.path.exists(directory + os.path.sep + name):
os.makedirs(directory + os.path.sep + name)
def sort_by_version(directory):
""" Iterates over the files in a directory, sorting the files into two new
directories that correspond to their version numbers. """
versions = extract_version_numbers(directory)
make_dirs(directory, versions)
for filename in get_filenames_of_extension(directory, "png"):
if versions[0] in filename:
copyfile(os.path.join(directory, filename),
os.path.join(directory, versions[0], filename))
elif versions[1] in filename:
copyfile(os.path.join(directory, filename),
os.path.join(directory, versions[1], filename))
msg = "File %s does not contain a version number: %s" % (filename, str(versions))
raise RuntimeError(msg)
def retrieve_xml(img_directory, xml_directory):
""" Determines the step number and version number of all the files in a directory,
and copies the corresponding xml files into that directory."""
version = img_directory.split(os.path.sep)[-1]
step_numbers = set()
for f in get_filenames_of_extension(img_directory, "png"):
step_number = f.split("gnucash")[-1].split(".")[0]
for f in get_filenames_of_extension(xml_directory, "xml"):
if version in f and f.split("-")[-1].split(".")[0] in step_numbers:
copyfile(os.path.join(xml_directory, f), os.path.join(img_directory, f))
from PIL import Image
from PIL import ImageChops
# import imagehash
class ImageSet:
"""Class that hashes images in order to eliminate duplicates."""
def __init__(self):
self.__images = dict()
self.__size = 0
def __len__(self):
return self.__size
def __contains__(self, img):
#hsh = imagehash.phash(img)
img = self._process_image(img)
hsh = hash(img.tobytes())
if hsh in self.__images:
for image in self.__images[hsh]:
if self.images_are_equal(image, img):
return True
return False
def images_are_equal(self, image1, image2):
# This is used instead of the == operator, because == checks metainformation
# in addition to graphical information, and we only care about graphical information.
return ImageChops.difference(image1, image2).getbbox() is None
def _process_image(self, image):
""" Images need to be cropped to remove the notification bar."""
width, height = image.size
cropped_image = image.crop((0, 50, width, height)) # For ignored area
return cropped_image
def add(self, img):
""" Attempts to add an image to the set. Returns true if successful,
returns false on collision. """
#hsh = imagehash.phash(img)
img = self._process_image(img)
hsh = hash(img.tobytes())
if hsh in self.__images:
for image in self.__images[hsh]:
if self.images_are_equal(image, img):
return False
self.__images[hsh] = [img]
self.__size += 1
return True
def get_images(self):
imgs = []
for key in self.__images:
imgs += self.__images[key]
return imgs
def find_match(self, img):
""" Attempts to find a matching image in the set. If none is found, return None."""
img = self._process_image(img)
hsh = hash(img.tobytes())
if hsh in self.__images:
for image in self.__images[hsh]:
if self.images_are_equal(image, img):
return image
return None
#! /usr/bin/env python3
from imgproc.filetools import *
data_root = os.getcwd()
screens = data_root + os.path.sep + "screenshots"
# Extracts the version numbers from the filenames of the images in screenshots.
print("Extracting version numbers...")
versions = extract_version_numbers(screens)
# Creates two new folders in the screenshots folder corresponding to the version numbers,
# and sorts all the screens in screenshots into their respective folder.
print("Sorting by versions...")
# Paths to the new folders.
version0_path = screens + os.path.sep + versions[0]
version1_path = screens + os.path.sep + versions[1]
# Remove duplicates from both new folders.
print("Removing duplicates from version %s..." % versions[0])
print("Removing duplicates from version %s..." % versions[1])
# Match the screens between versions that are identical matches.
print("Attempting to match...")
matches = match_screens(version0_path, version1_path)
#print(version0_path, data_root)
#retrieve_xml(version0_path, data_root)
#retrieve_xml(version1_path, data_root)
# Write all the matches (represented as a list of tuples of filenames) to a file.
with open("matches.txt", "w") as output:
string = ""
for match in matches:
string += ",".join(match) + "\n"
There are two modules here:
This runs the bipartite matching algorithm. To use it, change the variables SCREEN_PATH1 and SCREEN_PATH2 to the directories that contain the images that you want to match.
This is a convenience script that will take a csv file with matchings of the form
and given the location of these screens and the file, will concatenate the images of the matchings into one image per pair, to make it easier to visually verify that the accuracy of the algorithm.
To use it, change the variable CSV_FILE to the path to the file with the matchings, and SCREENS1 and SCREENS2 to the locations of the images. Note that the version on the left side of the "," in the csv file, must be the same as the version corresponding to SCREEN1
import os
from PIL import Image
# Change these values
MATCH_CSV = "output.txt"
SCREENS1 = "/home/jhoskins/projs/lsh/cs-data/screenshots/1.1.10"
SCREENS2 = "/home/jhoskins/projs/lsh/cs-data/screenshots/1.1.13"
f = open(MATCH_CSV)
# Iterate over the csv of form oldversionscreen,newversionscreen
for i, line in enumerate(f):
file1, file2 = line.split(",")
# Open the images corresponding to the match
image1 = + os.path.sep + f1.strip())
image2 = + os.path.sep + f2.strip())
width1, height = i1.size
width2, height = i2.size
# Create a new image in the working directory, contacentating both images horizontally.
new ="RGB", (width1 + width2, height))
new.paste(im=i1, box=(0,0))
new.paste(im=i2, box=(width1,0))"new" + str(i) + ".png")
from PIL import Image
from scipy.optimize import linear_sum_assignment
import imagehash
import numpy as np
import os
# Change these values
SCREEN_PATH1 = "/home/jhoskins/projs/lsh/cs-data/screenshots/1.1.10"
SCREEN_PATH2 = "/home/jhoskins/projs/lsh/cs-data/screenshots/1.1.13"
# These become the dimensions of the cost matrix
screen_count1 = len(list(os.listdir(SCREEN_PATH1)))
screen_count2 = len(list(os.listdir(SCREEN_PATH2)))
# Build the cost matrix
cost = np.zeros((screen_count1, screen_count2), dtype=int)
for i, filename1 in enumerate(os.listdir(SCREEN_PATH1)):
for j, filename2 in enumerate(os.listdir(SCREEN_PATH2)):
hsh1 = imagehash.dhash( + os.path.sep + filename1))
hsh2 = imagehash.dhash( + os.path.sep + filename2))
hamming_distance = int(hsh1 - hsh2)
cost[i, j] = hamming_distance
# Run the optimization algorithm
row_ind, col_ind = linear_sum_assignment(cost)
# Turn the mapping of row indicies -> col indicies into a mapping of
# 1.1.10 filenames -> 1.1.13 filenames
files1 = list(os.listdir(SCREEN_PATH1))
files2 = list(os.listdir(SCREEN_PATH2))
matching = []
for x, y in zip(row_ind, col_ind):
matching.append((files1[x], files2[y]))
# Write the matches to a file
with open("matches.txt", "w") as f:
s = ""
for match in matching:
s += ",".join(match) + "\n"
# Gui Change Analysis Tool (G-CAT)
## Project Summary
Mobile applications evolve at a rapid pace. During the development lifecycle of a mobile app, new features are implemented based on user demand, fixes are applied to existing bugs, and underlying platforms and APIs are updated, which all drive the evolution of an underlying codebase. To support this rapid development process, mobile software engineers need automated support for documenting and understanding the changes an app undergoes.
Because mobile applications are heavily GUI-driven, much of the functionality is tied to code related to the user interface. Therefore, changes to the user interface are of paramount importance for developers to understand and document in order to maintain a detailed working knowledge of an evolving codebase.
The purpose of this project is to develop a system to automatically detect, summarize, and document GUI-changes in mobile apps. This project will require the development and application of a differencing algorithm for attributed trees representing the hierarchical structure of mobile GUIs combined with computer vision techniques for detecting style and color differences.
The output of the tool will be detailed release notes that summarize the changes of a mobile GUI between two different releases or versions. Ideally the end product could be integrated as a plugin to an existing issue tracker (GitHub, GitLab).
## Use Cases
This tool is meant for comparing GUI changes between two commits. It can be used to:
* Optimize front-end development when working in a large group of developers
* Simplify GUI backtracking by highlighting where changes have occurred
* Quantify GUI changes for client approval and comparison
* Allow user to closely model other application's GUI
## Quick Start
### Prerequisites
GCAT takes two screenshots, and two XML [uiautomator]( dumps as input.
We recommend using The Android Debug Bridge, or [ADB]( to do this.
* Download and install ADB
### Installation
* Download
* Unzip somewhere on your computer.
* Depending on your operating system, you may need to give the following files permission to run:
* GCAT/libs/pid-linux/perceptualdiff
* GCAT/libs/pid-windows/perceptualdiff.exe
* GCAT/libs/pid-mac/perceptualdiff
* Navigate to the newly created GCAT folder.
### Generating Input
* Connect to the Android instance running your app using adb.
To get screenshots of the two GUIs you want to compare (more info [here](
adb shell screencap /sdcard/screenshot.png
adb pull /sdcard/screenshot.png guiscreenshot1.png
To get the uiautomator dumps of the two GUIs you want to compare (more info [here](
adb shell uiautomator dump dump1.xml
Run these commands twice total, once when the Android instance is open to each GUI you want to compare.
### Running GCAT
* Run the following command in the GCAT folder, making sure to use __absolute paths__:
java -jar GCAT.jar <screenshotPath1> <xmlDumpPath1> <screenshotPath2> <xmlDumpPath2>
GCAT will be run with default settings. A summary and itemized list of changes will be output to the screen, and a full HTML report of the changes will be generated in the newly created folder, Outputs/html/Full-Report.html.
Open this file in your web browser to see the results of the GUI differencing algorithm.
## Usage
To streamline input for the command line, drag and drop the appropriate
files into the command line window. This should copy the absolute path.
The input files required are:
* New commit png file containing screenshot of entire screen - referenced in instructions as the absolute path imgPath1
* New commit xml file generated using UiAutomator - referenced in instructions as absolute path uidumpPath1
* Old commit png file containing screenshot of entire screen - referenced in instructions as absolute path imgPath2
* Old commit xml file generated using UiAutomator - referenced in instructions as absolute path uidumpPath2
For example files for each input see the following:
## Installation
1. Download from the root directory. Unzip it to where you keep your applications.
2. Using the command line, navigate to the new folder, and run:
java -jar GCAT.jar [optionalParameterTags] <imgPath1> <uidumpPath1> <imgPath2> <uidumpPath2>
__It is important that each path is absolute.__
The following tags can be added to your shell command to modify the output as follows:
| Command | Parameter | Default | Effect |
|---|---|---| --- |
|```--htmlreport```| None | None | Opens a visual summary of the GCAT output automatically in your default browser when the analysis is complete. |
|```--violationthreshold```| Integer | 15 | Sets the value that GCAT will use to determine when a visual difference should be counted as an image difference. |
|```--imgDiffthreshold```| Integer | 20 | Sets the value that GCAT will use to determine when objects have moved versus when they have been deleted. |
|```--ignoredareas```| String of the form, "x,y,w,h:x2,y2,w2,h2" where x is the x coordinate of the area, y is the y coordinate of the area, w is width, and h is height. Each area is seperated by a colon, ":"| "0,0,1440,100:0,2372,1440,188" | Set areas on the screenshot that GCAT will ignore when scanning for changes. |
## Output
If the tool is run multiple times, the output files are overwritten with the most recent input parameters.
An html page is created with:
* A natural language summary of the changes between the two GUIs
* Detailed information about each change
* Links to view each GUI hierarchy, as well as their maximum common tree.
The directory containing all HTML files can be found at the absolute path: Android-Summarizing-GUI-Changes/Code/guigit/html
* ../commonTree.htm - displays a visual tree hierarchy of the maximum spanning common tree
* ../tree.html - displays the tree hierarchies generated by each commit input
* ../summaryTemplateDefault.html - summary of GUI changes in natural language and detailed changes with images
* ../Landing Page/landingPage.html - navigation between separate HTML files
The tree.html hierarchy is color-coded according to the following schema in order to highlight differences:
* White = Nodes in the tree are exactly the same
* Grey = Layout changes near root of trees
* Red = Tree nodes are different in value or location in the tree
## Our Team
### Members
* Shuning Chen
* Louisa Doyle
* Claudia Estes
* John Hoskins
* George Purnell
### Team Leaders
* Cody
* Kevin
\ No newline at end of file
This diff is collapsed.
This diff is collapsed.
<html lang="en" class="no-js"><head>
<meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>GCAT - GUI Change Analysis Tool</title>
<link rel="stylesheet" type="text/css" href="res/css/normalize.css">
<link rel="stylesheet" type="text/css" href="res/css/landingPage.css">
<link href="|Raleway" rel="stylesheet">
<div class="container">
<img src="../html/res/images/whitegCatLogo.png"/>