SBML.org — the global portal for all things SBML

Test Runner design

This starts off by assuming a new user, with no prior test data on their system, just to make it clear what the starting state is. It's blank slate.

Startup and initial presentation

The user starts up the application. The application immediately starts unpacking its internal test archive, showing a modal dialog while that's happening and preventing anything else from being done until it's finished.

  • Future enhancement: make it contact sourceforge to check for the latest archive of test cases, and offer to download an update if there is one.

Version 1 ("wrapper oriented") — see below for revised version

The first time, the user will not have any wrapper configurations defined. The app will immediately bring up the preference dialog so that the user can define one.

  • If the user clicks Save, the app checks whether the currently-selected wrapper definition is complete and gives the user feedback if it is not. (E.g.: if the program identified as the wrapper is not executable, or does not exist, or the user left out the output directory, or whatever.) Note: it only checks the currently-selected wrapper definition; if the user clicked Add and added a bunch of wrapper definitions all at once, it doesn't validate the other definitions, only the definition that is selected at the time the user clicks Save. (The dialog will have a checkbox for "proceed anyway".)
  • If the user clicks Cancel at this point (again, this is the first time through), none of the changes will be saved, which means that there will not be any wrappers defined. This will make the app complain to the user that no wrapper is defined and therefore that running tests is impossible. The dialog will have a checkbox for "proceed anyway" in case the user wants to go on, accepting the risks.
    • (LS) Perhaps there could be an option here for the Sarah-like user that would say something like 'import results', that would then take them to a file dialog that searched for the directory where the results were stored?
    • (MH) See below for addendum.

After this, the user will be faced with the main application window. There will be a list of test cases along the left-hand side, and the graph windows will be blank until the user clicks on a case in the list. If they do click on a case in the list, until there are results for that case, the user will see only a plot of the expected result in the upper half of the plot panels.

(MH) Other idea regarding Save & Cancel above: what if the option to select no wrapper at all, was an option? Having no wrapper then would mean "view only". The buttons and menu options for running could be disabled when no wrapper is selected. The pull-down menu in the toolbar could have the words "-no wrapper-" written on it when no wrapper is selected. The wrapper preference pane would have a special predefined in the list that could not be deleted and would always be an option (in fact, it could replace the current "newWrapper" item that is put in the list when the user first starts the app). The one thing I'm not sure about is how to allow the user to select the results to be viewed. Maybe the predefined "-no wrapper-" placeholder item in the list of wrappers could allow the user to fill in some of the fields, like the output directory, and not the rest, as a way of indicating which set of results should be presented? And if they selected a directory that had prior results, then this would have the effect of "importing" those results, as described by LS above.

(SK) Would it be possible to have a special class of configuration that allowed the user to create a non-wrapping wrapper - so the actual configuration of the wrapper need only specify a name and an output directory. Maybe a check box on the configuration panel that specifies: View results only - or some such. When checked the boxes other than the output dir are greyed out and the checks that are done when saving only look at the output dir. When done and faced with the main application window there would need to be a button to press to say Compare results instead of Run.

Version 2 ("software oriented")

The system will not immediately bring up the preference dialog. Instead, there will be a predefined special item that is always present: "—no software—". Once the initial unpacking of the archive is done, the user will be presented with the normal main screen of the SBML Test Runner. They will be able to browse the list of test cases, but until they configure a software application (see next section), they will not be able to do much else.

Configuring a software tool whose results are to be compared

The preference dialog will allow users to define how to obtain the results for a given software tool. Users can either (1) configure a wrapper program that runs a tool, (2) point the SBML Test Runner at a directory containing existing results that should be read without going through an intermediate wrapper, or (3) no wrapper and no results.

  1. Case of a wrapper:
    • If the user clicks Save, the app checks whether the currently-selected wrapper definition is complete and gives the user feedback if it is not. (E.g.: if the program identified as the wrapper is not executable, or does not exist, or the user left out the output directory, or whatever.) Note: it only checks the currently-selected wrapper definition; if the user clicked Add and added a bunch of wrapper definitions all at once, it doesn't validate the other definitions, only the definition that is selected at the time the user clicks Save. (The dialog will have a checkbox for "proceed anyway".)
    • If the user clicks Cancel at this point (again, this is the first time through), none of the changes will be saved, which means that there will not be any wrappers defined or directory of results defined. There will, however, be the predefined special item "-no software-" that will remain, and simply allow the user to view the test cases provided by the SBML Test Suite.
  2. Case of an externally-generated set of results:
    • A checkbox in the preference dialog will let the user indicate that the set of results already exists, and that no wrapper needs to be run. When this checkbox is checked, certain fields in the wrapper definition will be grayed out, and the user will basicaly only be able to indicate the output directory and the unsupported tags.
  3. Case of nothing defined:
    • The wrapper preference pane will always have a special predefined item, "—no software—", in the list that cannot be deleted and is always a selectable option. (It will replace the current "newWrapper" item that is put in the list when the user first starts the app). This item will be a no-op, with no software and no results associated with it. When selected, the user will be able to browse the test cases and expected results plots in the main panel, but they won't be able to run tools or compare results.

The user can go to the preferences dialog and change settings at any time. (Note from MH: I propose that we prefill the output directory with a path to a directory suffixed with the name of the wrapper, so that the output produced by each wrapper is kept separate from other wrappers' output. This path could be located in the user's private test suite directory, so for example, ~/.testsuite/wrappername/ on unix-like systems.)

  • Changing the output directory:
  • Changing the wrapper definition: If the user changes the wrapper definition, then this begs the question of what to do with the results that are attributed to that particular wrapper. If the user changes the unsupported tags list, the system should ask the user whether the results for cases that involved the change tags should be invalidated. If the user changes the path to the wrapper, the system should ask if all results should be invalidated, with a checkbox offering the user not to ask the question ever again.
  • Deleting a wrapper definition: The app will ask the user whether the results associated with that wrapper should be deleted as well.

Selecting which cases are run

It is not currently the case, but I propose that until we have the ability to simultaneously show multiple SBML levels/versions, we add add a pull-down menu in the menu bar to allow the user to select which level/version of test cases is used. An option in this pull-down list would also be "Highest".

Normal operations

Menus are provided for all operations. In addition, certain common operations are accessible via buttons in the toolbar. In what follows the buttons are given names, but the actual button shapes will be decided later after we figure out the behaviors we want. Proposed icons from http://glyphicons.com/

  • Run button: this is the most basic function. It executes the wrapper and when the wrapper is done, looks for the output .csv file, compares the values in the output .csv file with the expected values, and stores the result and also changes the color of the test case to indicate success/failure/something else. There are actually two sub-modes of this button:
    • If there are no cases selected in the list of cases, it runs starting from 00001.
      • (LS) Isn't there always at least one case selected? I would assume 'run' would run that selected case only. 'run all from 00001' seems like a different situation. Perhaps all we need is a 'select all' button, and then this becomes moot?
      • (MH) OK, maybe I'm trying to overload this button. Maybe it should always, and only, be "run selected". If there's one select, that's the one. If there's more than one selected, they all get run. So, basically, we'd only have the 2nd case described next?
    • If there are cases selected in the list of cases, it runs only those particular cases. If the cases have results already, it will rerun them.
      • (LS) This is perfect.
      • (MH) So supposing that the function of the run button is to run the selected case(s), and that we add a Select all button, would it then make sense that to run all cases, one would first click the Select all button followed by Run? This would be 2 steps, unlike my crappy overloaded current version, but given what you guys are saying about how often one actually would want to run all, maybe that's reasonable?
      • (LS) I think that would work, yes. There would also continue to be a 'run all' option in the menu, for people that prefer that. Any other opinions?
    • Proposed icon: (current 'play' icon) (MH) (LS)
  • Select all button (LS): selects all tests in the window (i.e. if the list has been filtered, it only selects the filtered ones.)
    • Proposed icons:
      • 16,5 (rounded doubled square) (MH) (LS)
      • 14,2 (drawer with plus) (LS) (?)
      • 14,5 (drawer with arrow) (LS) (?)
      • 22,6 (opposite-facing arrows) (LS) (?)
      • Hum, the drawers don't quite do it for me... How about 32, 10 (multiple squares)? (MH) (LS) - sure that, works.
      • Also, a google image search for 'select all icon' gave me a stacked box with a checkmark and this option from Office.
      • Also also, Word has a 'select all' icon that looks like a bunch of selected text. Perhaps we could create something that looked like a bunch of selected tests?
    • There's no good place for a 'select all' menu item (though maybe we could find/create one?), but 'control-A' should also work to select all the tests (and could be mentioned in the pop-up text for the icon, to teach people about it?).
  • Filter button (LS): This is a menu item, but it's very commonly used, and might could stand to be upgraded to a button.
    • There are actually a few things such a button could do. It could either (currently) bring up the 'filter by tags' or 'exclude by tags' (or by range, but I think tags are more helpful). I actually would like to see those two options combined into a single 'filter by tags' window, where you say 'show me all the tests that have tags (X or Y or Z) but that don't have (P nor Q nor R).
    • Proposed icons:
      • 7,7 (multiple tags) (MH) slight disinclination, but it would work (LS)
      • 33,1 (funnel) (MH) slight preference -(LS)
  • Diagnose (maybe a button, maybe not) (LS): Something that looks at all the tests that have failed, compares that list with the list of tests that have succeeded, and finds the most likely set of tags (or combinations of tags?) that cause the failures. This is for people looking for why their tool fails on some tests. (This feature can wait for v2)
    • Proposed icons:
      • 30,9 (swiss medical plus) (LS)
  • Upload (future version, maybe a button, maybe not): A button that lets the user upload the results to the SBML Test Suite Database.
    • Proposed icon:
      • 37,4 (upload to cloud) (MH) (LS)

Here is a mockup of how the revised interface might look based on my (MH) personal preferences of the icons above:

This looks good--my only suggestion is to put 'select all' to the left of 'play', so that if you want to run all, you're clicking from left to right. --LS

  • (MH) I thought of that too and did it that way originally, but then I thought that the operation of selecting everything would be less frequent than selecting a few cases and hitting 'run'. Also, when you first read it, seeing 'run' might be more suggestive. But I'm not strongly of a mind either way.
  • (LS) I guess we could try both and see if one makes more sense 'live'?

Retrieved from "http://sbml.org/SBML_Projects/SBML_Test_Suite/Test_Runner_design"

This page was last modified 23:28, 30 January 2013.



Please use our issue tracking system for any questions or suggestions about this website. This page was last modified 23:28, 30 January 2013.