High-level Science Requirements
This page lists high-level science requirements for a future astronomical software environment for data analysis. They are under discussion for the next time. You can either add your comment at the bottom of the page or use the
fase@eso.org e-mail list for discussion of special topics. After these top level requirements are stabilized, we intend to add more detailed requirements at the next lower level.
- Installation and running:
- the environment must be easy to install on typical systems used by astronomers (e.g. POSIX compatible)
- standard procedures for easy distribution of patches and updates must be provided
- standards for documentation and help information must be defined
- the standard system must be freely available with no license fees
- it must be possible to use the system on standard desktop systems
- all code in the standard system must be available as open source
- it must be possible to add compiled scientific code for data analysis using standard languages
- the languages FORTRAN, C, C++ and Java should be supported
- it must be as easy as possible for astronomers with no special computer science background to add new application code
- both environment and scientific application code must be under revision control
- standards for error handling and logging must be provided
- standards for test and validation of the system as well as for individual scientific tasks must be established
- it must be possible to add internationalization for user interaction
- whenever possible existing open standards should be used
- standards for interfaces to commercial packages should be established
- Definition of scripting and execution
- it must be possible to execute individual tasks directly from the operating system shell
- a standard for passing parameters between tasks must exist
- range and type checking of input parameters must be supported
- execution of complex sequences of tasks must be supported
- both interactive and batch mode execution must be supported
- standard flow control like looping, branching and conditional executing must be available
- the system must be scalable
- parallel execution of tasks must be possible
- re-synchronizion of execution streams must be possible
- monitoring of tasks must be possible
- a detailed log of tasks executed must be available
- logs should include host, version, input and results for each executed task
- it must be possible to add comments and notes to the log (i.e. worksheets)
- it must be possible to repeat execution of tasks listed in logs
- any warnings or errors occurring during the execution must be reported and logged
- both interactive and batch execution of tasks must be supported
- GUI's must be available for execution of high level tasks
- optimized GUI's for applications may be offered
- definition of workflows may be considered
- access to Web services must be possible
- the place of execution of a task must be transparent to the users
- it must be possible to specify where tasks will be executed
- Data structures
- it must be possible to associate units, errors and quality flags to all numeric quantities of scientific interest
- standard astronomical coordinate, time and unit systems must be supported
- individual quantities may be grouped together (e.g. associated to a common object or generated by a specific task)
- collections of groups of data must be supported (i.e. tables or databases)
- manipulation of individual data, groups and collections must be consistent and easy
- data items may be scalars and arrays
- standard data type must be available e.g. integer, real
- multi-dimensional arrays must be supported
- standard mathematical operations and function must be provided for all relevant data
- it must be possible to select subsets of collections
- propagation of errors and quality flags must be supported
- transformation of data to different coordinate, time and unit systems must be possible
- Access to data
- the exact location of data must be transparent to the user
- it must be possible to search specified locations for relevant data
- the location of data sets created may be specified
- access to reference data must be possible
- data sets may be grouped depending on their meta-data
- data may be shared by a well defined set of users
- data must be protected against multiple concurrent accesses
- access to read-only data sets must be possible
- it must be possible to read and write data in standard formats
- FITS data must be fully supported
- interaction with VO data must be provided
- it must be possible to handle and analyze large data sets
- a history of all changes to data sets must be recorder and associated to them
- Visualization of data
- it must be possible to view graphical representations of data being analyzed
- it must be possible to compare different data relating to the same target (e.g. field, object) by graphical means
- it must be possible to overlay images mapped in different coordinate systems and compare them
- Modeling of data
- it must be possible to show errors associated to data on the graphical representations
- easy classification of data samples must be supported
- statistical tests on data must be available
- robust estimators for basic data properties must be available
- comparison between data and models must be supported
--
PrebenGrosbol and
ThijsVanDerHulst - 30 Nov 2004
--
PrebenGrosbol - 16 Dec 2004
Comments
Revised version for "data visualization"
The environment should comprise tools for data visualization.
Visualization tools should provide the following capabilities:
-
- be directly accessible via a dedicated GUI,
- be inserted into user defined/built application/GUI
- have scripting capabilities
- be expandible
- have volume rendering capabilities
- be able to handle scalars,vectors and images
- it must be possible to show errors associated to data on the graphical representations
- it must supprt logarithmic scales
- support time series files (e.g. to build animations)
- provide a Lookup Table editor
- it must be possible to use multiple datasets simultaneously, and have multiple istances of the visualization tool simoultaneously
- it must be possible to compare different data relating to the same target (e.g. field, object) by graphical means
- have "data picker" capabilities
- have "visualization saving capabilties" (e.g. save as ps, or gif, etc)
- it must know, and be able to handle the IAU approved WCS, consequently overlay images with different WCS and compare them
- be capable of directly handling the most common astronomical data formats (e.g. fits)
to top