Skip to topic | Skip to bottom
Home
Main
Main.ComplexMultiWavelengthScenarior1.4 - 25 Aug 2004 - 20:01 - DougTodytopic end

Start of topic | Skip to actions

Distributed Multiwavelength Data Analysis Scenario

An increasing amount of astronomical data is stored in digital archives which are distributed around the world. Analysis often involves combination and comparision of newly obtained data with existing data from such archives.

In this scenario the user analyzes data from multiple wavelength regimes, with data from different sources often varying considerably in representation and characteristics. The data to be analyzed may be either local or remote. The data collections to be analyzed may be of any size. The analysis is driven from the user's workstation and may include software written by the user, such as analysis scripts or algorithms.

In this case the following capabilities are needed:

  • Common software infrastructure. This is needed at several levels, e.g., standard data access services to serve up data from archives, and a standard data analysis environment to make use of such services and permit integration of data analysis or user interface components from multiple sources.

  • Location transparency. Ideally the user should not have to care whether the data to be analyzed is stored locally or in a remote archive, with the same tools available for analysis in either case. Due to the exponential growth in data volumes, in the future most data access will be to remote data. Due to limited network bandwidth this may require moving some of the computation to where the data is stored.

  • Scalability. It is impractical to write new software for different computational environments. It should be possible to use the same software transparently on a workstation, on a cluster, or on the Grid. Scalability is required to be able to deal with large data volumes.

  • Data mediation. Data from different sources or wavelength regimes, produced at different times, is generally complex and heterogeneous and may differ in both content and representation. Active mediation is required to make it practical to combine large amounts of data at analysis time. This includes subsetting and filtering, data model transformation, and reformatting to a standard intermediate data representation. In the case of data access via the VO most of this is handled by the VO infrastructure, but the client still needs to deal with the data which is returned. Even ignoring VO concerns, data model mediation can be an issue when mixing software components from different systems.

-- DougTody - 24 Aug 2004
to top


You are here: Main > NetworkMeetings > PhoneMeeting20040824 > ComplexMultiWavelengthScenario

to top

Copyright © 1999-2009 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding OPTICON TWiki? Send feedback