

% Id: $Author: merzky $ on $Date: 2002/05/03 10:09:03 $ ($Revision: 1.1 $)
%
% unknown bug in compilation
%    LaTeX Error: Something's wrong--perhaps a missing \item.
% Press Enter to continue
% This error should be ignored
%

\documentclass[$Date: 2002/05/03 10:09:03 $]{glabarticle}

%===============================================================================
\newenvironment{shortlist}{
    \begin{itemize}
    \setlength{\itemsep}{-1ex}
    \renewcommand{\labelenumi}{{\bf AP-\arabic{enumi}}}
}{
    \end{itemize}
}

\newcounter{shortnum-cnt}
\newenvironment{shortnum}[1]{
    \begin{list}{\textbf{#1\arabic{shortnum-cnt}:}}{
    \usecounter{shortnum-cnt}
    \settowidth{\labelwidth}{\textbf{#1-100}}
    \setlength{\labelsep}{3ex}
    \setlength{\itemsep}{-1ex}
    \setlength{\leftmargin}{\labelwidth}
    \addtolength{\leftmargin}{\labelsep}
    \addtolength{\leftmargin}{5ex}
    \setlength{\rightmargin}{10ex}
    \setlength{\itemindent}{0ex}
    }
}{
    \end{list}
}

%===============================================================================

\begin{document}

%===============================================================================

 \glabdocauthors  {Andr\'e Merzky, Florian Schintke and Thorsten Sch\"utt}
 \glabdoctitle    {Requirement Analysis}
 \glabwpname      {Data Management and Visualization}
 \glabdocfilename {requirements}
 \glabpartners    {ZIB}
 \glableadpartner {ZIB}
 \glabconfigid    {}
 \glabdocclas     {Internal}
 \glababstract    {
   This documents describes the requirements for WP8 of the GridLab
   project - ``Data Management and Visualization''.  It is based on
   user scenarios as provided by the application work packages, and
   on the requirements of WP1 - ``Grid Application Toolkit (GAT)''.}
 \glablastamendment{\getcvsdate~~---~~\getcvstime} 

%===============================================================================

 \glabmaketitle

%===============================================================================

 \tableofcontents

 \newpage

 \section{Scope of this Document}
  
  This document will derive the requirements to the GridLab WP8 --
  Data Management and Visualization.  For that, we first review the
  user requirements, application developer requirements and
  requirements of all work packages depending on WP8.  From these,
  we derive our own set of requirements.

  Requirements in the sense of this document include
  \begin{shortlist}
   \item functionality requirements,
   \item design requirements,
   \item architectural requirements.
  \end{shortlist}

  These have to fit into the overall scope of the GridLab project.\\

  The User Requirements and Application Developer Requirements as
  listed in this document are not specific to WP8, but hold
  generally for all WPs of the GridLab project.  They have been
  derived from discussions with end users and application developers
  (Cactus Team, Triana Team), and from discussions with WP1
  developers (GAT).\\

  The requirements to WP8 as identified in this document have to be
  met by the WP8 architecture, WP8 design and WP8 implementation.
  The detailed plans for these are described in separate documents.

  \subsection{Terms used}

   Terms used throughout this document (as requirement, application,
   work package, architecture etc.) are used in the sense as
   described in Annex 1 of the GridLab project proposal or as
   defined by the GridLab technical Board.\\

   Some terms are described in the next subsection, mainly
   concerning terms specific to WP8.
 
  \subsection{About Data}

   Throughout this document, we talk about data, data sets, data
   files, checkpoint files and so on.  These terms are defined in
   this section. 

   \begin{description}
    \item[Data] 
          Data are information; raw facts. Data can be input into a
          program and processed in various ways.  Data can be
          produced by by programs, devices and user input (via
          devices).\\
          Data can be \textit{volatile} (have a very short lifetime, 
          e.g. inside the main memory of a computer) or
          \textit{persistent} (have an unlimited lifetime, 
          e.g. in a archived file).\\
          Data can be distributed, e.g. stored in files located on
          different storage systems, or live inside of an
          application running on multiple resources.\\
          Data are often structured.\\
          In this document, the term \textit{data} refers to any one
          of: \textit{data set}, \textit{data file} or
          \textit{collection of data files}.
    \item[Data Set] 
          Data with collective appearance are often gathered in data
          sets.  Data in data sets usually share a common set of
          meta data, and are of the same or complementary types.
    \item[Data File] 
          A File is an entity living on a storage system, containing
          a stream of bytes, and can be addressed by a unique name.
          Often, files are annotated with Meta Data as time of
          creation, owner, security information and size.  Data
          Files are files containing data sets.
    \item[Data Stream] 
          A Data Stream is a flow of Data from A  to B.  It is
          similar to a Data File but in general allows no seek
          operation.  Stream can be converted to files (caching,
          buffering) and vice versa (dumping).
    \item[Collection of Data Files] 
          Data Files sharing the same set or a similar set of
          Meta Data can be gathered in Collections.  Collections are
          not physical entity, but a virtual object containing
          information about Data Files contained in the Collection.
    \item[Meta Data] 
          Meta Data are data about Data Sets, Data Files and
          Collections.  Meta Data can contain most different
          information, from size of data over information about data
          structures to arbitrary annotation strings provided by a
          user or application.\\
          Meta Data are usually stored in Meta Data Directories
          (MDS).  Data Management Services (DMS) provide operations 
          on Data, Data Files, Data Sets and Data Stream and
          guarantee a consistent update of the Meta Data entries 
          in the MDS.
    \item[Checkpoint Data] 
          Checkpoint Data are Data Sets created by an application or
          system utility.  They usually contain all information
          necessary to completely restore and restart an
          application in a status similar to the one it was in
          before creating the checkpoint data.
    \item[Checkpoint File] 
          Checkpoint Files are Data Files containing Checkpoint
          Data.  Checkpoint Files can be part of a Collection
          containing the complete set of Checkpoint Data of an
          application.
    \item[Migration]
          An application moving during runtime from one resource to
          another is performing a Migration.  This process does also
          involve the movement of Data (usually Checkpoint Files) in
          the same direction.  This subprocess is called Data
          Migration.
   \end{description}


   As this document is concerned about the management and
   visualization of data, we will also mention a number of possible
   data sources.  Of main interest to the project are data produced
   by applications or experiments.  Also of interest can be data
   produced or gathered by middleware services or systems, like
   monitoring systems, information systems.  All this data could
   potentially also visualized, with major emphasis on visualization
   of application data.


 \section{GridLab Requirements and User Scenarios}
  
  From the Annex 1 and elsewhere given user scenarios, and from
  discussions with application developers, numerous general
  requirements have been identified and must be met by the general
  GridLab infrastructure.  These are listed in the requirements
  document of WP1. Together with the user requirements and
  application developer requirements also listed there, these form
  the basis of the requierements as identified here.\\

  This section reviews the user scenarios, and extracts all
  information relevant to data management and visualization.  From
  these information, and from other general user requirements as
  listed in the WP1 document, the requirements for WP8 are derived.
 
  \subsection*{Data Management Scenarios (DMS):}
  
   From the application scenarios of Annex following Data Management
   situations/processes are to be enabled by the GridLab project:
 
   \begin{shortnum}{DMS}
    \item migration of data files from A to B,
    \item accompanied selection and migration of data files from A to B,
    \item fast transport of data sets/files from A to B,
    \item discover data sets,
    \item locate data sets,
    \item archival of data files,
    \item recombination of parted data sets/files,
    \item requests information about a data set or data file and its
          contents/history/\ldots,
    \item requests information about a storage system and its
          optimal I/O parameters/variables,
   \end{shortnum}
 
  
  \subsection*{Visualization Scenarios(VS):}
  
   From the application scenarios of Annex one follow following
   Visualization situations/processes are to be enabled by the
   GridLab project:\\
  
   \begin{shortnum}{VS}
    \item visualization of past data sets,
    \item visualization of online data,
    \item visualization \textit{output} for further use,
    \item visual interaction with simulation code at runtime, and
    \item visual interaction with the Grid environment.
   \end{shortnum}
 
 

 \section{Requirement Analysis for WP8}
    
  From the requirements listed in the previous sections, it is
  possible to derive a well defined set of requirements, which have
  to be met by WP8 architecture, design and implementation\footnote{
    Please note, that a data set in the following list can mean 
    either (a) a volatile data set living inside an application, 
    or (b) a data file, or (c) a annotated collection of data files.
  }.  It is also possible to derive a well defined set of requirements
  to all other work packages, on which WP8 depends.


  \subsection*{Requirements which WP8 has to meet}

   \subsubsection*{Functionality Requirements (FR):}

    Following functionality must be provided by WP8:

    \begin{shortnum}{FR}
     \item migration of data,
     \item archival of data,
     \item replication of data,
     \item recombination of distributed/parallel written data,
     \item annotation of data,
     \item location of data\footnote{Location: For a
           known data, return the exact location of 
           the data.},
     \item discovery of data\footnote{Discovery: for 
           a set of attribute-value pairs, find a matching set 
           of data.},
     \item remote (online and offline) visualization of data,
     \item adaptive visualization (progressive, hierarchical) of data,
     \item data transformation\footnote{transform data from format A
           to format B},
     \item data extraction\footnote{transform data set A into subset
           B}, and
     \item secure data transport\footnote{including authorization,
         authentification and encryption}.
    \end{shortnum}

   \subsubsection*{Quality Requirements (QR):}

    Following quality constraints must be respected by WP8:

    \begin{shortnum}{QR}
     \item be application oriented,
     \item be usable on all types of resources,
     \item be usable in firewalled environments,
     \item be usable in disconnected environments,
     \item be usable in minimalistic environments,
     \item support and enforce use of security policies,
     \item support synchronous and asynchronous operation,
     \item provide abstractive interfaces,
     \item provide complete set of interfaces,
     \item provide ability to be discovered,
     \item provide ability to discover and use services dynamically,
     \item provide audit trails and verbose error messages,
     \item provide a test suite,
     \item be well documented,
     \item be extensible and 'future proof',
     \item be informative, transparent,
     \item be lightweight (where possible),
     \item be robust and fault tolerant,
     \item be adaptive to variable Grid infrastructure,
     \item be able to recover from interruptions,
     \item be independent from other services (if possible or
           necessary),
     \item behave consistent and reproducible,
     \item allow integration of 3rd party software/services, and
     \item allow instrumentation for monitoring purposes.
    \end{shortnum}


  \section{WP8 Requirements to other Work Packages}

   To meet the requirements identified in the previous section, the
   work of WP8 has to be able to utilize other services.  Certain
   functionality and properties are required \textit{from} these
   services in order to achieve a successful outcome of our
   work.  These requirements are listed below.  We distinguish
   between requirements we \textit{must have} in order to be able to
   do our work at all, and requirements which would be \textit{nice
   to have} and which would ease our work
   significantly\footnote{Nice-to-have can also mean that the
   architecture or design would be simpler and cleaner if the
   respective requirement is met by the other WP.}.  If features in
   the ENR section cannot be provided by the respective WPs, WP8 has
   to provide them itself, but possibly tailored to WP8 needs.

   \subsection*{External Requirements we MUST HAVE (EMR):}

    Please note that requirements to WP7 (Adaptive Components) are
    dealt with in a separate document issued by WP7: ``Use Cases for
    Adaptive Components'' (deliverable 7.1).

    \begin{shortnum}{EMR}
     \item WP5\phantom{0}  simple access to testbed resources,
     \item WP6\phantom{0}  simple interface to security functionality,
     \item WP6\phantom{0}  support for multiple credentials,
     \item WP6\phantom{0}  secure channel from A to B, not encrypted,
     \item WP10 ability to store custom meta data,
     \item WP10 ability to search meta data by regex matches to
                values in key-value pairs,
     \item WP11 hooks for audit trails.
    \end{shortnum}


   \subsection*{External Requirements which would be NICE TO HAVE (ENR):}

    \begin{shortnum}{ENR}
     \item WP5\phantom{0}  ability to simulate resource failures,
     \item WP5\phantom{0}  installation support,
     \item WP6\phantom{0}  secure encrypted channel from A to B,
     \item WP10 regex searches for keys for meta data,
     \item WP11 network forecast for bandwidth and latency.
     \item WP11 Monitor Storage Systems (space)
    \end{shortnum}


  \section{Outlook}

  Having identified these requirements, the next step is to design
  an architecture for the working package, which (a) fits into the
  global GridLab architecture and (b) meets all requirements above.
  Some requirements may not be enforced by that architecture, but
  have to be met at implementation level.  The implementation plan
  for Work Package 8 should cover the identified requirements
  completely. 

\end{document} 


