next up previous contents
Next: Irregular distribution Up: The Case of OOC Previous: Dealing with OOC index-arrays   Contents


Dealing with OOC index-arrays and data arrays


Figure: Irregular OOC inspector in lip

The concept presented above has to be further modified when not only index arrays but also data arrays become out-of-core. The previously described Inspector phase is insufficient because only part of the data array is stored in the memory and it is possible for the index array to refer to the data element which is currently stored on a disk. Detailed analysis leads to the conclusion that three types of non-local accesses may be observed in the index array:


Figure: Simplified C code that shows the use of parallel computation on an out-of-core data array with the use of lip .

  MPI_Init( /* ... */ );
  LIP_Setup( /* ... */ );

  /* Generate index array describing
   * relationships between user data */

  /* Perform irregular computation
   * on the data */
  for ( /* ... */ )
  {
    /* read i-section's indices 
     * from a file */

    /* Inspector Phase (computes 
     * optimized communication 
     * and disk-memory mapping 
       pattern)*/
    LIP_Localize(schedule,... );
    LIP_OOC_Localize(schedule,IObufmap,...);
    
    /* Create MPI derived datatypes  
     * for moving data between
     * the memory and the disk */
    LIP_IObufmap_get_datatype(IOBufmap,...);
    
    /* Executor Phase (performs 
       communication and computation) */
    
    /* Exchange data between disk and memory 
     * in order to store in the memory
     * the data that will be needed
     * in this i-section */

    MPI_File_write( /* ... */ );
    MPI_File_read( /* ... */ );
    
    /* gather non-local irregular data */
    LIP_Gather(schedule,... );

    /* perform computation on data */
    for ( i = /* ... */ )
    {
      k = edge[i];
      y[k] = f( x[k] );
    }

    /* scatterer non-local
     * irregular data (results) */
    LIP_Scatter(schedule,... );
  }

  /* Get MPI derived datatypes for moving    
   * the data obtained in the last 
   * iteration from the memory
   * to the disk  */
  LIP_IObufmap_get_datatype(IOBufmap );
  
  /* Store the data on a disk */
  MPI_File_write( /* ... */ );

  LIP_Exit( /* ... */ );
  MPI_Finalize( /* ... */ );


The idea of a new parallelizing scheme is presented in Fig. [*]. The respective pseudocode is shown in Fig. [*]. Since reading every element from a disk just as it becomes needed would be very inefficient, the optimization has to be aimed not only at efficient communication but also at minimizing I/O accesses associated with not-in-memory data elements. This is done by examining and translating the references to data residing on a disk into references to the memory buffer. This is done by using the LIP_ooc_localize() routine after LIP_localize() is called. The translation process changes indices in the indirect array and in communication schedule so they point to the local memory buffer as shown in the Fig. [*]. This process results in a data structure called I/O buffer mapping. It is updated upon examination of subsequent i-sections to record information about what data has to be moved between the disk and main memory as data that are in the memory are no longer needed for the processing of the next i-section. After the translation process is done, the I/O buffer mapping object may be used to create MPI derived data types to perform actual movement of data using MPI-IO routines (like MPI_File_read and MPI_File_write()). Obtaining MPI datatypes is done via a call toLIP_IObufmap_get_datatype(). In this way the irregular problems for which all arrays are OOC may also be parallelized.

next up previous contents
Next: Irregular distribution Up: The Case of OOC Previous: Dealing with OOC index-arrays   Contents
Created by Katarzyna Zając