Next: Using Irregular Distribution Up: Detailed Examples Previous: Moving to Out-of-core Data Contents

Indices and Data Out-of-core

In the following example we will see how to deal with the case when both the index and the data arrays are too large to fit into memory. We store these arrays in two files "data.file" and "index.file" respectively. To solve our problem we have to divide our out-of-core arrays into in-core sections and perform calculations step by step, each time reading new portions of data from the disk and writing them back after the calculation ends.

  isec_l_l = l_l/ISEC; /* sizes of arrays in memory */
  isec_n_l = n_l/ISEC; 

  y = malloc( isec_n_l * (sizeof *y) );
  x_l1 = malloc( (isec_n_l + isec_n_l) * (sizeof *x) ); 
  x_l2 = malloc( (isec_n_l + isec_n_l) * (sizeof *x) ); 
  perm = malloc( isec_n_l * (sizeof *perm) );
  perm_l = malloc( isec_n_l * (sizeof *perm) );     
  tperm_l = malloc( isec_n_l * (sizeof *perm) );

  MPI_File_open(MPI_COMM_SELF, fname1, MPI_MODE_RDWR, MPI_INFO_NULL, 
                &file1);
  MPI_File_open(MPI_COMM_SELF, fname2, MPI_MODE_RDWR, MPI_INFO_NULL, 
                &file2);
  MPI_File_open(MPI_COMM_SELF, fname3, MPI_MODE_RDWR, MPI_INFO_NULL, 
                &ifile);

The sizes of arrays in memory are now described by isec_n_l and isec_l_l. We have to create one additional file for writing data and one extra array as a data buffer. The reason for this is that we cannot update the original file after computing each i-section, since in the next one we may need to read this data again from the file and of course we don't want to have this data altered. Instead we write updated values to the second file so that after having finished computing all i-sections we get all our data properly updated.
In the same way as in previous example we create all the necessary data structures and start the main loop covering all i-sections

  LIP_IOBufmap_create(isec_n_l, &bufm);

  LIP_Datamap_create( LIP_DATAMAP_BLOCK, l, &l_l, LIP_INDEXTYPE_INT, 1, 0,
      &datamap );
  
  LIP_Schedule_create( l_l, &schedule ); /* l_l - size of array on disk */

  /*** for each i-section ***/ 
  for(isec_cnt=0;isec_cnt<ISEC; isec_cnt++)
  {

In each i-section we read one portion of the index array and translate indices using localize functions as illustrated in the Fig in the Chapter .

    offset = isec_cnt * isec_n_l;
    
    MPI_File_read_at(ifile,offset,perm,isec_n_l,MPI_INT,&status); 
    
    LIP_Localize( datamap, perm, LIP_INDEXTYPE_INT, perm_l, 
                 LIP_INDEXTYPE_INT, isec_n_l, 0, &schedule, 
		 LIP_INDEXINFO_NULL ); 
    
    LIP_Schedule_commit( &schedule );

    LIP_OOC_localize(perm_l, LIP_INDEXTYPE_INT, tperm_l, LIP_INDEXTYPE_INT, 
    		isec_n_l, 0, x_l1, MPI_DOUBLE, &bufm, &schedule)

Next have to update disk arrays before we read new data into memory buffers. To do it we get MPI datatypes from the IOBufmap structure with the LIP_DATA_OLD flag and after writing to the file we can use flag LIP_DATA_NEW to get the data required for the next step of calculation.


   LIP_IOBufmap_get_datatype(bufm,MPI_DOUBLE,&memtype,&filetype,
                             LIP_DATA_OLD);

   if (MPI_DATATYPE_NULL!=memtype) 
   {
      MPI_File_set_view(file2,0,MPI_DOUBLE,filetype,"native",info);
      MPI_File_write_at(file2,0, x_l2, 1, memtype, &status);
   }
   LIP_IOBufmap_get_datatype(bufm,MPI_DOUBLE,&memtype,&filetype,
                             LIP_DATA_NEW);
   if (MPI_DATATYPE_NULL!=memtype) 
   {
      MPI_File_set_view(file1,0,MPI_DOUBLE,filetype,"native",info);
      MPI_File_set_view(file2,0,MPI_DOUBLE,filetype,"native",info);

      MPI_File_read_at(file1,0, x_l1, 1, memtype, &status);
      MPI_File_read_at(file2,0, x_l2, 1, memtype, &status);
   }

After the IO phase we enter executor routines consisting of communication and computation.

   LIP_Gather( x_l1, x_l1 + isec_n_l, MPI_DOUBLE, schedule ); 

    for (i = 0 ; i < isec_n_l ; i++)
      y[i] = -x_l1[ tperm_l[i] ]; 

    for (i = 0 ; i < isec_n_l ; i++)
      x_l2[ i + isec_n_l ] = 0.0; 
   
    for (i = 0 ; i < isec_n_l ; i++)
       x_l2[tperm_l[i] ] += y[i];

    LIP_Scatter( x_l2 + isec_n_l, x_l2, MPI_DOUBLE, schedule, LIP_OP_SUM );
  } /*end for each i-section*/

That is all for the i-sections loop. After finishing it we should finally update the disk arrays using the datatypes obtained with the LIP_DATA_NEW flag.

  LIP_Schedule_free( &schedule );
  /* save all data to the disk */
  LIP_IOBufmap_get_datatype(bufm,MPI_DOUBLE,&memtype,&filetype,
                            LIP_DATA_ALL);
  if (MPI_DATATYPE_NULL!=memtype) 
  {
      MPI_File_set_view(file2,0,MPI_DOUBLE,filetype,"native",info);
      MPI_File_write_at(file2,0, x_l2, 1, memtype, &status);
      MPI_Type_free(&memtype);
      MPI_Type_free(&filetype);
  }

At the end we can compute the final sum of array x by reading it from file2.

Next: Using Irregular Distribution Up: Detailed Examples Previous: Moving to Out-of-core Data Contents
Created by Katarzyna Zając