Difference between revisions of "Main Page/DAT"

From Nekcem
Jump to navigationJump to search
Line 66: Line 66:
 
Memory required is 420M (I wouldn't know how to get breakdown of memory for computation and I/O)
 
Memory required is 420M (I wouldn't know how to get breakdown of memory for computation and I/O)
 
But let me know if anyway that we can get the info -- we can work on it.
 
But let me know if anyway that we can get the info -- we can work on it.
 +
 +
 +
 +
TODO:
 +
make sure thread joined at last step;
 +
make sure buffer size is optimized according to the formula given above;

Revision as of 17:39, 25 March 2012

SIZEu file

 ldim: dimension
 lxi: the degree of polynomials
 lx1: the number of grid points on the face 
 ly1=lx1; lz1=lx1
 lelt: the maximun number of element per core
 lp : the maximum number of cores


We'll have to use (E,lelt,lx1,lp), to represent size of prob, instead of c3d.rea.

E=total element numbers, lelt=element # per core,
lx1= grid points in one direction, lp= # of cores.

I had many different rea with c3d_6 (E=136K), c3d_7(E=273K), etc..

Even for a fixed num of element with c3d_7 (E=273K), men usage is different for different # of cores (lp=32k, 65k, 131k). So... sorry I wouldn't know which case if it's just c3d...

By the way, please remember I made huge change in the code so far for 2 times reduction in mem usage to go further up from 1.1 billion to 2.2 billion cases.

 from (E=273, lx1=16, lp= 131k): limit in the past  ---> (E=546k, lx1=16, lp=131k)

(E=999k, lx1=16, lp=131k) was 500M. So I couldn't do on BGP. But is be ok on XK6, even with lp=262k.


If you still keep the old version old version of the code: you can compile and see what men usage was. From example below, always the fourth one (92352484) will be the mem usage.


In there, if we assume "nc" is approximately same as the total grids "n". we have the following:

 For the header:
 (1) coordinate => 3 columns * 4 bytes
 (2) cell data  => 9 columns * 4 bytes
 (3) cell type  => 1 columns * 4 bytes
 For the 8 fields:
     3 columns * 4 bytes

So, we have 275M*(8 fields *3*4)+ 275M*(3+9+1)*4 = 40 GB ?

Or, neglecting the cel type, we get 275M(8*3*4+12*4)=39GB ?




The problem size on 32k cores was npt= 546000*16*16*16 = E*lx1*lx1*lx1 (where lx1=lxi+1). i.e., npt=2,236,416,000 = 2.2 billion grids

The output size with 4 fields will be: 2236416000*(4*3*4)+2236416000*(1+9+3)*4 = 223,641,600,000 (223GB)

Memory required is 420M (I wouldn't know how to get breakdown of memory for computation and I/O) But let me know if anyway that we can get the info -- we can work on it.


TODO: make sure thread joined at last step; make sure buffer size is optimized according to the formula given above;