Difference between revisions of "Main Page/DAT"

From Nekcem
Jump to navigationJump to search
 
(6 intermediate revisions by the same user not shown)
Line 11: Line 11:
 
   lp : the maximum number of cores
 
   lp : the maximum number of cores
  
 
+
use (E,lelt,lx1,lp), to represent size of prob
We'll have to use (E,lelt,lx1,lp), to represent size of prob, instead of c3d.rea.
 
  
 
  E=total element numbers, lelt=element # per core,
 
  E=total element numbers, lelt=element # per core,
 
  lx1= grid points in one direction, lp= # of cores.
 
  lx1= grid points in one direction, lp= # of cores.
  
I had many different rea with c3d_6 (E=136K), c3d_7(E=273K), etc..
+
There are many different rea with c3d_6 (E=136K), c3d_7(E=273K), etc..
  
 
Even for a fixed num of element with c3d_7 (E=273K), men usage is different for
 
Even for a fixed num of element with c3d_7 (E=273K), men usage is different for
different # of cores (lp=32k, 65k, 131k).  So... sorry I wouldn't know which
+
different # of cores (lp=32k, 65k, 131k).   
case if it's just c3d...
 
  
By the way, please remember I made huge change in the code so far for 2 times
+
made huge change in the code for 2 times
 
reduction in mem usage to go further up from 1.1 billion to 2.2 billion cases.
 
reduction in mem usage to go further up from 1.1 billion to 2.2 billion cases.
  
Line 32: Line 30:
  
  
If you still keep the old version old version of the code: you can compile and
+
In there, if we assume "nc" is approximately same as the total grids "n".
see what men usage was. From example below, always the fourth one (92352484) will
+
we have the following:
be the mem usage.
+
 
 +
  For the header:
 +
  (1) coordinate => 3 columns * 4 bytes
 +
  (2) cell data  => 9 columns * 4 bytes
 +
  (3) cell type  => 1 columns * 4 bytes
 +
 
 +
  For the 8 fields:
 +
      3 columns * 4 bytes
 +
 
 +
So, we have 275M*(8 fields *3*4)+ 275M*(3+9+1)*4 = 40 GB
 +
 
 +
Or, neglecting the cel type, we get 275M(8*3*4+12*4)=39GB
 +
 
 +
 
 +
The problem size on 32k cores was npt= 546000*16*16*16 = E*lx1*lx1*lx1
 +
(where lx1=lxi+1). i.e., npt=2,236,416,000 = 2.2 billion grids
 +
 
 +
The output size with 4 fields will be:
 +
2236416000*(4*3*4)+2236416000*(1+9+3)*4 = 223,641,600,000 (223GB)
 +
 
 +
Memory required is 420M .
 +
 
 +
 
 +
TODO:
 +
make sure thread joined at last step;
 +
make sure buffer size is optimized according to the formula given above;

Latest revision as of 15:43, 13 June 2012

SIZEu file

 ldim: dimension
 lxi: the degree of polynomials
 lx1: the number of grid points on the face 
 ly1=lx1; lz1=lx1
 lelt: the maximun number of element per core
 lp : the maximum number of cores

use (E,lelt,lx1,lp), to represent size of prob

E=total element numbers, lelt=element # per core,
lx1= grid points in one direction, lp= # of cores.

There are many different rea with c3d_6 (E=136K), c3d_7(E=273K), etc..

Even for a fixed num of element with c3d_7 (E=273K), men usage is different for different # of cores (lp=32k, 65k, 131k).

made huge change in the code for 2 times

reduction in mem usage to go further up from 1.1 billion to 2.2 billion cases.

 from (E=273, lx1=16, lp= 131k): limit in the past  ---> (E=546k, lx1=16, lp=131k)

(E=999k, lx1=16, lp=131k) was 500M. So I couldn't do on BGP. But is be ok on XK6, even with lp=262k.


In there, if we assume "nc" is approximately same as the total grids "n". we have the following:

 For the header:
 (1) coordinate => 3 columns * 4 bytes
 (2) cell data  => 9 columns * 4 bytes
 (3) cell type  => 1 columns * 4 bytes
 For the 8 fields:
     3 columns * 4 bytes

So, we have 275M*(8 fields *3*4)+ 275M*(3+9+1)*4 = 40 GB

Or, neglecting the cel type, we get 275M(8*3*4+12*4)=39GB


The problem size on 32k cores was npt= 546000*16*16*16 = E*lx1*lx1*lx1 (where lx1=lxi+1). i.e., npt=2,236,416,000 = 2.2 billion grids

The output size with 4 fields will be: 2236416000*(4*3*4)+2236416000*(1+9+3)*4 = 223,641,600,000 (223GB)

Memory required is 420M .


TODO: make sure thread joined at last step; make sure buffer size is optimized according to the formula given above;