mirror of
				https://git.proxmox.com/git/mirror_iproute2
				synced 2025-10-31 00:42:48 +00:00 
			
		
		
		
	
		
			
				
	
	
		
			98 lines
		
	
	
		
			4.4 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			98 lines
		
	
	
		
			4.4 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| Notes about distribution tables from Nistnet 
 | |
| -------------------------------------------------------------------------------
 | |
| I. About the distribution tables
 | |
| 
 | |
| The table used for "synthesizing" the distribution is essentially a scaled,
 | |
| translated, inverse to the cumulative distribution function.
 | |
| 
 | |
| Here's how to think about it: Let F() be the cumulative distribution
 | |
| function for a probability distribution X.  We'll assume we've scaled
 | |
| things so that X has mean 0 and standard deviation 1, though that's not
 | |
| so important here.  Then:
 | |
| 
 | |
| 	F(x) = P(X <= x) = \int_{-inf}^x f
 | |
| 
 | |
| where f is the probability density function.
 | |
| 
 | |
| F is monotonically increasing, so has an inverse function G, with range
 | |
| 0 to 1.  Here, G(t) = the x such that P(X <= x) = t.  (In general, G may
 | |
| have singularities if X has point masses, i.e., points x such that
 | |
| P(X = x) > 0.)
 | |
| 
 | |
| Now we create a tabular representation of G as follows:  Choose some table
 | |
| size N, and for the ith entry, put in G(i/N).  Let's call this table T.
 | |
| 
 | |
| The claim now is, I can create a (discrete) random variable Y whose
 | |
| distribution has the same approximate "shape" as X, simply by letting
 | |
| Y = T(U), where U is a discrete uniform random variable with range 1 to N.
 | |
| To see this, it's enough to show that Y's cumulative distribution function,
 | |
| (let's call it H), is a discrete approximation to F.  But
 | |
| 
 | |
| 	H(x) = P(Y <= x)
 | |
| 	     = (# of entries in T <= x) / N   -- as Y chosen uniformly from T
 | |
| 	     = i/N, where i is the largest integer such that G(i/N) <= x
 | |
| 	     = i/N, where i is the largest integer such that i/N <= F(x)
 | |
| 	     		-- since G and F are inverse functions (and F is
 | |
| 	     		   increasing)
 | |
| 	     = floor(N*F(x))/N
 | |
| 
 | |
| as desired.
 | |
| 
 | |
| II. How to create distribution tables (in theory)
 | |
| 
 | |
| How can we create this table in practice? In some cases, F may have a
 | |
| simple expression which allows evaluating its inverse directly.  The
 | |
| pareto distribution is one example of this.  In other cases, and
 | |
| especially for matching an experimentally observed distribution, it's
 | |
| easiest simply to create a table for F and "invert" it.  Here, we give
 | |
| a concrete example, namely how the new "experimental" distribution was
 | |
| created.
 | |
| 
 | |
| 1. Collect enough data points to characterize the distribution.  Here, I
 | |
| collected 25,000 "ping" roundtrip times to a "distant" point (time.nist.gov).
 | |
| That's far more data than is really necessary, but it was fairly painless to
 | |
| collect it, so...
 | |
| 
 | |
| 2. Normalize the data so that it has mean 0 and standard deviation 1.
 | |
| 
 | |
| 3. Determine the cumulative distribution.  The code I wrote creates a table
 | |
| covering the range -10 to +10, with granularity .00005.  Obviously, this
 | |
| is absurdly over-precise, but since it's a one-time only computation, I
 | |
| figured it hardly mattered.
 | |
| 
 | |
| 4. Invert the table: for each table entry F(x) = y, make the y*TABLESIZE
 | |
| (here, 4096) entry be x*TABLEFACTOR (here, 8192).  This creates a table
 | |
| for the ("normalized") inverse of size TABLESIZE, covering its domain 0
 | |
| to 1 with granularity 1/TABLESIZE.  Note that even with the granularity
 | |
| used in creating the table for F, it's possible not all the entries in
 | |
| the table for G will be filled in.  So, make a pass through the
 | |
| inverse's table, filling in any missing entries by linear interpolation.
 | |
| 
 | |
| III. How to create distribution tables (in practice)
 | |
| 
 | |
| If you want to do all this yourself, I've provided several tools to help:
 | |
| 
 | |
| 1. maketable does the steps 2-4 above, and then generates the appropriate
 | |
| header file.  So if you have your own time distribution, you can generate
 | |
| the header simply by:
 | |
| 
 | |
| 	maketable < time.values > header.h
 | |
| 
 | |
| 2. As explained in the other README file, the somewhat sleazy way I have
 | |
| of generating correlated values needs correction.  You can generate your
 | |
| own correction tables by compiling makesigtable and makemutable with
 | |
| your header file.  Check the Makefile to see how this is done.
 | |
| 
 | |
| 3. Warning: maketable, makesigtable and especially makemutable do
 | |
| enormous amounts of floating point arithmetic.  Don't try running
 | |
| these on an old 486.  (NIST Net itself will run fine on such a
 | |
| system, since in operation, it just needs to do a few simple integral
 | |
| calculations.  But getting there takes some work.)
 | |
| 
 | |
| 4. The tables produced are all normalized for mean 0 and standard
 | |
| deviation 1.  How do you know what values to use for real?  Here, I've
 | |
| provided a simple "stats" utility.  Give it a series of floating point
 | |
| values, and it will return their mean (mu), standard deviation (sigma),
 | |
| and correlation coefficient (rho).  You can then plug these values
 | |
| directly into NIST Net.
 | 
