Skip to content

C fasta Drake Diedrich - finite output state

CONTRIBUTE SOURCE CODE

fasta-dld3.c

This version of fasta looks inside the random number generator like gcc-5, and notices it has a finite number of states. The mapping through float is only used to build up the state-to-symbol map, after that it's all integers and, since the number of output states is modest, a table to lookup each of those. This avoids both the frequent loop over the lookup table boundaries, several instructions and branches inside the innermost loop. As a result, most of the CPU time is split between generating the random number integers and looking them up in the table. It's still much smaller than the minimum runtime environments of many languages.

I/O is also moved to system writes rather than stdio, as it's large memory spans being worked on now and already effectively a buffer.

The code is still idiomatic, but the sourcecode tuning extends into lightweight mathematical conversions in the random number generator and a standard lookup table that depends on that class of random number generator (any 32 bit would do this on a good sized machine).

Provide an example build command-line

cc -O3 -march=native -Wall -g fasta-dld3.c -o fasta-dld3

./fasta-dld3 1000 > dld3.out

diff fasta-output.txt dld3.out

perf stat ./fasta-dld3 25000000 >/dev/null

Performance counter stats for './fasta-dld3 25000000':

        563.81 msec task-clock                #    0.999 CPUs utilized          
             3      context-switches          #    0.005 K/sec                  
             0      cpu-migrations            #    0.000 K/sec                  
           120      page-faults               #    0.213 K/sec                  
 2,425,092,211      cycles                    #    4.301 GHz                    
 2,841,198,033      instructions              #    1.17  insn per cycle         
   208,425,354      branches                  #  369.675 M/sec                  
     3,442,013      branch-misses             #    1.65% of all branches        

   0.564248452 seconds time elapsed

   0.560295000 seconds user
   0.004002000 seconds sys