## 30 August 2010

### REVISED: More fun pointless code metrics

My apologies to everyone who looked at the numbers for GCC before. I had run cloc from the wrong directory and it was counting all the lines of code in all the source trees on my computer. Oops! The GCC numbers listed at the bottom are corrected. While I'm at it, let's look at GHC vs. GCC. Here is GHC:
1177 text files.
1159 unique files.
247 files ignored.

http://cloc.sourceforge.net v 1.51  T=56.0 s (16.6 files/s, 6818.4 lines/s)
--------------------------------------------------------------------------------
Language                      files          blank        comment           code
--------------------------------------------------------------------------------
C                               139          10199          13583          46412
XML                              33           3839            539          29416
C/C++ Header                    170           3171           5236           8509
HTML                             41            708             82           6761
Perl                             31           1578           1526           6284
Bourne Shell                     26            410            794           3983
Pascal                            2            753            356           2711
m4                                2            293            171           1959
yacc                              4            261              0           1465
make                             42            216            397            606
Bourne Again Shell                5             75            138            393
Lisp                              1             65             76            291
Assembly                          1             34             30            125
Teamcenter def                    3             20              0             54
CSS                               2              8              0             36
D                                 1             10             29             23
C Shell                           2             24             41             20
--------------------------------------------------------------------------------
SUM:                            929          59877          87076         234878
--------------------------------------------------------------------------------

And GCC:
58499 text files.
57735 unique files.
116986 files ignored.

http://cloc.sourceforge.net v 1.51  T=2794.0 s (19.4 files/s, 2776.8 lines/s)
--------------------------------------------------------------------------------
Language                      files          blank        comment           code
--------------------------------------------------------------------------------
C                             15704         350409         365314        1790292
Java                           6324         169097         641639         680923
C/C++ Header                   9712         135787         126549         659612
C++                           12898          98958         127224         428592
Bourne Shell                    126          56208          48639         317192
HTML                            440          30509           5304         139342
Fortran 90                     2978          12630          22272          71604
m4                              163           6547           1879          57799
Assembly                        195           7543           9400          45902
XML                              56           3621            214          29679
make                            115           3193            941          20374
Teamcenter def                   76           2903            379          18601
Expect                          219           4214           7719          16402
Fortran 77                      381            917           2776          10119
Objective C                     275           1922           1092           7021
Perl                             24            760           1157           4121
XSLT                             20            563            436           2805
Bourne Again Shell               12            372            599           1675
CSS                               8            332            143           1427
awk                              10            221            373           1370
Python                            4            301            142           1311
yacc                              2            107            109            987
Pascal                            4            218            200            985
C#                                9            230            506            879
MUMPS                             4            121              0            521
Tcl/Tk                            1             72            112            393
ASP.Net                           7             37              0            203
lex                               1             36             27            150
NAnt scripts                      2             17              0            148
MSBuild scripts                   1              1              0            140
Javascript                        2             20             81            122
Lisp                              1              4             21             59
MATLAB                            3             13              0             52
DTD                               3             28             70             26
Fortran 95                        2             10              7             21
DOS Batch                         3              0              0              7
--------------------------------------------------------------------------------
SUM:                          54338        1115068        1673037        4970376
--------------------------------------------------------------------------------

Keep in mind, the GCC tree includes lots of languages because it's a compiler for lots of things. And it has a lot of tests.

1. So GCC has more Haskell code than GHC? WTF?

2. Yes, WTF?
It's not only more, it's an order of magnitude more!

3. I'm surprised that GCC contains any Haskell. It must be something else that uses the .hs file extension.

I can't even imagine what would require 850k lines of Haskell. Skynet?

4. Have you tried sloccount instead of cloc. Cloc seems to have a large number of false hits on language types.

5. Yes, cloc seems to miss categorize things at times which is unfortunate, but I just realized the real problem.

I ran cloc from the directory where I have _all my source code_ stored. I'll post new revised numbers as soon as I have them.