My apologies to everyone who looked at the numbers for GCC before. I had run cloc from the wrong directory and it was counting all the lines of code in all the source trees on my computer. Oops! The GCC numbers listed at the bottom are corrected. While I'm at it, let's look at GHC vs. GCC. Here is GHC:
1177 text files.
1159 unique files.
247 files ignored.
http://cloc.sourceforge.net v 1.51 T=56.0 s (16.6 files/s, 6818.4 lines/s)
--------------------------------------------------------------------------------
Language files blank comment code
--------------------------------------------------------------------------------
Haskell 424 38213 64078 125830
C 139 10199 13583 46412
XML 33 3839 539 29416
C/C++ Header 170 3171 5236 8509
HTML 41 708 82 6761
Perl 31 1578 1526 6284
Bourne Shell 26 410 794 3983
Pascal 2 753 356 2711
m4 2 293 171 1959
yacc 4 261 0 1465
make 42 216 397 606
Bourne Again Shell 5 75 138 393
Lisp 1 65 76 291
Assembly 1 34 30 125
Teamcenter def 3 20 0 54
CSS 2 8 0 36
D 1 10 29 23
C Shell 2 24 41 20
--------------------------------------------------------------------------------
SUM: 929 59877 87076 234878
--------------------------------------------------------------------------------
And GCC:
58499 text files.
57735 unique files.
116986 files ignored.
http://cloc.sourceforge.net v 1.51 T=2794.0 s (19.4 files/s, 2776.8 lines/s)
--------------------------------------------------------------------------------
Language files blank comment code
--------------------------------------------------------------------------------
C 15704 350409 365314 1790292
Java 6324 169097 641639 680923
C/C++ Header 9712 135787 126549 659612
Ada 4518 227132 307713 659411
C++ 12898 98958 127224 428592
Bourne Shell 126 56208 48639 317192
HTML 440 30509 5304 139342
Fortran 90 2978 12630 22272 71604
m4 163 6547 1879 57799
Assembly 195 7543 9400 45902
XML 56 3621 214 29679
make 115 3193 941 20374
Teamcenter def 76 2903 379 18601
Expect 219 4214 7719 16402
Fortran 77 381 917 2776 10119
Objective C 275 1922 1092 7021
Perl 24 760 1157 4121
XSLT 20 563 436 2805
Bourne Again Shell 12 372 599 1675
CSS 8 332 143 1427
awk 10 221 373 1370
Python 4 301 142 1311
yacc 2 107 109 987
Pascal 4 218 200 985
C# 9 230 506 879
MUMPS 4 121 0 521
Tcl/Tk 1 72 112 393
ASP.Net 7 37 0 203
lex 1 36 27 150
NAnt scripts 2 17 0 148
MSBuild scripts 1 1 0 140
Javascript 2 20 81 122
Haskell 35 15 0 109
Lisp 1 4 21 59
MATLAB 3 13 0 52
DTD 3 28 70 26
Fortran 95 2 10 7 21
DOS Batch 3 0 0 7
--------------------------------------------------------------------------------
SUM: 54338 1115068 1673037 4970376
--------------------------------------------------------------------------------
Keep in mind, the GCC tree includes lots of languages because it's a compiler for lots of things. And it has a lot of tests.
So GCC has more Haskell code than GHC? WTF?
ReplyDeleteYes, WTF?
ReplyDeleteIt's not only more, it's an order of magnitude more!
I'm surprised that GCC contains any Haskell. It must be something else that uses the .hs file extension.
ReplyDeleteI can't even imagine what would require 850k lines of Haskell. Skynet?
Have you tried sloccount instead of cloc. Cloc seems to have a large number of false hits on language types.
ReplyDeleteYes, cloc seems to miss categorize things at times which is unfortunate, but I just realized the real problem.
ReplyDeleteI ran cloc from the directory where I have _all my source code_ stored. I'll post new revised numbers as soon as I have them.
Sorry about that!