These stats are from my experience loading the Unified Medical Language System (UMLS) maintained by the National Library of Medicine (NLM) into tables in MySQL version 3.23.39. The compressed 2001 UMLS Knowledge Sources can be downloaded here, assuming you are a registered user of the UMLS. The size of the download is 645 MB for Unix .TGZ or 642 MB for PC .ZIP.
You can find more information about accessing the UMLS Knowledge Sources here on the UMLS Information page. Sample load scripts for putting the UMLS Metathesaurus into MySQL are available here.
All file sizes are in bytes unless otherwise indicated.before MySQL | after MySQL | ||
---|---|---|---|
DIRECTORY NAME | # OF TABLES | SIZE OF TEXT FILES (uncompressed) |
SIZE OF TABLE FILES (.MYD, .MYI, .frm) |
META (Metathesaurus) |
18 | 3,182,121,481 | 3,553,484,570 |
LEX (Lexicon) |
13 | 39,048,919 | 48,009,980 |
LEX/LEX_DB (Lexical databases) |
3 | 581,585 | 875,002 |
NET (Semantic Network) |
6 | 587,709 | 835,980 |
UMLS (total) | 40 | 3,235,105,035 (3 GB) |
3,603,205,532 (3.3 GB) |
before MySQL | after MySQL | |
---|---|---|
FILE NAME | SIZE OF TEXT FILES (uncompressed) |
SIZE OF TABLE FILES (.MYD, .MYI, .frm) |
MRCOC | 466,167,879 | 524,103,382 |
MRCON | 123,355,529 | 129,604,508 |
MRSO | 89,085,376 | ** 31,835,392 |
LRAGR | 21,929,585 | 24,302,690 |
Concepts: | 797,359 |
Terms: | 1,485,241 |
Strings: | 1,734,706 |
Source Strings: | 1,877,059 |