OCR Sheet
- Sheet: B5, 90kg per 1000 sheets
- Dropout color: No.26 Violet 50% Screen(DNP)
- Frame size: 5mm x 7mm
- Pitch: 7.62mm x 12.7mm
- Number of frames: 10 x 12 = 120
Characters
- Numeric: 10 (0-9)
- Capital Roman alphabet: 26 (A-Z)
- Special: 12 (\+-*/=()・,?’)
- Katakana: 51 (ア-ン)
- Total: 99
Data Collection
- Instructions: 手書文字読取用紙記入上のお願い
- No templates for numerals and alphabets
- Templates given for special characters and Katakana
- 1445 writers of 7 kinds of job
Scanning System
- Scanner: Flying Spot Scanner (FSS) with a Flying Spot Cathode Ray Tube 5CNP16 and a Photomultiplier Tube 7696
- Interval: 0.133mm x 0.133mm
- Spot size: 0.1333mm
- Intensity levels: 16 (4bit)
- Number of pixels: 72 x 76, cut out to 64 x 63
Compiling
- Place: Electrotechnical Laboratory (ETL)
- Joint work of ETL and Fujitsu for design of OCR sheet and scanning system
- Computer : TOSBAC-3400/41
- Software: FSSTOMT
- Date of Collection: Sept. 1973
- Date of Scanning: Sept. 1973-Mar. 1974
- Quality evaluation by human
- 141319 samples in total
Format
- M-Type Data Format (ETL1, ETL6, ETL7)
- Fixed Record Length without Control Words
- Logical record length is 2052 bytes (1byte = 8bits)
- Big endian
- Sample Script
Contents
Contents of files:
Filename | Categories | # Categories | Sheets | # Sheets | # Records |
ETL1C-01 | 01234567 | 8 | 1001-2960 | 1445 | 11560 |
ETL1C-02 | 89ABCDEF | 8 | 1001-2960 | 1445 | 11560 |
ETL1C-03 | GHIJKLMN | 8 | 1001-2960 | 1445 | 11560 |
ETL1C-04 | OPQRSTUV | 8 | 1001-2960 | 1445 | 11560 |
ETL1C-05 | WXYZ\+-* | 8 | 1001-2960 | 1445 | 11560 |
ETL1C-06 | /=()・,?’ | 8 | 1001-2960 | 1445 | 11560 |
ETL1C-07 | アイウエオカキク | 8 | 1001-2960 | 1411 | 11288 |
ETL1C-08 | ケコサシスセソタ | 8 | 1001-2960 | 1411 | 11288 |
ETL1C-09 | チツテトナニヌネ | 8 | 1001-2960 | 1411 | 11287 note: ナ(NA) on Sheet 2672 is missing |
ETL1C-10 | ノハヒフヘホマミ | 8 | 1001-2960 | 1411 | 11288 |
ETL1C-11 | ムメモヤイユエヨ | 8 | 1001-2960 | 1411 | 11288 |
ETL1C-12 | ラリルレロワヰウ | 8 | 1001-2960 | 1411 | 11287 note: リ(RI) on Sheet 2708 is missing |
ETL1C-13 | ヱヲン | 3 | 1001-2960 | 1411 | 4233 |
List of available sheets:
1001-1026 1028-1149 1151-1243 1301-1306 1308-1316 1318-1355 1357 1360-1391 1393-1436 1438-1453 1455-1459 1461-1491 1501-1525 1527-1658 1660-1663 1665 1667-1695 1701-1766 1801-1837 1839-1884 2001-2019 2021-2025 2027-2153 2201-2391 2501-2696 2701-2744 2801-2802 *2803-2812 2813 *2814 2815-2817 *2818-2840 2901-2960 *: Katakana characters are missing
Samples
First ten samples (0: white, 15: black):