Specifications of ETL4

Summary

ETL4 is a dataset of handwritten character images of 51 Hiragana characters made from OCR sheets collected at Nagoya University which were scanned at Electrotechnical Laboratory with the TOSBAC-3400 scanning system in 1974 (S49) FY.


Data Collection

OCR Sheet (Same as the ETL1)

  • Sheet: B5, 90kg per 1000 sheets
  • Dropout color: No.26 Violet 50% Screen(DNP)
  • Frame size (width x height): 5mm x 7mm
  • Frame pitch (width x height): 7.62mm x 12.7mm
  • Number of frames: 10 x 12 = 120

Characters

  • Hiragana: 51(あいうえおかきくけこさしすせそたちつてとなにぬねのはひふへほまみむめもやいゆえよらりるれろわゐうゑをん)

Data Collection

Scanning System

  • Scanner: Flying Spot Scanner (FSS) with a Flying Spot Cathode Ray Tube 5CNP16 and a Photomultiplier Tube 7696
  • Interval: 0.133mm x 0.133mm
  • Spot diameter: 0.1333mm
  • Intensity levels: 16 (4bit)
  • Number of pixels: 72 x 76 = 5,472 pixels

Compilation

  • Location: Electrotechnical Laboratory (ETL)
  • Computer : TOSBAC-3400/41
  • Software: FSSTOMT
  • Date of Compilation: Dec. 1974
  • Date of Scanning: Dec. 1974

Format


Sample

metadata image
0 Serial Data Number: 500100
Serial Sheet Number: 5001
JIS Code: 0xb1
EBCDIC Code: 0x81
4 Character Code: H A
Evaluation of Individual Character Image: 0
Evaluation of Character Group: 0
Sample Position Y on Sheet: 1
Sample Position X on Sheet: 0
Male-Female Code: 1
Age of Writer: 23
Industry Classification Code: 9144
Occupation Classifiaction Code: 11
Sheet Gatherring Date: 741202
Scanning Date: 741216
Number of X-Axis Sampling Points: 72
Number of Y-Axis Sampling Points: 76
Number of Levels of Pixel: 16
Magnification of Scanning Lens: 133
Serial Data Number (old): 0