Specification of ETL-3

Overview

ETL-3 was composed as a cooperative work of the Electrotechnical Laboratory and Hitachi Ltd. in 1974. This dataset contains 48 classes of handwritten characters of numerics, roman alphabets and symbols by presenting the model characters to the writers. The OCR sheets were collected by the Hitachi Ltd., and processed by a TOSBAC-3400 computer system owned by the Electrotechnical Laboratory.

Specifications

  • OCR Sheet (Same as for ETL-1)
    • Sheet: B5, 90 kg per 1000 sheets
    • Dropout color: No. 26 Violet 50 % Screen (DNP)
    • Frame size: 5 mm × 7 mm
    • Pitch: 7.62 mm × 12.7 mm
    • Number of frames: 10 columns × 12 rows = 120 frames
  • Character Classes
    • Numeric: 10 (0-9)
    • Capital Roman alphabet: 26 (A-Z)
    • Symbol: 12 (¥+-*/=()・,_▾)
    • Total: 48
  • Data Collection
    • Location: Hitachi Ltd.
    • Instructions for writers:
      1. Use an HB-graphite pencil for writing. For correction, make it completely clean by using erasers.
      2. Write each character in the specified frame and style as shown by example.
      3. Write each character full but never out of the frame, not slanted, and neatly.
    • number of writers: 200
    • number of samples: 9600
  • Scanning System
    • Scanner: Flying Spot Scanner (FSS) with a Flying Spot Cathode Ray Tube 5CNP16 and a Photomultiplier Tube 7696
    • Interval: 0.133 mm × 0.133 mm
    • Spot size: 0.1333 mm
    • Intensity levels: 16 (4bit)
    • Number of pixels: 72 × 76 = 5472 pixels
  • Compilation
    • Location: Electrotechnical Laboratory
    • Computer: TOSBAC-3400/41 (program: FSSTOMT)
    • Date of compilation: April 1974
    • Date of collection: April 1974

File Format

Reference

斉藤泰一、山田博三、森俊二: “手書文字データベースの解析(III)”, 「電総研彙報」, Vol.42, No.5, pp.385–434 (1978-05).