Specifications of ETL-5

Overview

The ETL-5 dataset was collected at Fujitsu Limited and organized by Electrotechnical Laboratory by using the TOSBAC-3400 computer system in 1974. It contains scans of 51 Katakana characters handwritten with a presence of character style guide.

Organization

  • OCR Sheet (Same as for ETL-1)
    • Sheet: B5 size (ISO 216), 90-kg-per-1000 OCR sheets
    • Dropout color: No. 26 Violet 50% Screen (DNP)
    • Frame size: 5 mm in width by 7 mm in height
    • Frame pitch: 7.62 mm (between columns) by 12.7 mm (between rows)
    • Frame layout: 10 columns by 12 rows = 120 frames
  • Class of characters
    • Katakana: 51(ア-ワヰウヱヲン)
  • Collection
    • Location: Fujitsu Limited
    • Guide for writers (rough translation from original Japanese):
      1. Writing guide
        1. Use an HB-graphite pencil for writing. For correction, make it completely clean by using erasers.
        2. Write each character twice in the specified frames as shown in the example.
        3. Refer to the character style guide before start writing.
      2. Character style guide
        1. Watch out for the parts marked by circles.
        2. Keep the stroks ordered, not scattered.
        3. Make corners and dots clear.
        4. Write each character full but never out of the frame, not slanted, and neatly.
    • Number of writers: 104
    • Number of samples: 10608 = (51 × 2 × 104)
  • Scanning System
    • Scanner: Flying Spot Scanner (FSS) with a Flying Spot Cathod Ray Tube 5CNP16 and a Photomultiplier Tube 7696
    • Pixel interval: 0.1 mm × 0.1 mm
    • Pixel size: 0.1 mm
    • Intensity levels: 16 (4 bits)
    • Number of pixels: 72 × 76 = 5472 pixels
  • Dataset Organization
    • Scanning location: Electrotechnical Laboratory
    • Computer system: TOSBAC-3400/41 (program: FSSTOMT)
    • Date of organization: February 1975
    • Date of scanning: February 1975

File Format

Reference

  1. 斉藤泰一、山田博三、森俊二: “手書文字データベースの解析(III)”, 「電総研彙報」, Vol.42, No.5, pp.385–434 (1978-05).