Specification of ETL-2

Data Collection

OCR Sheets

  • Paper: B4, 90 kg per 1000 sheets
  • Samples: e2shta01e2shtb01e2shtc01e2shtd01

Characters

Scanner

  • ITV Camera Scanner 240×240
  • Sampling interval: 54μm x 54μm
  • Spot size: 54μm
  • Intensity levels: 64=6bits
  • Number of pixels: 60 x 60 = 3600

Compile

  • Source of Collection: Dai Nippon Printing Co., Ltd., The Mainichi Newspapers Co., Ltd
  • Total samples: 52796
  • Scanning: Toshiba
  • Computer: TOSBAC-40C TOSPICS
  • Date of Collection: October 1973
  • Date of Scanning: October 1973

Format

Files

filename # records serial numbers # categories # sheets source font original files
ETL2-1 9056 1-11520 1136 24 A MINCHO KPSM1-KPSM4
ETL2-2 10480 11521-23040 1048 24 A MINCHO KPSM5-KPSM8
ETL2-3 11360 28801-40320 1136 24 C MINCHO KPTM1-KPTM4
ETL2-4 10480 40321-51840 1048 24 C MINCHO KPTM5-KPTM8
ETL2-5 11420 23041-28800 51841-57600 571 24 B D GOTHIC GOTHIC KPSG1-KPSG2 KPTG1-KPTG2

Samples

filename record metadata image
ETL2_1 1 1 A KANJI MINCHO 上 1
ETL2_2 101 11621 A KANJI MINCHO 浴 note: the stored character code is wrong, which is next to the true one, but shown as it is here  11621
ETL2_3 201 29001 C KANJI MINCHO 内  29001
ETL2_4 301 40621 C KANJI MINCHO 淡  40621
ETL2_5 401 23441 B KANJI GOTHIC 切  23441