ETL-9G
Format
Files
- One data set contains 3036 characters written by a writer, hence 12144 = 4 * 3036
- 20 sheets per writer: like 1-20: first writer, 21-40: second writer etc.
filename |
# records |
# categories |
# data sets |
data set indices |
# sheets |
ETL9G_01 |
12144 |
3036 |
4 |
1-4 |
80 |
ETL9G_02 |
12144 |
3036 |
4 |
5-8 |
80 |
⋮ |
⋮ |
⋮ |
⋮ |
⋮ |
⋮ |
ETL9G_50 |
12144 |
3036 |
4 |
197-200 |
80 |
Samples
filename |
record |
metadata and JIS code in hex |
image |
ETL9G_01 |
1 |
(1, 12321, ‘A.TSUGU ‘, 1, 0, 0, 0, 0, 0, 0, 8212, 8310, 0, 0) 0x3021 |
|
ETL9G_11 |
101 |
(1, 12580, ‘IN.HIBI ‘, 101, 0, 0, 0, 0, 0, 0, 8212, 8311, 4, 6) 0x3124 |
|
ETL9G_21 |
201 |
(2, 12839, ‘OU.OKI ‘, 49, 0, 0, 0, 0, 0, 0, 8212, 8406, 0, 3) 0x3227 |
|
ETL9G_31 |
301 |
(2, 13101, ‘KAI.BAI ‘, 301, 0, 0, 0, 0, 0, 0, 8212, 8405, 4, 9) 0x332d |
|
ETL9G_41 |
401 |
(3, 13360, ‘KAN.MA ‘, 401, 0, 0, 0, 0, 0, 0, 8212, 8403, 0, 6) 0x3430 |
|
ETL-9B
ETL-9B is generated from ETL-9G by binalization. The threshold is determined by T=λ∙h + (1-λ)∙μ, where h is Otsu’s threshold [4] and μ is the average of all intensity levels in ETL-9G [5]. For ETL-9B, λ=0.4 [1][2].
Format
Files
- One data set contains 3036 characters written by a writer, hence 121440 = 40 * 3036
- 20 sheets per writer: 1-20: first writer, 21-40: second writer etc.
- The first record of each file is dummy filled by zeros
- The last data set of 3036 records of ETL9B_5 is the model presented to examinees
filename |
# records |
# data sets |
data set index |
# sheets |
ETL9B_1 |
121440 |
40 |
1-40 |
800 |
ETL9B_2 |
121440 |
40 |
41-80 |
800 |
ETL9B_3 |
121440 |
40 |
81-120 |
800 |
ETL9B_4 |
121440 |
40 |
121-160 |
800 |
ETL9B_5 |
121440+3036 |
40+1 |
161-200 |
800+20 |
Samples
filename |
record index (dummy record as 0) |
metadata and JIS code in hex |
image |
ETL9B_1 |
1 |
(1, 9250, ‘A.HI’) 0x2422 |
|
ETL9B_2 |
100 |
(801, 12349, ‘AYA.’) 0x303d |
|
ETL9B_3 |
200 |
(1601, 12611, ‘EI.A’) 0x3143 |
|
ETL9B_4 |
300 |
(2402, 12873, ‘KA.Y’) 0x3249 |
|
ETL9B_5 |
400 |
(3203, 13135, ‘KAKU’) 0x334f |
|
References
- 斉藤泰一、山田博三、山本和彦: “JIS第1水準手書漢字データベースETL9とその解析”, 「信学論(D) 画像処理特集号」, Vol.J68-D, No.4, pp.757–764 (1985-04).
- 斉藤泰一、山田博三、山本和彦: “手書文字データベースの解析(VIII) -方向パターン・マッチング法によるJIS第1水準手書漢字データベースETL9の評価-”, 「電総研彙報」, Vol.49, No.7, pp.487–525 (1985-07).
- 斉藤泰一、山本和彦、山田博三: “手書文字データベースの解析(IX) -データベースETL9とその見本文字について-”, 「電総研彙報」, Vol.50, No.4, pp.259–263 (1986-04).
- 大津展之: “判別および最小2乗規準に基づく自動しきい値選定法”, 「信学論(D)」, Vol.63-D, No.4, pp.349–356 (1980-04).
- 斉藤泰一、山田博三: “判別しきい値選定法の一改良”, 「情報処理学会論文誌(情処学論)」, Vol.22, No.6, pp.596–599 (1981-11).