Skip to content

Latest commit

 

History

History
366 lines (246 loc) · 68 KB

summary.md

File metadata and controls

366 lines (246 loc) · 68 KB

训练总结

伪标签数据产生

文件夹序号 结果 时间戳 测试结果 模型 数据 GPU K batch epoch seed percentile fix_thresh 其他
pseudo_1 expand_train_1.json 2022-10-13-15-15 0.51443849647 mengzi train.json 3090 1 16 50 42 85 0.70 warmup=0.1
pseudo_2 expand_train_2.json 2022-10-13-22-48 0.55291467089 mengzi expand_train_1.json 3090 1 16 50 42 70 0.70 warmup=0.1
pseudo_3 expand_train_3.json 2022-10-13-23-41 0.56821691118 mengzi expand_train_2.json 3090 1 16 50 42 50 0.70 warmup=0.1
pseudo_3 expand_train_4.json 2022-10-13-23-41 0.56821691118 mengzi expand_train_2.json 3090 1 16 50 42 30 0.70 warmup=0.1
pseudo_4 expand_train.json 2022-10-14-13-52 0.60286913357 mengzi expand_train_4.json 3090 5 32 50 42 25 0.70 warmup=0.1
pseudo_5 expand_train_cur_best.json 2022_11_07_18_40_47 0.59952852320 hfl/chinese-macbert-base expand_train.json 2080Ti*2 1 12 40 42 20 0.70 pgd=3

数据增强方法

*.aug_tail.json

尾部类别 (12, 22, 32, 35) 大语种翻译 (每句话 5 次) + Chinese EDA 数据增强 (每句话 20 次)

涉及到的文件:eda.pyback_trans.py

模型训练

作用 最终预测 expand_train_630 expand_train_632 expand_train_6422 expand_train_6460
2022_10_22_19_12_04 3 3 3
2022_10_27_07_38_29 3 3 3 3
2022_11_01_04_26_32 3 3 3 3
2022_11_03_19_41_25 1、2 1、2、3、4 1、2、4
2022_11_05_05_55_17 0
2022_11_06_04_35_15 1、2 1、3
2022_11_06_19_08_24 2
2022_10_25_05_36_40 1
备注 F1 = 0.6307787544
percentile = 20
fix_thresh = 0.70
F1 = 0.63293263685
percentile = 20
fix_thresh = 0.70
F1 = 0.64228001063
percentile = 20
fix_thresh = 0.70
F1 = 0.63926327467
percentile = 20
fix_thresh = 0.70

训练日志

时间 成员 得分 预训练模型 训练轮数 交叉验证 其他设置 训练集+验证集得分 验证集得分
训练开始:2022/10/12 15:49:35
训练结束:2022/10/12 18:52:39
提交时间:2022/10/12 18:54
张兆 0.46661411496 nghuyong/ernie-3.0-base-zh 60 5 warmup 0.1(用的可能不对)
2080Ti*2 batch=24
random_seed=42
best_1.pt : 0.7630403185746878
best_2.pt : 0.785194086089213
best_3.pt : 0.8484339825318716
best_4.pt : 0.8481442706893945
best_5.pt : 0.8564035464087444
bagging : 0.9039802357258913
训练开始:2022/10/12 20:34:47
训练结束:2022/10/12 20:52:25
提交时间:2022/10/12 20:54
张兆 0.50926177 Langboat/mengzi-bert-base 25 1 2080Ti*2 batch=24
random_seed=42
random_split_ratio=0.8
0.9035649164096997 0.555419
训练开始:2022/10/12 20:57:30
训练结束:2022/10/13 00:12:28
提交时间:2022/10/13 08:23
张兆 0.54461296043 nghuyong/ernie-3.0-base-zh 30 10 2080Ti*2 batch=24
random_seed=42
best_1.pt : 0.9501953736953739
best_2.pt : 0.8090291204084319
best_3.pt : 0.6498208126075585
best_4.pt : 0.9427773331560824
best_5.pt : 0.9111302191404046
best_6.pt : 0.8292753225364519
best_7.pt : 0.8938565221741899
best_8.pt : 0.9537918662275343
best_9.pt : 0.7997188727837208
best_10.pt : 0.8814895748089789
bagging : 0.954136902855915
0.616938
0.586474
0.529849
0.564196
0.568573
0.516865
0.46167
0.598384
0.542902
0.67397
训练开始:2022/10/13 11:39:52
训练结束:2022/10/13 16:40:56
提交时间:2022/10/13 17:38
张兆 0.56185948422 nghuyong/ernie-3.0-base-zh 50 10 2080Ti*2 batch=24
random_seed=42
correct_bias = True
best_1.pt : 0.958411
best_2.pt : 0.878444
best_3.pt : 0.950718
best_4.pt : 0.932414
best_5.pt : 0.929663
best_6.pt : 0.952183
best_7.pt : 0.952413
best_8.pt : 0.965563
best_9.pt : 0.949669
best_10.pt : 0.761528
bagging : 0.995802
0.627148
0.627045
0.524618
0.632453
0.603112
0.521837
0.550469
0.65892
0.498028
0.668365
训练开始:2022/10/15 10:00:01
训练结束:2022/10/15 10:25:30
训练开始:2022/10/15 11:25:11
训练结束:2022/10/15 11:59:54
张兆 未验证 nghuyong/ernie-3.0-base-zh 50 1 未添加RDrop:
2080Ti2 batch=24
random_seed=42
correct_bias = True
添加RDrop:
2080Ti
4 batch=24
random_seed=42
correct_bias = True
RDrop=0.4
未添加RDrop:0.909574
添加RDrop:0.924773
未添加RDrop:0.568337
添加RDrop:0.635372
训练开始:2022/10/15 14:01:25
训练结束:2022/10/15 14:35:36
张兆 未验证 nghuyong/ernie-3.0-base-zh 50 1 添加FGM:
2080Ti*4 batch=24
random_seed=42
correct_bias = True
未添加FGM:0.909574
添加FGM:0.924612
未添加FGM:0.568337
添加FGM:0.60684
训练开始:2022/10/15 14:37:24
训练结束:2022/10/15 15:49:57
张兆 未验证 nghuyong/ernie-3.0-base-zh 50 1 添加PGD:
2080Ti*4 batch=24
random_seed=42
correct_bias = True
PGD_K=3
未添加PGD:0.909574
添加PGD:0.917637
未添加PGD:0.568337
添加PGD:0.582854
训练开始:2022/10/15 21:18:01
训练结束:2022/10/16 03:22:04
张兆 0.56684304078 nghuyong/ernie-3.0-base-zh 50 10 2080Ti*4 batch=24
random_seed=42
correct_bias = True
RDrop=0.4
best_1.pt : 0.964261
best_2.pt : 0.937494
best_3.pt : 0.93178
best_4.pt : 0.915917
best_5.pt : 0.954728
best_6.pt : 0.95125
best_7.pt : 0.951136
best_8.pt : 0.957213
best_9.pt : 0.959292
best_10.pt : 0.967787
bagging :  0.995802
0.58019
0.576124
0.526915
0.577535
0.612728
0.559396
0.549781
0.646989
0.537495
0.687826
李一鸣 0.60286913357 Langboat/mengzi-bert-base 40 5 batch=12 用官方测试集 F1 = 0.57 的模型产生的伪标签进行 pseudo-labelling
李一鸣 0.587 Langboat/mengzi-bert-base 40 1 batch=12 用官方测试集 F1 = 0.57 的模型产生的伪标签进行 pseudo-labelling
李一鸣 0.58719238 Langboat/mengzi-bert-base 40 5 batch=12, 用官方测试集 F1 = 0.60 的模型产生的伪标签进行 pseudo-labelling, RDrop=0.1
李一鸣 0.58009964029 Langboat/mengzi-bert-base 40 1 batch=12, 用官方测试集 F1 = 0.60 的模型产生的伪标签进行 pseudo-labelling, RDrop=0.1
李一鸣 0.60216 Langboat/mengzi-bert-base 40 1 batch=24, 用官方测试集 F1 = 0.599 的模型产生的伪标签(expand_train_cur_best.json )进行 pseudo-labelling, no extra tricks
训练开始:2022/10/16 17:28:14
训练结束:2022/10/16 18:10:27
张兆 0.52782175403 Langboat/mengzi-bert-base 50 1 2080Ti*2 batch=24 0.911089 0.604567
训练开始:2022/10/16 18:10:31
训练结束:2022/10/16 18:58:40
张兆 0.50326362551 Langboat/mengzi-bert-base 50 1 2080Ti*4 batch=24
RDrop=0.4
0.900517 0.560547
训练开始:2022/10/16 18:58:44
训练结束:2022/10/16 19:41:36
张兆 0.50364182397 nghuyong/ernie-3.0-base-zh 50 1 2080Ti*2 batch=24 0.914422 0.566295
训练开始:2022/10/16 20:57:37
训练结束:2022/10/16 21:48:23
张兆 0.52432191354 nghuyong/ernie-3.0-base-zh 50 1 2080Ti*4 batch=24
RDrop=0.4
0.924773 0.635372
张兆 0.60906089726
1:0.59323842065
3:0.62876738448
5:0.60725801303
2 3 4:0.61215379337
nghuyong/ernie-3.0-base-zh 40 5 V100*4 batch=128
fgm
0.940159
0.959345
0.982243
0.961593
0.961342
bagging:0.967881
0.83072
0.928769
0.927161
0.934107
0.943387
大模型实验
2022_10_29_18_22_10
张兆 0.58887581038 hfl/chinese-roberta-wwm-ext-large 50 1 V100*8 batch=64
fgm
0.987128 Epoch=27
0.924402
全语言实验
2022_10_29_18_43_16
张兆 0.60163217528 pretrained/nghuyong/ernie-3.0-base-zh 50 1 V100*8 batch=128
fgm
0.999038 Epoch =29
0.99521
大模型实验
2022_10_29_20_26_06
张兆 0.59526036696 nghuyong/ernie-3.0-xbase-zh 50 1 V100*8 batch=128
fgm
0.983985 Epoch =34
0.913857

实验计划

探究不同因素对不同模型的影响

基本配置:batch=12,epoch=40 (early stop), gpu=2,3, lr=2e-5, seed=42, split_test_ratio=0.2, dropout=0.3

数据采用expand_train.json

模型 baseline rdrop=0.1 rdrop=0.5 rdrop=1.0 ema=0.999 pgd=3 warmup=0.1 fgm
mengzi-bert-base 2022_10_25_05_36_57
Epoch =11 0.943702
0.973459
0.58946182233
2022_10_25_05_36_51
Epoch =13 0.948241
0.981155
0.58861785176
2022_10_25_05_37_12
Epoch =12 0.953051
0.953051
0.58592248996
2022_10_25_05_37_15
Epoch =22 0.955365
0.984842
0.58437014014
2022_10_25_05_41_54
Epoch =20 0.955028
0.983924
0.58043132506
2022_10_25_05_37_08
Epoch =14 0.94846
0.982611
0.58877098445
ernie-3.0-base-zh 2022_10_21_17_44_31
Epoch = 12 0.939449
0.975266
0.61203747617
2022_10_22_16_39_14
Epoch=32 0.947126
0.986405
0.60179269740
2022_10_23_04_57_21
Epoch=20 0.94027
0.975243
0.58086732341
2022_10_21_21_46_43
Epoch =28 0.939117
0.981063
0.60027698592
2022_10_22_22_19_15
Epoch=20 0.942571
0.981936
0.59630101157
chinese-macbert-base 2022_10_19_09_04_25
Epoch =11 0.93218
0.973964
0.58566511183
2022_10_21_17_25_43
Epoch =27 0.937448
0.984506
0.56675022857
2022_10_20_20_00_52
Epoch =19 0.944687
0.983283
0.57831287521
2022_10_22_03_38_29
Epoch =17 0.942641
0.981227
0.58781315810
2022_10_20_02_42_28
Epoch =19 0.950837
0.98362
0.59952852320
2022_10_21_04_24_30
Epoch =33 0.955382
0.990917
0.57818127106
2022_10_19_14_51_36
Epoch =25 0.940839
0.988269
0.59486694090
chinese-roberta-wwm-ext 2022_10_25_08_19_12
Epoch=14 0.943605
0.979684
0.59554496893
2022_10_25_11_23_51
Epoch=25 0.948677
0.985914
0.59928113089
2022_10_25_15_54_51
Epoch=15 0.946985
0.980388
0.58545683003
2022_10_25_19_05_47
Epoch=15 0.945916
0.983087
0.60706732296

英文模型

基本配置:batch=12,epoch=40 (early stop), gpu=2,3, lr=2e-5, seed=42, split_test_ratio=0.2, dropout=0.3

数据采用expand_train_cur_best_en.json

模型 baseline ema=0.999 pgd=3 fgm warmup=0.1
ernie-2.0-base-en 2022_10_23_18_05_14
Epoch =24 0.869316
0.947128
0.58633437923
2022_10_23_18_18_01
Epoch =27 0.883606
0.973969
0.58900858108
2022_10_23_18_05_27
Epoch =18 0.851685
0.94073
0.56777475031
2022_10_23_18_19_40
Epoch =24 0.884273
0.953767
0.60969211560
2022_10_23_18_05_32
Epoch =25 0.86958
0.968817
0.59526836713
roberta-base
xlnet-base-cased 2022_10_24_05_41_37
Epoch =31 0.876336
0.976078
0.59146824680
2022_10_24_05_41_04
Epoch =17 0.868918
0.969398
0.59067118434
2022_10_24_05_41_12
Epoch =19 0.875453
0.970272
0.57834073199
2022_10_24_05_41_20
Epoch =21 0.879417
0.976025
0.58459104336
2022_10_24_05_41_29
Epoch =24 0.873381
0.974582
0.59797619630
deberta-v3-base 2022_10_24_10_44_57
Epoch =25 0.875878
0.943788
0.58797673621
2022_10_24_10_50_12
Epoch =37 0.880665
0.947626
0.59859900583
2022_10_24_10_52_08
Epoch =28 0.894997
0.955026
0.60094762341
2022_10_24_10_54_03
Epoch =24 0.881334
0.947271
0.59791066113
2022_10_24_10_51_34
Epoch =24 0.856213
0.933411
0.58096981486

二郎神:IDEA-CCNL/Erlangshen-DeBERTa-v2-320M-Chinese

数据采用expand_train_cur_best.json

2022_10_26_06_08_38:不加 Epoch =14 0.909819 0.979235 0.61420523782

2022_10_26_06_05_48:加fgm Epoch =8 0.913229 0.979437 0.59672118680

ernie-2.0-base-en+fgm 十折交叉验证 expand_train_cur_best_en.json

2022_10_25_05_36_32

1:Epoch =23 0.738594 0.943053 0.59779498710

2:Epoch =20 0.942162 0.965787 0.61183453255

3:Epoch =14 0.921992 0.961072 0.60124333189

4:Epoch =22 0.912673 0.964743 0.60470543799

5:Epoch =15 0.90293 0.962382 0.61298211869

6:Epoch =23 0.918961 0.974785 0.62131731657

7:Epoch =15 0.920576 0.963609

8:Epoch =20 0.914956 0.965676

9:Epoch =40 0.946059 0.991682

10:Epoch=29 0.938323 0.990791

0.970571

0.61755036431

ernie-3.0-base-zh+fgm 十折交叉验证 expand_train_cur_best+en_zh.json

2022_10_25_05_36_40

1:Epoch =39 0.980802 0.997899 0.62529548458

2:Epoch =35 0.99583 0.999507 0.61470568046

3:Epoch=33 0.968477 0.996453 0.61029077511

4:Epoch =40 0.998369 0.99666 0.61501080887

5:Epoch =37 0.96937 0.996507 0.61981191769

6:Epoch =34 0.95805 0.995294 0.62014754016

7:Epoch =39 0.97682 0.994493 0.60765491039

8:Epoch =37 0.950761 0.994647

9:Epoch=35 0.983666 0.995142

10:Epoch =39 0.974248 0.997419

0.996792

0.62106441848

随机数实验

batch=128, bert='pretrained/nghuyong/ernie-3.0-base-zh'

data_file='expand_train_cur_best.json',

8*V100 32G

五折

随机数 时间戳 1 2 3 4 5 all
3407 2022_10_27_07_36_12 Epoch =29
0.828277
0.940246
Epoch=38
0.922059
0.975522
Epoch =26
0.922241
0.978571
Epoch =20
0.920416
0.961822
0.59931987441
Epoch =18
0.928557
0.964518
0.976724
0.60708719687
fgm+3407 2022_10_27_07_32_33 Epoch =29
0.837168
0.958176
0.62207249038
Epoch =33
0.93111
0.978384
0.61360985387
Epoch=27
0.928721
0.981655
0.60667812125
Epoch =22
0.932218
0.959626
0.59740545961
Epoch =29
0.938336
0.972912
0.59479068059
0.984951
0.61382655005
219 2022_10_27_07_40_08 Epoch = 16
0.82509
0.936603
0.59974018113
Epoch = 17
0.921952
0.956262
0.60490219534
Epoch = 21
0.912299
0.955187
0.59659364810
Epoch = 16
0.932538
0.960904
Epoch = 35
0.936999
0.98227
0.61027444916
0.976059
0.61142175663
fgm+219 2022_10_27_07_44_22 Epoch=33
0.837085
0.942478
0.61739339132
Epoch =36
0.931217
0.978476
0.61062900994
Epoch =32
0.920706
0.981834
0.61664927614
Epoch =34
0.934104
0.978847
0.62235183026
Epoch =35
0.945433
0.986606
0.60045155119
0.991535
0.61824679505
909 2022_10_27_07_38_29 Epoch =38
0.827719
0.949487
Epoch =32
0.927994
0.969012
Epoch =30
0.918163
0.97716
0.63077875449
Epoch =18
0.927548
0.968832
Epoch=38
0.933141
0.982951
0.60616902305
0.985358
0.62524955972
fgm+909 2022_10_27_07_43_16 Epoch = 34
0.833804
0.948922
Epoch = 21
0.932729
0.96123
0.60084338886
Epoch = 13
0.921903
0.949437
0.59973293567
Epoch = 36
0.934
0.979314
0.62197043809
Epoch = 36
0.938507
0.98511
0.60984056326
0.978551
0.61553075216

swa实验

ernie-3.0-base-zh,单卡V100 32G,batch 32,epoch 50,数据expand_train_cur_best.json

添加fgm 不添加fgm
2022_10_29_09_06_14
Epoch =50 0.916163
0.981227
0.62434188208
2022_10_29_09_03_42
Epoch =46 0.914986
0.980371
0.60998950294

不同模型结构实验

ernie-3.0-base-zh,4*V100 32G,batch 128,epoch 50,数据expand_train_cur_best.json,fgm

model1 model2 model3 model4 model5
last_hidden+MLP 后四层连接 pooler_output输出 pooler_output+MLP 前四层+后四层平均
2022_10_29_13_36_46
Epoch =29 0.917249
0.980737
0.61477357402
2022_10_29_09_10_37
Epoch =29 0.917644
0.985873
0.60476939744
2022_10_29_09_13_44
Epoch =24 0.921817
0.982066
0.60608243078
2022_10_29_09_14_02
Epoch =41 0.923856
0.982397
0.61429143697
2022_10_29_09_16_20
Epoch =14 0.918428
0.974142
0.60180561494

5折实验

expand_train_cur_best.json

模型 训练配置 1 2 3 4 5 all
ernie-xbase + fgm 2022_10_30_05_13_40
8*V100 32G
batch 128
Epoch=28
0.85882
0.962631
0.59420600432
Epoch =11
0.925827
0.957288
0.59320443123
Epoch =20
0.930223
0.982003
0.60906881597
Epoch =18
0.93642
0.976608
0.59153312985
Epoch =18
0.945131
0.984544
0.60618609586
0.99059
0.60398456406
ernie-xbase + fgm + swa 2022_10_30_05_17_06
单卡V100 32G
batch 16
Epoch =34
0.833288
0.59777530265
Epoch =23
0.936805
2022_11_02_03_26_58
Epoch=23
0.942784
2022_11_02_20_43_59
停止
ernie + fgm + swa 2022_10_30_05_10_41
单卡V100 32G
batch 32
Epoch =18
0.850886
0.945433
0.61460610456
Epoch =22
0.937315
0.962094
0.61609908526
Epoch=47
0.927484
0.984062
0.61932705714
Epoch =43
0.936822
0.980606
0.61907224613
Epoch =29
0.948016
0.965169
0.61239652746
0.970118
0.61836815011

不同Loss实验

ernie-3.0-base-zh,2*V100 32G,batch 32,expand_train_cur_best.json

cb rfl ntrfl dbfl
2022_10_30_16_22_03 2022_10_30_16_26_50 2022_10_30_16_31_32 2022_10_30_16_32_21
Epoch =42
0.92314
0.986964
0.61037779477
Epoch=26
0.92106
0.98226
0.61772640943
Epoch =17
0.925418
0.981216
0.61171727349
Epoch =29
0.922694
0.981506
0.60664211626

大模型随机数实验

batch=64, bert='pretrained/nghuyong/ernie-3.0-xbase-zh'

data_file='expand_train_cur_best.json'

4*V100 32G

四折

随机数 时间戳 1 2 3 4 all
5267 2022_10_31_03_07_03 Epoch = 8
0.844945
0.936316
2022_11_01_03_36_44
Epoch = 14
0.929097
0.975143
2022_11_01_03_42_36
Epoch = 37
0.936798
0.983429
2022_11_01_03_48_52
Epoch = 20
0.937756
0.982108
2022_11_01_03_52_27
0.979784
6271 2022_10_31_03_07_58 Epoch =26
0.857581
0.953757
0.60340980050
Epoch =11
0.933323
0.956936
2022_11_01_03_59_29
Epoch =22
0.93793
0.981894
2022_11_01_04_03_06
Epoch =15
0.933826
0.980293
2022_11_01_04_06_31
0.985472
3254 2022_10_31_03_13_49 Epoch =15
0.848537
0.954258
2022_11_01_04_09_52
Epoch =23
0.930986
0.979635
0.59871831078
Epoch =33
0.936514
0.98287
0.62449404558
Epoch=24
0.931808
0.981126
0.59931316293
0.995635
0.61619123551
1618 2022_10_31_03_15_53 Epoch =24
0.850448
0.956344
2022_11_01_04_24_47
Epoch =17
0.921543
0.971829
2022_11_01_04_28_20
Epoch =9
0.927174
0.954966
2022_11_01_04_31_58
Epoch=41
0.939747
0.984025
0.61157607909
0.990993
5374 2022_10_31_03_19_03 Epoch = 12
0.851206
0.938386
2022_11_01_04_40_29
Epoch = 15
0.922498
0.977894
2022_11_01_04_44_08
Epoch = 14
0.935281
0.976279
2022_11_01_04_47_53
Epoch = 16
0.934405
0.97992
2022_11_01_04_51_41
0.98513
7606 2022_10_31_03_35_55 Epoch =14
0.851953
0.948443
0.61432388691
Epoch=21
0.931022
0.974039
2022_11_01_04_59_19
Epoch=17
0.935818
0.977052
2022_11_01_05_03_09
Epoch =24
0.944068
0.984329
0.58745971606
0.991329

增强数据实验

expand_train_aug_tail.json

ernie-3.0-base-zh

fgm

model1

配置 时间戳 1 2 3 4 5 all
batch=128
4*V100
seed=3407
2022_11_01_04_22_41
2022_11_02_03_08_56
Epoch =32
0.871526
0.974331
Epoch =28
0.964592
0.991178
Epoch =41
0.973115
0.994438
0.62325553677
Epoch =13
0.970786
0.989015
0.61143654969
Epoch =19
0.976139
0.992837
0.61164173186
0.997983
0.61838623151
batch=128
4*V100
seed=42
2022_11_01_04_26_32
2022_11_02_03_10_10
Epoch =29
0.87611
0.975097
Epoch =31
0.965207
0.990986
0.61300549904
Epoch =24
0.965826
0.991184
0.63293263685
Epoch =39
0.977998
0.995103
0.61587753363
Epoch =18
0.972853
0.991349
0.60745748898
0.997956
0.62664852015
batch=32
单卡
seed=42
swa=True
2022_11_01_05_09_54 Epoch =48
0.876458
2022_11_02_03_40_24
Epoch =28
0.971478
2022_11_02_20_35_04
停止

随机数实验

batch=64, bert='pretrained/nghuyong/ernie-3.0-base-zh'

data_file='expand_train_cur_best.json',

4*V100 32G

fgm

四折

随机数 时间戳 1 2 3 4 all
1 2022_11_02_03_32_27
2022_11_02_19_38_48
Epoch =12
0.853343
0.938496
Epoch =23
0.934691
0.961254
Epoch =18
0.93041
0.959059
Epoch =20
0.937899
0.977642
0.969301
2 2022_11_02_03_32_58
2022_11_02_19_47_37
Epoch =25
0.856089
0.955624
Epoch =25
0.935358
0.978399
Epoch =19
0.925913
0.957886
Epoch =18
0.937418
0.980764
0.968468
0.62021852110
3 2022_11_02_03_37_21
2022_11_02_19_56_27
Epoch =16
0.857157
0.941225
Epoch =21
0.936291
0.977023
Epoch =36
0.9289
0.977059
Epoch =42
0.931982
0.980649
0.9868
0.61908867574
4 2022_11_02_03_40_55
2022_11_02_20_05_19
Epoch=20
0.85396
0.940266
Epoch =13
0.934075
0.958485
Epoch=25
0.937078
0.977961
Epoch =13
0.933042
0.953424
0.968248

其他实验

ernie-3.0-base-zh,model1,batch=128,4*V100 32G,seed=42

配置 时间戳 1 2 3 4 5 all
train_1.json 2022_11_02_19_46_18 Epoch=30
0.951351
0.989547
Epoch =8
0.960934
0.982272
Epoch =33
0.972444
0.994191
0.62424420592
Epoch=32
0.971671
0.994274
0.62282102063
Epoch=42
0.983029
0.996411
0.60209126670
0.999394
0.62709795528
train_2.json 2022_11_02_19_48_24 Epoch =27
0.936979
0.987124
Epoch =14
0.968284
0.99085
Epoch =24
0.968725
0.992486
0.61248007133
Epoch =29
0.976006
0.994891
Epoch =50
0.980255
0.995916
0.999213
0.62776592938
train_3.json 2022_11_02_19_49_15 Epoch =33
0.945613
0.988999
Epoch =33
0.96962
0.993691
Epoch =22
0.970094
0.993323
Epoch =23
0.974919
0.994009
Epoch =44
0.980772
0.995917
0.999506
0.61825667047
assignee=True
expand_train_aug_tail.json
2022_11_02_19_51_58 Epoch =24
0.862726
0.969741
Epoch =7
0.945047
0.968843
Epoch =9
0.953333
0.976113
Epoch =30
0.960581
0.990204
Epoch =17
0.955794
0.983563
0.990541
0.61689728830
seed=909
expand_train_aug_tail.json
fgm=False
2022_11_02_19_54_00 Epoch =16
0.862872
0.970313
Epoch =27
0.961241
0.990883
Epoch =15
0.958736
0.985262
0.60714643290
Epoch =19
0.967793
0.990451
Epoch =16
0.965509
0.985466
0.996883
0.61925341996
seed=42
train_630_aug_tail.json
2022_11_02_19_58_44 Epoch =38
0.86662
0.973508
0.61892184410
Epoch=22
0.966239
0.991846
0.62699731852
Epoch =22
0.99014
0.996317
0.62829439429
Epoch =18
0.978446
0.992617
0.62607548483
Epoch=13
0.972539
0.989964
0.61848918549
0.99747
0.63099144316
seed=909
train_630_aug_tail.json
fgm=False
2022_11_02_19_59_44 Epoch =20
0.858304
0.969588
Epoch=14
0.961148
0.986552
Epoch =19
0.981033
0.992011
0.62140866596
Epoch =16
0.967573
0.986563
Epoch =10
0.967701
0.984855
0.995281
0.62570651609
seed=42
train_632_aug_tail.json
2022_11_02_20_01_13 Epoch =24
0.870113
0.973455
2022_11_03_03_59_12_3
Epoch =42
0.972512
0.994253
0.62641728896
Epoch =25
0.993971
0.997304
0.62363084980
Epoch =29
0.978241
0.994846
0.61682686555
Epoch =19
0.975262
0.992922
0.62228775322
0.998875
0.63094056485
seed=909
train_632_aug_tail.json
fgm=False
2022_11_02_20_01_55 Epoch =10
0.850151
0.964642
Epoch=25
0.964354
0.990974
Epoch =26
0.986338
0.994619
0.61113461788
Epoch =17
0.966243
0.988185
Epoch=19
0.968454
0.988842
0.997676
0.62093379110
seed=909
expand_train_630.json
fgm=False
2022_11_02_20_21_50 Epoch =19
0.818384
0.937568
0.59469983287
Epoch =28
0.930541
0.975334
0.60771179775
Epoch=20
0.952989
0.961648
0.60115782034
Epoch =31
0.928331
0.97673
0.61745018178
Epoch =25
0.93317
0.980749
0.62231670502
0.978132
0.62578734849
seed=909
expand_train_632.json
fgm=False
2022_11_02_20_23_08 Epoch =25
0.83736
0.960503
Epoch =14
0.936257
0.977144
Epoch=29
0.968587
0.984876
Epoch =19
0.949106
0.97852
Epoch =19
0.942543
0.977803
0.991369
0.60985007379

四折实验

ernie-3.0-base-zh,model1,batch=128,4*V100 32G,seed=42

时间戳 1 2 3 4 all
2022_11_03_19_40_33
train_630_aug_tail.json
Epoch=20
0.87107
0.967542
Epoch =27
0.976572
0.992577
Epoch =13
0.978638
0.989724
Epoch =21
0.974908
0.992153
0.997476
0.62603740631
2022_11_03_19_41_25
train_632_aug_tail.json
Epoch =12
0.872987
0.968018
0.61577661518
Epoch =33
0.978451
0.994023
Epoch=28
0.978055
0.992478
Epoch =20
0.974541
0.991169
0.99844
0.63234589689去掉第一折:0.62031840512

多实验

ernie-3.0-base-zh,model1,batch=128,4*V100 32G

时间戳 配置 1 2 3 4 5 all
2022_11_05_05_35_07 expand_train_6422.json
seed=42 fgm
K=4
Epoch =19
0.85822
0.942561
Epoch =30
0.954033
0.988508
Epoch=44
0.977698
0.987825
Epoch =46
0.973522
0.989223
0.992954
0.60793513753
2022_11_05_05_37_29 expand_train_6422.json
seed=42 fgm
K=5
Epoch=22
0.848551
0.962322
Epoch=19
0.94463
0.962904
Epoch =34
0.961831
0.990747
Epoch =33
0.963921
0.987833
Epoch =41
0.981031
0.989617
0.992817
0.61795701255
2022_11_05_05_40_49 expand_train_6422.json
seed=909 nfgm
K=4
Epoch =21
0.857822
0.939576
Epoch =23
0.947798
0.976549
Epoch=21
0.964533
0.978804
Epoch =28
0.94602
0.981704
0.967977
2022_11_05_05_41_19 expand_train_6422.json
seed=909 nfgm
K=5
Epoch =26
0.837493
0.941728
不交
Epoch =40
0.936513
0.981417
不交
Epoch=19
0.95189
0.970197
不交
Epoch =42
0.957603
0.985129
不交
Epoch =42
0.947453
0.959921
不交
0.983114
不交
2022_11_05_05_44_35 expand_train_6422_aug_tail.json
seed=42 fgm
K=5
Epoch =41
0.886191
0.976738
Epoch =11
0.979304
0.989131
Epoch =14
0.995935
0.995623
Epoch=34
0.988813
0.997246
Epoch =19
0.987683
0.994889
0.997153
0.62279051452
2022_11_05_05_45_50 expand_train_6422_aug_tail.json
seed=42 fgm
K=4
Epoch =21
0.89401
0.972746
Epoch =30
0.989453
0.996302
Epoch =14
0.989562
0.992872
Epoch =24
0.988703
0.995577
0.99765
0.6272329318
2022_11_05_05_46_36 expand_train_6422_aug_tail.json
seed=909 nfgm
K=4
Epoch =37
0.889075
0.971764
Epoch =21
0.977557
0.991733
Epoch=27
0.982924
0.994052
Epoch =25
0.980508
0.992498
0.998318
0.62790207858
2022_11_05_05_46_58 expand_train_6422_aug_tail.json
seed=909 nfgm
K=5
Epoch =14
0.872622
0.972961
Epoch =16
0.974689
0.990382
Epoch =28
0.991149
0.995408
Epoch =25
0.977126
0.992539
Epoch =23
0.978972
0.993038
0.997585
2022_11_05_05_55_17 expand_train_6422_aug_tail.json
seed=42 fgm
K=1
Epoch =25
0.966275
0.991975
0.62600679310
2022_11_05_08_36_58 expand_train_6422.json
seed=42 fgm
K=1
Epoch =31
0.935153
0.98574
0.61366218811
2022_11_05_11_38_44 expand_train_6422.json
seed=909 nfgm
K=1
Epoch =15
0.904939
0.943555
不交
2022_11_05_12_51_22 expand_train_6422_aug_tail.json
seed=909 nfgm
K=1
Epoch =16
0.961666
0.988959
0.60882394030

三折实验

ernie-3.0-base-zh,model1,batch=128,4*V100 32G,seed=42 fgm,K=3

时间戳 配置 1 2 3 all
2022_11_06_04_34_51 expand_train_aug_tail.json Epoch = 15
0.898812
0.965936
Epoch = 20
0.963193
0.98506
Epoch = 27
0.968975
0.988606
0.997509
2022_11_06_04_35_15 expand_train_6422_aug_tail.json Epoch =25
0.914169
0.970832
Epoch =18
0.989666
0.994014
Epoch =12
0.981403
0.988101
0.994475
0.62673116125
2022_11_06_04_37_58 expand_train_6422.json Epoch =27
0.876217
0.951335
Epoch =33
0.968127
0.986179
Epoch =31
0.943877
0.979998
0.978263
2022_11_06_04_38_28 expand_train_632.json Epoch =20
0.883645
0.956869
Epoch=26
0.950254
0.978839
Epoch=23
0.956906
0.979368
0.993394
2022_11_06_09_05_33 train_632_aug_tail.json Epoch = 14
0.898623
0.966144
Epoch = 21
0.9794
0.990766
Epoch = 38
0.971669
0.990213
0.998695

最后的实验

model1,batch=128,4*V100 32G,seed=42 fgm

时间戳 配置 1 2 3 all
2022_11_06_18_41_14 chinese-roberta-wwm-ext
expand_train_6422_aug_tail.json
K=3
Epoch = 17
0.912253
0.970553
Epoch = 9
0.987278
0.990932
Epoch = 33
0.978513
0.992626
0.997274
2022_11_06_18_51_20 chinese-roberta-wwm-ext
expand_train_6422_aug_tail.json
K=1
Epoch =14
0.963594
0.991064
2022_11_06_19_06_54 chinese-roberta-wwm-ext
expand_train_6460_aug_tail.json
K=3
Epoch =20
0.926447
0.9753
Epoch =14
0.982489
0.991879
Epoch =26
0.982073
0.993274
0.997845
2022_11_06_19_07_09 chinese-roberta-wwm-ext
expand_train_6460_aug_tail.json
K=1
Epoch =22
0.967037
0.992853
2022_11_06_19_08_24 ernie-3.0-base-zh
expand_train_6460_aug_tail.json
K=3
Epoch =20
0.929239
0.976106
Epoch =20
0.989386
0.993937
Epoch =16
0.984976
0.992109
0.996779
2022_11_06_19_09_34 ernie-3.0-base-zh
expand_train_6460_aug_tail.json
K=1
Epoch =29
0.967763
0.99287

模型集成

ensemble/ensemble.xlsx

测试实验

  1. 测试究竟有多少32(22也是一样的结果)

提交结果:0.00028845044

$F1score=\frac{1}{36}\cdot\frac{2PR}{P+R}=0.00028845044$,$R=1$

$0.00028845044*18(P+1)=P$$(1-0.00519210792)P=0.00519210792$

共有20839条测试数据,因此 $\frac{0.00519210792}{1-0.00519210792}=\frac{x}{20839}$$x=108.763046419599$

所以共有109条标签为32的数据?

  1. 测试究竟有多少2

提交结果:0.00891031259

$F1score=\frac{1}{36}\cdot\frac{2PR}{P+R}=0.00891031259$$R=1$

$0.00891031259*18(P+1)=P$$(1-0.16038562662)P=0.16038562662$

共有20839条测试数据,因此 $\frac{0.16038562662}{1-0.16038562662}=\frac{x}{20839}$$x=3980.72755672264$

所以共有3981条标签为2的数据?

  1. 测试究竟有多少35

提交结果:0.00034578147