You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here is our new papers for logical reasoning data augmentation, prompt augmentation and evaluation. Please consider to add those papers into your arXiv paper. Thanks a lot.
Logic-Driven Data Augmentation and Prompt Augmentation
We present an AMR-based logic-driven data augmentation for contrastive learning to improve discriminative language model's logical reasoning performance and we also use AMR-based data augmentation method to augment the prompt which help GPT-4 achieved #1 on the ReClor leaderboard (One of the hardest logical reasoning reading comprehension dataset, the data was collected from LSAT and GMAT) and we also achieved better performance than other baseline models on different logical reasoning reading comprehension tasks and natural language inference tasks. Here is the details for the paper.
Our paper (Qiming Bao, Alex Yuxuan Peng, Zhenyun Deng, Wanjun Zhong, Neset Tan, Nathan Young, Yang Chen, Yonghua Zhu, Michael Witbrock, Jiamou Liu)
"Enhancing Logical Reasoning of Large Language Models through Logic-Driven Data Augmentation" [Paper link] [Source code] [Model weights] [Leaderboard].
Out-of-Distribution Logical Reasoning Evaluation and Prompt Augmentation for Enhancing OOD Logical Reasoning
We present a systematically out-of-distribution evaluation on logical reasoning tasks. We presented three new more robust logical reasoning datasets ReClor-Plus, LogiQA-Plus and LogiQAv2-Plus which are basically constructed from ReClor, LogiQA and LogiQAv2 from the changes of option's order and forms. We found simply using chain-of-thought prompting will not increase models' performance on the out-of-distribution scenario while using our AMR-based logic-driven data augmentation to augment prompt can increase large language models' performance on out-of-distribution logical reasoning tasks. The three datasets have been collected by OpenAI/Evals.
"A Systematic Evaluation of Large Language Models on Out-of-Distribution Logical Reasoning Tasks" [Paper link] [Source code] [Dataset links].
A Empirical Study on Out-Of-Distribution Multi-Step Logical Reasoning
We find that pre-trained language models are not good at on robust multi-step logical reasoning tasks and one of the main reason is that there is limited amount of training sets for deeper multi-step logical reasoning. Therefore, we present a deeper large multi-step logical reasoning datasets named PARARULE-Plus. The dataset has also been collected by OpenAI/Evals.
"Multi-Step Deductive Reasoning Over Natural Language: An Empirical Study on Out-of-Distribution Generalisation" [Paper link] [Source code] [Dataset links].
The text was updated successfully, but these errors were encountered:
Hi Jie,
Here is our new papers for logical reasoning data augmentation, prompt augmentation and evaluation. Please consider to add those papers into your arXiv paper. Thanks a lot.
Logic-Driven Data Augmentation and Prompt Augmentation
We present an AMR-based logic-driven data augmentation for contrastive learning to improve discriminative language model's logical reasoning performance and we also use AMR-based data augmentation method to augment the prompt which help GPT-4 achieved #1 on the ReClor leaderboard (One of the hardest logical reasoning reading comprehension dataset, the data was collected from LSAT and GMAT) and we also achieved better performance than other baseline models on different logical reasoning reading comprehension tasks and natural language inference tasks. Here is the details for the paper.
Our paper (Qiming Bao, Alex Yuxuan Peng, Zhenyun Deng, Wanjun Zhong, Neset Tan, Nathan Young, Yang Chen, Yonghua Zhu, Michael Witbrock, Jiamou Liu)
"Enhancing Logical Reasoning of Large Language Models through Logic-Driven Data Augmentation" [Paper link] [Source code] [Model weights] [Leaderboard].
Out-of-Distribution Logical Reasoning Evaluation and Prompt Augmentation for Enhancing OOD Logical Reasoning
We present a systematically out-of-distribution evaluation on logical reasoning tasks. We presented three new more robust logical reasoning datasets
ReClor-Plus
,LogiQA-Plus
andLogiQAv2-Plus
which are basically constructed from ReClor, LogiQA and LogiQAv2 from the changes of option's order and forms. We found simply using chain-of-thought prompting will not increase models' performance on the out-of-distribution scenario while using our AMR-based logic-driven data augmentation to augment prompt can increase large language models' performance on out-of-distribution logical reasoning tasks. The three datasets have been collected by OpenAI/Evals."A Systematic Evaluation of Large Language Models on Out-of-Distribution Logical Reasoning Tasks" [Paper link] [Source code] [Dataset links].
A Empirical Study on Out-Of-Distribution Multi-Step Logical Reasoning
We find that pre-trained language models are not good at on robust multi-step logical reasoning tasks and one of the main reason is that there is limited amount of training sets for deeper multi-step logical reasoning. Therefore, we present a deeper large multi-step logical reasoning datasets named PARARULE-Plus. The dataset has also been collected by OpenAI/Evals.
"Multi-Step Deductive Reasoning Over Natural Language: An Empirical Study on Out-of-Distribution Generalisation" [Paper link] [Source code] [Dataset links].
The text was updated successfully, but these errors were encountered: