-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluation Method for BIRD Dataset [Enhancement] #159
Comments
@lucaordronneau Thanks for interests in our work. Yes, the EX is more strict, we considered the returned orders should also be one of user requirements. Just imagine the agent return a long list with messy orders, which make users very annoying. However, if you dis consider this, you can try out our new metrics for beta testing SOFT-FT, which contains detailed elaborations here: |
Project Name: China Urban Bird Dataset Project Description: We are compiling a dataset of birdwatching records from citizens across various cities in China for scientific research related to bird conservation. We welcome data sources from any channel, including but not limited to structured species distribution databases, citizen science projects, social media data, and historical literature data. Data Requirements Details: The data sources we need should at least include species name, geographic information (precise coordinates or specific locations), and observation dates. It would be best if the data also included specific information such as the species' Latin name, Chinese name, and English name. We require data sources that are within the scope of China and at the urban scale. Contribution Guidelines: We hope for submissions in Excel or CSV format, and other table formats compatible with the Windows system are also acceptable. We prefer data to be shared under a free license agreement, but we also support acquiring data through compensated purchase arrangements. Contact Information: Please submit to the email address [email protected]. |
Hello,
I encountered an improvement opportunity during the evaluation process for the BIRD dataset. The prediction below is marked as incorrect by the evaluation method, but the only difference is the order of the elements.
The evaluation method uses strict equality. This occurs in the file bird/llm/src/evaluation.py on line 26.
The text was updated successfully, but these errors were encountered: