Stemming unknown proper nouns #9

meliksahturker · 2021-07-02T09:25:36Z

Hey Olga, good work with zeyrek.

I have a small improvement suggestion. Zeyrek is capable of providing stem of known proper nouns where inflections are attached with apostrophe. Example:
"istanbul'daki" -> "İstanbul"

but merges the inflection with the stem in case of unknown proper noun without parsing the inflections. Example:
"melik'in" -> "melikin"

So my suggestion is it should return the part before apostrophe. I'm not sure about if it should parse the inflection after apostrophe though. I might be missing some other case with apostrophe but here I am pointing out to something with unknown proper nouns and their inflections.

obulat · 2021-07-02T11:48:45Z

Thank you for opening the first issue here, @meliksahturker :)
Zemberek-nlp has the functionality of parsing unknown words, and I was planning to port it to Zeyrek as well, but didn't have time for it, sorry. I would greatly appreciate any contributions in this area. Other than that, you are free to fork Zeyrek, and just add the 'remove the part after apostrophe' functionality, if you want.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stemming unknown proper nouns #9

Stemming unknown proper nouns #9

meliksahturker commented Jul 2, 2021

obulat commented Jul 2, 2021

Stemming unknown proper nouns #9

Stemming unknown proper nouns #9

Comments

meliksahturker commented Jul 2, 2021

obulat commented Jul 2, 2021