Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_langdetect endpoint missing in ES 2.2? #13

Open
marbleman opened this issue Feb 23, 2016 · 8 comments
Open

_langdetect endpoint missing in ES 2.2? #13

marbleman opened this issue Feb 23, 2016 · 8 comments

Comments

@marbleman
Copy link

With ES 1.7 we used the _langdetect endpoint to verifiy the language of a document prior to indexing it according to the examples from https://github.com/jprante/elasticsearch-langdetect.

Trying the same with ES.2.2 and bundle 2.2.0.1 the example query now returns

curl -XPOST 'localhost:9200/_langdetect?pretty' -d 'Das ist ein Test'
{
"error" : {
"root_cause" : [ {
"type" : "invalid_index_name_exception",
"reason" : "Invalid index name [langdetect], must not start with ''",
"index" : "_langdetect"
} ],
"type" : "invalid_index_name_exception",
"reason" : "Invalid index name [langdetect], must not start with ''",
"index" : "_langdetect"
},
"status" : 400
}

Is the endpoint still available somewhere?

@marbleman
Copy link
Author

No comments? Hope my question wasn't too birdbrained... ;) However, if so, I would appreciate a hint on what I am missing...

@jprante
Copy link
Owner

jprante commented Mar 2, 2016

Sorry, I overlooked the issue.

I released 2.2.0.2 with a fix.

Download link of plugin zip file is

https://github.com/jprante/elasticsearch-plugin-bundle/releases/download/2.2.0.2/elasticsearch-plugin-bundle-2.2.0.2-plugin.zip

@marbleman
Copy link
Author

Thanks a lot for your response! Installed it right away. Unfortunatlly I get an error no matter if execute from sense or from command line:

curl -XPOST 'localhost:9200/_langdetect?pretty' -d 'Das ist ein Test'
{
"error" : {
"root_cause" : [ {
"type" : "illegal_state_exception",
"reason" : "failed to find action [org.xbib.elasticsearch.action.langdetect.LangdetectAction@d8b70e11] to execute"
} ],
"type" : "illegal_state_exception",
"reason" : "failed to find action [org.xbib.elasticsearch.action.langdetect.LangdetectAction@d8b70e11] to execute"
},
"status" : 500
}

@jprante
Copy link
Owner

jprante commented Mar 2, 2016

OK, that was the reason why I removed the REST action.... I have to investigate how to solve this class loader issue.

@marbleman
Copy link
Author

Thanks in advance! IMHO _langdetect REST endpoint is quite an important feature since it allows to check the language prior to indexing. Each document can then be sent to the right index having the appopriate analyzers for that language

@marbleman
Copy link
Author

@jprante
Copy link
Owner

jprante commented Mar 7, 2016

Attaching the right analyzer is a feature where REST endpoint is not for.

In ES 1.x this was possible by assigning an analyzer path. In ES 2.x this was removed. I will implement multi-field name extension with automatically setting language analyzers https://www.elastic.co/guide/en/elasticsearch/guide/current/mixed-lang-fields.html#_analyze_multiple_times

Thanks for finding the typo.

@marbleman
Copy link
Author

This is probably not the right place to discuss some "best practices" (which I would be interested in) but according to some recommendations around the inet we decided to go for seperate indices for each language such as "myindex_de" and "myindex_en" for example. Therefore we have to detect the language prior to indexing... This way we can do searches on "myindex_*" to get results in multiple languages. And we get around all that trouble with mixed languages

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants