How to use a custom TokensRegex rules annotator with Stanford CoreNLP Server? -
the tokensregex rules color annotator (stanford-corenlp-full-2016-10-31/tokensregex/color.rules.txt
) loads when using corenlp through command line fails web server java.lang.illegalargumentexception: unknown annotator: color
.
setup
# custom.properties annotators=tokenize,ssplit,pos,lemma,ner,regexner,color customannotatorclass.color = edu.stanford.nlp.pipeline.tokensregexannotator color.rules = tokensregex/color.rules.txt
command line
$ java -cp "*" -xmx2g edu.stanford.nlp.pipeline.stanfordcorenlp -props custom.properties -file ./tokensregex/color.input.txt -outputformat text [main] info edu.stanford.nlp.pipeline.stanfordcorenlp - registering annotator color class edu.stanford.nlp.pipeline.tokensregexannotator ... [main] info edu.stanford.nlp.pipeline.stanfordcorenlp - adding annotator color [main] info edu.stanford.nlp.ling.tokensregex.coremapexpressionextractor - reading tokensregex rules tokensregex/color.rules.txt [main] info edu.stanford.nlp.ling.tokensregex.coremapexpressionextractor - read 7 rules # color.input.txt.output sentence #1 (9 tokens): both blue , light blue nice colors. [text=both characteroffsetbegin=0 characteroffsetend=4 partofspeech=cc lemma=both namedentitytag=o] [text=blue characteroffsetbegin=5 characteroffsetend=9 partofspeech=jj lemma=blue namedentitytag=color normalizednamedentitytag=#0000ff] ...
server
java -mx2g -cp "*" edu.stanford.nlp.pipeline.stanfordcorenlpserver -c custom.properties
wget --post-data 'both blue , light blue nice colors.' 'localhost:9000/?properties={"annotators":"tokenize,ssplit,pos,lemma,ner,regexner,color","outputformat":"json"}' -o -
http request sent, awaiting response... 500 internal server error 2016-11-05 14:41:27 error 500: internal server error. java.lang.illegalargumentexception: unknown annotator: color @ edu.stanford.nlp.pipeline.stanfordcorenlp.ensureprerequisiteannotators(stanfordcorenlp.java:304) @ edu.stanford.nlp.pipeline.stanfordcorenlpserver$corenlphandler.getproperties(stanfordcorenlpserver.java:713) @ edu.stanford.nlp.pipeline.stanfordcorenlpserver$corenlphandler.handle(stanfordcorenlpserver.java:540) @ com.sun.net.httpserver.filter$chain.dofilter(filter.java:79) @ sun.net.httpserver.authfilter.dofilter(authfilter.java:83) @ com.sun.net.httpserver.filter$chain.dofilter(filter.java:82) @ sun.net.httpserver.serverimpl$exchange$linkhandler.handle(serverimpl.java:675) @ com.sun.net.httpserver.filter$chain.dofilter(filter.java:79) @ sun.net.httpserver.serverimpl$exchange.run(serverimpl.java:647) @ java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1142) @ java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:617) @ java.lang.thread.run(thread.java:745)
solution
include custom annotator properties in request: wget --post-data 'both blue , light blue nice colors.' 'localhost:9000/?properties={"color.rules":"tokensregex/color.rules.txt","customannotatorclass.color":"edu.stanford.nlp.pipeline.tokensregexannotator","annotators":"tokenize,ssplit,pos,lemma,ner,regexner,color","enforcerequirements":"false","outputformat":"json"}' -o -
add
"enforcerequirements":"false"
to request , should stop error!
Comments
Post a Comment