openNLP vs Spacy for Contentserver is the second part of a comparism between this two AI NLP packages in the Content Server environment.
openNLP vs Spacy for Contentserver Part 1
Goto the start opennlp series of acticles: Starting Page
openNLP vs Spacy for Contentserver Part 2
Feature | openNLP | Spacy |
---|---|---|
Named Entities (NER) detection (ISO Language Codes) | fr, de, en, it, nl,da, es, pt,se Other Languages require training | ca,zh, hr, da, nl, en, fi, fr, de, el, it, ja, ko, lt, mk, nb, pl, pt, ro, ru, sl, es, sv, uk, af, sq, am, grc, ar, eu, bn, bg, cs, et, fo, gu, he, hi, hu, is, id, ga, hn, ki, la, lv, lij, dsb, lg, ms, ml, mr, ne, nn, fa, sa, sr , tn, si, sk, tl, ta, tt, te, hh, ti, tr, hsb, ur, vi, yo Other Languages require training |
Word Vectors | experimental | included in the larger models |
Visualizers | none | Part of Speech Named Entities Span Visualizer in Jupyter Notebooks Web Based |
Connect to the Content Server | 1. Inside the Content Server in the JVM 2.From a JAVA Client using REST | 1. With a JAVA Rest client. This client invokes trhe Spacy processor for each entry to process 2.Using jspybridge (javascript/python bridge) and connect the js part to the Content Server via REST Remark: Using REST directly from Python won’t work due to the architecture of Content Server REST |
File Type Opener | Apache TICA | Apache TICA |
Application Architecture | Separate Client/can run in the Content Server | Separate Client |
LLM (Large Language Model) Interface | none as LLM, standard NLP tasks such as Named Entity Recognition and Text Classification are to be implemented locally based n openNLP | Hugging Face, OpenAI API, including GPT-4 and various GPT-3 models (Usage examples for standard NLP tasks such as Named Entity Recognition and Text Classification) |
Programming Language | JAVA | Python |