Navigation
Artificial Intelligence and Content Server
Example Linguistic Features of NLP
ContentServer Example Project with NLP
NLP application: AUTOCATEGORIZER – KI based categories
this is an Content Server example project with an autocategorizer. It follows these guidelines.
- Definition of categories. Which attributes and categories should become components?
- Avoiding prejudice. Do the new categories imply prejudice and do these have any impact?
- What is good evaluation of a test run?
- Choosing the Categorizer, Neural Network or Simple Algorithm? For simple algorithms, several should be selected and tested.
- Pre-trained as open source or the categorizer still needs to be trained?
- Selection of the training data set and the test data set. Although “the more the better”, about 10% of the data set for testing and training is enough to get started. It is important to consider the point of prejudice.
- Data transfer.
- Possibly pre-process data by using natural language processing (NLP) tools
- Training and testing each selected algorithm. Assessing accuracy through framework evaluation tools.
- Selection of the algorithm or neural network with the most favorable ratings from the test runs
- Production run: New business workspace or document is transferred to Python. The categorizer categorizes the document/business workspace and enters the new categories/attributes in the content server for the documents/business workspaces.
- Logs are generated as needed
- A trained categorizer can be saved, retrained and used over and over again.