A Data Record represents one piece of data to be labelled like a warranty claim, product review or a customer complaint. A data record is composed of a set of data fields defined in the Data Record Template.
Example
|
Id |
Author |
Message |
Creation Date |
|
126243 |
John G. |
Hi, I’m facing a problem in the login page as I don’t remember my password. Can you help me, please? |
2018-01-10 12:32:01 |
All the data records in a task are considered unique and independent of each other, and so during labelling, PrediCX will not take into consideration any relationships between data records, such as. sequences of messages in a chatbot conversation or set warranty claims for the same car.
Import/Export data records
Data record import/export operations can be performed using the webapp (recommended for light use only) or the REST API (recommended for heavy use and integration purposes). In order to import data records, all the values in the data records fields must be encoded using UTF-8.
Labelling process
The data record labelling process allows users to label to data records, minimizing the manual labelling effort and maximizing the labelling performance. The process is composed of a manual phase, and an automatic phase.
During the manual phase, the system provides the user with the best data record to label next, in order to achieve the best labelling performance. When one data record gets labelled by a user the next data record is provided, however, different users never get the same data record assign to them. During this process the user assigns one or more labels to the data records depending on the business needs. Although, the user can always change a data record labels directly.
The automatic phase of the process labels the remaining automatically labelled and unlabelled data records in the system based on the knowledge gained from manually labelled samples. Every time the automatic labelling process is executed, all of the automatically labelled data records are re-labelled. This guarantees that the data records are labelled based on the most up-to-date/complete model.