I am trying to train a spacy textcat_multilabel model. I thought I had everything set up correctly, but I continue to get a validation error.
This is the label section of my config:
[components.textcat_multilabel]factory = "textcat_multilabel"scorer = {"@scorers": "spacy.textcat_multilabel_scorer.v2"}threshold = 0.5labels = ["Operational (Frontline)", "Certified/Technical", "Administrative (General)", "Corporate (HR/Finance/Procurement)", "Digital (Applications)", "Digital (ICT)", "Communication and Engagement", "Environmental/Scientific", "Leadership/Management/Coaching/Mentoring", "Policy/Legislation/Regulatory", "Cultural Capability", "Project Management", "Workplace Health and Safety", "Analytical (Data/GIS/Modelling)", "Other"]
This command
python -m spacy train .\config.cfg --output ..\output --paths.tain .\train.spacy --paths.dev .\dev.spacy
throws this error
=========================== Initializing pipeline ===========================✘ Config validation errortextcat_multilabel -> labels extra fields not permitted{'nlp': <spacy.lang.en.English object at 0x00000210C3758B10>, 'name': 'textcat_multilabel', 'labels': ['Operational (Frontline)', 'Certified/Technical', 'Administrative (General)','Corporate (HR/Finance/Procurement)', 'Digital (Applications)', 'Digital (ICT)', 'Communication and Engagement', 'Environmental/Scientific', 'Leadership/Management/Coaching/Mentoring', 'Policy/Legislation/Regulatory', 'Cultural Capability', 'Project Management', 'Workplace Health and Safety', 'Analytical (Data/GIS/Modelling)', 'Other'], 'model': {'@architectures': 'spacy.TextCatBOW.v2', 'exclusive_classes': False, 'ngram_size': 1, 'no_output_layer': False, 'nO': None}, 'scorer': {'@scorers': 'spacy.textcat_multilabel_scorer.v2'}, 'threshold': 0.5, '@factories': 'textcat_multilabel'}
Have I got this all wrong? Is there another way to specify labels in the config?
Thanks;