ArabicNLP 2026 Shared Task

Task B (Task 2): Textual Harmful Prompt Detection

Systems receive Arabic prompts directed at LLMs and detect unsafe prompts across binary and harm-category settings.

Definition

Task B (Task 2) focuses on textual harmful prompt detection for Arabic LLM safety evaluation. Given an Arabic prompt directed at an LLM, systems determine whether the prompt is safe or unsafe and, when unsafe, identify the harm category.

The task targets Arabic safety risks in prompts, including direct harmful requests, indirect or disguised unsafe requests, dialectal phrasing, and prompts that require careful interpretation of intent.

Subtasks

  • Subtask B1: Given an Arabic prompt, classify it as Safe or Unsafe.
  • Subtask B2: Given an unsafe Arabic prompt, classify it into the relevant harm domain.

Harm domains include self-harm, harm to others, harassment, adult content, bullying, hate speech, and fraud or illegal activities.

Leaderboard

The CodaBench competitions for Task B (Task 2) are open for participation:

Datasets

The dataset contains Arabic prompts annotated for safety evaluation. The released data will be the authoritative source for final split sizes and labels.

Evaluation

The official metric is macro-F1. Accuracy, macro-precision, macro-recall, and weighted F1 may also be reported for analysis.

Subtask B1 is evaluated as binary classification. Subtask B2 is evaluated as harm-category classification.

Submission

Scorers, Format Checkers, and Baselines

Scorer scripts, format checkers, baseline systems, and starter-kit material will be released in the ArGuard repository.

Guidelines

The submission process will include a system development phase using the development set and a final evaluation phase using the blind test set.

  • Each team should maintain a single submission account.
  • The most recent valid submission before the deadline will be considered the final submission.
  • Output filenames and archive formats will be specified with the starter kit.
  • Teams should include their team name and a short method description with each submission.

Submission Site

The official submission site will be linked when the competition is opened.