پیکربندی آستانه‌های تشخیص

آستانه‌های تشخیص، سطح اطمینانی را تعیین می‌کنند که در آن Discuse محتوا را علامت‌گذاری می‌کند و میان مثبت‌های کاذب و منفی‌های کاذب تعادل برقرار می‌سازد. در Discuse این آستانه‌ها تنظیمات پروژه هستند که در داشبورد (یا settings API) پیکربندی می‌شوند، و درخواست API فقط مشخص می‌کند کدام بررسی‌ها اجرا شوند — این درخواست شامل آستانه‌های عددی نیست. این راهنما توضیح می‌دهد این بده‌بستان چگونه کار می‌کند و چطور مقادیر مناسب را برای پلتفرم خود انتخاب کنید.

آستانه‌های Discuse چگونه کار می‌کنند

هر بررسی یک امتیاز اطمینان از 0.0 تا 1.0 و یک پرچم hit برمی‌گرداند. وقتی امتیاز به آستانه‌ای برسد یا از آن بیشتر شود که برای آن دسته در پروژه‌تان پیکربندی کرده‌اید، hit فعال می‌شود. آستانه‌های قابل پیکربندی عبارت‌اند از:

آستانه (تنظیمات پروژه)	اعمال می‌شود بر
`threshold_sentiment`	حد برش کلی برای احساسات منفی
`threshold_toxicity`	متن سمی
`threshold_profanity`	ناسزاگویی
`threshold_threat`	تهدیدها
`threshold_insult`	توهین‌ها
`threshold_spam`	میزان اطمینان طبقه‌بند هرزنامه
`threshold_images`	تصاویر صریح (به‌طور کلی)
`threshold_images_porn`	تصاویر پورنوگرافیک
`threshold_images_sexual`	تصاویر با محتوای تحریک‌آمیز جنسی

این‌ها همان نام‌هایی هستند که توسط API تنظیمات پروژه ارائه می‌شوند؛ شیء settings در هر درخواست فقط شامل سوییچ‌های روشن/خاموش check_* به‌همراه expected_language است. آستانه‌ها را در داشبورد تغییر می‌دهید، نه برای هر درخواست به‌صورت جداگانه.

بده‌بستان

Lower Threshold = More Strict
├── More content flagged
├── Higher false positive rate
├── Fewer harmful posts slip through
└── More user friction

Higher Threshold = More Permissive
├── Less content flagged
├── Lower false positive rate
├── More harmful posts may slip through
└── Better user experience

تصویرسازی بده‌بستان

                False Positives ─────────────────────────►
                     Few                              Many
               ┌─────────────────────────────────────────┐
False      Few │   ◄─── Ideal Zone                      │
Negatives      │        (High threshold,                │
               │         low false rates)               │
      │        │                                        │
      │        │              Your platform's           │
      │        │              optimal point →  ●        │
      ▼        │                                        │
          Many │                      Too permissive ──►│
               └─────────────────────────────────────────┘
                       Threshold: 0.3   0.5   0.7   0.9

چه آستانه‌هایی برای کدام پلتفرم مناسب‌اند؟

مقادیر زیر نقطه‌های شروعی هستند که با نام آستانه‌های Discuse بیان شده‌اند (threshold_toxicity، threshold_images_porn و موارد مشابه). آن‌ها را به‌عنوان مبنایی برای تنظیم بر اساس داده‌های مثبت کاذب خودتان در نظر بگیرید، نه عددهایی که حتماً درست و قطعی باشند. آستانه پایین‌تر یعنی محتوا سخت‌گیرانه‌تر علامت‌گذاری می‌شود.

پلتفرم‌های شبکه‌های اجتماعی

پلتفرم‌های اجتماعی عمومی به تعدیل محتوای متعادل نیاز دارند:

const SOCIAL_MEDIA_THRESHOLDS = {
  threshold_toxicity: 0.7,
  threshold_profanity: 0.6,
  threshold_threat: 0.5,        // Lower (stricter) for threats
  threshold_insult: 0.7,
  threshold_spam: 0.75,
  threshold_images_porn: 0.6,
  threshold_images_sexual: 0.8  // More permissive for suggestive content
};

پلتفرم‌های حرفه‌ای / تجاری

فضاهای تجاری معمولاً به تعدیل محتوای سخت‌گیرانه‌تری نیاز دارند:

const PROFESSIONAL_THRESHOLDS = {
  threshold_toxicity: 0.5,
  threshold_profanity: 0.4,
  threshold_threat: 0.3,
  threshold_insult: 0.5,
  threshold_spam: 0.6,
  threshold_images_porn: 0.3,
  threshold_images_sexual: 0.5
};

جوامع بازی

پلتفرم‌های بازی ممکن است شوخی و کل‌کل بیشتری را تحمل کنند، اما همچنان نسبت به تهدیدهای واقعی سخت‌گیر بمانند:

const GAMING_THRESHOLDS = {
  threshold_toxicity: 0.8,
  threshold_profanity: 0.85,    // Banter allowed
  threshold_threat: 0.5,        // Still strict on real threats
  threshold_insult: 0.8,
  threshold_spam: 0.8,
  threshold_images_porn: 0.6,
  threshold_images_sexual: 0.9
};

پلتفرم‌های کودکان

پلتفرم‌های ویژه افراد زیر سن قانونی به سخت‌گیرانه‌ترین تنظیمات نیاز دارند:

const CHILDRENS_THRESHOLDS = {
  threshold_toxicity: 0.3,
  threshold_profanity: 0.2,
  threshold_threat: 0.2,
  threshold_insult: 0.3,
  threshold_spam: 0.5,
  threshold_images_porn: 0.1,   // Maximum strictness
  threshold_images_sexual: 0.2
};

آیا می‌توانم آستانه‌ها را به‌صورت پویا تغییر دهم؟

Discuse برای هر پروژه یک مجموعه آستانه ذخیره می‌کند؛ بنابراین تغییرات بر اساس کاربر یا زمینه باید در اپلیکیشن شما مدیریت شود. الگوهای زیر آستانه مؤثر را در سمت شما محاسبه می‌کنند؛ سپس می‌توانید یا درخواست را به پروژه‌های مختلف هدایت کنید (هرکدام با آستانه‌های پیکربندی‌شدهٔ خودش) یا خودتان مقایسه را روی امتیازهایی که Discuse برمی‌گرداند اعمال کنید.

سطح اعتماد کاربران

آستانه‌های مؤثر را بر اساس اعتبار کاربر تنظیم کنید:

function getThresholds(user) {
  const baseThresholds = PLATFORM_THRESHOLDS;

  const trustMultipliers = {
    new_user: 0.8,       // Stricter (lower effective threshold)
    basic_user: 1.0,     // Standard
    verified_user: 1.15, // Slightly more permissive
    trusted_user: 1.3,   // More permissive
    moderator: 1.5       // Most permissive
  };

  const multiplier = trustMultipliers[user.trustLevel] || 1.0;

  return Object.fromEntries(
    Object.entries(baseThresholds).map(([key, value]) => [
      key,
      typeof value === 'number'
        ? Math.min(value * multiplier, 0.95)
        : adjustNestedThresholds(value, multiplier)
    ])
  );
}

آستانه‌های مبتنی بر زمینه

انواع مختلف محتوا ممکن است به آستانه‌های متفاوتی نیاز داشته باشند:

const CONTEXT_THRESHOLDS = {
  // Public posts visible to everyone
  public_post: {
    toxic: 0.6,
    profanity: 0.5
  },

  // Direct messages between users
  direct_message: {
    toxic: 0.7,      // Slightly more permissive
    profanity: 0.6
  },

  // Comments on public content
  comment: {
    toxic: 0.55,     // Stricter than posts
    profanity: 0.5
  },

  // Profile information
  profile: {
    toxic: 0.5,      // Strict for public-facing content
    profanity: 0.4
  }
};

function getContextThresholds(contentType) {
  return CONTEXT_THRESHOLDS[contentType] || CONTEXT_THRESHOLDS.public_post;
}

تنظیمات مبتنی بر زمان

در دوره‌های پرریسک، آستانه‌ها را تنظیم کنید:

function getTimeAdjustedThresholds(baseThresholds) {
  const hour = new Date().getHours();
  const dayOfWeek = new Date().getDay();

  // Stricter during off-hours when fewer moderators available
  const isOffHours = hour < 6 || hour > 22;
  const isWeekend = dayOfWeek === 0 || dayOfWeek === 6;

  let multiplier = 1.0;

  if (isOffHours) multiplier *= 0.85;
  if (isWeekend) multiplier *= 0.9;

  return adjustThresholds(baseThresholds, multiplier);
}

پیاده‌سازی پیکربندی آستانه‌ها

پیکربندی متمرکز

// config/moderation.js — mirrors your Discuse project thresholds so app-side
// routing stays in sync with the values configured in the dashboard.
export const ModerationConfig = {
  thresholds: {
    threshold_toxicity:      parseFloat(process.env.THRESHOLD_TOXICITY || '0.7'),
    threshold_profanity:     parseFloat(process.env.THRESHOLD_PROFANITY || '0.6'),
    threshold_threat:        parseFloat(process.env.THRESHOLD_THREAT || '0.5'),
    threshold_insult:        parseFloat(process.env.THRESHOLD_INSULT || '0.7'),
    threshold_spam:          parseFloat(process.env.THRESHOLD_SPAM || '0.75'),
    threshold_images_porn:   parseFloat(process.env.THRESHOLD_IMAGES_PORN || '0.6'),
    threshold_images_sexual: parseFloat(process.env.THRESHOLD_IMAGES_SEXUAL || '0.8')
  },

  actions: {
    high_confidence: 'auto_block',     // score > 0.95
    medium_confidence: 'human_review', // 0.7 - 0.95
    low_confidence: 'allow_with_flag'  // threshold - 0.7
  }
};

به‌روزرسانی آستانه‌ها در زمان اجرا

امکان تنظیم آستانه‌ها را بدون استقرار مجدد فراهم کنید:

class ModerationService {
  constructor() {
    this.thresholds = defaultThresholds;
    this.loadRemoteConfig();
  }

  async loadRemoteConfig() {
    try {
      const config = await fetch('/api/admin/moderation-config');
      const data = await config.json();
      this.thresholds = data.thresholds;
      console.log('Loaded remote moderation config');
    } catch (error) {
      console.warn('Using default thresholds:', error);
    }
  }

  async checkContent(content, context) {
    const result = await callModerationAPI(content);
    const thresholds = this.getThresholdsForContext(context);

    return this.applyThresholds(result, thresholds);
  }
}

سنجش اثربخشی آستانه‌ها

معیارهای کلیدی

const MODERATION_METRICS = {
  // Accuracy
  precision: 'True positives / (True positives + False positives)',
  recall: 'True positives / (True positives + False negatives)',
  f1_score: 'Harmonic mean of precision and recall',

  // User impact
  block_rate: 'Content blocked / Total content',
  appeal_rate: 'Appeals filed / Content blocked',
  appeal_success: 'Appeals won / Appeals filed',

  // Operational
  review_queue_size: 'Items waiting for human review',
  review_time: 'Average time to human decision'
};

آستانه‌های آزمون A/B

تغییرات آستانه را روی بخشی از ترافیک آزمایش کنید:

async function moderateWithExperiment(content, userId) {
  const experiment = getExperiment(userId, 'threshold_test');

  const thresholds = experiment === 'control'
    ? CURRENT_THRESHOLDS
    : EXPERIMENTAL_THRESHOLDS;

  const result = await checkContent(content);
  const decision = applyThresholds(result, thresholds);

  // Log for analysis
  await logExperiment({
    experiment: 'threshold_test',
    variant: experiment,
    content_id: content.id,
    scores: result,
    decision: decision,
    timestamp: Date.now()
  });

  return decision;
}

تحلیل نتایج

-- Calculate precision and recall for each threshold variant
SELECT
  variant,
  COUNT(*) as total_decisions,
  SUM(CASE WHEN blocked AND actually_harmful THEN 1 ELSE 0 END) as true_positives,
  SUM(CASE WHEN blocked AND NOT actually_harmful THEN 1 ELSE 0 END) as false_positives,
  SUM(CASE WHEN NOT blocked AND actually_harmful THEN 1 ELSE 0 END) as false_negatives,
  SUM(CASE WHEN blocked AND actually_harmful THEN 1 ELSE 0 END) * 1.0 /
    NULLIF(SUM(CASE WHEN blocked THEN 1 ELSE 0 END), 0) as precision,
  SUM(CASE WHEN blocked AND actually_harmful THEN 1 ELSE 0 END) * 1.0 /
    NULLIF(SUM(CASE WHEN actually_harmful THEN 1 ELSE 0 END), 0) as recall
FROM moderation_decisions
WHERE experiment = 'threshold_test'
GROUP BY variant;

روند تنظیم آستانه‌ها

گام 1: تعیین خط مبنا

// Start with conservative thresholds
const INITIAL_THRESHOLDS = {
  threshold_toxicity: 0.5,
  threshold_profanity: 0.5,
  threshold_spam: 0.6
};

گام 2: جمع‌آوری داده‌ها

async function logModerationDecision(content, result, decision) {
  await db.insert('moderation_log', {
    content_id: content.id,
    content_hash: hashContent(content.text),
    scores: result.results,
    thresholds_used: currentThresholds,
    decision: decision,
    user_trust_level: content.author.trustLevel,
    created_at: Date.now()
  });
}

گام 3: تحلیل نرخ‌های خطا

محتوای مسدودشده و درخواست‌های تجدیدنظر کاربران را بررسی کنید تا موارد زیر را شناسایی کنید:

مثبت‌های کاذب: محتوای امنی که به‌اشتباه مسدود شده است
منفی‌های کاذب: محتوای آسیب‌زایی که شناسایی نشده است

گام 4: تنظیم و تکرار

// Based on analysis, raise thresholds that fire too often
const ADJUSTED_THRESHOLDS = {
  threshold_toxicity: 0.65,  // Raised after false positives
  threshold_profanity: 0.55,
  threshold_spam: 0.7
};

گام 5: پایش مداوم

برای سنجش اثربخشی آستانه‌ها هشدارهایی تنظیم کنید:

async function checkModerationHealth() {
  const stats = await getModerationStats(last24Hours);

  // Alert if false positive rate too high
  if (stats.appealSuccessRate > 0.3) {
    alert('High appeal success rate - thresholds may be too strict');
  }

  // Alert if harmful content is getting through
  if (stats.reportedAfterApproval > threshold) {
    alert('Increase in reported content - thresholds may be too permissive');
  }
}

خلاصهٔ بهترین روش‌ها

محافظه‌کارانه شروع کنید: با آستانه‌های سخت‌گیرانه‌تر آغاز کنید و بر اساس داده‌ها آن‌ها را آسان‌گیرانه‌تر کنید.
از زمینه استفاده کنید: بخش‌های مختلف (پست‌های عمومی، پیام‌های خصوصی، پروفایل‌ها) آستانه‌های متفاوتی می‌طلبند.
سطوح اعتماد مهم‌اند: آستانهٔ مؤثر را بر اساس اعتبار کاربر در اپلیکیشن خود تنظیم کنید.
همه‌چیز را اندازه‌گیری کنید: دقت، بازخوانی و تأثیر بر کاربران را دنبال کنید.
پیوسته تکرار و بهبود دهید: نظارت محتوا هرگز «تمام‌شده» نیست.
تصمیم‌ها را مستند کنید: ثبت کنید چرا آستانه‌ها تغییر کرده‌اند.
گزینه‌های جایگزین داشته باشید: موارد مرزی را برای بازبینی انسانی ارجاع دهید.

به یاد داشته باشید: در Discuse این آستانه‌ها تنظیمات پروژه هستند. آن‌ها را در داشبورد یا از طریق settings API تغییر دهید؛ شیء settings در هر درخواست فقط مشخص می‌کند کدام check_* اجرا شوند.

گام‌های بعدی

راهنمای تعدیل محتوای AI - آشنایی با تعدیل مبتنی بر AI
مقیاس‌پذیر کردن تعدیل محتوا - پیاده‌سازی برای حجم بالا
تحلیل متن - جزئیات تعدیل مخصوص متن