पहचान थ्रेशोल्ड कॉन्फ़िगर करना

पहचान थ्रेशोल्ड वह भरोसे का स्तर तय करते हैं जिस पर Discuse सामग्री को फ़्लैग करता है, ताकि false positives और false negatives के बीच संतुलन बनाया जा सके। Discuse में ये थ्रेशोल्ड प्रोजेक्ट सेटिंग्स हैं, जिन्हें डैशबोर्ड (या सेटिंग्स API) में कॉन्फ़िगर किया जाता है, और API अनुरोध केवल यह चालू-बंद करता है कि कौन-सी जाँचें चलें — इसमें संख्यात्मक थ्रेशोल्ड नहीं भेजे जाते। यह गाइड समझाती है कि यह संतुलन कैसे काम करता है और अपने प्लेटफ़ॉर्म के लिए मान कैसे चुनें।

Discuse थ्रेशहोल्ड कैसे काम करते हैं

हर जाँच 0.0 से 1.0 तक का confidence score और एक hit फ्लैग लौटाती है। hit तब सेट होता है जब स्कोर आपके प्रोजेक्ट में उस श्रेणी के लिए कॉन्फ़िगर किए गए थ्रेशहोल्ड तक पहुँचता है या उसे पार कर जाता है। कॉन्फ़िगर किए जा सकने वाले थ्रेशहोल्ड ये हैं:

थ्रेशहोल्ड (प्रोजेक्ट सेटिंग)	इस पर लागू
`threshold_sentiment`	कुल मिलाकर नकारात्मक भावना की कट-ऑफ
`threshold_toxicity`	विषाक्त टेक्स्ट
`threshold_profanity`	अपशब्द
`threshold_threat`	धमकियाँ
`threshold_insult`	अपमान
`threshold_spam`	स्पैम क्लासिफ़ायर का confidence
`threshold_images`	स्पष्ट/अश्लील इमेजरी (समग्र)
`threshold_images_porn`	पोर्नोग्राफ़िक इमेजरी
`threshold_images_sexual`	यौन संकेत देने वाली इमेजरी

ये वे नाम हैं जो प्रोजेक्ट सेटिंग्स API में उपलब्ध होते हैं; हर अनुरोध के settings ऑब्जेक्ट में केवल on/off check_* टॉगल और expected_language होते हैं। थ्रेशहोल्ड आप डैशबोर्ड में बदलते हैं, हर अनुरोध के स्तर पर नहीं।

समझौता

Lower Threshold = More Strict
├── More content flagged
├── Higher false positive rate
├── Fewer harmful posts slip through
└── More user friction

Higher Threshold = More Permissive
├── Less content flagged
├── Lower false positive rate
├── More harmful posts may slip through
└── Better user experience

समझौते को विज़ुअलाइज़ करना

                False Positives ─────────────────────────►
                     Few                              Many
               ┌─────────────────────────────────────────┐
False      Few │   ◄─── Ideal Zone                      │
Negatives      │        (High threshold,                │
               │         low false rates)               │
      │        │                                        │
      │        │              Your platform's           │
      │        │              optimal point →  ●        │
      ▼        │                                        │
          Many │                      Too permissive ──►│
               └─────────────────────────────────────────┘
                       Threshold: 0.3   0.5   0.7   0.9

कौन-से थ्रेशहोल्ड किस प्लेटफ़ॉर्म के लिए उपयुक्त हैं?

नीचे दिए गए मान Discuse के थ्रेशहोल्ड नामों (threshold_toxicity, threshold_images_porn, आदि) में व्यक्त शुरुआती बिंदु हैं। इन्हें अपने false-positive डेटा के आधार पर ट्यून करने के लिए एक आधाररेखा मानें, गारंटीशुदा सही संख्याएँ नहीं। कम थ्रेशहोल्ड अधिक सख्ती से फ़्लैग करता है।

सोशल मीडिया प्लेटफ़ॉर्म

सामान्य उद्देश्य वाले सोशल प्लेटफ़ॉर्म को संतुलित मॉडरेशन की ज़रूरत होती है:

const SOCIAL_MEDIA_THRESHOLDS = {
  threshold_toxicity: 0.7,
  threshold_profanity: 0.6,
  threshold_threat: 0.5,        // Lower (stricter) for threats
  threshold_insult: 0.7,
  threshold_spam: 0.75,
  threshold_images_porn: 0.6,
  threshold_images_sexual: 0.8  // More permissive for suggestive content
};

प्रोफ़ेशनल / बिज़नेस प्लेटफ़ॉर्म

बिज़नेस संदर्भों में आम तौर पर अधिक सख्त मॉडरेशन चाहिए होता है:

const PROFESSIONAL_THRESHOLDS = {
  threshold_toxicity: 0.5,
  threshold_profanity: 0.4,
  threshold_threat: 0.3,
  threshold_insult: 0.5,
  threshold_spam: 0.6,
  threshold_images_porn: 0.3,
  threshold_images_sexual: 0.5
};

गेमिंग समुदाय

गेमिंग प्लेटफ़ॉर्म वास्तविक धमकियों पर सख्ती बनाए रखते हुए थोड़ी अधिक चुहल-बाज़ी सहन कर सकते हैं:

const GAMING_THRESHOLDS = {
  threshold_toxicity: 0.8,
  threshold_profanity: 0.85,    // Banter allowed
  threshold_threat: 0.5,        // Still strict on real threats
  threshold_insult: 0.8,
  threshold_spam: 0.8,
  threshold_images_porn: 0.6,
  threshold_images_sexual: 0.9
};

बच्चों के प्लेटफ़ॉर्म

नाबालिगों के लिए बनाए गए प्लेटफ़ॉर्म में सबसे सख्त सेटिंग्स की ज़रूरत होती है:

const CHILDRENS_THRESHOLDS = {
  threshold_toxicity: 0.3,
  threshold_profanity: 0.2,
  threshold_threat: 0.2,
  threshold_insult: 0.3,
  threshold_spam: 0.5,
  threshold_images_porn: 0.1,   // Maximum strictness
  threshold_images_sexual: 0.2
};

क्या मैं thresholds को dynamically बदल सकता/सकती हूँ?

Discuse हर project के लिए thresholds का एक set store करता है, इसलिए per-user या per-context variation आपकी application में रहती है। नीचे दिए गए patterns आपकी तरफ़ एक effective threshold compute करते हैं; फिर आप या तो अलग-अलग projects पर route करते हैं (हर project के अपने configured thresholds के साथ) या Discuse द्वारा return किए गए scores के मुकाबले comparison खुद apply करते हैं।

User trust levels

User reputation के आधार पर effective thresholds adjust करें:

function getThresholds(user) {
  const baseThresholds = PLATFORM_THRESHOLDS;

  const trustMultipliers = {
    new_user: 0.8,       // Stricter (lower effective threshold)
    basic_user: 1.0,     // Standard
    verified_user: 1.15, // Slightly more permissive
    trusted_user: 1.3,   // More permissive
    moderator: 1.5       // Most permissive
  };

  const multiplier = trustMultipliers[user.trustLevel] || 1.0;

  return Object.fromEntries(
    Object.entries(baseThresholds).map(([key, value]) => [
      key,
      typeof value === 'number'
        ? Math.min(value * multiplier, 0.95)
        : adjustNestedThresholds(value, multiplier)
    ])
  );
}

Context-based thresholds

अलग-अलग content types के लिए अलग thresholds की ज़रूरत हो सकती है:

const CONTEXT_THRESHOLDS = {
  // Public posts visible to everyone
  public_post: {
    toxic: 0.6,
    profanity: 0.5
  },

  // Direct messages between users
  direct_message: {
    toxic: 0.7,      // Slightly more permissive
    profanity: 0.6
  },

  // Comments on public content
  comment: {
    toxic: 0.55,     // Stricter than posts
    profanity: 0.5
  },

  // Profile information
  profile: {
    toxic: 0.5,      // Strict for public-facing content
    profanity: 0.4
  }
};

function getContextThresholds(contentType) {
  return CONTEXT_THRESHOLDS[contentType] || CONTEXT_THRESHOLDS.public_post;
}

Time-based adjustments

High-risk periods के दौरान thresholds adjust करें:

function getTimeAdjustedThresholds(baseThresholds) {
  const hour = new Date().getHours();
  const dayOfWeek = new Date().getDay();

  // Stricter during off-hours when fewer moderators available
  const isOffHours = hour < 6 || hour > 22;
  const isWeekend = dayOfWeek === 0 || dayOfWeek === 6;

  let multiplier = 1.0;

  if (isOffHours) multiplier *= 0.85;
  if (isWeekend) multiplier *= 0.9;

  return adjustThresholds(baseThresholds, multiplier);
}

थ्रेशहोल्ड कॉन्फ़िगरेशन लागू करना

केंद्रीकृत कॉन्फ़िगरेशन

// config/moderation.js — mirrors your Discuse project thresholds so app-side
// routing stays in sync with the values configured in the dashboard.
export const ModerationConfig = {
  thresholds: {
    threshold_toxicity:      parseFloat(process.env.THRESHOLD_TOXICITY || '0.7'),
    threshold_profanity:     parseFloat(process.env.THRESHOLD_PROFANITY || '0.6'),
    threshold_threat:        parseFloat(process.env.THRESHOLD_THREAT || '0.5'),
    threshold_insult:        parseFloat(process.env.THRESHOLD_INSULT || '0.7'),
    threshold_spam:          parseFloat(process.env.THRESHOLD_SPAM || '0.75'),
    threshold_images_porn:   parseFloat(process.env.THRESHOLD_IMAGES_PORN || '0.6'),
    threshold_images_sexual: parseFloat(process.env.THRESHOLD_IMAGES_SEXUAL || '0.8')
  },

  actions: {
    high_confidence: 'auto_block',     // score > 0.95
    medium_confidence: 'human_review', // 0.7 - 0.95
    low_confidence: 'allow_with_flag'  // threshold - 0.7
  }
};

रनटाइम थ्रेशहोल्ड अपडेट

बिना दोबारा डिप्लॉय किए थ्रेशहोल्ड में बदलाव की अनुमति दें:

class ModerationService {
  constructor() {
    this.thresholds = defaultThresholds;
    this.loadRemoteConfig();
  }

  async loadRemoteConfig() {
    try {
      const config = await fetch('/api/admin/moderation-config');
      const data = await config.json();
      this.thresholds = data.thresholds;
      console.log('Loaded remote moderation config');
    } catch (error) {
      console.warn('Using default thresholds:', error);
    }
  }

  async checkContent(content, context) {
    const result = await callModerationAPI(content);
    const thresholds = this.getThresholdsForContext(context);

    return this.applyThresholds(result, thresholds);
  }
}

थ्रेशोल्ड की प्रभावशीलता मापना

मुख्य मेट्रिक्स

const MODERATION_METRICS = {
  // Accuracy
  precision: 'True positives / (True positives + False positives)',
  recall: 'True positives / (True positives + False negatives)',
  f1_score: 'Harmonic mean of precision and recall',

  // User impact
  block_rate: 'Content blocked / Total content',
  appeal_rate: 'Appeals filed / Content blocked',
  appeal_success: 'Appeals won / Appeals filed',

  // Operational
  review_queue_size: 'Items waiting for human review',
  review_time: 'Average time to human decision'
};

A/B टेस्टिंग थ्रेशोल्ड

ट्रैफ़िक के एक हिस्से पर थ्रेशोल्ड में किए गए बदलावों को टेस्ट करें:

async function moderateWithExperiment(content, userId) {
  const experiment = getExperiment(userId, 'threshold_test');

  const thresholds = experiment === 'control'
    ? CURRENT_THRESHOLDS
    : EXPERIMENTAL_THRESHOLDS;

  const result = await checkContent(content);
  const decision = applyThresholds(result, thresholds);

  // Log for analysis
  await logExperiment({
    experiment: 'threshold_test',
    variant: experiment,
    content_id: content.id,
    scores: result,
    decision: decision,
    timestamp: Date.now()
  });

  return decision;
}

परिणामों का विश्लेषण

-- Calculate precision and recall for each threshold variant
SELECT
  variant,
  COUNT(*) as total_decisions,
  SUM(CASE WHEN blocked AND actually_harmful THEN 1 ELSE 0 END) as true_positives,
  SUM(CASE WHEN blocked AND NOT actually_harmful THEN 1 ELSE 0 END) as false_positives,
  SUM(CASE WHEN NOT blocked AND actually_harmful THEN 1 ELSE 0 END) as false_negatives,
  SUM(CASE WHEN blocked AND actually_harmful THEN 1 ELSE 0 END) * 1.0 /
    NULLIF(SUM(CASE WHEN blocked THEN 1 ELSE 0 END), 0) as precision,
  SUM(CASE WHEN blocked AND actually_harmful THEN 1 ELSE 0 END) * 1.0 /
    NULLIF(SUM(CASE WHEN actually_harmful THEN 1 ELSE 0 END), 0) as recall
FROM moderation_decisions
WHERE experiment = 'threshold_test'
GROUP BY variant;

थ्रेशहोल्ड ट्यूनिंग वर्कफ़्लो

चरण 1: बेसलाइन स्थापित करें

// Start with conservative thresholds
const INITIAL_THRESHOLDS = {
  threshold_toxicity: 0.5,
  threshold_profanity: 0.5,
  threshold_spam: 0.6
};

चरण 2: डेटा इकट्ठा करें

async function logModerationDecision(content, result, decision) {
  await db.insert('moderation_log', {
    content_id: content.id,
    content_hash: hashContent(content.text),
    scores: result.results,
    thresholds_used: currentThresholds,
    decision: decision,
    user_trust_level: content.author.trustLevel,
    created_at: Date.now()
  });
}

चरण 3: फ़ॉल्स दरों का विश्लेषण करें

ब्लॉक की गई सामग्री और उपयोगकर्ता अपीलों की समीक्षा करें, ताकि पहचाना जा सके:

फ़ॉल्स पॉज़िटिव: सुरक्षित सामग्री जिसे गलती से ब्लॉक कर दिया गया
फ़ॉल्स नेगेटिव: हानिकारक सामग्री जो पकड़ी नहीं गई

चरण 4: समायोजित करें और दोहराएँ

// Based on analysis, raise thresholds that fire too often
const ADJUSTED_THRESHOLDS = {
  threshold_toxicity: 0.65,  // Raised after false positives
  threshold_profanity: 0.55,
  threshold_spam: 0.7
};

चरण 5: लगातार निगरानी करें

थ्रेशहोल्ड की प्रभावशीलता के लिए अलर्ट सेट करें:

async function checkModerationHealth() {
  const stats = await getModerationStats(last24Hours);

  // Alert if false positive rate too high
  if (stats.appealSuccessRate > 0.3) {
    alert('High appeal success rate - thresholds may be too strict');
  }

  // Alert if harmful content is getting through
  if (stats.reportedAfterApproval > threshold) {
    alert('Increase in reported content - thresholds may be too permissive');
  }
}

सर्वोत्तम अभ्यासों का सारांश

संयमित शुरुआत करें: कड़े थ्रेशोल्ड से शुरू करें और डेटा के आधार पर उन्हें ढीला करें।
संदर्भ का उपयोग करें: अलग-अलग सतहों (सार्वजनिक पोस्ट, DMs, प्रोफ़ाइल) के लिए अलग-अलग थ्रेशोल्ड उचित होते हैं।
Trust levels मायने रखते हैं: अपने ऐप में उपयोगकर्ता की प्रतिष्ठा के अनुसार प्रभावी थ्रेशोल्ड समायोजित करें।
हर चीज़ मापें: precision, recall और उपयोगकर्ता प्रभाव को ट्रैक करें।
लगातार सुधार करें: मॉडरेशन कभी "पूरा" नहीं होता।
निर्णयों का दस्तावेज़ रखें: थ्रेशोल्ड क्यों बदले गए, इसका रिकॉर्ड रखें।
बैकअप रखें: सीमावर्ती मामलों को मानव समीक्षा के लिए भेजें।

याद रखें: Discuse में ये थ्रेशोल्ड प्रोजेक्ट सेटिंग्स हैं। इन्हें डैशबोर्ड में या settings API के माध्यम से बदलें; प्रति-अनुरोध settings ऑब्जेक्ट केवल यह टॉगल करता है कि कौन-से check_* चलते हैं।

अगले कदम

AI कंटेंट मॉडरेशन गाइड - AI मॉडरेशन को समझना
कंटेंट मॉडरेशन को स्केल करना - बड़े पैमाने पर इम्प्लीमेंटेशन
टेक्स्ट विश्लेषण - टेक्स्ट-विशिष्ट मॉडरेशन विवरण