Algılama Eşiklerini Yapılandırma

Algılama eşikleri, Discuse’un içeriği hangi güven düzeyinde işaretleyeceğini belirler; yanlış pozitifler ile yanlış negatifler arasında denge kurar. Discuse’ta bu eşikler, kontrol panelinde (veya ayarlar API’sinde) yapılandırılan proje ayarlarıdır; API isteği ise yalnızca hangi kontrollerin çalışacağını açıp kapatır — sayısal eşik değerleri içermez. Bu kılavuz, bu dengenin nasıl işlediğini ve platformunuz için uygun değerleri nasıl seçeceğinizi açıklar.

Discuse eşikleri nasıl çalışır?

Her kontrol, 0.0 ile 1.0 arasında bir güven skoru ve bir hit bayrağı döndürür. Skor, projenizde ilgili kategori için yapılandırdığınız eşiğe ulaştığında veya bu eşiği aştığında hit ayarlanır. Yapılandırılabilir eşikler şunlardır:

Eşik (proje ayarı)	Uygulandığı alan
`threshold_sentiment`	Genel olumsuz duygu kesme eşiği
`threshold_toxicity`	Toksik metin
`threshold_profanity`	Küfür
`threshold_threat`	Tehditler
`threshold_insult`	Hakaretler
`threshold_spam`	Spam sınıflandırıcı güveni
`threshold_images`	Müstehcen görseller (genel)
`threshold_images_porn`	Pornografik görseller
`threshold_images_sexual`	Cinsel çağrışımlı görseller

Bunlar, proje ayarları API’si tarafından sunulan adlardır; istek başına gönderilen settings nesnesi yalnızca aç/kapat türündeki check_* anahtarlarını ve expected_language değerini içerir. Eşikleri istek bazında değil, kontrol panelinden değiştirirsiniz.

Denge noktası

Lower Threshold = More Strict
├── More content flagged
├── Higher false positive rate
├── Fewer harmful posts slip through
└── More user friction

Higher Threshold = More Permissive
├── Less content flagged
├── Lower false positive rate
├── More harmful posts may slip through
└── Better user experience

Denge noktasını görselleştirme

                False Positives ─────────────────────────►
                     Few                              Many
               ┌─────────────────────────────────────────┐
False      Few │   ◄─── Ideal Zone                      │
Negatives      │        (High threshold,                │
               │         low false rates)               │
      │        │                                        │
      │        │              Your platform's           │
      │        │              optimal point →  ●        │
      ▼        │                                        │
          Many │                      Too permissive ──►│
               └─────────────────────────────────────────┘
                       Threshold: 0.3   0.5   0.7   0.9

Hangi eşikler hangi platforma uygundur?

Aşağıdaki değerler, Discuse'un eşik adlarıyla (threshold_toxicity, threshold_images_porn vb.) ifade edilen başlangıç noktalarıdır. Bunları kesin doğru rakamlar olarak değil, kendi yanlış pozitif verilerinize göre ayarlayacağınız bir temel olarak değerlendirin. Daha düşük bir eşik, içerikleri daha agresif biçimde işaretler.

Sosyal medya platformları

Genel amaçlı sosyal platformların dengeli bir moderasyona ihtiyacı vardır:

const SOCIAL_MEDIA_THRESHOLDS = {
  threshold_toxicity: 0.7,
  threshold_profanity: 0.6,
  threshold_threat: 0.5,        // Lower (stricter) for threats
  threshold_insult: 0.7,
  threshold_spam: 0.75,
  threshold_images_porn: 0.6,
  threshold_images_sexual: 0.8  // More permissive for suggestive content
};

Profesyonel / iş platformları

İş ortamlarında genellikle daha sıkı moderasyon tercih edilir:

const PROFESSIONAL_THRESHOLDS = {
  threshold_toxicity: 0.5,
  threshold_profanity: 0.4,
  threshold_threat: 0.3,
  threshold_insult: 0.5,
  threshold_spam: 0.6,
  threshold_images_porn: 0.3,
  threshold_images_sexual: 0.5
};

Oyun toplulukları

Oyun platformları, gerçek tehditler konusunda sıkı kalırken daha fazla şakalaşmaya tolerans gösterebilir:

const GAMING_THRESHOLDS = {
  threshold_toxicity: 0.8,
  threshold_profanity: 0.85,    // Banter allowed
  threshold_threat: 0.5,        // Still strict on real threats
  threshold_insult: 0.8,
  threshold_spam: 0.8,
  threshold_images_porn: 0.6,
  threshold_images_sexual: 0.9
};

Çocuklara yönelik platformlar

Küçüklere yönelik platformlar en sıkı ayarları gerektirir:

const CHILDRENS_THRESHOLDS = {
  threshold_toxicity: 0.3,
  threshold_profanity: 0.2,
  threshold_threat: 0.2,
  threshold_insult: 0.3,
  threshold_spam: 0.5,
  threshold_images_porn: 0.1,   // Maximum strictness
  threshold_images_sexual: 0.2
};

Eşikleri dinamik olarak değiştirebilir miyim?

Discuse, proje başına tek bir eşik seti saklar; bu nedenle kullanıcıya veya bağlama göre değişiklikler uygulamanızda yönetilir. Aşağıdaki desenler, etkili eşiği sizin tarafınızda hesaplar; ardından ya farklı projelere yönlendirme yaparsınız (her birinin kendi yapılandırılmış eşikleriyle) ya da Discuse tarafından döndürülen puanlara göre karşılaştırmayı kendiniz uygularsınız.

Kullanıcı güven seviyeleri

Etkili eşikleri kullanıcı itibarına göre ayarlayın:

function getThresholds(user) {
  const baseThresholds = PLATFORM_THRESHOLDS;

  const trustMultipliers = {
    new_user: 0.8,       // Stricter (lower effective threshold)
    basic_user: 1.0,     // Standard
    verified_user: 1.15, // Slightly more permissive
    trusted_user: 1.3,   // More permissive
    moderator: 1.5       // Most permissive
  };

  const multiplier = trustMultipliers[user.trustLevel] || 1.0;

  return Object.fromEntries(
    Object.entries(baseThresholds).map(([key, value]) => [
      key,
      typeof value === 'number'
        ? Math.min(value * multiplier, 0.95)
        : adjustNestedThresholds(value, multiplier)
    ])
  );
}

Bağlama dayalı eşikler

Farklı içerik türleri farklı eşiklere ihtiyaç duyabilir:

const CONTEXT_THRESHOLDS = {
  // Public posts visible to everyone
  public_post: {
    toxic: 0.6,
    profanity: 0.5
  },

  // Direct messages between users
  direct_message: {
    toxic: 0.7,      // Slightly more permissive
    profanity: 0.6
  },

  // Comments on public content
  comment: {
    toxic: 0.55,     // Stricter than posts
    profanity: 0.5
  },

  // Profile information
  profile: {
    toxic: 0.5,      // Strict for public-facing content
    profanity: 0.4
  }
};

function getContextThresholds(contentType) {
  return CONTEXT_THRESHOLDS[contentType] || CONTEXT_THRESHOLDS.public_post;
}

Zamana dayalı ayarlamalar

Yüksek riskli dönemlerde eşikleri ayarlayın:

function getTimeAdjustedThresholds(baseThresholds) {
  const hour = new Date().getHours();
  const dayOfWeek = new Date().getDay();

  // Stricter during off-hours when fewer moderators available
  const isOffHours = hour < 6 || hour > 22;
  const isWeekend = dayOfWeek === 0 || dayOfWeek === 6;

  let multiplier = 1.0;

  if (isOffHours) multiplier *= 0.85;
  if (isWeekend) multiplier *= 0.9;

  return adjustThresholds(baseThresholds, multiplier);
}

Eşik yapılandırmasını uygulama

Merkezi yapılandırma

// config/moderation.js — mirrors your Discuse project thresholds so app-side
// routing stays in sync with the values configured in the dashboard.
export const ModerationConfig = {
  thresholds: {
    threshold_toxicity:      parseFloat(process.env.THRESHOLD_TOXICITY || '0.7'),
    threshold_profanity:     parseFloat(process.env.THRESHOLD_PROFANITY || '0.6'),
    threshold_threat:        parseFloat(process.env.THRESHOLD_THREAT || '0.5'),
    threshold_insult:        parseFloat(process.env.THRESHOLD_INSULT || '0.7'),
    threshold_spam:          parseFloat(process.env.THRESHOLD_SPAM || '0.75'),
    threshold_images_porn:   parseFloat(process.env.THRESHOLD_IMAGES_PORN || '0.6'),
    threshold_images_sexual: parseFloat(process.env.THRESHOLD_IMAGES_SEXUAL || '0.8')
  },

  actions: {
    high_confidence: 'auto_block',     // score > 0.95
    medium_confidence: 'human_review', // 0.7 - 0.95
    low_confidence: 'allow_with_flag'  // threshold - 0.7
  }
};

Çalışma zamanında eşik güncellemeleri

Yeniden dağıtım yapmadan eşik ayarlamalarına izin verin:

class ModerationService {
  constructor() {
    this.thresholds = defaultThresholds;
    this.loadRemoteConfig();
  }

  async loadRemoteConfig() {
    try {
      const config = await fetch('/api/admin/moderation-config');
      const data = await config.json();
      this.thresholds = data.thresholds;
      console.log('Loaded remote moderation config');
    } catch (error) {
      console.warn('Using default thresholds:', error);
    }
  }

  async checkContent(content, context) {
    const result = await callModerationAPI(content);
    const thresholds = this.getThresholdsForContext(context);

    return this.applyThresholds(result, thresholds);
  }
}

Eşik etkinliğini ölçme

Temel metrikler

const MODERATION_METRICS = {
  // Accuracy
  precision: 'True positives / (True positives + False positives)',
  recall: 'True positives / (True positives + False negatives)',
  f1_score: 'Harmonic mean of precision and recall',

  // User impact
  block_rate: 'Content blocked / Total content',
  appeal_rate: 'Appeals filed / Content blocked',
  appeal_success: 'Appeals won / Appeals filed',

  // Operational
  review_queue_size: 'Items waiting for human review',
  review_time: 'Average time to human decision'
};

A/B testi eşikleri

Eşik değişikliklerini trafiğin bir alt kümesinde test edin:

async function moderateWithExperiment(content, userId) {
  const experiment = getExperiment(userId, 'threshold_test');

  const thresholds = experiment === 'control'
    ? CURRENT_THRESHOLDS
    : EXPERIMENTAL_THRESHOLDS;

  const result = await checkContent(content);
  const decision = applyThresholds(result, thresholds);

  // Log for analysis
  await logExperiment({
    experiment: 'threshold_test',
    variant: experiment,
    content_id: content.id,
    scores: result,
    decision: decision,
    timestamp: Date.now()
  });

  return decision;
}

Sonuçları analiz etme

-- Calculate precision and recall for each threshold variant
SELECT
  variant,
  COUNT(*) as total_decisions,
  SUM(CASE WHEN blocked AND actually_harmful THEN 1 ELSE 0 END) as true_positives,
  SUM(CASE WHEN blocked AND NOT actually_harmful THEN 1 ELSE 0 END) as false_positives,
  SUM(CASE WHEN NOT blocked AND actually_harmful THEN 1 ELSE 0 END) as false_negatives,
  SUM(CASE WHEN blocked AND actually_harmful THEN 1 ELSE 0 END) * 1.0 /
    NULLIF(SUM(CASE WHEN blocked THEN 1 ELSE 0 END), 0) as precision,
  SUM(CASE WHEN blocked AND actually_harmful THEN 1 ELSE 0 END) * 1.0 /
    NULLIF(SUM(CASE WHEN actually_harmful THEN 1 ELSE 0 END), 0) as recall
FROM moderation_decisions
WHERE experiment = 'threshold_test'
GROUP BY variant;

Eşik ayarlama iş akışı

1. Adım: Bir başlangıç düzeyi belirleyin

// Start with conservative thresholds
const INITIAL_THRESHOLDS = {
  threshold_toxicity: 0.5,
  threshold_profanity: 0.5,
  threshold_spam: 0.6
};

2. Adım: Veri toplayın

async function logModerationDecision(content, result, decision) {
  await db.insert('moderation_log', {
    content_id: content.id,
    content_hash: hashContent(content.text),
    scores: result.results,
    thresholds_used: currentThresholds,
    decision: decision,
    user_trust_level: content.author.trustLevel,
    created_at: Date.now()
  });
}

3. Adım: Hatalı oranları analiz edin

Şunları belirlemek için engellenen içerikleri ve kullanıcı itirazlarını inceleyin:

Yanlış pozitifler: Hatalı şekilde engellenen güvenli içerik
Yanlış negatifler: Yakalanmayan zararlı içerik

4. Adım: Ayarlayın ve yineleyin

// Based on analysis, raise thresholds that fire too often
const ADJUSTED_THRESHOLDS = {
  threshold_toxicity: 0.65,  // Raised after false positives
  threshold_profanity: 0.55,
  threshold_spam: 0.7
};

5. Adım: Sürekli izleyin

Eşik etkinliği için uyarılar ayarlayın:

async function checkModerationHealth() {
  const stats = await getModerationStats(last24Hours);

  // Alert if false positive rate too high
  if (stats.appealSuccessRate > 0.3) {
    alert('High appeal success rate - thresholds may be too strict');
  }

  // Alert if harmful content is getting through
  if (stats.reportedAfterApproval > threshold) {
    alert('Increase in reported content - thresholds may be too permissive');
  }
}

En iyi uygulamalar özeti

Temkinli başlayın: daha katı eşiklerle başlayın ve verilere göre bunları gevşetin.
Bağlamı kullanın: farklı yüzeyler (herkese açık gönderiler, DM’ler, profiller) farklı eşikler gerektirir.
Güven düzeyleri önemlidir: uygulamanızdaki kullanıcı itibarına göre etkin eşiği ayarlayın.
Her şeyi ölçün: kesinlik, geri çağırma ve kullanıcı etkisini takip edin.
Sürekli yineleyin: moderasyon hiçbir zaman "bitmiş" değildir.
Kararları belgeleyin: eşiklerin neden değiştiğine dair kayıt tutun.
Yedek planlarınız olsun: sınırda kalan vakaları insan incelemesine yönlendirin.

Unutmayın: Discuse içinde bu eşikler proje ayarlarıdır. Bunları panelden veya ayarlar API’si üzerinden değiştirin; istek başına gönderilen settings nesnesi yalnızca hangi check_* kontrollerinin çalışacağını açıp kapatır.

Sonraki adımlar

AI İçerik Moderasyonu Kılavuzu - AI moderasyonunu anlama
İçerik Moderasyonunu Ölçeklendirme - Yüksek hacimli uygulama
Metin Analizi - Metne özgü moderasyon ayrıntıları