स्पैम पहचान

Discuse स्पैम पहचान टेक्स्ट को स्पैम या गैर-स्पैम के रूप में वर्गीकृत करती है, साथ में confidence score भी देती है। check_spam सक्षम करके टेक्स्ट को POST https://api.discuse.com/api/v2/check पर भेजें, और निर्णय को results.spamfinder से पढ़ें। यह उन प्रमोशनल स्पैम, स्कैम और bot-generated शोर को पकड़ लेती है जिन्हें साधारण keyword filters अक्सर चूक जाते हैं।

स्पैम पहचान क्या पकड़ती है?

मॉडल उन high-volume पैटर्न पर प्रशिक्षित है जो blocklists से बच निकलते हैं:

प्रमोशनल स्पैम और अनचाहे विज्ञापन
स्कैम और phishing संदेश
Bot-generated और copy-paste कंटेंट

यह एक label (जैसे spam या ham) के साथ confidence score लौटाता है, ताकि आप तय कर सकें कि कितनी सख्ती रखनी है।

मैं स्पैम जाँच कैसे चलाऊँ?

curl -X POST https://api.discuse.com/api/v2/check \
  -H "Content-Type: application/json" \
  -H "X-API-Key: YOUR_API_KEY" \
  -d '{
    "content": {
      "text": "CONGRATULATIONS! You won $10,000! Click here to claim: bit.ly/fake"
    },
    "settings": {
      "check_spam": true
    }
  }'

रिस्पॉन्स फ़ॉर्मैट

{
  "has_violations": true,
  "cached": false,
  "message": "Spam content detected",
  "results": {
    "hits": true,
    "spamfinder": {
      "text": "CONGRATULATIONS! You won $10,000! Click here to claim: bit.ly/fake",
      "label": "spam",
      "confidence": 0.97,
      "is_spam": true,
      "hit": true
    }
  },
  "usage": {
    "api_requests_used": 8,
    "api_requests_limit": 5000,
    "api_requests_remaining": 4992
  }
}

स्पैम परिणाम कौन-से फ़ील्ड लौटाता है?

फ़ील्ड	प्रकार	अर्थ
`text`	string	वह टेक्स्ट जिसे वर्गीकृत किया गया
`label`	string	मॉडल वर्गीकरण (जैसे `spam`, `ham`)
`confidence`	number	लेबल में मॉडल का भरोसा (0.0–1.0)
`is_spam`	bool	कच्चा मॉडल निर्णय — `label == spam`, threshold-unaware
`hit`	bool	Threshold-aware निर्णय — `is_spam` AND `confidence` ≥ आपके प्रोजेक्ट का स्पैम threshold

is_spam बनाम hit

is_spam कच्चा निर्णय है: मॉडल ने टेक्स्ट को स्पैम लेबल किया, चाहे उसका भरोसा कितना भी रहा हो। hit इसके अलावा यह भी मांगता है कि confidence आपके प्रोजेक्ट के configured spam threshold को पार करे। Moderation actions को hit पर gate करें, is_spam पर नहीं, ताकि कम भरोसे वाला स्पैम लेबल किसी borderline संदेश को दंडित न करे।

मैं confidence score को कैसे समझूँ?

confidence दिखाता है कि मॉडल अपने label को लेकर कितना निश्चित है:

0.0 – 0.3: बहुत कम — संभवतः वैध।
0.3 – 0.5: कम — borderline।
0.5 – 0.7: मध्यम — संदिग्ध।
0.7 – 0.9: अधिक — बहुत संभव है कि स्पैम हो।
0.9 – 1.0: बहुत अधिक — लगभग निश्चित रूप से स्पैम।

अनुशंसित thresholds

अपने प्रोजेक्ट का spam threshold अपने प्लेटफ़ॉर्म की सहनशीलता के अनुसार सेट करें:

const SPAM_THRESHOLDS = {
  strict: 0.5,      // professional platforms, financial services
  standard: 0.7,    // social media, forums
  permissive: 0.85  // creative platforms, open communities
};

उपयोग के मामले

कमेंट सेक्शन

async function moderateComment(comment) {
  const result = await checkSpam(comment.text);
  const spam = result.results.spamfinder;

  if (spam.hit) {
    if (spam.confidence > 0.9) {
      return { action: 'reject', reason: 'spam_detected' };
    }
    return { action: 'review', reason: 'possible_spam' };
  }
  return { action: 'approve' };
}

यूज़र रजिस्ट्रेशन

def validate_registration(user_data):
    bio = user_data.get('bio')
    if bio:
        result = check_spam(bio)
        if result['results']['spamfinder']['hit']:
            return {'approved': False, 'reason': 'Spam content detected in profile'}
    return {'approved': True}

मैसेजिंग प्लेटफ़ॉर्म

async function filterMessage(message, sender) {
  const result = await checkSpam(message.text);
  const spam = result.results.spamfinder;

  if (spam.hit) {
    await incrementSpamCount(sender.id);
    const spamCount = await getSpamCount(sender.id);
    if (spamCount > 3) {
      await banUser(sender.id, 'repeated_spam');
    }
    return { delivered: false, reason: 'Message filtered as spam' };
  }
  return { delivered: true };
}

अन्य जाँचों के साथ मिलाना

एक ही request में sentiment और language के साथ स्पैम चलाएँ:

{
  "content": {
    "text": "Check out this amazing deal! Click here: example.com/offer"
  },
  "settings": {
    "check_spam": true,
    "check_sentiment": true,
    "check_language": true
  }
}

इसके बाद response में results.spamfinder, results.sentiment, और results.language साथ-साथ आते हैं।

सर्वोत्तम अभ्यास

Graded responses इस्तेमाल करें

Binary block/allow के बजाय, confidence के आधार पर शाखाएँ बनाएँ:

function handleSpamResult(spam) {
  if (!spam.hit) return 'allow';
  if (spam.confidence > 0.95) return 'silent_delete';
  if (spam.confidence > 0.8)  return 'block_notify';
  if (spam.confidence > 0.6)  return 'flag_for_review';
  return 'apply_friction';
}

Repeat offenders को ट्रैक करें

async function assessUser(userId, spam) {
  if (spam.hit) {
    await incrementUserSpamScore(userId, spam.confidence);
  }
  const userScore = await getUserSpamScore(userId);
  if (userScore > 10.0) await autoSuspendUser(userId);
  else if (userScore > 5.0) await flagForManualReview(userId);
}

Trusted users को whitelist करें

False positives घटाने और quota बचाने के लिए verified या high-trust accounts के लिए spam check छोड़ दें:

function shouldCheckSpam(user) {
  if (user.isVerified) return false;
  if (user.trustScore > 0.9) return false;
  return true;
}

उपयोग सीमाएँ

स्पैम पहचान आपके sentiment-analysis quota से उपयोग करती है:

प्लान	मासिक विश्लेषण	नोट्स
Basic	1,000	स्पैम + sentiment शामिल
Gold	5,000	स्पैम + sentiment शामिल
Platinum	15,000	स्पैम + sentiment शामिल
Ultimate	30,000	स्पैम + sentiment शामिल

Cached responses आपके quota में नहीं गिने जाते।

इंटीग्रेशन उदाहरण

Express.js middleware

const spamFilter = async (req, res, next) => {
  if (req.body.text) {
    const result = await checkSpam(req.body.text);
    if (result.results.spamfinder.hit) {
      return res.status(400).json({
        error: 'spam_detected',
        message: 'Your message was flagged as spam'
      });
    }
  }
  next();
};

app.post('/api/comments', spamFilter, createComment);

Python Flask

from functools import wraps
from flask import request, jsonify

def spam_filter(f):
    @wraps(f)
    def decorated(*args, **kwargs):
        text = request.json.get('text')
        if text:
            result = check_spam(text)
            if result['results']['spamfinder']['hit']:
                return jsonify({
                    'error': 'spam_detected',
                    'message': 'Your message was flagged as spam'
                }), 400
        return f(*args, **kwargs)
    return decorated

@app.route('/api/comments', methods=['POST'])
@spam_filter
def create_comment():
    pass

अगले कदम

टेक्स्ट विश्लेषण - स्पैम को sentiment scoring के साथ मिलाएँ
भाषा पहचान - कंटेंट की भाषा पहचानें और लागू करें
क्विक स्टार्ट गाइड - अपनी पहली API key पाएँ

स्पैम पहचान