python - Filter elements from list based on them containing spam terms -


so i've made script scrapes sites , builds list of results. each result has following structure:

result = {'id': id,             'name': name,             'url': url,             'datetime': datetime,         } 

i want filter results list of results based on spam terms being in name. i've defined following function, , seems filter results, not of them:

def filterspamgigslist(thelist):     index = 0     spamterms = ['paid','hire','work','review','survey',                  'home','rent','cash','pay','flex',                  'facebook','sex','$$$','boss','secretary',                  'loan','supplemental','income','sales',                  'dollars','money']     in thelist:         y in spamterms:             if y in i['name'].lower():                 thelist.pop(index)                 break                     index += 1     return thelist 

any clue why might not filtering out results contain these spam terms? maybe need call .split() on name after calling .lower() of names phrases?

i guess you've got problem in-place modifying thelist iterating on jakub suggested.

the obious way return new list. split in 2 functions readability:

def is_spam(value):     spam_terms = ['paid','hire','work','review','survey',                  'home','rent','cash','pay','flex',                  'facebook','sex','$$$','boss','secretary',                  'loan','supplemental','income','sales',                  'dollars','money']     term in spam_terms:         if term in value.lower():             return true     return false  def filter_spam_gigs_list(results):     return [i in results if not is_spam(i['name'])] 

Comments

Popular posts from this blog

java - SSE Emitter : Manage timeouts and complete() -

jquery - uncaught exception: DataTables Editor - remote hosting of code not allowed -

java - How to resolve error - package com.squareup.okhttp3 doesn't exist? -