python - Filter elements from list based on them containing spam terms -
so i've made script scrapes sites , builds list of results. each result has following structure:
result = {'id': id, 'name': name, 'url': url, 'datetime': datetime, }
i want filter results list of results based on spam terms being in name. i've defined following function, , seems filter results, not of them:
def filterspamgigslist(thelist): index = 0 spamterms = ['paid','hire','work','review','survey', 'home','rent','cash','pay','flex', 'facebook','sex','$$$','boss','secretary', 'loan','supplemental','income','sales', 'dollars','money'] in thelist: y in spamterms: if y in i['name'].lower(): thelist.pop(index) break index += 1 return thelist
any clue why might not filtering out results contain these spam terms? maybe need call .split() on name after calling .lower() of names phrases?
i guess you've got problem in-place modifying thelist iterating on jakub suggested.
the obious way return new list. split in 2 functions readability:
def is_spam(value): spam_terms = ['paid','hire','work','review','survey', 'home','rent','cash','pay','flex', 'facebook','sex','$$$','boss','secretary', 'loan','supplemental','income','sales', 'dollars','money'] term in spam_terms: if term in value.lower(): return true return false def filter_spam_gigs_list(results): return [i in results if not is_spam(i['name'])]
Comments
Post a Comment