Python for loop unreasonably stops half way while iterating through CSV rows -
i dealing csv file analysis lecture feedback data, format
"5631","18650","10",,,"2015-09-18 09:35:11" "18650","null","10",,,"2015-09-18 09:37:12" "18650","5631","10",,,"2015-09-18 09:37:19" "58649","null","6",,,"2015-09-18 09:38:13" "45379","31541","10","its friday","nothing yet keep up","2015-09-18 09:39:46"
i trying rid of bad data. data entries "id1","id2" and corresponding "id2","id1" considered valid.
i using nested loops try find matching entry each row. however, outer loop seems stop half way no reason. here's code
class filter: file1 = open('encodedpeerinteractions.fa2015.csv') peerinter = csv.reader(file1,delimiter=',') def __init__(self): super() def filter(self): file2 = open('filteredinteractions.csv','a') row in self.peerinter: print(row) if row[0] == 'null' or row[1] == 'null': continue id1 = int(row[0]) id2 = int(row[1]) test in self.peerinter: if test[0] == 'null' or test[1] == 'null': continue if int(test[0]) == id2 , int(test[1]) == id1: file2.write("\n") file2.write(str(row)) break file2.close()
i have tried use pdb step trough code, fine first couple loops , jump file2.close() , return. program prints out few valid entries way not enough.
i tested csv file , loaded memory on 18000 entries. tested using print gives same result nothing wrong append file.
edit
now understand problem is. this question says, break out when there's match when there's no match, inner loop consume file without resetting it. when return outer loop ends. should make list or let reset.
you making way more complicated needs be.
given:
$ cat /tmp/so.csv "5631","18650","10",,,"2015-09-18 09:35:11" "18650","null","10",,,"2015-09-18 09:37:12" "18650","5631","10",,,"2015-09-18 09:37:19" "58649","null","6",,,"2015-09-18 09:38:13" "45379","31541","10","its friday","nothing yet keep up","2015-09-18 09:39:46"
you can use csv , filter want:
>>> open('/tmp/so.csv') f: ... list(filter(lambda row: 'null' not in row[0:2], csv.reader(f))) ... [['5631', '18650', '10', '', '', '2015-09-18 09:35:11'], ['18650', '5631', '10', '', '', '2015-09-18 09:37:19'], ['45379', '31541', '10', 'its friday', 'nothing yet keep up', '2015-09-18 09:39:46']]
Comments
Post a Comment